Monday, October 31, 2011

Friday, October 28, 2011

One Million Healthy Children

Last Thursday, IBM Research announced a collaborative research effort with Georgia Tech that aims to improve pediatric healthcare from a payment and policy level. The project, called One Million Healthy Children, will apply advanced modeling and analytics to thousands children's healthcare records to better understand the inefficiencies that make up today's model, which encourages fee for service, rather than prevention and precise diagnoses. By examining factors far beyond the realm of medicine - the region's literary make-up, transportation hubs and access, healthy food stores and socioeconomic status of families - IBM and Georgia Tech hope to provide doctors, policymakers and patients a better idea of how to approach disease. The team will first look at diabetes, which accounts for over $174B in costs in the U.S. per year. 


The modeling technology used in this initiative was born out of IBM Research - Almaden, by a services research team led by Paul Maglio. The Smarter Planet Platform for the Analysis and Simulation of Health is a tool that uses a plug-and-play type of format to insert factors making up a population's health - the systems of systems so to say. IBM researcher on the project, Cheryl Kieliszewski tells us a little more about the project:

We are thrilled to partner with Georgia Tech on the One Million Healthy Children project.  Together, we aim to tackle a difficult, multi-dimensional problem in health – kids experiencing preventable chronic non-communicable diseases, such as obesity and diabetes, at ever younger ages, and which will have a major influence on overall health and well being throughout their lifespan.  To do this, we will explore a number of ways to create and use complex composite models to examine what-if scenarios to improve children's health.  

The partnership brings together Bill Rouse and his team at Georgia Tech, which has expertise in complex adaptive systems, in particular within healthcare modeling, and our team, which has expertise in composite model assembly. On the one hand, it provides the team an opportunity to help understand a difficult societal challenge – keeping our kids healthy – and on the other, it forces the team to confront technical issues, such as how to semi-automatically couple models from different domains, and also social and process issues to support complex decision making. We’re very excited about the potential of this project and for continued work with the Georgia Tech team.


IBM Press Release
"What Is or What If"IBM Research news blog guest post by Professor William Rouse, Georgia Institute of Technology's Tennenbaum Institute Executive Director, co-chair of the National Academies Healthy America Initiative and member of the National Academy of Engineering
eWeek
HealthITNews
Information Week

Friday, October 21, 2011

We won an Emmy!

Earlier today it was announced that IBM was presented a unique award. Together with FOX, an IBM Research project born out of Almaden won an Engineering Emmy award for Innovation from the Academy of Television Arts and Sciences. According to the Academy, by improving the ability of media companies to capture, manage and exploit content in digital form, IBM and Fox have fundamentally changed the way that audio and video content is managed and stored.

The Linear Tape File System (LTFS), invented by IBM's lauded Research Division, enabled major improvements in digital workflow and dramatic reductions in the costs associated with capturing, storing and repurposing media content while providing dramatic improvements in transfer rates, storage density, automated workflow, meta-data capture and content availability.  Combining digital broadcast and IT standards in a broadcast environment, the LTFS has enabled real-time content recording and high-speed recovery of content to be a broadly-supported, multi-industry solution.


Michael Richmond, Brian Biskeborn, David Pease, Arnon Amir (Almaden Research), and Shinobu Fujihara (Yamato) at the 2010 NAB show where LTFS was announced and released.
In a blog post earlier this year, IBMer Tony Pearson, Master Inventor and Senior Managing Consultant for the IBM System Storage product line, wrote:

"With the capabilities of LTFS, IBM has introduced an entirely new role for tape, as an attractive high capacity, easy to use, low cost and shareable storage media. LTFS can make tape usable in a fashion like removable external disk, a giant alternative to floppy diskettes, DVD-RW and USB memory sticks with directory tree access and file-level drag-and-drop capability. LTFS can allow the for passing of information around from one system or employee to another. And as for high video storage capacity, a 1.5TB LTO-5 cartridge can hold about 50 hours of XDCAM HD video!"

Lead researcher on the project David Pease is a long time storage research expert at the Almaden lab in San Jose, CA. Pioneering many of the tape and disk storage technologies out of IBM Research over the last decade, David recalls a significant factor in deciding to pursue this project the way he did. "We really needed to make the first version open source," David said. "The idea of a file system that was cross-platform and interoperable was key; we wanted people to have an interface they were familiar with, similar to disk with file folders, drag and drop and double-click, but we also wanted to make sure it wasn't tied to only Windows or only Unix. The real future for acceptance for just about any kind of storage technology is interoperability and that people aren't tied to a platform."

David and his team developed LTFS from concept to fruition in just less than 3 years. An impressive feat in the research world, he shares some thoughts about winning an Emmy for his work:


First, I am truly stunned.  This recognition is more than we ever expected so early in the project, and hopefully it reflects the importance of what we've done.  When we started this work, we said that our goal was to change the tape industry and the Media and Entertainment business; it seems that we are well on the way to realizing these goals.

I have to point out that an idea and project like this are never the work of an individual.  From Ed Childers and the other tape experts in Tucson, to the folks at Almaden who encouraged me to get involved with tape (again), to the team of great researchers and developers who worked on this in my group, to the tape specialists in the Yamato Lab who joined my team or worked to support it, I have to say that we couldn't have gotten here without the efforts of each of you.  Thank you all for making this possible!

Fun fact: This past February, David Pease completed a 41-day motorcyle ride from San Jose, CA to Panama City, Panama with 3 companions on different stages of the trip. You can read about his travels through California, Northern, Central and Southeast Mexico, Guatemala, Honduras, Nicaragua, Costa Rica and finally Panama at his blog here.


*****UPDATE Monday, October 31*****

Here's a picture of David with the Emmy. You can see more pictures of the team from the Awards Ceremony here.



Friday, October 7, 2011

IBM Research - Almaden in Councilwoman Nancy Pyle's Newsletter


In 2005, Nancy began her first term on the San Jose City Council as the representative for District 10. A retired teacher of over 25 years and a former Community College Board Trustee, Nancy represents approximately 100,000 residents from Almaden Valley and Blossom Valley.

As District 10's City Councilmember, Nancy takes an innovative approach to solving today's challenges through creativity and collaboration. She has focused city government on the basics, maintaining our streets and parks, keeping our neighborhoods safe, investing resources in youth and senior programs and making city government more accountable.


Councilmember Pyle is the Chair of the Neighborhood Services and Education Committee and a member of the Airport Competitiveness Committee. She serves as the Council liaison to the Parks and Recreation Commission and the Disability Advisory Commission. Councilmember Pyle also serves on the Santa Clara County Emergency Preparedness Council.

 
Nancy Pyle is a graduate of LeMoyne College in Syracuse, New York, where she majored in French (she’s fluent!), and earned a Masters Degree in Educational Administration from the US International University in San Diego. In 1960, Nancy moved to San Jose with her family where she served as a teacher, Community Relations Manager, and Legislative Analyst for San Jose Unified School District. 

Monday, October 3, 2011

Best Paper Award for "An Optimal Algorithm for the Distinct Elements Problem"

Last month, the winners of IBM's 2010 Pat Goldberg Memorial Best Paper competition in computer science, electrical engineering and math were announced. IBM Research - Almaden computer scientist David Woodruff co-authored one of the winning papers, titled "An Optimal Algorithm for the Distinct Elements Problem" with Daniel M. Kane (Harvard University) and Jelani Nelson (MIT) for PODS 2010 (ACM Symposium on Principles of Database Systems).

The Professional Interest Communities at IBM Research (PICs) reviewed a total of close to 120 papers submitted by IBM Research authors and nominated 34 for best paper consideration. A worldwide Research team reviewed the nominated papers and selected four outstanding papers as the award winners.

All of the submitted papers represent IBM Researchers advancing our field and are indicative of our commitment to long-term, exploratory work that can change the way we look at the world.

Below, David explains a bit about his paper, what it might mean for the advancement of mathematical discovery and what it's like working for IBM.

Estimating the number of distinct attributes is a fundamental practical and theoretical problem in database applications dating back to the 1970s. It arises in trying to optimize a query sequence, where keeping the number of distinct elements small at intermediate stages in the sequence ensures the overall running time is low. It is also useful for comparing two data sets, e.g., how many new items did we get by putting the two datasets together? 

The techniques in this paper will be useful for improving the memory and time complexity of a number of fundamental problems in the data stream literature, related to estimating the number of distinct elements, such as estimating rarity, similarity, union sizes of databases, etc. 

They can also be used for estimating the number of distinct elements in the distributed model, sliding window model, and time-decayed model. In the distributed model, there are multiple servers, each holding a database, and we want to estimate the number of distinct elements in the union of the databases with as few communication and computational overhead as possible. In the sliding-window model, which could for instance be used on a router monitoring distinct source-destination traffic passing through it, it may be that we are only interested in the recent traffic, or we may at least want to give more weight to recent traffic. These variations correspond to the sliding-window and time-decayed models. 

My work closes a long-standing problem in the area of data streams. IBM Research has made a sustained effort to design data stream and sub-linear algorithms for a wide variety of problems, e.g., those in graph theory, machine learning, network traffic analysis, numerical linear algebra, and statistics. This work significantly bolsters that effort. 

I’m very grateful for the amazing amount of freedom and that I’ve been given at IBM, and have been able to use this to highly optimize my time and productivity. Interacting with interesting colleagues is one of the most exciting aspects of my job. It’s really enjoyable and a great learning experience.  

Research interests: data stream algorithms, communication complexity, numerical linear algebra, graph algorithms, coding theory, and cryptography 

Inspirational figures: The super theory group at IBM Research - Almaden 

When I'm not working, I'm: home remodeling, playing basketball, practicing chinese, traveling 

Favorite travel spots: Banff, Canada. Hangzhou, China. Venice, Italy. 

David Woodruff is a Research Staff Member in IBM Research - Almaden's Principles and Methodologies Group. He received a B.S. in computer sceince, B.S. in mathematics, M.Eng in computer science and Ph.D in computer science, each from MIT. David also contributes to IBM's cognitive computing initiative, SyNAPSE.