Big Data Analytics in Life Science and Healthcare – Scope, Challenges and Success stories

Big data in healthcareBig data is a large collection of data from various sources in the form of data sets. When organizations get to the point where their volume, velocity, variety and veracity of data exceed storage or computing capacity, there are some big challenges that need to be addressed.

Analytics research is intended to develop complex procedures running over large-scale, enormous in size data repositories with the objective of extracting useful knowledge hidden in such repositories. One of the most significant application scenarios where Big Data arise is, without a doubt, scientific computing.

Among the IT tools, Hadoop and Teradata are the main players in the big data analytics industries which are utilized by various companies for various projects.

Big Data analytics in Science and Healthcare:

Scientists and researchers produce huge amounts of data per day via experiments (e.g., disciplines like high-energy physics, astronomy, biology, biomedicine, and so forth). But extracting useful knowledge for decision-making purposes from these massive, large-scale data repositories is almost impossible for actual DBMS-inspired analysis tools. As a consequence, data-driven approaches using Big data analytics in biology, medicine, public policy, social sciences, and humanities, can replace the traditional hypothesis-driven research in science.

Big data analytics is reshaping nearly every aspect of healthcare, which of course represents nearly a fifth of the U.S. economy. McKinsey has estimated big data projects can save up to $300-$450 billion which many regards as a conservative baseline.

Behavioral Analytics:

Behavioral analytics is a process of understanding what patient or customer interacts and their dynamics which helps to make a decision on which health care products to offer to customers.

Big data analytics is transforming entire industries and redefining how humans interact with companies and each other. It’s changing how we work, live, eat, sleep and play.

Evidence-based medicine is changing how patients are diagnosed and often demonstrates that alternative treatments are more effective (and cost-effective) than conventional care.

Pharmaceuticals are using real-world signals to track adverse events and focus R&D programs on the medications most likely to succeed. Researchers and epidemiologists map disease outbreaks and break down new bacteria to genetic components to stop viral outbreaks. Smarter screening, more powerful population management, reduced claims fraud, health plans incenting healthier behaviors, more efficient surgery center operations – big data is making for a healthier healthcare system.

Challenges and Concerns of big data analytics industry:

Major industries from retail to aeronautics are leveraging big data. But despite the abundance of data in healthcare, and the clear promise of big-data analytics, the sector has been slow to put it to work due to the shortage of technical talent about big data among the science graduates.

Dealing with the Security Issues of Big Data:

Big data that resides within a Hadoop and Teradata environment can contain personally identification information (PII) of patients such as the names, addresses and social security numbers of clients, customers and employees. Due to the sensitive nature of all of this data and the damage that can be done should it fall into the wrong hands, it is imperative that it be protected from unauthorized access.

Getting & Keeping the Talent Necessary to Make Use of Big Data Analytics:

Businesses are feeling the data talent shortage. Not only is there a shortage of data scientists, but to successfully implement a big data project requires a sophisticated team of developers, data scientists, and analysts who also have a sufficient amount of domain knowledge to identify valuable insights.

As per the survey by MGI and McKinsey’s, In the United States alone there is a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.

Investment and Cost management:

It’s difficult to project the cost of a big data project, and given how quickly they scale, can quickly eat up resources. The challenge lies in taking into account all costs of the project from acquiring new hardware to paying a cloud provider, to hiring additional personnel. Most companies initiate Big data analytics projects but not able to complete the projects due to various challenges and in some cases, projects are either on hold or temporarily postponed.

Lack of Quality Data:

Data quality is not a new concern, but the ability to store every piece of data a business produces in its original form compounds the problem. Dirty data costs companies in the United States $600 billion every year. Common causes of dirty data that must be addressed include user input errors, duplicate data, and incorrect data linking.

Lack of Actionable Insights:

Having more data doesn’t necessarily lead to actionable insights. A key challenge for data science teams is to identify a clear business objective and the appropriate data sources to collect and analyze to meet that objective. The challenge doesn’t stop there, however. Once key patterns have been identified, businesses must be prepared to act and make necessary changes in order to derive business value from them.

Most of the industries are already started investing in big data analytics but their main concern is in getting the right talent with required skills from the market.

Success Stories:

NASA: Unlocking the secrets of the Universe with Big Data

NASA gathered approximately 1.73 gigabytes of data from our nearly 100 currently active missions.

NASA uses the big data analytics to keep hundreds of satellites in the air and fulfill its vision of “reaching for new heights and revealing the unknown space secret which will benefit all humankind. NASA collects “big bang” data from across the solar system to unlock the secrets of the universe. NASA’s huge collection of data is one of the most valuable assets, and its strategic importance in its research and science is huge. Challenges faced by NASA are handling, storing, and managing this data in an easily accessible manner.

Big Data Cloud Computing Helps NASA JPL Rover Curiosity Land on Mars
Video: Big Data Cloud Computing Helps NASA JPL Rover Curiosity Land on Mars
Evolution of NASA Earth Science Data Systems in the Era of Big Data
Video: Evolution of NASA Earth Science Data Systems in the Era of Big Data









Like the Square Kilometer Array project, Big data analytics helps to aggregate data from tens of thousands of radio telescopes to figure out how the galaxy was formed at the “cosmic dawn?” Not surprisingly, NASA embraces extreme innovation in architecting its storage and data management environment. It applies algorithms to data in tens of thousands of different formats from a range of spacecraft, including unmanned rovers and probes, earth-bound telescopes and observatories around the world – and archives it for future analysis as planetary science progresses.

From analyzing the real-time solar plasma ejections and monitoring global climate change to optimizing large-scale engineering designs and modernizing the way we approach to mission operations, NASA is a leader in the application of big data.

NASA is involved in analyzing data collected from planes to study safety implications, which in turn will help with commercial airlines’ maintenance procedure improvements and potentially prevent equipment failures. Using advanced algorithms, the agency helped tease out relevant information from a mountain of unstructured data to help predict and prevent safety problems.

Read more in the NASA Blog

Transforming into One Red Cross with Data Driven Insights – Smartly using big data to save lives:

510 is a Red Cross start-up hosted by the Netherlands Red Cross, they focus on using data and Artificial Intelligence to help with Disaster Risk Reduction, Resilience and response.  Their work is complex but holds exciting potential for the Humanitarian sector.


Using Data to Help Convert Red Cross "Disaster Donors"
Video: Using Data to Help Convert Red Cross “Disaster Donors”

510 has co-developed with a team of students, volunteers and staff a state of ‘artistic’ data-driven solution, the Priority Index, which predicts damage just hours after a disaster strikes. Organizations like the Red Cross and Red Crescent National Societies, governments or UNOCHA can use these results to better understand the impact of a natural disaster and mobilize a humanitarian response faster, and with a focus on the most vulnerable and damaged areas.

This methodology is able to process and learn from similar disasters that happened in the past by comparing historical impact data with secondary data, for instance, population, disaster statistics, wind speeds and rainfall. The whole process forecasts areas that are most likely to be badly affected and should therefore be considered as a priority.

The Geisinger Health System (GHS):

The Geisinger Health System (GHS) is a physician-led health care system of northeastern and central Pennsylvania. Geisinger Health System has been a leader in re-engineering healthcare delivery. Geisinger has successfully redesigned health care facility for many chronic and acute diseases, conditions, and procedures. GHS focuses on healthcare management design and builds based on proven, clinical best practices, which comprise of two key service areas: clinical care management advisory services and clinical content and intelligence.

Geisinger Health Systems Gaining a Competitive Advantage as a Data Driven Business
Video: Geisinger Health Systems Gaining a Competitive Advantage as a Data Driven Business

Geisinger Health System implemented an IT system called a Unified Data Architecture (UDA) which allowed integrating big data into the existing data analytics and management systems. The UDA’s big data capabilities help to track and analyze patient outcomes, to correlate their genomic sequences with clinical care, and to visualize healthcare data across cohorts of patients and networks of providers. Geisinger’s UDA is the largest practical application of point-of-care big data in healthcare, with thousands of CPUs processing and delivering hundreds of terabytes of data every hour.

Geisinger Health System’s UDA provides a common data space for rapid integration of data from selected internal and external sources. The power to process troves of data from various sources, combined with the ability to integrate and store large volumes of data, makes the UDA uniquely positioned to fill the gap left by traditional healthcare data systems.

A large amount of data helps to perform analytics on free-text imaging reports which in turn helps to detect many patients with dangerously large abdominal aortic aneurysms who had no follow-up scheduled for this incidental finding. For More refer Harvard Business Review


The American Cancer Society –  Doesn’t want to battle cancer but wants to eliminate cancer:

Barriers to early detection and effective treatment of cancers are often the lack of knowledge, or the ability to understand and piece together what we already know. Processing huge volumes of medical data generated by various clinical trials, genomic sequencing, and individual case studies promise to unlock secrets that could not be readily observed by human eyes.

American Cancer Society: Using Data and Analytics to Finish the Fight Against Cancer
Video: American Cancer Society: Using Data and Analytics to Finish the Fight Against Cancer

With a mission to eliminate cancer as a major health problem before the end of this century, American Cancer Society’s is focused on a cure and is using data as the primary weapon in the fight to eliminate cancer. American Cancer Society wants their data “Fast, Fresh & Better” to reduce latency and encourage a culture rich in analytics to help finish the global fight against cancer.

With Teradata, the American Cancer Society has transformed their organization to be focused and data driven towards their mission to cure cancer. As a volunteer Non-profit organization, ACS uses the power of big data analytics to have better engagement with its donor and its volunteers.


Big Data Analytics Strategies for Smart Power Grid: More than just keeping the lights on

Big Data Analytics Strategies for the Smart Power Grid
Video: Big Data Analytics Strategies for the Smart Power Grid

Smart grid is a complete automation system, where a large pool of sensors is embedded in the existing power grids system for controlling and monitoring it by utilizing modern information technologies. The data collected from these sensors are huge and have all the characteristics to be called as Big Data.

Keeping the lights on – during periods of peak demand and throughout all kinds of weather – is a fundamentally data-driven responsibility for utilities. The State of California uses leverage advanced analytical platforms for situational intelligence and uses powerful visualization and modeling tools to project supply-demand dynamics across the grid. Real-time visibility is critical because that’s when and how people expect to get their electricity. The essential data is huge and diverse – weather forecasts, real-time sensor data, continuous metering. But that is what’s required to optimally balance supply and demand, plug into renewable when necessary and avoids blackouts and service dips.

For more update on related topics, please subscribe to the blog.

If you like the post, share it with friends on social media and also follow us. If you have any question or suggestion add a comment below and also you can write directly to me.

3 thoughts on “Big Data Analytics in Life Science and Healthcare – Scope, Challenges and Success stories

  1. […] Read more about the Big data analytics and the how NASA and other health care sectors using big data analytics to achiev… […]

  2. It’s very big blog. Thanks for your blog.

    1. rohith.kalegowda

      Thank you Keerthika

Leave a Comment