There's plenty of hoopla about Hadoop this week as three new solutions come to market. EMC, Hortonworks and Intel have all announced new Hadoop products.
An open-source framework for storing and processing large volumes of diverse data on a scalable cluster of servers, Hadoop has rapidly emerged as the preferred solution for Big Data analytics applications. That's because Hadoop is flexible, scalable, inexpensive, fault-tolerant, and enjoys rapid adoption rates and a rich ecosystem surrounded by massive investment.
However, customers face high hurdles to broadly adopting Hadoop as their singular data repository. A lack of useful interfaces and high-level tooling for Business Intelligence and data mining -- components that are critical to data analytics and building a data-driven enterprise -- are among the challenges.
How are EMC, Intel and Hortonworks tackling those Big Data mountains with their Hadoop solutions? Each in their own way.
Three New Hadoop Solutions
EMC announced Pivotal HD, which features native integration of its Greenplum massively parallel processing (MPP) database with Apache Hadoop. The new EMC Greenplum-developed HAWQ technology brings 10 years of large scale data management research and development to Hadoop and delivers more than 100 times performance improvements when compared with existing SQL-like services on top of Hadoop.
Intel's pitch is called Intel Distribution for Apache Hadoop software. The offering, which includes Intel Manager for Apache Hadoop software, is built from the silicon up to deliver industry-leading performance and improved security features. The Intel Distribution is the first to provide complete encryption with support of Intel AES New Instructions in the Intel Xeon processor.
Hortonworks recently released Hortonworks Data Platform for Windows. This is the first and only Hadoop-based platform available on both Windows and Linux and provides interoperability across Windows, Linux and Windows Azure.
EMC Turns Heads
Charles King, principal analyst at Pund-IT, told us EMC, Intel and Hortonworks are taking different approaches and targeting different audiences.
"EMC's Pivotal HD is the likely headline leader of the three, especially given the stunning 10 times to 100 times-plus performance improvements -- in concert with Greenplum's MPP database with Apache Hadoop -- it offers compared to other SQL-like services for Hadoop," King said. "But the company's new HAWQ promises to make a potentially greater impact on the commercial Big Data market."
King said that by creating a true SQL parallel database running on top of the Hadoop Distributed File System, EMC Greenplum is extending the considerable value of Hadoop to organizations that have invested in SQL training and personnel, which is to say virtually every commercial business. And that, he said, means that Big Data benefits could and should become far more accessible than ever before and help overcome the skills shortage often associated with Big Data. (continued...)