Greetings! 

The HDF Group is pleased to announce that we are actively developing an HDF5 
Spark Connector and are seeking Beta Users for this software.  The HDF5 Spark 
Connector allows users of the Apache Spark open source processing engine to 
natively query data stored in HDF5 files.

This software is being developed in response to interest from members of the 
HDF5 user community. Many of them are interested in using Spark to obtain the 
same kind of speed, scalability, and reliability in data processing that they 
look for in I/O from HDF5.  To date, they have been hampered by Spark's 
inability to directly access HDF5 files.  Without this software, as a 
workaround, they have had to first perform an unwanted conversion of existing 
data from HDF5 to another data storage tool that Spark can directly read. We 
consider this software to be an exciting bridge between two very different but 
important and influential open source big data technologies:

. In use for more than 30 years, HDF5 (Hierarchical Data Format 5) addresses 
the problems of how to organize, store, discover, access, analyze, share, and 
preserve data in the face of enormous growth in size and complexity.  Since its 
release, HDF5 has become the de-facto standard for the collection, storage, and 
provisioning for large, complex scientific datasets.  HDF5 and its predecessors 
have supported mission-critical computing needs for Big Data and NoSQL with 
open source software since 1989, long before anyone was using the terms Big 
Data, NoSQL, or open source!

. Apache Spark is a powerful open source processing engine built around speed, 
ease of use, and sophisticated analytics. Originally developed at UC Berkeley 
in 2009 and now considered an essential piece of the Hadoop ecosystem, Spark is 
the largest open source project in data processing.  Since its release, Apache 
Spark has seen rapid adoption by enterprises across a wide range of industries. 
Internet powerhouses such as Netflix, Yahoo, and eBay have deployed Spark at 
massive scale, collectively processing multiple petabytes of data on clusters 
of over 8,000 nodes.

The HDF Group is eager to speak with HDF5 users who are interested in joining 
the Beta Test program for the HDF5 Spark Connector.  As a Beta Tester, you will 
have an opportunity to begin using this software and in the next few months 
provide crucial feedback to The HDF Group that will help guide the 
functionality and roadmap for this product.

For more information on the HDF5 Spark Connector: 
https://www.hdfgroup.org/downloads/spark-connector/ 

If you're interested in becoming a beta tester, would like to be kept 
up-to-date on this product, or have other questions or concerns, you can use 
this form to communicate with The HDF Group: 
https://www.hdfgroup.org/downloads/spark-connector/signup-hdf5-spark-connector-beta-tester/


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to