Hi Gil, Currently, our company uses S3 heavily for data storage. Can you further explain the benefits of this in relation to S3 when the pending patch does come out? Also, I have heard of Swift from others. Can you explain to me the pros and cons of Swift compared to HDFS? It can be just a brief summary if you like or just guide me to material that will help me get a better understanding.
Thanks, Ben > On Mar 22, 2016, at 6:35 AM, Gil Vernik <g...@il.ibm.com> wrote: > > We recently released an object store connector for Spark. > https://github.com/SparkTC/stocator <https://github.com/SparkTC/stocator> > Currently this connector contains driver for the Swift based object store ( > like SoftLayer or any other Swift cluster ), but it can easily support > additional object stores. > There is a pending patch to support Amazon S3 object store. > > The major highlight is that this connector doesn't create any temporary files > and so it achieves very fast response times when Spark persist data in the > object store. > The new connector supports speculate mode and covers various failure > scenarios ( like two Spark tasks writing into same object, partial corrupted > data due to run time exceptions in Spark master, etc ). It also covers > https://issues.apache.org/jira/browse/SPARK-10063 > <https://issues.apache.org/jira/browse/SPARK-10063>and other known issues. > > The detail algorithm for fault tolerance will be released very soon. For now, > those who interested, can view the implementation in the code itself. > > https://github.com/SparkTC/stocator > <https://github.com/SparkTC/stocator>contains all the details how to setup > and use with Spark. > > A series of tests showed that the new connector obtains 70% improvements for > write operations from Spark to Swift and about 30% improvements for read > operations from Swift into Spark ( comparing to the existing driver that > Spark uses to integrate with objects stored in Swift). > > There is an ongoing work to add more coverage and fix some known bugs / > limitations. > > All the best > Gil >