Hi All, I've submitted a proposal for GORA-386, Spark Backend Support.
You already know that Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop MapReduce support. On the other hand, Spark is an Apache project advertised as “lightning fast cluster computing”. It has a thriving open-source community and is the most active Apache project at the moment. There is already an existing Map/Reduce support for Apache Gora. However there is not a generic abstraction layer which allows using some other replacements instead of that. At my proposal, I aim to create an abstraction layer and support Spark as a backend. My goal includes Gora Input Format to RDD Transformation, Generic Abstraction Layer Backend and Data Storage via newly developed GoraInputmap. Due to Gora will have an architectural change; I planned to test its functionality with new architecture. I also have some other plans if I can finish my proposal earlier. I want to try to test the ability of mapping Hadoop style Map/Reduce stuff into Spark style. There are some interesting articles about it, i.e.: [1] Kind Regards, Furkan KAMACI [1] http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/