Hi Furkan,
In what context are we talking here?
GSoC or Just development?
I am very keen to essentially work towards what we can release as Gora 1.0
Thank you Furkan

On Saturday, March 21, 2015, Furkan KAMACI <furkankam...@gmail.com> wrote:

> As you know that there is an issue for integration Apache Spark and Apache
> Gora [1]. Apache Spark is a popular project and in contrast to Hadoop's
> two-stage disk-based MapReduce paradigm, Spark's in-memory primitives
> provide performance up to 100 times faster for certain applications [2].
> There are also some alternatives to Apache Spark, i.e. Apache Tez [3].
>
> When implementing an integration for Spark, it should be considered to
> have an abstraction for such kind of projects as an architectural design
> and there is a related issue for it: [4].
>
> There is another Apache project which aims to provide a framework named as
> Apache Crunch [5] for writing, testing, and running MapReduce pipelines.
> Its goal is to make pipelines that are composed of many user-defined
> functions simple to write, easy to test, and efficient to run. It is an
> high-level tool for writing data pipelines, as opposed to developing
> against the MapReduce, Spark, Tez APIs or etc. directly [6].
>
> I would like to learn how Apache Crunch fits with creating a multi
> execution engine for Gora [4]? What kind of benefits we can get with
> integrating Apache Gora and Apache Crunch and what kind of gaps we still
> can have instead of developing a custom engine for our purpose?
>
> Kind Regards,
> Furkan KAMACI
>
> [1] https://issues.apache.org/jira/browse/GORA-386
> [2] Xin, Reynold; Rosen, Josh; Zaharia, Matei; Franklin, Michael; Shenker,
> Scott; Stoica, Ion (June 2013).
> [3] http://tez.apache.org/
> [4] https://issues.apache.org/jira/browse/GORA-418
> [5] https://crunch.apache.org/
> [6] https://crunch.apache.org/user-guide.html#motivation
>


-- 
*Lewis*

Reply via email to