Hello Spark community,

I am currently trying to implement a proof-of-concept RDD that will allow
to integrate Apache Spark and Apache Ignite (incubating) [1]. My original
idea was to embed an Ignite node in Spark's worker process, in order for
the user code to have a direct access to in-memory data as it gives the
best performance without any need to explicitly load data into Spark.

However, after looking at the documentation and the following questions on
the user list [2], [3] I realized that it might be impossible to implement.

So can anybody in the community clarify or point me to the documentation
regarding the following questions:

   - Does worker spawn a new process for each application? Is there a way
   for workers to reuse the same process for different Spark contexts?
   - Is there a way to embed a worker in a user process?
   - Is there a way to attach a piece of user logic to a worker lifecycle
   events (initialization/destroy)?

Thanks,
Alexey

----

[1] http://ignite.incubator.apache.org/
[2]
http://apache-spark-user-list.1001560.n3.nabble.com/Embedding-Spark-Masters-Zk-Workers-SparkContext-App-in-single-JVM-clustered-sorta-for-symmetric-depl-td17711.html
[3]
http://apache-spark-user-list.1001560.n3.nabble.com/Sharing-memory-across-applications-td11845.html

Reply via email to