Ok. a. From your comments I understand that there is only one copy of the data which resides on the ignite cluster. The data is not copied on the spark nodes while executing the lineage graph consisting of transformations & actions. If my understanding is correct what happens when a transformation is applied on an RDD? Does it create a new cache or just an RDD? b. One of the features of IgniteRDD is to speed up Spark SQL queries by 100 times. This is being done by using the in-memory indexing capabilities that ignite provides. Since the IgniteRDD is created from the IgniteContext, I assume we could only execute sql queries (through igniteRDD.sql() api) that Ignite could execute and not any Spark SQL query. Is my understanding right? for e.g. can we define user defined functions as we do with Spark SQL?
Thanks. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Apache-Spark-Ignite-Integration-tp8556p9019.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.