Hi there, I have a large amount of objects, which I have to partition into chunks with the help of a binary tree: after each object has been run through the tree, the leaves of that tree contain the chunks. Next I have to process each of those chunks in the same way with a function f(chunk). So I thought if I could make the list of chunks into an RDD listOfChunks, I could use Spark by calling listOfChunks.map(f) and do the processing in parallel.
What would you recommend how I create the RDD? Is it possible to start with an RDD that is a list of empty chunks and then to add my objects one by one to the belonging chunks? Or would you recommend something else? Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/creation-of-RDD-from-a-Tree-tp23310.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org