Determination of number of RDDs

2014-12-04 Thread Deep Pradhan
Hi,
I have a graph and I want to create RDDs equal in number to the nodes in
the graph. How can I do that?
If I have 10 nodes then I want to create 10 rdds. Is that possible in
GraphX?
Like in C language we have array of pointers. Do we have array of RDDs in
Spark.
Can we create such an array and then parallelize it?

Thank You


Re: Determination of number of RDDs

2014-12-04 Thread Ankur Dave
At 2014-12-04 02:08:45 -0800, Deep Pradhan pradhandeep1...@gmail.com wrote:
 I have a graph and I want to create RDDs equal in number to the nodes in
 the graph. How can I do that?
 If I have 10 nodes then I want to create 10 rdds. Is that possible in
 GraphX?

This is possible: you can collect the elements to the driver, then create an 
RDD for each element.

If you have so many elements that collect them to the driver is infeasible, 
there's probably an alternative solution that doesn't involve creating one RDD 
per element.

Ankur

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Determination of number of RDDs

2014-12-04 Thread Kapil Malik
Regarding: Can we create such an array and then parallelize it?

Parallelizing an array of RDDs - i.e. RDD[RDD[x]] is not possible.
RDD is not serializable.

From: Deep Pradhan [mailto:pradhandeep1...@gmail.com]
Sent: 04 December 2014 15:39
To: user@spark.apache.org
Subject: Determination of number of RDDs

Hi,
I have a graph and I want to create RDDs equal in number to the nodes in the 
graph. How can I do that?
If I have 10 nodes then I want to create 10 rdds. Is that possible in GraphX?
Like in C language we have array of pointers. Do we have array of RDDs in Spark.
Can we create such an array and then parallelize it?

Thank You