At 2014-11-11 01:51:43 +0000, "Buttler, David" <buttl...@llnl.gov> wrote:
> I am building a graph from a large CSV file.  Each record contains a couple 
> of nodes and about 10 edges.  When I try to load a large portion of the 
> graph, using multiple partitions, I get inconsistent results in the number of 
> edges between different runs.  However, if I use a single partition, or a 
> small portion of the CSV file (say 1000 rows), then I get a consistent number 
> of edges.  Is there anything I should be aware of as to why this could be 
> happening in GraphX?

Is it possible there's some nondeterminism in the way you're reading the file? 
It would be helpful if you could post the code you're using to load the graph.

Ankur

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to