Re: GraphX: .edges.distinct().count() is 10?

2014-04-23 Thread Daniel Darabos
This is caused by https://issues.apache.org/jira/browse/SPARK-1188. I think the fix will be in the next release. But until then, do: g.edges.map(_.copy()).distinct.count On Wed, Apr 23, 2014 at 2:26 AM, Ryan Compton wrote: > Try this: https://www.dropbox.com/s/xf34l0ta496bdsn/.txt > >

Re: GraphX: .edges.distinct().count() is 10?

2014-04-22 Thread Ryan Compton
Try this: https://www.dropbox.com/s/xf34l0ta496bdsn/.txt This code: println(g.numEdges) println(g.numVertices) println(g.edges.distinct().count()) gave me 1 9294 2 On Tue, Apr 22, 2014 at 5:14 PM, Ankur Dave wrote: > I wasn't able to reproduce this with a small test file

Re: GraphX: .edges.distinct().count() is 10?

2014-04-22 Thread Ankur Dave
I wasn't able to reproduce this with a small test file, but I did change the file parsing to use x(1).toLong instead of x(2).toLong. Did you mean to take the third column rather than the second? If so, would you mind posting a larger sample of the file, or even the whole file if possible? Here's