This is caused by https://issues.apache.org/jira/browse/SPARK-1188. I think
the fix will be in the next release. But until then, do:
g.edges.map(_.copy()).distinct.count
On Wed, Apr 23, 2014 at 2:26 AM, Ryan Compton wrote:
> Try this: https://www.dropbox.com/s/xf34l0ta496bdsn/.txt
>
>
Try this: https://www.dropbox.com/s/xf34l0ta496bdsn/.txt
This code:
println(g.numEdges)
println(g.numVertices)
println(g.edges.distinct().count())
gave me
1
9294
2
On Tue, Apr 22, 2014 at 5:14 PM, Ankur Dave wrote:
> I wasn't able to reproduce this with a small test file
I wasn't able to reproduce this with a small test file, but I did change
the file parsing to use x(1).toLong instead of x(2).toLong. Did you mean to
take the third column rather than the second?
If so, would you mind posting a larger sample of the file, or even the
whole file if possible?
Here's