endet: Samstag, 11. Juli 2015 03:58
An: Ted Yu; Robin East; user
Betreff: Re: Spark GraphX memory requirements + java.lang.OutOfMemoryError: GC
overhead limit exceeded
Hello again.
So I could compute triangle numbers when run the code from spark shell without
workers (with --driver-memory 15g o
Hello again.
So I could compute triangle numbers when run the code from spark shell
without workers (with --driver-memory 15g option), but with workers I have
errors. So I run spark shell:
./bin/spark-shell --master spark://192.168.0.31:7077 --executor-memory
6900m --driver-memory 15g
and workers (
Yep, I already found it. So I added 1 line:
val graph = GraphLoader.edgeListFile(sc, "", ...)
val newgraph = graph.convertToCanonicalEdges()
and could successfully count triangles on "newgraph". Next will test it on
bigger (several Gb) networks.
I am using Spark 1.3 and 1.4 but haven't seen
See SPARK-4917 which went into Spark 1.3.0
On Fri, Jun 26, 2015 at 2:27 AM, Robin East wrote:
> You’ll get this issue if you just take the first 2000 lines of that file.
> The problem is triangleCount() expects srdId < dstId which is not the case
> in the file (e.g. vertex 28). You can get round
You’ll get this issue if you just take the first 2000 lines of that file. The
problem is triangleCount() expects srdId < dstId which is not the case in the
file (e.g. vertex 28). You can get round this by calling
graph.convertToCanonical Edges() which removes bi-directional edges and ensures
sr
Ok, but what does it means? I did not change the core files of spark, so is
it a bug there?
PS: on small datasets (<500 Mb) I have no problem.
Am 25.06.2015 18:02 schrieb "Ted Yu" :
> The assertion failure from TriangleCount.scala corresponds with the
> following lines:
>
> g.outerJoinVertices
The assertion failure from TriangleCount.scala corresponds with the
following lines:
g.outerJoinVertices(counters) {
(vid, _, optCounter: Option[Int]) =>
val dblCount = optCounter.getOrElse(0)
// double count should be even (divisible by two)
assert((dblCount & 1)
Hello!
I am trying to compute number of triangles with GraphX. But get memory
error or heap size, even though the dataset is very small (1Gb). I run the
code in spark-shell, having 16Gb RAM machine (also tried with 2 workers on
separate machines 8Gb RAM each). So I have 15x more memory than the dat