Re: assertion failed error with GraphX

2015-07-22 Thread Roman Sokolov
I am also having problems with triangle count - seems like this algorithm is very memory consuming (I could not process even small graphs ~ 5 million Vertices and 70 million Edges with less the 32 GB RAM on EACH machine). What if I have graphs with billion edges, what amount of RAM do I need then?

RE: Spark performance

2015-07-11 Thread Roman Sokolov
Hello. Had the same question. What if I need to store 4-6 Tb and do queries? Can't find any clue in documentation. Am 11.07.2015 03:28 schrieb "Mohammed Guller" : > Hi Ravi, > > First, Neither Spark nor Spark SQL is a database. Both are compute > engines, which need to be paired with a storage sy

Re: Spark GraphX memory requirements + java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-07-10 Thread Roman Sokolov
146) ... 10 more On 26 June 2015 at 14:06, Roman Sokolov wrote: > Yep, I already found it. So I added 1 line: > > val graph = GraphLoader.edgeListFile(sc, "", ...) > val newgraph = graph.convertToCanonicalEdges() > > and could successfully count triangles on "

Re: Spark GraphX memory requirements + java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-06-26 Thread Roman Sokolov
bi-directional edges and >> ensures srcId < dstId. Which version of Spark are you on? Can’t remember >> what version that method was introduced in. >> >> Robin >> >> On 26 Jun 2015, at 09:44, Roman Sokolov wrote: >> >> Ok, but what does it means

Re: Spark GraphX memory requirements + java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-06-26 Thread Roman Sokolov
gt; following lines: > > g.outerJoinVertices(counters) { > (vid, _, optCounter: Option[Int]) => > val dblCount = optCounter.getOrElse(0) > // double count should be even (divisible by two) > assert((dblCount & 1) == 0) > > Cheers > > On Thu, Jun 25, 20

Spark GraphX memory requirements + java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-06-25 Thread Roman Sokolov
ecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -- Best regards, Roman Sokolov