Hi all, I have an interesting behavior from GraphX while running SSSP. I use the stand-alone mode with 16+1 machines, each has 30GB memory and 4 cores. The dataset is 63GB. However, the input for some stages is huge, about 16 TB !
The computation takes very long time. I stopped it. For your information, I use the same SSSP code mentioned in the GraphX documentation: http://spark.apache.org/docs/latest/graphx-programming-guide.html#pregel-api I use StorageLevel.MEMORY_ONLY since I have plenty of memory. I appreciate your comment/help about this issue. -- Thanks, -Khaled [image: Inline image 1]