Yesterday's run died sometime during the night, without any errors. Today, I am running it using GraphFrames instead. It is still spawning new tasks, so there is progress.
From: Felix Cheung [mailto:felixcheun...@hotmail.com] Sent: Thursday, November 10, 2016 7:50 PM To: user@spark.apache.org; Shreya Agarwal <shrey...@microsoft.com> Subject: Re: Strongly Connected Components It is possible it is dead. Could you check the Spark UI to see if there is any progress? _____________________________ From: Shreya Agarwal <shrey...@microsoft.com<mailto:shrey...@microsoft.com>> Sent: Thursday, November 10, 2016 12:45 AM Subject: RE: Strongly Connected Components To: <user@spark.apache.org<mailto:user@spark.apache.org>> Bump. Anyone? Its been running for 10 hours now. No results. From: Shreya Agarwal Sent: Tuesday, November 8, 2016 9:05 PM To: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Strongly Connected Components Hi, I am running this on a graph with >5B edges and >3B edges and have 2 questions - 1. What is the optimal number of iterations? 2. I am running it for 1 iteration right now on a beefy 100 node cluster, with 300 executors each having 30GB RAM and 5 cores. I have persisted the graph to MEMORY_AND_DISK. And it has been running for 3 hours already. Any ideas on how to speed this up? Regards, Shreya