If the RDD is cached, you can check its storage information in the Storage tab of the Web UI.
On Wed, May 21, 2014 at 12:31 PM, yxzhao <yxz...@ualr.edu> wrote: > Thanks Xiangrui, How to check and make sure the data is distributed > evenly? Thanks again. > On Wed, May 21, 2014 at 2:17 PM, Xiangrui Meng [via Apache Spark User > List] <[hidden email]> wrote: > >> Many OutOfMemoryErrors in the log. Is your data distributed evenly? >> -Xiangrui >> >> On Wed, May 21, 2014 at 11:23 AM, yxzhao <[hidden email]> wrote: >> >>> I run the pagerank example processing a large data set, 5GB in size, >>> using >>> 48 >>> machines. The job got stuck at the time point: 14/05/20 21:32:17, as the >>> attached log shows. It was stuck there for more than 10 hours and then I >>> killed it at last. But I did not find any information explaining why it >>> was >>> stuck. Any suggestions? Thanks. >>> >>> >>> Spark_OK_48_pagerank.log >>> >>> >>> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n6185/Spark_OK_48_pagerank.log> >>> >>> >>> >>> -- >>> View this message in context: >>> >>> http://apache-spark-user-list.1001560.n3.nabble.com/Job-Processing-Large-Data-Set-Got-Stuck-tp6185.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> >> ________________________________ >> If you reply to this email, your message will be added to the discussion >> below: >> >> http://apache-spark-user-list.1001560.n3.nabble.com/Job-Processing-Large-Data-Set-Got-Stuck-tp6185p6187.html >> To unsubscribe from Job Processing Large Data Set Got Stuck, click here. >> NAML > > ________________________________ > View this message in context: Re: Job Processing Large Data Set Got Stuck > > Sent from the Apache Spark User List mailing list archive at Nabble.com.