subject:"What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException\: Missing an output location for shuffle\?"

Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

2015-06-26 Thread XianXing Zhang

Do we have any update on this thread? Has anyone met and solved similar problems before? Any pointers will be greatly appreciated! Best, XianXing On Mon, Jun 15, 2015 at 11:48 PM, Jia Yu jia...@asu.edu wrote: Hi Peng, I got exactly same error! My shuffle data is also very large. Have you

Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

2015-06-26 Thread Eugen Cepoi

Are you using yarn? If yes increase the yarn memory overhead option. Yarn is probably killing your executors. Le 26 juin 2015 20:43, XianXing Zhang xianxing.zh...@gmail.com a écrit : Do we have any update on this thread? Has anyone met and solved similar problems before? Any pointers will be

Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

2015-06-26 Thread XianXing Zhang

Yes we deployed Spark on top of Yarn. What you suggested is very helpful, I increased the Yarn memory overhead option and it helped in most cases. (Sometime it still has some failures when the amount of data to be shuffled is large, but I guess if I continue increasing the Yarn memory overhead

Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

2015-06-16 Thread Jia Yu

Hi Peng, I got exactly same error! My shuffle data is also very large. Have you figured out a method to solve that? Thanks, Jia On Fri, Apr 24, 2015 at 7:59 AM, Peng Cheng pc...@uow.edu.au wrote: I'm deploying a Spark data processing job on an EC2 cluster, the job is small for the cluster

What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

2015-04-24 Thread Peng Cheng

I'm deploying a Spark data processing job on an EC2 cluster, the job is small for the cluster (16 cores with 120G RAM in total), the largest RDD has only 76k+ rows. But heavily skewed in the middle (thus requires repartitioning) and each row has around 100k of data after serialization. The job

Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?

5 matches

Site Navigation

Mail list logo

Footer information