Hi Sven,

Do you have a small example program that you can share which will allow me
to reproduce this issue?  If you have a workload that runs into this, you
should be able to keep iteratively simplifying the job and reducing the
data set size until you hit a fairly minimal reproduction (assuming the
issue is deterministic, which it sounds like it is).

On Tue, Dec 30, 2014 at 9:49 AM, Sven Krasser <kras...@gmail.com> wrote:

> Hey all,
>
> Since upgrading to 1.2.0 a pyspark job that worked fine in 1.1.1 fails
> during shuffle. I've tried reverting from the sort-based shuffle back to
> the hash one, and that fails as well. Does anyone see similar problems or
> has an idea on where to look next?
>
> For the sort-based shuffle I get a bunch of exception like this in the
> executor logs:
>
> 2014-12-30 03:13:04,061 ERROR [Executor task launch worker-2] 
> executor.Executor (Logging.scala:logError(96)) - Exception in task 4523.0 in 
> stage 1.0 (TID 4524)
> org.apache.spark.SparkException: PairwiseRDD: unexpected value: 
> List([B@130dc7ad)
>       at 
> org.apache.spark.api.python.PairwiseRDD$$anonfun$compute$2.apply(PythonRDD.scala:307)
>       at 
> org.apache.spark.api.python.PairwiseRDD$$anonfun$compute$2.apply(PythonRDD.scala:305)
>       at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>       at 
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:219)
>       at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>       at org.apache.spark.scheduler.Task.run(Task.scala:56)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
>
> For the hash-based shuffle, there are now a bunch of these exceptions in the 
> logs:
>
>
> 2014-12-30 04:14:01,688 ERROR [Executor task launch worker-0] 
> executor.Executor (Logging.scala:logError(96)) - Exception in task 4479.0 in 
> stage 1.0 (TID 4480)
> java.io.FileNotFoundException: 
> /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1419905501183_0004/spark-local-20141230035728-8fc0/23/merged_shuffle_1_68_0
>  (No such file or directory)
>       at java.io.FileOutputStream.open(Native Method)
>       at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
>       at 
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
>       at 
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:192)
>       at 
> org.apache.spark.shuffle.hash.HashShuffleWriter$$anonfun$write$1.apply(HashShuffleWriter.scala:67)
>       at 
> org.apache.spark.shuffle.hash.HashShuffleWriter$$anonfun$write$1.apply(HashShuffleWriter.scala:65)
>       at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>       at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>       at 
> org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>       at org.apache.spark.scheduler.Task.run(Task.scala:56)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
>
> Thank you!
> -Sven
>
>
>
> --
> http://sites.google.com/site/krasser/?utm_source=sig
>

Reply via email to