[ 
https://issues.apache.org/jira/browse/SPARK-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071969#comment-16071969
 ] 

Lasantha Fernando commented on SPARK-21172:
-------------------------------------------

Have encountered the same issue in Spark 1.6.3 as well. I initially thought 
this was due to some configuration issue and have been trying many 
configuration parameters such as different compression codecs and changing the 
number of default shuffle partitions. But am still encountering this issue 
quite consistently and only happens when dealing with larger data sets (usually 
data sets larger than 1 TB).

These are some of the configuration parameters I am using.

{noformat}
spark.driver.memory                                     16g
spark.executor.cores                                    6
spark.executor.memory                                   22g
spark.io.compression.codec                              snappy
spark.sql.shuffle.partitions                            2400
spark.shuffle.spill.numElementsForceSpillThreshold      50000000
spark.serializer                                        
org.apache.spark.serializer.KryoSerializer
spark.kryoserializer.buffer.max                         512m
{noformat}

The stack trace is given below. Any pointers as to why this issue might be 
occurring?

{noformat}
17/07/01 04:46:06 WARN scheduler.TaskSetManager: Lost task 1494.0 in stage 9.0 
(TID 9444, s04b.la.org): java.io.EOFException: reached end of stream after 
reading 212805 bytes; 671088640 bytes expected
        at 
org.spark-project.guava.io.ByteStreams.readFully(ByteStreams.java:735)
        at 
org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:119)
        at 
org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:102)
        at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at 
org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
        at 
org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at 
org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:168)
        at org.apache.spark.sql.execution.Sort$$anonfun$1.apply(Sort.scala:90)
        at org.apache.spark.sql.execution.Sort$$anonfun$1.apply(Sort.scala:64)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$21.apply(RDD.scala:728)
        at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$21.apply(RDD.scala:728)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at 
org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:88)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{noformat}


> EOFException reached end of stream in UnsafeRowSerializer
> ---------------------------------------------------------
>
>                 Key: SPARK-21172
>                 URL: https://issues.apache.org/jira/browse/SPARK-21172
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle
>    Affects Versions: 2.0.1
>            Reporter: liupengcheng
>              Labels: shuffle
>
> Spark sql job failed because of the following Exception. Seems like a bug in 
> shuffle stage. 
> Shuffle read size for single task is tens of GB
> {code}
> org.apache.spark.SparkException: Task failed while writing rows
>       at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:264)
>       at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>       at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>       at org.apache.spark.scheduler.Task.run(Task.scala:86)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.EOFException: reached end of stream after reading 9034374 
> bytes; 1684891936 bytes expected
>       at 
> org.spark_project.guava.io.ByteStreams.readFully(ByteStreams.java:735)
>       at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:127)
>       at 
> org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:110)
>       at scala.collection.Iterator$$anon$12.next(Iterator.scala:444)
>       at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>       at 
> org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
>       at 
> org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
>       at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>       at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply$mcV$sp(WriterContainer.scala:255)
>       at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:253)
>       at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:253)
>       at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1345)
>       at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:259)
>       ... 8 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to