[ 
https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-3277:
-------------------------------

    Description: 
I tested the LZ4 compression,and it come up with such problem.(with wordcount)
Also I tested the snappy and LZF,and they were OK.
At last I set the  "spark.shuffle.spill" as false to avoid such exeception, but 
once open this "switch", this error would come.
It seems that if num of the[ words is few, wordcount will go through,but if it 
is a complex text ,this problem will show
Exeception Info as follow:
{code}
java.lang.AssertionError: assertion failed
        at scala.Predef$.assert(Predef.scala:165)
        at 
org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.<init>(ExternalAppendOnlyMap.scala:416)
        at 
org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
        at 
org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
        at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
        at 
org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:54)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
{code}


  was:
I tested the LZ4 compression,and it come up with such problem.(with wordcount)
Also I tested the snappy and LZF,and they were OK.
At last I set the  "spark.shuffle.spill" as false to avoid such exeception, but 
once open this "switch", this error would come.
It seems that if num of the words is few, wordcount will go through,but if it 
is a complex text ,this problem will show
Exeception Info as follow:
java.lang.AssertionError: assertion failed
        at scala.Predef$.assert(Predef.scala:165)
        at 
org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.<init>(ExternalAppendOnlyMap.scala:416)
        at 
org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
        at 
org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
        at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
        at 
org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:54)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)



> LZ4 compression cause the the ExternalSort exception
> ----------------------------------------------------
>
>                 Key: SPARK-3277
>                 URL: https://issues.apache.org/jira/browse/SPARK-3277
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.2, 1.1.0, 1.2.0
>            Reporter: hzw
>            Priority: Blocker
>         Attachments: test_lz4_bug.patch
>
>
> I tested the LZ4 compression,and it come up with such problem.(with wordcount)
> Also I tested the snappy and LZF,and they were OK.
> At last I set the  "spark.shuffle.spill" as false to avoid such exeception, 
> but once open this "switch", this error would come.
> It seems that if num of the[ words is few, wordcount will go through,but if 
> it is a complex text ,this problem will show
> Exeception Info as follow:
> {code}
> java.lang.AssertionError: assertion failed
>         at scala.Predef$.assert(Predef.scala:165)
>         at 
> org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.<init>(ExternalAppendOnlyMap.scala:416)
>         at 
> org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
>         at 
> org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
>         at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
>         at 
> org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
>         at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         at org.apache.spark.scheduler.Task.run(Task.scala:54)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:722)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to