[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113741#comment-14113741
 ] 

Mridul Muralidharan commented on SPARK-3277:


This looks like unrelated changes pushed to BlockObjectWriter as part of 
introduction of ShuffleWriteMetrics.
I had introducing checks and also documented that we must not infer size based 
on position of stream after flush - since close can write data to the streams 
(and one flush can result in more data getting generated which need not be 
flushed to streams).

Apparently this logic was modified subsequently causing this bug.
Solution would be to revert changes to update shuffleBytesWritten before close 
of stream.
It must be done after close and based on file.length

 LZ4 compression cause the the ExternalSort exception
 

 Key: SPARK-3277
 URL: https://issues.apache.org/jira/browse/SPARK-3277
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2
Reporter: hzw
 Fix For: 1.1.0


 I tested the LZ4 compression,and it come up with such problem.(with wordcount)
 Also I tested the snappy and LZF,and they were OK.
 At last I set the  spark.shuffle.spill as false to avoid such exeception, 
 but once open this switch, this error would come.
 Exeception Info as follow:
 java.lang.AssertionError: assertion failed
 at scala.Predef$.assert(Predef.scala:165)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.init(ExternalAppendOnlyMap.scala:416)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
 at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
 at 
 org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:54)
 at 
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread hzw (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14113789#comment-14113789
 ] 

hzw commented on SPARK-3277:


Sorry,I can not understand it clearly since I'm not familiar with the code of 
this class.
Can you point the line number of the code where it goes wrong or make a pr to 
fix this problem 

 LZ4 compression cause the the ExternalSort exception
 

 Key: SPARK-3277
 URL: https://issues.apache.org/jira/browse/SPARK-3277
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2
Reporter: hzw
 Fix For: 1.1.0


 I tested the LZ4 compression,and it come up with such problem.(with wordcount)
 Also I tested the snappy and LZF,and they were OK.
 At last I set the  spark.shuffle.spill as false to avoid such exeception, 
 but once open this switch, this error would come.
 It seems that if num of the words is few, wordcount will go through,but if it 
 is a complex text ,this problem will show
 Exeception Info as follow:
 java.lang.AssertionError: assertion failed
 at scala.Predef$.assert(Predef.scala:165)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.init(ExternalAppendOnlyMap.scala:416)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
 at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
 at 
 org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:54)
 at 
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114014#comment-14114014
 ] 

Mridul Muralidharan commented on SPARK-3277:


[~matei] Attaching a patch which reproduces the bug consistently.
I suspect the issue is more serious than what I detailed above - spill to disk 
seems completely broken if I understood the assertion message correctly.
Unfortunately, this is based on a few minutes of free time I could grab - so a 
more principled debugging session is definitely warranted !



 LZ4 compression cause the the ExternalSort exception
 

 Key: SPARK-3277
 URL: https://issues.apache.org/jira/browse/SPARK-3277
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2, 1.1.0, 1.2.0
Reporter: hzw
Priority: Blocker
 Fix For: 1.1.0


 I tested the LZ4 compression,and it come up with such problem.(with wordcount)
 Also I tested the snappy and LZF,and they were OK.
 At last I set the  spark.shuffle.spill as false to avoid such exeception, 
 but once open this switch, this error would come.
 It seems that if num of the words is few, wordcount will go through,but if it 
 is a complex text ,this problem will show
 Exeception Info as follow:
 java.lang.AssertionError: assertion failed
 at scala.Predef$.assert(Predef.scala:165)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.init(ExternalAppendOnlyMap.scala:416)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
 at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
 at 
 org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:54)
 at 
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114026#comment-14114026
 ] 

Mridul Muralidharan commented on SPARK-3277:


[~hzw] did you notice this against 1.0.2 ?
I did not think the changes for consolidated shuffle were backported to that 
branch, [~mateiz] can comment more though.

 LZ4 compression cause the the ExternalSort exception
 

 Key: SPARK-3277
 URL: https://issues.apache.org/jira/browse/SPARK-3277
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2, 1.1.0, 1.2.0
Reporter: hzw
Priority: Blocker
 Fix For: 1.1.0

 Attachments: test_lz4_bug.patch


 I tested the LZ4 compression,and it come up with such problem.(with wordcount)
 Also I tested the snappy and LZF,and they were OK.
 At last I set the  spark.shuffle.spill as false to avoid such exeception, 
 but once open this switch, this error would come.
 It seems that if num of the words is few, wordcount will go through,but if it 
 is a complex text ,this problem will show
 Exeception Info as follow:
 java.lang.AssertionError: assertion failed
 at scala.Predef$.assert(Predef.scala:165)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.init(ExternalAppendOnlyMap.scala:416)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
 at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
 at 
 org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:54)
 at 
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114247#comment-14114247
 ] 

Matei Zaharia commented on SPARK-3277:
--

Thanks Mridul -- I think Andrew and Patrick have figured this out.

 LZ4 compression cause the the ExternalSort exception
 

 Key: SPARK-3277
 URL: https://issues.apache.org/jira/browse/SPARK-3277
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2, 1.1.0, 1.2.0
Reporter: hzw
Priority: Blocker
 Attachments: test_lz4_bug.patch


 I tested the LZ4 compression,and it come up with such problem.(with wordcount)
 Also I tested the snappy and LZF,and they were OK.
 At last I set the  spark.shuffle.spill as false to avoid such exeception, 
 but once open this switch, this error would come.
 It seems that if num of the words is few, wordcount will go through,but if it 
 is a complex text ,this problem will show
 Exeception Info as follow:
 java.lang.AssertionError: assertion failed
 at scala.Predef$.assert(Predef.scala:165)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.init(ExternalAppendOnlyMap.scala:416)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
 at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
 at 
 org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:54)
 at 
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3277) LZ4 compression cause the the ExternalSort exception

2014-08-28 Thread Mridul Muralidharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114484#comment-14114484
 ] 

Mridul Muralidharan commented on SPARK-3277:


Sounds great, thx !
I suspect it is because for lzo we configure it to write block on flush 
(partial if insufficient data to fill block); but for lz4, either such config 
does not exist or we dont use that.
Resulting in flush becoming noop in case the data in current block is 
insufficientto cause a compressed block to be created - while close will force 
patial block to be written out.

Which is why the asserion lists all sizes as 0


 LZ4 compression cause the the ExternalSort exception
 

 Key: SPARK-3277
 URL: https://issues.apache.org/jira/browse/SPARK-3277
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2, 1.1.0, 1.2.0
Reporter: hzw
Assignee: Andrew Or
Priority: Blocker
 Attachments: test_lz4_bug.patch


 I tested the LZ4 compression,and it come up with such problem.(with wordcount)
 Also I tested the snappy and LZF,and they were OK.
 At last I set the  spark.shuffle.spill as false to avoid such exeception, 
 but once open this switch, this error would come.
 It seems that if num of the[ words is few, wordcount will go through,but if 
 it is a complex text ,this problem will show
 Exeception Info as follow:
 {code}
 java.lang.AssertionError: assertion failed
 at scala.Predef$.assert(Predef.scala:165)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap$DiskMapIterator.init(ExternalAppendOnlyMap.scala:416)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:235)
 at 
 org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:150)
 at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
 at 
 org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
 at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
 at org.apache.spark.scheduler.Task.run(Task.scala:54)
 at 
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org