Hey all

We have a user running Scalding, on Cascading3, on Tez. This exception tends to 
crop up for DAGs that hang indefinitely (this DAG has 140 vertices).

It looks like the flag exception BufferTooSmallException isn’t being caught and 
forcing the buffer to reset. Nor is the exception, when passed up to the 
thread, causing the Node/DAG to fail.

Or is this a misinterpretation.

ckw


2015-03-23 11:32:40,445 INFO [TezChild] writers.UnorderedPartitionedKVWriter: 
Moving to next buffer and triggering spill
2015-03-23 11:32:40,496 INFO [UnorderedOutSpiller 
[E61683F3D94D46C2998CDC61CD112750]] writers.UnorderedPartitionedKVWriter: 
Finished spill 1
2015-03-23 11:32:40,496 INFO [UnorderedOutSpiller 
[E61683F3D94D46C2998CDC61CD112750]] writers.UnorderedPartitionedKVWriter: 
Spill# 1 complete.
2015-03-23 11:32:41,185 ERROR [TezChild] 
hadoop.TupleSerialization$SerializationElementWriter: failed serializing token: 
181 with classname: scala.Tuple2
org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter$BufferTooSmallException
        at 
org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter$ByteArrayOutputStream.write(UnorderedPartitionedKVWriter.java:651)
        at 
org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter$ByteArrayOutputStream.write(UnorderedPartitionedKVWriter.java:646)
        at java.io.DataOutputStream.write(DataOutputStream.java:88)
        at java.io.DataOutputStream.writeInt(DataOutputStream.java:198)
        at 
com.twitter.chill.hadoop.KryoSerializer.serialize(KryoSerializer.java:50)
        at 
cascading.tuple.hadoop.TupleSerialization$SerializationElementWriter.write(TupleSerialization.java:705)
        at 
cascading.tuple.io.TupleOutputStream.writeElement(TupleOutputStream.java:114)
        at cascading.tuple.io.TupleOutputStream.write(TupleOutputStream.java:89)
        at 
cascading.tuple.io.TupleOutputStream.writeTuple(TupleOutputStream.java:64)
        at 
cascading.tuple.hadoop.io.TupleSerializer.serialize(TupleSerializer.java:37)
        at 
cascading.tuple.hadoop.io.TupleSerializer.serialize(TupleSerializer.java:28)
        at 
org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.write(UnorderedPartitionedKVWriter.java:212)
        at 
org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.write(UnorderedPartitionedKVWriter.java:194)
        at 
cascading.flow.tez.stream.element.OldOutputCollector.collect(OldOutputCollector.java:51)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[…]

—
Chris K Wensel
[email protected]




Reply via email to