We ultimately solved this by putting a huge stack size of 100m on the slave nodes' spark-env.sh. Two things deceiving about this:
1) That a gigantic stack is needed for deserialization 2) The docs seem to imply that the slave settings are determined at runtime from the scheduler - this is not the case globally. On Tue, Sep 17, 2013 at 12:38 PM, Gary Malouf <[email protected]> wrote: > If more context is needed, I am happy to provide it. This is a very > troubling issue for us as it seriously limits how much data we can look at > a time in Spark. For now, I am able to revert to Hive to get the job done.. > > > On Fri, Sep 13, 2013 at 3:19 PM, Gary Malouf <[email protected]>wrote: > >> I previously was having issues with StackOverflows when working with one >> or two days worth of data. Steps I have taken since then: >> >> 1) Increase stack size (Xss) from default to 2m to as high as 200m >> 2) Active Kryo serialization >> 3) Implement custom serializers for my protobuf messages >> >> While these changes have allowed me to grab up to 10 days worth of data, >> I can not really go beyond that without the dreaded StackOverflowError: >> >> java.lang.StackOverflowError >> at >> java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2291) >> at >> java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2584) >> at >> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2594) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >> at >> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> >> >> Seems like it gets stuck in an infinite loop of deserialization. Has >> anyone found ways to work through this? >> > >
