We are using protobuf, which under the covers does have this. I was under the impression that the custom serializer solution would work out - it helped some but ultimately I needed a larger stack size.
On Mon, Sep 23, 2013 at 3:11 PM, Reynold Xin <[email protected]> wrote: > Hi Gary, > > I am really confused here - what does your custom serializer do? Do you > have some data structure that is having a giant nested structure? > > > -- > Reynold Xin, AMPLab, UC Berkeley > http://rxin.org > > > > On Tue, Sep 17, 2013 at 1:40 PM, Gary Malouf <[email protected]>wrote: > >> We ultimately solved this by putting a huge stack size of 100m on the >> slave nodes' spark-env.sh. Two things deceiving about this: >> >> 1) That a gigantic stack is needed for deserialization >> >> 2) The docs seem to imply that the slave settings are determined at >> runtime from the scheduler - this is not the case globally. >> >> >> On Tue, Sep 17, 2013 at 12:38 PM, Gary Malouf <[email protected]>wrote: >> >>> If more context is needed, I am happy to provide it. This is a very >>> troubling issue for us as it seriously limits how much data we can look at >>> a time in Spark. For now, I am able to revert to Hive to get the job done.. >>> >>> >>> On Fri, Sep 13, 2013 at 3:19 PM, Gary Malouf <[email protected]>wrote: >>> >>>> I previously was having issues with StackOverflows when working with >>>> one or two days worth of data. Steps I have taken since then: >>>> >>>> 1) Increase stack size (Xss) from default to 2m to as high as 200m >>>> 2) Active Kryo serialization >>>> 3) Implement custom serializers for my protobuf messages >>>> >>>> While these changes have allowed me to grab up to 10 days worth of >>>> data, I can not really go beyond that without the dreaded >>>> StackOverflowError: >>>> >>>> java.lang.StackOverflowError >>>> at >>>> java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2291) >>>> at >>>> java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2584) >>>> at >>>> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2594) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1704) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1342) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> at >>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>> at >>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>> at >>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>> at >>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>> >>>> >>>> Seems like it gets stuck in an infinite loop of deserialization. Has >>>> anyone found ways to work through this? >>>> >>> >>> >> >
