Hi, Thanks for the reponse.
I discovered my problem was that some of the executors got OOM, tracing
down the logs of executors helps discovering the problem. Usually the log
from the driver do not reflect the OOM error and therefore causes
confusions among users.

This is just the discoveries on my side, not sure if OP was having the same
problem though

On Wed, Feb 11, 2015 at 12:03 AM, Arush Kharbanda <
ar...@sigmoidanalytics.com> wrote:

> Hi
>
> Can you share the code you are trying to run.
>
> Thanks
> Arush
>
> On Wed, Feb 11, 2015 at 9:12 AM, Tianshuo Deng <td...@twitter.com.invalid>
> wrote:
>
>> I have seen the same problem, It causes some tasks to fail, but not the
>> whole job to fail.
>> Hope someone could shed some light on what could be the cause of this.
>>
>> On Mon, Jan 26, 2015 at 9:49 AM, Aaron Davidson <ilike...@gmail.com>
>> wrote:
>>
>>> It looks like something weird is going on with your object
>>> serialization, perhaps a funny form of self-reference which is not detected
>>> by ObjectOutputStream's typical loop avoidance. That, or you have some data
>>> structure like a linked list with a parent pointer and you have many
>>> thousand elements.
>>>
>>> Assuming the stack trace is coming from an executor, it is probably a
>>> problem with the objects you're sending back as results, so I would
>>> carefully examine these and maybe try serializing some using
>>> ObjectOutputStream manually.
>>>
>>> If your program looks like
>>> foo.map { row => doComplexOperation(row) }.take(10)
>>>
>>> you can also try changing it to
>>> foo.map { row => doComplexOperation(row); 1 }.take(10)
>>>
>>> to avoid serializing the result of that complex operation, which should
>>> help narrow down where exactly the problematic objects are coming from.
>>>
>>> On Mon, Jan 26, 2015 at 8:31 AM, octavian.ganea <
>>> octavian.ga...@inf.ethz.ch> wrote:
>>>
>>>> Here is the first error I get at the executors:
>>>>
>>>> 15/01/26 17:27:04 ERROR ExecutorUncaughtExceptionHandler: Uncaught
>>>> exception
>>>> in thread Thread[handle-message-executor-16,5,main]
>>>> java.lang.StackOverflowError
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1840)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1533)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>         at
>>>> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
>>>>         at
>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>>>         at
>>>>
>>>> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
>>>>
>>>> If you have any pointers for me on how to debug this, that would be very
>>>> useful. I tried running with both spark 1.2.0 and 1.1.1, getting the
>>>> same
>>>> error.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Lost-task-connection-closed-tp21361p21371.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>>
>>
>
>
> --
>
> [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>
>
> *Arush Kharbanda* || Technical Teamlead
>
> ar...@sigmoidanalytics.com || www.sigmoidanalytics.com
>

Reply via email to