Could you provide a script to reproduce this problem?

Thanks!

On Wed, Oct 8, 2014 at 9:13 PM, Sung Hwan Chung
<coded...@cs.stanford.edu> wrote:
> This is also happening to me on a regular basis, when the job is large with
> relatively large serialized objects used in each RDD lineage. A bad thing
> about this is that this exception always stops the whole job.
>
>
> On Fri, Sep 26, 2014 at 11:17 AM, Brad Miller <bmill...@eecs.berkeley.edu>
> wrote:
>>
>> FWIW I suspect that each count operation is an opportunity for you to
>> trigger the bug, and each filter operation increases the likelihood of
>> setting up the bug.  I normally don't come across this error until my job
>> has been running for an hour or two and had a chance to build up longer
>> lineages for some RDDs.  It sounds like your data is a bit smaller and it's
>> more feasible for you to build up longer lineages more quickly.
>>
>> If you can reduce your number of filter operations (for example by
>> combining some into a single function) that may help.  It may also help to
>> introduce persistence or checkpointing at intermediate stages so that the
>> length of the lineages that have to get replayed isn't as long.
>>
>> On Fri, Sep 26, 2014 at 11:10 AM, Arun Ahuja <aahuj...@gmail.com> wrote:
>>>
>>> No for me as well it is non-deterministic.  It happens in a piece of code
>>> that does many filter and counts on a small set of records (~1k-10k).  The
>>> originally set is persisted in memory and we have a Kryo serializer set for
>>> it.  The task itself takes in just a few filtering parameters.  This with
>>> the same setting has sometimes completed to sucess and sometimes failed
>>> during this step.
>>>
>>> Arun
>>>
>>> On Fri, Sep 26, 2014 at 1:32 PM, Brad Miller <bmill...@eecs.berkeley.edu>
>>> wrote:
>>>>
>>>> I've had multiple jobs crash due to "java.io.IOException: unexpected
>>>> exception type"; I've been running the 1.1 branch for some time and am now
>>>> running the 1.1 release binaries. Note that I only use PySpark. I haven't
>>>> kept detailed notes or the tracebacks around since there are other problems
>>>> that have caused my greater grief (namely "key not found" errors).
>>>>
>>>> For me the exception seems to occur non-deterministically, which is a
>>>> bit interesting since the error message shows that the same stage has 
>>>> failed
>>>> multiple times.  Are you able to consistently re-produce the bug across
>>>> multiple invocations at the same place?
>>>>
>>>> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aahuj...@gmail.com> wrote:
>>>>>
>>>>> Has anyone else seen this erorr in task deserialization?  The task is
>>>>> processing a small amount of data and doesn't seem to have much data 
>>>>> hanging
>>>>> to the closure?  I've only seen this with Spark 1.1
>>>>>
>>>>> Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times,
>>>>> most recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com):
>>>>> java.io.IOException: unexpected exception type
>>>>>
>>>>> java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>>>>>
>>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>>>>>
>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>>>>
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>>
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>>
>>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>>
>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>>
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>>
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>>
>>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>>
>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>>>>
>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>>>>
>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>         java.lang.Thread.run(Thread.java:744)
>>>>
>>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to