Maybe I was bailing too early, Kay.  I'm sure I waited at least 15 mins,
but maybe not 30.



On Wed, Oct 30, 2013 at 3:45 PM, Kay Ousterhout <[email protected]>wrote:

> Patrick: I don't think this was caused by a recent merge -- pretty sure I
> was seeing it last week.
>
> Mark: Are you sure the examples assembly is hanging, as opposed to just
> taking a long time?  It takes ~30 minutes on my machine (not doubting that
> the Java version update fixes it -- just pointing out that if you wait, it
> may actually finish).
>
> Evan: One thing to note is that the log message is wrong (see
> https://github.com/apache/incubator-spark/pull/126): the task is actually
> failing just once, not 4 times.  Doesn't help fix the issue -- but just
> thought I'd point it out in case anyone else is trying to look into this.
>
>
> On Wed, Oct 30, 2013 at 2:08 PM, Patrick Wendell <[email protected]>
> wrote:
>
> > This may have been caused by a recent merge since a bunch of people
> > independently hit it in the last 48 hours.
> >
> > One debugging step would be to narrow it down to which merge caused
> > it. I don't have time personally today, but just a suggestion for ppl
> > for whom this is blocking progress.
> >
> > - Patrick
> >
> > On Wed, Oct 30, 2013 at 1:44 PM, Mark Hamstra <[email protected]>
> > wrote:
> > > What JDK version on you using, Evan?
> > >
> > > I tried to reproduce your problem earlier today, but I wasn't even able
> > to
> > > get through the assembly build -- kept hanging when trying to build the
> > > examples assembly.  Foregoing the assembly and running the tests would
> > hang
> > > on FileServerSuite "Dynamically adding JARS locally" -- no stack trace,
> > > just hung.  And I was actually seeing a very similar stack trace to
> yours
> > > from a test suite of our own running against 0.8.1-SNAPSHOT -- not
> > exactly
> > > the same because line numbers were different once it went into the java
> > > runtime, and it eventually ended up someplace a little different.  That
> > got
> > > me curious about differences in Java versions, so I updated to the
> latest
> > > Oracle release (1.7.0_45).  Now it cruises right through the build and
> > test
> > > of Spark master from before Matei merged your PR.  Then I logged into a
> > > machine that has 1.7.0_15 (7u15-2.3.7-0ubuntu1~11.10.1, actually)
> > > installed, and I'm right back to the hanging during the examples
> assembly
> > > (but passes FileServerSuite, oddly enough.)  Upgrading the JDK didn't
> > > improve the results of the ClearStory test suite I was looking at, so
> my
> > > misery isn't over; but yours might be with a newer JDK....
> > >
> > >
> > >
> > > On Wed, Oct 30, 2013 at 12:44 PM, Evan Chan <[email protected]> wrote:
> > >
> > >> Must be a local environment thing, because AmpLab Jenkins can't
> > >> reproduce it..... :-p
> > >>
> > >> On Wed, Oct 30, 2013 at 11:10 AM, Josh Rosen <[email protected]>
> > wrote:
> > >> > Someone on the users list also encountered this exception:
> > >> >
> > >> >
> > >>
> >
> https://mail-archives.apache.org/mod_mbox/incubator-spark-user/201310.mbox/%3C64474308D680D540A4D8151B0F7C03F7025EF289%40SHSMSX104.ccr.corp.intel.com%3E
> > >> >
> > >> >
> > >> > On Wed, Oct 30, 2013 at 9:40 AM, Evan Chan <[email protected]> wrote:
> > >> >
> > >> >> I'm at the latest
> > >> >>
> > >> >> commit f0e23a023ce1356bc0f04248605c48d4d08c2d05
> > >> >> Merge: aec9bf9 a197137
> > >> >> Author: Reynold Xin <[email protected]>
> > >> >> Date:   Tue Oct 29 01:41:44 2013 -0400
> > >> >>
> > >> >>
> > >> >> and seeing this when I do a "test-only FileServerSuite":
> > >> >>
> > >> >> 13/10/30 09:35:04.300 INFO DAGScheduler: Completed ResultTask(0, 0)
> > >> >> 13/10/30 09:35:04.307 INFO LocalTaskSetManager: Loss was due to
> > >> >> java.io.StreamCorruptedException
> > >> >> java.io.StreamCorruptedException: invalid type code: AC
> > >> >>         at
> > >> >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
> > >> >>         at
> > >> java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
> > >> >>         at
> > >> >>
> > >>
> >
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39)
> > >> >>         at
> > >> >>
> > >>
> >
> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:101)
> > >> >>         at
> > >> >> org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
> > >> >>         at
> > >> scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440)
> > >> >>         at
> > >> >>
> > >>
> >
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:26)
> > >> >>         at
> > >> >>
> > >>
> >
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27)
> > >> >>         at
> > >> >>
> > org.apache.spark.Aggregator.combineCombinersByKey(Aggregator.scala:53)
> > >> >>         at
> > >> >>
> > >>
> >
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$2.apply(PairRDDFunctions.scala:95)
> > >> >>         at
> > >> >>
> > >>
> >
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$2.apply(PairRDDFunctions.scala:94)
> > >> >>         at
> > >> >>
> > >>
> >
> org.apache.spark.rdd.MapPartitionsWithContextRDD.compute(MapPartitionsWithContextRDD.scala:40)
> > >> >>         at
> > >> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)
> > >> >>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)
> > >> >>         at
> > >> >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107)
> > >> >>         at org.apache.spark.scheduler.Task.run(Task.scala:53)
> > >> >>         at
> > >> >>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:212)
> > >> >>         at
> > >> >>
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> > >> >>         at
> > >> >>
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> > >> >>         at java.lang.Thread.run(Thread.java:680)
> > >> >>
> > >> >>
> > >> >> Anybody else seen this yet?
> > >> >>
> > >> >> I have a really simple PR and this fails without my change, so I
> may
> > >> >> go ahead and submit it anyways.
> > >> >>
> > >> >> --
> > >> >> --
> > >> >> Evan Chan
> > >> >> Staff Engineer
> > >> >> [email protected]  |
> > >> >>
> > >>
> > >>
> > >>
> > >> --
> > >> --
> > >> Evan Chan
> > >> Staff Engineer
> > >> [email protected]  |
> > >>
> >
>

Reply via email to