Maybe I was bailing too early, Kay. I'm sure I waited at least 15 mins, but maybe not 30.
On Wed, Oct 30, 2013 at 3:45 PM, Kay Ousterhout <[email protected]>wrote: > Patrick: I don't think this was caused by a recent merge -- pretty sure I > was seeing it last week. > > Mark: Are you sure the examples assembly is hanging, as opposed to just > taking a long time? It takes ~30 minutes on my machine (not doubting that > the Java version update fixes it -- just pointing out that if you wait, it > may actually finish). > > Evan: One thing to note is that the log message is wrong (see > https://github.com/apache/incubator-spark/pull/126): the task is actually > failing just once, not 4 times. Doesn't help fix the issue -- but just > thought I'd point it out in case anyone else is trying to look into this. > > > On Wed, Oct 30, 2013 at 2:08 PM, Patrick Wendell <[email protected]> > wrote: > > > This may have been caused by a recent merge since a bunch of people > > independently hit it in the last 48 hours. > > > > One debugging step would be to narrow it down to which merge caused > > it. I don't have time personally today, but just a suggestion for ppl > > for whom this is blocking progress. > > > > - Patrick > > > > On Wed, Oct 30, 2013 at 1:44 PM, Mark Hamstra <[email protected]> > > wrote: > > > What JDK version on you using, Evan? > > > > > > I tried to reproduce your problem earlier today, but I wasn't even able > > to > > > get through the assembly build -- kept hanging when trying to build the > > > examples assembly. Foregoing the assembly and running the tests would > > hang > > > on FileServerSuite "Dynamically adding JARS locally" -- no stack trace, > > > just hung. And I was actually seeing a very similar stack trace to > yours > > > from a test suite of our own running against 0.8.1-SNAPSHOT -- not > > exactly > > > the same because line numbers were different once it went into the java > > > runtime, and it eventually ended up someplace a little different. That > > got > > > me curious about differences in Java versions, so I updated to the > latest > > > Oracle release (1.7.0_45). Now it cruises right through the build and > > test > > > of Spark master from before Matei merged your PR. Then I logged into a > > > machine that has 1.7.0_15 (7u15-2.3.7-0ubuntu1~11.10.1, actually) > > > installed, and I'm right back to the hanging during the examples > assembly > > > (but passes FileServerSuite, oddly enough.) Upgrading the JDK didn't > > > improve the results of the ClearStory test suite I was looking at, so > my > > > misery isn't over; but yours might be with a newer JDK.... > > > > > > > > > > > > On Wed, Oct 30, 2013 at 12:44 PM, Evan Chan <[email protected]> wrote: > > > > > >> Must be a local environment thing, because AmpLab Jenkins can't > > >> reproduce it..... :-p > > >> > > >> On Wed, Oct 30, 2013 at 11:10 AM, Josh Rosen <[email protected]> > > wrote: > > >> > Someone on the users list also encountered this exception: > > >> > > > >> > > > >> > > > https://mail-archives.apache.org/mod_mbox/incubator-spark-user/201310.mbox/%3C64474308D680D540A4D8151B0F7C03F7025EF289%40SHSMSX104.ccr.corp.intel.com%3E > > >> > > > >> > > > >> > On Wed, Oct 30, 2013 at 9:40 AM, Evan Chan <[email protected]> wrote: > > >> > > > >> >> I'm at the latest > > >> >> > > >> >> commit f0e23a023ce1356bc0f04248605c48d4d08c2d05 > > >> >> Merge: aec9bf9 a197137 > > >> >> Author: Reynold Xin <[email protected]> > > >> >> Date: Tue Oct 29 01:41:44 2013 -0400 > > >> >> > > >> >> > > >> >> and seeing this when I do a "test-only FileServerSuite": > > >> >> > > >> >> 13/10/30 09:35:04.300 INFO DAGScheduler: Completed ResultTask(0, 0) > > >> >> 13/10/30 09:35:04.307 INFO LocalTaskSetManager: Loss was due to > > >> >> java.io.StreamCorruptedException > > >> >> java.io.StreamCorruptedException: invalid type code: AC > > >> >> at > > >> >> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) > > >> >> at > > >> java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) > > >> >> at > > >> >> > > >> > > > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39) > > >> >> at > > >> >> > > >> > > > org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:101) > > >> >> at > > >> >> org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) > > >> >> at > > >> scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440) > > >> >> at > > >> >> > > >> > > > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:26) > > >> >> at > > >> >> > > >> > > > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27) > > >> >> at > > >> >> > > org.apache.spark.Aggregator.combineCombinersByKey(Aggregator.scala:53) > > >> >> at > > >> >> > > >> > > > org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$2.apply(PairRDDFunctions.scala:95) > > >> >> at > > >> >> > > >> > > > org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$2.apply(PairRDDFunctions.scala:94) > > >> >> at > > >> >> > > >> > > > org.apache.spark.rdd.MapPartitionsWithContextRDD.compute(MapPartitionsWithContextRDD.scala:40) > > >> >> at > > >> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237) > > >> >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:226) > > >> >> at > > >> >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107) > > >> >> at org.apache.spark.scheduler.Task.run(Task.scala:53) > > >> >> at > > >> >> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:212) > > >> >> at > > >> >> > > >> > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > > >> >> at > > >> >> > > >> > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > > >> >> at java.lang.Thread.run(Thread.java:680) > > >> >> > > >> >> > > >> >> Anybody else seen this yet? > > >> >> > > >> >> I have a really simple PR and this fails without my change, so I > may > > >> >> go ahead and submit it anyways. > > >> >> > > >> >> -- > > >> >> -- > > >> >> Evan Chan > > >> >> Staff Engineer > > >> >> [email protected] | > > >> >> > > >> > > >> > > >> > > >> -- > > >> -- > > >> Evan Chan > > >> Staff Engineer > > >> [email protected] | > > >> > > >
