Hello, Sorry for the late reply.
When I tried the LogQuery example this time, things now seem to be fine! ... 14/10/10 04:01:21 INFO scheduler.DAGScheduler: Stage 0 (collect at LogQuery.scala:80) finished in 0.429 s 14/10/10 04:01:21 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool defa 14/10/10 04:01:21 INFO spark.SparkContext: Job finished: collect at LogQuery.scala:80, took 12.802743914 s (10.10.10.10,"FRED",GET http://images.com/2013/Generic.jpg HTTP/1.1) bytes=621 n=2 Not sure if this is the correct response for that example. Our mesos/spark builds have since been updated since I last wrote. Possibly, the JDK version was updated to 1.7.0_67 If you are using an older JDK, maybe try updating that? - Fi Fairiz "Fi" Azizi On Wed, Oct 8, 2014 at 7:54 AM, RJ Nowling <rnowl...@gmail.com> wrote: > Yep! That's the example I was talking about. > > Is an error message printed when it hangs? I get : > > 14/09/30 13:23:14 ERROR BlockManagerMasterActor: Got two different block > manager registrations on 20140930-131734-1723727882-5050-1895-1 > > > > On Tue, Oct 7, 2014 at 8:36 PM, Fairiz Azizi <code...@gmail.com> wrote: > >> Sure, could you point me to the example? >> >> The only thing I could find was >> >> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/LogQuery.scala >> >> So do you mean running it like: >> MASTER="mesos://xxxxxxx*:5050*" ./run-example LogQuery >> >> I tried that and I can see the job run and the tasks complete on the >> slave nodes, but the client process seems to hang forever, it's probably a >> different problem. BTW, only a dozen or so tasks kick off. >> >> I actually haven't done much with Scala and Spark (it's been all python). >> >> Fi >> >> >> >> Fairiz "Fi" Azizi >> >> On Tue, Oct 7, 2014 at 6:29 AM, RJ Nowling <rnowl...@gmail.com> wrote: >> >>> I was able to reproduce it on a small 4 node cluster (1 mesos master and >>> 3 mesos slaves) with relatively low-end specs. As I said, I just ran the >>> log query examples with the fine-grained mesos mode. >>> >>> Spark 1.1.0 and mesos 0.20.1. >>> >>> Fairiz, could you try running the logquery example included with Spark >>> and see what you get? >>> >>> Thanks! >>> >>> On Mon, Oct 6, 2014 at 8:07 PM, Fairiz Azizi <code...@gmail.com> wrote: >>> >>>> That's what great about Spark, the community is so active! :) >>>> >>>> I compiled Mesos 0.20.1 from the source tarball. >>>> >>>> Using the Mapr3 Spark 1.1.0 distribution from the Spark downloads page >>>> (spark-1.1.0-bin-mapr3.tgz). >>>> >>>> I see no problems for the workloads we are trying. >>>> >>>> However, the cluster is small (less than 100 cores across 3 nodes). >>>> >>>> The workloads reads in just a few gigabytes from HDFS, via an ipython >>>> notebook spark shell. >>>> >>>> thanks, >>>> Fi >>>> >>>> >>>> >>>> Fairiz "Fi" Azizi >>>> >>>> On Mon, Oct 6, 2014 at 9:20 AM, Timothy Chen <tnac...@gmail.com> wrote: >>>> >>>>> Ok I created SPARK-3817 to track this, will try to repro it as well. >>>>> >>>>> Tim >>>>> >>>>> On Mon, Oct 6, 2014 at 6:08 AM, RJ Nowling <rnowl...@gmail.com> wrote: >>>>> > I've recently run into this issue as well. I get it from running >>>>> Spark >>>>> > examples such as log query. Maybe that'll help reproduce the issue. >>>>> > >>>>> > >>>>> > On Monday, October 6, 2014, Gurvinder Singh < >>>>> gurvinder.si...@uninett.no> >>>>> > wrote: >>>>> >> >>>>> >> The issue does not occur if the task at hand has small number of map >>>>> >> tasks. I have a task which has 978 map tasks and I see this error as >>>>> >> >>>>> >> 14/10/06 09:34:40 ERROR BlockManagerMasterActor: Got two different >>>>> block >>>>> >> manager registrations on 20140711-081617-711206558-5050-2543-5 >>>>> >> >>>>> >> Here is the log from the mesos-slave where this container was >>>>> running. >>>>> >> >>>>> >> http://pastebin.com/Q1Cuzm6Q >>>>> >> >>>>> >> If you look for the code from where error produced by spark, you >>>>> will >>>>> >> see that it simply exit and saying in comments "this should never >>>>> >> happen, lets just quit" :-) >>>>> >> >>>>> >> - Gurvinder >>>>> >> On 10/06/2014 09:30 AM, Timothy Chen wrote: >>>>> >> > (Hit enter too soon...) >>>>> >> > >>>>> >> > What is your setup and steps to repro this? >>>>> >> > >>>>> >> > Tim >>>>> >> > >>>>> >> > On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen <tnac...@gmail.com> >>>>> wrote: >>>>> >> >> Hi Gurvinder, >>>>> >> >> >>>>> >> >> I tried fine grain mode before and didn't get into that problem. >>>>> >> >> >>>>> >> >> >>>>> >> >> On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh >>>>> >> >> <gurvinder.si...@uninett.no> wrote: >>>>> >> >>> On 10/06/2014 08:19 AM, Fairiz Azizi wrote: >>>>> >> >>>> The Spark online docs indicate that Spark is compatible with >>>>> Mesos >>>>> >> >>>> 0.18.1 >>>>> >> >>>> >>>>> >> >>>> I've gotten it to work just fine on 0.18.1 and 0.18.2 >>>>> >> >>>> >>>>> >> >>>> Has anyone tried Spark on a newer version of Mesos, i.e. Mesos >>>>> >> >>>> v0.20.0? >>>>> >> >>>> >>>>> >> >>>> -Fi >>>>> >> >>>> >>>>> >> >>> Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in >>>>> >> >>> coarse >>>>> >> >>> mode, in fine grain mode there is an issue with blockmanager >>>>> names >>>>> >> >>> conflict. I have been waiting for it to be fixed but it is still >>>>> >> >>> there. >>>>> >> >>> >>>>> >> >>> -Gurvinder >>>>> >> >>> >>>>> >> >>> >>>>> --------------------------------------------------------------------- >>>>> >> >>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>>> >> >>> For additional commands, e-mail: dev-h...@spark.apache.org >>>>> >> >>> >>>>> >> >>>>> >> >>>>> >> >>>>> --------------------------------------------------------------------- >>>>> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>>> >> For additional commands, e-mail: dev-h...@spark.apache.org >>>>> >> >>>>> > >>>>> > >>>>> > -- >>>>> > em rnowl...@gmail.com >>>>> > c 954.496.2314 >>>>> >>>> >>>> >>> >>> >>> -- >>> em rnowl...@gmail.com >>> c 954.496.2314 >>> >> >> > > > -- > em rnowl...@gmail.com > c 954.496.2314 >