from:"Fairiz Azizi"

Re: Spark on Mesos 0.20

2014-10-17 Thread Fairiz Azizi

Hello,

Sorry for the delay (again), we were busy upgrading our cluster from MAPR
3.0.x to Mapr 3.1.1.26113.GA

I updated my builds to include referencing the native hadoop libraries by
this distribution as well as installing SNAPPY (I no longer see the 'unable
to load native hadoop libraries' and also see that it loads the SNAPPY
library).

I ran the example against a directory of ApacheLog files containing about
4.4GB, and things seem to work fine.

time MASTER=mesos://*:5050* /opt/spark/current/bin/run-example
LogQuery maprfs:///user/hive/warehouse/apachelog/dt=20141017/16

14/10/18 05:23:21 INFO scheduler.DAGScheduler: Stage 0 (collect at
LogQuery.scala:80) finished in 1.704 s
14/10/18 05:23:21 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0,
whose tasks have all completed, from pool default
14/10/18 05:23:21 INFO spark.SparkContext: Job finished: collect at
LogQuery.scala:80, took 40.533904277 s
(null,null,null) bytes=0 n=16682940

real 0m51.393s
user 0m19.130s
sys 0m4.120s

So this combination of software seems to work fine for me!

Spark 1.1.0
Mesos 0.20.1
MAPR 3.1.1.26113.GA
spark-1.1.0-bin-mapr3.tgz

Note: one thing you might try is increasing your spark.executor.memory
setting
Mine was set to 8GB in the spark-defaults.conf file.

Hope this helps,
Fi

Fairiz Fi Azizi

On Thu, Oct 9, 2014 at 11:35 PM, Gurvinder Singh gurvinder.si...@uninett.no
wrote:

On 10/10/2014 06:11 AM, Fairiz Azizi wrote:
Hello,

Sorry for the late reply.

When I tried the LogQuery example this time, things now seem to be fine!

...

14/10/10 04:01:21 INFO scheduler.DAGScheduler: Stage 0 (collect at
LogQuery.scala:80) finished in 0.429 s

14/10/10 04:01:21 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0,
whose tasks have all completed, from pool defa

14/10/10 04:01:21 INFO spark.SparkContext: Job finished: collect at
LogQuery.scala:80, took 12.802743914 s

(10.10.10.10,FRED,GET http://images.com/2013/Generic.jpg HTTP/1.1)
bytes=621 n=2

Not sure if this is the correct response for that example.

Our mesos/spark builds have since been updated since I last wrote.

Possibly, the JDK version was updated to 1.7.0_67

If you are using an older JDK, maybe try updating that?
I have tested on current JDK 7 and now I am running JDK 8, the problem
still exist. Can you run logquery on data of size say 100+ GB, so that
you have more map tasks. As we start to see the issue on larger tasks.

- Gurvinder

- Fi

Fairiz Fi Azizi

On Wed, Oct 8, 2014 at 7:54 AM, RJ Nowling rnowl...@gmail.com
mailto:rnowl...@gmail.com wrote:

Yep! That's the example I was talking about.

Is an error message printed when it hangs? I get :

14/09/30 13:23:14 ERROR BlockManagerMasterActor: Got two different
block manager registrations on 20140930-131734-1723727882-5050-1895-1

On Tue, Oct 7, 2014 at 8:36 PM, Fairiz Azizi code...@gmail.com
mailto:code...@gmail.com wrote:

Sure, could you point me to the example?

The only thing I could find was

https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/LogQuery.scala

So do you mean running it like:
MASTER=mesos://xxx_:5050_ ./run-example LogQuery

I tried that and I can see the job run and the tasks complete on
the slave nodes, but the client process seems to hang forever,
it's probably a different problem. BTW, only a dozen or so tasks
kick off.

I actually haven't done much with Scala and Spark (it's been all
python).

Fairiz Fi Azizi

On Tue, Oct 7, 2014 at 6:29 AM, RJ Nowling rnowl...@gmail.com
mailto:rnowl...@gmail.com wrote:

I was able to reproduce it on a small 4 node cluster (1
mesos master and 3 mesos slaves) with relatively low-end
specs. As I said, I just ran the log query examples with
the fine-grained mesos mode.

Spark 1.1.0 and mesos 0.20.1.

Fairiz, could you try running the logquery example included
with Spark and see what you get?

Thanks!

On Mon, Oct 6, 2014 at 8:07 PM, Fairiz Azizi
code...@gmail.com mailto:code...@gmail.com wrote:

That's what great about Spark, the community is so
active! :)

I compiled Mesos 0.20.1 from the source tarball.

Using the Mapr3 Spark 1.1.0 distribution from the Spark
downloads page (spark-1.1.0-bin-mapr3.tgz).

I see no problems for the workloads we are trying.

However, the cluster is small (less than 100 cores
across 3 nodes).

The workloads reads in just a few gigabytes from HDFS,
via an ipython

Re: Spark on Mesos 0.20

2014-10-09 Thread Fairiz Azizi

Hello,

Sorry for the late reply.

When I tried the LogQuery example this time, things now seem to be fine!

...

14/10/10 04:01:21 INFO scheduler.DAGScheduler: Stage 0 (collect at
LogQuery.scala:80) finished in 0.429 s

14/10/10 04:01:21 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0,
whose tasks have all completed, from pool defa

14/10/10 04:01:21 INFO spark.SparkContext: Job finished: collect at
LogQuery.scala:80, took 12.802743914 s

(10.10.10.10,FRED,GET http://images.com/2013/Generic.jpg HTTP/1.1)
bytes=621 n=2

Not sure if this is the correct response for that example.

Our mesos/spark builds have since been updated since I last wrote.

Possibly, the JDK version was updated to 1.7.0_67

If you are using an older JDK, maybe try updating that?

- Fi

Fairiz Fi Azizi

On Wed, Oct 8, 2014 at 7:54 AM, RJ Nowling rnowl...@gmail.com wrote:

Yep! That's the example I was talking about.

Is an error message printed when it hangs? I get :

14/09/30 13:23:14 ERROR BlockManagerMasterActor: Got two different block
manager registrations on 20140930-131734-1723727882-5050-1895-1

On Tue, Oct 7, 2014 at 8:36 PM, Fairiz Azizi code...@gmail.com wrote:

Sure, could you point me to the example?

The only thing I could find was

https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/LogQuery.scala

So do you mean running it like:
MASTER=mesos://xxx*:5050* ./run-example LogQuery

I tried that and I can see the job run and the tasks complete on the
slave nodes, but the client process seems to hang forever, it's probably a
different problem. BTW, only a dozen or so tasks kick off.

I actually haven't done much with Scala and Spark (it's been all python).

Fairiz Fi Azizi

On Tue, Oct 7, 2014 at 6:29 AM, RJ Nowling rnowl...@gmail.com wrote:

I was able to reproduce it on a small 4 node cluster (1 mesos master and
3 mesos slaves) with relatively low-end specs. As I said, I just ran the
log query examples with the fine-grained mesos mode.

Spark 1.1.0 and mesos 0.20.1.

Fairiz, could you try running the logquery example included with Spark
and see what you get?

Thanks!

On Mon, Oct 6, 2014 at 8:07 PM, Fairiz Azizi code...@gmail.com wrote:

That's what great about Spark, the community is so active! :)

I compiled Mesos 0.20.1 from the source tarball.

Using the Mapr3 Spark 1.1.0 distribution from the Spark downloads page
(spark-1.1.0-bin-mapr3.tgz).

I see no problems for the workloads we are trying.

However, the cluster is small (less than 100 cores across 3 nodes).

The workloads reads in just a few gigabytes from HDFS, via an ipython
notebook spark shell.

thanks,
Fi

Fairiz Fi Azizi

On Mon, Oct 6, 2014 at 9:20 AM, Timothy Chen tnac...@gmail.com wrote:

Ok I created SPARK-3817 to track this, will try to repro it as well.

Tim

On Mon, Oct 6, 2014 at 6:08 AM, RJ Nowling rnowl...@gmail.com wrote:
I've recently run into this issue as well. I get it from running
Spark
examples such as log query. Maybe that'll help reproduce the issue.

On Monday, October 6, 2014, Gurvinder Singh
gurvinder.si...@uninett.no
wrote:

The issue does not occur if the task at hand has small number of map
tasks. I have a task which has 978 map tasks and I see this error as

14/10/06 09:34:40 ERROR BlockManagerMasterActor: Got two different
block
manager registrations on 20140711-081617-711206558-5050-2543-5

Here is the log from the mesos-slave where this container was
running.

http://pastebin.com/Q1Cuzm6Q

If you look for the code from where error produced by spark, you
will
see that it simply exit and saying in comments this should never
happen, lets just quit :-)

- Gurvinder
On 10/06/2014 09:30 AM, Timothy Chen wrote:
(Hit enter too soon...)

What is your setup and steps to repro this?

Tim

On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen tnac...@gmail.com
wrote:
Hi Gurvinder,

I tried fine grain mode before and didn't get into that problem.

On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh
gurvinder.si...@uninett.no wrote:
On 10/06/2014 08:19 AM, Fairiz Azizi wrote:
The Spark online docs indicate that Spark is compatible with
Mesos
0.18.1

I've gotten it to work just fine on 0.18.1 and 0.18.2

Has anyone tried Spark on a newer version of Mesos, i.e. Mesos
v0.20.0?

-Fi

Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in
coarse
mode, in fine grain mode there is an issue with blockmanager
names
conflict. I have been waiting for it to be fixed but it is still
there.

-Gurvinder

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

-
To unsubscribe, e-mail: dev

Re: Spark on Mesos 0.20

2014-10-07 Thread Fairiz Azizi

Sure, could you point me to the example?

The only thing I could find was
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/LogQuery.scala

So do you mean running it like:
MASTER=mesos://xxx*:5050* ./run-example LogQuery

I tried that and I can see the job run and the tasks complete on the slave
nodes, but the client process seems to hang forever, it's probably a
different problem. BTW, only a dozen or so tasks kick off.

I actually haven't done much with Scala and Spark (it's been all python).

Fairiz Fi Azizi

On Tue, Oct 7, 2014 at 6:29 AM, RJ Nowling rnowl...@gmail.com wrote:

I was able to reproduce it on a small 4 node cluster (1 mesos master and 3
mesos slaves) with relatively low-end specs. As I said, I just ran the log
query examples with the fine-grained mesos mode.

Spark 1.1.0 and mesos 0.20.1.

Fairiz, could you try running the logquery example included with Spark and
see what you get?

Thanks!

On Mon, Oct 6, 2014 at 8:07 PM, Fairiz Azizi code...@gmail.com wrote:

That's what great about Spark, the community is so active! :)

I compiled Mesos 0.20.1 from the source tarball.

Using the Mapr3 Spark 1.1.0 distribution from the Spark downloads page
(spark-1.1.0-bin-mapr3.tgz).

I see no problems for the workloads we are trying.

However, the cluster is small (less than 100 cores across 3 nodes).

The workloads reads in just a few gigabytes from HDFS, via an ipython
notebook spark shell.

thanks,
Fi

Fairiz Fi Azizi

On Mon, Oct 6, 2014 at 9:20 AM, Timothy Chen tnac...@gmail.com wrote:

Ok I created SPARK-3817 to track this, will try to repro it as well.

Tim

On Mon, Oct 6, 2014 at 6:08 AM, RJ Nowling rnowl...@gmail.com wrote:
I've recently run into this issue as well. I get it from running Spark
examples such as log query. Maybe that'll help reproduce the issue.

On Monday, October 6, 2014, Gurvinder Singh
gurvinder.si...@uninett.no
wrote:

The issue does not occur if the task at hand has small number of map
tasks. I have a task which has 978 map tasks and I see this error as

14/10/06 09:34:40 ERROR BlockManagerMasterActor: Got two different
block
manager registrations on 20140711-081617-711206558-5050-2543-5

Here is the log from the mesos-slave where this container was running.

http://pastebin.com/Q1Cuzm6Q

If you look for the code from where error produced by spark, you will
see that it simply exit and saying in comments this should never
happen, lets just quit :-)

- Gurvinder
On 10/06/2014 09:30 AM, Timothy Chen wrote:
(Hit enter too soon...)

What is your setup and steps to repro this?

Tim

On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen tnac...@gmail.com
wrote:
Hi Gurvinder,

I tried fine grain mode before and didn't get into that problem.

I've gotten it to work just fine on 0.18.1 and 0.18.2

Has anyone tried Spark on a newer version of Mesos, i.e. Mesos
v0.20.0?

-Fi

Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in
coarse
mode, in fine grain mode there is an issue with blockmanager names
conflict. I have been waiting for it to be fixed but it is still
there.

-Gurvinder

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

--
em rnowl...@gmail.com
c 954.496.2314

Spark on Mesos 0.20

2014-10-06 Thread Fairiz Azizi

The Spark online docs indicate that Spark is compatible with Mesos 0.18.1

I've gotten it to work just fine on 0.18.1 and 0.18.2

Has anyone tried Spark on a newer version of Mesos, i.e. Mesos v0.20.0?

-Fi

Re: Spark on Mesos 0.20

2014-10-06 Thread Fairiz Azizi

That's what great about Spark, the community is so active! :)

I compiled Mesos 0.20.1 from the source tarball.

Using the Mapr3 Spark 1.1.0 distribution from the Spark downloads page
 (spark-1.1.0-bin-mapr3.tgz).

I see no problems for the workloads we are trying.

However, the cluster is small (less than 100 cores across 3 nodes).

The workloads reads in just a few gigabytes from HDFS, via an ipython
notebook spark shell.

thanks,
Fi



Fairiz Fi Azizi

On Mon, Oct 6, 2014 at 9:20 AM, Timothy Chen tnac...@gmail.com wrote:

 Ok I created SPARK-3817 to track this, will try to repro it as well.

 Tim

 On Mon, Oct 6, 2014 at 6:08 AM, RJ Nowling rnowl...@gmail.com wrote:
  I've recently run into this issue as well. I get it from running Spark
  examples such as log query.  Maybe that'll help reproduce the issue.
 
 
  On Monday, October 6, 2014, Gurvinder Singh gurvinder.si...@uninett.no
  wrote:
 
  The issue does not occur if the task at hand has small number of map
  tasks. I have a task which has 978 map tasks and I see this error as
 
  14/10/06 09:34:40 ERROR BlockManagerMasterActor: Got two different block
  manager registrations on 20140711-081617-711206558-5050-2543-5
 
  Here is the log from the mesos-slave where this container was running.
 
  http://pastebin.com/Q1Cuzm6Q
 
  If you look for the code from where error produced by spark, you will
  see that it simply exit and saying in comments this should never
  happen, lets just quit :-)
 
  - Gurvinder
  On 10/06/2014 09:30 AM, Timothy Chen wrote:
   (Hit enter too soon...)
  
   What is your setup and steps to repro this?
  
   Tim
  
   On Mon, Oct 6, 2014 at 12:30 AM, Timothy Chen tnac...@gmail.com
 wrote:
   Hi Gurvinder,
  
   I tried fine grain mode before and didn't get into that problem.
  
  
   On Sun, Oct 5, 2014 at 11:44 PM, Gurvinder Singh
   gurvinder.si...@uninett.no wrote:
   On 10/06/2014 08:19 AM, Fairiz Azizi wrote:
   The Spark online docs indicate that Spark is compatible with Mesos
   0.18.1
  
   I've gotten it to work just fine on 0.18.1 and 0.18.2
  
   Has anyone tried Spark on a newer version of Mesos, i.e. Mesos
   v0.20.0?
  
   -Fi
  
   Yeah we are using Spark 1.1.0 with Mesos 0.20.1. It runs fine in
   coarse
   mode, in fine grain mode there is an issue with blockmanager names
   conflict. I have been waiting for it to be fixed but it is still
   there.
  
   -Gurvinder
  
  
 -
   To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
   For additional commands, e-mail: dev-h...@spark.apache.org
  
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 
 
  --
  em rnowl...@gmail.com
  c 954.496.2314

Re: Spark on Mesos 0.20

Re: Spark on Mesos 0.20

Re: Spark on Mesos 0.20

Spark on Mesos 0.20

Re: Spark on Mesos 0.20

5 matches

Site Navigation

Mail list logo

Footer information