Re: Hadoop on Mesos: FATAL mapred.MesosScheduler: Failed to initialize the TaskScheduler

2017-08-01 Thread tommy xiao
Traiano,

i am not also use it anymore, so i just share with you.

2017-07-31 2:27 GMT+08:00 Traiano Welcome :

> Hi Tommy
>
> On Sun, Jul 30, 2017 at 9:37 PM, tommy xiao  wrote:
>
>> why not use Myriad?
>>
>> https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Home
>>
>
>
> I'm in doubt about the future of this project. I'm told it's likely to be
> discontinued soon due to the lack of contributors.
> In any case - have you perhaps seen a successful deployment of this ?
>
>
>
>
>
>>
>>
>>
>> 2017-07-23 17:27 GMT+08:00 Traiano Welcome :
>>
>>>
>>> Hi List!
>>>
>>> I'm working on configuring hadoop to use the mesos scheduler, using the
>>> procedure outlined in "Apache Mesos Essentials" here:
>>>
>>> https://pastebin.com/y1ERJZqq
>>>
>>> Currently I've a 3 node mesos cluster, with an HDFS namenode
>>> communicating successfully with two HDFS data nodes. However, when I try to
>>> start up the jobtracker it fails with the following error:
>>>
>>> 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the
>>> TaskScheduler
>>> java.lang.ClassNotFoundException:  org.apache.hadoop.mapred.JobQu
>>> eueTaskScheduler
>>>
>>> Some more context around the error:
>>>
>>>  17/07/22 18:44:38 INFO mapred.CompletedJobStatusStore: Completed job
>>> store is inactive
>>>  17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the
>>> TaskScheduler
>>>  java.lang.ClassNotFoundException:  org.apache.hadoop.mapred.JobQu
>>> eueTaskScheduler
>>>
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:359)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:348)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:347)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:312)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:195)
>>> at   org.apache.hadoop.mapred.MesosScheduler.start(MesosScheduler
>>> .java:160)
>>> atorg.apache.hadoop.mapred.JobTracker.offerService(JobTracker.
>>> java:2186)
>>> at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4548)
>>> 17/07/22 18:44:38 INFO mapred.JobTracker: SHUTDOWN_MSG:
>>>
>>> Is there some way I could debug this further to trace the root cause of
>>> this error?
>>>
>>> Here is a full paste of the debug output when starting up the jobtracker:
>>>
>>>  https://pastebin.com/a61wN4vQ
>>>
>>>
>>>
>>
>>
>> --
>> Deshi Xiao
>> Twitter: xds2000
>> E-mail: xiaods(AT)gmail.com
>>
>
>


-- 
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com


Re: Hadoop on Mesos: FATAL mapred.MesosScheduler: Failed to initialize the TaskScheduler

2017-07-30 Thread Traiano Welcome
Hi Tommy

On Sun, Jul 30, 2017 at 9:37 PM, tommy xiao  wrote:

> why not use Myriad?
>
> https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Home
>


I'm in doubt about the future of this project. I'm told it's likely to be
discontinued soon due to the lack of contributors.
In any case - have you perhaps seen a successful deployment of this ?





>
>
>
> 2017-07-23 17:27 GMT+08:00 Traiano Welcome :
>
>>
>> Hi List!
>>
>> I'm working on configuring hadoop to use the mesos scheduler, using the
>> procedure outlined in "Apache Mesos Essentials" here:
>>
>> https://pastebin.com/y1ERJZqq
>>
>> Currently I've a 3 node mesos cluster, with an HDFS namenode
>> communicating successfully with two HDFS data nodes. However, when I try to
>> start up the jobtracker it fails with the following error:
>>
>> 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the
>> TaskScheduler
>> java.lang.ClassNotFoundException:  org.apache.hadoop.mapred.JobQu
>> eueTaskScheduler
>>
>> Some more context around the error:
>>
>>  17/07/22 18:44:38 INFO mapred.CompletedJobStatusStore: Completed job
>> store is inactive
>>  17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the
>> TaskScheduler
>>  java.lang.ClassNotFoundException:  org.apache.hadoop.mapred.JobQu
>> eueTaskScheduler
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:359)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:348)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:347)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:312)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:195)
>> at   org.apache.hadoop.mapred.MesosScheduler.start(MesosScheduler
>> .java:160)
>> atorg.apache.hadoop.mapred.JobTracker.offerService(JobTracker.
>> java:2186)
>> at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4548)
>> 17/07/22 18:44:38 INFO mapred.JobTracker: SHUTDOWN_MSG:
>>
>> Is there some way I could debug this further to trace the root cause of
>> this error?
>>
>> Here is a full paste of the debug output when starting up the jobtracker:
>>
>>  https://pastebin.com/a61wN4vQ
>>
>>
>>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>


Re: Hadoop on Mesos: FATAL mapred.MesosScheduler: Failed to initialize the TaskScheduler

2017-07-30 Thread tommy xiao
why not use Myriad?

https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Home


2017-07-23 17:27 GMT+08:00 Traiano Welcome :

>
> Hi List!
>
> I'm working on configuring hadoop to use the mesos scheduler, using the
> procedure outlined in "Apache Mesos Essentials" here:
>
> https://pastebin.com/y1ERJZqq
>
> Currently I've a 3 node mesos cluster, with an HDFS namenode communicating
> successfully with two HDFS data nodes. However, when I try to start up the
> jobtracker it fails with the following error:
>
> 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the
> TaskScheduler
> java.lang.ClassNotFoundException:  org.apache.hadoop.mapred.
> JobQueueTaskScheduler
>
> Some more context around the error:
>
>  17/07/22 18:44:38 INFO mapred.CompletedJobStatusStore: Completed job
> store is inactive
>  17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the
> TaskScheduler
>  java.lang.ClassNotFoundException:  org.apache.hadoop.mapred.
> JobQueueTaskScheduler
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:359)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:348)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:347)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:312)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:195)
> at   org.apache.hadoop.mapred.MesosScheduler.start(
> MesosScheduler.java:160)
> atorg.apache.hadoop.mapred.JobTracker.offerService(
> JobTracker.java:2186)
> at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4548)
> 17/07/22 18:44:38 INFO mapred.JobTracker: SHUTDOWN_MSG:
>
> Is there some way I could debug this further to trace the root cause of
> this error?
>
> Here is a full paste of the debug output when starting up the jobtracker:
>
>  https://pastebin.com/a61wN4vQ
>
>
>


-- 
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com


Re: How to deploy Hadoop on Mesos

2017-07-28 Thread Traiano Welcome
Hadoop definitely seems to be on the list of frameworks for mesos:

http://mesos.apache.org/documentation/latest/frameworks/

Has anyone recently tested getting it to work?




On Thu, Jul 27, 2017 at 5:39 PM, Stephen Gran 
wrote:

> Hi,
>
> On 27/07/17 13:54, Traiano Welcome wrote:
> > Hi Stephen
> >
> >
> > On Thu, Jul 27, 2017 at 12:19 PM, Stephen Gran 
> wrote:
> > Both spark and flink integrate natively with mesos, so no need for an
> > intermediate yarn layer.  For batch work, we're looking at the aurora
> > project for job scheduling.
> >
> >
> >
> > I haven't looked at Aurora before - would you consider it a drop in
> > replacement for hadoop for distributed batch workloads?
>
> It's definitely not a drop in replacement - they have very different
> APIs and capabilities.  What aurora gives us is a DSL to build the DAG
> of an execution, and with a little work, some primitives to run those
> executions.  So, the functionality ends up being similar for 'just
> batch', but the language, the bindings, etc are all very different.
>
> Cheers,
> --
> Stephen Gran
> Senior Technical Architect
>
> picture the possibilities | piksel.com
>


Re: How to deploy Hadoop on Mesos

2017-07-27 Thread Stephen Gran
Hi,

On 27/07/17 13:54, Traiano Welcome wrote:
> Hi Stephen
> 
> 
> On Thu, Jul 27, 2017 at 12:19 PM, Stephen Gran  
> wrote:
> Both spark and flink integrate natively with mesos, so no need for an
> intermediate yarn layer.  For batch work, we're looking at the aurora
> project for job scheduling.
> 
> 
> 
> I haven't looked at Aurora before - would you consider it a drop in
> replacement for hadoop for distributed batch workloads?

It's definitely not a drop in replacement - they have very different
APIs and capabilities.  What aurora gives us is a DSL to build the DAG
of an execution, and with a little work, some primitives to run those
executions.  So, the functionality ends up being similar for 'just
batch', but the language, the bindings, etc are all very different.

Cheers,
-- 
Stephen Gran
Senior Technical Architect

picture the possibilities | piksel.com


Re: How to deploy Hadoop on Mesos

2017-07-27 Thread Traiano Welcome
Hi Stephen


On Thu, Jul 27, 2017 at 12:19 PM, Stephen Gran <stephen.g...@piksel.com>
wrote:

> Hi,
>
> So typically people run two sorts of workloads on hadoop -
> ad-hoc/scheduled batch work, and stream workloads (spark, flink, etc.).
>
>

I'm definitely sure we'll be using hadoop for batch workloads.
We will integrate Spark with mesos for streaming workloads.




> Both spark and flink integrate natively with mesos, so no need for an
> intermediate yarn layer.  For batch work, we're looking at the aurora
> project for job scheduling.
>
>

I haven't looked at Aurora before - would you consider it a drop in
replacement for hadoop for distributed batch workloads?



> hadoop brings some interesting things, but I've not found integration
> with mesos to ever be pain-free, so we're moving to other tools instead
> of continuing down the path of trying to get hadoop working with mesos.
>
>

Understandably :-) I think I might take your advice here. Even if a one
time integration of hadoop and mesos was successful, the pain of having to
keep the integration functional over time through rapid updates and code
changes between two unrelated project codebases would be a nightmare.


Good luck!
>
> On 27/07/17 08:50, Traiano Welcome wrote:
> > Hi Stephen
> >
> >
> > On Wed, Jul 26, 2017 at 5:18 PM, Stephen Gran <stephen.g...@piksel.com
> > <mailto:stephen.g...@piksel.com>> wrote:
> >
> > Hi,
> >
> > It is having discussions about whether to stop, as it's having
> trouble
> > getting enough contributors.
> >
> > I guess I'd ask what you need to run on hadoop, why you're looking at
> > mesos, and then see what else is in that space.
> >
> >
> >
> > I don't know what we'd need to run on hadoop at this point - it's open
> > ended, and for our developers to decide. However, should this make a
> > difference?
> >
> > We have mesos in place as a resource scheduler for a number of
> > frameworks and would like to resource manage it using the same
> > semantics, tools and mechanisms mesos provides.
> >
> > I've looked at two books so far that show how this is done, so it seems
> > this way of managing hadoop is in use in places (ref: "Apache Mesos
> > Essentials", "Mastering Mesos"), however these books are probably out of
> > date because the procedure they describe for integrating mesos and
> > hadoop is broken.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Cheers,
> >
> > On 26/07/17 14:13, Brandon Gulla wrote:
> > > Have you looked into Apache Myriad?
> > >
> > > http://myriad.apache.org/
> > >
> > > On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <
> trai...@gmail.com <mailto:trai...@gmail.com>
> > > <mailto:trai...@gmail.com <mailto:trai...@gmail.com>>> wrote:
> > >
> > > Hi
> > >
> > > Would anyone know of some reliable guides to deploying  apache
> > > hadoop on top of the mesos scheduler?
> > >
> > > Thanks,
> > > Traiano
> > >
> > >
> > >
> > >
> > > --
> > > Brandon
> >
> > --
> > Stephen Gran
> > Senior Technical Architect
> >
> > picture the possibilities | piksel.com <http://piksel.com>
> >
> >
>
> --
> Stephen Gran
> Senior Technical Architect
>
> picture the possibilities | piksel.com
>


Re: How to deploy Hadoop on Mesos

2017-07-27 Thread Stephen Gran
Hi,

So typically people run two sorts of workloads on hadoop -
ad-hoc/scheduled batch work, and stream workloads (spark, flink, etc.).

Both spark and flink integrate natively with mesos, so no need for an
intermediate yarn layer.  For batch work, we're looking at the aurora
project for job scheduling.

hadoop brings some interesting things, but I've not found integration
with mesos to ever be pain-free, so we're moving to other tools instead
of continuing down the path of trying to get hadoop working with mesos.

Good luck!

On 27/07/17 08:50, Traiano Welcome wrote:
> Hi Stephen
> 
> 
> On Wed, Jul 26, 2017 at 5:18 PM, Stephen Gran <stephen.g...@piksel.com
> <mailto:stephen.g...@piksel.com>> wrote:
> 
> Hi,
> 
> It is having discussions about whether to stop, as it's having trouble
> getting enough contributors.
> 
> I guess I'd ask what you need to run on hadoop, why you're looking at
> mesos, and then see what else is in that space.
> 
> 
> 
> I don't know what we'd need to run on hadoop at this point - it's open
> ended, and for our developers to decide. However, should this make a
> difference?
> 
> We have mesos in place as a resource scheduler for a number of
> frameworks and would like to resource manage it using the same
> semantics, tools and mechanisms mesos provides.
> 
> I've looked at two books so far that show how this is done, so it seems
> this way of managing hadoop is in use in places (ref: "Apache Mesos
> Essentials", "Mastering Mesos"), however these books are probably out of
> date because the procedure they describe for integrating mesos and
> hadoop is broken.
> 
> 
> 
> 
> 
> 
> 
> 
>  
> 
> Cheers,
> 
> On 26/07/17 14:13, Brandon Gulla wrote:
> > Have you looked into Apache Myriad?
> >
> > http://myriad.apache.org/
> >
> > On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <trai...@gmail.com 
> <mailto:trai...@gmail.com>
> > <mailto:trai...@gmail.com <mailto:trai...@gmail.com>>> wrote:
> >
> > Hi
> >
> > Would anyone know of some reliable guides to deploying  apache
> > hadoop on top of the mesos scheduler?
> >
> > Thanks,
> > Traiano
> >
> >
> >
> >
> > --
> > Brandon
> 
> --
> Stephen Gran
> Senior Technical Architect
> 
> picture the possibilities | piksel.com <http://piksel.com>
> 
> 

-- 
Stephen Gran
Senior Technical Architect

picture the possibilities | piksel.com


Re: How to deploy Hadoop on Mesos

2017-07-27 Thread Traiano Welcome
Hi Brandon

On Wed, Jul 26, 2017 at 5:13 PM, Brandon Gulla 
wrote:

> Have you looked into Apache Myriad?
>
> http://myriad.apache.org/
>


I took a brief look and thought "more flaky, half-cooked stuff that doesn't
work in production and will cause a system engineer a world of pain to get
working reliably." ... So no, avoid like the plague :-)




>
>
> On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome 
> wrote:
>
>> Hi
>>
>> Would anyone know of some reliable guides to deploying  apache hadoop on
>> top of the mesos scheduler?
>>
>> Thanks,
>> Traiano
>>
>
>
>
> --
> Brandon
>


Re: How to deploy Hadoop on Mesos

2017-07-27 Thread Traiano Welcome
Hi Stephen


On Wed, Jul 26, 2017 at 5:18 PM, Stephen Gran <stephen.g...@piksel.com>
wrote:

> Hi,
>
> It is having discussions about whether to stop, as it's having trouble
> getting enough contributors.
>
> I guess I'd ask what you need to run on hadoop, why you're looking at
> mesos, and then see what else is in that space.
>
>

I don't know what we'd need to run on hadoop at this point - it's open
ended, and for our developers to decide. However, should this make a
difference?

We have mesos in place as a resource scheduler for a number of frameworks
and would like to resource manage it using the same semantics, tools and
mechanisms mesos provides.

I've looked at two books so far that show how this is done, so it seems
this way of managing hadoop is in use in places (ref: "Apache Mesos
Essentials", "Mastering Mesos"), however these books are probably out of
date because the procedure they describe for integrating mesos and hadoop
is broken.










> Cheers,
>
> On 26/07/17 14:13, Brandon Gulla wrote:
> > Have you looked into Apache Myriad?
> >
> > http://myriad.apache.org/
> >
> > On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome <trai...@gmail.com
> > <mailto:trai...@gmail.com>> wrote:
> >
> > Hi
> >
> > Would anyone know of some reliable guides to deploying  apache
> > hadoop on top of the mesos scheduler?
> >
> > Thanks,
> > Traiano
> >
> >
> >
> >
> > --
> > Brandon
>
> --
> Stephen Gran
> Senior Technical Architect
>
> picture the possibilities | piksel.com
>


Re: How to deploy Hadoop on Mesos

2017-07-26 Thread Stephen Gran
Hi,

It is having discussions about whether to stop, as it's having trouble
getting enough contributors.

I guess I'd ask what you need to run on hadoop, why you're looking at
mesos, and then see what else is in that space.

Cheers,

On 26/07/17 14:13, Brandon Gulla wrote:
> Have you looked into Apache Myriad? 
> 
> http://myriad.apache.org/
> 
> On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome  > wrote:
> 
> Hi
> 
> Would anyone know of some reliable guides to deploying  apache
> hadoop on top of the mesos scheduler?
> 
> Thanks,
> Traiano
> 
> 
> 
> 
> -- 
> Brandon

-- 
Stephen Gran
Senior Technical Architect

picture the possibilities | piksel.com


Re: How to deploy Hadoop on Mesos

2017-07-26 Thread Brandon Gulla
Have you looked into Apache Myriad?

http://myriad.apache.org/

On Wed, Jul 26, 2017 at 4:12 AM, Traiano Welcome  wrote:

> Hi
>
> Would anyone know of some reliable guides to deploying  apache hadoop on
> top of the mesos scheduler?
>
> Thanks,
> Traiano
>



-- 
Brandon


How to deploy Hadoop on Mesos

2017-07-26 Thread Traiano Welcome
Hi

Would anyone know of some reliable guides to deploying  apache hadoop on
top of the mesos scheduler?

Thanks,
Traiano


Hadoop on Mesos: FATAL mapred.MesosScheduler: Failed to initialize the TaskScheduler

2017-07-23 Thread Traiano Welcome
Hi List!

I'm working on configuring hadoop to use the mesos scheduler, using the
procedure outlined in "Apache Mesos Essentials" here:

https://pastebin.com/y1ERJZqq

Currently I've a 3 node mesos cluster, with an HDFS namenode communicating
successfully with two HDFS data nodes. However, when I try to start up the
jobtracker it fails with the following error:

17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the
TaskScheduler
java.lang.ClassNotFoundException:
org.apache.hadoop.mapred.JobQueueTaskScheduler

Some more context around the error:

 17/07/22 18:44:38 INFO mapred.CompletedJobStatusStore: Completed job store
is inactive
 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the
TaskScheduler
 java.lang.ClassNotFoundException:
org.apache.hadoop.mapred.JobQueueTaskScheduler

at java.net.URLClassLoader$1.run(URLClassLoader.java:359)
at java.net.URLClassLoader$1.run(URLClassLoader.java:348)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:347)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:312)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:195)
at
org.apache.hadoop.mapred.MesosScheduler.start(MesosScheduler.java:160)
at
org.apache.hadoop.mapred.JobTracker.offerService(JobTracker.java:2186)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4548)
17/07/22 18:44:38 INFO mapred.JobTracker: SHUTDOWN_MSG:

Is there some way I could debug this further to trace the root cause of
this error?

Here is a full paste of the debug output when starting up the jobtracker:

 https://pastebin.com/a61wN4vQ


Hadoop on Mesos Memory Configuration Question

2015-10-01 Thread Ajit Jagdale
Hi all,

I'm new to Mesos and to using Hadoop over Mesos.  I've been trying to
determine if Mesos memory configurations are affecting the memory that I
allocate to Hadoop mappers and reducers (in Hadoop's mapped-site.xml
file).  When I set values to the mappers, something seems to interfere with
allocating that memory.

Cluster setup:
- 1 master node and 6 slave nodes
- There is no /etc/mesos-slave/resource file, so memory is configured by
Mesos.  My understanding of this is that since there are no explicit memory
settings on each slave node, Mesos is giving the asking application
(Hadoop) all of the available memory minus 1GB for running the OS.

But there still must be some mesos memory configuration somewhere, right?
Something that knows how much a slice of memory is.  I'm not sure if I know
where that is.

Any suggestions of how mesos' process of memory allocation could be
affecting how Hadoop affects memory allocation would be appreciated.

Thanks,
Ajit


Re: Hadoop on Mesos. HDFS question.

2015-07-09 Thread Adam Bordelon
Blatant product plug: The easiest way to run hdfs-mesos on Mesos (using
Marathon) is to launch a DCOS (EE) cluster. https://mesosphere.com/product/
You might also want to look at the custom DCOS config in
https://github.com/mesosphere/hdfs/tree/master/example-conf/mesosphere-dcos

For basic instructions on running an app in Marathon, see
https://mesosphere.github.io/marathon/docs/application-basics.html
See the marathon.json in DCOS Universe:
https://github.com/mesosphere/universe/blob/version-1.x/repo/packages/H/hdfs/0/marathon.json
Just replace any of the {{moustache}} variables with values like the
defaults in
https://github.com/mesosphere/universe/blob/version-1.x/repo/packages/H/hdfs/0/config.json
You can also simplify the command to cd hdfs-mesos*  ./bin/hdfs-mesos

On Mon, Jul 6, 2015 at 2:19 PM, Kk Bk kkbr...@gmail.com wrote:

 Adam

 I would like to choose option 2. Cab you provide pointers as to how to run
 hdfs-mesos using marathon ?

 -Bhargav

 On Sun, Jul 5, 2015 at 10:53 PM, Adam Bordelon a...@mesosphere.io wrote:

 Kk,

 There are two options for running the hdfs framework on Mesos.
 - If you already have the hadoop/hdfs binaries on all your nodes, you can
 follow the instructions in
 https://github.com/mesosphere/hdfs#if-you-have-hadoop-pre-installed-in-your-cluster
 to tell the scheduler to use the preinstalled NN/DN binaries.
 - Otherwise, you can run the hdfs framework scheduler `bin/hdfs-mesos`
 on any node that can reach the Mesos master and slaves, and it can serve
 out the binaries itself. Note that this node may not necessarily be the
 same node on which either of the namenodes end up running. Some choose to
 run the hdfs-mesos scheduler on a Mesos master node, but you can achieve
 framework scheduler HA if you run it via another framework like Marathon
 that can restart the scheduler (elsewhere) if it or its node dies. See
 example (templatized) Marathon json in
 https://github.com/mesosphere/universe/tree/version-1.x/repo/packages/H/hdfs/0

 On Fri, Jul 3, 2015 at 11:31 AM, Kk Bk kkbr...@gmail.com wrote:

 Thanks guys for the response.

 1) I use trusty. Seems like CDH4 does not have support for Trusty.

 2) Followed instructions as per link https://github.com/mesosphere/hdfs.
 Able to build hdfs-mesos-*.tgz

 Should i copy this file to all nodes (i have multi-node mesos cluster)
 or just the master node of mesos where i plan to keep the namenode for
 hadoop





 On Fri, Jul 3, 2015 at 8:34 AM, Tom Arnfeld t...@duedil.com wrote:

  It might be worth taking a look at the install documentation on the
 Hadoop on Mesos product here; https://github.com/mesos/hadoop

 For our installations I don't think we really do much more than
 installing the apt packages you mentioned and then installing the
 hadoop-mesos jars.. plus adding the appropriate configuration.

 On Friday, Jul 3, 2015 at 3:52 pm, Kk Bk kkbr...@gmail.com, wrote:

 I am trying to install Hadoop on Mesos on ubuntu servers, So followed
 instruction as per link
 https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2.

 Step-2 of link says to install HDFS using as per link
 http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html
 .

 Question: Is it sufficient to run following commands

 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode
 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker
 hadoop-hdfs-datanode

 Or just follow the instructions on the mesosphere link that installs
 HDFS ?










Re: Hadoop on Mesos. HDFS question.

2015-07-05 Thread Adam Bordelon
Kk,

There are two options for running the hdfs framework on Mesos.
- If you already have the hadoop/hdfs binaries on all your nodes, you can
follow the instructions in
https://github.com/mesosphere/hdfs#if-you-have-hadoop-pre-installed-in-your-cluster
to tell the scheduler to use the preinstalled NN/DN binaries.
- Otherwise, you can run the hdfs framework scheduler `bin/hdfs-mesos` on
any node that can reach the Mesos master and slaves, and it can serve out
the binaries itself. Note that this node may not necessarily be the same
node on which either of the namenodes end up running. Some choose to run
the hdfs-mesos scheduler on a Mesos master node, but you can achieve
framework scheduler HA if you run it via another framework like Marathon
that can restart the scheduler (elsewhere) if it or its node dies. See
example (templatized) Marathon json in
https://github.com/mesosphere/universe/tree/version-1.x/repo/packages/H/hdfs/0

On Fri, Jul 3, 2015 at 11:31 AM, Kk Bk kkbr...@gmail.com wrote:

 Thanks guys for the response.

 1) I use trusty. Seems like CDH4 does not have support for Trusty.

 2) Followed instructions as per link https://github.com/mesosphere/hdfs.
 Able to build hdfs-mesos-*.tgz

 Should i copy this file to all nodes (i have multi-node mesos cluster) or
 just the master node of mesos where i plan to keep the namenode for hadoop





 On Fri, Jul 3, 2015 at 8:34 AM, Tom Arnfeld t...@duedil.com wrote:

  It might be worth taking a look at the install documentation on the
 Hadoop on Mesos product here; https://github.com/mesos/hadoop

 For our installations I don't think we really do much more than
 installing the apt packages you mentioned and then installing the
 hadoop-mesos jars.. plus adding the appropriate configuration.

 On Friday, Jul 3, 2015 at 3:52 pm, Kk Bk kkbr...@gmail.com, wrote:

 I am trying to install Hadoop on Mesos on ubuntu servers, So followed
 instruction as per link
 https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2.

 Step-2 of link says to install HDFS using as per link
 http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html
 .

 Question: Is it sufficient to run following commands

 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode
 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker
 hadoop-hdfs-datanode

 Or just follow the instructions on the mesosphere link that installs
 HDFS ?








Re: Hadoop on Mesos. HDFS question.

2015-07-03 Thread haosdent
You just need install HDFS through sudo apt-get install
hadoop-hdfs-namenode hadoop-hdfs-secondarynamenode
hadoop-hdfs-datanode hadoop-client
And then continue to follow the mesosphere link step. Mesosphere link don't
contain instructions to install HDFS.

On Fri, Jul 3, 2015 at 10:51 PM, Kk Bk kkbr...@gmail.com wrote:

 I am trying to install Hadoop on Mesos on ubuntu servers, So followed
 instruction as per link
 https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2.

 Step-2 of link says to install HDFS using as per link
 http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html
 .

 Question: Is it sufficient to run following commands

 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode
 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker
 hadoop-hdfs-datanode

 Or just follow the instructions on the mesosphere link that installs HDFS ?







-- 
Best Regards,
Haosdent Huang


Hadoop on Mesos. HDFS question.

2015-07-03 Thread Kk Bk
I am trying to install Hadoop on Mesos on ubuntu servers, So followed
instruction as per link
https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2.

Step-2 of link says to install HDFS using as per link
http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html
.

Question: Is it sufficient to run following commands

1) On Namenode: sudo apt-get install hadoop-hdfs-namenode
2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker
hadoop-hdfs-datanode

Or just follow the instructions on the mesosphere link that installs HDFS ?


Re: Hadoop on Mesos. HDFS question.

2015-07-03 Thread Tom Arnfeld
It might be worth taking a look at the install documentation on the Hadoop on 
Mesos product here; https://github.com/mesos/hadoop



For our installations I don't think we really do much more than installing the 
apt packages you mentioned and then installing the hadoop-mesos jars.. plus 
adding the appropriate configuration.






On Friday, Jul 3, 2015 at 3:52 pm, Kk Bk kkbr...@gmail.com, wrote:

I am trying to install Hadoop on Mesos on ubuntu servers, So followed 
instruction as per link 
https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2.




Step-2 of link says to install HDFS using as per link 
http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html.




Question: Is it sufficient to run following commands




1) On Namenode: sudo apt-get install hadoop-hdfs-namenode

2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker 
hadoop-hdfs-datanode




Or just follow the instructions on the mesosphere link that installs HDFS ?

Re: Hadoop on Mesos. HDFS question.

2015-07-03 Thread Kk Bk
Thanks guys for the response.

1) I use trusty. Seems like CDH4 does not have support for Trusty.

2) Followed instructions as per link https://github.com/mesosphere/hdfs.
Able to build hdfs-mesos-*.tgz

Should i copy this file to all nodes (i have multi-node mesos cluster) or
just the master node of mesos where i plan to keep the namenode for hadoop





On Fri, Jul 3, 2015 at 8:34 AM, Tom Arnfeld t...@duedil.com wrote:

  It might be worth taking a look at the install documentation on the
 Hadoop on Mesos product here; https://github.com/mesos/hadoop

 For our installations I don't think we really do much more than installing
 the apt packages you mentioned and then installing the hadoop-mesos jars..
 plus adding the appropriate configuration.

 On Friday, Jul 3, 2015 at 3:52 pm, Kk Bk kkbr...@gmail.com, wrote:

 I am trying to install Hadoop on Mesos on ubuntu servers, So followed
 instruction as per link
 https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2.

 Step-2 of link says to install HDFS using as per link
 http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html
 .

 Question: Is it sufficient to run following commands

 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode
 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker
 hadoop-hdfs-datanode

 Or just follow the instructions on the mesosphere link that installs HDFS
 ?







Re: hadoop on mesos odd issues with heartbeat and ghost task trackers.

2015-03-03 Thread Tom Arnfeld
Hi John,

Not sure if you ended up getting to the bottom of the issue, but often when
the scheduler gives up and his this time out it's because something funky
happened in mesos and the scheduler wasn't updated correctly. Could you
describe the state (with some logs too if possible) of mesos while this
happens?

Tom.

On 25 February 2015 at 17:01, John Omernik j...@omernik.com wrote:

 I am running hadoop on mesos 0.0.8 on Mesos 0.21.0.  I am running into
 a weird issue where it appears two of my nodes, when a task tracker is
 run on them,  never really complete the check in process, the job
 tracker is waiting for their heartbeat, they think they are running
 successfully, and then tasks that would be assigned to them stay in a
 hung/pending state waiting for the heartbeat.

 Basically in the job tracker log, I see the below (where the pending
 tasks is one, the inactive slots is 2 (launched but no heartbeat yet)
 so the jobtracker just sits there waiting, and the node thinks it's
 running fine.

 Is there a way to have the JobTracker give up on a task tracker
 sooner?  This waiting for timeout period seems odd.

 Thanks!

 (if there is any other information I can provide, please let me know)



 Job Tracker Log:

Pending Map Tasks: 0

Pending Reduce Tasks: 1

   Running Map Tasks: 0

Running Reduce Tasks: 0

  Idle Map Slots: 2

   Idle Reduce Slots: 0

  Inactive Map Slots: 2 (launched but no hearbeat yet)

   Inactive Reduce Slots: 2 (launched but no hearbeat yet)

Needed Map Slots: 0

 Needed Reduce Slots: 0

  Unhealthy Trackers: 0

 2015-02-25 10:57:01,930 INFO mapred.ResourcePolicy [Thread-1290]:
 Satisfied map and reduce slots needed.

 2015-02-25 10:57:02,083 INFO mapred.MesosScheduler [IPC Server handler
 7 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:31264.

 2015-02-25 10:57:02,097 INFO mapred.MesosScheduler [IPC Server handler
 0 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:50060.

 2015-02-25 10:57:02,148 INFO mapred.MesosScheduler [IPC Server handler
 4 on 7676]: Unknown/exited TaskTracker: http://moonman:31182.

 2015-02-25 10:57:02,392 INFO mapred.MesosScheduler [IPC Server handler
 1 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:31264.

 2015-02-25 10:57:02,403 INFO mapred.MesosScheduler [IPC Server handler
 3 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:50060.

 2015-02-25 10:57:02,459 INFO mapred.MesosScheduler [IPC Server handler
 6 on 7676]: Unknown/exited TaskTracker: http://moonman:31182.

 2015-02-25 10:57:02,702 INFO mapred.MesosScheduler [IPC Server handler
 4 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:31264.

 2015-02-25 10:57:02,714 INFO mapred.MesosScheduler [IPC Server handler
 5 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:50060.



Re: Hadoop on Mesos

2015-01-30 Thread Alex
Hi Tom,

Thanks a lot for your reply, it's very helpful.

On 01/29/2015 05:54 PM, Tom Arnfeld wrote:
 Hi Alex,

 Great to hear you're hoping to use Hadoop on Mesos. We've been running
 it for a good 6 months and it's been awesome.

 I'll answer the simpler question first, running multiple job trackers
 should be just fine.. even multiple JTs with HA enabled (we do this).
 The mesos scheduler for Hadoop will ship all configuration options
 needed for each TaskTracker within mesos, so there's nothing you need
 to have specifically configured on each slave..

 # Slow slot allocations

 If you only have a few slaves, not many resources and a large amount
 of resources per slot, you might end up with a pretty small slot
 allocation (e.g 5 mappers and 1 reducer). Because of the nature of
 Hadoop, slots are static for each TaskTracker and the framework does a
 /best effort/ to figure out what balance of map/reduce slots to launch
 on the cluster.

 Because of this, the current stable version of the framework has a few
 issues when running on small clusters, especially when you don't
 configure min/max slot capacity for each JobTracker. Few links below

 - https://github.com/mesos/hadoop/issues/32
 - https://github.com/mesos/hadoop/issues/31
 - https://github.com/mesos/hadoop/issues/28
 - https://github.com/mesos/hadoop/issues/26

 Having said that, we've been working on a solution to this problem
 which enables Hadoop to launch different types of slots over the
 lifetime of a single job, meaning you could start with 5 maps and 1
 reduce, and then end with 0 maps and 6 reduce. It's not perfect, but
 it's a decent optimisation if you still need to use Hadoop.

 - https://github.com/mesos/hadoop/pull/33

 You may also want to look into how large your executor URI is (the one
 containing the hadoop source that gets downloaded for each task
 tracker) and how long that takes to download.. it might be that the
 task trackers are taking a while to bootstrap.

Do you have any idea of when your pull request will be merged? It looks
pretty interesting, even if we're just playing around at this point. Is
your hadoop-mesos-0.0.9.jar available for download somewhere, or do I
have to build it myself? In the meantime, I'm adding more slaves to see
if this makes the problem go away, at least for demos.


 # HA Hadoop JTs

 The framework currently does not support a full HA setup, however
 that's not a huge issue. The JT will automatically restart jobs where
 they left off on it's own when a failover occurs, but for the time
 being all the track trackers will be killed and new ones spawned.
 Depending on your setup, this could be a fairly negligible time.

I'm not sure I understand. I know task trackers will get restarted,
that's not what I'm worried about. The issue I see is with the JT: it's
started on one master only. If that master goes down, the framework goes
down. I was kind of hoping to be able to do something like this:

property
  namemapred.job.tracker/name
  valuezk://mesos01:2181,mesos02:2181,mesos03:2181/hadoop530/value
/property

Perhaps this doesn't actually work as I would expect. It doesn't look
like there's been any progress on issue #28, unfortunately...


 # Multiple versions of hadoop on the cluster

 This is totally fine, each JT configuration can be given it's own
 hadoop tar.gz file with the right version in it, and they will all
 happily share the Mesos cluster.

I guess you have to have multiple startup scripts for this, and also
multiple versions of Hadoop on the masters. Any pointers of how you've
set this up would be much appreciated.

Cheers,
Alex


Re: Hadoop on Mesos

2015-01-29 Thread Tom Arnfeld
Hi Alex,




Great to hear you're hoping to use Hadoop on Mesos. We've been running it for a 
good 6 months and it's been awesome.




I'll answer the simpler question first, running multiple job trackers should be 
just fine.. even multiple JTs with HA enabled (we do this). The mesos scheduler 
for Hadoop will ship all configuration options needed for each TaskTracker 
within mesos, so there's nothing you need to have specifically configured on 
each slave..




# Slow slot allocations




If you only have a few slaves, not many resources and a large amount of 
resources per slot, you might end up with a pretty small slot allocation (e.g 5 
mappers and 1 reducer). Because of the nature of Hadoop, slots are static for 
each TaskTracker and the framework does a best effort to figure out what 
balance of map/reduce slots to launch on the cluster.




Because of this, the current stable version of the framework has a few issues 
when running on small clusters, especially when you don't configure min/max 
slot capacity for each JobTracker. Few links below




- https://github.com/mesos/hadoop/issues/32

- https://github.com/mesos/hadoop/issues/31

- https://github.com/mesos/hadoop/issues/28

- https://github.com/mesos/hadoop/issues/26




Having said that, we've been working on a solution to this problem which 
enables Hadoop to launch different types of slots over the lifetime of a single 
job, meaning you could start with 5 maps and 1 reduce, and then end with 0 maps 
and 6 reduce. It's not perfect, but it's a decent optimisation if you still 
need to use Hadoop.




- https://github.com/mesos/hadoop/pull/33


You may also want to look into how large your executor URI is (the one 
containing the hadoop source that gets downloaded for each task tracker) and 
how long that takes to download.. it might be that the task trackers are taking 
a while to bootstrap.




# HA Hadoop JTs




The framework currently does not support a full HA setup, however that's not a 
huge issue. The JT will automatically restart jobs where they left off on it's 
own when a failover occurs, but for the time being all the track trackers will 
be killed and new ones spawned. Depending on your setup, this could be a fairly 
negligible time.




# Multiple versions of hadoop on the cluster




This is totally fine, each JT configuration can be given it's own hadoop tar.gz 
file with the right version in it, and they will all happily share the Mesos 
cluster.




I hope this makes sense! Ping me on irc (tarnfeld) if you run into anything 
funky on that branch for flexi trackers.




Tom.


--


Tom Arnfeld

Developer // DueDil

On Thu, Jan 29, 2015 at 4:09 PM, Alex alex.m.lis...@gmail.com wrote:

 Hi guys,
 I'm a Hadoop and Mesos n00b, so please be gentle. I'm trying to set up a
 Mesos cluster, and my ultimate goal is to introduce Mesos in my
 organization by showing off it's ability to run multiple Hadoop
 clusters, plus other stuff, on the same resources. I'd like to be able
 to do this with a HA configuration as close as possible to something we
 would run in production.
 I've successfully set up a Mesos cluster with 3 masters and 4 slaves,
 but I'm having trouble getting Hadoop jobs to run on top of it. I'm
 using Mesos 0.21.1 and Hadoop CDH 5.3.0. Initially I tried to follow the
 Mesosphere tutorial[1], but it looks like it is very outdated and I
 didn't get very far. Then I tried following the instructions in the
 github repo[2], but they're also less than ideal.
 I've managed to get a Hadoop jobtracker running on one of the masters, I
 can submit jobs to it and they eventually finish. The strange thing is
 that they take a really long time to start the reduce task, so much so
 that the first few times I thought it wasn't working at all. Here's part
 of the output for a simple wordcount example:
 15/01/29 16:37:58 INFO mapred.JobClient:  map 0% reduce 0%
 15/01/29 16:39:23 INFO mapred.JobClient:  map 25% reduce 0%
 15/01/29 16:39:31 INFO mapred.JobClient:  map 50% reduce 0%
 15/01/29 16:39:34 INFO mapred.JobClient:  map 75% reduce 0%
 15/01/29 16:39:37 INFO mapred.JobClient:  map 100% reduce 0%
 15/01/29 16:56:25 INFO mapred.JobClient:  map 100% reduce 100%
 15/01/29 16:56:29 INFO mapred.JobClient: Job complete: job_201501291533_0004
 Mesos started 3 task trackers which ran the map tasks pretty fast, but
 then it looks like it was stuck for quite a while before launching a
 fourth task tracker to run the reduce task. Is this normal, or is there
 something wrong here?
 More questions: my configuration file looks a lot like the example in
 the github repo, but that's listed as being representative of a
 pseudo-distributed configuration. What should it look like for a real
 distributed setup? How can I go about running multiple Hadoop clusters?
 Currently, all three masters have the same configuration file, so they
 all create a different framework. How should things be set up for a
 high-availability Hadoop framework that can

Re: Alternate HDFS Filesystems + Hadoop on Mesos

2014-08-24 Thread Connor Doyle

 Also, fwiw I'm interested in rallying folks on a Tachyon Framework in the 
 not-too-distant future, for anyone who is interested.  Probably follow the 
 spark model and try to push upstream.   

Hi Tim, late follow-up:

The not-too distant future is here!  Adam and I took a stab at a Tachyon 
framework during the MesosCon hackathon 
(http://github.com/mesosphere/tachyon-mesos).
We started writing in Scala, but not at all opposed to switching to Java, 
especially if the work can be upstreamed.
--
Connor

 
 
 On Fri, Aug 15, 2014 at 5:16 PM, John Omernik j...@omernik.com wrote:
 I tried hdfs:/// and hdfs://cldbnode:7222/ Neither worked (examples below) I 
 really think the hdfs vs other prefixes should be looked at. Like I said 
 above, the tachyon project just added a env variable to address this.  
 
 
 
 hdfs://cldbnode:7222/
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I0815 19:14:17.101666 22022 fetcher.cpp:76] Fetching URI 
 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
 I0815 19:14:17.101780 22022 fetcher.cpp:105] Downloading resource from 
 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz' to 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0003/executors/executor_Task_Tracker_5/runs/b3174e72-75ea-48be-bbb8-a9a6cc605018/hadoop-0.20.2-mapr-4.0.0.tgz'
 E0815 19:14:17.778833 22022 fetcher.cpp:109] HDFS copyToLocal failed: hadoop 
 fs -copyToLocal 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz' 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0003/executors/executor_Task_Tracker_5/runs/b3174e72-75ea-48be-bbb8-a9a6cc605018/hadoop-0.20.2-mapr-4.0.0.tgz'
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please 
 use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties 
 files.
 -copyToLocal: Wrong FS: 
 maprfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz, expected: 
 hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] 
 src ... localdst
 Failed to fetch: hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Failed to synchronize with slave (it's probably exited)
 
 
 
 hdfs:/// 
 
 
 I0815 19:10:45.006803 21508 fetcher.cpp:76] Fetching URI 
 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
 I0815 19:10:45.007099 21508 fetcher.cpp:105] Downloading resource from 
 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz' to 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0002/executors/executor_Task_Tracker_2/runs/22689054-aff6-4f7c-9746-a068a11ff000/hadoop-0.20.2-mapr-4.0.0.tgz'
 E0815 19:10:45.681922 21508 fetcher.cpp:109] HDFS copyToLocal failed: hadoop 
 fs -copyToLocal 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz' 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0002/executors/executor_Task_Tracker_2/runs/22689054-aff6-4f7c-9746-a068a11ff000/hadoop-0.20.2-mapr-4.0.0.tgz'
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please 
 use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties 
 files.
 -copyToLocal: Wrong FS: maprfs:/mesos/hadoop-0.20.2-mapr-4.0.0.tgz, 
 expected: hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] 
 src ... localdst
 Failed to fetch: hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Failed to synchronize with slave (it's probably exited)
 
 
 On Fri, Aug 15, 2014 at 5:38 PM, John Omernik j...@omernik.com wrote:
 I am away from my cluster right now, I trued doing a hadoop fs -ls 
 maprfs:// and that worked.   When I tries hadoop fs -ls hdfs:/// it failed 
 with wrong fs type.  With that error I didn't try it in the mapred-site.  I 
 will try it.  Still...why hard code the file prefixes? I guess I am curious 
 on how glusterfs would work, or others as they pop up. 
 
 On Aug 15, 2014 5:04 PM, Adam Bordelon a...@mesosphere.io wrote:
 Can't you just use the hdfs:// protocol for maprfs? That should work just 
 fine.
 
 
 On Fri, Aug 15, 2014 at 2:50 PM, John Omernik j...@omernik.com wrote:
 Thanks all.
 
 I realized MapR has a work around for me that I will try soon in that I 
 have MapR fs NFS mounted on each node, I.e. I should be able to get the 
 tar from there.
 
 That said, perhaps someone with better coding skills than me could 
 provide an env variable where a user could provide the HDFS prefixes to 
 try. I know we did that with the tachyon project and it works well for 
 other HDFS compatible fs implementations, perhaps that would work here?  
 Hard coding a pluggable system seems like a long term issue that will 
 keep coming up.
 
 On Aug 15, 2014 4:02 PM, Tim St Clair tstcl...@redhat.com wrote:
 The uri doesn't currently start with any of the known types (at least

Re: Alternate HDFS Filesystems + Hadoop on Mesos

2014-08-18 Thread Adam Bordelon
Okay, I guess MapRFS is protocol compatible with HDFS, but not
uri-compatible. I know the MapR guys have gotten MapR on Mesos working.
They may have more answers for you on how they accomplished this.

 why hard code the file prefixes?
We allow any uri, so we need to have handlers coded for each type of
protocol group, which so far includes hdfs/hftp/s3/s3n which use
hdfs::copyToLocal, or http/https/ftp/ftps which use net::download, or
file:// or an absolute/relative path for files pre-populated on the machine
(uses 'cp'). MapRFS (and Tachyon) would probably fit into the
hdfs::copyToLocal group so easily that it would be a one-line fix each.

 I really think the hdfs vs other prefixes should be looked at
I agree. Could you file a JIRA with your request? It should be an easy
enough change for us to pick up. I would also like to see Tachyon as a
possible filesystem for the fetcher.


On Fri, Aug 15, 2014 at 5:16 PM, John Omernik j...@omernik.com wrote:

 I tried hdfs:/// and hdfs://cldbnode:7222/ Neither worked (examples below)
 I really think the hdfs vs other prefixes should be looked at. Like I said
 above, the tachyon project just added a env variable to address this.



 hdfs://cldbnode:7222/

 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I0815 19:14:17.101666 22022 fetcher.cpp:76] Fetching URI 
 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
 I0815 19:14:17.101780 22022 fetcher.cpp:105] Downloading resource from 
 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz' to 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0003/executors/executor_Task_Tracker_5/runs/b3174e72-75ea-48be-bbb8-a9a6cc605018/hadoop-0.20.2-mapr-4.0.0.tgz'
 E0815 19:14:17.778833 22022 fetcher.cpp:109] HDFS copyToLocal failed: hadoop 
 fs -copyToLocal 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz' 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0003/executors/executor_Task_Tracker_5/runs/b3174e72-75ea-48be-bbb8-a9a6cc605018/hadoop-0.20.2-mapr-4.0.0.tgz'
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
 -copyToLocal: Wrong FS: 
 maprfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz, expected: 
 hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] 
 src ... localdst
 Failed to fetch: hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Failed to synchronize with slave (it's probably exited)




 hdfs:///




 I0815 19:10:45.006803 21508 fetcher.cpp:76] Fetching URI 
 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
 I0815 19:10:45.007099 21508 fetcher.cpp:105] Downloading resource from 
 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz' to 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0002/executors/executor_Task_Tracker_2/runs/22689054-aff6-4f7c-9746-a068a11ff000/hadoop-0.20.2-mapr-4.0.0.tgz'
 E0815 19:10:45.681922 21508 fetcher.cpp:109] HDFS copyToLocal failed: hadoop 
 fs -copyToLocal 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz' 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0002/executors/executor_Task_Tracker_2/runs/22689054-aff6-4f7c-9746-a068a11ff000/hadoop-0.20.2-mapr-4.0.0.tgz'
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
 -copyToLocal: Wrong FS: maprfs:/mesos/hadoop-0.20.2-mapr-4.0.0.tgz, expected: 
 hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] 
 src ... localdst
 Failed to fetch: hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Failed to synchronize with slave (it's probably exited)



 On Fri, Aug 15, 2014 at 5:38 PM, John Omernik j...@omernik.com wrote:

 I am away from my cluster right now, I trued doing a hadoop fs -ls
 maprfs:// and that worked.   When I tries hadoop fs -ls hdfs:/// it failed
 with wrong fs type.  With that error I didn't try it in the mapred-site.  I
 will try it.  Still...why hard code the file prefixes? I guess I am curious
 on how glusterfs would work, or others as they pop up.
  On Aug 15, 2014 5:04 PM, Adam Bordelon a...@mesosphere.io wrote:

 Can't you just use the hdfs:// protocol for maprfs? That should work
 just fine.


 On Fri, Aug 15, 2014 at 2:50 PM, John Omernik j...@omernik.com wrote:

 Thanks all.

 I realized MapR has a work around for me that I will try soon in that I
 have MapR fs NFS mounted on each node, I.e. I should be able to get the tar
 from there.

 That said, perhaps someone with better coding skills than me could
 provide an env variable where a user could provide the HDFS prefixes to
 try. I know we did

Re: Alternate HDFS Filesystems + Hadoop on Mesos

2014-08-18 Thread John Omernik
Adam - I am new to using Jira properly. (I couldn't find the JIRA for the
Tachyon change as an example, so I linked to the code... is that ok?)

I created

https://issues.apache.org/jira/browse/MESOS-1711

If you wouldn't mind taking a quick look to make sure I filled things out
correctly to get addressed I'd appreciate it. If you want to hit me up off
list with any recommendations on what I did to make it better in the
future, I'd appreciate it as well.

Thanks!

John



On Mon, Aug 18, 2014 at 4:43 AM, Adam Bordelon a...@mesosphere.io wrote:

 Okay, I guess MapRFS is protocol compatible with HDFS, but not
 uri-compatible. I know the MapR guys have gotten MapR on Mesos working.
 They may have more answers for you on how they accomplished this.

  why hard code the file prefixes?
 We allow any uri, so we need to have handlers coded for each type of
 protocol group, which so far includes hdfs/hftp/s3/s3n which use
 hdfs::copyToLocal, or http/https/ftp/ftps which use net::download, or
 file:// or an absolute/relative path for files pre-populated on the machine
 (uses 'cp'). MapRFS (and Tachyon) would probably fit into the
 hdfs::copyToLocal group so easily that it would be a one-line fix each.

  I really think the hdfs vs other prefixes should be looked at
 I agree. Could you file a JIRA with your request? It should be an easy
 enough change for us to pick up. I would also like to see Tachyon as a
 possible filesystem for the fetcher.


 On Fri, Aug 15, 2014 at 5:16 PM, John Omernik j...@omernik.com wrote:

 I tried hdfs:/// and hdfs://cldbnode:7222/ Neither worked (examples
 below) I really think the hdfs vs other prefixes should be looked at. Like
 I said above, the tachyon project just added a env variable to address
 this.



 hdfs://cldbnode:7222/

 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I0815 19:14:17.101666 22022 fetcher.cpp:76] Fetching URI 
 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
 I0815 19:14:17.101780 22022 fetcher.cpp:105] Downloading resource from 
 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz' to 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0003/executors/executor_Task_Tracker_5/runs/b3174e72-75ea-48be-bbb8-a9a6cc605018/hadoop-0.20.2-mapr-4.0.0.tgz'
 E0815 19:14:17.778833 22022 fetcher.cpp:109] HDFS copyToLocal failed: hadoop 
 fs -copyToLocal 'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz' 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0003/executors/executor_Task_Tracker_5/runs/b3174e72-75ea-48be-bbb8-a9a6cc605018/hadoop-0.20.2-mapr-4.0.0.tgz'
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please 
 use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties 
 files.
 -copyToLocal: Wrong FS: 
 maprfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz, expected: 
 hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] 
 src ... localdst
 Failed to fetch: hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Failed to synchronize with slave (it's probably exited)




 hdfs:///





 I0815 19:10:45.006803 21508 fetcher.cpp:76] Fetching URI 
 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
 I0815 19:10:45.007099 21508 fetcher.cpp:105] Downloading resource from 
 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz' to 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0002/executors/executor_Task_Tracker_2/runs/22689054-aff6-4f7c-9746-a068a11ff000/hadoop-0.20.2-mapr-4.0.0.tgz'
 E0815 19:10:45.681922 21508 fetcher.cpp:109] HDFS copyToLocal failed: hadoop 
 fs -copyToLocal 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz' 
 '/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0002/executors/executor_Task_Tracker_2/runs/22689054-aff6-4f7c-9746-a068a11ff000/hadoop-0.20.2-mapr-4.0.0.tgz'
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please 
 use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties 
 files.
 -copyToLocal: Wrong FS: maprfs:/mesos/hadoop-0.20.2-mapr-4.0.0.tgz, 
 expected: hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] 
 src ... localdst
 Failed to fetch: hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz
 Failed to synchronize with slave (it's probably exited)



 On Fri, Aug 15, 2014 at 5:38 PM, John Omernik j...@omernik.com wrote:

 I am away from my cluster right now, I trued doing a hadoop fs -ls
 maprfs:// and that worked.   When I tries hadoop fs -ls hdfs:/// it failed
 with wrong fs type.  With that error I didn't try it in the mapred-site.  I
 will try it.  Still...why hard code the file prefixes? I guess I am curious
 on how glusterfs would work, or others

Re: Alternate HDFS Filesystems + Hadoop on Mesos

2014-08-15 Thread John Omernik
I tried hdfs:/// and hdfs://cldbnode:7222/ Neither worked (examples below)
I really think the hdfs vs other prefixes should be looked at. Like I said
above, the tachyon project just added a env variable to address this.



hdfs://cldbnode:7222/

WARNING: Logging before InitGoogleLogging() is written to STDERR
I0815 19:14:17.101666 22022 fetcher.cpp:76] Fetching URI
'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
I0815 19:14:17.101780 22022 fetcher.cpp:105] Downloading resource from
'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz' to
'/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0003/executors/executor_Task_Tracker_5/runs/b3174e72-75ea-48be-bbb8-a9a6cc605018/hadoop-0.20.2-mapr-4.0.0.tgz'
E0815 19:14:17.778833 22022 fetcher.cpp:109] HDFS copyToLocal failed:
hadoop fs -copyToLocal
'hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
'/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0003/executors/executor_Task_Tracker_5/runs/b3174e72-75ea-48be-bbb8-a9a6cc605018/hadoop-0.20.2-mapr-4.0.0.tgz'
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated.
Please use org.apache.hadoop.log.metrics.EventCounter in all the
log4j.properties files.
-copyToLocal: Wrong FS:
maprfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz,
expected: hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz
Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc]
[-crc] src ... localdst
Failed to fetch: hdfs://hadoopmapr1:7222/mesos/hadoop-0.20.2-mapr-4.0.0.tgz
Failed to synchronize with slave (it's probably exited)




hdfs:///


I0815 19:10:45.006803 21508 fetcher.cpp:76] Fetching URI
'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
I0815 19:10:45.007099 21508 fetcher.cpp:105] Downloading resource from
'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz' to
'/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0002/executors/executor_Task_Tracker_2/runs/22689054-aff6-4f7c-9746-a068a11ff000/hadoop-0.20.2-mapr-4.0.0.tgz'
E0815 19:10:45.681922 21508 fetcher.cpp:109] HDFS copyToLocal failed:
hadoop fs -copyToLocal 'hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz'
'/tmp/mesos/slaves/20140815-103603-1677764800-5050-24315-2/frameworks/20140815-154511-1677764800-5050-7162-0002/executors/executor_Task_Tracker_2/runs/22689054-aff6-4f7c-9746-a068a11ff000/hadoop-0.20.2-mapr-4.0.0.tgz'
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated.
Please use org.apache.hadoop.log.metrics.EventCounter in all the
log4j.properties files.
-copyToLocal: Wrong FS: maprfs:/mesos/hadoop-0.20.2-mapr-4.0.0.tgz,
expected: hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz
Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc]
[-crc] src ... localdst
Failed to fetch: hdfs:///mesos/hadoop-0.20.2-mapr-4.0.0.tgz
Failed to synchronize with slave (it's probably exited)



On Fri, Aug 15, 2014 at 5:38 PM, John Omernik j...@omernik.com wrote:

 I am away from my cluster right now, I trued doing a hadoop fs -ls
 maprfs:// and that worked.   When I tries hadoop fs -ls hdfs:/// it failed
 with wrong fs type.  With that error I didn't try it in the mapred-site.  I
 will try it.  Still...why hard code the file prefixes? I guess I am curious
 on how glusterfs would work, or others as they pop up.
  On Aug 15, 2014 5:04 PM, Adam Bordelon a...@mesosphere.io wrote:

 Can't you just use the hdfs:// protocol for maprfs? That should work just
 fine.


 On Fri, Aug 15, 2014 at 2:50 PM, John Omernik j...@omernik.com wrote:

 Thanks all.

 I realized MapR has a work around for me that I will try soon in that I
 have MapR fs NFS mounted on each node, I.e. I should be able to get the tar
 from there.

 That said, perhaps someone with better coding skills than me could
 provide an env variable where a user could provide the HDFS prefixes to
 try. I know we did that with the tachyon project and it works well for
 other HDFS compatible fs implementations, perhaps that would work here?
 Hard coding a pluggable system seems like a long term issue that will keep
 coming up.
  On Aug 15, 2014 4:02 PM, Tim St Clair tstcl...@redhat.com wrote:

 The uri doesn't currently start with any of the known types (at least
 on 1st grok).
 You could redirect via a proxy that does the job for you.

 | if you had some fuse mount that would work too.

 Cheers,
 Tim

 --

 *From: *John Omernik j...@omernik.com
 *To: *user@mesos.apache.org
 *Sent: *Friday, August 15, 2014 3:55:02 PM
 *Subject: *Alternate HDFS Filesystems + Hadoop on Mesos

 I am on a wonderful journey trying to get hadoop on Mesos working with
 MapR.   I feel like I am close, but when the slaves try to run the packaged
 Hadoop, I get the error below.  The odd thing is,  I KNOW I got Spark
 running on Mesos pulling both data and the packages from MapRFS.  So I am
 confused why there is and issue

Re: Please Help me about hadoop on Mesos

2014-01-27 Thread Vinod Kone

 I have some questions about running hadoop on top of Mesos, please help me.
 1. when a tasktracker is launched, if n cpu core are allocated to it, it
 can only launch n-1 map tasks. Could someone tell me why? And, if I want to
 run map-only job, what should I do to run n map tasks on a n cpu resource
 offer?


This is because 1 cpu is allocated to the task tracker itself.


 2. After a tasktracker is launched, in what condition it's status will
 update to FINISHED?
 In my cluster, sometimes it will never end until I restart the jobtracker.
 Sometimes it will end if there is no task or job in jobtracker to run.


The expected case is that the task tracker is finished/killed if there is
no task/job assigned to it. If there is an idle task tracker for a long
time it's probably a bug (@brenden can correct me if the semantics have
changed around this). Some logs would help diagnose the issue.


 3. How to use DRF with weight? I run two frameworks on mesos, and I want
 to give them different proportion of resources.

 Give each framework a different role (FrameworkInfo.role) and give weights
to each role via master command line flags (see --roles and --weights via
./master --help).



 Please help me!
 Thank you very much!





Re: Re: Please Help me about hadoop on Mesos

2014-01-27 Thread Vinod Kone
On Mon, Jan 27, 2014 at 10:07 AM, HUO Jing huoj...@ihep.ac.cn wrote:

 So, at the very beginning, if all the resource are assigned to hadoop, and
 after that, there are always enough jobs in jobtracker, is that meanning
 that the other framework will never get resource?
 Is it fair to do so ?


That is correct. Currently there is no concept of pre-emption of resources
in mesos. While this is likely to change in the future, in the short term
you could reserve resources to frameworks (see --resources on ./slave
--help) to avoid starvation.


Re: Hadoop on Mesos use local cdh4 installation instead of tar.gz

2014-01-02 Thread Damien Hardy
Hello,

Using hadoop distribution is possible (here cdh4.1.2) :
An archive is mandatory by haddop-mesos framework, so I created and
deployed a small dummy file that does not cost so much to get and untar.

In mapred-site.xml, override mapred.mesos.executor.directory and
mapred.mesos.executor.command so I use mesos task directory for my job
and deployed cloudera tasktracker to execute.

+  property
+namemapred.mesos.executor.uri/name
+valuehdfs://hdfscluster/tmp/dummy.tar.gz/value
+  /property
+  property
+namemapred.mesos.executor.directory/name
+value.//value
+  /property
+  property
+namemapred.mesos.executor.command/name
+value. /etc/default/hadoop-0.20; env ; $HADOOP_HOME/bin/hadoop
org.apache.hadoop.mapred.MesosExecutor/value
+  /property

Add some envar in /etc/default/hadoop-0.20 so hadoop services can find
hadoop-mesos jar and libmesos :

+export
HADOOP_CLASSPATH=/usr/lib/hadoop-mesos/hadoop-mesos.jar:$HADOOP_HOME/contrib/fairscheduler/hadoop-fairscheduler-2.0.0-mr1-cdh4.1.2.jar:$HADOOP_CLASSPATH
+export MESOS_NATIVE_LIBRARY=/usr/lib/libmesos.so

I created an hadoop-mesos deb to be deployed with hadoop ditribution.
My goal is to limit -copyToLocal of TT code for each mesos tasks, and no
need for special manipulation in Hadoop Distribution code (only config)

Regards,

Le 31/12/2013 16:45, Damien Hardy a écrit :
 I'm now able to use snappy compression by adding
 
 export JAVA_LIBRARY_PATH=/usr/lib/hadoop/lib/native/
 in my /etc/default/mesos-slave (environment variable for mesos-slave
 process used by my init.d script)
 
 This envar is propagated to executor Jvm and so taskTracker can find
 libsnappy.so to use it.
 
 Starting using local deployement of cdh4 ...
 
 Reading at the source it seams that something could be done using
 mapred.mesos.executor.directory and mapred.mesos.executor.command
 to use local hadoop.
 
 
 Le 31/12/2013 15:08, Damien Hardy a écrit :
 Hello,

 Happy new year 2014 @mesos users.

 I am trying to get MapReduce cdh4.1.2 running on Mesos.

 Seams working mostly but few things are still problematic.

   * MR1 code is already deployed locally with HDFS is there a way to use
 it instead of tar.gz stored on HDFS to be copied locally and untar.

   * If not, using tar.gz distribution of cdh4 seams not supporting
 Snappy compression. is there a way to correct it ?

 Best regards,

 

-- 
Damien HARDY
IT Infrastructure Architect
Viadeo - 30 rue de la Victoire - 75009 Paris - France
PGP : 45D7F89A



signature.asc
Description: OpenPGP digital signature


Hadoop on Mesos use local cdh4 installation instead of tar.gz

2013-12-31 Thread Damien Hardy
Hello,

Happy new year 2014 @mesos users.

I am trying to get MapReduce cdh4.1.2 running on Mesos.

Seams working mostly but few things are still problematic.

  * MR1 code is already deployed locally with HDFS is there a way to use
it instead of tar.gz stored on HDFS to be copied locally and untar.

  * If not, using tar.gz distribution of cdh4 seams not supporting
Snappy compression. is there a way to correct it ?

Best regards,

-- 
Damien HARDY



signature.asc
Description: OpenPGP digital signature


Re: Hadoop on Mesos use local cdh4 installation instead of tar.gz

2013-12-31 Thread Damien Hardy
I'm now able to use snappy compression by adding

export JAVA_LIBRARY_PATH=/usr/lib/hadoop/lib/native/
in my /etc/default/mesos-slave (environment variable for mesos-slave
process used by my init.d script)

This envar is propagated to executor Jvm and so taskTracker can find
libsnappy.so to use it.

Starting using local deployement of cdh4 ...

Reading at the source it seams that something could be done using
mapred.mesos.executor.directory and mapred.mesos.executor.command
to use local hadoop.


Le 31/12/2013 15:08, Damien Hardy a écrit :
 Hello,
 
 Happy new year 2014 @mesos users.
 
 I am trying to get MapReduce cdh4.1.2 running on Mesos.
 
 Seams working mostly but few things are still problematic.
 
   * MR1 code is already deployed locally with HDFS is there a way to use
 it instead of tar.gz stored on HDFS to be copied locally and untar.
 
   * If not, using tar.gz distribution of cdh4 seams not supporting
 Snappy compression. is there a way to correct it ?
 
 Best regards,
 

-- 
Damien HARDY



signature.asc
Description: OpenPGP digital signature


Fwd: Crashed when configure Hadoop on Mesos

2013-12-11 Thread Azuryy Yu
Hi,
I download hadoop-mesos from here:
https://github.com/mesos/hadoop

I changed mesos.version to mesos.version0.14.2/mesos.version in the
pom.xml, then build successful.

then I download mesos-0.14.2, and build successfully, I can start mesos
cluster successfully with 3 nodes. I can see all nodes on the webUI.

but when I start jobTracker strictly follow
https://github.com/mesos/hadoop/blob/master/README.md
JobTracker cannot start and throw Exceptions:
13/12/11 15:07:55 INFO mapred.MesosScheduler: Starting MesosScheduler
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7febaee37d09, pid=14013, tid=140650221684480
#
# JRE version: Java(TM) SE Runtime Environment (7.0_45-b18) (build
1.7.0_45-b18)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode
linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x632d09]  jni_GetByteArrayElements+0x89
#
# Failed to write core dump. Core dumps have been disabled. To enable core
dumping, try ulimit -c unlimited before starting Java again
#
# An error report file with more information is saved as:
# /home/hadoop/hadoop-1.2.1/logs/jt_error_gc.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp

*It seems JVM crashed during JNI calling. but who can tell me how to fix
it. Thanks.*
My Java version: 1.7.0_45
hadoop-core : build from the 2.2.0
HDFS is started with HA,

My mapred-site.xml:
  property
namemapred.jobtracker.taskScheduler/name
valueorg.apache.hadoop.mapred.MesosScheduler/value
  /property
  property
namemapred.mesos.taskScheduler/name
valueorg.apache.hadoop.mapred.FairScheduler/value
  /property
  property
namemapred.mesos.master/name
valuewebdm.test.com:5050/value
  /property
  property
namemapred.mesos.executor.uri/name
valuehdfs://webdm-cluster/data/mesos/hadoop-2.2.0.tar.gz/value
  /property