Re: Multiple disks with Mesos

2014-10-08 Thread Dick Davies
To answer point 2) - yes, your executors will create their 'sandboxes'
under work_dir.


On 8 October 2014 00:13, Arunabha Ghosh  wrote:
> Thanks Steven !
>
> On Tue, Oct 7, 2014 at 4:08 PM, Steven Schlansker
>  wrote:
>>
>>
>> On Oct 7, 2014, at 4:06 PM, Arunabha Ghosh  wrote:
>>
>> > Hi,
>> >  I would like to run Mesos slaves on machines that have multiple
>> > disks. According to the Mesos configuration page I can specify a work_dir
>> > argument to the slaves.
>> >
>> > 1) Can the work_dir argument contain multiple directories ?
>> >
>> > 2) Is the work_dir where Mesos will place all of its data ? So If I
>> > started a task on Mesos, would the slave place the task's data (stderr,
>> > stdout, task created directories) inside work_dir ?
>>
>> We stitch our disks together before Mesos gets its hands on it using a
>> technology such as LVM or btrfs, so that the work_dir is spread across the
>> multiple disks transparently.
>>
>


Re: Multiple disks with Mesos

2014-10-08 Thread Damien Hardy
Hello,

I run mesos on top hadoop HDFS.
Hadoop handle well with JBOD configuration.

Today mesos can only work on one of the disk and cannot take advantage
of other disks. (use non HDFS space)

This would be a great feature to handle with JBOD too. Dealing with
failure better than LVM for example.

Cheers,

Le 08/10/2014 01:06, Arunabha Ghosh a écrit :
> Hi,
>  I would like to run Mesos slaves on machines that have multiple
> disks. According to the Mesos configuration page
>  I can
> specify a work_dir argument to the slaves. 
> 
> 1) Can the work_dir argument contain multiple directories ?
> 
> 2) Is the work_dir where Mesos will place all of its data ? So If I
> started a task on Mesos, would the slave place the task's data (stderr,
> stdout, task created directories) inside work_dir ?
> 
> Thanks,
> Arunabha

-- 
Damien HARDY



signature.asc
Description: OpenPGP digital signature


Re: Multiple Network interfaces

2014-10-08 Thread Jay Buffington
My reading of the code is that this is not supported.

I have this same problem.  I'm trying to work around an issue with a
stubborn application that requires that all instances in a cluster run
on the same port.  Therefore, I have 5+ interfaces per slave and I
want task A and task B to bind to the same port on different
interfaces on the same slave.

Since the ports resource is simply a Range[1] there is nowhere to
stick the IP they belong to.  I think to get what you want mesos would
need to introduce a new first class IP resource in the C++ Resources
abstraction[2] which has an IP (Scalar) and a list of ports (Range).

I opened https://issues.apache.org/jira/browse/MESOS-1874 to track the
work that needs to be done to support this.

Jay

[1] https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L335
[2] https://github.com/apache/mesos/blob/master/src/common/resources.cpp

On Tue, Oct 7, 2014 at 5:09 PM, Diptanu Choudhury  wrote:
> Hi,
>
> I am wondering if Mesos can offer resources from multiple network
> interfaces? We would like to attach multiple Network Interfaces on EC2
> instances and would like to bind specific applications that we run on mesos
> on specific interfaces?
>
> So basically I am wondering if Mesos can offer ports from different network
> interfaces on the same slave??
>
> --
> Thanks,
> Diptanu Choudhury
> Web - www.linkedin.com/in/diptanu
> Twitter - @diptanu


Re: Multiple Network interfaces

2014-10-08 Thread CCAAT

H,

Possible solution?  Attach a computer with multiple ethernet cards.
One is used to interface to the slave via the single port. ON the
attached computer (basically a secure router) you run Network Address 
Translation) [1] and other codes to make the multiple interfaces 
available on different IP and ports.



In fact, I'm not sure that you could not do this directly on the slave 
itself.   I have not examined the mesos code, but this is pretty 
standard when multiple IP interfaces are desired.


In fact there are many ways to set up multiple IP address on the same
(physical) ethernet card:  "IP aliasisng" [2]. Surely there are more?
[3].


A bit more detail on exactly what you are trying to accomplish, might
help in finding the correct solution, or a feature request in some 
detail for the developers to consider?


hth,
James


[1]http://en.wikipedia.org/wiki/Network_address_translation

[2] 
http://www.tecmint.com/create-multiple-ip-addresses-to-one-single-network-interface/


[3] https://docs.docker.com/articles/networking/



On 10/08/14 10:52, Jay Buffington wrote:

My reading of the code is that this is not supported.

I have this same problem.  I'm trying to work around an issue with a
stubborn application that requires that all instances in a cluster run
on the same port.  Therefore, I have 5+ interfaces per slave and I
want task A and task B to bind to the same port on different
interfaces on the same slave.

Since the ports resource is simply a Range[1] there is nowhere to
stick the IP they belong to.  I think to get what you want mesos would
need to introduce a new first class IP resource in the C++ Resources
abstraction[2] which has an IP (Scalar) and a list of ports (Range).

I opened https://issues.apache.org/jira/browse/MESOS-1874 to track the
work that needs to be done to support this.

Jay

[1] https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L335
[2] https://github.com/apache/mesos/blob/master/src/common/resources.cpp

On Tue, Oct 7, 2014 at 5:09 PM, Diptanu Choudhury  wrote:

Hi,

I am wondering if Mesos can offer resources from multiple network
interfaces? We would like to attach multiple Network Interfaces on EC2
instances and would like to bind specific applications that we run on mesos
on specific interfaces?

So basically I am wondering if Mesos can offer ports from different network
interfaces on the same slave??

--
Thanks,
Diptanu Choudhury
Web - www.linkedin.com/in/diptanu
Twitter - @diptanu






Re: Multiple Network interfaces

2014-10-08 Thread Ankur Chauhan
Is using bridged networking with docker an option. That would map the service 
port (fixed) to a dynamically assigned port. Granted it will not utilize 
different nics but it's something. 

Sent from my iPhone

> On Oct 8, 2014, at 8:52 AM, Jay Buffington  wrote:
> 
> My reading of the code is that this is not supported.
> 
> I have this same problem.  I'm trying to work around an issue with a
> stubborn application that requires that all instances in a cluster run
> on the same port.  Therefore, I have 5+ interfaces per slave and I
> want task A and task B to bind to the same port on different
> interfaces on the same slave.
> 
> Since the ports resource is simply a Range[1] there is nowhere to
> stick the IP they belong to.  I think to get what you want mesos would
> need to introduce a new first class IP resource in the C++ Resources
> abstraction[2] which has an IP (Scalar) and a list of ports (Range).
> 
> I opened https://issues.apache.org/jira/browse/MESOS-1874 to track the
> work that needs to be done to support this.
> 
> Jay
> 
> [1] https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L335
> [2] https://github.com/apache/mesos/blob/master/src/common/resources.cpp
> 
>> On Tue, Oct 7, 2014 at 5:09 PM, Diptanu Choudhury  wrote:
>> Hi,
>> 
>> I am wondering if Mesos can offer resources from multiple network
>> interfaces? We would like to attach multiple Network Interfaces on EC2
>> instances and would like to bind specific applications that we run on mesos
>> on specific interfaces?
>> 
>> So basically I am wondering if Mesos can offer ports from different network
>> interfaces on the same slave??
>> 
>> --
>> Thanks,
>> Diptanu Choudhury
>> Web - www.linkedin.com/in/diptanu
>> Twitter - @diptanu


Re: Multiple Network interfaces

2014-10-08 Thread Tim St Clair
Not via mesos directly (at least not that I'm aware of), however it may be 
possible if you use Docker and do it indirectly by configuring multiple bridge 
adapters || via ( https://github.com/jpetazzo/pipework ). 

Another option might be Kube on Mesos, I hear that openvswitch exists under the 
hood to enable this capability, depending on whether that meets your 
requirements. 

Best of Luck, 
Tim 

- Original Message -

> From: "Diptanu Choudhury" 
> To: user@mesos.apache.org
> Sent: Tuesday, October 7, 2014 7:09:44 PM
> Subject: Multiple Network interfaces

> Hi,

> I am wondering if Mesos can offer resources from multiple network interfaces?
> We would like to attach multiple Network Interfaces on EC2 instances and
> would like to bind specific applications that we run on mesos on specific
> interfaces?

> So basically I am wondering if Mesos can offer ports from different network
> interfaces on the same slave??

> --
> Thanks,
> Diptanu Choudhury
> Web - www.linkedin.com/in/diptanu
> Twitter - @diptanu


Re: Multiple disks with Mesos

2014-10-08 Thread Tim St Clair
+1, stitching can be done outside of mesos prior to init. 

- Original Message -
> From: "Steven Schlansker" 
> To: user@mesos.apache.org
> Sent: Tuesday, October 7, 2014 6:08:21 PM
> Subject: Re: Multiple disks with Mesos
> 
> 
> On Oct 7, 2014, at 4:06 PM, Arunabha Ghosh  wrote:
> 
> > Hi,
> >  I would like to run Mesos slaves on machines that have multiple disks.
> >  According to the Mesos configuration page I can specify a work_dir
> >  argument to the slaves.
> > 
> > 1) Can the work_dir argument contain multiple directories ?
> > 
> > 2) Is the work_dir where Mesos will place all of its data ? So If I started
> > a task on Mesos, would the slave place the task's data (stderr, stdout,
> > task created directories) inside work_dir ?
> 
> We stitch our disks together before Mesos gets its hands on it using a
> technology such as LVM or btrfs, so that the work_dir is spread across the
> multiple disks transparently.
> 
> 

-- 
Cheers,
Timothy St. Clair
Red Hat Inc.


HDFS Mesos Framework

2014-10-08 Thread Luke Amdor
Has anyone started work on a Hadoop HDFS Mesos framework? I know many of us
just run HDFS alongside the Mesos slaves, but I was looking for something a
little simpler. Possibly with federation support and quorum journals?

-- 
*Luke Amdor* | Platform Lead Architect | Banno | *ProfitStars®*
Des Moines IA 50309 | Cell 515.231.4033


Re: HDFS Mesos Framework

2014-10-08 Thread Tim Chen
Brenden Matthews has a HDFS framework that is still in progress:
https://github.com/brndnmtthws/hdfs

Welcome to contribute as well!

Tim

On Wed, Oct 8, 2014 at 9:51 AM, Luke Amdor  wrote:

> Has anyone started work on a Hadoop HDFS Mesos framework? I know many of
> us just run HDFS alongside the Mesos slaves, but I was looking for
> something a little simpler. Possibly with federation support and quorum
> journals?
>
> --
> *Luke Amdor* | Platform Lead Architect | Banno | *ProfitStars®*
> Des Moines IA 50309 | Cell 515.231.4033
>


BTRFS on a DFS

2014-10-08 Thread CCAAT

Hello,


Well I did not want to "hijack a thread" but since several folks are
dancing right next to my current quandary, I figured I'd start another,
companion thread closely related to several of the current threads.

So, I'm building up (2)  clusters of (3) machines each;  the first 
difference is one set will be running systemd and the other set will be 
running openrc. The purpose is to help flush out differences related to 
how system/kernel based resources are managed by systemd vs openrc, aid 
in diagnosing deep memory (OOM) issues and test various kernel 
optimizations.



That said, what I'm working on is installing btrfs as the file system on 
each pair of mirrored (Raid 1) drives on each system. Later on with 
btrfs, I can migrate to (Raid 10) or (Raid 6) when R6 becomes stable.

What I' proposing to do is use a Distributed File System (DFS) like
lustre on top of btrfs for these (2) mesos + spark clusters. Later on
I might try FhGFS as well as gluster after lustre.

With btrfs, it's pretty straight forward to combine several disk into
one "pool" of space.  If anyone has setup btrfs for mesos, and 
suggestions are most welcome?


Or can I just install mesos on the cluster-btrfs without a DFS ? 
Pros/Cons of these approaches? I'm really trying to  avoid the legacy 
codes such as Hadoop and HDFS.



Suggestions and comments are most welcome.


James



Re: Multiple Network interfaces

2014-10-08 Thread Diptanu Choudhury
I treat Mesos as a resource allocator, so while I am aware of these
proposed work-around, it would be awesome if we have Mesos do the resource
allocations for network interfaces the same way it does for CPU, Memory,
Disk, etc.

On Wed, Oct 8, 2014 at 9:28 AM, Tim St Clair  wrote:

> Not via mesos directly (at least not that I'm aware of), however it may be
> possible if you use Docker and do it indirectly by configuring multiple
> bridge adapters || via (https://github.com/jpetazzo/pipework).
>
> Another option might be Kube on Mesos, I hear that openvswitch exists
> under the hood to enable this capability, depending on whether that meets
> your requirements.
>
> Best of Luck,
> Tim
>
> --
>
> *From: *"Diptanu Choudhury" 
> *To: *user@mesos.apache.org
> *Sent: *Tuesday, October 7, 2014 7:09:44 PM
> *Subject: *Multiple Network interfaces
>
>
> Hi,
>
> I am wondering if Mesos can offer resources from multiple network
> interfaces? We would like to attach multiple Network Interfaces on EC2
> instances and would like to bind specific applications that we run on mesos
> on specific interfaces?
>
> So basically I am wondering if Mesos can offer ports from different
> network interfaces on the same slave??
>
> --
> Thanks,
> Diptanu Choudhury
> Web - www.linkedin.com/in/diptanu
> Twitter - @diptanu 
>
>
>
>


-- 
Thanks,
Diptanu Choudhury
Web - www.linkedin.com/in/diptanu
Twitter - @diptanu 


Re: Multiple Network interfaces

2014-10-08 Thread Diptanu Choudhury
On AWS, there is no need to share the same network interfaces between
multiple applications as we can attach multiple network interfaces per
instance. Only thing is Mesos doesn't understand that there are all these
other interfaces.

On Wed, Oct 8, 2014 at 9:28 AM, Tim St Clair  wrote:

> Not via mesos directly (at least not that I'm aware of), however it may be
> possible if you use Docker and do it indirectly by configuring multiple
> bridge adapters || via (https://github.com/jpetazzo/pipework).
>
> Another option might be Kube on Mesos, I hear that openvswitch exists
> under the hood to enable this capability, depending on whether that meets
> your requirements.
>
> Best of Luck,
> Tim
>
> --
>
> *From: *"Diptanu Choudhury" 
> *To: *user@mesos.apache.org
> *Sent: *Tuesday, October 7, 2014 7:09:44 PM
> *Subject: *Multiple Network interfaces
>
>
> Hi,
>
> I am wondering if Mesos can offer resources from multiple network
> interfaces? We would like to attach multiple Network Interfaces on EC2
> instances and would like to bind specific applications that we run on mesos
> on specific interfaces?
>
> So basically I am wondering if Mesos can offer ports from different
> network interfaces on the same slave??
>
> --
> Thanks,
> Diptanu Choudhury
> Web - www.linkedin.com/in/diptanu
> Twitter - @diptanu 
>
>
>
>


-- 
Thanks,
Diptanu Choudhury
Web - www.linkedin.com/in/diptanu
Twitter - @diptanu 


Message retries for TASK_FINISHED

2014-10-08 Thread Colleen Lee
We're encountering issues where our scheduler will never receive the
TASK_FINISHED message, causing the scheduler driver to never be stopped.

Logs from scheduler:

2014-10-01 05:48:14,549 INFO [JobManager] >>> Running job
DTdJgUhTEeSnPCIACwSpXg
2014-10-01 06:06:06,536 INFO [JobScheduler] >>> Job DTdJgUhTEeSnPCIACwSpXg:
Registered as 20140927-080358-3165886730-5050-928-0069 to master
'20140927-080358-3165886730-5050-928'
2014-10-01 06:06:06,538 INFO [JobScheduler] >>> Job DTdJgUhTEeSnPCIACwSpXg:
Found matching offer, declining all other offers
2014-10-01 06:06:08,706 INFO [JobScheduler] >>> Job DTdJgUhTEeSnPCIACwSpXg:
Got status update task_id {
  value: "DTdJgUhTEeSnPCIACwSpXg"
}
state: TASK_RUNNING
slave_id {
  value: "20140818-235718-3165886730-5050-901-4"
}
timestamp: 1.41214356869682E9

Most of these logs are generated by our side, but they indicate that the
driver is run, the framework successfully registered with the master, the
framework successfully accepted offers, and the framework was successfully
transmitted a TASK_RUNNING message.

Logs from the master:

/var/log/mesos/mesos-master.INFO:I1001 06:06:06.540736   951
master.hpp:655] Adding task DTdJgUhTEeSnPCIACwSpXg with resources
cpus(*):1; mem(*):1536 on slave 20140818-235718-3165886730-5050-901-4
(ip-10-51-165-231.ec2.internal)
/var/log/mesos/mesos-master.INFO:I1001 06:06:06.541306   951
master.cpp:3111] Launching task DTdJgUhTEeSnPCIACwSpXg of framework
20140927-080358-3165886730-5050-928-0069 with resources cpus(*):1;
mem(*):1536 on slave 20140818-235718-3165886730-5050-901-4 at slave(1)@
10.51.165.231:5051 (ip-10-51-165-231.ec2.internal)
/var/log/mesos/mesos-master.INFO:I1001 06:06:08.699162   951
master.cpp:2628] Status update TASK_RUNNING (UUID:
0e17e133-c1d0-4a56-9cef-afb45a486045) for task DTdJgUhTEeSnPCIACwSpXg of
framework 20140927-080358-3165886730-5050-928-0069 from slave
20140818-235718-3165886730-5050-901-4 at slave(1)@10.51.165.231:5051
(ip-10-51-165-231.ec2.internal)
/var/log/mesos/mesos-master.INFO:I1001 06:16:15.295013   945
master.cpp:2628] Status update TASK_FINISHED (UUID:
587869ec-a4ef-439a-a706-8a46fbc9fde8) for task DTdJgUhTEeSnPCIACwSpXg of
framework 20140927-080358-3165886730-5050-928-0069 from slave
20140818-235718-3165886730-5050-901-4 at slave(1)@10.51.165.231:5051
(ip-10-51-165-231.ec2.internal)
/var/log/mesos/mesos-master.INFO:I1001 06:16:15.295222   945
master.hpp:673] Removing task DTdJgUhTEeSnPCIACwSpXg with resources
cpus(*):1; mem(*):1536 on slave 20140818-235718-3165886730-5050-901-4
(ip-10-51-165-231.ec2.internal)
/var/log/mesos/mesos-master.INFO:W1001 06:16:25.295990   951
master.cpp:2621] Could not lookup task for status update TASK_FINISHED
(UUID: 587869ec-a4ef-439a-a706-8a46fbc9fde8) for task
DTdJgUhTEeSnPCIACwSpXg of framework
20140927-080358-3165886730-5050-928-0069 from slave
20140818-235718-3165886730-5050-901-4 at slave(1)@10.51.165.231:5051
(ip-10-51-165-231.ec2.internal)

The "Could not lookup task" messages have been recurring ever since.

Logs from the slave:

/var/log/mesos/mesos-slave.INFO:I1001 06:06:06.543417  1207 slave.cpp:933]
Got assigned task DTdJgUhTEeSnPCIACwSpXg for framework
20140927-080358-3165886730-5050-928-0069
/var/log/mesos/mesos-slave.INFO:I1001 06:06:06.543709  1207 slave.cpp:1043]
Launching task DTdJgUhTEeSnPCIACwSpXg for framework
20140927-080358-3165886730-5050-928-0069
/var/log/mesos/mesos-slave.INFO:I1001 06:06:06.544926  1207 slave.cpp:1153]
Queuing task 'DTdJgUhTEeSnPCIACwSpXg' for executor default of framework
'20140927-080358-3165886730-5050-928-0069
/var/log/mesos/mesos-slave.INFO:I1001 06:06:08.644618  1205 slave.cpp:1783]
Flushing queued task DTdJgUhTEeSnPCIACwSpXg for executor 'default' of
framework 20140927-080358-3165886730-5050-928-0069
/var/log/mesos/mesos-slave.INFO:I1001 06:06:08.698570  1204 slave.cpp:2018]
Handling status update TASK_RUNNING (UUID:
0e17e133-c1d0-4a56-9cef-afb45a486045) for task DTdJgUhTEeSnPCIACwSpXg of
framework 20140927-080358-3165886730-5050-928-0069 from executor(1)@
10.51.165.231:42450
/var/log/mesos/mesos-slave.INFO:I1001 06:06:08.698796  1204
status_update_manager.cpp:320] Received status update TASK_RUNNING (UUID:
0e17e133-c1d0-4a56-9cef-afb45a486045) for task DTdJgUhTEeSnPCIACwSpXg of
framework 20140927-080358-3165886730-5050-928-0069
/var/log/mesos/mesos-slave.INFO:I1001 06:06:08.698974  1204
status_update_manager.cpp:373] Forwarding status update TASK_RUNNING (UUID:
0e17e133-c1d0-4a56-9cef-afb45a486045) for task DTdJgUhTEeSnPCIACwSpXg of
framework 20140927-080358-3165886730-5050-928-0069 to
master@10.153.179.188:5050
/var/log/mesos/mesos-slave.INFO:I1001 06:06:08.699262  1204 slave.cpp:2145]
Sending acknowledgement for status update TASK_RUNNING (UUID:
0e17e133-c1d0-4a56-9cef-afb45a486045) for task DTdJgUhTEeSnPCIACwSpXg of
framework 20140927-080358-3165886730-5050-928-0069 to executor(1)@
10.51.165.231:42450
/var/log/mesos/mesos-slave.INFO:I1001 06:06:08.711634  1208
status_update_manager.cpp:398] Received status update ac