That's a great point Itamar, and something we discussed quite some time ago 
here but never implemented. These are the first two options that spring to mind 
that I can remember...




- Are you using docker containers for your tasks? Why not use containers 
pre-configured on the box for these services too?

- Build some custom init scripts for your services (perhaps systemd and the 
like can do this for you) that will drop your PIDs into cgroups after they 
launch, which would allow you to reserve those resources you need using the 
same resource system as the popular container tools.

- Do you need to actually reserve these resources? Perhaps if you're only 
concerned about memory, or CPU, you could just advertise your slaves as having 
less than the machine actually has (using the --resources) flag to mesos-slave.




With any of these three approaches you still are going to need to modify the 
--resources flag on each slave to ensure less resources than are actually 
available are advertised to the cluster.




Maybe those options are of some use. If you do end up implementing something in 
this area for settings aside resources for these auxiliary services, i'd love 
to know how you end up doing it!




--


Tom Arnfeld

Developer // DueDil






On Thursday, Jan 8, 2015 at 7:32 am, Itamar Ostricher <ita...@yowza3d.com>, 
wrote:

Thanks everybody for all your insights!


I totally agree with the last response from Tom.

The per-node services definitely belong to the level that provisions the 
machine and the mesos-slave service itself (in our case, pre-configured GCE 
images).




So I guess the problem I wanted to solve is more general - how can I make sure 
there are resources reserved for all of the system-level stuff that are running 
outside of the mesos context?

To be more specific, if I have a machine with 16 CPUs, it is common that my 
framework will schedule 16 heavy number-crunching processes on it.

This can starve anything else that's running on the machine... (like the 
logging aggregation service, and the mesos-slave service itself)

(this probably explains phenomena of lost tasks we've been observing)

What's the best-practice solution for this situation?





On Wed, Jan 7, 2015 at 2:09 AM, Tom Arnfeld <t...@duedil.com> wrote:

I completely agree with Charles, though I think I can appreciate what you're 
trying to do here. Take the log aggregation service as an example, you want 
that on every slave to aggregate logs, but want to avoid using yet another 
layer of configuration management to deploy it.




I'm of the opinion that these kind of auxiliary services which all work 
together (the mesos-slave process included) to define what we mean by a "slave" 
are the responsibility of whoever/whatever is provisioning the mesos-slave 
process and possibly even the machine itself. In our case, that's Chef. IMO 
once a slave registers with the mesos cluster it's immediately ready to start 
doing work, and mesos will actually start offering that slave immediately.




If you continue down this path you're also going to run into a variety of 
interesting timing issues when these services fail, or when you want to upgrade 
them. I'd suggest taking a look at some kind of more advanced process monitor 
to run these aux services like M/Monit instead of mesos (via Marathon).




Think of it another way, would you want something running through mesos to 
install apt package updates once a day? That'd be super weird, so why would log 
aggregation by any different?


--


Tom Arnfeld

Developer // DueDil







On Tue, Jan 6, 2015 at 11:57 PM, Charles Baker <cnob...@gmail.com> wrote:



It seems like an 'anti-pattern' (for lack of a better term) to attempt to force 
locality on a bunch of dependency services launched through Marathon. I thought 
the whole idea of Mesos (and Marathon) was to treat the data center as one 
giant computer in which it fundamentally should not matter where your services 
are launched. Although I obviously don't know the details of the use-case and 
may be grossly misunderstanding what you are trying to do but to me it sounds 
like you are attempting to shoehorn a non-distributed application into a 
distributed architecture. If this is the case, you may want to revisit your 
implementation and try to decouple the application's requirement of node-level 
dependency locality. It is also a good opportunity to possibly redesign a 
monolithic application into a distributed one.



On Tue, Jan 6, 2015 at 12:53 PM, David Greenberg <dsg123456...@gmail.com> wrote:

Tom is absolutely correct--you also need to ensure that your "special tasks" 
run as a user which is assigned a role w/ a special reservation to ensure they 
can always launch.



On Tue, Jan 6, 2015 at 2:38 PM, Tom Arnfeld <t...@duedil.com> wrote:

I'm not sure if I'm fully aware of the use case but if you use a different 
framework (aka Marathon) to launch these services, should the service die and 
need to be re-launched (or even the slave restarts) could you not be in a 
position where another framework has consumed all resources on that slave and 
your "core" tasks cannot launch?


Maybe if you're just using Marathon it might provide a sort of priority to 
decide who gets what resources first, but with multiple frameworks you might 
need to look into the slave resource reservations and framework roles.




FWIW We're configuring these things out of band (via Chef to be specific).




Hope this helps!


--


Tom Arnfeld

Developer // DueDil





(+44) 7525940046

25 Christopher Street, London, EC2A 2BS








On Tue, Jan 6, 2015 at 9:05 AM, Itamar Ostricher <ita...@yowza3d.com> wrote:

Hi,


I was wondering if the best approach to do what I want is to use mesos itself, 
or other Linux system tools.




There are a bunch of services that our framework assumes are running on all 
participating slaves (e.g. logging service, data-bridge service, etc.).

One approach to do that is in the infrastructure level, making sure that slave 
nodes are configured correctly (e.g. with pre-configured images, or other 
provisioning systems).

Another approach would be to use mesos itself (maybe with something like 
Marathon) to schedule these services on all slave nodes.




The advantage of the mesos-based approach is that it becomes trivial to account 
for the resource consumption of said services (e.g. make sure there's always at 
least 1 CPU dedicated to this).

I'm not sure how to achieve something similar with the system-approach.




Anyone has any insights on this?

Reply via email to