I have a simple setup where a framework runs with a role, and some 
resources are reserved in cluster for that role.
The resource offers arrive at the framework as a list of two resource 
sets: one general (cpus(*)), etc)  and one specific for the role 
(cpus("role1"), etc).

So far so good. If two tasks are launched, each with one of the two 
resources, things work.

But problems start when I need to launch multiple smaller tasks (with a 
total resource consumption equal to the offered). I run this by creating 
resource objects, and attaching them to tasks, using calls from the 
standard Mesos samples (python):
                    task = mesos_pb2.TaskInfo()
                    cpus = task.resources.add()
                    cpus.name = "cpus"
                    cpus.scalar.value = TASK_CPUS

checking that total doesnt surpass the offered resources. This starts 
fine, but soon I get TASK_ERROR messages, due to Master validator finding 
that more resources are requested by tasks than available in the offer. 
This obviously happens because all tasks resources, as defined above, come 
with (*) role, while the offer resources are split between "*" and "role1" 
! Ok, then I assign a role to task resources, by adding
                   cpus.role = "role1"

But this fails again, and for the same reason.. 

Shouldn't this work differently? When a resource offer is received 
framework with a "role1", why should it care which part is 'unreserved' 
and which part is reserved to "role1"? When a task launch request is 
received by the master, from a framework with a role, why can't it check 
only the total resource amount, instead of treating unreserved and 
reserved resources separately? They are reserved for this role anyway.. Or 
I'm missing something?


Regards, 
Gidon



Reply via email to