@Tom, one more question: how about your task run time? If the task run time
is too short, e.g. 100ms, the resources will be return to allocator when
task finished and will allocate it until next allocation cycle.
Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
Platform OpenSource
Hi Tom,
I saw that the two frameworks with roles is consuming most of the
resources, so I think that you can do more test by removing the two
frameworks with roles.
Another I want to mention is that the DRF allocator may have some issues
when there are plenty of frameworks and the community is
Currently, changing any --attributes or --resources requires draining the
agent and killing all running tasks.
See https://issues.apache.org/jira/browse/MESOS-1739
You could do a `mesos-slave --recovery=cleanup` which essentially kills all
the tasks and clears the work_dir; then restart with a
Hi all,
The vote for Mesos 0.27.1 (rc1) has passed with the
following votes.
+1 (Binding)
--
Bernd Mathiske
Joris Van Remoortere
Vinod Kone
+1 (Non-binding)
--
Zhitao Li
Jörg Schad
There were no 0 or -1 votes.
Please find the release at:
IIRC you can avoid the issue by either using a different work_dir for the
agent, or removing (and, possibly, re-creating) it.
I'm afraid I don't have a running instance of Mesos on this machine and
can't test it out.
Also (and this is strictly my opinion :) I would consider a change of
attribute
Thanks for the responses. Filed a ticket for this:
- https://issues.apache.org/jira/browse/MESOS-4737
- Erik
On Mon, Feb 22, 2016 at 1:23 PM, Sargun Dhillon wrote:
> As someone who has been there and back again (Reusing task-IDs, and
> realizing it's a terrible idea),
As someone who has been there and back again (Reusing task-IDs, and
realizing it's a terrible idea), I'd put some advise in the docs +
mesos.proto to compose task IDs from GUIDs, and add that it's
dangerous to reuse them.
I would advocate for a mechanism to prevent the usage of non-unique
IDs for
I would vote for updating comments in mesos.proto to warn users to not
re-use task id for now.
On Sun, Feb 21, 2016 at 9:05 PM, Klaus Ma wrote:
> Yes, it's dangerous to reuse TaskID; there's a JIRA (MESOS-3070) that
> Master'll crash when Master failover with duplicated
Zhitao,
In my experience the best way to manage these attributes is to ensure
attribute changes are minimal (ie one attribute at a time) and roll them
out slowly across a cluster. This way you can catch unsafe mutations
quickly and rollback if needed.
I don't think there is a whitelist/blacklist
Hi,
We recently discovered that updating attributes on Mesos agents is a very
risk operation, and has a potential to send agent(s) into a crash loop if
not done properly with errors like "Failed to perform recovery:
Incompatible slave
info detected". This combined with --recovery_timeout made the
Hey guys,
After some extra thought, we came to what we think is a nice interface for the
Mesos authorizer [1] which will allow users of Mesos to use to your custom
backends in a nice way. Please share your thoughts with us in case we missed
something or there are improvements we can make to
Hi Guangya,
Most of the agents do not have a role, so they use the default wildcard role
for resources. Also none of the frameworks have a role, therefore they fall
into the wildcard role too.
Frameworks are being offered resources up to a certain level of fairness but no
further. The issue
If non of the framework has role, then no framework can consume reserved
resources, so I think that at least the framework
20160219-164457-67375276-5050-28802-0014 and
20160219-164457-67375276-5050-28802-0015
should have role.
Can you please show some detail for the following:
1) Master start
Ah yes sorry my mistake, there are a couple of agents with a dev role and only
one or two frameworks connect to the cluster with that role, but not very
often. Whether they’re connected or not doesn’t seem to cause any change in
allocation behaviour.
No other agents have roles.
> 974 2420
Hi Tom,
I think that your cluster should have some role, weight configuration
because I can see there are at least two agent has role with "dev"
configured.
56 1363 I0219 18:08:26.284010 28810 hierarchical.hpp:1025] Filtered
ports(dev):[3000-5000]; cpus(dev):10; mem(dev):63488; disk(dev):153600
If this is of any use to anyone: There is also an outstanding branch of Docker
which has checkpoint/restore functionality in it (based on CRIU I believe)
which is hopefully being merged into experimental soon.
From: Sharma Podila [spod...@netflix.com]
Sent: 19
Would you be able to elaborate a bit more on how you did this?
From: Mauricio Garavaglia [mauri...@medallia.com]
Sent: 19 February 2016 19:20
To: user@mesos.apache.org
Subject: Re: AW: Feature request: move in-flight containers w/o stopping them
Mesos is not only
17 matches
Mail list logo