8:10, Guangya Liu <gyliu...@gmail.com> wrote:
>>>>
>>>> Hi Tom,
>>>>
>>>> I traced the agent of "20160112-165226-67375276-5050-22401-S199" and
>>>> found that it is keeps declining by many frameworks: once a framework got
>>>&g
e weight for each
>>> role?
>>> 2) Do you start all agents without any reservation?
>>>
>>> Thanks,
>>>
>>> Guangya
>>>
>>> On Sun, Feb 21, 2016 at 9:23 AM, Klaus Ma <klaus1982...@gmail.com>
>>> wrote:
>>>
&
>>> Thanks,
>>>
>>> Guangya
>>>
>>> On Sun, Feb 21, 2016 at 9:23 AM, Klaus Ma <klaus1982...@gmail.com
>>> <mailto:klaus1982...@gmail.com>> wrote:
>>> Hi Tom,
>>>
>>> What's the allocation interval,
y to reduce filter's timeout
>>> of framework?
>>>
>>> According to the log, ~12 frameworks on cluster with ~42 agents; the
>>> filter duration is 5sec, and there're ~60 times filtered in each seconds
>>> (e.g. 65 in 18:08:34). For example, framework
>&
filter
>> duration is 5sec, and there're ~60 times filtered in each seconds (e.g. 65
>> in 18:08:34). For example, framework
>> (20160219-164457-67375276-5050-28802-0015) just get resources from 6 agents
>> and filtered the other 36 agents at 18:08:35 (egrep "Allo
filtered in each seconds
>> (e.g. 65 in 18:08:34). For example, framework
>> (20160219-164457-67375276-5050-28802-0015)
>> just get resources from 6 agents and filtered the other 36 agents at
>> 18:08:35 (egrep "Alloca|Filtered" mesos-master.log | grep
>> &q
050-28802-0015)
> just get resources from 6 agents and filtered the other 36 agents at
> 18:08:35 (egrep "Alloca|Filtered" mesos-master.log | grep
> "20160219-164457-67375276-5050-28802-0015" | grep "18:08:35")
>
> Thanks
> Klaus
>
> ------
-164457-67375276-5050-28802-0015)
just get resources from 6 agents and filtered the other 36 agents at 18:08:35
(egrep "Alloca|Filtered" mesos-master.log | grep
"20160219-164457-67375276-5050-28802-0015" | grep "18:08:35")
ThanksKlaus
From: t...@duedil.com
Subject: Re:
Hi Ben,
I've only just seen your email! Really appreciate the reply, that's
certainly an interesting bug and we'll try that patch and see how we get on.
Cheers,
Tom.
On 29 January 2016 at 19:54, Benjamin Mahler wrote:
> Hi Tom,
>
> I suspect you may be tripping the
Hi Tom,
I suspect you may be tripping the following issue:
https://issues.apache.org/jira/browse/MESOS-4302
Please have a read through this and see if it applies here. You may also be
able to apply the fix to your cluster to see if that helps things.
Ben
On Wed, Jan 20, 2016 at 10:19 AM, Tom
Can you share the whole log of master? I'll be helpful :).
Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
Platform OpenSource Technology, STG, IBM GCG
+86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
On Thu, Jan 21, 2016 at 11:57 PM, Tom Arnfeld wrote:
>
I can’t send the entire log as there’s a lot of activity on the cluster all the
time, is there anything particular you’re looking for?
> On 22 Jan 2016, at 12:46, Klaus Ma wrote:
>
> Can you share the whole log of master? I'll be helpful :).
>
>
> Da (Klaus), Ma
Yes, it seems Hadoop framework did not consume all offered resources: if
framework launch task (1 CPUs) on offer (10 CPUs), the other 9 CPUs will
return back to master (recoverResouces).
Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
Platform OpenSource Technology, STG, IBM GCG
Thanks everyone!
Stephan - There's a couple of useful points there, will definitely give it
a read.
Klaus - Thanks, we're running a bunch of different frameworks, in that list
there's Hadoop MRv1, Apache Spark, Marathon and a couple of home grown
frameworks we have. In this particular case the
Guangya - Nope, there's no outstanding offers for any frameworks, the ones
that are getting offers are responding properly.
Klaus - This was just a sample of logs for a single agent, the cluster has
at least ~40 agents at any one time.
On 21 January 2016 at 15:20, Guangya Liu
Can you please help check if some outstanding offers in cluster which does
not accept by any framework? You can check this via the endpoint of
/master/state.json endpoint.
If there are some outstanding offers, you can start the master with a
offer_timeout flag to let master rescind some offers if
Do you mean the only one slave is offered to some framework but the others
are starving?
Mesos allocator (DRF) offer resources by host; so if there's only one host,
the other framework can not get resources. We're have several JIRAs on how
to balance resources between frameworks.
Da
t;t...@duedil.com>
Sent: Wednesday, January 20, 2016 7:19 PM
To: user@mesos.apache.org
Subject: Mesos sometimes not allocating the entire cluster
Hey,
I've noticed some interesting behaviour recently when we have lots of different
frameworks connected to our Mesos cluster at once, all using a v
Hey,
I've noticed some interesting behaviour recently when we have lots of
different frameworks connected to our Mesos cluster at once, all using a
variety of different shares. Some of the frameworks don't get offered more
resources (for long periods of time, hours even) leaving the cluster under
Hi Tom,
Which framework are you using, e.g. Swarm, Marathon or something else? and
which language package are you using?
DRF will sort role/framework by allocation ratio, and offer all "available"
resources by slave; but if the resources it too small (< 0.1CPU) or the
resources was
20 matches
Mail list logo