[jira] [Commented] (MESOS-6200) Hope mesos support soft and hard cpu/memory resource in the task

2019-07-31 Thread Deshi Xiao (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-6200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897570#comment-16897570
 ] 

Deshi Xiao commented on MESOS-6200:
---

thanks.got it.

Benjamin Mahler (JIRA)  于 2019年7月31日周三 上午6:25写道:



> Hope mesos support soft and hard cpu/memory resource in the task
> 
>
> Key: MESOS-6200
> URL: https://issues.apache.org/jira/browse/MESOS-6200
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization, docker, scheduler api
>Affects Versions: 0.28.2
> Environment: CentOS 7 
> Kernel 3.10.0-327.28.3.el7.x86_64
> Mesos 0.28.2
> Docker 1.11.2
>Reporter: Lei Xu
>Priority: Major
>  Labels: resource-management
>
> The Docker executor maybe could support soft/hard resource limit to enable 
> more flexible resources sharing among the applications.
> ||  || CPU || Memory ||
> | hard limit| --cpu-period & --cpu-quota | --memory & --memory-swap|
> | soft limit| --cpu-shares | --memory-reservation|
> And now the task protobuf message has only one resource struct that used to 
> describe the cgroup limit, and the docker executor handle is like the 
> following, only --memory and --cpu-shares were set:
> {code}
>   if (resources.isSome()) {
> // TODO(yifan): Support other resources (e.g. disk).
> Option cpus = resources.get().cpus();
> if (cpus.isSome()) {
>   uint64_t cpuShare =
> std::max((uint64_t) (CPU_SHARES_PER_CPU * cpus.get()), 
> MIN_CPU_SHARES);
>   argv.push_back("--cpu-shares");
>   argv.push_back(stringify(cpuShare));
> }
> Option mem = resources.get().mem();
> if (mem.isSome()) {
>   Bytes memLimit = std::max(mem.get(), MIN_MEMORY);
>   argv.push_back("--memory");
>   argv.push_back(stringify(memLimit.bytes()));
> }
>   }
> {code}
> I hope that the executor and the protobuf message could separate the resource 
> to the two parts: soft and hard. Then the user could set 2 levels resource 
> limits for the docker.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (MESOS-9919) Health check performance decreases on large machines

2019-07-31 Thread Greg Mann (JIRA)
Greg Mann created MESOS-9919:


 Summary: Health check performance decreases on large machines
 Key: MESOS-9919
 URL: https://issues.apache.org/jira/browse/MESOS-9919
 Project: Mesos
  Issue Type: Task
  Components: agent, containerization
Reporter: Greg Mann


In recent testing, it appears that the performance of Mesos command health 
checks decreases dramatically on nodes with large numbers of cores and lots of 
memory. This may be due to the changes in the cost of forking the agent process 
on such nodes. We need to investigate this issue to understand the root cause.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (MESOS-8069) Role-related endpoints need to reflect hierarchical accounting.

2019-07-31 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897302#comment-16897302
 ] 

Benjamin Mahler commented on MESOS-8069:


This was done for the v0 /roles endpoint but still needs to be done for v1 
GET_ROLES.

> Role-related endpoints need to reflect hierarchical accounting.
> ---
>
> Key: MESOS-8069
> URL: https://issues.apache.org/jira/browse/MESOS-8069
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, HTTP API, master
>Reporter: Benjamin Mahler
>Assignee: Till Toenshoff
>Priority: Major
>  Labels: mesosphere, multitenancy, resource-management
> Attachments: Screen Shot 2018-03-06 at 15.06.04.png
>
>
> With the introduction of hierarchical roles, the role-related endpoints need 
> to be updated to provide aggregated accounting information.
> For example, information about how many resources are allocated to "/eng" 
> should include the resources allocated to "/eng/frontend" and "/eng/backend", 
> since quota guarantees and limits are also applied on the aggregation.
> This also affects the UI display, for example the 'Roles' tab.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (MESOS-9918) Agent fails to scale many tasks/containers with command health checks

2019-07-31 Thread Greg Mann (JIRA)
Greg Mann created MESOS-9918:


 Summary: Agent fails to scale many tasks/containers with command 
health checks
 Key: MESOS-9918
 URL: https://issues.apache.org/jira/browse/MESOS-9918
 Project: Mesos
  Issue Type: Task
  Components: agent, containerization
Reporter: Greg Mann


When ~50 containers are launched simultaneously in a task group on an agent, 
all of which specify command health checks, they will fail to become healthy. 
The {{LAUNCH_NESTED_CONTAINER_SESSION}} calls for the health checks time out, 
leading to task group failure.

We should both investigate the cause of the timeouts (based on previous 
profiling efforts, it is likely due to the cost of forking from the agent 
process), as well as consider rate-limiting options to allow operators to 
simultaneously scale large numbers of containers.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (MESOS-9427) Revisit quota documentation.

2019-07-31 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler reassigned MESOS-9427:
--

Assignee: Benjamin Mahler

> Revisit quota documentation.
> 
>
> Key: MESOS-9427
> URL: https://issues.apache.org/jira/browse/MESOS-9427
> Project: Mesos
>  Issue Type: Documentation
>  Components: allocation, documentation
>Reporter: Benjamin Mahler
>Assignee: Benjamin Mahler
>Priority: Major
>  Labels: multitenancy, resource-management
>
> At this point the quota documentation in the docs/ folder has become rather 
> stale. It would be good to at least update any inaccuracies and ideally 
> re-write it to better reflect the current thinking.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (MESOS-9758) Take ports out of the roles endpoints.

2019-07-31 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897300#comment-16897300
 ] 

Benjamin Mahler commented on MESOS-9758:


v0 /roles no longer has ports, but v1 GET_ROLES still has it.

> Take ports out of the roles endpoints.
> --
>
> Key: MESOS-9758
> URL: https://issues.apache.org/jira/browse/MESOS-9758
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Meng Zhu
>Priority: Major
>  Labels: resource-management
>
> It does not make sense to combine ports across agents.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (MESOS-9845) Add docs for automatic agent draining

2019-07-31 Thread Greg Mann (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-9845:


Assignee: Greg Mann

> Add docs for automatic agent draining
> -
>
> Key: MESOS-9845
> URL: https://issues.apache.org/jira/browse/MESOS-9845
> Project: Mesos
>  Issue Type: Task
>  Components: documentation
>Reporter: Greg Mann
>Assignee: Greg Mann
>Priority: Major
>  Labels: foundations, mesosphere
>
> Will probably require:
> * A separate page describing the feature (in lieu or superceding the 
> maintenance doc)
> * Updates to the API docs, for master and agent APIs.  Any GET_STATE or 
> similar call changes will also be included.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)