[jira] [Created] (MESOS-10067) Update the `update()` method of cgroups subsystem interface to handle container resource limits

2019-12-05 Thread Qian Zhang (Jira)
Qian Zhang created MESOS-10067:
--

 Summary: Update the `update()` method of cgroups subsystem 
interface to handle container resource limits
 Key: MESOS-10067
 URL: https://issues.apache.org/jira/browse/MESOS-10067
 Project: Mesos
  Issue Type: Task
  Components: containerization
Reporter: Qian Zhang
Assignee: Qian Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (MESOS-10047) Update the CPU subsystem in the cgroup isolator to set container's CPU resource limits

2019-12-05 Thread Qian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/MESOS-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang reassigned MESOS-10047:
--

Assignee: Qian Zhang

> Update the CPU subsystem in the cgroup isolator to set container's CPU 
> resource limits
> --
>
> Key: MESOS-10047
> URL: https://issues.apache.org/jira/browse/MESOS-10047
> Project: Mesos
>  Issue Type: Task
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (MESOS-10066) mesos-docker-excutor process dies when agent stops. Recovery fails when agent returns

2019-12-05 Thread Dalton Matos Coelho Barreto (Jira)
Dalton Matos Coelho Barreto created MESOS-10066:
---

 Summary: mesos-docker-excutor process dies when agent stops. 
Recovery fails when agent returns
 Key: MESOS-10066
 URL: https://issues.apache.org/jira/browse/MESOS-10066
 Project: Mesos
  Issue Type: Bug
  Components: agent, containerization, docker, executor
Affects Versions: 1.7.3
Reporter: Dalton Matos Coelho Barreto


Hello all,

The documentation about Agent Recovery shows two conditions for the recovery to 
be possible:
 - The agent must have recovery enabled (default true?);
 - The scheduler must register itself saying that it has checkpointing enabled.

In my tests I'm using Marathon as the scheduler and Mesos itself sees Marathon 
as e checkpoint-enabled scheduler:
{noformat}
$ curl -sL 10.234.172.27:5050/state | jq '.frameworks[] | {"name": .name, 
"id": .id, "checkpoint": .checkpoint, "active": .active}'
{
  "name": "asgard-chronos",
  "id": "4783cf15-4fb1-4c75-90fe-44eeec5258a7-0001",
  "checkpoint": true,
  "active": true
}
{
  "name": "marathon",
  "id": "4783cf15-4fb1-4c75-90fe-44eeec5258a7-",
  "checkpoint": true,
  "active": true
}
}}
{noformat}
Here is what I'm using:
 # Mesos Master, 1.4.1
 # Mesos Agent 1.7.3
 # Using docker image {{mesos/mesos-centos:1.7.x}}
 # Docker sock mounted from the host
 # Docker binary also mounted from the host
 # Marathon: 1.4.12
 # Docker
{noformat}
Client: Docker Engine - Community
 Version:   19.03.5
 API version:   1.39 (downgraded from 1.40)
 Go version:go1.12.12
 Git commit:633a0ea838
 Built: Wed Nov 13 07:22:05 2019
 OS/Arch:   linux/amd64
 Experimental:  false

Server: Docker Engine - Community
 Engine:
  Version:  18.09.2
  API version:  1.39 (minimum version 1.12)
  Go version:   go1.10.6
  Git commit:   6247962
  Built:Sun Feb 10 03:42:13 2019
  OS/Arch:  linux/amd64
  Experimental: false
{noformat}

h2. The problem

Here is the Marathon test app, a simple {{sleep 99d}} based on {{debian}} 
docker image.
{noformat}
{
  "id": "/sleep",
  "cmd": "sleep 99d",
  "cpus": 0.1,
  "mem": 128,
  "disk": 0,
  "instances": 1,
  "constraints": [],
  "acceptedResourceRoles": [
"*"
  ],
  "container": {
"type": "DOCKER",
"volumes": [],
"docker": {
  "image": "debian",
  "network": "HOST",
  "privileged": false,
  "parameters": [],
  "forcePullImage": true
}
  },
  "labels": {},
  "portDefinitions": []
}
{noformat}
This task runs fine and get scheduled on the right agent, which is running 
mesos agent 1.7.3 (using the docker image, {{mesos/mesos-centos:1.7.x}}).

Here is a sample log:
{noformat}
mesos-slave_1  | I1205 13:24:21.391464 19849 slave.cpp:2403] Authorizing 
task 'sleep.8c187c41-1762-11ea-a2e5-02429217540f' for framework 
4783cf15-4fb1-4c75-90fe-44eeec5258a7-
mesos-slave_1  | I1205 13:24:21.392707 19849 slave.cpp:2846] Launching task 
'sleep.8c187c41-1762-11ea-a2e5-02429217540f' for framework 
4783cf15-4fb1-4c75-90fe-44eeec5258a7-
mesos-slave_1  | I1205 13:24:21.392895 19849 paths.cpp:748] Creating 
sandbox 
'/var/lib/mesos/agent/slaves/79ad3a13-b567-4273-ac8c-30378d35a439-S60499/frameworks/4783cf15-4fb1-4c75-90fe-44eeec5258a7-/executors/sleep.8c187c41-1762-11ea-a2e5-02429217540f/runs/53ec0ef3-3290-476a-b2b6-385099e9b923'
mesos-slave_1  | I1205 13:24:21.394399 19849 paths.cpp:748] Creating 
sandbox 
'/var/lib/mesos/agent/meta/slaves/79ad3a13-b567-4273-ac8c-30378d35a439-S60499/frameworks/4783cf15-4fb1-4c75-90fe-44eeec5258a7-/executors/sleep.8c187c41-1762-11ea-a2e5-02429217540f/runs/53ec0ef3-3290-476a-b2b6-385099e9b923'
mesos-slave_1  | I1205 13:24:21.394918 19849 slave.cpp:9068] Launching 
executor 'sleep.8c187c41-1762-11ea-a2e5-02429217540f' of framework 
4783cf15-4fb1-4c75-90fe-44eeec5258a7- with resources 
[{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"*"},"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}]
 in work directory 
'/var/lib/mesos/agent/slaves/79ad3a13-b567-4273-ac8c-30378d35a439-S60499/frameworks/4783cf15-4fb1-4c75-90fe-44eeec5258a7-/executors/sleep.8c187c41-1762-11ea-a2e5-02429217540f/runs/53ec0ef3-3290-476a-b2b6-385099e9b923'
mesos-slave_1  | I1205 13:24:21.396499 19849 slave.cpp:3078] Queued task 
'sleep.8c187c41-1762-11ea-a2e5-02429217540f' for executor 
'sleep.8c187c41-1762-11ea-a2e5-02429217540f' of framework 
4783cf15-4fb1-4c75-90fe-44eeec5258a7-
mesos-slave_1  | I1205 13:24:21.397038 19849 slave.cpp:3526] Launching 
container 53ec0ef3-3290-476a-b2b6-385099e9b92

[jira] [Commented] (MESOS-9968) WWWAuthenticate header parsing fails when commas are in (quoted) realm

2019-12-05 Thread Benno Evers (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988921#comment-16988921
 ] 

Benno Evers commented on MESOS-9968:


1.8.x:
{noformat}
commit 21ec06ed44c1fbd2272081b20bdcee630759f52d
Author: Benjamin Bannier 
Date:   Mon Sep 23 10:24:50 2019 +0200

Fixed parsing of HTTP authentication headers.

This patch adds support for quoted strings in `Www-Authenticate` headers
and allows to use spaces when delimiting authentication attributes of
the form `WWW-Autenticate: a=b, c=d`, both of with are allowed by
RFC2617.

Review: https://reviews.apache.org/r/71534

commit 32d6937bee6c2c43d769daa7b95b33856b9b8364
Author: Benjamin Bannier 
Date:   Mon Sep 23 10:23:27 2019 +0200

Cleaned up `HTTPTest.WWWAuthenticateHeader`.

This patch removes a number of error-prone temporaries previously reused
in the test.

Review: https://reviews.apache.org/r/71533
{noformat}

1.9.x
{noformat}
commit 5fa73f683c38c025b0e650de24474e0fdf95d1f4
Author: Benjamin Bannier 
Date:   Mon Sep 23 10:24:50 2019 +0200

Fixed parsing of HTTP authentication headers.

This patch adds support for quoted strings in `Www-Authenticate` headers
and allows to use spaces when delimiting authentication attributes of
the form `WWW-Autenticate: a=b, c=d`, both of with are allowed by
RFC2617.

Review: https://reviews.apache.org/r/71534

commit 5f6d218a3123ec35b3a14ce20e72b5ca3594cef2
Author: Benjamin Bannier 
Date:   Mon Sep 23 10:23:27 2019 +0200

Cleaned up `HTTPTest.WWWAuthenticateHeader`.

This patch removes a number of error-prone temporaries previously reused
in the test.

Review: https://reviews.apache.org/r/71533
{noformat}

> WWWAuthenticate header parsing fails when commas are in (quoted) realm
> --
>
> Key: MESOS-9968
> URL: https://issues.apache.org/jira/browse/MESOS-9968
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, libprocess
>Reporter: Jan Schlicht
>Assignee: Benjamin Bannier
>Priority: Major
> Fix For: 1.10.0
>
>
> This was discovered when trying to launch the 
> {{[nvcr.io/nvidia/tensorflow:19.08-py3|http://nvcr.io/nvidia/tensorflow:19.08-py3]}}
>  image using the Mesos containerizer. This launch fails with
> {noformat}
> Failed to launch container: Failed to get WWW-Authenticate header: Unexpected 
> auth-param format: 
> 'realm="https://nvcr.io/proxy_auth?scope=repository:nvidia/tensorflow:pull' 
> in 
> 'realm="https://nvcr.io/proxy_auth?scope=repository:nvidia/tensorflow:pull,push";'
> {noformat}
> This is because the [header tokenization in 
> libprocess|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/http.cpp#L640]
>  can't handle commas in quoted realm values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (MESOS-10062) Implement relative path computation for stout

2019-12-05 Thread Benjamin Bannier (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988766#comment-16988766
 ] 

Benjamin Bannier commented on MESOS-10062:
--

Reviews: 
https://reviews.apache.org/r/71878/
https://reviews.apache.org/r/71879/
https://reviews.apache.org/r/71880/
https://reviews.apache.org/r/71881/
https://reviews.apache.org/r/71882/

> Implement relative path computation for stout
> -
>
> Key: MESOS-10062
> URL: https://issues.apache.org/jira/browse/MESOS-10062
> Project: Mesos
>  Issue Type: Task
>Reporter: Benno Evers
>Assignee: Benjamin Bannier
>Priority: Major
>
> When using executor domain sockets, we might need to specify relative paths 
> in order to stay below the path length limit of 108 characters.
> To do so, we should implement a `path::relative_path()` function in stout 
> that can compute the relative path between two directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)