date:20160411

[jira] [Issue Comment Deleted] (MESOS-5184) Mesos does not validate role info when framework registered with specified role

2016-04-11 Thread Jian Qiu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Qiu updated MESOS-5184:

Comment: was deleted

(was: We also need to validate role when update weight and quota.)

> Mesos does not validate role info when framework registered with specified 
> role
> ---
>
> Key: MESOS-5184
> URL: https://issues.apache.org/jira/browse/MESOS-5184
> Project: Mesos
>  Issue Type: Bug
>  Components: general
>Affects Versions: 0.28.0
>Reporter: Liqiang Lin
> Fix For: 0.29.0
>
>
> When framework registered with specified role, Mesos does not validate the 
> role info. It will accept the subscription and send unreserved resources as 
> offer to the framework.
> {code}
> # cat register.json
> {
> "framework_id": {"value" : "test1"},
> "type":"SUBSCRIBE",
> "subscribe":{
> "framework_info":{
> "user":"root",
> "name":"test1",
> "failover_timeout":60,
> "role":"/test/test1",
> "id":{"value":"test1"},
> "principal":"test1",
> "capabilities":[{"type":"REVOCABLE_RESOURCES"}]
> },
> "force":true
> }
> }
> # curl -v  http://192.168.56.110:5050/api/v1/scheduler -H "Content-type: 
> application/json" -X POST -d @register.json
> * Hostname was NOT found in DNS cache
> *   Trying 192.168.56.110...
> * Connected to 192.168.56.110 (192.168.56.110) port 5050 (#0)
> > POST /api/v1/scheduler HTTP/1.1
> > User-Agent: curl/7.35.0
> > Host: 192.168.56.110:5050
> > Accept: */*
> > Content-type: application/json
> > Content-Length: 265
> >
> * upload completely sent off: 265 out of 265 bytes
> < HTTP/1.1 200 OK
> < Date: Wed, 06 Apr 2016 21:34:18 GMT
> < Transfer-Encoding: chunked
> < Mesos-Stream-Id: 8b2c6740-b619-49c3-825a-e6ae780f4edc
> < Content-Type: application/json
> <
> 69
> {"subscribed":{"framework_id":{"value":"test1"}},"type":"SUBSCRIBED"}20
> {"type":"HEARTBEAT"}1531
> {"offers":{"offers":[{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S0"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos2"},"type":"TEXT"}],"framework_id":{"value":"test1"},"hostname":"mesos2","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O0"},"resources":[{"name":"disk","role":"*","scalar":{"value":20576.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos2","ip":"192.168.56.110","port":5051},"path":"\/slave(1)","scheme":"http"}},{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S1"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos1"},"type":"TEXT"}],"framework_id":{"v
> alue":"test1"},"hostname":"mesos1","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O1"},"resources":[{"name":"disk","role":"*","scalar":{"value":21468.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos1","ip":"192.168.56.111","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}20
> {"type":"HEARTBEAT"}20
> {code}
> As you see,  the role under which framework register is "/test/test1", which 
> is an invalid role according to 
> [#MESOS-2210|https://issues.apache.org/jira/browse/MESOS-2210]
> And Mesos master log
> {code}
> I0407 05:34:18.132333 20672 master.cpp:2107] Received subscription request 
> for HTTP framework 'test1'
> I0407 05:34:18.133515 20672 master.cpp:2198] Subscribing framework 'test1' 
> with checkpointing disabled and capabilities [ REVOCABLE_RESOURCES ]
> I0407 05:34:18.135027 20674 hierarchical.cpp:264] Added framework test1
> I0407 05:34:18.138746 20672 master.cpp:5659] Sending 2 offers to framework 
> test1 (test1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5184) Mesos does not validate role info when framework registered with specified role

2016-04-11 Thread Jian Qiu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236654#comment-15236654
 ] 

Jian Qiu commented on MESOS-5184:
-

We also need to validate role when update weight and quota.

> Mesos does not validate role info when framework registered with specified 
> role
> ---
>
> Key: MESOS-5184
> URL: https://issues.apache.org/jira/browse/MESOS-5184
> Project: Mesos
>  Issue Type: Bug
>  Components: general
>Affects Versions: 0.28.0
>Reporter: Liqiang Lin
> Fix For: 0.29.0
>
>
> When framework registered with specified role, Mesos does not validate the 
> role info. It will accept the subscription and send unreserved resources as 
> offer to the framework.
> {code}
> # cat register.json
> {
> "framework_id": {"value" : "test1"},
> "type":"SUBSCRIBE",
> "subscribe":{
> "framework_info":{
> "user":"root",
> "name":"test1",
> "failover_timeout":60,
> "role":"/test/test1",
> "id":{"value":"test1"},
> "principal":"test1",
> "capabilities":[{"type":"REVOCABLE_RESOURCES"}]
> },
> "force":true
> }
> }
> # curl -v  http://192.168.56.110:5050/api/v1/scheduler -H "Content-type: 
> application/json" -X POST -d @register.json
> * Hostname was NOT found in DNS cache
> *   Trying 192.168.56.110...
> * Connected to 192.168.56.110 (192.168.56.110) port 5050 (#0)
> > POST /api/v1/scheduler HTTP/1.1
> > User-Agent: curl/7.35.0
> > Host: 192.168.56.110:5050
> > Accept: */*
> > Content-type: application/json
> > Content-Length: 265
> >
> * upload completely sent off: 265 out of 265 bytes
> < HTTP/1.1 200 OK
> < Date: Wed, 06 Apr 2016 21:34:18 GMT
> < Transfer-Encoding: chunked
> < Mesos-Stream-Id: 8b2c6740-b619-49c3-825a-e6ae780f4edc
> < Content-Type: application/json
> <
> 69
> {"subscribed":{"framework_id":{"value":"test1"}},"type":"SUBSCRIBED"}20
> {"type":"HEARTBEAT"}1531
> {"offers":{"offers":[{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S0"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos2"},"type":"TEXT"}],"framework_id":{"value":"test1"},"hostname":"mesos2","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O0"},"resources":[{"name":"disk","role":"*","scalar":{"value":20576.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos2","ip":"192.168.56.110","port":5051},"path":"\/slave(1)","scheme":"http"}},{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S1"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos1"},"type":"TEXT"}],"framework_id":{"v
> alue":"test1"},"hostname":"mesos1","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O1"},"resources":[{"name":"disk","role":"*","scalar":{"value":21468.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos1","ip":"192.168.56.111","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}20
> {"type":"HEARTBEAT"}20
> {code}
> As you see,  the role under which framework register is "/test/test1", which 
> is an invalid role according to 
> [#MESOS-2210|https://issues.apache.org/jira/browse/MESOS-2210]
> And Mesos master log
> {code}
> I0407 05:34:18.132333 20672 master.cpp:2107] Received subscription request 
> for HTTP framework 'test1'
> I0407 05:34:18.133515 20672 master.cpp:2198] Subscribing framework 'test1' 
> with checkpointing disabled and capabilities [ REVOCABLE_RESOURCES ]
> I0407 05:34:18.135027 20674 hierarchical.cpp:264] Added framework test1
> I0407 05:34:18.138746 20672 master.cpp:5659] Sending 2 offers to framework 
> test1 (test1)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5060) Requesting /files/read.json with a negative length value causes subsequent /files requests to 404.

2016-04-11 Thread zhou xing (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236611#comment-15236611
 ] 

zhou xing commented on MESOS-5060:
--

Hi Greg, thanks for your reminder! Is that ok for you to help shepherd this 
ticket? or I will send a mail in the mailing thread to find one

> Requesting /files/read.json with a negative length value causes subsequent 
> /files requests to 404.
> --
>
> Key: MESOS-5060
> URL: https://issues.apache.org/jira/browse/MESOS-5060
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
> Environment: Mesos 0.23.0 on CentOS 6, also Mesos 0.28.0 on OSX
>Reporter: Tom Petr
>Assignee: zhou xing
>Priority: Minor
> Fix For: 0.29.0
>
>
> I accidentally hit a slave's /files/read.json endpoint with a negative length 
> (ex. http://hostname:5051/files/read.json?path=XXX&offset=0&length=-100). The 
> HTTP request timed out after 30 seconds with nothing relevant in the slave 
> logs, and subsequent calls to any of the /files endpoints on that slave 
> immediately returned a HTTP 404 response. We ultimately got things working 
> again by restarting the mesos-slave process (checkpointing FTW!), but it'd be 
> wise to guard against negative lengths on the slave's end too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4828) XFS disk quota isolator

2016-04-11 Thread Yan Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236604#comment-15236604
 ] 

Yan Xu commented on MESOS-4828:
---

[~jieyu] Hey, the main reason is for consistency with {{posix/disk}}. I realize 
that there hasn't been too strict of a convention to follow. I don't have a 
strong preference about it but was aiming for consistency. If we agree to start 
to use "disk/du" I think "disk/xfs" is fine. Of course leaving "posix/disk" for 
a deprecation cycle is reasonable.

> XFS disk quota isolator
> ---
>
> Key: MESOS-4828
> URL: https://issues.apache.org/jira/browse/MESOS-4828
> Project: Mesos
>  Issue Type: Epic
>  Components: isolation
>Reporter: James Peach
>Assignee: James Peach
>
> Implement a disk resource isolator using XFS project quotas. Compared to the 
> {{posix/disk}} isolator, this doesn't need to scan the filesystem 
> periodically, and applications receive a {{EDQUOT}} error instead of being 
> summarily killed.
> This initial implementation only isolates sandbox directory resources, since 
> isolation doesn't have any visibility into the the lifecycle of volumes, 
> which is needed to assign and track project IDs.
> The build dependencies for this are XFS header (from xfsprogs-devel) and 
> libblkid. We need libblkid or the equivalent to map filesystem paths to block 
> devices in order to apply quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5184) Mesos does not validate role info when framework registered with specified role

2016-04-11 Thread Liqiang Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liqiang Lin updated MESOS-5184:
---
Description: 
When framework registered with specified role, Mesos does not validate the role 
info. It will accept the subscription and send unreserved resources as offer to 
the framework.

{code}
# cat register.json
{
"framework_id": {"value" : "test1"},
"type":"SUBSCRIBE",
"subscribe":{
"framework_info":{
"user":"root",
"name":"test1",
"failover_timeout":60,
"role":"/test/test1",
"id":{"value":"test1"},
"principal":"test1",
"capabilities":[{"type":"REVOCABLE_RESOURCES"}]
},
"force":true
}
}

# curl -v  http://192.168.56.110:5050/api/v1/scheduler -H "Content-type: 
application/json" -X POST -d @register.json
* Hostname was NOT found in DNS cache
*   Trying 192.168.56.110...
* Connected to 192.168.56.110 (192.168.56.110) port 5050 (#0)
> POST /api/v1/scheduler HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 192.168.56.110:5050
> Accept: */*
> Content-type: application/json
> Content-Length: 265
>
* upload completely sent off: 265 out of 265 bytes
< HTTP/1.1 200 OK
< Date: Wed, 06 Apr 2016 21:34:18 GMT
< Transfer-Encoding: chunked
< Mesos-Stream-Id: 8b2c6740-b619-49c3-825a-e6ae780f4edc
< Content-Type: application/json
<
69
{"subscribed":{"framework_id":{"value":"test1"}},"type":"SUBSCRIBED"}20
{"type":"HEARTBEAT"}1531
{"offers":{"offers":[{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S0"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos2"},"type":"TEXT"}],"framework_id":{"value":"test1"},"hostname":"mesos2","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O0"},"resources":[{"name":"disk","role":"*","scalar":{"value":20576.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos2","ip":"192.168.56.110","port":5051},"path":"\/slave(1)","scheme":"http"}},{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S1"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos1"},"type":"TEXT"}],"framework_id":{"v
alue":"test1"},"hostname":"mesos1","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O1"},"resources":[{"name":"disk","role":"*","scalar":{"value":21468.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos1","ip":"192.168.56.111","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}20
{"type":"HEARTBEAT"}20
{code}

As you see,  the role under which framework register is "/test/test1", which is 
an invalid role according to 
[#MESOS-2210|https://issues.apache.org/jira/browse/MESOS-2210]

And Mesos master log

{code}
I0407 05:34:18.132333 20672 master.cpp:2107] Received subscription request for 
HTTP framework 'test1'
I0407 05:34:18.133515 20672 master.cpp:2198] Subscribing framework 'test1' with 
checkpointing disabled and capabilities [ REVOCABLE_RESOURCES ]
I0407 05:34:18.135027 20674 hierarchical.cpp:264] Added framework test1
I0407 05:34:18.138746 20672 master.cpp:5659] Sending 2 offers to framework 
test1 (test1)
{code}


  was:
When framework registered with specified role, Mesos does not validate the role 
info. It will accept the subscription and send unreserved resources as offer to 
the framework.

# cat register.json
{
"framework_id": {"value" : "test1"},
"type":"SUBSCRIBE",
"subscribe":{
"framework_info":{
"user":"root",
"name":"test1",
"failover_timeout":60,
"role":"/test/test1",
"id":{"value":"test1"},
"principal":"test1",
"capabilities":[{"type":"REVOCABLE_RESOURCES"}]
},
"force":true
}
}

# curl -v  http://192.168.56.110:5050/api/v1/scheduler -H "Content-type: 
application/json" -X POST -d @register.json
* Hostname was NOT found in DNS cache
*   Trying 192.168.56.110...
* Connected to 192.168.56.110 (192.168.56.110) port 5050 (#0)
> POST /api/v1/scheduler HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 192.168.56.110:5050
> Accept: */*
> Content-type: application/json
> Content-Length: 265
>
* upload completely sent off: 265 out of 265 bytes
< HTTP/1.1 200 OK
< Date: Wed, 06 Apr 2016 21:34:18 GMT
< Transfer-Encoding: chunked
< Mesos-Stream-Id: 8b2c6740-b619-49c3-825a-e6ae780f4edc
< Content-Type: application/json
<
69
{"subscribed":{"framework_id":{"value":"test1"}},"type":"SUBSCRIBED"}20
{"type":"HEARTBEAT"}1531
{"offers":{"offers":[{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S0"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"

[jira] [Created] (MESOS-5185) Accessibility for Mesos Web UI

2016-04-11 Thread haosdent (JIRA)

haosdent created MESOS-5185:
---

 Summary: Accessibility for Mesos Web UI
 Key: MESOS-5185
 URL: https://issues.apache.org/jira/browse/MESOS-5185
 Project: Mesos
  Issue Type: Epic
  Components: webui
Reporter: haosdent
Priority: Minor


Currently, Mesos Web UI do not have fully support Accessibility features for
disabled people.

For example:

Web GUI can support screen reader to read page content for blind person.
so we can fix some issues such as making Mesos Web GUI pages to support
[WAI-ARIA standard | https://www.w3.org/WAI/intro/aria]

We could update webui according to [Accessibility Design Guidelines for the 
Web|https://msdn.microsoft.com/en-us/library/aa291312(v=vs.71).aspx] and 
https://www.w3.org/standards/webdesign/accessibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5184) Mesos does not validate role info when framework registered with specified role

2016-04-11 Thread Liqiang Lin (JIRA)

Liqiang Lin created MESOS-5184:
--

 Summary: Mesos does not validate role info when framework 
registered with specified role
 Key: MESOS-5184
 URL: https://issues.apache.org/jira/browse/MESOS-5184
 Project: Mesos
  Issue Type: Bug
  Components: general
Affects Versions: 0.28.0
Reporter: Liqiang Lin
 Fix For: 0.29.0


When framework registered with specified role, Mesos does not validate the role 
info. It will accept the subscription and send unreserved resources as offer to 
the framework.

# cat register.json
{
"framework_id": {"value" : "test1"},
"type":"SUBSCRIBE",
"subscribe":{
"framework_info":{
"user":"root",
"name":"test1",
"failover_timeout":60,
"role":"/test/test1",
"id":{"value":"test1"},
"principal":"test1",
"capabilities":[{"type":"REVOCABLE_RESOURCES"}]
},
"force":true
}
}

# curl -v  http://192.168.56.110:5050/api/v1/scheduler -H "Content-type: 
application/json" -X POST -d @register.json
* Hostname was NOT found in DNS cache
*   Trying 192.168.56.110...
* Connected to 192.168.56.110 (192.168.56.110) port 5050 (#0)
> POST /api/v1/scheduler HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 192.168.56.110:5050
> Accept: */*
> Content-type: application/json
> Content-Length: 265
>
* upload completely sent off: 265 out of 265 bytes
< HTTP/1.1 200 OK
< Date: Wed, 06 Apr 2016 21:34:18 GMT
< Transfer-Encoding: chunked
< Mesos-Stream-Id: 8b2c6740-b619-49c3-825a-e6ae780f4edc
< Content-Type: application/json
<
69
{"subscribed":{"framework_id":{"value":"test1"}},"type":"SUBSCRIBED"}20
{"type":"HEARTBEAT"}1531
{"offers":{"offers":[{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S0"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos2"},"type":"TEXT"}],"framework_id":{"value":"test1"},"hostname":"mesos2","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O0"},"resources":[{"name":"disk","role":"*","scalar":{"value":20576.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos2","ip":"192.168.56.110","port":5051},"path":"\/slave(1)","scheme":"http"}},{"agent_id":{"value":"2cd5576e-6260-4262-a62c-b0dc45c86c45-S1"},"attributes":[{"name":"mesos_agent_type","text":{"value":"IBM_MESOS_EGO"},"type":"TEXT"},{"name":"hostname","text":{"value":"mesos1"},"type":"TEXT"}],"framework_id":{"v
alue":"test1"},"hostname":"mesos1","id":{"value":"5b84aad8-dd60-40b3-84c2-93be6b7aa81c-O1"},"resources":[{"name":"disk","role":"*","scalar":{"value":21468.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":3952.0},"type":"SCALAR"},{"name":"cpus","role":"*","scalar":{"value":4.0},"type":"SCALAR"}],"url":{"address":{"hostname":"mesos1","ip":"192.168.56.111","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS"}20
{"type":"HEARTBEAT"}20

As you see,  the role under which framework register is "/test/test1", which is 
an invalid role according to 
[#MESOS-2210|https://issues.apache.org/jira/browse/MESOS-2210]

And Mesos master log

I0407 05:34:18.132333 20672 master.cpp:2107] Received subscription request for 
HTTP framework 'test1'
I0407 05:34:18.133515 20672 master.cpp:2198] Subscribing framework 'test1' with 
checkpointing disabled and capabilities [ REVOCABLE_RESOURCES ]
I0407 05:34:18.135027 20674 hierarchical.cpp:264] Added framework test1
I0407 05:34:18.138746 20672 master.cpp:5659] Sending 2 offers to framework 
test1 (test1)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5183) Provide backup/restore functionality for the registry.

2016-04-11 Thread Benjamin Mahler (JIRA)

Benjamin Mahler created MESOS-5183:
--

 Summary: Provide backup/restore functionality for the registry.
 Key: MESOS-5183
 URL: https://issues.apache.org/jira/browse/MESOS-5183
 Project: Mesos
  Issue Type: Epic
  Components: master
Reporter: Benjamin Mahler
Priority: Critical


Currently there is no built-in support for backup/restore of the registry 
state. The current suggestion is to back up the LevelDB directories across each 
master and to restore them. This can be error prone and it requires that 
operators deal directly with the underlying storage layer.

Ideally, the master provides a means to extract the complete registry contents 
for backup purposes, and has the ability to restore its state from a backup.

As a note, the {{/registrar(1)/registry}} endpoint currently provides an 
ability to extract the state as JSON. There is currently no built-in support 
for restoring from backups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5182) mesos-executor (CommandScheduler) does not accept offer with revocable resources

2016-04-11 Thread Liqiang Lin (JIRA)

Liqiang Lin created MESOS-5182:
--

 Summary: mesos-executor (CommandScheduler) does not accept offer 
with revocable resources
 Key: MESOS-5182
 URL: https://issues.apache.org/jira/browse/MESOS-5182
 Project: Mesos
  Issue Type: Bug
  Components: framework
Affects Versions: 0.28.0
Reporter: Liqiang Lin
 Fix For: 0.29.0


Currently mesos-executor (CommandScheduler) does not accept offer with 
revocable resources. It's unable to verify cases using revocable resources to 
launch tasks with this example framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api

2016-04-11 Thread wangqun (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236424#comment-15236424
 ] 

wangqun commented on MESOS-5148:


[~haosd...@gmail.com] Thanks for let me understand it.

> Supporting Container Images in Mesos Containerizer doesn't work by using 
> marathon api
> -
>
> Key: MESOS-5148
> URL: https://issues.apache.org/jira/browse/MESOS-5148
> Project: Mesos
>  Issue Type: Bug
>Reporter: wangqun
>
> Hi
> I use the marathon api to create tasks to test Supporting Container 
> Images in Mesos Containerizer .
> My steps is the following:
> 1) to run the process in master node.
> sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 
> --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 
> --quorum=1 --work_dir=/var/lib/mesos
> 2) to run the process in slave node.
> sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos 
> --log_dir=/var/log/mesos --containerizers=docker,mesos 
> --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 
> --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave 
> --image_providers=docker --executor_environment_variables="{}"
> 3) to create one json file to specify the container to be managed by mesos.
> sudo  touch mesos.json
> sudo vim  mesos.json
> {
>   "container": {
> "type": "MESOS",
> "docker": {
>   "image": "library/redis"
> }
>   },
>   "id": "ubuntumesos",
>   "instances": 1,
>   "cpus": 0.5,
>   "mem": 512,
>   "uris": [],
>   "cmd": "ping 8.8.8.8"
> }
> 4)sudo curl -X POST -H "Content-Type: application/json" 
> localhost:8080/v2/apps -d...@mesos.json
> 5)sudo  curl http://localhost:8080/v2/tasks
> {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]}
> 6) sudo docker run -ti --net=host redis redis-cli  
> Could not connect to Redis at 127.0.0.1:6379: Connection refused
> not connected> 
> 7)
> I0409 01:43:48.774868 3492 slave.cpp:3886] Executor 
> 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework 
> ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0
> I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor 
> 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework 
> ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134
> I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework 
> ffb72d7c-dd63-4c30-abea-bb746ab2c326-
> I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling 
> '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4'
>  for gc 6.9070953778days in the future
> I0409 01:43:48.817401 3493 gc.cpp:55] Scheduling 
> '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce'
>  for gc 6.9065992889days in the future
> I0409 01:43:48.823158 3493 gc.cpp:55] Scheduling 
> '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4'
>  for gc 6.9065273185days in the future
> I0409 01:43:48.826216 3491 status_update_manager.cpp:282] Closing status 
> update streams for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326-
> I0409 01:43:48.835602 3493 gc.cpp:55] Scheduling 
> '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce'
>  for gc 6.9064716444days in the future
> I0409 01:43:48.838580 3493 gc.cpp:55] Scheduling 
> '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-'
>  for gc 6.9041064889days in the future
> I0409 01:43:48.844699 3493 gc.cpp:55] Scheduling 
> '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-'
>  for gc 6.902654163days in the future
> I0409 01:44:01.623440 3494 slave.cpp:4374] Current disk usage 27.10%. Max 
> allowed age: 4.403153217546436days
> I0409 01:44:32.339310 3494 slave.cpp:1361] Got assigned task 
> ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce for framework 
> ffb72d7c-dd63-4c30-abea-bb746ab2c326-

[jira] [Updated] (MESOS-5173) Allow master/agent to take multiple --modules flags

2016-04-11 Thread Kapil Arya (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-5173:
--
Sprint: Mesosphere Sprint 33

> Allow master/agent to take multiple --modules flags
> ---
>
> Key: MESOS-5173
> URL: https://issues.apache.org/jira/browse/MESOS-5173
> Project: Mesos
>  Issue Type: Task
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> When loading multiple modules into master/agent, one has to merge all module 
> metadata (library name, module name, parameters, etc.) into a single json 
> file which is then passed on to the --modules flag. This quickly becomes 
> cumbersome especially if the modules are coming from different 
> vendors/developers.
> An alternate would be to allow multiple invocations of --modules flag that 
> can then be passed on to the module manager. That way, each flag corresponds 
> to just one module library and modules from that library.
> Another approach is to create a new flag (e.g., --modules-dir) that contains 
> a path to a directory that would contain multiple json files. One can think 
> of it as an analogous to systemd units. The operator that drops a new file 
> into this directory and the file would automatically be picked up by the 
> master/agent module manager. Further, the naming scheme can also be inherited 
> to prefix the filename with an "NN_" to signify oad order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5171) Expose state/state.hpp to public headers

2016-04-11 Thread Kapil Arya (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-5171:
--
Sprint: Mesosphere Sprint 33

> Expose state/state.hpp to public headers
> 
>
> Key: MESOS-5171
> URL: https://issues.apache.org/jira/browse/MESOS-5171
> Project: Mesos
>  Issue Type: Task
>  Components: replicated log
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> We want the Modules to be able to use replicated log along with the APIs to 
> communicate with Zookeeper. This change would require us to expose at least 
> the following headers state/storage.hpp, and any additional files that 
> state.hpp depends on (e.g., zookeeper/authentication.hpp).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5180) Scheduler driver does not detect disconnection with master and reregister.

2016-04-11 Thread Anand Mazumdar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-5180:
--
Description: 
The existing implementation of the scheduler driver does not re-register with 
the master under some network partition cases.

When a scheduler registers with the master:
1) master links to the framework
2) framework links to the master

It is possible for either of these links to break *without* the master 
changing.  (Currently, the scheduler driver will only re-register if the master 
changes).

If both links break or if just link (1) breaks, the master views the framework 
as {{inactive}} and {{disconnected}}.  This means the framework will not 
receive any more events (such as offers) from the master until it re-registers. 
 There is currently no way for the scheduler to detect a one-way link breakage.

if link (2) breaks, it makes (almost) no difference to the scheduler.  The 
scheduler usually uses the link to send messages to the master, but libprocess 
will create another socket if the persistent one is not available.

To fix link breakages for (1+2) and (2), the scheduler driver should implement 
a `::exited` event handler for the master's {{pid}} and trigger a master 
(re-)detection upon a disconnection. This in turn should make the driver 
(re)-register with the master. The scheduler library already does this: 
https://github.com/apache/mesos/blob/master/src/scheduler/scheduler.cpp#L395

See the related issue MESOS-5181 for link (1) breakage.

  was:
The existing implementation of the scheduler driver does not re-register with 
the master under some network partition cases.

When a scheduler registers with the master:
1) master links to the framework
2) framework links to the master

It is possible for either of these links to break *without* the master 
changing.  (Currently, the scheduler driver will only re-register if the master 
changes).

If both links break or if just link (1) breaks, the master views the framework 
as {{inactive}} and {{disconnected}}.  This means the framework will not 
receive any more events (such as offers) from the master until it re-registers. 
 There is currently no way for the scheduler to detect a one-way link breakage.

if link (2) breaks, it makes (almost) no difference to the scheduler.  The 
scheduler usually uses the link to send messages to the master, but libprocess 
will create another socket if the persistent one is not available.

To fix link breakages for (1+2) and (2), the scheduler driver should implement 
a `::exited` event handler for the master's {{pid}} and re-register in this 
case.

See the related issue MESOS-5181 for link (1) breakage.


> Scheduler driver does not detect disconnection with master and reregister.
> --
>
> Key: MESOS-5180
> URL: https://issues.apache.org/jira/browse/MESOS-5180
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.24.0
>Reporter: Joseph Wu
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> The existing implementation of the scheduler driver does not re-register with 
> the master under some network partition cases.
> When a scheduler registers with the master:
> 1) master links to the framework
> 2) framework links to the master
> It is possible for either of these links to break *without* the master 
> changing.  (Currently, the scheduler driver will only re-register if the 
> master changes).
> If both links break or if just link (1) breaks, the master views the 
> framework as {{inactive}} and {{disconnected}}.  This means the framework 
> will not receive any more events (such as offers) from the master until it 
> re-registers.  There is currently no way for the scheduler to detect a 
> one-way link breakage.
> if link (2) breaks, it makes (almost) no difference to the scheduler.  The 
> scheduler usually uses the link to send messages to the master, but 
> libprocess will create another socket if the persistent one is not available.
> To fix link breakages for (1+2) and (2), the scheduler driver should 
> implement a `::exited` event handler for the master's {{pid}} and trigger a 
> master (re-)detection upon a disconnection. This in turn should make the 
> driver (re)-register with the master. The scheduler library already does 
> this: 
> https://github.com/apache/mesos/blob/master/src/scheduler/scheduler.cpp#L395
> See the related issue MESOS-5181 for link (1) breakage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5180) Scheduler driver does not detect disconnection with master and reregister.

2016-04-11 Thread Anand Mazumdar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-5180:
--
Story Points: 2  (was: 1)

> Scheduler driver does not detect disconnection with master and reregister.
> --
>
> Key: MESOS-5180
> URL: https://issues.apache.org/jira/browse/MESOS-5180
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.24.0
>Reporter: Joseph Wu
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> The existing implementation of the scheduler driver does not re-register with 
> the master under some network partition cases.
> When a scheduler registers with the master:
> 1) master links to the framework
> 2) framework links to the master
> It is possible for either of these links to break *without* the master 
> changing.  (Currently, the scheduler driver will only re-register if the 
> master changes).
> If both links break or if just link (1) breaks, the master views the 
> framework as {{inactive}} and {{disconnected}}.  This means the framework 
> will not receive any more events (such as offers) from the master until it 
> re-registers.  There is currently no way for the scheduler to detect a 
> one-way link breakage.
> if link (2) breaks, it makes (almost) no difference to the scheduler.  The 
> scheduler usually uses the link to send messages to the master, but 
> libprocess will create another socket if the persistent one is not available.
> To fix link breakages for (1+2) and (2), the scheduler driver should 
> implement a `::exited` event handler for the master's {{pid}} and re-register 
> in this case.
> See the related issue MESOS-5181 for link (1) breakage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5181) Master should reject calls from the scheduler driver if the scheduler is not connected.

2016-04-11 Thread Anand Mazumdar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-5181:
--
Description: 
When a scheduler registers, the master will create a link from master to 
scheduler.  If this link breaks, the master will consider the scheduler 
{{inactive}} and mark it as {{disconnected}}.

This causes a couple problems:
1) Master does not send offers to {{inactive}} schedulers.  But these 
schedulers might consider themselves "registered" in a one-way network 
partition scenario.
2) Any calls from the {{inactive}} scheduler is still accepted, which leaves 
the scheduler in a starved, but semi-functional state.

See the related issue for more context: MESOS-5180

There should be an additional guard for registered, but {{inactive}} schedulers 
here:
https://github.com/apache/mesos/blob/94f4f4ebb7d491ec6da1473b619600332981dd8e/src/master/master.cpp#L1977

The HTTP API already does this:
https://github.com/apache/mesos/blob/94f4f4ebb7d491ec6da1473b619600332981dd8e/src/master/http.cpp#L459

Since the scheduler driver cannot return a 403, it may be necessary to return a 
{{Event::ERROR}} and force the scheduler to abort.

  was:
When a scheduler registers, the master will create a link from master to 
scheduler.  If this link breaks, the master will consider the scheduler 
{{inactive}} and {{disconnected}}.

This causes a couple problems:
1) Master does not send offers to {{inactive}} schedulers.  But these 
schedulers are still considered "registered".
2) Any calls from the {{inactive}} scheduler is still accepted, which leaves 
the scheduler in a starved, but semi-functional state.

See the related issue for more context: MESOS-5180

There should be an additional guard for registered, but {{inactive}} schedulers 
here:
https://github.com/apache/mesos/blob/94f4f4ebb7d491ec6da1473b619600332981dd8e/src/master/master.cpp#L1977

The HTTP API already does this:
https://github.com/apache/mesos/blob/94f4f4ebb7d491ec6da1473b619600332981dd8e/src/master/http.cpp#L459

Since the scheduler driver cannot return a 403, it may be necessary to return a 
{{Event::ERROR}} and force the scheduler to abort.


> Master should reject calls from the scheduler driver if the scheduler is not 
> connected.
> ---
>
> Key: MESOS-5181
> URL: https://issues.apache.org/jira/browse/MESOS-5181
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.24.0
>Reporter: Joseph Wu
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> When a scheduler registers, the master will create a link from master to 
> scheduler.  If this link breaks, the master will consider the scheduler 
> {{inactive}} and mark it as {{disconnected}}.
> This causes a couple problems:
> 1) Master does not send offers to {{inactive}} schedulers.  But these 
> schedulers might consider themselves "registered" in a one-way network 
> partition scenario.
> 2) Any calls from the {{inactive}} scheduler is still accepted, which leaves 
> the scheduler in a starved, but semi-functional state.
> See the related issue for more context: MESOS-5180
> There should be an additional guard for registered, but {{inactive}} 
> schedulers here:
> https://github.com/apache/mesos/blob/94f4f4ebb7d491ec6da1473b619600332981dd8e/src/master/master.cpp#L1977
> The HTTP API already does this:
> https://github.com/apache/mesos/blob/94f4f4ebb7d491ec6da1473b619600332981dd8e/src/master/http.cpp#L459
> Since the scheduler driver cannot return a 403, it may be necessary to return 
> a {{Event::ERROR}} and force the scheduler to abort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5180) Scheduler driver does not detect disconnection with master and reregister.

2016-04-11 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-5180:
-
Description: 
The existing implementation of the scheduler driver does not re-register with 
the master under some network partition cases.

When a scheduler registers with the master:
1) master links to the framework
2) framework links to the master

It is possible for either of these links to break *without* the master 
changing.  (Currently, the scheduler driver will only re-register if the master 
changes).

If both links break or if just link (1) breaks, the master views the framework 
as {{inactive}} and {{disconnected}}.  This means the framework will not 
receive any more events (such as offers) from the master until it re-registers. 
 There is currently no way for the scheduler to detect a one-way link breakage.

if link (2) breaks, it makes (almost) no difference to the scheduler.  The 
scheduler usually uses the link to send messages to the master, but libprocess 
will create another socket if the persistent one is not available.

To fix link breakages for (1+2) and (2), the scheduler driver should implement 
a `::exited` event handler for the master's {{pid}} and re-register in this 
case.

See the related issue MESOS-5181 for link (1) breakage.

  was:
The existing implementation of the scheduler driver does not re-register with 
the master under some network partition cases.

When a scheduler registers with the master:
1) master links to the framework
2) framework links to the master

It is possible for either of these links to break *without* the master 
changing.  (Currently, the scheduler driver will only re-register if the master 
changes).

If both links break or if just link (1) breaks, the master views the framework 
as {{inactive}} and {{disconnected}}.  This means the framework will not 
receive any more events (such as offers) from the master until it re-registers. 
 There is currently no way for the scheduler to detect a one-way link breakage.

if link (2) breaks, it makes (almost) no difference to the scheduler.  The 
scheduler usually uses the link to send messages to the master, but libprocess 
will create another socket if the persistent one is not available.

To fix link breakages for (1+2) and (2), the scheduler driver should implement 
a `::exited` event handler for the master's {{pid}} and re-register in this 
case.

See the related issue [TODO] for link (1) breakage.


> Scheduler driver does not detect disconnection with master and reregister.
> --
>
> Key: MESOS-5180
> URL: https://issues.apache.org/jira/browse/MESOS-5180
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.24.0
>Reporter: Joseph Wu
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> The existing implementation of the scheduler driver does not re-register with 
> the master under some network partition cases.
> When a scheduler registers with the master:
> 1) master links to the framework
> 2) framework links to the master
> It is possible for either of these links to break *without* the master 
> changing.  (Currently, the scheduler driver will only re-register if the 
> master changes).
> If both links break or if just link (1) breaks, the master views the 
> framework as {{inactive}} and {{disconnected}}.  This means the framework 
> will not receive any more events (such as offers) from the master until it 
> re-registers.  There is currently no way for the scheduler to detect a 
> one-way link breakage.
> if link (2) breaks, it makes (almost) no difference to the scheduler.  The 
> scheduler usually uses the link to send messages to the master, but 
> libprocess will create another socket if the persistent one is not available.
> To fix link breakages for (1+2) and (2), the scheduler driver should 
> implement a `::exited` event handler for the master's {{pid}} and re-register 
> in this case.
> See the related issue MESOS-5181 for link (1) breakage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5181) Master should reject calls from the scheduler driver if the scheduler is not connected.

2016-04-11 Thread Joseph Wu (JIRA)

Joseph Wu created MESOS-5181:


 Summary: Master should reject calls from the scheduler driver if 
the scheduler is not connected.
 Key: MESOS-5181
 URL: https://issues.apache.org/jira/browse/MESOS-5181
 Project: Mesos
  Issue Type: Bug
  Components: scheduler driver
Affects Versions: 0.24.0
Reporter: Joseph Wu
Assignee: Anand Mazumdar


When a scheduler registers, the master will create a link from master to 
scheduler.  If this link breaks, the master will consider the scheduler 
{{inactive}} and {{disconnected}}.

This causes a couple problems:
1) Master does not send offers to {{inactive}} schedulers.  But these 
schedulers are still considered "registered".
2) Any calls from the {{inactive}} scheduler is still accepted, which leaves 
the scheduler in a starved, but semi-functional state.

See the related issue for more context: MESOS-5180

There should be an additional guard for registered, but {{inactive}} schedulers 
here:
https://github.com/apache/mesos/blob/94f4f4ebb7d491ec6da1473b619600332981dd8e/src/master/master.cpp#L1977

The HTTP API already does this:
https://github.com/apache/mesos/blob/94f4f4ebb7d491ec6da1473b619600332981dd8e/src/master/http.cpp#L459

Since the scheduler driver cannot return a 403, it may be necessary to return a 
{{Event::ERROR}} and force the scheduler to abort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5180) Scheduler driver does not detect disconnection with master and reregister.

2016-04-11 Thread Joseph Wu (JIRA)

Joseph Wu created MESOS-5180:


 Summary: Scheduler driver does not detect disconnection with 
master and reregister.
 Key: MESOS-5180
 URL: https://issues.apache.org/jira/browse/MESOS-5180
 Project: Mesos
  Issue Type: Bug
  Components: scheduler driver
Affects Versions: 0.24.0
Reporter: Joseph Wu
Assignee: Anand Mazumdar


The existing implementation of the scheduler driver does not re-register with 
the master under some network partition cases.

When a scheduler registers with the master:
1) master links to the framework
2) framework links to the master

It is possible for either of these links to break *without* the master 
changing.  (Currently, the scheduler driver will only re-register if the master 
changes).

If both links break or if just link (1) breaks, the master views the framework 
as {{inactive}} and {{disconnected}}.  This means the framework will not 
receive any more events (such as offers) from the master until it re-registers. 
 There is currently no way for the scheduler to detect a one-way link breakage.

if link (2) breaks, it makes (almost) no difference to the scheduler.  The 
scheduler usually uses the link to send messages to the master, but libprocess 
will create another socket if the persistent one is not available.

To fix link breakages for (1+2) and (2), the scheduler driver should implement 
a `::exited` event handler for the master's {{pid}} and re-register in this 
case.

See the related issue [TODO] for link (1) breakage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-5166) ExamplesTest.DynamicReservationFramework is slow

2016-04-11 Thread Klaus Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma reassigned MESOS-5166:
---

Assignee: Klaus Ma

> ExamplesTest.DynamicReservationFramework is slow
> 
>
> Key: MESOS-5166
> URL: https://issues.apache.org/jira/browse/MESOS-5166
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: Benjamin Bannier
>Assignee: Klaus Ma
>  Labels: examples, mesosphere
>
> For an unoptimized build under OS X 
> {{ExamplesTest.DynamicReservationFramework}} currently takes more than 13 
> seconds on my machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5179) Enhance the error message for Duration flag

2016-04-11 Thread Guangya Liu (JIRA)

Guangya Liu created MESOS-5179:
--

 Summary: Enhance the error message for Duration flag
 Key: MESOS-5179
 URL: https://issues.apache.org/jira/browse/MESOS-5179
 Project: Mesos
  Issue Type: Bug
Reporter: Guangya Liu
Assignee: Guangya Liu


Enhance the error message for  
https://github.com/apache/mesos/blob/4dfa91fc21f80204f5125b2e2f35c489f8fb41d8/3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp#L70
 to list all of the supported duration unit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5170) Adapt json creation for authorization based endpoint filtering.

2016-04-11 Thread Adam B (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5170:
--
Fix Version/s: 0.29.0

> Adapt json creation for authorization based endpoint filtering.
> ---
>
> Key: MESOS-5170
> URL: https://issues.apache.org/jira/browse/MESOS-5170
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> For authorization based endpoint filtering we need to adapt the json endpoint 
> creation as discussed in MESOS-4931.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5170) Adapt json creation for authorization based endpoint filtering.

2016-04-11 Thread Adam B (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5170:
--
Assignee: Joerg Schad

> Adapt json creation for authorization based endpoint filtering.
> ---
>
> Key: MESOS-5170
> URL: https://issues.apache.org/jira/browse/MESOS-5170
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> For authorization based endpoint filtering we need to adapt the json endpoint 
> creation as discussed in MESOS-4931.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5169) Introduce new Authorizer Actions for Authorized based filtering of endpoints.

2016-04-11 Thread Adam B (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5169:
--
 Assignee: Joerg Schad
   Sprint: Mesosphere Sprint 33
Fix Version/s: 0.29.0
  Description: For authorization based endpoint filtering we need to 
introduce the authorizer actions outlined via MESOS-4932.  (was: For 
authorization based endpoint filtering we need to introduce the authorizer 
actions outlined via MESOS-493.)
  Component/s: security

> Introduce new Authorizer Actions for Authorized based filtering of endpoints.
> -
>
> Key: MESOS-5169
> URL: https://issues.apache.org/jira/browse/MESOS-5169
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> For authorization based endpoint filtering we need to introduce the 
> authorizer actions outlined via MESOS-4932.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5168) Benchmark overhead of authorization based filtering.

2016-04-11 Thread Adam B (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5168:
--
 Assignee: Joerg Schad
   Sprint: Mesosphere Sprint 33
Fix Version/s: 0.29.0

> Benchmark overhead of authorization based filtering.
> 
>
> Key: MESOS-5168
> URL: https://issues.apache.org/jira/browse/MESOS-5168
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> When adding authorization based filtering as outlined in MESOS-4931 we need 
> to be careful especially for performance critical endpoints such as /state.
> We should ensure via a benchmark that performance does not degreade below an 
> acceptable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3578) ProvisionerDockerLocalStoreTest.MetadataManagerInitialization is flaky

2016-04-11 Thread Anand Mazumdar (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236150#comment-15236150
 ] 

Anand Mazumdar commented on MESOS-3578:
---

Logs from another ASF CI run.

{code}
[ RUN  ] ProvisionerDockerLocalStoreTest.MetadataManagerInitialization
E0411 17:14:46.692386 32652 shell.hpp:106] Command 'hadoop version 2>&1' 
failed; this is the output:
sh: 1: hadoop: not found
I0411 17:14:46.692488 32652 fetcher.cpp:59] Skipping URI fetcher plugin 
'hadoop' as it could not be created: Failed to create HDFS client: Failed to 
execute 'hadoop version 2>&1'; the command was either not found or exited with 
a non-zero exit status: 127
I0411 17:14:46.692757 32652 local_puller.cpp:90] Creating local puller with 
docker registry '/tmp/s6Ahtf/images'
I0411 17:14:46.695791 32678 metadata_manager.cpp:159] Looking for image 'abc'
I0411 17:14:46.696559 32678 local_puller.cpp:142] Untarring image 'abc' from 
'/tmp/s6Ahtf/images/abc.tar' to '/tmp/s6Ahtf/store/staging/qf0NsJ'
I0411 17:14:46.741811 32685 local_puller.cpp:162] The repositories JSON file 
for image 'abc' is '{"abc":{"latest":"456"}}'
I0411 17:14:46.742210 32685 local_puller.cpp:290] Extracting layer tar ball 
'/tmp/s6Ahtf/store/staging/qf0NsJ/123/layer.tar to rootfs 
'/tmp/s6Ahtf/store/staging/qf0NsJ/123/rootfs'
I0411 17:14:46.747326 32685 local_puller.cpp:290] Extracting layer tar ball 
'/tmp/s6Ahtf/store/staging/qf0NsJ/456/layer.tar to rootfs 
'/tmp/s6Ahtf/store/staging/qf0NsJ/456/rootfs'
../../src/tests/containerizer/provisioner_docker_tests.cpp:210: Failure
(imageInfo).failure(): Collect failed: Subprocess 'tar, tar, -x, -f, 
/tmp/s6Ahtf/store/staging/qf0NsJ/123/layer.tar, -C, 
/tmp/s6Ahtf/store/staging/qf0NsJ/123/rootfs' failed: tar: This does not look 
like a tar archive
tar: Exiting with failure status due to previous errors

[  FAILED  ] ProvisionerDockerLocalStoreTest.MetadataManagerInitialization (204 
ms)
{code}

> ProvisionerDockerLocalStoreTest.MetadataManagerInitialization is flaky
> --
>
> Key: MESOS-3578
> URL: https://issues.apache.org/jira/browse/MESOS-3578
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Anand Mazumdar
>  Labels: flaky-test, mesosphere
>
> Showed up on ASF CI:
> https://builds.apache.org/job/Mesos/881/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,OS=ubuntu:14.04,label_exp=docker%7C%7CHadoop/consoleFull
> {code}
> [ RUN  ] ProvisionerDockerLocalStoreTest.MetadataManagerInitialization
> Using temporary directory 
> '/tmp/ProvisionerDockerLocalStoreTest_MetadataManagerInitialization_9ynmgE'
> I0929 02:36:44.066397 30457 local_puller.cpp:127] Untarring image from 
> '/tmp/ProvisionerDockerLocalStoreTest_MetadataManagerInitialization_9ynmgE/store/staging/aZND7C'
>  to 
> '/tmp/ProvisionerDockerLocalStoreTest_MetadataManagerInitialization_9ynmgE/images/abc:latest.tar'
> ../../src/tests/containerizer/provisioner_docker_tests.cpp:843: Failure
> (layers).failure(): Collect failed: Untar failed with exit code: exited with 
> status 2
> [  FAILED  ] ProvisionerDockerLocalStoreTest.MetadataManagerInitialization 
> (181 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-5064) Remove default value for the agent `work_dir`

2016-04-11 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220643#comment-15220643
 ] 

Greg Mann edited comment on MESOS-5064 at 4/11/16 10:58 PM:


Reviews here:

https://reviews.apache.org/r/46003/
https://reviews.apache.org/r/46005/
https://reviews.apache.org/r/46004/
https://reviews.apache.org/r/45562/


was (Author: greggomann):
Reviews here:

https://reviews.apache.org/r/46003/
https://reviews.apache.org/r/46005/
https://reviews.apache.org/r/46004/
https://reviews.apache.org/r/45562/
https://reviews.apache.org/r/46038/

> Remove default value for the agent `work_dir`
> -
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}. In addition, we can remove 
> the default value for the {{\-\-work_dir}} flag, forcing users to explicitly 
> set the work directory for the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5159) Add test to verify error when requesting fractional GPUs

2016-04-11 Thread Kevin Klues (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236104#comment-15236104
 ] 

Kevin Klues commented on MESOS-5159:


Updated to fail with TASK_ERROR semantics based on MESOS-5178

https://reviews.apache.org/r/45970/

> Add test to verify error when requesting fractional GPUs
> 
>
> Key: MESOS-5159
> URL: https://issues.apache.org/jira/browse/MESOS-5159
> Project: Mesos
>  Issue Type: Task
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: gpu, mesosphere
>
> Fractional GPU requests should immediately cause a TASK_FAILED without ever 
> launching the task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5178) Add logic to validate for non-fractional GPU requests in the master

2016-04-11 Thread Kevin Klues (JIRA)

Kevin Klues created MESOS-5178:
--

 Summary: Add logic to validate for non-fractional GPU requests in 
the master
 Key: MESOS-5178
 URL: https://issues.apache.org/jira/browse/MESOS-5178
 Project: Mesos
  Issue Type: Task
Reporter: Kevin Klues
Assignee: Kevin Klues


We should not put this logic directly into the  'Resources::validate()' 
function.
The primary reason is that the existing 'Resources::validate()' function 
doesn't consider the semantics of any particular resource when performing its 
validation (it only makes sure that the fields in the 'Resource' protobuf 
message are correctly formed). Since a fractional 'gpus' resources is actually 
well-formed (and only semantically incorrect), we should push this validation 
logic up into the master.

Moreover, the existing logic to construct a 'Resources' object from a 
'RepeatedPtrField' silently drops any resources that don't pass 
'Resources::validate()'. This means that if we were to push the non-fractional 
'gpus' validation into 'Resources::validate()', the 'gpus' resources would just 
be silently dropped rather than causing a TASK_ERROR in the master. This is 
obviously *not* the desired behaviour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5174) Update the balloon-framework to run on test clusters

2016-04-11 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-5174:
-
Sprint: Mesosphere Sprint 33

> Update the balloon-framework to run on test clusters
> 
>
> Key: MESOS-5174
> URL: https://issues.apache.org/jira/browse/MESOS-5174
> Project: Mesos
>  Issue Type: Improvement
>  Components: framework, technical debt
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: mesosphere, tech-debt
>
> There are a couple of problems with the balloon framework that prevent it 
> from being deployed (easily) on an actual cluster:
> * The framework accepts 100% of memory in an offer.  This means the expected 
> behavior (finish or OOM) is dependent on the offer size.
> * The framework assumes the {{balloon-executor}} binary is available on each 
> agent.  This is generally only true in the build environment or in 
> single-agent test environments.
> * The framework does not specify CPUs with the executor.  This is required by 
> many isolators.
> * The executor's {{TASK_FINISHED}} logic path was untested and is flaky.
> * The framework has no metrics.
> * The framework only launches a single task and then exits.  With this 
> behavior, we can't have useful metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4541) Default work_dir slave to /var/lib/mesos instead of /tmp

2016-04-11 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235989#comment-15235989
 ] 

Greg Mann commented on MESOS-4541:
--

In MESOS-5064, we've opted for eliminating the default {{work_dir}} for the 
agent; this will require users to specify the work directory explicitly. 
Closing this ticket as "Won't Fix".

> Default work_dir slave to /var/lib/mesos instead of /tmp
> 
>
> Key: MESOS-4541
> URL: https://issues.apache.org/jira/browse/MESOS-4541
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Nick van 't Hart
>
> Centos cleanup Daily systemd service 
> /usr/lib/systemd/system/systemd-tmpfiles-clean.service 
> #  This file is part of systemd.
> #
> #  systemd is free software; you can redistribute it and/or modify it
> #  under the terms of the GNU Lesser General Public License as published by
> #  the Free Software Foundation; either version 2.1 of the License, or
> #  (at your option) any later version.
> [Unit]
> Description=Cleanup of Temporary Directories
> Documentation=man:tmpfiles.d(5) man:systemd-tmpfiles(8)
> DefaultDependencies=no
> Wants=local-fs.target
> After=systemd-readahead-collect.service systemd-readahead-replay.service 
> local-fs.target
> Before=sysinit.target shutdown.target
> ConditionDirectoryNotEmpty=|/usr/lib/tmpfiles.d
> ConditionDirectoryNotEmpty=|/usr/local/lib/tmpfiles.d
> ConditionDirectoryNotEmpty=|/etc/tmpfiles.d
> ConditionDirectoryNotEmpty=|/run/tmpfiles.d
> [Service]
> Type=oneshot
> ExecStart=/usr/bin/systemd-tmpfiles --clean
> IOSchedulingClass=idle
> http://www.freedesktop.org/software/systemd/man/systemd-tmpfiles.html
> systemd-tmpfiles creates, deletes, and cleans up volatile and temporary files 
> and directories, based on the configuration file format and location 
> specified in tmpfiles.d(5).
> /usr/lib/tmpfiles.d/tmp.conf 
> delete all files older then 10 days /tmp/*
> change default work_dir for mesos from /tmp to /var/lib/mesos/
> Problems:
> - mesos slave crash when deploying from marathon (state of running tasks lost)
> - mesos slave restart recovery will not work, because 
> /tmp/mesos/meta/slaves/latest could not be found 
> For now maybe add some extra documentation for work_dir option, when using in 
> production.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5064) Remove default value for the agent `work_dir`

2016-04-11 Thread Greg Mann (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5064:
-
Story Points: 2  (was: 1)

> Remove default value for the agent `work_dir`
> -
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}. In addition, we can remove 
> the default value for the {{\-\-work_dir}} flag, forcing users to explicitly 
> set the work directory for the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4705) Slave failed to sample container with perf event

2016-04-11 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235933#comment-15235933
 ] 

Benjamin Mahler commented on MESOS-4705:


Which patch? This one? https://reviews.apache.org/r/44379/

It still does not contain the information related to perf stat formats that 
[~haosd...@gmail.com] provided earlier in this thread. Can you add that?

With respect to https://reviews.apache.org/r/44255/, happy to discuss further, 
but let's do that outside of this ticket since it is not related.

> Slave failed to sample container with perf event
> 
>
> Key: MESOS-4705
> URL: https://issues.apache.org/jira/browse/MESOS-4705
> Project: Mesos
>  Issue Type: Bug
>  Components: cgroups, isolation
>Affects Versions: 0.27.1
>Reporter: Fan Du
>Assignee: Fan Du
>
> When sampling container with perf event on Centos7 with kernel 
> 3.10.0-123.el7.x86_64, slave complained with below error spew:
> {code}
> E0218 16:32:00.591181  8376 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line 
> '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00':
>  Unexpected number of fields
> {code}
> it's caused by the current perf format [assumption | 
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430]
>  with kernel version below 3.12 
> On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below:
> value,unit,event,cgroup,running,ratio
> A local modification fixed this error on my test bed, please review this 
> ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5176) LinuxFilesystemIsolatorTest.ROOT_RecoverOrphanedPersistentVolume is flaky

2016-04-11 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235893#comment-15235893
 ] 

Greg Mann commented on MESOS-5176:
--

[~kaysoky]

> LinuxFilesystemIsolatorTest.ROOT_RecoverOrphanedPersistentVolume is flaky
> -
>
> Key: MESOS-5176
> URL: https://issues.apache.org/jira/browse/MESOS-5176
> Project: Mesos
>  Issue Type: Bug
>  Components: tests
> Environment: CentOS 7, with libevent and SSL enabled
>Reporter: Greg Mann
>  Labels: mesosphere
>
> Observed on the internal Mesosphere CI:
> {code}
> [07:10:58] :   [Step 11/11] [ RUN  ] 
> LinuxFilesystemIsolatorTest.ROOT_RecoverOrphanedPersistentVolume
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.289384 32129 cluster.cpp:149] 
> Creating default 'local' authorizer
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.317526 32129 leveldb.cpp:174] 
> Opened db in 27.91929ms
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.318943 32129 leveldb.cpp:181] 
> Compacted db in 1.383973ms
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.318989 32129 leveldb.cpp:196] 
> Created db iterator in 18603ns
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.319000 32129 leveldb.cpp:202] 
> Seeked to beginning of db in 1529ns
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.319008 32129 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 358ns
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.319046 32129 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.319627 32143 recover.cpp:447] 
> Starting replica recovery
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.319852 32143 recover.cpp:473] 
> Replica is in EMPTY status
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.320796 32145 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (17047)@172.30.2.121:48158
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.321202 32146 recover.cpp:193] 
> Received a recover response from a replica in EMPTY status
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.321650 32150 recover.cpp:564] 
> Updating replica status to STARTING
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323005 32149 master.cpp:382] 
> Master 57a2cf4e-da76-4801-a887-c0c84ad59d0d (ip-172-30-2-121.mesosphere.io) 
> started on 172.30.2.121:48158
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323022 32149 master.cpp:384] Flags 
> at startup: --acls="" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate="true" 
> --authenticate_http="true" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/fWC4sn/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/fWC4sn/master" 
> --zk_session_timeout="10secs"
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323227 32149 master.cpp:433] 
> Master only allowing authenticated frameworks to register
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323237 32149 master.cpp:438] 
> Master only allowing authenticated agents to register
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323243 32149 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/fWC4sn/credentials'
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323498 32149 master.cpp:480] Using 
> default 'crammd5' authenticator
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323616 32149 master.cpp:551] Using 
> default 'basic' HTTP authenticator
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323739 32149 master.cpp:589] 
> Authorization enabled
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323884 32150 
> whitelist_watcher.cpp:77] No whitelist given
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.323920 32143 hierarchical.cpp:142] 
> Initialized hierarchical allocator process
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.324103 32148 leveldb.cpp:304] 
> Persisting metadata (8 bytes) to leveldb took 2.27166ms
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.324126 32148 replica.cpp:320] 
> Persisted replica status to STARTING
> [07:10:58]W:   [Step 11/11] I0410 07:10:58.324322 32146 recover.cpp:473] 
> Replica is in STARTING status
> [07:10:58]W:   [Step 11/11] I0410

[jira] [Commented] (MESOS-5177) LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutorWithVolumes is flaky

2016-04-11 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235896#comment-15235896
 ] 

Greg Mann commented on MESOS-5177:
--

[~tnachen]

> LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutorWithVolumes
>  is flaky
> 
>
> Key: MESOS-5177
> URL: https://issues.apache.org/jira/browse/MESOS-5177
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation, tests
>Affects Versions: 0.28.0
> Environment: CentOS 7, with libevent and SSL enabled
>Reporter: Greg Mann
>  Labels: mesosphere
>
> Observed on the internal Mesosphere CI:
> {code}
> [19:35:11] :   [Step 11/11] [ RUN  ] 
> LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutorWithVolumes
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.907374 31187 cluster.cpp:149] 
> Creating default 'local' authorizer
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.912621 31187 leveldb.cpp:174] 
> Opened db in 5.045872ms
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.914330 31187 leveldb.cpp:181] 
> Compacted db in 1.6835ms
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.914373 31187 leveldb.cpp:196] 
> Created db iterator in 17681ns
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.914386 31187 leveldb.cpp:202] 
> Seeked to beginning of db in 1769ns
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.914393 31187 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 306ns
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.914429 31187 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.914922 31206 recover.cpp:447] 
> Starting replica recovery
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.915133 31206 recover.cpp:473] 
> Replica is in EMPTY status
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.916041 31203 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (16968)@172.30.2.184:40532
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.916425 31202 recover.cpp:193] 
> Received a recover response from a replica in EMPTY status
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.916898 31201 recover.cpp:564] 
> Updating replica status to STARTING
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.917946 31207 master.cpp:382] 
> Master abd3c4ca-5e96-4cbe-8814-a9c5ebd1767b (ip-172-30-2-184.mesosphere.io) 
> started on 172.30.2.184:40532
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.917966 31207 master.cpp:384] Flags 
> at startup: --acls="" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate="true" 
> --authenticate_http="true" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/0PzkwC/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/0PzkwC/master" 
> --zk_session_timeout="10secs"
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918198 31207 master.cpp:433] 
> Master only allowing authenticated frameworks to register
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918207 31207 master.cpp:438] 
> Master only allowing authenticated agents to register
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918213 31207 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/0PzkwC/credentials'
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918454 31207 master.cpp:480] Using 
> default 'crammd5' authenticator
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918587 31207 master.cpp:551] Using 
> default 'basic' HTTP authenticator
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918615 31205 leveldb.cpp:304] 
> Persisting metadata (8 bytes) to leveldb took 1.524112ms
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918644 31205 replica.cpp:320] 
> Persisted replica status to STARTING
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918750 31207 master.cpp:589] 
> Authorization enabled
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918856 31204 recover.cpp:473] 
> Replica is in STARTING status
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918908 31201 hierarchical.cpp:142] 
> Initialized hierarchical allocator process
> [19:35:11]W:   [Step 11/11] I0411 19:35:11.918912 3

[jira] [Created] (MESOS-5177) LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutorWithVolumes is flaky

2016-04-11 Thread Greg Mann (JIRA)

Greg Mann created MESOS-5177:


 Summary: 
LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutorWithVolumes 
is flaky
 Key: MESOS-5177
 URL: https://issues.apache.org/jira/browse/MESOS-5177
 Project: Mesos
  Issue Type: Bug
  Components: isolation, tests
Affects Versions: 0.28.0
 Environment: CentOS 7, with libevent and SSL enabled
Reporter: Greg Mann


Observed on the internal Mesosphere CI:
{code}
[19:35:11] : [Step 11/11] [ RUN  ] 
LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutorWithVolumes
[19:35:11]W: [Step 11/11] I0411 19:35:11.907374 31187 cluster.cpp:149] 
Creating default 'local' authorizer
[19:35:11]W: [Step 11/11] I0411 19:35:11.912621 31187 leveldb.cpp:174] 
Opened db in 5.045872ms
[19:35:11]W: [Step 11/11] I0411 19:35:11.914330 31187 leveldb.cpp:181] 
Compacted db in 1.6835ms
[19:35:11]W: [Step 11/11] I0411 19:35:11.914373 31187 leveldb.cpp:196] 
Created db iterator in 17681ns
[19:35:11]W: [Step 11/11] I0411 19:35:11.914386 31187 leveldb.cpp:202] 
Seeked to beginning of db in 1769ns
[19:35:11]W: [Step 11/11] I0411 19:35:11.914393 31187 leveldb.cpp:271] 
Iterated through 0 keys in the db in 306ns
[19:35:11]W: [Step 11/11] I0411 19:35:11.914429 31187 replica.cpp:779] 
Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
[19:35:11]W: [Step 11/11] I0411 19:35:11.914922 31206 recover.cpp:447] 
Starting replica recovery
[19:35:11]W: [Step 11/11] I0411 19:35:11.915133 31206 recover.cpp:473] 
Replica is in EMPTY status
[19:35:11]W: [Step 11/11] I0411 19:35:11.916041 31203 replica.cpp:673] 
Replica in EMPTY status received a broadcasted recover request from 
(16968)@172.30.2.184:40532
[19:35:11]W: [Step 11/11] I0411 19:35:11.916425 31202 recover.cpp:193] 
Received a recover response from a replica in EMPTY status
[19:35:11]W: [Step 11/11] I0411 19:35:11.916898 31201 recover.cpp:564] 
Updating replica status to STARTING
[19:35:11]W: [Step 11/11] I0411 19:35:11.917946 31207 master.cpp:382] 
Master abd3c4ca-5e96-4cbe-8814-a9c5ebd1767b (ip-172-30-2-184.mesosphere.io) 
started on 172.30.2.184:40532
[19:35:11]W: [Step 11/11] I0411 19:35:11.917966 31207 master.cpp:384] Flags 
at startup: --acls="" --allocation_interval="1secs" 
--allocator="HierarchicalDRF" --authenticate="true" --authenticate_http="true" 
--authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" 
--credentials="/tmp/0PzkwC/credentials" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/0PzkwC/master" 
--zk_session_timeout="10secs"
[19:35:11]W: [Step 11/11] I0411 19:35:11.918198 31207 master.cpp:433] 
Master only allowing authenticated frameworks to register
[19:35:11]W: [Step 11/11] I0411 19:35:11.918207 31207 master.cpp:438] 
Master only allowing authenticated agents to register
[19:35:11]W: [Step 11/11] I0411 19:35:11.918213 31207 credentials.hpp:37] 
Loading credentials for authentication from '/tmp/0PzkwC/credentials'
[19:35:11]W: [Step 11/11] I0411 19:35:11.918454 31207 master.cpp:480] Using 
default 'crammd5' authenticator
[19:35:11]W: [Step 11/11] I0411 19:35:11.918587 31207 master.cpp:551] Using 
default 'basic' HTTP authenticator
[19:35:11]W: [Step 11/11] I0411 19:35:11.918615 31205 leveldb.cpp:304] 
Persisting metadata (8 bytes) to leveldb took 1.524112ms
[19:35:11]W: [Step 11/11] I0411 19:35:11.918644 31205 replica.cpp:320] 
Persisted replica status to STARTING
[19:35:11]W: [Step 11/11] I0411 19:35:11.918750 31207 master.cpp:589] 
Authorization enabled
[19:35:11]W: [Step 11/11] I0411 19:35:11.918856 31204 recover.cpp:473] 
Replica is in STARTING status
[19:35:11]W: [Step 11/11] I0411 19:35:11.918908 31201 hierarchical.cpp:142] 
Initialized hierarchical allocator process
[19:35:11]W: [Step 11/11] I0411 19:35:11.918912 31208 
whitelist_watcher.cpp:77] No whitelist given
[19:35:11]W: [Step 11/11] I0411 19:35:11.919694 31202 replica.cpp:673] 
Replica in STARTING status received a broadcasted recover request from 
(16970)@172.30.2.184:40532
[19:35:11]W: [Step 11/11] I0411 19:35:11.920127 31205 recover.cpp:193] 
Received a recover response from a replica in STARTING status
[19:35:11]W: [Step 11/11] I0411

[jira] [Created] (MESOS-5176) LinuxFilesystemIsolatorTest.ROOT_RecoverOrphanedPersistentVolume is flaky

2016-04-11 Thread Greg Mann (JIRA)

Greg Mann created MESOS-5176:


 Summary: 
LinuxFilesystemIsolatorTest.ROOT_RecoverOrphanedPersistentVolume is flaky
 Key: MESOS-5176
 URL: https://issues.apache.org/jira/browse/MESOS-5176
 Project: Mesos
  Issue Type: Bug
  Components: tests
 Environment: CentOS 7, with libevent and SSL enabled
Reporter: Greg Mann


Observed on the internal Mesosphere CI:
{code}
[07:10:58] : [Step 11/11] [ RUN  ] 
LinuxFilesystemIsolatorTest.ROOT_RecoverOrphanedPersistentVolume
[07:10:58]W: [Step 11/11] I0410 07:10:58.289384 32129 cluster.cpp:149] 
Creating default 'local' authorizer
[07:10:58]W: [Step 11/11] I0410 07:10:58.317526 32129 leveldb.cpp:174] 
Opened db in 27.91929ms
[07:10:58]W: [Step 11/11] I0410 07:10:58.318943 32129 leveldb.cpp:181] 
Compacted db in 1.383973ms
[07:10:58]W: [Step 11/11] I0410 07:10:58.318989 32129 leveldb.cpp:196] 
Created db iterator in 18603ns
[07:10:58]W: [Step 11/11] I0410 07:10:58.319000 32129 leveldb.cpp:202] 
Seeked to beginning of db in 1529ns
[07:10:58]W: [Step 11/11] I0410 07:10:58.319008 32129 leveldb.cpp:271] 
Iterated through 0 keys in the db in 358ns
[07:10:58]W: [Step 11/11] I0410 07:10:58.319046 32129 replica.cpp:779] 
Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
[07:10:58]W: [Step 11/11] I0410 07:10:58.319627 32143 recover.cpp:447] 
Starting replica recovery
[07:10:58]W: [Step 11/11] I0410 07:10:58.319852 32143 recover.cpp:473] 
Replica is in EMPTY status
[07:10:58]W: [Step 11/11] I0410 07:10:58.320796 32145 replica.cpp:673] 
Replica in EMPTY status received a broadcasted recover request from 
(17047)@172.30.2.121:48158
[07:10:58]W: [Step 11/11] I0410 07:10:58.321202 32146 recover.cpp:193] 
Received a recover response from a replica in EMPTY status
[07:10:58]W: [Step 11/11] I0410 07:10:58.321650 32150 recover.cpp:564] 
Updating replica status to STARTING
[07:10:58]W: [Step 11/11] I0410 07:10:58.323005 32149 master.cpp:382] 
Master 57a2cf4e-da76-4801-a887-c0c84ad59d0d (ip-172-30-2-121.mesosphere.io) 
started on 172.30.2.121:48158
[07:10:58]W: [Step 11/11] I0410 07:10:58.323022 32149 master.cpp:384] Flags 
at startup: --acls="" --allocation_interval="1secs" 
--allocator="HierarchicalDRF" --authenticate="true" --authenticate_http="true" 
--authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" 
--credentials="/tmp/fWC4sn/credentials" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/fWC4sn/master" 
--zk_session_timeout="10secs"
[07:10:58]W: [Step 11/11] I0410 07:10:58.323227 32149 master.cpp:433] 
Master only allowing authenticated frameworks to register
[07:10:58]W: [Step 11/11] I0410 07:10:58.323237 32149 master.cpp:438] 
Master only allowing authenticated agents to register
[07:10:58]W: [Step 11/11] I0410 07:10:58.323243 32149 credentials.hpp:37] 
Loading credentials for authentication from '/tmp/fWC4sn/credentials'
[07:10:58]W: [Step 11/11] I0410 07:10:58.323498 32149 master.cpp:480] Using 
default 'crammd5' authenticator
[07:10:58]W: [Step 11/11] I0410 07:10:58.323616 32149 master.cpp:551] Using 
default 'basic' HTTP authenticator
[07:10:58]W: [Step 11/11] I0410 07:10:58.323739 32149 master.cpp:589] 
Authorization enabled
[07:10:58]W: [Step 11/11] I0410 07:10:58.323884 32150 
whitelist_watcher.cpp:77] No whitelist given
[07:10:58]W: [Step 11/11] I0410 07:10:58.323920 32143 hierarchical.cpp:142] 
Initialized hierarchical allocator process
[07:10:58]W: [Step 11/11] I0410 07:10:58.324103 32148 leveldb.cpp:304] 
Persisting metadata (8 bytes) to leveldb took 2.27166ms
[07:10:58]W: [Step 11/11] I0410 07:10:58.324126 32148 replica.cpp:320] 
Persisted replica status to STARTING
[07:10:58]W: [Step 11/11] I0410 07:10:58.324322 32146 recover.cpp:473] 
Replica is in STARTING status
[07:10:58]W: [Step 11/11] I0410 07:10:58.325204 32143 replica.cpp:673] 
Replica in STARTING status received a broadcasted recover request from 
(17049)@172.30.2.121:48158
[07:10:58]W: [Step 11/11] I0410 07:10:58.325527 32145 recover.cpp:193] 
Received a recover response from a replica in STARTING status
[07:10:58]W: [Step 11/11] I0410 07:10:58.325860 32150 master.cpp:1832] The 
newly elected leader is m

[jira] [Commented] (MESOS-5175) LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint is flaky

2016-04-11 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235886#comment-15235886
 ] 

Greg Mann commented on MESOS-5175:
--

[~jieyu]

> LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint is flaky
> -
>
> Key: MESOS-5175
> URL: https://issues.apache.org/jira/browse/MESOS-5175
> Project: Mesos
>  Issue Type: Bug
>  Components: tests
> Environment: CentOS 7 with SSL and libevent enabled
>Reporter: Greg Mann
>  Labels: mesosphere
>
> Observed on the internal Mesosphere CI:
> {code}
> [07:12:07] :   [Step 11/11] [ RUN  ] 
> LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint
> [07:12:08]W:   [Step 11/11] I0410 07:12:08.906998 32129 linux.cpp:81] Making 
> '/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH'
>  a shared mount
> [07:12:08]W:   [Step 11/11] I0410 07:12:08.923028 32129 
> linux_launcher.cpp:101] Using /sys/fs/cgroup/freezer as the freezer hierarchy 
> for the Linux launcher
> [07:12:08]W:   [Step 11/11] I0410 07:12:08.923751 32144 
> containerizer.cpp:682] Starting container 
> '86d04a91-e7b0-4b8f-9706-b9969796b5d1' for executor 'test_executor' of 
> framework ''
> [07:12:08]W:   [Step 11/11] I0410 07:12:08.924296 32148 provisioner.cpp:285] 
> Provisioning image rootfs 
> '/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab'
>  for container 86d04a91-e7b0-4b8f-9706-b9969796b5d1
> [07:12:08]W:   [Step 11/11] I0410 07:12:08.924885 32145 copy.cpp:127] Copying 
> layer path '/tmp/WwQa3Q/test_image' to rootfs 
> '/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab'
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.627612 32145 linux.cpp:355] Bind 
> mounting work directory from '/tmp/WwQa3Q/sandbox' to 
> '/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab/mnt/mesos/sandbox'
>  for container 86d04a91-e7b0-4b8f-9706-b9969796b5d1
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.648669 32147 
> linux_launcher.cpp:281] Cloning child process with flags = CLONE_NEWNS
> [07:12:13]W:   [Step 11/11] + 
> /mnt/teamcity/work/4240ba9ddd0997c3/build/src/mesos-containerizer mount 
> --help=false --operation=make-rslave --path=/
> [07:12:13]W:   [Step 11/11] + grep -E 
> /mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/.+
>  /proc/self/mountinfo
> [07:12:13]W:   [Step 11/11] + grep -v 86d04a91-e7b0-4b8f-9706-b9969796b5d1
> [07:12:13]W:   [Step 11/11] + cut '-d ' -f5
> [07:12:13]W:   [Step 11/11] + xargs --no-run-if-empty umount -l
> [07:12:13]W:   [Step 11/11] + mount -n --rbind /tmp/WwQa3Q 
> /mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab/mnt/mesos/sandbox/mountpoint
> [07:12:13] :   [Step 11/11] Changing root to 
> /mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.827551 32145 
> containerizer.cpp:1674] Executor for container 
> '86d04a91-e7b0-4b8f-9706-b9969796b5d1' has exited
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.827607 32145 
> containerizer.cpp:1439] Destroying container 
> '86d04a91-e7b0-4b8f-9706-b9969796b5d1'
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.830469 32145 cgroups.cpp:2676] 
> Freezing cgroup 
> /sys/fs/cgroup/freezer/mesos/86d04a91-e7b0-4b8f-9706-b9969796b5d1
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.832928 32143 cgroups.cpp:1409] 
> Successfully froze cgroup 
> /sys/fs/cgroup/freezer/mesos/86d04a91-e7b0-4b8f-9706-b9969796b5d1 after 
> 2.412032ms
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.835292 32150 cgroups.cpp:2694] 
> Thawing cgroup 
> /sys/fs/cgroup/freezer/mesos/86d04a91-e7b0-4b8f-9706-b9969796b5d1
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.837411 32150 cgroups.cpp:1438] 
> Successfullly thawed cgroup 
> /sys/fs/cgroup/freezer/mesos/86d04a91-e7b0-4b8f-9706-b9969796b5d1 after 
> 2.07616ms
> [07:12:13]W:   [Step 11/11] I0410 07:12:13.840045 32148 linux.cpp:817] 
> Unmounting sandbox/work directory 
> '/mn

[jira] [Created] (MESOS-5175) LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint is flaky

2016-04-11 Thread Greg Mann (JIRA)

Greg Mann created MESOS-5175:


 Summary: 
LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint is flaky
 Key: MESOS-5175
 URL: https://issues.apache.org/jira/browse/MESOS-5175
 Project: Mesos
  Issue Type: Bug
  Components: tests
 Environment: CentOS 7 with SSL and libevent enabled
Reporter: Greg Mann


Observed on the internal Mesosphere CI:
{code}
[07:12:07] : [Step 11/11] [ RUN  ] 
LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint
[07:12:08]W: [Step 11/11] I0410 07:12:08.906998 32129 linux.cpp:81] Making 
'/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH'
 a shared mount
[07:12:08]W: [Step 11/11] I0410 07:12:08.923028 32129 
linux_launcher.cpp:101] Using /sys/fs/cgroup/freezer as the freezer hierarchy 
for the Linux launcher
[07:12:08]W: [Step 11/11] I0410 07:12:08.923751 32144 
containerizer.cpp:682] Starting container 
'86d04a91-e7b0-4b8f-9706-b9969796b5d1' for executor 'test_executor' of 
framework ''
[07:12:08]W: [Step 11/11] I0410 07:12:08.924296 32148 provisioner.cpp:285] 
Provisioning image rootfs 
'/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab'
 for container 86d04a91-e7b0-4b8f-9706-b9969796b5d1
[07:12:08]W: [Step 11/11] I0410 07:12:08.924885 32145 copy.cpp:127] Copying 
layer path '/tmp/WwQa3Q/test_image' to rootfs 
'/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab'
[07:12:13]W: [Step 11/11] I0410 07:12:13.627612 32145 linux.cpp:355] Bind 
mounting work directory from '/tmp/WwQa3Q/sandbox' to 
'/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab/mnt/mesos/sandbox'
 for container 86d04a91-e7b0-4b8f-9706-b9969796b5d1
[07:12:13]W: [Step 11/11] I0410 07:12:13.648669 32147 
linux_launcher.cpp:281] Cloning child process with flags = CLONE_NEWNS
[07:12:13]W: [Step 11/11] + 
/mnt/teamcity/work/4240ba9ddd0997c3/build/src/mesos-containerizer mount 
--help=false --operation=make-rslave --path=/
[07:12:13]W: [Step 11/11] + grep -E 
/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/.+
 /proc/self/mountinfo
[07:12:13]W: [Step 11/11] + grep -v 86d04a91-e7b0-4b8f-9706-b9969796b5d1
[07:12:13]W: [Step 11/11] + cut '-d ' -f5
[07:12:13]W: [Step 11/11] + xargs --no-run-if-empty umount -l
[07:12:13]W: [Step 11/11] + mount -n --rbind /tmp/WwQa3Q 
/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab/mnt/mesos/sandbox/mountpoint
[07:12:13] : [Step 11/11] Changing root to 
/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab
[07:12:13]W: [Step 11/11] I0410 07:12:13.827551 32145 
containerizer.cpp:1674] Executor for container 
'86d04a91-e7b0-4b8f-9706-b9969796b5d1' has exited
[07:12:13]W: [Step 11/11] I0410 07:12:13.827607 32145 
containerizer.cpp:1439] Destroying container 
'86d04a91-e7b0-4b8f-9706-b9969796b5d1'
[07:12:13]W: [Step 11/11] I0410 07:12:13.830469 32145 cgroups.cpp:2676] 
Freezing cgroup 
/sys/fs/cgroup/freezer/mesos/86d04a91-e7b0-4b8f-9706-b9969796b5d1
[07:12:13]W: [Step 11/11] I0410 07:12:13.832928 32143 cgroups.cpp:1409] 
Successfully froze cgroup 
/sys/fs/cgroup/freezer/mesos/86d04a91-e7b0-4b8f-9706-b9969796b5d1 after 
2.412032ms
[07:12:13]W: [Step 11/11] I0410 07:12:13.835292 32150 cgroups.cpp:2694] 
Thawing cgroup /sys/fs/cgroup/freezer/mesos/86d04a91-e7b0-4b8f-9706-b9969796b5d1
[07:12:13]W: [Step 11/11] I0410 07:12:13.837411 32150 cgroups.cpp:1438] 
Successfullly thawed cgroup 
/sys/fs/cgroup/freezer/mesos/86d04a91-e7b0-4b8f-9706-b9969796b5d1 after 
2.07616ms
[07:12:13]W: [Step 11/11] I0410 07:12:13.840045 32148 linux.cpp:817] 
Unmounting sandbox/work directory 
'/mnt/teamcity/temp/buildTmp/LinuxFilesystemIsolatorTest_ROOT_VolumeFromHostSandboxMountPoint_aSovaH/provisioner/containers/86d04a91-e7b0-4b8f-9706-b9969796b5d1/backends/copy/rootfses/104f1991-f54a-4dd0-ab92-48ff2d3bebab/mnt/mesos/sandbox'
 for container 86d04a91-e7b0-4b8f-9706-b9969796b5d1
[07:12:13]W: [Step 11/11] I0410 07:12:13.840504 32150 provisioner.cpp:330] 
Destr

[jira] [Updated] (MESOS-5139) ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar is flaky

2016-04-11 Thread Gilbert Song (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilbert Song updated MESOS-5139:

  Sprint: Mesosphere Sprint 33
Story Points: 2

> ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar is flaky
> --
>
> Key: MESOS-5139
> URL: https://issues.apache.org/jira/browse/MESOS-5139
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.0
> Environment: Ubuntu14.04
>Reporter: Vinod Kone
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> Found this on ASF CI while testing 0.28.1-rc2
> {code}
> [ RUN  ] ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar
> E0406 18:29:30.870481   520 shell.hpp:93] Command 'hadoop version 2>&1' 
> failed; this is the output:
> sh: 1: hadoop: not found
> E0406 18:29:30.870576   520 fetcher.cpp:59] Failed to create URI fetcher 
> plugin 'hadoop': Failed to create HDFS client: Failed to execute 'hadoop 
> version 2>&1'; the command was either not found or exited with a non-zero 
> exit status: 127
> I0406 18:29:30.871052   520 local_puller.cpp:90] Creating local puller with 
> docker registry '/tmp/3l8ZBv/images'
> I0406 18:29:30.873325   539 metadata_manager.cpp:159] Looking for image 'abc'
> I0406 18:29:30.874438   539 local_puller.cpp:142] Untarring image 'abc' from 
> '/tmp/3l8ZBv/images/abc.tar' to '/tmp/3l8ZBv/store/staging/5tw8bD'
> I0406 18:29:30.901916   547 local_puller.cpp:162] The repositories JSON file 
> for image 'abc' is '{"abc":{"latest":"456"}}'
> I0406 18:29:30.902304   547 local_puller.cpp:290] Extracting layer tar ball 
> '/tmp/3l8ZBv/store/staging/5tw8bD/123/layer.tar to rootfs 
> '/tmp/3l8ZBv/store/staging/5tw8bD/123/rootfs'
> I0406 18:29:30.909144   547 local_puller.cpp:290] Extracting layer tar ball 
> '/tmp/3l8ZBv/store/staging/5tw8bD/456/layer.tar to rootfs 
> '/tmp/3l8ZBv/store/staging/5tw8bD/456/rootfs'
> ../../src/tests/containerizer/provisioner_docker_tests.cpp:183: Failure
> (imageInfo).failure(): Collect failed: Subprocess 'tar, tar, -x, -f, 
> /tmp/3l8ZBv/store/staging/5tw8bD/456/layer.tar, -C, 
> /tmp/3l8ZBv/store/staging/5tw8bD/456/rootfs' failed: tar: This does not look 
> like a tar archive
> tar: Exiting with failure status due to previous errors
> [  FAILED  ] ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar (243 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3781) Replace Master/Slave Terminology Phase I - Add duplicate agent flags

2016-04-11 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235826#comment-15235826
 ] 

Vinod Kone commented on MESOS-3781:
---

This is the work flow we use. Open --> Accept --> In Progress --> Reviewable.

There should be buttons up top for making these transitions. Sometimes the 
buttons are at the top level and sometimes they are underneath the "Workflow" 
button. Once a ticket is "In Progress", you can click "Post Review" button to 
post the RB link as the comment and transition the ticket to "Reviewable".

> Replace Master/Slave Terminology Phase I - Add duplicate agent flags 
> -
>
> Key: MESOS-3781
> URL: https://issues.apache.org/jira/browse/MESOS-3781
> Project: Mesos
>  Issue Type: Task
>Reporter: Diana Arroyo
>Assignee: Jay Guo
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3214) Replace boost foreach with range-based for

2016-04-11 Thread Michael Park (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-3214:

Sprint: Mesosphere Sprint 33

> Replace boost foreach with range-based for
> --
>
> Key: MESOS-3214
> URL: https://issues.apache.org/jira/browse/MESOS-3214
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Michael Park
>Assignee: Michael Park
>  Labels: mesosphere
>
> It's desirable to replace the boost {{foreach}} macro with the C++11 
> range-based {{for}}. This will help avoid some of the pitfalls of boost 
> {{foreach}} such as dealing with types with commas in them, as well as 
> improving compiler diagnostics by avoiding the macro expansion.
> One way to accomplish this is to replace the existing {{foreach (const Elem& 
> elem, container)}} pattern with {{for (const Elem& elem : container)}}. We 
> could support {{foreachkey}} and {{foreachvalue}} semantics via adaptors 
> {{keys}} and {{values}} which would be used like this: {{for (const Key& key 
> : keys(container))}}, {{for (const Value& value : values(container))}}. This 
> leaves {{foreachpair}} which cannot be used with {{for}}. I think it would be 
> desirable to support {{foreachpair}} for cases where the implicit unpacking 
> is useful.
> Another approach is to keep {{foreach}}, {{foreachpair}}, {{foreachkey}} and 
> {{foreachvalue}}, but simply implement them based on range-based {{for}}. For 
> example, {{#define foreach(elem, container) for (elem : container)}}. While 
> the consistency in the names is desirable, but unnecessary indirection of the 
> macro definition is not.
> It's unclear to me which approach we would favor in Mesos, so please share 
> your thoughts and preferences.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-3214) Replace boost foreach with range-based for

2016-04-11 Thread Michael Park (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park reassigned MESOS-3214:
---

Assignee: Michael Park

> Replace boost foreach with range-based for
> --
>
> Key: MESOS-3214
> URL: https://issues.apache.org/jira/browse/MESOS-3214
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Michael Park
>Assignee: Michael Park
>  Labels: mesosphere
>
> It's desirable to replace the boost {{foreach}} macro with the C++11 
> range-based {{for}}. This will help avoid some of the pitfalls of boost 
> {{foreach}} such as dealing with types with commas in them, as well as 
> improving compiler diagnostics by avoiding the macro expansion.
> One way to accomplish this is to replace the existing {{foreach (const Elem& 
> elem, container)}} pattern with {{for (const Elem& elem : container)}}. We 
> could support {{foreachkey}} and {{foreachvalue}} semantics via adaptors 
> {{keys}} and {{values}} which would be used like this: {{for (const Key& key 
> : keys(container))}}, {{for (const Value& value : values(container))}}. This 
> leaves {{foreachpair}} which cannot be used with {{for}}. I think it would be 
> desirable to support {{foreachpair}} for cases where the implicit unpacking 
> is useful.
> Another approach is to keep {{foreach}}, {{foreachpair}}, {{foreachkey}} and 
> {{foreachvalue}}, but simply implement them based on range-based {{for}}. For 
> example, {{#define foreach(elem, container) for (elem : container)}}. While 
> the consistency in the names is desirable, but unnecessary indirection of the 
> macro definition is not.
> It's unclear to me which approach we would favor in Mesos, so please share 
> your thoughts and preferences.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4882) Add support for command and arguments to mesos-execute.

2016-04-11 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4882:
---
   Sprint: Mesosphere Sprint 33
Affects Version/s: 0.28.0
   0.27.2
 Story Points: 5
   Labels: cli mesosphere  (was: )
  Description: 
{{CommandInfo}} protobuf support two kinds of command:
{code}
// There are two ways to specify the command:
  // 1) If 'shell == true', the command will be launched via shell
  //(i.e., /bin/sh -c 'value'). The 'value' specified will be
  //treated as the shell command. The 'arguments' will be ignored.
  // 2) If 'shell == false', the command will be launched by passing
  //arguments to an executable. The 'value' specified will be
  //treated as the filename of the executable. The 'arguments'
  //will be treated as the arguments to the executable. This is
  //similar to how POSIX exec families launch processes (i.e.,
  //execlp(value, arguments(0), arguments(1), ...)).
{code}

The mesos-execute cannot handle 2) now, enabling 2) can help with testing and 
running one off tasks.

  was:
The commandInfo support two kind of command:
{code}
// There are two ways to specify the command:
  // 1) If 'shell == true', the command will be launched via shell
  //(i.e., /bin/sh -c 'value'). The 'value' specified will be
  //treated as the shell command. The 'arguments' will be ignored.
  // 2) If 'shell == false', the command will be launched by passing
  //arguments to an executable. The 'value' specified will be
  //treated as the filename of the executable. The 'arguments'
  //will be treated as the arguments to the executable. This is
  //similar to how POSIX exec families launch processes (i.e.,
  //execlp(value, arguments(0), arguments(1), ...)).
{code}

The mesos-execute cannot handle 2) now, enabling 2) can help some unit test 
with isolator.



   Issue Type: Improvement  (was: Bug)
  Summary: Add support for command and arguments to mesos-execute.  
(was: Enabled mesos-execute treat command as executable value and arguments.)

> Add support for command and arguments to mesos-execute.
> ---
>
> Key: MESOS-4882
> URL: https://issues.apache.org/jira/browse/MESOS-4882
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 0.28.0, 0.27.2
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>  Labels: cli, mesosphere
>
> {{CommandInfo}} protobuf support two kinds of command:
> {code}
> // There are two ways to specify the command:
>   // 1) If 'shell == true', the command will be launched via shell
>   //  (i.e., /bin/sh -c 'value'). The 'value' specified will be
>   //  treated as the shell command. The 'arguments' will be ignored.
>   // 2) If 'shell == false', the command will be launched by passing
>   //  arguments to an executable. The 'value' specified will be
>   //  treated as the filename of the executable. The 'arguments'
>   //  will be treated as the arguments to the executable. This is
>   //  similar to how POSIX exec families launch processes (i.e.,
>   //  execlp(value, arguments(0), arguments(1), ...)).
> {code}
> The mesos-execute cannot handle 2) now, enabling 2) can help with testing and 
> running one off tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5155) Consolidate authorization actions for quota.

2016-04-11 Thread Alexander Rukletsov (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235759#comment-15235759
 ] 

Alexander Rukletsov commented on MESOS-5155:


I'm afraid so, because we also have to update ACLs which we've published in 
0.27.

> Consolidate authorization actions for quota.
> 
>
> Key: MESOS-5155
> URL: https://issues.apache.org/jira/browse/MESOS-5155
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Zhitao Li
>  Labels: mesosphere
>
> We should have just a single authz action: {{UPDATE_QUOTA_WITH_ROLE}}. It was 
> a mistake in retrospect to introduce multiple actions.
> Actions that are not symmetrical are register/teardown and dynamic 
> reservations. The way they are implemented in this way is because entities 
> that do one action differ from entities that do the other. For example, 
> register framework is issued by a framework, teardown by an operator. What is 
> a good way to identify a framework? A role it runs in, which may be different 
> each launch and makes no sense in multi-role frameworks setup or better a 
> sort of a group id, which is its principal. For dynamic reservations and 
> persistent volumes, they can be both issued by frameworks and operators, 
> hence similar reasoning applies. 
> Now, quota is associated with a role and set only by operators. Do we need to 
> care about principals that set it? Not that much. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4908) Tasks cannot be killed forcefully.

2016-04-11 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4908:
---
  Sprint: Mesosphere Sprint 33
Story Points: 5

> Tasks cannot be killed forcefully.
> --
>
> Key: MESOS-4908
> URL: https://issues.apache.org/jira/browse/MESOS-4908
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Currently there is no way for a scheduler to instruct the executor to kill a 
> certain task immediately, skipping any possible timeouts and / or kill 
> policies. This may be desirable in cases like, e.g., the kill policy is 10 
> minutes but something went wrong, so the scheduler decides to issue a 
> forceful kill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5139) ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar is flaky

2016-04-11 Thread Greg Mann (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5139:
-
Labels: mesosphere  (was: )

> ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar is flaky
> --
>
> Key: MESOS-5139
> URL: https://issues.apache.org/jira/browse/MESOS-5139
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.0
> Environment: Ubuntu14.04
>Reporter: Vinod Kone
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> Found this on ASF CI while testing 0.28.1-rc2
> {code}
> [ RUN  ] ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar
> E0406 18:29:30.870481   520 shell.hpp:93] Command 'hadoop version 2>&1' 
> failed; this is the output:
> sh: 1: hadoop: not found
> E0406 18:29:30.870576   520 fetcher.cpp:59] Failed to create URI fetcher 
> plugin 'hadoop': Failed to create HDFS client: Failed to execute 'hadoop 
> version 2>&1'; the command was either not found or exited with a non-zero 
> exit status: 127
> I0406 18:29:30.871052   520 local_puller.cpp:90] Creating local puller with 
> docker registry '/tmp/3l8ZBv/images'
> I0406 18:29:30.873325   539 metadata_manager.cpp:159] Looking for image 'abc'
> I0406 18:29:30.874438   539 local_puller.cpp:142] Untarring image 'abc' from 
> '/tmp/3l8ZBv/images/abc.tar' to '/tmp/3l8ZBv/store/staging/5tw8bD'
> I0406 18:29:30.901916   547 local_puller.cpp:162] The repositories JSON file 
> for image 'abc' is '{"abc":{"latest":"456"}}'
> I0406 18:29:30.902304   547 local_puller.cpp:290] Extracting layer tar ball 
> '/tmp/3l8ZBv/store/staging/5tw8bD/123/layer.tar to rootfs 
> '/tmp/3l8ZBv/store/staging/5tw8bD/123/rootfs'
> I0406 18:29:30.909144   547 local_puller.cpp:290] Extracting layer tar ball 
> '/tmp/3l8ZBv/store/staging/5tw8bD/456/layer.tar to rootfs 
> '/tmp/3l8ZBv/store/staging/5tw8bD/456/rootfs'
> ../../src/tests/containerizer/provisioner_docker_tests.cpp:183: Failure
> (imageInfo).failure(): Collect failed: Subprocess 'tar, tar, -x, -f, 
> /tmp/3l8ZBv/store/staging/5tw8bD/456/layer.tar, -C, 
> /tmp/3l8ZBv/store/staging/5tw8bD/456/rootfs' failed: tar: This does not look 
> like a tar archive
> tar: Exiting with failure status due to previous errors
> [  FAILED  ] ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar (243 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-1739) Allow slave reconfiguration on restart

2016-04-11 Thread Adam B (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235678#comment-15235678
 ] 

Adam B commented on MESOS-1739:
---

This is why we suggest that changes like this will need to notify (all?) 
frameworks of the change in attributes, so the framework can make the right 
choice about what to do with its tasks based on the new information. I'm not 
sure, however, how we should handle frameworks that don't understand the new 
"attributes changed" message.

> Allow slave reconfiguration on restart
> --
>
> Key: MESOS-1739
> URL: https://issues.apache.org/jira/browse/MESOS-1739
> Project: Mesos
>  Issue Type: Epic
>Reporter: Patrick Reilly
>  Labels: external-volumes, mesosphere, myriad
>
> Make it so that either via a slave restart or a out of process "reconfigure" 
> ping, the attributes and resources of a slave can be updated to be a superset 
> of what they used to be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5155) Consolidate authorization actions for quota.

2016-04-11 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5155:
---
Sprint: Mesosphere Sprint 33

> Consolidate authorization actions for quota.
> 
>
> Key: MESOS-5155
> URL: https://issues.apache.org/jira/browse/MESOS-5155
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Zhitao Li
>  Labels: mesosphere
>
> We should have just a single authz action: {{UPDATE_QUOTA_WITH_ROLE}}. It was 
> a mistake in retrospect to introduce multiple actions.
> Actions that are not symmetrical are register/teardown and dynamic 
> reservations. The way they are implemented in this way is because entities 
> that do one action differ from entities that do the other. For example, 
> register framework is issued by a framework, teardown by an operator. What is 
> a good way to identify a framework? A role it runs in, which may be different 
> each launch and makes no sense in multi-role frameworks setup or better a 
> sort of a group id, which is its principal. For dynamic reservations and 
> persistent volumes, they can be both issued by frameworks and operators, 
> hence similar reasoning applies. 
> Now, quota is associated with a role and set only by operators. Do we need to 
> care about principals that set it? Not that much. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4941) Support update existing quota.

2016-04-11 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4941:
---
Shepherd: Alexander Rukletsov  (was: Joris Van Remoortere)
  Sprint: Mesosphere Sprint 33
Story Points: 8

> Support update existing quota.
> --
>
> Key: MESOS-4941
> URL: https://issues.apache.org/jira/browse/MESOS-4941
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Zhitao Li
>Assignee: Zhitao Li
>  Labels: Quota, mesosphere
>
> We want to support updating an existing quota without the cycle of delete and 
> recreate. This avoids the possible starvation risk of losing the quota 
> between delete and recreate, and also makes the interface friendly.
> Design doc:
> https://docs.google.com/document/d/1c8fJY9_N0W04FtUQ_b_kZM6S0eePU7eYVyfUP14dSys



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5174) Update the balloon-framework to run on test clusters

2016-04-11 Thread Joseph Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235663#comment-15235663
 ] 

Joseph Wu commented on MESOS-5174:
--

|| Review || Summary ||
| https://reviews.apache.org/r/45604/ | First 4 bullet points in the 
description |
| https://reviews.apache.org/r/45905/ | Metrics | 

> Update the balloon-framework to run on test clusters
> 
>
> Key: MESOS-5174
> URL: https://issues.apache.org/jira/browse/MESOS-5174
> Project: Mesos
>  Issue Type: Improvement
>  Components: framework, technical debt
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: mesosphere, tech-debt
>
> There are a couple of problems with the balloon framework that prevent it 
> from being deployed (easily) on an actual cluster:
> * The framework accepts 100% of memory in an offer.  This means the expected 
> behavior (finish or OOM) is dependent on the offer size.
> * The framework assumes the {{balloon-executor}} binary is available on each 
> agent.  This is generally only true in the build environment or in 
> single-agent test environments.
> * The framework does not specify CPUs with the executor.  This is required by 
> many isolators.
> * The executor's {{TASK_FINISHED}} logic path was untested and is flaky.
> * The framework has no metrics.
> * The framework only launches a single task and then exits.  With this 
> behavior, we can't have useful metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5155) Consolidate authorization actions for quota.

2016-04-11 Thread Adam B (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235662#comment-15235662
 ] 

Adam B commented on MESOS-5155:
---

+1 to getting rid of DESTROY_QUOTA_WITH_PRINCIPAL, but has it made it into a 
release already? Do we need to put it through a deprecation cycle?

> Consolidate authorization actions for quota.
> 
>
> Key: MESOS-5155
> URL: https://issues.apache.org/jira/browse/MESOS-5155
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Zhitao Li
>  Labels: mesosphere
>
> We should have just a single authz action: {{UPDATE_QUOTA_WITH_ROLE}}. It was 
> a mistake in retrospect to introduce multiple actions.
> Actions that are not symmetrical are register/teardown and dynamic 
> reservations. The way they are implemented in this way is because entities 
> that do one action differ from entities that do the other. For example, 
> register framework is issued by a framework, teardown by an operator. What is 
> a good way to identify a framework? A role it runs in, which may be different 
> each launch and makes no sense in multi-role frameworks setup or better a 
> sort of a group id, which is its principal. For dynamic reservations and 
> persistent volumes, they can be both issued by frameworks and operators, 
> hence similar reasoning applies. 
> Now, quota is associated with a role and set only by operators. Do we need to 
> care about principals that set it? Not that much. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5174) Update the balloon-framework to run on test clusters

2016-04-11 Thread Joseph Wu (JIRA)

Joseph Wu created MESOS-5174:


 Summary: Update the balloon-framework to run on test clusters
 Key: MESOS-5174
 URL: https://issues.apache.org/jira/browse/MESOS-5174
 Project: Mesos
  Issue Type: Improvement
  Components: framework, technical debt
Reporter: Joseph Wu
Assignee: Joseph Wu


There are a couple of problems with the balloon framework that prevent it from 
being deployed (easily) on an actual cluster:

* The framework accepts 100% of memory in an offer.  This means the expected 
behavior (finish or OOM) is dependent on the offer size.
* The framework assumes the {{balloon-executor}} binary is available on each 
agent.  This is generally only true in the build environment or in single-agent 
test environments.
* The framework does not specify CPUs with the executor.  This is required by 
many isolators.
* The executor's {{TASK_FINISHED}} logic path was untested and is flaky.
* The framework has no metrics.
* The framework only launches a single task and then exits.  With this 
behavior, we can't have useful metrics.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-5155) Consolidate authorization actions for quota.

2016-04-11 Thread Zhitao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li reassigned MESOS-5155:


Assignee: Zhitao Li

> Consolidate authorization actions for quota.
> 
>
> Key: MESOS-5155
> URL: https://issues.apache.org/jira/browse/MESOS-5155
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Zhitao Li
>  Labels: mesosphere
>
> We should have just a single authz action: {{UPDATE_QUOTA_WITH_ROLE}}. It was 
> a mistake in retrospect to introduce multiple actions.
> Actions that are not symmetrical are register/teardown and dynamic 
> reservations. The way they are implemented in this way is because entities 
> that do one action differ from entities that do the other. For example, 
> register framework is issued by a framework, teardown by an operator. What is 
> a good way to identify a framework? A role it runs in, which may be different 
> each launch and makes no sense in multi-role frameworks setup or better a 
> sort of a group id, which is its principal. For dynamic reservations and 
> persistent volumes, they can be both issued by frameworks and operators, 
> hence similar reasoning applies. 
> Now, quota is associated with a role and set only by operators. Do we need to 
> care about principals that set it? Not that much. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4760) Expose metrics and gauges for fetcher cache usage and hit rate

2016-04-11 Thread Zhitao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li updated MESOS-4760:
-
Assignee: Michael Browning  (was: Zhitao Li)

> Expose metrics and gauges for fetcher cache usage and hit rate
> --
>
> Key: MESOS-4760
> URL: https://issues.apache.org/jira/browse/MESOS-4760
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher, statistics
>Reporter: Michael Browning
>Assignee: Michael Browning
>Priority: Minor
>  Labels: features, fetcher, statistics, uber
>
> To evaluate the fetcher cache and calibrate the value of the 
> fetcher_cache_size flag, it would be useful to have metrics and gauges on 
> agents that expose operational statistics like cache hit rate, occupied cache 
> size, and time spent downloading resources that were not present.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-4760) Expose metrics and gauges for fetcher cache usage and hit rate

2016-04-11 Thread Zhitao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li reassigned MESOS-4760:


Assignee: Zhitao Li

> Expose metrics and gauges for fetcher cache usage and hit rate
> --
>
> Key: MESOS-4760
> URL: https://issues.apache.org/jira/browse/MESOS-4760
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher, statistics
>Reporter: Michael Browning
>Assignee: Zhitao Li
>Priority: Minor
>  Labels: features, fetcher, statistics, uber
>
> To evaluate the fetcher cache and calibrate the value of the 
> fetcher_cache_size flag, it would be useful to have metrics and gauges on 
> agents that expose operational statistics like cache hit rate, occupied cache 
> size, and time spent downloading resources that were not present.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3739) Mesos does not set Content-Type for 400 Bad Request

2016-04-11 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3739:
--
Assignee: Vinod Kone
  Sprint: Mesosphere Sprint 33

> Mesos does not set Content-Type for 400 Bad Request
> ---
>
> Key: MESOS-3739
> URL: https://issues.apache.org/jira/browse/MESOS-3739
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API
>Affects Versions: 0.24.0, 0.24.1, 0.25.0
>Reporter: Ben Whitehead
>Assignee: Vinod Kone
>  Labels: mesosphere
>
> While integrating with the HTTP Scheduler API I encountered the following 
> scenario.
> The message below was serialized to protobuf and sent as the POST body
> {code:title=message}
> call {
>   type: ACKNOWLEDGE,
>   acknowledge: {
> uuid: ,
> agentID: { value: "20151012-182734-16777343-5050-8978-S2" },
> taskID: { value: "task-1" }
>   }
> }
> {code}
> {code:title=Request Headers}
> POST /api/v1/scheduler HTTP/1.1
> Content-Type: application/x-protobuf
> Accept: application/x-protobuf
> Content-Length: 73
> Host: localhost:5050
> User-Agent: RxNetty Client
> {code}
> I received the following response
> {code:title=Response Headers}
> HTTP/1.1 400 Bad Request
> Date: Wed, 14 Oct 2015 23:21:36 GMT
> Content-Length: 74
> Failed to validate Scheduler::Call: Expecting 'framework_id' to be present
> {code}
> Even though my accept header made no mention of {{text/plain}} the message 
> body returned to me is {{text/plain}}. Additionally, there is no 
> {{Content-Type}} header set on the response so I can't even do anything 
> intelligently in my response handler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5166) ExamplesTest.DynamicReservationFramework is slow

2016-04-11 Thread Michael Park (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-5166:

Sprint: Mesosphere Sprint 33

> ExamplesTest.DynamicReservationFramework is slow
> 
>
> Key: MESOS-5166
> URL: https://issues.apache.org/jira/browse/MESOS-5166
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: Benjamin Bannier
>  Labels: examples, mesosphere
>
> For an unoptimized build under OS X 
> {{ExamplesTest.DynamicReservationFramework}} currently takes more than 13 
> seconds on my machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5166) ExamplesTest.DynamicReservationFramework is slow

2016-04-11 Thread Michael Park (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-5166:

Shepherd: Michael Park

> ExamplesTest.DynamicReservationFramework is slow
> 
>
> Key: MESOS-5166
> URL: https://issues.apache.org/jira/browse/MESOS-5166
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: Benjamin Bannier
>  Labels: examples, mesosphere
>
> For an unoptimized build under OS X 
> {{ExamplesTest.DynamicReservationFramework}} currently takes more than 13 
> seconds on my machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4689) Design doc for v1 Operator API

2016-04-11 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-4689:
--
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 33  (was: Mesosphere Sprint 
29)

> Design doc for v1 Operator API
> --
>
> Key: MESOS-4689
> URL: https://issues.apache.org/jira/browse/MESOS-4689
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Kevin Klues
>
> We need to design how the v1 operator API (all the HTTP endpoints exposed by 
> master/agent that are not for scheduler/executor interactions) looks and 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3558) Implement HTTPCommandExecutor that uses the Executor Library

2016-04-11 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3558:
--
Sprint: Mesosphere Sprint 33

> Implement  HTTPCommandExecutor that uses the Executor Library 
> --
>
> Key: MESOS-3558
> URL: https://issues.apache.org/jira/browse/MESOS-3558
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Qian Zhang
>  Labels: mesosphere
>
> Instead of using the {{MesosExecutorDriver}} , we should make the 
> {{CommandExecutor}} in {{src/launcher/executor.cpp}} use the new Executor 
> HTTP Library that we create in {{MESOS-3550}}. 
> This would act as a good validation of the {{HTTP API}} implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-5173) Allow master/agent to take multiple --modules flags

2016-04-11 Thread Kapil Arya (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya reassigned MESOS-5173:
-

Assignee: Kapil Arya

> Allow master/agent to take multiple --modules flags
> ---
>
> Key: MESOS-5173
> URL: https://issues.apache.org/jira/browse/MESOS-5173
> Project: Mesos
>  Issue Type: Task
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> When loading multiple modules into master/agent, one has to merge all module 
> metadata (library name, module name, parameters, etc.) into a single json 
> file which is then passed on to the --modules flag. This quickly becomes 
> cumbersome especially if the modules are coming from different 
> vendors/developers.
> An alternate would be to allow multiple invocations of --modules flag that 
> can then be passed on to the module manager. That way, each flag corresponds 
> to just one module library and modules from that library.
> Another approach is to create a new flag (e.g., --modules-dir) that contains 
> a path to a directory that would contain multiple json files. One can think 
> of it as an analogous to systemd units. The operator that drops a new file 
> into this directory and the file would automatically be picked up by the 
> master/agent module manager. Further, the naming scheme can also be inherited 
> to prefix the filename with an "NN_" to signify oad order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-5119) Support directory structure in CommandInfo.URI.filename in fetcher

2016-04-11 Thread Michael Browning (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Browning reassigned MESOS-5119:
---

Assignee: Michael Browning

> Support directory structure in CommandInfo.URI.filename in fetcher
> --
>
> Key: MESOS-5119
> URL: https://issues.apache.org/jira/browse/MESOS-5119
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Yan Xu
>Assignee: Michael Browning
>
> In MESOS-4735, {{CommandInfo.URI.filename}} is added but there is no 
> validation to make sure it's a simple basename, so people can actually 
> specify the filename to be something like {{path/to/file}} but the validation 
> [won't catch it|https://reviews.apache.org/r/45046/#comment190155]. The fetch 
> will fail later in {{download()}} because it cannot open a destination file 
> without its parent directory.
> Instead of fixing this by disallowing such output filename, we could actually 
> support this behavior. There are use cases where multiple fetch targets have 
> the same basename but they are organized by a directory hierarchy.
> {noformat:title=}
> root/app.dat
> root/parent/app.dat
> root/parent/child/app.dat
> {noformat}
> It looks to me that supporting this is straightforward and we just need to 1) 
> make sure the output path is within the sandbox and 2) recursively mkdirs for 
> the parent dirs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5145) protobuf vendored but its depencencies are not

2016-04-11 Thread Greg Mann (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5145:
-
Labels: mesosphere  (was: )

> protobuf vendored but its depencencies are not
> --
>
> Key: MESOS-5145
> URL: https://issues.apache.org/jira/browse/MESOS-5145
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: David Robinson
>  Labels: mesosphere
>
> Updating [protobuf from 2.5 to 
> 2.6.1|https://github.com/apache/mesos/commit/51872fba7f94d80e55c9cc9b46f96780a938f626]
>  has caused Mesos builds to fail if pypi.python.org is unreachable. 
> Protobuf-2.6.1 requires 
> [google-apputils|https://pypi.python.org/pypi/google-apputils] and if it's 
> not available the build process will attempt to download it from pypi.
> Prior to this change it was possible to build Mesos without Internet access. 
> If the build process reaches out to arbitrary things on the Internet it's 
> impossible to guarantee build reproducibility.
> {noformat:title=snippet from setup.py in protobuf-2.6.1.tar.gz}
>   setup(name = 'protobuf',
> version = '2.6.1',
> ...
> setup_requires = ['google-apputils'],
> ...
> )
> {noformat}
> {noformat:title=snippet from build log}
> 08:20:49 DEBUG: Building protobuf Python egg ...
> 08:20:49 DEBUG: cd ../3rdparty/libprocess/3rdparty/protobuf-2.6.1/python &&   
> \
> 08:20:49 DEBUG: CC="gcc"  \
> 08:20:49 DEBUG: CXX="g++" \
> 08:20:49 DEBUG: CFLAGS="-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
> -Wno-unused-local-typedefs"   \
> 08:20:49 DEBUG: CXXFLAGS="-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
> -Wno-unused-local-typedefs -Wno-maybe-uninitialized -std=c++11"   
>   \
> 08:20:49 DEBUG: 
> PYTHONPATH=/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26  
> \
> 08:20:49 DEBUG: /usr/bin/python2.7 setup.py build bdist_egg
> 08:20:49 DEBUG: Download error on 
> http://pypi.python.org/simple/google-apputils/: [Errno 111] Connection 
> refused -- Some packages may not be found!
> 08:20:49 DEBUG: Download error on 
> http://pypi.python.org/simple/google-apputils/: [Errno 111] Connection 
> refused -- Some packages may not be found!
> 08:20:49 DEBUG: Couldn't find index page for 'google-apputils' (maybe 
> misspelled?)
> 08:20:49 DEBUG: Download error on http://pypi.python.org/simple/: [Errno 111] 
> Connection refused -- Some packages may not be found!
> 08:20:49 DEBUG: No local packages or download links found for google-apputils
> 08:20:49 DEBUG: Traceback (most recent call last):
> 08:20:49 DEBUG:   File "setup.py", line 200, in 
> 08:20:49 DEBUG: "Protocol Buffers are Google's data interchange format.",
> 08:20:49 DEBUG:   File "/usr/lib64/python2.7/distutils/core.py", line 111, in 
> setup
> 08:20:49 DEBUG: _setup_distribution = dist = klass(attrs)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 221, in __init__
> 08:20:49 DEBUG: self.fetch_build_eggs(attrs.pop('setup_requires'))
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 245, in fetch_build_eggs
> 08:20:49 DEBUG: parse_requirements(requires), 
> installer=self.fetch_build_egg
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 580, in resolve
> 08:20:49 DEBUG: dist = best[req.key] = env.best_match(req, self, 
> installer)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 825, in best_match
> 08:20:49 DEBUG: return self.obtain(req, installer) # try and 
> download/install
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 837, in obtain
> 08:20:49 DEBUG: return installer(requirement)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 294, in fetch_build_egg
> 08:20:49 DEBUG: return cmd.easy_install(req)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/command/easy_install.py",
>  line 584, in easy_install
> 08:20:49 DEBUG: raise DistutilsError(msg)
> 08:20:49 DEBUG: distutils.errors.DistutilsError: Could not find suitable 
> distribution for Requirement.parse('google-apputils')
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2533) Support HTTP checks in Mesos health check program

2016-04-11 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235528#comment-15235528
 ] 

haosdent commented on MESOS-2533:
-

[~medzin] After read the issue, how about you use marathon command health check 
as workaround currently. Marathon command health check depends on the health 
check of Mesos as well. Because I think still need take some times to put this 
patch forward and merged, I afraid you could not use this shortly.

> Support HTTP checks in Mesos health check program
> -
>
> Key: MESOS-2533
> URL: https://issues.apache.org/jira/browse/MESOS-2533
> Project: Mesos
>  Issue Type: Bug
>Reporter: Niklas Quarfot Nielsen
>Assignee: haosdent
>  Labels: mesosphere
>
> Currently, only commands are supported but our health check protobuf enables 
> users to encode HTTP checks as well. We should wire up this in the health 
> check program or remove the http field from the protobuf.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-5064) Remove default value for the agent `work_dir`

2016-04-11 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220643#comment-15220643
 ] 

Greg Mann edited comment on MESOS-5064 at 4/11/16 5:18 PM:
---

Reviews here:

https://reviews.apache.org/r/46003/
https://reviews.apache.org/r/46005/
https://reviews.apache.org/r/46004/
https://reviews.apache.org/r/45562/
https://reviews.apache.org/r/46038/


was (Author: greggomann):
Reviews here:
https://reviews.apache.org/r/45562/
https://reviews.apache.org/r/45563/

> Remove default value for the agent `work_dir`
> -
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}. In addition, we can remove 
> the default value for the {{\-\-work_dir}} flag, forcing users to explicitly 
> set the work directory for the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2533) Support HTTP checks in Mesos health check program

2016-04-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/MESOS-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235486#comment-15235486
 ] 

Adam Medziński commented on MESOS-2533:
---

[~haosd...@gmail.com] thanks for suggestion, but I ask about this because of my 
issue on Marathon github:
https://github.com/mesosphere/marathon/issues/3728

> Support HTTP checks in Mesos health check program
> -
>
> Key: MESOS-2533
> URL: https://issues.apache.org/jira/browse/MESOS-2533
> Project: Mesos
>  Issue Type: Bug
>Reporter: Niklas Quarfot Nielsen
>Assignee: haosdent
>  Labels: mesosphere
>
> Currently, only commands are supported but our health check protobuf enables 
> users to encode HTTP checks as well. We should wire up this in the health 
> check program or remove the http field from the protobuf.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4922) Setup proper /etc/hostname, /etc/hosts and /etc/resolv.conf for containers in network/cni isolator.

2016-04-11 Thread Jie Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235468#comment-15235468
 ] 

Jie Yu commented on MESOS-4922:
---

commit 00141e4a56a81525fec1f86f2b212dcbc04e3a8c
Author: Avinash sridharan 
Date:   Mon Apr 11 09:51:16 2016 -0700

Adding a stout interface for `sethostname` system call in linux.

Review: https://reviews.apache.org/r/45953/

> Setup proper /etc/hostname, /etc/hosts and /etc/resolv.conf for containers in 
> network/cni isolator.
> ---
>
> Key: MESOS-4922
> URL: https://issues.apache.org/jira/browse/MESOS-4922
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation
>Reporter: Qian Zhang
>Assignee: Avinash Sridharan
>  Labels: mesosphere
>
> The network/cni isolator needs to properly setup /etc/hostname and /etc/hosts 
> for the container with a hostname (e.g., randomly generated) and the assigned 
> IP returned by CNI plugin.
> We should consider the following cases:
> 1) container is using host filesystem
> 2) container is using a different filesystem
> 3) custom executor and command executor



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5173) Allow master/agent to take multiple --modules flags

2016-04-11 Thread Kapil Arya (JIRA)

Kapil Arya created MESOS-5173:
-

 Summary: Allow master/agent to take multiple --modules flags
 Key: MESOS-5173
 URL: https://issues.apache.org/jira/browse/MESOS-5173
 Project: Mesos
  Issue Type: Task
Reporter: Kapil Arya
 Fix For: 0.29.0


When loading multiple modules into master/agent, one has to merge all module 
metadata (library name, module name, parameters, etc.) into a single json file 
which is then passed on to the --modules flag. This quickly becomes cumbersome 
especially if the modules are coming from different vendors/developers.

An alternate would be to allow multiple invocations of --modules flag that can 
then be passed on to the module manager. That way, each flag corresponds to 
just one module library and modules from that library.

Another approach is to create a new flag (e.g., --modules-dir) that contains a 
path to a directory that would contain multiple json files. One can think of it 
as an analogous to systemd units. The operator that drops a new file into this 
directory and the file would automatically be picked up by the master/agent 
module manager. Further, the naming scheme can also be inherited to prefix the 
filename with an "NN_" to signify oad order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.

2016-04-11 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4891:
--
Shepherd: Jie Yu

> Add a '/containers' endpoint to the agent to list all the active containers.
> 
>
> Key: MESOS-4891
> URL: https://issues.apache.org/jira/browse/MESOS-4891
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Jie Yu
>Assignee: Jay Guo
>  Labels: mesosphere
> Attachments: screenshot.png
>
>
> This endpoint will be similar to /monitor/statistics.json endpoint, but it'll 
> also contain the 'container_status' about the container (see ContainerStatus 
> in mesos.proto). We'll eventually deprecate the /monitor/statistics.json 
> endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.

2016-04-11 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4891:
--
Story Points: 8

> Add a '/containers' endpoint to the agent to list all the active containers.
> 
>
> Key: MESOS-4891
> URL: https://issues.apache.org/jira/browse/MESOS-4891
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Jie Yu
>Assignee: Jay Guo
>  Labels: mesosphere
> Attachments: screenshot.png
>
>
> This endpoint will be similar to /monitor/statistics.json endpoint, but it'll 
> also contain the 'container_status' about the container (see ContainerStatus 
> in mesos.proto). We'll eventually deprecate the /monitor/statistics.json 
> endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5172) Registry puller cannot fetch blobs correctly from some private repos.

2016-04-11 Thread Gilbert Song (JIRA)

Gilbert Song created MESOS-5172:
---

 Summary: Registry puller cannot fetch blobs correctly from some 
private repos.
 Key: MESOS-5172
 URL: https://issues.apache.org/jira/browse/MESOS-5172
 Project: Mesos
  Issue Type: Bug
  Components: containerization
Reporter: Gilbert Song
Assignee: Gilbert Song


When the registry puller is pulling a private repository from some private 
registry (e.g., quay.io), errors may occur when fetching blobs, at which point 
fetching the manifest of the repo is finished correctly. The error message is 
`Unexpected HTTP response '400 Bad Request' when trying to download the blob`. 
This may arise from the logic of fetching blobs, or incorrect format of uri 
when requesting blobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4944) Improve overlay backend so that it's writable

2016-04-11 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4944:
--
Shepherd: Jie Yu
  Sprint: Mesosphere Sprint 32
  Labels: mesosphere  (was: )

> Improve overlay backend so that it's writable
> -
>
> Key: MESOS-4944
> URL: https://issues.apache.org/jira/browse/MESOS-4944
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jie Yu
>Assignee: Shuai Lin
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> Currently, the overlay backend will provision a read-only FS. We can use an 
> empty directory from the container sandbox to act as the upper layer so that 
> it's writable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4944) Improve overlay backend so that it's writable

2016-04-11 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4944:
--
Story Points: 5

> Improve overlay backend so that it's writable
> -
>
> Key: MESOS-4944
> URL: https://issues.apache.org/jira/browse/MESOS-4944
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jie Yu
>Assignee: Shuai Lin
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> Currently, the overlay backend will provision a read-only FS. We can use an 
> empty directory from the container sandbox to act as the upper layer so that 
> it's writable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4944) Improve overlay backend so that it's writable

2016-04-11 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4944:
--
Component/s: containerization

> Improve overlay backend so that it's writable
> -
>
> Key: MESOS-4944
> URL: https://issues.apache.org/jira/browse/MESOS-4944
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jie Yu
>Assignee: Shuai Lin
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> Currently, the overlay backend will provision a read-only FS. We can use an 
> empty directory from the container sandbox to act as the upper layer so that 
> it's writable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5171) Expose state/state.hpp to public headers

2016-04-11 Thread Kapil Arya (JIRA)

Kapil Arya created MESOS-5171:
-

 Summary: Expose state/state.hpp to public headers
 Key: MESOS-5171
 URL: https://issues.apache.org/jira/browse/MESOS-5171
 Project: Mesos
  Issue Type: Task
  Components: replicated log
Reporter: Kapil Arya
Assignee: Kapil Arya
 Fix For: 0.29.0


We want the Modules to be able to use replicated log along with the APIs to 
communicate with Zookeeper. This change would require us to expose at least the 
following headers state/storage.hpp, and any additional files that state.hpp 
depends on (e.g., zookeeper/authentication.hpp).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5170) Adapt json creation for authorization based endpoint filtering.

2016-04-11 Thread Joerg Schad (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-5170:
---
Story Points: 5  (was: 3)
  Labels: authorization mesosphere security  (was: mesosphere security)

> Adapt json creation for authorization based endpoint filtering.
> ---
>
> Key: MESOS-5170
> URL: https://issues.apache.org/jira/browse/MESOS-5170
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>  Labels: authorization, mesosphere, security
>
> For authorization based endpoint filtering we need to adapt the json endpoint 
> creation as discussed in MESOS-4931.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5169) Introduce new Authorizer Actions for Authorized based filtering of endpoints.

2016-04-11 Thread Joerg Schad (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-5169:
---
Labels: authorization mesosphere security  (was: )

> Introduce new Authorizer Actions for Authorized based filtering of endpoints.
> -
>
> Key: MESOS-5169
> URL: https://issues.apache.org/jira/browse/MESOS-5169
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>  Labels: authorization, mesosphere, security
>
> For authorization based endpoint filtering we need to introduce the 
> authorizer actions outlined via MESOS-493.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5168) Benchmark overhead of authorization based filtering.

2016-04-11 Thread Joerg Schad (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-5168:
---
Labels: authorization mesosphere security  (was: mesosphere security)

> Benchmark overhead of authorization based filtering.
> 
>
> Key: MESOS-5168
> URL: https://issues.apache.org/jira/browse/MESOS-5168
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>  Labels: authorization, mesosphere, security
>
> When adding authorization based filtering as outlined in MESOS-4931 we need 
> to be careful especially for performance critical endpoints such as /state.
> We should ensure via a benchmark that performance does not degreade below an 
> acceptable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5168) Benchmark overhead of authorization based filtering.

2016-04-11 Thread Joerg Schad (JIRA)

Joerg Schad created MESOS-5168:
--

 Summary: Benchmark overhead of authorization based filtering.
 Key: MESOS-5168
 URL: https://issues.apache.org/jira/browse/MESOS-5168
 Project: Mesos
  Issue Type: Improvement
Reporter: Joerg Schad


When adding authorization based filtering as outlined in MESOS-4931 we need to 
be careful especially for performance critical endpoints such as /state.

We should ensure via a benchmark that performance does not degreade below an 
acceptable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5168) Benchmark overhead of authorization based filtering.

2016-04-11 Thread Joerg Schad (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-5168:
---
Labels: mesosphere security  (was: )

> Benchmark overhead of authorization based filtering.
> 
>
> Key: MESOS-5168
> URL: https://issues.apache.org/jira/browse/MESOS-5168
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>  Labels: mesosphere, security
>
> When adding authorization based filtering as outlined in MESOS-4931 we need 
> to be careful especially for performance critical endpoints such as /state.
> We should ensure via a benchmark that performance does not degreade below an 
> acceptable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5170) Adapt json creation for authorization based endpoint filtering.

2016-04-11 Thread Joerg Schad (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-5170:
---
Labels: mesosphere security  (was: mesosphere)

> Adapt json creation for authorization based endpoint filtering.
> ---
>
> Key: MESOS-5170
> URL: https://issues.apache.org/jira/browse/MESOS-5170
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>  Labels: mesosphere, security
>
> For authorization based endpoint filtering we need to adapt the json endpoint 
> creation as discussed in MESOS-4931.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5170) Adapt json creation for authorization based endpoint filtering.

2016-04-11 Thread Joerg Schad (JIRA)

Joerg Schad created MESOS-5170:
--

 Summary: Adapt json creation for authorization based endpoint 
filtering.
 Key: MESOS-5170
 URL: https://issues.apache.org/jira/browse/MESOS-5170
 Project: Mesos
  Issue Type: Improvement
Reporter: Joerg Schad


For authorization based endpoint filtering we need to adapt the json endpoint 
creation as discussed in MESOS-4931.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5169) Introduce new Authorizer Actions for Authorized based filtering of endpoints.

2016-04-11 Thread Joerg Schad (JIRA)

Joerg Schad created MESOS-5169:
--

 Summary: Introduce new Authorizer Actions for Authorized based 
filtering of endpoints.
 Key: MESOS-5169
 URL: https://issues.apache.org/jira/browse/MESOS-5169
 Project: Mesos
  Issue Type: Improvement
Reporter: Joerg Schad


For authorization based endpoint filtering we need to introduce the authorizer 
actions outlined via MESOS-493.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5168) Benchmark overhead of authorization based filtering.

2016-04-11 Thread Joerg Schad (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-5168:
---
Story Points: 3

> Benchmark overhead of authorization based filtering.
> 
>
> Key: MESOS-5168
> URL: https://issues.apache.org/jira/browse/MESOS-5168
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>
> When adding authorization based filtering as outlined in MESOS-4931 we need 
> to be careful especially for performance critical endpoints such as /state.
> We should ensure via a benchmark that performance does not degreade below an 
> acceptable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4931) Authorization based filtering for endpoints.

2016-04-11 Thread Joerg Schad (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-4931:
---
Description: Some endpoints such as /state should be filtered depending on 
which information the user is authorized to see. For example a user should be 
only able to see tasks he is authorized to see.  (was: Some endpoints -such as 
state- should be filtered depending on which information the user is authorized 
to see. For example a user should be only able to see tasks he is authorized to 
see.)

> Authorization based filtering for endpoints.
> 
>
> Key: MESOS-4931
> URL: https://issues.apache.org/jira/browse/MESOS-4931
> Project: Mesos
>  Issue Type: Epic
>  Components: security
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> Some endpoints such as /state should be filtered depending on which 
> information the user is authorized to see. For example a user should be only 
> able to see tasks he is authorized to see.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5130) Enable `newtork/cni` isolator in `MesosContainerizer` as the default `network` isolator.

2016-04-11 Thread Avinash Sridharan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan updated MESOS-5130:
-
 Sprint: Mesosphere Sprint 32
Description: Currently there are no default `network` isolators for 
`MesosContainerizer`. With the development of the `network/cni` isolator we 
have an interface to run Mesos on multitude of IP networks. Given that its 
based on an open standard (the CNI spec) which is gathering a lot of traction 
from vendors (calico, weave, coreOS) and already works on some default networks 
(bridge, ipvlan, macvlan) it makes sense to make it as the default network 
isolator.  (was: The CNI network isolator needs to be enabled by default. )

> Enable `newtork/cni` isolator in `MesosContainerizer` as the default 
> `network` isolator.
> 
>
> Key: MESOS-5130
> URL: https://issues.apache.org/jira/browse/MESOS-5130
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
>
> Currently there are no default `network` isolators for `MesosContainerizer`. 
> With the development of the `network/cni` isolator we have an interface to 
> run Mesos on multitude of IP networks. Given that its based on an open 
> standard (the CNI spec) which is gathering a lot of traction from vendors 
> (calico, weave, coreOS) and already works on some default networks (bridge, 
> ipvlan, macvlan) it makes sense to make it as the default network isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5064) Remove default value for the agent `work_dir`

2016-04-11 Thread Greg Mann (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5064:
-
Summary: Remove default value for the agent `work_dir`  (was: Document 
avoiding using `/tmp` as agent’s work directory in production)

> Remove default value for the agent `work_dir`
> -
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5064) Remove default value for the agent `work_dir`

2016-04-11 Thread Greg Mann (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5064:
-
Description: Following a crash report from the user we need to be more 
explicit about the dangers of using {{/tmp}} as agent {{work_dir}}. In 
addition, we can remove the default value for the {{\-\-work_dir}} flag, 
forcing users to explicitly set the work directory for the agent.  (was: 
Following a crash report from the user we need to be more explicit about the 
dangers of using {{/tmp}} as agent {{work_dir}})

> Remove default value for the agent `work_dir`
> -
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}. In addition, we can remove 
> the default value for the {{\-\-work_dir}} flag, forcing users to explicitly 
> set the work directory for the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3084) PPC64LE architecture support on third-party libraries

2016-04-11 Thread Neil Conway (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235275#comment-15235275
 ] 

Neil Conway commented on MESOS-3084:


Hi [~ykrips] -- I believe that this JIRA (and the associated review requests) 
can be closed, because the libraries in question have been updated. Is that 
correct? Thanks.

> PPC64LE architecture support on third-party libraries
> -
>
> Key: MESOS-3084
> URL: https://issues.apache.org/jira/browse/MESOS-3084
> Project: Mesos
>  Issue Type: Improvement
>  Components: general, libprocess
>Affects Versions: 0.22.1
> Environment: Ubuntu 14.04 ppc64le
>Reporter: Jihun Kang
>Assignee: Jihun Kang
>Priority: Minor
>
> Some third-party libraries were behind the development cycle, so some of them 
> already supported ppc64 and ppc64le architecture but these changes are not 
> applied to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5151) Marathon Pass Dynamic Value with Parameters Resource in Docker Configuration

2016-04-11 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235206#comment-15235206
 ] 

Greg Mann commented on MESOS-5151:
--

[~jesada], thanks for the extra information. I believe that Marathon already 
supports passing arbitrary command-line parameters to the Docker CLI; for 
example, see the JSON provided at the bottom of this Github issue: 
https://github.com/mesosphere/marathon/issues/3111

> Marathon Pass Dynamic Value with Parameters Resource in Docker Configuration
> 
>
> Key: MESOS-5151
> URL: https://issues.apache.org/jira/browse/MESOS-5151
> Project: Mesos
>  Issue Type: Wish
>  Components: docker
>Affects Versions: 0.28.0
> Environment: software
>Reporter: Jesada Gonkratoke
>
> "parameters": [
>{ "key": "add-host", "value": "dockerhost:$(hostname -i)" }
>   ]
> },
> # I want to add dynamic host ip



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2533) Support HTTP checks in Mesos health check program

2016-04-11 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235105#comment-15235105
 ] 

haosdent commented on MESOS-2533:
-

[~medzin] Thank you for your inquiring. This patch is stale now. I think I need 
rebase again. But If you want to check by http now, it could use command check 
as workaround. For example, you could use {{curl xxx}} as your health check 
command.

> Support HTTP checks in Mesos health check program
> -
>
> Key: MESOS-2533
> URL: https://issues.apache.org/jira/browse/MESOS-2533
> Project: Mesos
>  Issue Type: Bug
>Reporter: Niklas Quarfot Nielsen
>Assignee: haosdent
>  Labels: mesosphere
>
> Currently, only commands are supported but our health check protobuf enables 
> users to encode HTTP checks as well. We should wire up this in the health 
> check program or remove the http field from the protobuf.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3070) Master CHECK failure if a framework uses duplicated task id.

2016-04-11 Thread Klaus Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235104#comment-15235104
 ] 

Klaus Ma commented on MESOS-3070:
-

Ping [~vinodkone]/[~jieyu] :).

> Master CHECK failure if a framework uses duplicated task id.
> 
>
> Key: MESOS-3070
> URL: https://issues.apache.org/jira/browse/MESOS-3070
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.22.1
>Reporter: Jie Yu
>Assignee: Klaus Ma
>
> We observed this in one of our testing cluster.
> One framework (under development) keeps launching tasks using the same 
> task_id. We don't expect the master to crash even if the framework is not 
> doing what it's supposed to do. However, under a series of events, this could 
> happen and keeps crashing the master.
> 1) frameworkA launches task 'task_id_1' on slaveA
> 2) master fails over
> 3) slaveA has not re-registered yet
> 4) frameworkA re-registered and launches task 'task_id_1' on slaveB
> 5) slaveA re-registering and add task "task_id_1' to frameworkA
> 6) CHECK failure in addTask
> {noformat}
> I0716 21:52:50.759305 28805 master.hpp:159] Adding task 'task_id_1' with 
> resources cpus(*):4; mem(*):32768 on slave 
> 20150417-232509-1735470090-5050-48870-S25 (hostname)
> ...
> ...
> F0716 21:52:50.760136 28805 master.hpp:362] Check failed: 
> !tasks.contains(task->task_id()) Duplicate task 'task_id_1' of framework 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4070) numify() handles negative numbers inconsistently.

2016-04-11 Thread Yong Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235077#comment-15235077
 ] 

Yong Tang commented on MESOS-4070:
--

Hi [~jieyu], I am wondering if you get a chance to take a look at the review 
request:
https://reviews.apache.org/r/45011/
And since you initially opened this JIRA ticket (MESOS-4070), is it possible 
for you to shepherd this ticket? Thanks.

> numify() handles negative numbers inconsistently.
> -
>
> Key: MESOS-4070
> URL: https://issues.apache.org/jira/browse/MESOS-4070
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Jie Yu
>Assignee: Yong Tang
>  Labels: tech-debt
>
> As pointed by [~neilc] in this review:
> https://reviews.apache.org/r/40988
> {noformat}
> Try num2 = numify("-10");
> EXPECT_SOME_EQ(-10, num2);
> // TODO(neilc): This is inconsistent with the handling of non-hex numbers.
> EXPECT_ERROR(numify("-0x10"));
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4621) --disable-optimize triggers optimized builds.

2016-04-11 Thread Yong Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235072#comment-15235072
 ] 

Yong Tang commented on MESOS-4621:
--

Hi [~tillt], just wondering if you get a chance to take a look at the review 
request for:
https://reviews.apache.org/r/44911/
Or do you think we should close this issue MESOS-4621 and continue on the other 
issue MESOS-2537 by [~jamespeach]?


> --disable-optimize triggers optimized builds.
> -
>
> Key: MESOS-4621
> URL: https://issues.apache.org/jira/browse/MESOS-4621
> Project: Mesos
>  Issue Type: Bug
>Reporter: Till Toenshoff
>Assignee: Yong Tang
>Priority: Minor
>
> The toggle-logic of the build configuration argument {{optimize}} appears to 
> be implemented incorrectly. When using the perfectly legal invocation;
> {noformat}
> ../configure --disable-optimize
> {noformat}
> What you get here is enabled optimizing {{O2}}.
> {noformat}
> ccache g++ -Qunused-arguments -fcolor-diagnostics 
> -DPACKAGE_NAME=\"libprocess\" -DPACKAGE_TARNAME=\"libprocess\" 
> -DPACKAGE_VERSION=\"0.0.1\" -DPACKAGE_STRING=\"libprocess\ 0.0.1\" 
> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"libprocess\" 
> -DVERSION=\"0.0.1\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
> -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
> -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
> -DLT_OBJDIR=\".libs/\" -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 
> -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 
> -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBCURL=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
> -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBDL=1 -I. 
> -I../../../../3rdparty/libprocess/3rdparty  
> -I../../../../3rdparty/libprocess/3rdparty/stout/include -Iprotobuf-2.5.0/src 
>  -Igmock-1.7.0/gtest/include -Igmock-1.7.0/include -isystem boost-1.53.0 
> -Ipicojson-1.3.0 -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS -Iglog-0.3.3/src 
> -I/usr/local/opt/openssl/include -I/usr/local/opt/libevent/include 
> -I/usr/local/opt/subversion/include/subversion-1 -I/usr/include/apr-1 
> -I/usr/include/apr-1.0   -O2 -Wno-unused-local-typedef -std=c++11 
> -stdlib=libc++ -DGTEST_USE_OWN_TR1_TUPLE=1 -DGTEST_LANG_CXX11 -MT 
> stout_tests-flags_tests.o -MD -MP -MF .deps/stout_tests-flags_tests.Tpo -c -o 
> stout_tests-flags_tests.o `test -f 'stout/tests/flags_tests.cpp' || echo 
> '../../../../3rdparty/libprocess/3rdparty/'`stout/tests/flags_tests.cpp
> {noformat}
> It seems more straightforward to actually disable optimizing for the above 
> argument.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5167) Add tests for `network/cni` isolator

2016-04-11 Thread Qian Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang updated MESOS-5167:
--
Shepherd: Jie Yu

> Add tests for `network/cni` isolator
> 
>
> Key: MESOS-5167
> URL: https://issues.apache.org/jira/browse/MESOS-5167
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> We need to add tests to verify the functionality of `network/cni` isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5167) Add tests for `network/cni` isolator

2016-04-11 Thread Qian Zhang (JIRA)

Qian Zhang created MESOS-5167:
-

 Summary: Add tests for `network/cni` isolator
 Key: MESOS-5167
 URL: https://issues.apache.org/jira/browse/MESOS-5167
 Project: Mesos
  Issue Type: Task
  Components: test
Reporter: Qian Zhang
Assignee: Qian Zhang


We need to add tests to verify the functionality of `network/cni` isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5166) ExamplesTest.DynamicReservationFramework is slow

2016-04-11 Thread Benjamin Bannier (JIRA)

Benjamin Bannier created MESOS-5166:
---

 Summary: ExamplesTest.DynamicReservationFramework is slow
 Key: MESOS-5166
 URL: https://issues.apache.org/jira/browse/MESOS-5166
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Benjamin Bannier


For an unoptimized build under OS X 
{{ExamplesTest.DynamicReservationFramework}} currently takes more than 13 
seconds on my machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-1739) Allow slave reconfiguration on restart

2016-04-11 Thread Deshi Xiao (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234997#comment-15234997
 ] 

Deshi Xiao commented on MESOS-1739:
---

log yujie's comment here:

{quote}
This is a high level question: I am now sure if adding attributes is safe or 
not. For instance, my framework has the following rule: only schedule tasks to 
agents that do not have attribute "not_safe". Now, say agent A is initially 
without that attribute. My framework lands several tasks on that agent. Later, 
when agent restarts, the operator adds the new attribute "not_safe". Suddently, 
i have tasks running on unsafe boxes. oops.
{quote}

> Allow slave reconfiguration on restart
> --
>
> Key: MESOS-1739
> URL: https://issues.apache.org/jira/browse/MESOS-1739
> Project: Mesos
>  Issue Type: Epic
>Reporter: Patrick Reilly
>  Labels: external-volumes, mesosphere, myriad
>
> Make it so that either via a slave restart or a out of process "reconfigure" 
> ping, the attributes and resources of a slave can be updated to be a superset 
> of what they used to be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5165) Add tests to ensure the installed Python tools work

2016-04-11 Thread Benjamin Bannier (JIRA)

Benjamin Bannier created MESOS-5165:
---

 Summary: Add tests to ensure the installed Python tools work
 Key: MESOS-5165
 URL: https://issues.apache.org/jira/browse/MESOS-5165
 Project: Mesos
  Issue Type: Bug
  Components: python api, test
Reporter: Benjamin Bannier


We should check at least that the installed tools are complete, and probably 
also add some integration tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5010) Installation of mesos python package is incomplete

2016-04-11 Thread Bernd Mathiske (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernd Mathiske updated MESOS-5010:
--
   Sprint: Mesosphere Sprint 32
Fix Version/s: 0.29.0

> Installation of mesos python package is incomplete
> --
>
> Key: MESOS-5010
> URL: https://issues.apache.org/jira/browse/MESOS-5010
> Project: Mesos
>  Issue Type: Bug
>  Components: python api
>Affects Versions: 0.26.0, 0.28.0, 0.27.2, 0.29.0
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
> Fix For: 0.29.0
>
>
> The installation of mesos python package is incomplete, i.e., the files 
> {{cli.py}}, {{futures.py}}, and {{http.py}} are not installed.
> {code}
> % ../configure --enable-python
> % make install DESTDIR=$PWD/D
> % PYTHONPATH=$PWD/D/usr/local/lib/python2.7/site-packages:$PYTHONPATH python 
> -c 'from mesos import http'
> Traceback (most recent call last):
>   File "", line 1, in 
> ImportError: cannot import name http
> {code}
> This appears to be first broken with {{d1d70b9}} (MESOS-3969, [Upgraded 
> bundled pip to 7.1.2.|https://reviews.apache.org/r/40630]). Bisecting in 
> {{pip}}-land shows that our install becomes broken for {{pip-6.0.1}} and 
> later (we are using {{pip-7.1.2}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 121 matches

Mail list logo