[jira] [Commented] (MESOS-8198) Update the ReconcileOfferOperations protos

2017-11-13 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250472#comment-16250472
 ] 

Greg Mann commented on MESOS-8198:
--

Review here: https://reviews.apache.org/r/63768/

> Update the ReconcileOfferOperations protos
> --
>
> Key: MESOS-8198
> URL: https://issues.apache.org/jira/browse/MESOS-8198
> Project: Mesos
>  Issue Type: Task
>Reporter: Gastón Kleiman
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Some protos have been committed, but they follow an event-based API.
> We decided to follow the request/response model for this API, so we need to 
> update the protos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8198) Update the ReconcileOfferOperations protos

2017-11-13 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-8198:
-
Shepherd: Vinod Kone

> Update the ReconcileOfferOperations protos
> --
>
> Key: MESOS-8198
> URL: https://issues.apache.org/jira/browse/MESOS-8198
> Project: Mesos
>  Issue Type: Task
>Reporter: Gastón Kleiman
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Some protos have been committed, but they follow an event-based API.
> We decided to follow the request/response model for this API, so we need to 
> update the protos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-8198) Update the ReconcileOfferOperations protos

2017-11-13 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-8198:


Assignee: Greg Mann

> Update the ReconcileOfferOperations protos
> --
>
> Key: MESOS-8198
> URL: https://issues.apache.org/jira/browse/MESOS-8198
> Project: Mesos
>  Issue Type: Task
>Reporter: Gastón Kleiman
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Some protos have been committed, but they follow an event-based API.
> We decided to follow the request/response model for this API, so we need to 
> update the protos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8172) Agent --authenticate_http_executors commandline flag unrecognized in 1.4.0

2017-11-08 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244125#comment-16244125
 ] 

Greg Mann commented on MESOS-8172:
--

Was Mesos built with SSL enabled? Executor authentication requires SSL for now, 
but we could improve the error messaging in this case.

> Agent --authenticate_http_executors commandline flag unrecognized in 1.4.0
> --
>
> Key: MESOS-8172
> URL: https://issues.apache.org/jira/browse/MESOS-8172
> Project: Mesos
>  Issue Type: Bug
>  Components: executor, security
>Affects Versions: 1.4.0
> Environment: Ubuntu 16.04.3 with meso 1.4.0 compiled from source 
> tarball.
>Reporter: Dan Leary
>Assignee: Greg Mann
>
> Apparently the mesos-agent authenticate_http_executors commandline arg was 
> introduced in 1.3.0 by MESOS-6365.   But running "mesos-agent 
> --authenticate_http_executors ..." in 1.4.0 yields
> {noformat}
> Failed to load unknown flag 'authenticate_http_executors'
> {noformat}
> ...followed by a usage report that does not include 
> "--authenticate_http_executors".
> Presumably this means executor authentication is no longer configurable.
> It is still documented at 
> https://mesos.apache.org/documentation/latest/authentication/#agent



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8132) Design a library to send offer operation status updates

2017-11-01 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235219#comment-16235219
 ] 

Greg Mann commented on MESOS-8132:
--

Short design doc 
[here|https://docs.google.com/a/mesosphere.io/document/d/1hGPQA2pGjUwiR93J1mZuANByXupv42PRYZgi-HU8mb8/edit?usp=sharing].

> Design a library to send offer operation status updates
> ---
>
> Key: MESOS-8132
> URL: https://issues.apache.org/jira/browse/MESOS-8132
> Project: Mesos
>  Issue Type: Task
>Reporter: Greg Mann
>Assignee: Greg Mann
>Priority: Major
>  Labels: mesosphere
>
> As detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#],
>  we need to add a library to do the following:
> * Send offer operation status updates
> * Checkpoint pending/unacknowledged operations
> * Retry operation status updates until an acknowledgement is received
> This should be a common library which can be used by the agent (for its 
> default resources) and by local resource providers. In the future, it can 
> also be used by external resource providers.
> We should write a short design doc to explore precisely how this will be 
> implemented. It can probably be modeled after the task status update manager 
> in the agent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8130) Add placeholder handlers for offer operation feedback

2017-10-30 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225743#comment-16225743
 ] 

Greg Mann commented on MESOS-8130:
--

Review here: https://reviews.apache.org/r/63322/

> Add placeholder handlers for offer operation feedback
> -
>
> Key: MESOS-8130
> URL: https://issues.apache.org/jira/browse/MESOS-8130
> Project: Mesos
>  Issue Type: Task
>  Components: agent, master
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> In order to sketch out the flow of messages necessary to facilitate offer 
> operation feedback, we should add some empty placeholder handlers to the 
> master and agent as detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8130) Add placeholder handlers for offer operation feedback

2017-10-30 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225744#comment-16225744
 ] 

Greg Mann commented on MESOS-8130:
--

{code}
commit 6ecbf02c21d3cfdb74c56cbdde5d2c5879149ae9
Author: Greg Mann g...@mesosphere.io
Date:   Mon Oct 30 13:02:18 2017 -0700

Added placeholder handlers and other changes for operation updates.

This patch adds empty placeholder handler functions which will
be used for offer operation status updates as well as their
acknowledgement and reconciliation.

A number of switch statements are also updated to handle new
enum values and validation code is added.

Review: https://reviews.apache.org/r/63322/
{code}

> Add placeholder handlers for offer operation feedback
> -
>
> Key: MESOS-8130
> URL: https://issues.apache.org/jira/browse/MESOS-8130
> Project: Mesos
>  Issue Type: Task
>  Components: agent, master
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> In order to sketch out the flow of messages necessary to facilitate offer 
> operation feedback, we should add some empty placeholder handlers to the 
> master and agent as detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8131) Add new protobuf messages for offer operation feedback

2017-10-30 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225731#comment-16225731
 ] 

Greg Mann commented on MESOS-8131:
--

{code}
commit e6bec836af3a672a0838cd6a1b7687f087d5594f
Author: Greg Mann 
Date:   Mon Oct 30 13:00:58 2017 -0700

Added protobuf messages for V1 scheduler operation feedback.

This patch adds new and updated protobuf messages to facilitate
offer operation status updates, as well as acknowledgement of
those updates and operation status reconciliation.

Review: https://reviews.apache.org/r/63321/
{code}

> Add new protobuf messages for offer operation feedback
> --
>
> Key: MESOS-8131
> URL: https://issues.apache.org/jira/browse/MESOS-8131
> Project: Mesos
>  Issue Type: Task
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
> Fix For: 1.5.0
>
>
> We should add the necessary protobuf messages for offer operation feedback as 
> detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8131) Add new protobuf messages for offer operation feedback

2017-10-30 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225729#comment-16225729
 ] 

Greg Mann commented on MESOS-8131:
--

Review here: https://reviews.apache.org/r/63321/

> Add new protobuf messages for offer operation feedback
> --
>
> Key: MESOS-8131
> URL: https://issues.apache.org/jira/browse/MESOS-8131
> Project: Mesos
>  Issue Type: Task
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> We should add the necessary protobuf messages for offer operation feedback as 
> detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8054) Feedback for offer operations

2017-10-30 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-8054:
-
Shepherd: Greg Mann

> Feedback for offer operations
> -
>
> Key: MESOS-8054
> URL: https://issues.apache.org/jira/browse/MESOS-8054
> Project: Mesos
>  Issue Type: Epic
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>
> Only LAUNCH operations provide feedback on success or failure. All Operations 
> should do so. RESERVE, UNRESERVE, CREATE, DESTROY, CREATE_VOLUME, AND 
> DESTROY_VOLUME should all provide feedback on success or failure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8140) Executors should clear their auth tokens

2017-10-30 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-8140:
-
Shepherd: Greg Mann
  Labels: security  (was: )

> Executors should clear their auth tokens
> 
>
> Key: MESOS-8140
> URL: https://issues.apache.org/jira/browse/MESOS-8140
> Project: Mesos
>  Issue Type: Bug
>  Components: executor, security
>Reporter: James Peach
>Assignee: James Peach
>  Labels: security
> Fix For: 1.5.0
>
>
> The built-in executors should clear {{MESOS_EXECUTOR_AUTHENTICATION_TOKEN}} 
> from their environment since otherwise tasks running as the same user in the 
> same container can trivially inspect it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8130) Add placeholder handlers for offer operation feedback

2017-10-26 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220059#comment-16220059
 ] 

Greg Mann commented on MESOS-8130:
--

Review here: https://reviews.apache.org/r/63322/

> Add placeholder handlers for offer operation feedback
> -
>
> Key: MESOS-8130
> URL: https://issues.apache.org/jira/browse/MESOS-8130
> Project: Mesos
>  Issue Type: Task
>  Components: agent, master
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> In order to sketch out the flow of messages necessary to facilitate offer 
> operation feedback, we should add some empty placeholder handlers to the 
> master and agent as detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (MESOS-8130) Add placeholder handlers for offer operation feedback

2017-10-26 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-8130:
-
Comment: was deleted

(was: Review here: https://reviews.apache.org/r/63322/)

> Add placeholder handlers for offer operation feedback
> -
>
> Key: MESOS-8130
> URL: https://issues.apache.org/jira/browse/MESOS-8130
> Project: Mesos
>  Issue Type: Task
>  Components: agent, master
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> In order to sketch out the flow of messages necessary to facilitate offer 
> operation feedback, we should add some empty placeholder handlers to the 
> master and agent as detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-8132) Design a library to send offer operation status updates

2017-10-25 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-8132:


Assignee: Greg Mann

> Design a library to send offer operation status updates
> ---
>
> Key: MESOS-8132
> URL: https://issues.apache.org/jira/browse/MESOS-8132
> Project: Mesos
>  Issue Type: Task
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> As detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#],
>  we need to add a library to do the following:
> * Send offer operation status updates
> * Checkpoint pending/unacknowledged operations
> * Retry operation status updates until an acknowledgement is received
> This should be a common library which can be used by the agent (for its 
> default resources) and by local resource providers. In the future, it can 
> also be used by external resource providers.
> We should write a short design doc to explore precisely how this will be 
> implemented. It can probably be modeled after the task status update manager 
> in the agent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8132) Design a library to send offer operation status updates

2017-10-25 Thread Greg Mann (JIRA)
Greg Mann created MESOS-8132:


 Summary: Design a library to send offer operation status updates
 Key: MESOS-8132
 URL: https://issues.apache.org/jira/browse/MESOS-8132
 Project: Mesos
  Issue Type: Task
Reporter: Greg Mann


As detailed in the [offer operation feedback design 
doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#],
 we need to add a library to do the following:
* Send offer operation status updates
* Checkpoint pending/unacknowledged operations
* Retry operation status updates until an acknowledgement is received

This should be a common library which can be used by the agent (for its default 
resources) and by local resource providers. In the future, it can also be used 
by external resource providers.

We should write a short design doc to explore precisely how this will be 
implemented. It can probably be modeled after the task status update manager in 
the agent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8131) Add new protobuf messages for offer operation feedback

2017-10-25 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219160#comment-16219160
 ] 

Greg Mann commented on MESOS-8131:
--

This ticket refers to the framework API parts of the offer operation feedback 
feature. Storage-related work has already resulted in a couple reviews for 
other protobufs in the design:
* https://reviews.apache.org/r/63001/
* https://reviews.apache.org/r/63094/

> Add new protobuf messages for offer operation feedback
> --
>
> Key: MESOS-8131
> URL: https://issues.apache.org/jira/browse/MESOS-8131
> Project: Mesos
>  Issue Type: Task
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> We should add the necessary protobuf messages for offer operation feedback as 
> detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8131) Add new protobuf messages for offer operation feedback

2017-10-25 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-8131:
-
Shepherd: Jie Yu

> Add new protobuf messages for offer operation feedback
> --
>
> Key: MESOS-8131
> URL: https://issues.apache.org/jira/browse/MESOS-8131
> Project: Mesos
>  Issue Type: Task
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> We should add the necessary protobuf messages for offer operation feedback as 
> detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-8131) Add new protobuf messages for offer operation feedback

2017-10-25 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-8131:


Assignee: Greg Mann

> Add new protobuf messages for offer operation feedback
> --
>
> Key: MESOS-8131
> URL: https://issues.apache.org/jira/browse/MESOS-8131
> Project: Mesos
>  Issue Type: Task
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> We should add the necessary protobuf messages for offer operation feedback as 
> detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8131) Add new protobuf messages for offer operation feedback

2017-10-25 Thread Greg Mann (JIRA)
Greg Mann created MESOS-8131:


 Summary: Add new protobuf messages for offer operation feedback
 Key: MESOS-8131
 URL: https://issues.apache.org/jira/browse/MESOS-8131
 Project: Mesos
  Issue Type: Task
Reporter: Greg Mann


We should add the necessary protobuf messages for offer operation feedback as 
detailed in the [offer operation feedback design 
doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8130) Add placeholder handlers for offer operation feedback

2017-10-25 Thread Greg Mann (JIRA)
Greg Mann created MESOS-8130:


 Summary: Add placeholder handlers for offer operation feedback
 Key: MESOS-8130
 URL: https://issues.apache.org/jira/browse/MESOS-8130
 Project: Mesos
  Issue Type: Task
  Components: agent, master
Reporter: Greg Mann


In order to sketch out the flow of messages necessary to facilitate offer 
operation feedback, we should add some empty placeholder handlers to the master 
and agent as detailed in the [offer operation feedback design 
doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-8130) Add placeholder handlers for offer operation feedback

2017-10-25 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-8130:


Assignee: Greg Mann

> Add placeholder handlers for offer operation feedback
> -
>
> Key: MESOS-8130
> URL: https://issues.apache.org/jira/browse/MESOS-8130
> Project: Mesos
>  Issue Type: Task
>  Components: agent, master
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> In order to sketch out the flow of messages necessary to facilitate offer 
> operation feedback, we should add some empty placeholder handlers to the 
> master and agent as detailed in the [offer operation feedback design 
> doc|https://docs.google.com/document/d/1GGh14SbPTItjiweSZfann4GZ6PCteNrn-1y4pxOjgcI/edit#].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8126) Consider decoupling the authorization logic from response creation.

2017-10-24 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217127#comment-16217127
 ] 

Greg Mann commented on MESOS-8126:
--

Agreed - I think breaking the authorization code out of {{createAgentResponse}} 
would clean things up.

If we use a helper which modifies the {{GetAgents::Agent}} in-place, like we do 
in {{convertResourceFormat}}, then we could avoid extra copies as a result of 
the refactor.

> Consider decoupling the authorization logic from response creation.
> ---
>
> Key: MESOS-8126
> URL: https://issues.apache.org/jira/browse/MESOS-8126
> Project: Mesos
>  Issue Type: Task
>Reporter: Michael Park
>
> Currently the {{createAgentResponse}} function performs some authorization,
> given an optional {{rolesAcceptor}}. {{_getAgents}} function uses this helper
> *with* a {{rolesAcceptor}}. {{createAgentAdded}} on the other hand uses the
> helper *without* a {{rolesAcceptor}} and is passed to 
> {{Master::Subscriber::send}}
> for authorization post-hoc.
> From first glance, it seemed like there were 2 authorizations being done for 
> no
> reason, and it seems like it could be beneficial to actually pull the 
> authorization
> logic out of the response creation logic, rather than coupling them and 
> by-passing
> authorization when we want a *custom* authorization logic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-6985) os::getenv() can segfault

2017-10-23 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16215880#comment-16215880
 ] 

Greg Mann commented on MESOS-6985:
--

Hey [~ipronin]! The approach you proposed here back in January sounds good to 
me. Do you have any cycles to work on this at present? If so, I can shepherd 
the ticket.

> os::getenv() can segfault
> -
>
> Key: MESOS-6985
> URL: https://issues.apache.org/jira/browse/MESOS-6985
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
> Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without 
> libevent/SSL
>Reporter: Greg Mann
>Assignee: Ilya Pronin
>  Labels: reliability, stout
> Attachments: 
> MasterMaintenanceTest.InverseOffersFilters-truncated.txt, 
> MasterTest.MultipleExecutors.txt
>
>
> This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 
> and has been produced by the tests {{MasterTest.MultipleExecutors}} and 
> {{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, 
> {{os::getenv()}} segfaults with the same stack trace:
> {code}
> *** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are 
> using GNU date ***
> PC: @ 0x2ad59e3ae82d (unknown)
> I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0
> *** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; 
> stack trace: ***
> I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: 
> executor(75)@172.17.0.2:45752 with pid 28591
> @ 0x2ad5ab953197 (unknown)
> @ 0x2ad5ab957479 (unknown)
> @ 0x2ad59e165330 (unknown)
> @ 0x2ad59e3ae82d (unknown)
> @ 0x2ad594631358 os::getenv()
> @ 0x2ad59aba6acf mesos::internal::slave::executorEnvironment()
> @ 0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor()
> @ 0x2ad59ab818a2 mesos::internal::slave::Slave::_run()
> @ 0x2ad59ac1ec10 
> _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_
> @ 0x2ad59ac1e6bf 
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
> @ 0x2ad59bce2304 std::function<>::operator()()
> @ 0x2ad59bcc9824 process::ProcessBase::visit()
> @ 0x2ad59bd4028e process::DispatchEvent::visit()
> @ 0x2ad594616df1 process::ProcessBase::serve()
> @ 0x2ad59bcc72b7 process::ProcessManager::resume()
> @ 0x2ad59bcd567c 
> process::ProcessManager::init_threads()::$_2::operator()()
> @ 0x2ad59bcd5585 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2ad59bcd std::_Bind_simple<>::operator()()
> @ 0x2ad59bcd552c std::thread::_Impl<>::_M_run()
> @ 0x2ad59d9e6a60 (unknown)
> @ 0x2ad59e15d184 start_thread
> @ 0x2ad59e46d37d (unknown)
> make[4]: *** [check-local] Segmentation fault
> {code}
> Find attached the full log from a failed run of 
> {{MasterTest.MultipleExecutors}} and a truncated log from a failed run of 
> {{MasterMaintenanceTest.InverseOffersFilters}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6985) os::getenv() can segfault

2017-10-23 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-6985:
-
Shepherd: Greg Mann

> os::getenv() can segfault
> -
>
> Key: MESOS-6985
> URL: https://issues.apache.org/jira/browse/MESOS-6985
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
> Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without 
> libevent/SSL
>Reporter: Greg Mann
>  Labels: reliability, stout
> Attachments: 
> MasterMaintenanceTest.InverseOffersFilters-truncated.txt, 
> MasterTest.MultipleExecutors.txt
>
>
> This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 
> and has been produced by the tests {{MasterTest.MultipleExecutors}} and 
> {{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, 
> {{os::getenv()}} segfaults with the same stack trace:
> {code}
> *** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are 
> using GNU date ***
> PC: @ 0x2ad59e3ae82d (unknown)
> I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0
> *** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; 
> stack trace: ***
> I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: 
> executor(75)@172.17.0.2:45752 with pid 28591
> @ 0x2ad5ab953197 (unknown)
> @ 0x2ad5ab957479 (unknown)
> @ 0x2ad59e165330 (unknown)
> @ 0x2ad59e3ae82d (unknown)
> @ 0x2ad594631358 os::getenv()
> @ 0x2ad59aba6acf mesos::internal::slave::executorEnvironment()
> @ 0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor()
> @ 0x2ad59ab818a2 mesos::internal::slave::Slave::_run()
> @ 0x2ad59ac1ec10 
> _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_
> @ 0x2ad59ac1e6bf 
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
> @ 0x2ad59bce2304 std::function<>::operator()()
> @ 0x2ad59bcc9824 process::ProcessBase::visit()
> @ 0x2ad59bd4028e process::DispatchEvent::visit()
> @ 0x2ad594616df1 process::ProcessBase::serve()
> @ 0x2ad59bcc72b7 process::ProcessManager::resume()
> @ 0x2ad59bcd567c 
> process::ProcessManager::init_threads()::$_2::operator()()
> @ 0x2ad59bcd5585 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2ad59bcd std::_Bind_simple<>::operator()()
> @ 0x2ad59bcd552c std::thread::_Impl<>::_M_run()
> @ 0x2ad59d9e6a60 (unknown)
> @ 0x2ad59e15d184 start_thread
> @ 0x2ad59e46d37d (unknown)
> make[4]: *** [check-local] Segmentation fault
> {code}
> Find attached the full log from a failed run of 
> {{MasterTest.MultipleExecutors}} and a truncated log from a failed run of 
> {{MasterMaintenanceTest.InverseOffersFilters}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-6985) os::getenv() can segfault

2017-10-23 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-6985:


Assignee: Ilya Pronin

> os::getenv() can segfault
> -
>
> Key: MESOS-6985
> URL: https://issues.apache.org/jira/browse/MESOS-6985
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
> Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without 
> libevent/SSL
>Reporter: Greg Mann
>Assignee: Ilya Pronin
>  Labels: reliability, stout
> Attachments: 
> MasterMaintenanceTest.InverseOffersFilters-truncated.txt, 
> MasterTest.MultipleExecutors.txt
>
>
> This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 
> and has been produced by the tests {{MasterTest.MultipleExecutors}} and 
> {{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, 
> {{os::getenv()}} segfaults with the same stack trace:
> {code}
> *** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are 
> using GNU date ***
> PC: @ 0x2ad59e3ae82d (unknown)
> I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0
> *** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; 
> stack trace: ***
> I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: 
> executor(75)@172.17.0.2:45752 with pid 28591
> @ 0x2ad5ab953197 (unknown)
> @ 0x2ad5ab957479 (unknown)
> @ 0x2ad59e165330 (unknown)
> @ 0x2ad59e3ae82d (unknown)
> @ 0x2ad594631358 os::getenv()
> @ 0x2ad59aba6acf mesos::internal::slave::executorEnvironment()
> @ 0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor()
> @ 0x2ad59ab818a2 mesos::internal::slave::Slave::_run()
> @ 0x2ad59ac1ec10 
> _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_
> @ 0x2ad59ac1e6bf 
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
> @ 0x2ad59bce2304 std::function<>::operator()()
> @ 0x2ad59bcc9824 process::ProcessBase::visit()
> @ 0x2ad59bd4028e process::DispatchEvent::visit()
> @ 0x2ad594616df1 process::ProcessBase::serve()
> @ 0x2ad59bcc72b7 process::ProcessManager::resume()
> @ 0x2ad59bcd567c 
> process::ProcessManager::init_threads()::$_2::operator()()
> @ 0x2ad59bcd5585 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2ad59bcd std::_Bind_simple<>::operator()()
> @ 0x2ad59bcd552c std::thread::_Impl<>::_M_run()
> @ 0x2ad59d9e6a60 (unknown)
> @ 0x2ad59e15d184 start_thread
> @ 0x2ad59e46d37d (unknown)
> make[4]: *** [check-local] Segmentation fault
> {code}
> Find attached the full log from a failed run of 
> {{MasterTest.MultipleExecutors}} and a truncated log from a failed run of 
> {{MasterMaintenanceTest.InverseOffersFilters}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8117) Update Getting Started documentation

2017-10-20 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212940#comment-16212940
 ] 

Greg Mann commented on MESOS-8117:
--

{code}
commit 8386e22f20d9d20836df6111221cb3afdaf2a3ba
Author: Andrew Schwartzmeyer 
Date:   Fri Oct 20 10:37:24 2017 -0700

Moved building docs to `building.md`.

The existing "Getting Started" documentation does not cover how to "get
started" with Mesos, but instead how to build it from source on multiple
platforms. Also added a link to `configuration.md` in the build
documentation section, as it was not obvious.

Review: https://reviews.apache.org/r/63093/
{code}
{code}
commit b71478750dce4a26d84c1840a4e6d73349a6f0db
Author: Andrew Schwartzmeyer 
Date:   Fri Oct 20 10:37:25 2017 -0700

Added the Getting Started landing page.

After moving the build documentation to its own page, we can now have a
real "Getting Started" page suitable for anyone to get started with
Mesos. It is purposefully short, and therefore not overwhelming.

Review: https://reviews.apache.org/r/63095/
{code}

> Update Getting Started documentation
> 
>
> Key: MESOS-8117
> URL: https://issues.apache.org/jira/browse/MESOS-8117
> Project: Mesos
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Andrew Schwartzmeyer
>Assignee: Andrew Schwartzmeyer
>  Labels: docuentation, microsoft
> Fix For: 1.5.0
>
>
> Our "getting started" landing page is not how to get started on Mesos, it's 
> how to build. Build instructions should exist in their own file, and the 
> getting started page should be more like a landing page such as the community 
> page is. As someone who has onboarded other developers, I speak from 
> experience saying we need real "getting started" info.
> This work was started at the docathon at Mesosphere a couple weeks ago.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7434) SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.

2017-10-20 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7434:
-
Story Points: 5  (was: 2)

> SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.
> -
>
> Key: MESOS-7434
> URL: https://issues.apache.org/jira/browse/MESOS-7434
> Project: Mesos
>  Issue Type: Bug
> Environment: Debian 8
> CentOS 6
> other Linux distros
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: flaky, flaky-test, mesosphere
> Attachments: RestartSlaveRequireExecutorAuthentication is 
> flaky_failure_log_centos6.txt, 
> RestartSlaveRequireExecutorAuthentication_failure_log_debian8.txt, 
> SlaveTest.RestartSlaveRequireExecAuth-Ubuntu-16.txt
>
>
> This test failure has been observed on an internal CI system. It occurs on a 
> variety of Linux distributions. It seems that using {{cat}} as the task 
> command may be problematic; see attached log file 
> {{SlaveTest.RestartSlaveRequireExecutorAuthentication.txt}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7434) SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.

2017-10-20 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212903#comment-16212903
 ] 

Greg Mann commented on MESOS-7434:
--

This was observed recently on our internal CI, on Ubuntu 16; logs attached to 
this ticket.

> SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.
> -
>
> Key: MESOS-7434
> URL: https://issues.apache.org/jira/browse/MESOS-7434
> Project: Mesos
>  Issue Type: Bug
> Environment: Debian 8
> CentOS 6
> other Linux distros
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: flaky, flaky-test, mesosphere
> Attachments: RestartSlaveRequireExecutorAuthentication is 
> flaky_failure_log_centos6.txt, 
> RestartSlaveRequireExecutorAuthentication_failure_log_debian8.txt, 
> SlaveTest.RestartSlaveRequireExecAuth-Ubuntu-16.txt
>
>
> This test failure has been observed on an internal CI system. It occurs on a 
> variety of Linux distributions. It seems that using {{cat}} as the task 
> command may be problematic; see attached log file 
> {{SlaveTest.RestartSlaveRequireExecutorAuthentication.txt}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7434) SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.

2017-10-20 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7434:
-
Attachment: SlaveTest.RestartSlaveRequireExecAuth-Ubuntu-16.txt

> SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.
> -
>
> Key: MESOS-7434
> URL: https://issues.apache.org/jira/browse/MESOS-7434
> Project: Mesos
>  Issue Type: Bug
> Environment: Debian 8
> CentOS 6
> other Linux distros
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: flaky, flaky-test, mesosphere
> Attachments: RestartSlaveRequireExecutorAuthentication is 
> flaky_failure_log_centos6.txt, 
> RestartSlaveRequireExecutorAuthentication_failure_log_debian8.txt, 
> SlaveTest.RestartSlaveRequireExecAuth-Ubuntu-16.txt
>
>
> This test failure has been observed on an internal CI system. It occurs on a 
> variety of Linux distributions. It seems that using {{cat}} as the task 
> command may be problematic; see attached log file 
> {{SlaveTest.RestartSlaveRequireExecutorAuthentication.txt}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8091) Allow the KillPolicy to specify a signal

2017-10-13 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-8091:
-
Story Points: 3  (was: 1)
 Description: As specified in the design doc of MESOS-7951, the default 
executor should be updated to allow the framework to specify a particular 
signal to be used when initiating task termination.  (was: The {{KillPolicy}} 
protobuf message should be updated to match the design doc of MESOS-7951.)
 Summary: Allow the KillPolicy to specify a signal  (was: Update the 
KillPolicy protobuf message)

> Allow the KillPolicy to specify a signal
> 
>
> Key: MESOS-8091
> URL: https://issues.apache.org/jira/browse/MESOS-8091
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Greg Mann
>  Labels: mesosphere
>
> As specified in the design doc of MESOS-7951, the default executor should be 
> updated to allow the framework to specify a particular signal to be used when 
> initiating task termination.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8092) Allow the KillPolicy to specify a command

2017-10-13 Thread Greg Mann (JIRA)
Greg Mann created MESOS-8092:


 Summary: Allow the KillPolicy to specify a command
 Key: MESOS-8092
 URL: https://issues.apache.org/jira/browse/MESOS-8092
 Project: Mesos
  Issue Type: Improvement
Reporter: Greg Mann


As specified in the design doc of MESOS-7951, the default executor should be 
extended to allow the specification of a command in the {{KillPolicy}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8091) Update the KillPolicy protobuf message

2017-10-13 Thread Greg Mann (JIRA)
Greg Mann created MESOS-8091:


 Summary: Update the KillPolicy protobuf message
 Key: MESOS-8091
 URL: https://issues.apache.org/jira/browse/MESOS-8091
 Project: Mesos
  Issue Type: Improvement
Reporter: Greg Mann


The {{KillPolicy}} protobuf message should be updated to match the design doc 
of MESOS-7951.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-564) Update Contribution Documentation

2017-10-12 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202150#comment-16202150
 ] 

Greg Mann commented on MESOS-564:
-

Review here: https://reviews.apache.org/r/62548/

> Update Contribution Documentation
> -
>
> Key: MESOS-564
> URL: https://issues.apache.org/jira/browse/MESOS-564
> Project: Mesos
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Dave Lester
>Assignee: Greg Mann
>  Labels: documentation, mesosphere
>
> Our contribution guide is currently fairly verbose, and it focuses on the 
> ReviewBoard workflow for making code contributions. It would be helpful for 
> new contributors to have a first-time contribution guide which focuses on 
> using GitHub PRs to make small contributions, since that workflow has a 
> smaller barrier to entry for new users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7914) Replace usage of `ObjectApprover` with `AuthorizationAcceptor`

2017-10-12 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7914:
-
Sprint: Mesosphere Sprint 62, Mesosphere Sprint 63, Mesosphere Sprint 64  
(was: Mesosphere Sprint 62, Mesosphere Sprint 63, Mesosphere Sprint 64, 
Mesosphere Sprint 65)

> Replace usage of `ObjectApprover` with `AuthorizationAcceptor`
> --
>
> Key: MESOS-7914
> URL: https://issues.apache.org/jira/browse/MESOS-7914
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 1.4.0
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authorization, mesosphere
>
> Now that the {{AuthorizationAcceptor}} class has been added, we can replace 
> all occurrences of {{getObjectApprover}} with 
> {{AuthorizationAcceptor::create}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8067) Extended KillPolicy

2017-10-10 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-8067:
-
Epic Name: Extended KillPolicy  (was: Extend the KillPolicy)

> Extended KillPolicy
> ---
>
> Key: MESOS-8067
> URL: https://issues.apache.org/jira/browse/MESOS-8067
> Project: Mesos
>  Issue Type: Epic
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-8067) Extended KillPolicy

2017-10-10 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-8067:


Assignee: Greg Mann

> Extended KillPolicy
> ---
>
> Key: MESOS-8067
> URL: https://issues.apache.org/jira/browse/MESOS-8067
> Project: Mesos
>  Issue Type: Epic
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8067) Extended KillPolicy

2017-10-10 Thread Greg Mann (JIRA)
Greg Mann created MESOS-8067:


 Summary: Extended KillPolicy
 Key: MESOS-8067
 URL: https://issues.apache.org/jira/browse/MESOS-8067
 Project: Mesos
  Issue Type: Epic
Reporter: Greg Mann






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7951) Design Doc for Extended KillPolicy

2017-10-10 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7951:
-
Summary: Design Doc for Extended KillPolicy  (was: Extend the KillPolicy)

> Design Doc for Extended KillPolicy
> --
>
> Key: MESOS-7951
> URL: https://issues.apache.org/jira/browse/MESOS-7951
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, executor, HTTP API
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
> Fix For: 1.5.0
>
>
> After introducing the {{KillPolicy}} in MESOS-4909, some interactions with 
> framework developers have led to the suggestion of a couple possible 
> improvements to this interface. Namely,
> * Allowing the framework to specify a command to be run to initiate 
> termination, rather than a signal to be sent, would allow some developers to 
> avoid wrapping their application in a signal handler. This is useful because 
> a signal handler wrapper modifies the application's process tree, which may 
> make introspection and debugging more difficult in the case of well-known 
> services with standard debugging procedures.
> * In the case of terminations which do begin with a signal, it would be 
> useful to allow the framework to specify the signal to be sent, rather than 
> assuming SIGTERM. PostgreSQL, for example, permits several shutdown types, 
> each initiated with a [different 
> signal|https://www.postgresql.org/docs/9.3/static/server-shutdown.html].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7951) Extend the KillPolicy

2017-09-25 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16179729#comment-16179729
 ] 

Greg Mann commented on MESOS-7951:
--

Design doc here: 
https://docs.google.com/document/d/1xRaOEe2K7OIVrDTOY9UDwwJbCIwXF3wZUrXYl8Pqy24/edit?usp=sharing

> Extend the KillPolicy
> -
>
> Key: MESOS-7951
> URL: https://issues.apache.org/jira/browse/MESOS-7951
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, executor, HTTP API
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> After introducing the {{KillPolicy}} in MESOS-4909, some interactions with 
> framework developers have led to the suggestion of a couple possible 
> improvements to this interface. Namely,
> * Allowing the framework to specify a command to be run to initiate 
> termination, rather than a signal to be sent, would allow some developers to 
> avoid wrapping their application in a signal handler. This is useful because 
> a signal handler wrapper modifies the application's process tree, which may 
> make introspection and debugging more difficult in the case of well-known 
> services with standard debugging procedures.
> * In the case of terminations which do begin with a signal, it would be 
> useful to allow the framework to specify the signal to be sent, rather than 
> assuming SIGTERM. PostgreSQL, for example, permits several shutdown types, 
> each initiated with a [different 
> signal|https://www.postgresql.org/docs/9.3/static/server-shutdown.html].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-564) Update Contribution Documentation

2017-09-25 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-564:

Summary: Update Contribution Documentation  (was: Update 'Mesos Developers 
Guide' Contribution Documentation)

> Update Contribution Documentation
> -
>
> Key: MESOS-564
> URL: https://issues.apache.org/jira/browse/MESOS-564
> Project: Mesos
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Dave Lester
>Assignee: Greg Mann
>  Labels: documentation, mesosphere
>
> Our contribution guide is currently fairly verbose, and it focuses on the 
> ReviewBoard workflow for making code contributions. It would be helpful for 
> new contributors to have a first-time contribution guide which focuses on 
> using GitHub PRs to make small contributions, since that workflow has a 
> smaller barrier to entry for new users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-564) Update 'Mesos Developers Guide' Contribution Documentation

2017-09-25 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-564:

 Sprint: Mesosphere Sprint 64
 Labels: documentation mesosphere  (was: twitter)
Description: Our contribution guide is currently fairly verbose, and it 
focuses on the ReviewBoard workflow for making code contributions. It would be 
helpful for new contributors to have a first-time contribution guide which 
focuses on using GitHub PRs to make small contributions, since that workflow 
has a smaller barrier to entry for new users.
Component/s: documentation
 Issue Type: Improvement  (was: Bug)

> Update 'Mesos Developers Guide' Contribution Documentation
> --
>
> Key: MESOS-564
> URL: https://issues.apache.org/jira/browse/MESOS-564
> Project: Mesos
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Dave Lester
>Assignee: Greg Mann
>  Labels: documentation, mesosphere
>
> Our contribution guide is currently fairly verbose, and it focuses on the 
> ReviewBoard workflow for making code contributions. It would be helpful for 
> new contributors to have a first-time contribution guide which focuses on 
> using GitHub PRs to make small contributions, since that workflow has a 
> smaller barrier to entry for new users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-564) Update 'Mesos Developers Guide' Contribution Documentation

2017-09-25 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-564:
---

Assignee: Greg Mann

> Update 'Mesos Developers Guide' Contribution Documentation
> --
>
> Key: MESOS-564
> URL: https://issues.apache.org/jira/browse/MESOS-564
> Project: Mesos
>  Issue Type: Bug
>Reporter: Dave Lester
>Assignee: Greg Mann
>  Labels: twitter
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7914) Replace usage of `ObjectApprover` with `AuthorizationAcceptor`

2017-09-19 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172487#comment-16172487
 ] 

Greg Mann commented on MESOS-7914:
--

Reviews here:
https://reviews.apache.org/r/61924/
https://reviews.apache.org/r/61925/

> Replace usage of `ObjectApprover` with `AuthorizationAcceptor`
> --
>
> Key: MESOS-7914
> URL: https://issues.apache.org/jira/browse/MESOS-7914
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 1.4.0
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authorization, mesosphere
>
> Now that the {{AuthorizationAcceptor}} class has been added, we can replace 
> all occurrences of {{getObjectApprover}} with 
> {{AuthorizationAcceptor::create}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7941) Send TASK_STARTING status from built-in executors

2017-09-14 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7941:
-
Sprint:   (was: Mesosphere Sprint 63)

> Send TASK_STARTING status from built-in executors
> -
>
> Key: MESOS-7941
> URL: https://issues.apache.org/jira/browse/MESOS-7941
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benno Evers
>Assignee: Benno Evers
>
> All executors have the option to send out a TASK_STARTING status update to 
> signal to the scheduler that they received the command to launch the task.
> It would be good if our built-in executors would do this, for reasons laid 
> out in 
> https://mail-archives.apache.org/mod_mbox/mesos-dev/201708.mbox/%3CCA%2B9TLTzkEVM0CKvY%2B%3D0%3DwjrN6hYFAt0401Y7b8tysDWx1WZzdw%40mail.gmail.com%3E
> This will also fix MESOS-6790.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-09-14 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7601:
-
Shepherd: Greg Mann  (was: Jie Yu)

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere, tech-debt
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
> (172.31.13.122) for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> s

[jira] [Commented] (MESOS-7916) Improve the test coverage of the DefaultExecutor.

2017-09-13 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165041#comment-16165041
 ] 

Greg Mann commented on MESOS-7916:
--

{code}
commit e7df335a484131450ff15bcd2ee325ea40dc8155
Author: Gastón Kleiman gas...@mesosphere.io
Date:   Wed Sep 13 09:21:23 2017 -0700

Cleaned up DefaultExecutor tests.

Updated the DefaultExecutor tests to use test helpers where possible.
Also made the boilerplate initialization code consistent across tests.

Review: https://reviews.apache.org/r/61982/
{code}

> Improve the test coverage of the DefaultExecutor.
> -
>
> Key: MESOS-7916
> URL: https://issues.apache.org/jira/browse/MESOS-7916
> Project: Mesos
>  Issue Type: Improvement
>  Components: executor
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> We should write tests for the {{DefaultExecutor}} to cover the following 
> common scenarios:
> # -Start a task that uses a GPU, and make sure that it is made available to 
> the task.-
> # -Launch a Docker task with a health check.-
> # -Launch two tasks and verify that they can access a volume owned by the 
> Executor via {{sandbox_path}} volumes.-
> # -Launch two tasks, each one in its own task group, and verify that they can 
> access a volume owned by the Executor via {{sandbox_path}} volumes.-
> # -Launch a task that uses an env secret, make sure that it is accessible.-
> # Launch a task using a URI and make sure that the artifact is accessible.
> # Launch a task using a Docker image + URIs, make sure that the fetched 
> artifact is accessible.
> # Launch one task and ensure that (health) checks can read from a persistent 
> volume.
> # Ensure that the executor's env is NOT inherited by the nested tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7877) Audit test code for undefined behavior in accessing container elements

2017-09-13 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165042#comment-16165042
 ] 

Greg Mann commented on MESOS-7877:
--

{code}
commit 1f4d7ef27e0e4936c1ea15d4e56d778e35a92507
Author: Gastón Kleiman gas...@mesosphere.io
Date:   Wed Sep 13 09:21:20 2017 -0700

Added new overloads for the `createExecutorInfo` test helper method.

These new overloads make it possible to specify framework ID, executor
resources, and executor ID as a protobuf message rather than a string.

Review: https://reviews.apache.org/r/62197/
{code}
{code}
commit 2a6f6b7aedf05b23ae0fe04364159c87f6c5cea8
Author: Gastón Kleiman gas...@mesosphere.io
Date:   Wed Sep 13 09:21:25 2017 -0700

Changed `EXPECT` to `ASSERT` when relying on the assertion afterwards.

A common pattern in our tests is to check that at least one offer is
received using:

'EXPECT_FALSE(offers->offers().empty())'

The test then accesses the first element of the array returned by
`offers->offers()` to extract information such as the agent ID.

This patch makes the tests that follow this pattern use `ASSERT_FALSE`
instead of `EXPECT_FALSE` to avoid invalid memory accesses when the
array is empty.

Review: https://reviews.apache.org/r/62042/
{code}

> Audit test code for undefined behavior in accessing container elements
> --
>
> Key: MESOS-7877
> URL: https://issues.apache.org/jira/browse/MESOS-7877
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: Benjamin Bannier
>Assignee: Gastón Kleiman
>Priority: Minor
>  Labels: mesosphere, newbie, tech-debt, test
>
> We do not always make sure we never access elements from empty containers, 
> e.g., we use patterns like the following
> {code}
> Future> offers;
> // Satisfy offers.
> EXPECT_FALSE(offers.empty());
> const auto& offer = (*offers)[0];
> {code}
> While the intention here is to diagnose an empty {{offers}}, the code still 
> exhibits undefined behavior in the element access if {{offers}} was indeed 
> empty (compilers might aggressively exploit undefined behavior to e.g., 
> remove "impossible" code). Instead one should prevent accessing any elements 
> of an empty container, e.g.,
> {code}
> ASSERT_FALSE(offers.empty()); // Prevent execution of rest of test body.
> {code}
> We should audit and fix existing test code for such incorrect checks and 
> variations involving e.g., {{EXPECT_NE}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7914) Replace usage of `ObjectApprover` with `AuthorizationAcceptor`

2017-09-12 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16163828#comment-16163828
 ] 

Greg Mann commented on MESOS-7914:
--

Note that it would also be useful to add some error logging 
[here|https://github.com/apache/mesos/blob/5125b80ea50b5babd7636234605b66c627780834/src/common/http.cpp#L1212-L1216]
 in an {{onFailed}} handler to log the case where the authorizer fails to 
return a valid object approver.

> Replace usage of `ObjectApprover` with `AuthorizationAcceptor`
> --
>
> Key: MESOS-7914
> URL: https://issues.apache.org/jira/browse/MESOS-7914
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 1.4.0
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authorization, mesosphere
>
> Now that the {{AuthorizationAcceptor}} class has been added, we can replace 
> all occurrences of {{getObjectApprover}} with 
> {{AuthorizationAcceptor::create}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7951) Extend the KillPolicy

2017-09-08 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-7951:


Assignee: Greg Mann

> Extend the KillPolicy
> -
>
> Key: MESOS-7951
> URL: https://issues.apache.org/jira/browse/MESOS-7951
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, executor, HTTP API
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> After introducing the {{KillPolicy}} in MESOS-4909, some interactions with 
> framework developers have led to the suggestion of a couple possible 
> improvements to this interface. Namely,
> * Allowing the framework to specify a command to be run to initiate 
> termination, rather than a signal to be sent, would allow some developers to 
> avoid wrapping their application in a signal handler. This is useful because 
> a signal handler wrapper modifies the application's process tree, which may 
> make introspection and debugging more difficult in the case of well-known 
> services with standard debugging procedures.
> * In the case of terminations which do begin with a signal, it would be 
> useful to allow the framework to specify the signal to be sent, rather than 
> assuming SIGTERM. PostgreSQL, for example, permits several shutdown types, 
> each initiated with a [different 
> signal|https://www.postgresql.org/docs/9.3/static/server-shutdown.html].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7951) Extend the KillPolicy

2017-09-08 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7951:


 Summary: Extend the KillPolicy
 Key: MESOS-7951
 URL: https://issues.apache.org/jira/browse/MESOS-7951
 Project: Mesos
  Issue Type: Improvement
  Components: agent, executor, HTTP API
Reporter: Greg Mann


After introducing the {{KillPolicy}} in MESOS-4909, some interactions with 
framework developers have led to the suggestion of a couple possible 
improvements to this interface. Namely,
* Allowing the framework to specify a command to be run to initiate 
termination, rather than a signal to be sent, would allow some developers to 
avoid wrapping their application in a signal handler. This is useful because a 
signal handler wrapper modifies the application's process tree, which may make 
introspection and debugging more difficult in the case of well-known services 
with standard debugging procedures.
* In the case of terminations which do begin with a signal, it would be useful 
to allow the framework to specify the signal to be sent, rather than assuming 
SIGTERM. PostgreSQL, for example, permits several shutdown types, each 
initiated with a [different 
signal|https://www.postgresql.org/docs/9.3/static/server-shutdown.html].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7916) Improve the test coverage of the DefaultExecutor.

2017-08-30 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148260#comment-16148260
 ] 

Greg Mann commented on MESOS-7916:
--

{code}
commit be7d2ca48a765c247b644fb54d142602ac487d61
Author: Gastón Kleiman 
Date:   Wed Aug 30 17:20:38 2017 -0700

Added tests to ensure that tasks can access their parent's volumes.

These tests verify that sibling tasks can share a persistent volume
owned by their parent executor using 'sandbox_path' volumes.

Review: https://reviews.apache.org/r/61921/
{code}
{code}
commit 065d2a801396e90adb619e839f062ae153249ca0
Author: Gastón Kleiman 
Date:   Wed Aug 30 17:20:36 2017 -0700

Added a test that uses environment secrets and the DefaultExecutor.

This test checks that environment secrets are properly resolved and
exposed to tasks started by the DefaultExecutor.

Review: https://reviews.apache.org/r/61920/
{code}

> Improve the test coverage of the DefaultExecutor.
> -
>
> Key: MESOS-7916
> URL: https://issues.apache.org/jira/browse/MESOS-7916
> Project: Mesos
>  Issue Type: Improvement
>  Components: executor
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> We should write tests for the {{DefaultExecutor}} to cover the following 
> common scenarios:
> # Start a task that uses a GPU, and make sure that it is made available to 
> the task.
> # Launch a Docker task with a health check.
> # Launch two tasks and verify that they can access a volume owned by the 
> Executor via {{sandbox_path}} volumes.
> # Launch two tasks, each one in its own task group, and verify that they can 
> access a volume owned by the Executor via {{sandbox_path}} volumes.
> # Launch one task and ensure that (health) checks can read from a persistent 
> volume.
> # Launch a task using a URI and make sure that the artifact is accessible.
> # Launch a task using a Docker image + URIs, make sure that the fetched 
> artifact is accessible.
> # Write a test that ensures that the executor's env is NOT inherited by the 
> nested tasks.
> # Launch a task that uses an env secret, make sure that it is accessible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7785) Pass Operator API subscription events through authorizer

2017-08-30 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147985#comment-16147985
 ] 

Greg Mann commented on MESOS-7785:
--

{code}
commit e4d56bcb65f7bf9805eff18e6a9249eb7512f745
Author: Quinn Leng 
Date:   Tue Aug 29 13:13:19 2017 -0700

Added authorization for V1 events.

Added authorization filtering for the master V1 operator event
stream. Subscribers will only receive events that their
principal is authorized to see. The new test
'MasterAPITest.EventAuthorizationFiltering' verifies this
behavior.

Review: https://reviews.apache.org/r/61189/
{code}

> Pass Operator API subscription events through authorizer 
> -
>
> Key: MESOS-7785
> URL: https://issues.apache.org/jira/browse/MESOS-7785
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Mathew Appelman
>Assignee: Quinn
> Fix For: 1.5.0
>
>
> In order to consume the subscription endpoint from the Operator API in the 
> DC/OS UI, we must ensure a user can only receive events they are authorized 
> to consume.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7914) Replace usage of `ObjectApprover` with `AuthorizationAcceptor`

2017-08-26 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7914:
-
Sprint: Mesosphere Sprint 62

> Replace usage of `ObjectApprover` with `AuthorizationAcceptor`
> --
>
> Key: MESOS-7914
> URL: https://issues.apache.org/jira/browse/MESOS-7914
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 1.4.0
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authorization, mesosphere
>
> Now that the {{AuthorizationAcceptor}} class has been added, we can replace 
> all occurrences of {{getObjectApprover}} with 
> {{AuthorizationAcceptor::create}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7914) Replace usage of `ObjectApprover` with `AuthorizationAcceptor`

2017-08-24 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-7914:


Assignee: Greg Mann

> Replace usage of `ObjectApprover` with `AuthorizationAcceptor`
> --
>
> Key: MESOS-7914
> URL: https://issues.apache.org/jira/browse/MESOS-7914
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 1.4.0
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authorization, mesosphere
>
> Now that the {{AuthorizationAcceptor}} class has been added, we can replace 
> all occurrences of {{getObjectApprover}} with 
> {{AuthorizationAcceptor::create}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7913) Authorization Improvements

2017-08-24 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7913:
-
Labels: authorization mesosphere  (was: authorization)

> Authorization Improvements
> --
>
> Key: MESOS-7913
> URL: https://issues.apache.org/jira/browse/MESOS-7913
> Project: Mesos
>  Issue Type: Epic
>  Components: security
>Reporter: Greg Mann
>  Labels: authorization, mesosphere
>
> This epic is meant to collect tickets for improvements to authorization in 
> Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7914) Replace usage of `ObjectApprover` with `AuthorizationAcceptor`

2017-08-24 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7914:


 Summary: Replace usage of `ObjectApprover` with 
`AuthorizationAcceptor`
 Key: MESOS-7914
 URL: https://issues.apache.org/jira/browse/MESOS-7914
 Project: Mesos
  Issue Type: Improvement
  Components: security
Affects Versions: 1.4.0
Reporter: Greg Mann


Now that the {{AuthorizationAcceptor}} class has been added, we can replace all 
occurrences of {{getObjectApprover}} with {{AuthorizationAcceptor::create}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7913) Authorization Improvements

2017-08-24 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7913:


 Summary: Authorization Improvements
 Key: MESOS-7913
 URL: https://issues.apache.org/jira/browse/MESOS-7913
 Project: Mesos
  Issue Type: Epic
  Components: security
Reporter: Greg Mann


This epic is meant to collect tickets for improvements to authorization in 
Mesos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7888) Track fetcher task success and failures

2017-08-22 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137451#comment-16137451
 ] 

Greg Mann commented on MESOS-7888:
--

[~xujyan] [~jpe...@apache.org] FYI: this didn't make it into the {{1.4.0-rc1}} 
tag: https://github.com/apache/mesos/commits/1.4.0-rc1

I saw the fix version on this ticket and just wanted to make sure you knew.

> Track fetcher task success and failures
> ---
>
> Key: MESOS-7888
> URL: https://issues.apache.org/jira/browse/MESOS-7888
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, statistics
>Reporter: James Peach
>Assignee: James Peach
>Priority: Minor
> Fix For: 1.4.0
>
>
> In MESOS-7524, we added fetcher metrics for total task fetches and failed 
> task fetches. For consistency with the similar metrics in MESOS-7842, we 
> should switch these to track the successful task fetches and the failed task 
> fetches. Operators can derive the total by adding the succeeded and failed 
> counts.
> ie. replace {{containerizer/fetcher/task_fetches_total}} with 
> {{containerizer/fetcher/task_fetches_succeeded}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7861) Include check output in the DefaultExecutor log

2017-08-21 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135973#comment-16135973
 ] 

Greg Mann commented on MESOS-7861:
--

{code}
commit 8347ec09f15989b822f48c43c6547c25b5f4
Author: Gastón Kleiman 
Date:   Mon Aug 21 15:28:39 2017 -0700

Raised the logging level of some check and health check messages.

Some users pointed out that always logging the result of checks and
health checks makes it easier to debug problems.

Review: https://reviews.apache.org/r/61791/
{code}
{code}
commit 6d778fa45a73723b857db9b2ce92c3d15fb3373f
Author: Gastón Kleiman 
Date:   Mon Aug 21 15:28:37 2017 -0700

Made the log output handling of TCP and HTTP checks consistent.

Review: https://reviews.apache.org/r/61766/
{code}
{code}
commit 0a01bc38eba08da8ef8b4ae152c95a57c39d73f3
Author: Gastón Kleiman 
Date:   Mon Aug 21 15:28:36 2017 -0700

Included nested command check output in the executor logs.

This patch updates the checker and health checker to include the output
of COMMAND checks and health checks in its logs by default. This has
the effect of including these logs in the executor output for easier
debugging.

Review: https://reviews.apache.org/r/61697/
{code}

> Include check output in the DefaultExecutor log
> ---
>
> Key: MESOS-7861
> URL: https://issues.apache.org/jira/browse/MESOS-7861
> Project: Mesos
>  Issue Type: Improvement
>  Components: executor
>Affects Versions: 1.3.0
>Reporter: Michael Browning
>Assignee: Gastón Kleiman
>  Labels: check, default-executor, health-check, mesosphere
>
> With the default executor, health and readiness checks are run in their own 
> nested containers, whose sandboxes are cleaned up right before performing the 
> next check. This makes access to stdout/stderr of previous runs of the check 
> command effectively impossible.
> Although the exit code of the command being run is reported in a task status, 
> it is often necessary to see the command's actual output when debugging a 
> framework issue, so the ability to access this output via the executor logs 
> would be helpful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7861) Include check output in the DefaultExecutor log

2017-08-16 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7861:
-
Shepherd: Greg Mann

> Include check output in the DefaultExecutor log
> ---
>
> Key: MESOS-7861
> URL: https://issues.apache.org/jira/browse/MESOS-7861
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Affects Versions: 1.3.0
>Reporter: Michael Browning
>Assignee: Gastón Kleiman
>Priority: Minor
>  Labels: check, default-executor, health-check, mesosphere
>
> With the default executor, health and readiness checks are run in their own 
> nested containers, whose sandboxes are cleaned up right before performing the 
> next check. This makes access to stdout/stderr of previous runs of the check 
> command effectively impossible.
> Although the exit code of the command being run is reported in a task status, 
> it is often necessary to see the command's actual output when debugging a 
> framework issue, so the ability to access this output via the executor logs 
> would be helpful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (MESOS-7661) Libprocess timers with long durations trigger immediately

2017-08-14 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126403#comment-16126403
 ] 

Greg Mann edited comment on MESOS-7661 at 8/14/17 9:19 PM:
---

The commits here, along with those from MESOS-7660, will help us avoid many 
cases in which we would overflow. Note, however, that fundamentally this issue 
still exists, since we have not made changes to the arithmetic operators.

After some discussion, we are opting to address this issue at the Mesos level, 
by restricting the lengths of durations that users can supply, so I'm closing 
this ticket as "Won't Fix".


was (Author: greggomann):
The commits here, along with those from MESOS-7660, will prevent us from 
overflowing. Note, however, that fundamentally this issue still exists, since 
we have not made changes to the arithmetic operators.

> Libprocess timers with long durations trigger immediately
> -
>
> Key: MESOS-7661
> URL: https://issues.apache.org/jira/browse/MESOS-7661
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> {{process::delay()}} will schedule a method to be run right ahead when called 
> with a vry long {{Duration}}.
> This happens because [{{Timeout}} tries to add two long 
> durations|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/libprocess/include/process/timeout.hpp#L33-L38],
>  leading to an [integer overflow in 
> {{Duration}}|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/stout/include/stout/duration.hpp#L116].
> I'd expect libprocess to either:
>   1. Never run the method.
>   2. Schedule it in the longest possible {{Duration}}.
> {{Duration::operator+=()}} should probably also handle integer overflows 
> differently. If an addition leads to an integer overflow, it might make more 
> sense to return {{Duration::max()}} than a negative duration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7661) Libprocess timers with long durations trigger immediately

2017-08-14 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126403#comment-16126403
 ] 

Greg Mann commented on MESOS-7661:
--

The commits here, along with those from MESOS-7660, will prevent us from 
overflowing. Note, however, that fundamentally this issue still exists, since 
we have not made changes to the arithmetic operators.

> Libprocess timers with long durations trigger immediately
> -
>
> Key: MESOS-7661
> URL: https://issues.apache.org/jira/browse/MESOS-7661
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> {{process::delay()}} will schedule a method to be run right ahead when called 
> with a vry long {{Duration}}.
> This happens because [{{Timeout}} tries to add two long 
> durations|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/libprocess/include/process/timeout.hpp#L33-L38],
>  leading to an [integer overflow in 
> {{Duration}}|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/stout/include/stout/duration.hpp#L116].
> I'd expect libprocess to either:
>   1. Never run the method.
>   2. Schedule it in the longest possible {{Duration}}.
> {{Duration::operator+=()}} should probably also handle integer overflows 
> differently. If an addition leads to an integer overflow, it might make more 
> sense to return {{Duration::max()}} than a negative duration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7661) Libprocess timers with long durations trigger immediately

2017-08-14 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126397#comment-16126397
 ] 

Greg Mann commented on MESOS-7661:
--

{code}
commit 1efe264ebcf998c248cb7eecba57bd65e2060645
Author: Gastón Kleiman 
Date:   Mon Aug 14 13:52:50 2017 -0700

Stout: Made boundary checking in Duration consistent.

Review: https://reviews.apache.org/r/61601/
{code}
{code}
commit f4348182c1c5b832743166cfdad9b1a84bc2824e
Author: Gastón Kleiman 
Date:   Mon Aug 14 13:52:49 2017 -0700

Stout: Made `Duration::parse()` handle durations out of range.

Made `Duration:parse()` return an error if the argument is out of the
range that a `Duration` can represent.

Review: https://reviews.apache.org/r/60721/
{code}

> Libprocess timers with long durations trigger immediately
> -
>
> Key: MESOS-7661
> URL: https://issues.apache.org/jira/browse/MESOS-7661
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> {{process::delay()}} will schedule a method to be run right ahead when called 
> with a vry long {{Duration}}.
> This happens because [{{Timeout}} tries to add two long 
> durations|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/libprocess/include/process/timeout.hpp#L33-L38],
>  leading to an [integer overflow in 
> {{Duration}}|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/stout/include/stout/duration.hpp#L116].
> I'd expect libprocess to either:
>   1. Never run the method.
>   2. Schedule it in the longest possible {{Duration}}.
> {{Duration::operator+=()}} should probably also handle integer overflows 
> differently. If an addition leads to an integer overflow, it might make more 
> sense to return {{Duration::max()}} than a negative duration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7660) HierarchicalAllocator uses the default filter instead of a very long one

2017-08-14 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126392#comment-16126392
 ] 

Greg Mann commented on MESOS-7660:
--

{code}
commit 2fe2bb26a425da9aaf1d7cf34019dd347d0cf9a4
Author: Gastón Kleiman 
Date:   Mon Aug 14 13:52:54 2017 -0700

Added MESOS-7660 to the changelog.

This patch adds MESOS-7660 to the changelog and adds a missing period
to the existing text.

Review: https://reviews.apache.org/r/61621/
{code}
{code}
commit 183cceef366586f4a55b6ba7144c4a8277eb9962
Author: Gastón Kleiman 
Date:   Mon Aug 14 13:52:52 2017 -0700

Fixed the default filter used by the allocator.

If a framework accepts/refuses an offer using a very long filter, the
`HierarchicalAllocator` will use the default filter instead, meaning
that it will filter the resources for only 5 seconds. This can happen
when a framework sets `Filter::refuse_seconds` to a number of seconds
larger than what fits in `Duration`.

This patch makes the hierarchical allocator cap the filter duration to
at most 365 days.

Review: https://reviews.apache.org/r/60525/
{code}

> HierarchicalAllocator uses the default filter instead of a very long one
> 
>
> Key: MESOS-7660
> URL: https://issues.apache.org/jira/browse/MESOS-7660
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
> Fix For: 1.5.0
>
>
> If a framework accepts/refuses an offer using a very long filter, [the 
> {{HierarchicalAllocator}} will use the default {{Filter}} 
> instead|https://github.com/apache/mesos/blob/master/src/master/allocator/mesos/hierarchical.cpp#L1046-L1052].
>  Meaning that it will filter the resources for only 5 seconds.
> This can happen when a framework sets {{Filter::refuse_seconds}} to a number 
> of seconds [larger than what fits in 
> {{Duration}}|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/stout/include/stout/duration.hpp#L401-L405].
> The following [tests are 
> flaky|https://issues.apache.org/jira/browse/MESOS-7514] because of this: 
> {{ReservationTest.ReserveShareWithinRole}} and 
> {{ReservationTest.PreventUnreservingAlienResources}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7871) Agent fails assertion during request to '/state'

2017-08-09 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120820#comment-16120820
 ] 

Greg Mann commented on MESOS-7871:
--

{code}
commit db8d097c9565e9b6f60531f9eb3f993a6c60fd72
Author: Greg Mann 
Date:   Wed Aug 9 10:00:46 2017 -0700

Added a test to verify the fix for a failed agent assertion.

This patch adds 'SlaveTest.GetStateTaskGroupPending', which confirms
the fix for MESOS-7871. The test verifies that requests to the agent's
'/state' endpoint are successful when there are pending tasks on the
agent which were launched as part of a task group.

Review: https://reviews.apache.org/r/61534
{code}
{code}
commit 4f4807394944d23d3a6f79249ce49e2494a88350
Author: Andrei Budnik 
Date:   Wed Aug 9 11:06:40 2017 -0700

Moved task validation from `getExecutorInfo` to `runTask` on agent.

Previously, `getExecutorInfo` was called only in `runTask`, so it
asserted the invariant that a task should have either CommandInfo
or ExecutorInfo set but not both. This is true for individual
tasks, but it is not necessarily true for tasks which are part of a
task group, since the master injects the task group's ExecutorInfo.

Now `getExecutorInfo` is also called to calculate allocated
resources of tasks which might be part of a task group, which could
violate this invariant, so the assertion has been moved.

Review: https://reviews.apache.org/r/61524/
{code}

> Agent fails assertion during request to '/state'
> 
>
> Key: MESOS-7871
> URL: https://issues.apache.org/jira/browse/MESOS-7871
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Greg Mann
>Assignee: Andrei Budnik
>  Labels: mesosphere
> Fix For: 1.4.0
>
>
> While processing requests to {{/state}}, the Mesos agent calls 
> {{Framework::allocatedResources()}}, which in turn calls 
> {{Slave::getExecutorInfo()}} on executors associated with the framework's 
> pending tasks.
> In the case of tasks launched as part of task groups, this leads to the 
> failure of the assertion 
> [here|https://github.com/apache/mesos/blob/a31dd52ab71d2a529b55cd9111ec54acf7550ded/src/slave/slave.cpp#L4983-L4985].
>  This means that the check will fail if the agent processes a request to 
> {{/state}} at a time when it has pending tasks launched as part of a task 
> group.
> This assertion should be removed since this helper function is now used with 
> task groups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7871) Agent fails assertion during request to '/state'

2017-08-09 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120422#comment-16120422
 ] 

Greg Mann commented on MESOS-7871:
--

Test and comment updates:
https://reviews.apache.org/r/61534/
https://reviews.apache.org/r/61535/

> Agent fails assertion during request to '/state'
> 
>
> Key: MESOS-7871
> URL: https://issues.apache.org/jira/browse/MESOS-7871
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Greg Mann
>Assignee: Andrei Budnik
>  Labels: mesosphere
>
> While processing requests to {{/state}}, the Mesos agent calls 
> {{Framework::allocatedResources()}}, which in turn calls 
> {{Slave::getExecutorInfo()}} on executors associated with the framework's 
> pending tasks.
> In the case of tasks launched as part of task groups, this leads to the 
> failure of the assertion 
> [here|https://github.com/apache/mesos/blob/a31dd52ab71d2a529b55cd9111ec54acf7550ded/src/slave/slave.cpp#L4983-L4985].
>  This means that the check will fail if the agent processes a request to 
> {{/state}} at a time when it has pending tasks launched as part of a task 
> group.
> This assertion should be removed since this helper function is now used with 
> task groups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7871) Agent fails assertion during request to '/state'

2017-08-08 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-7871:


Assignee: Greg Mann

> Agent fails assertion during request to '/state'
> 
>
> Key: MESOS-7871
> URL: https://issues.apache.org/jira/browse/MESOS-7871
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> While processing requests to {{/state}}, the Mesos agent calls 
> {{Framework::allocatedResources()}}, which in turn calls 
> {{Slave::getExecutorInfo()}} on executors associated with the framework's 
> pending tasks.
> In the case of tasks launched as part of task groups, this leads to the 
> failure of the assertion 
> [here|https://github.com/apache/mesos/blob/a31dd52ab71d2a529b55cd9111ec54acf7550ded/src/slave/slave.cpp#L4983-L4985].
>  This means that the check will fail if the agent processes a request to 
> {{/state}} at a time when it has pending tasks launched as part of a task 
> group.
> This assertion should be removed since this helper function is now used with 
> task groups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7871) Agent fails assertion during request to '/state'

2017-08-08 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7871:


 Summary: Agent fails assertion during request to '/state'
 Key: MESOS-7871
 URL: https://issues.apache.org/jira/browse/MESOS-7871
 Project: Mesos
  Issue Type: Bug
  Components: agent
Reporter: Greg Mann


While processing requests to {{/state}}, the Mesos agent calls 
{{Framework::allocatedResources()}}, which in turn calls 
{{Slave::getExecutorInfo()}} on executors associated with the framework's 
pending tasks.

In the case of tasks launched as part of task groups, this leads to the failure 
of the assertion 
[here|https://github.com/apache/mesos/blob/a31dd52ab71d2a529b55cd9111ec54acf7550ded/src/slave/slave.cpp#L4983-L4985].
 This means that the check will fail if the agent processes a request to 
{{/state}} at a time when it has pending tasks launched as part of a task group.

This assertion should be removed since this helper function is now used with 
task groups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7416) Filter results of `/master/slaves` and the v1 call GET_AGENTS

2017-08-02 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111872#comment-16111872
 ] 

Greg Mann commented on MESOS-7416:
--

{code}
commit e87569b2ae3c7f8303ce146f882c340b4fdd5ca4
Author: Alexander Rojas 
Date:   Wed Aug 2 13:14:07 2017 -0700

Added full authz for non summarized fields of `/slaves` endpoint.

Fields were authorized based on partial elements of each
resource. Moreover, some fields which required authorization were not
being authorized at all. This patch enables full authorization of all
fields.

Review: https://reviews.apache.org/r/61257/
{code}
{code}
commit 2fe2562455d899545f2f6cbace989489867b8ee7
Author: Alexander Rojas 
Date:   Wed Aug 2 13:14:01 2017 -0700

Enabled filtering of the 'GET_AGENTS' v1 API call.

Enables filtering of the results of calls to the 'GET_AGENTS' v1
API. It filters the contents of different resources entries based
on the 'VIEW_ROLE' permissions of the principal doing the request
based on resource roles, allocation roles and reservations.

Review: https://reviews.apache.org/r/61171/
{code}

> Filter results of `/master/slaves` and the v1 call GET_AGENTS
> -
>
> Key: MESOS-7416
> URL: https://issues.apache.org/jira/browse/MESOS-7416
> Project: Mesos
>  Issue Type: Task
>  Components: HTTP API, master
>Reporter: Alexander Rojas
>Assignee: Alexander Rojas
>  Labels: mesosphere, security
> Fix For: 1.4.0
>
>
> The results returned by both the endpoint {{/master/slaves}} and the API v1 
> {{GET_AGENTS}} return full information about the agent state which probably 
> need to be filtered for certain uses, particularly in a multi-tenancy 
> scenario.
> The kind of leaked data includes specific role names and their specific 
> allocations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7851) Master stores old resource format in the registry

2017-08-02 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111663#comment-16111663
 ] 

Greg Mann commented on MESOS-7851:
--

Note that when this issue is resolved, the {{authorizeResource()}} helper 
introduced in [this patch|https://reviews.apache.org/r/61171/] should be 
updated.

> Master stores old resource format in the registry
> -
>
> Key: MESOS-7851
> URL: https://issues.apache.org/jira/browse/MESOS-7851
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Greg Mann
>  Labels: master, mesosphere, reservation
>
> We intend for the master to store all internal resource representations in 
> the new, post-reservation-refinement format. However, [when persisting 
> registered agents to the 
> registrar|https://github.com/apache/mesos/blob/498a000ac1bb8f51dc871f22aea265424a407a17/src/master/master.cpp#L5861-L5876],
>  the master does not convert the resources; agents provide resources in the 
> pre-reservation-refinement format, and these resources are stored as-is. This 
> means that after recovery, any agents in the master's {{slaves.recovered}} 
> map will have {{SlaveInfo.resources}} in the pre-reservation-refinement 
> format.
> We should update the master to convert these resources before persisting them 
> to the registry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7851) Master stores old resource format in the registry

2017-08-02 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7851:


 Summary: Master stores old resource format in the registry
 Key: MESOS-7851
 URL: https://issues.apache.org/jira/browse/MESOS-7851
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Greg Mann


We intend for the master to store all internal resource representations in the 
new, post-reservation-refinement format. However, [when persisting registered 
agents to the 
registrar|https://github.com/apache/mesos/blob/498a000ac1bb8f51dc871f22aea265424a407a17/src/master/master.cpp#L5861-L5876],
 the master does not convert the resources; agents provide resources in the 
pre-reservation-refinement format, and these resources are stored as-is. This 
means that after recovery, any agents in the master's {{slaves.recovered}} map 
will have {{SlaveInfo.resources}} in the pre-reservation-refinement format.

We should update the master to convert these resources before persisting them 
to the registry.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7416) Filter results of `/master/slaves` and the v1 call GET_AGENTS

2017-07-25 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100838#comment-16100838
 ] 

Greg Mann commented on MESOS-7416:
--

[~arojas] I didn't catch this during review, but looking back at the above 
commit, it doesn't look like it touches the v1 handler at all?

> Filter results of `/master/slaves` and the v1 call GET_AGENTS
> -
>
> Key: MESOS-7416
> URL: https://issues.apache.org/jira/browse/MESOS-7416
> Project: Mesos
>  Issue Type: Task
>  Components: HTTP API, master
>Reporter: Alexander Rojas
>Assignee: Alexander Rojas
>  Labels: mesosphere, security
> Fix For: 1.4.0
>
>
> The results returned by both the endpoint {{/master/slaves}} and the API v1 
> {{GET_AGENTS}} return full information about the agent state which probably 
> need to be filtered for certain uses, particularly in a multi-tenancy 
> scenario.
> The kind of leaked data includes specific role names and their specific 
> allocations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7829) Improve completed task/framework garbage collection

2017-07-25 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7829:


 Summary: Improve completed task/framework garbage collection
 Key: MESOS-7829
 URL: https://issues.apache.org/jira/browse/MESOS-7829
 Project: Mesos
  Issue Type: Improvement
  Components: master
Reporter: Greg Mann


The Mesos master currently uses two flags to determine how it garbage collects 
completed tasks and frameworks from memory:
* {{--max_completed_frameworks}}
* {{--max_completed_tasks_per_framework}}

Setting these parameters correctly can be difficult, since there may be a large 
variance in the size of Task and Framework objects kept in memory. Launching a 
framework which makes use of task labels to pass data of significant size can 
quickly lead to performance issues if the master is retaining a large number of 
completed tasks.

We should explore other ways of garbage collecting completed frameworks and 
tasks, which could better handle the variation in the size of task metadata.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6101) Add event for Framwork added to master operator API

2017-07-21 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-6101:
-
Sprint: Mesosphere Sprint 60

> Add event for Framwork added to master operator API
> ---
>
> Key: MESOS-6101
> URL: https://issues.apache.org/jira/browse/MESOS-6101
> Project: Mesos
>  Issue Type: Task
>Reporter: Zhitao Li
>Assignee: Quinn
>
> Consider the following case:
> 1) a subscriber connects to master;
> 2) a new scheduler registered as a new framework;
> 3) a task is launched from this framework.
> In this sequence, subscriber does not have a way to know the FrameworkInfo 
> belonging to the FrameworkId.
> We should support an event (e.g. when framework info in master is 
> added/changed).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6101) Add Framwork events to master's operator API

2017-07-21 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-6101:
-
Summary: Add Framwork events to master's operator API  (was: Add event for 
Framwork added to master operator API)

> Add Framwork events to master's operator API
> 
>
> Key: MESOS-6101
> URL: https://issues.apache.org/jira/browse/MESOS-6101
> Project: Mesos
>  Issue Type: Task
>Reporter: Zhitao Li
>Assignee: Quinn
>
> Consider the following case:
> 1) a subscriber connects to master;
> 2) a new scheduler registered as a new framework;
> 3) a task is launched from this framework.
> In this sequence, subscriber does not have a way to know the FrameworkInfo 
> belonging to the FrameworkId.
> We should support an event (e.g. when framework info in master is 
> added/changed).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6101) Add event for Framwork added to master operator API

2017-07-20 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-6101:
-
Shepherd: Anand Mazumdar  (was: Greg Mann)

> Add event for Framwork added to master operator API
> ---
>
> Key: MESOS-6101
> URL: https://issues.apache.org/jira/browse/MESOS-6101
> Project: Mesos
>  Issue Type: Task
>Reporter: Zhitao Li
>Assignee: Quinn
>
> Consider the following case:
> 1) a subscriber connects to master;
> 2) a new scheduler registered as a new framework;
> 3) a task is launched from this framework.
> In this sequence, subscriber does not have a way to know the FrameworkInfo 
> belonging to the FrameworkId.
> We should support an event (e.g. when framework info in master is 
> added/changed).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7818) Add more filtering options for unversioned operator API

2017-07-20 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095017#comment-16095017
 ] 

Greg Mann commented on MESOS-7818:
--

Let's collect specific requirements here and break out into additional tickets 
if necessary.

cc [~klueska] [~cinchurge]

> Add more filtering options for unversioned operator API
> ---
>
> Key: MESOS-7818
> URL: https://issues.apache.org/jira/browse/MESOS-7818
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Greg Mann
>  Labels: api, mesosphere, operator
>
> The Mesos CLI hits {{/state}} to get the state of the Mesos cluster, which 
> can cause performance issues in large clusters. To optimize the CLI for large 
> clusters, we can add more filtering options to unversioned operator endpoints 
> like {{/tasks}}, so that the CLI can request results for only those tasks 
> which match certain criteria.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-2258) Enable filtering of task information in master/state.json

2017-07-20 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095014#comment-16095014
 ] 

Greg Mann commented on MESOS-2258:
--

Closing this in favor of MESOS-7818.

> Enable filtering of task information in master/state.json
> -
>
> Key: MESOS-2258
> URL: https://issues.apache.org/jira/browse/MESOS-2258
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Niklas Quarfot Nielsen
>
> The masters state endpoint can grow huge (several MB's) in large 
> installations due to data of all running and completed tasks, while other 
> pieces of information (counters, attached slaves and frameworks) are still 
> useful to be polled frequently.
> We can add query parameters to state.json to filter out task information 
> and/or introduce a /metadata.json endpoint with all but task information.
> Any thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7818) Add more filtering options for unversioned operator API

2017-07-20 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7818:


 Summary: Add more filtering options for unversioned operator API
 Key: MESOS-7818
 URL: https://issues.apache.org/jira/browse/MESOS-7818
 Project: Mesos
  Issue Type: Improvement
  Components: master
Reporter: Greg Mann


The Mesos CLI hits {{/state}} to get the state of the Mesos cluster, which can 
cause performance issues in large clusters. To optimize the CLI for large 
clusters, we can add more filtering options to unversioned operator endpoints 
like {{/tasks}}, so that the CLI can request results for only those tasks which 
match certain criteria.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7630) Add simple filtering to unversioned operator API

2017-07-20 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7630:
-
Sprint: Mesosphere Sprint 59

> Add simple filtering to unversioned operator API
> 
>
> Key: MESOS-7630
> URL: https://issues.apache.org/jira/browse/MESOS-7630
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, master
>Reporter: Quinn
>Assignee: Quinn
>  Labels: agent, api, http, master, mesosphere
> Fix For: 1.4.0
>
>
> Add filtering for the following endpoints:
> - {{/frameworks}}
> - {{/slaves}}
> - {{/tasks}}
> - {{/containers}}
> We should investigate whether we should use RESTful style or query string to 
> filter the specific resource. We should also figure out whether it's 
> necessary to filter a list of resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7434) SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.

2017-07-20 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7434:
-
Sprint: Mesosphere Sprint 58  (was: Mesosphere Sprint 58, Mesosphere Sprint 
59)

> SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.
> -
>
> Key: MESOS-7434
> URL: https://issues.apache.org/jira/browse/MESOS-7434
> Project: Mesos
>  Issue Type: Bug
> Environment: Debian 8
> CentOS 6
> other Linux distros
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: flaky, flaky-test, mesosphere
> Attachments: 
> RestartSlaveRequireExecutorAuthentication_failure_log_debian8.txt, 
> RestartSlaveRequireExecutorAuthentication is flaky_failure_log_centos6.txt
>
>
> This test failure has been observed on an internal CI system. It occurs on a 
> variety of Linux distributions. It seems that using {{cat}} as the task 
> command may be problematic; see attached log file 
> {{SlaveTest.RestartSlaveRequireExecutorAuthentication.txt}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7630) Add simple filtering to unversioned operator API

2017-07-18 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092386#comment-16092386
 ] 

Greg Mann commented on MESOS-7630:
--

{code}
commit 916a5c9fdbc7619b7c9356c21afb83e043feef88
Author: Quinn Leng 
Date:   Tue Jul 18 17:07:02 2017 -0700

Added test cases for /slaves, /containers, /frameworks endpoints.

Added query parameter test cases for '/slaves' and '/frameworks' on
the master, and '/containers' on the agent.

Review: https://reviews.apache.org/r/60847/
{code}
{code}
commit 8363449c130298b9c77560c5df583dc1226dd17c
Author: Quinn Leng 
Date:   Tue Jul 18 17:06:59 2017 -0700

Added filtering to /slaves, /containers and /frameworks endpoints.

Added query parameter support for the '/slaves', '/frameworks' and
'/containers' endpoints.

This allows slaves, frameworks and containers to be queried
by ID.

If no ID is specified, all records are returned, consistent
with current behavior.

Review: https://reviews.apache.org/r/60822/
{code}
{code}
commit aa244baa45d8db84e98e6dca9944a3f679da70d1
Author: Quinn Leng 
Date:   Tue Jul 18 17:06:56 2017 -0700

Added class definition for the 'IDAcceptor'.

This commit contains the class definition for 'IDAcceptor', which is
used to filter IDs in the '/master/frameworks', '/master/slaves',
'/master/tasks', and '/slave/containers' endpoints.

Review: https://reviews.apache.org/r/60820/
{code}

> Add simple filtering to unversioned operator API
> 
>
> Key: MESOS-7630
> URL: https://issues.apache.org/jira/browse/MESOS-7630
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, master
>Reporter: Quinn
>Assignee: Quinn
>  Labels: agent, api, http, master, mesosphere
>
> Add filtering for the following endpoints:
> - {{/frameworks}}
> - {{/slaves}}
> - {{/tasks}}
> - {{/containers}}
> We should investigate whether we should use RESTful style or query string to 
> filter the specific resource. We should also figure out whether it's 
> necessary to filter a list of resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7630) Add simple filtering to unversioned operator API

2017-07-13 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086807#comment-16086807
 ] 

Greg Mann commented on MESOS-7630:
--

{code}
commit 9e208293ba482d843e5c56a40d997ba18e764b58
Author: Quinn Leng 
Date:   Thu Jul 13 17:44:03 2017 -0700

Refactored authorization acceptors into a single class.

Replaced different authorization-related Acceptor classes with one
AuthorizationAcceptor class.

Removed the ObjectAcceptor parent class, since no inheritance features
are provided by it.

Review: https://reviews.apache.org/r/60716/
{code}
{code}
commit 15656be2f65cc4eeaf053b47133ca0bd43d5c166
Author: Quinn Leng 
Date:   Thu Jul 13 17:43:59 2017 -0700

Added constructors for ObjectApprover::Object.

Added new constructors and updated all places where
ObjectApprover::Objects are constructed to use new
constructors.

Review: https://reviews.apache.org/r/60279/
{code}

> Add simple filtering to unversioned operator API
> 
>
> Key: MESOS-7630
> URL: https://issues.apache.org/jira/browse/MESOS-7630
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, master
>Reporter: Quinn
>Assignee: Quinn
>  Labels: agent, api, http, master, mesosphere
>
> Add filtering for the following endpoints:
> - {{/frameworks}}
> - {{/slaves}}
> - {{/tasks}}
> - {{/containers}}
> We should investigate whether we should use RESTful style or query string to 
> filter the specific resource. We should also figure out whether it's 
> necessary to filter a list of resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7602) Add filtering capabilities to the master/agent operator APIs

2017-07-10 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-7602:


Assignee: Quinn

> Add filtering capabilities to the master/agent operator APIs
> 
>
> Key: MESOS-7602
> URL: https://issues.apache.org/jira/browse/MESOS-7602
> Project: Mesos
>  Issue Type: Epic
>  Components: agent, HTTP API, master
>Reporter: Greg Mann
>Assignee: Quinn
>  Labels: api, http, mesosphere
>
> We would like to add filtering capabilities to both the unversioned operator 
> HTTP endpoints and the V1 operator APIs on the master and agent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7602) Add filtering capabilities to the master/agent operator APIs

2017-07-10 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7602:
-
Shepherd: Greg Mann

> Add filtering capabilities to the master/agent operator APIs
> 
>
> Key: MESOS-7602
> URL: https://issues.apache.org/jira/browse/MESOS-7602
> Project: Mesos
>  Issue Type: Epic
>  Components: agent, HTTP API, master
>Reporter: Greg Mann
>Assignee: Quinn
>  Labels: api, http, mesosphere
>
> We would like to add filtering capabilities to both the unversioned operator 
> HTTP endpoints and the V1 operator APIs on the master and agent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7434) SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.

2017-07-06 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7434:
-
Shepherd: Till Toenshoff

> SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.
> -
>
> Key: MESOS-7434
> URL: https://issues.apache.org/jira/browse/MESOS-7434
> Project: Mesos
>  Issue Type: Bug
> Environment: Debian 8
> CentOS 6
> other Linux distros
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: flaky, flaky-test, mesosphere
> Attachments: 
> RestartSlaveRequireExecutorAuthentication_failure_log_debian8.txt, 
> RestartSlaveRequireExecutorAuthentication is flaky_failure_log_centos6.txt
>
>
> This test failure has been observed on an internal CI system. It occurs on a 
> variety of Linux distributions. It seems that using {{cat}} as the task 
> command may be problematic; see attached log file 
> {{SlaveTest.RestartSlaveRequireExecutorAuthentication.txt}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7630) Add simple filtering to unversioned operator API

2017-06-30 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070910#comment-16070910
 ] 

Greg Mann commented on MESOS-7630:
--

{code}
commit 0d277bb64fa5a4d0b4f741daedf64095beab4773
Author: Quinn Leng 
Date:   Fri Jun 30 16:58:34 2017 -0700

Added filtering to the '/tasks' endpoint.

Added filtering to the '/tasks' endpoint.

Review: https://reviews.apache.org/r/60107/
{code}

> Add simple filtering to unversioned operator API
> 
>
> Key: MESOS-7630
> URL: https://issues.apache.org/jira/browse/MESOS-7630
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, master
>Reporter: Quinn
>Assignee: Quinn
>  Labels: agent, api, http, master, mesosphere
>
> Add filtering for the following endpoints:
> - {{/frameworks}}
> - {{/slaves}}
> - {{/tasks}}
> - {{/containers}}
> We should investigate whether we should use RESTful style or query string to 
> filter the specific resource. We should also figure out whether it's 
> necessary to filter a list of resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7726) MasterTest.IgnoreOldAgentReregistration test is flaky

2017-06-30 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070853#comment-16070853
 ] 

Greg Mann commented on MESOS-7726:
--

The linked ticket, MESOS-7562, is similar but fails on the timeout of a 
different future. Leaving them both open for the time being in case these are 
discrete issues.

> MasterTest.IgnoreOldAgentReregistration test is flaky
> -
>
> Key: MESOS-7726
> URL: https://issues.apache.org/jira/browse/MESOS-7726
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Neil Conway
>  Labels: flaky-test, mesosphere-oncall
>
> Observed this on ASF CI.
> {code}
> [ RUN  ] MasterTest.IgnoreOldAgentReregistration
> I0627 05:23:06.031154  4917 cluster.cpp:162] Creating default 'local' 
> authorizer
> I0627 05:23:06.033433  4945 master.cpp:438] Master 
> a8778782-0da1-49a5-9cb8-9f6d11701733 (c43debbe7e32) started on 
> 172.17.0.4:41747
> I0627 05:23:06.033457  4945 master.cpp:440] Flags at startup: --acls="" 
> --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate_agents="true" --authenticate_frameworks="true" 
> --authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
> --authenticate_http_readwrite="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/2BARnF/credentials" 
> --filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
> --hostname_lookup="true" --http_authenticators="basic" 
> --http_framework_authenticators="basic" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" 
> --max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
> --recovery_agent_removal_limit="100%" --registry="in_memory" 
> --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
> --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
> --registry_store_timeout="100secs" --registry_strict="false" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/mesos/mesos-1.4.0/_inst/share/mesos/webui" 
> --work_dir="/tmp/2BARnF/master" --zk_session_timeout="10secs"
> I0627 05:23:06.033771  4945 master.cpp:490] Master only allowing 
> authenticated frameworks to register
> I0627 05:23:06.033787  4945 master.cpp:504] Master only allowing 
> authenticated agents to register
> I0627 05:23:06.033798  4945 master.cpp:517] Master only allowing 
> authenticated HTTP frameworks to register
> I0627 05:23:06.033812  4945 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/2BARnF/credentials'
> I0627 05:23:06.034080  4945 master.cpp:562] Using default 'crammd5' 
> authenticator
> I0627 05:23:06.034221  4945 http.cpp:974] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-readonly'
> I0627 05:23:06.034409  4945 http.cpp:974] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-readwrite'
> I0627 05:23:06.034569  4945 http.cpp:974] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-scheduler'
> I0627 05:23:06.034688  4945 master.cpp:642] Authorization enabled
> I0627 05:23:06.034862  4938 whitelist_watcher.cpp:77] No whitelist given
> I0627 05:23:06.034868  4950 hierarchical.cpp:169] Initialized hierarchical 
> allocator process
> I0627 05:23:06.037211  4957 master.cpp:2161] Elected as the leading master!
> I0627 05:23:06.037236  4957 master.cpp:1700] Recovering from registrar
> I0627 05:23:06.037333  4938 registrar.cpp:345] Recovering registrar
> I0627 05:23:06.038146  4938 registrar.cpp:389] Successfully fetched the 
> registry (0B) in 768256ns
> I0627 05:23:06.038290  4938 registrar.cpp:493] Applied 1 operations in 
> 30798ns; attempting to update the registry
> I0627 05:23:06.038861  4938 registrar.cpp:550] Successfully updated the 
> registry in 510976ns
> I0627 05:23:06.038960  4938 registrar.cpp:422] Successfully recovered 
> registrar
> I0627 05:23:06.039364  4941 hierarchical.cpp:207] Skipping recovery of 
> hierarchical allocator: nothing to recover
> I0627 05:23:06.039594  4958 master.cpp:1799] Recovered 0 agents from the 
> registry (129B); allowing 10mins for agents to re-register
> I0627 05:23:06.043999  4917 containerizer.cpp:230] Using isolation: 
> posix/cpu,posix/mem,filesystem/posix,network/cni,environment_secret
> W0627 05:23:06.044456  4917 backend.cpp:76] Failed to create 'aufs' backend: 
> AufsBackend requires root privileges
> W0627 05:23:06.044548  4917 backend.cpp:76] Failed to create 'bind' backend: 
> BindBackend requires root privileges
> I0627 05:23:06.044580  4917 provisioner.cpp

[jira] [Updated] (MESOS-7562) MasterTest.IgnoreOldAgentReregistration is flaky

2017-06-30 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7562:
-
 Labels: flaky flaky-test mesosphere mesosphere-oncall  (was: )
Component/s: test

> MasterTest.IgnoreOldAgentReregistration is flaky
> 
>
> Key: MESOS-7562
> URL: https://issues.apache.org/jira/browse/MESOS-7562
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: flaky, flaky-test, mesosphere, mesosphere-oncall
>
> {noformat}
> [ RUN  ] MasterTest.IgnoreOldAgentReregistration
> I0524 16:29:07.143152 29236 cluster.cpp:162] Creating default 'local' 
> authorizer
> I0524 16:29:07.149690 29287 master.cpp:436] Master 
> 3912ae61-36a4-468c-bef5-82f082370f3d (core-dev) started on 10.0.49.2:42980
> I0524 16:29:07.149724 29287 master.cpp:438] Flags at startup: --acls="" 
> --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate_agents="true" --authenticate_frameworks="true" 
> --authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
> --authenticate_http_readwrite="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/gg4ie7/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --http_authenticators="basic" --http_framework_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
> --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
> --max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
> --recovery_agent_removal_limit="100%" --registry="in_memory" 
> --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
> --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
> --registry_store_timeout="100secs" --registry_strict="false" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/gg4ie7/master" 
> --zk_session_timeout="10secs"
> I0524 16:29:07.149896 29287 master.cpp:488] Master only allowing 
> authenticated frameworks to register
> I0524 16:29:07.149905 29287 master.cpp:502] Master only allowing 
> authenticated agents to register
> I0524 16:29:07.149912 29287 master.cpp:515] Master only allowing 
> authenticated HTTP frameworks to register
> I0524 16:29:07.149920 29287 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/gg4ie7/credentials'
> I0524 16:29:07.150065 29287 master.cpp:560] Using default 'crammd5' 
> authenticator
> I0524 16:29:07.150133 29287 http.cpp:975] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-readonly'
> I0524 16:29:07.150168 29287 http.cpp:975] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-readwrite'
> I0524 16:29:07.150223 29287 http.cpp:975] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-scheduler'
> I0524 16:29:07.150259 29287 master.cpp:640] Authorization enabled
> I0524 16:29:07.151617 29274 master.cpp:2161] Elected as the leading master!
> I0524 16:29:07.151644 29274 master.cpp:1700] Recovering from registrar
> I0524 16:29:07.152218 29261 registrar.cpp:389] Successfully fetched the 
> registry (0B) in 505088ns
> I0524 16:29:07.152268 29261 registrar.cpp:493] Applied 1 operations in 
> 4200ns; attempting to update the registry
> I0524 16:29:07.152664 29261 registrar.cpp:550] Successfully updated the 
> registry in 371200ns
> I0524 16:29:07.152703 29261 registrar.cpp:422] Successfully recovered 
> registrar
> I0524 16:29:07.153328 29291 master.cpp:1799] Recovered 0 agents from the 
> registry (119B); allowing 10mins for agents to re-register
> I0524 16:29:07.160094 29236 containerizer.cpp:230] Using isolation: 
> posix/cpu,posix/mem,filesystem/posix,network/cni,environment_secret
> W0524 16:29:07.160295 29236 backend.cpp:76] Failed to create 'overlay' 
> backend: OverlayBackend requires root privileges
> W0524 16:29:07.160326 29236 backend.cpp:76] Failed to create 'bind' backend: 
> BindBackend requires root privileges
> I0524 16:29:07.160334 29236 provisioner.cpp:255] Using default backend 'copy'
> I0524 16:29:07.161916 29236 cluster.cpp:448] Creating default 'local' 
> authorizer
> I0524 16:29:07.162616 29276 slave.cpp:225] Mesos agent started on 
> (7738)@10.0.49.2:42980
> I0524 16:29:07.162644 29276 slave.cpp:226] Flags at startup: --acls="" 
> --appc_simple_discovery_uri_prefix="http://"; 
> --appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="true" 
> --authenticate_http_readwrite="true" --authenticatee="crammd5" 
> --authentication_backoff_factor="1secs" --au

[jira] [Commented] (MESOS-7743) Authorization for framework effective capabilities.

2017-06-29 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16069236#comment-16069236
 ] 

Greg Mann commented on MESOS-7743:
--

cc [~arojas]

> Authorization for framework effective capabilities.
> ---
>
> Key: MESOS-7743
> URL: https://issues.apache.org/jira/browse/MESOS-7743
> Project: Mesos
>  Issue Type: Bug
>  Components: modules, security
>Reporter: James Peach
>
> As noted by [~greggomann], we should add an authorization hook to the 
> application of framework effective capabilities so that authorization modules 
> can make fine-grained decisions about which effective capabilities they are 
> willing to allow specific tasks to hold.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7680) Stop using EXIT() in master/agent initialization code

2017-06-15 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7680:
-
Description: 
The initialization of master/agent dependencies is currently inconsistent. For 
some dependencies, we initialize them outside of the actor and then inject them 
via the constructor; for example, in {{main.cpp}} and {{cluster.cpp}}.

Some other dependencies are created/initialized within the master/slave's 
{{initialize()}} method. In this case, if the dependency creation fails, we use 
{{EXIT(EXIT_FAILURE)}} to terminate the process. In the case of tests, this is 
problematic. If I create multiple agents, for example, and one of their 
dependencies fails to initialize successfully, the entire test harness would 
exit :-(

During some discussion, [~jieyu] proposed an alternative: instead of using 
{{EXIT}} when dependency creation fails, we could terminate the master/agent 
libprocess process. In the case of the production binaries, this would cause 
the executable to exit. In the case of our tests, this would allow a single 
test to fail, while the test harness continues running.



  was:
The initialization of master/agent dependencies is currently inconsistent. For 
some dependencies, we initialize them outside of the actor and then inject them 
via the constructor; for example, in {{main.cpp}} and {{cluster.cpp}}.

Some other dependencies are created/initialized within the master/slave's 
{{initialize()}} method. In this case, if the dependency creation fails, we use 
{{EXIT(EXIT_FAILURE)}} to terminate the process. In the case of tests, this is 
problematic. If I create multiple agents, for example, and one of their 
dependencies fails to initialize successfully, the entire test harness would 
exit :-(

Instead of using {{EXIT}} when dependency creation fails, we should terminate 
the master/agent libprocess process. In the case of the production binaries, 
this will cause the executable to exit. In the case of our tests, this will 
allow a single test to fail, while the test harness continues running.




> Stop using EXIT() in master/agent initialization code
> -
>
> Key: MESOS-7680
> URL: https://issues.apache.org/jira/browse/MESOS-7680
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, master
>Reporter: Greg Mann
>  Labels: mesosphere
>
> The initialization of master/agent dependencies is currently inconsistent. 
> For some dependencies, we initialize them outside of the actor and then 
> inject them via the constructor; for example, in {{main.cpp}} and 
> {{cluster.cpp}}.
> Some other dependencies are created/initialized within the master/slave's 
> {{initialize()}} method. In this case, if the dependency creation fails, we 
> use {{EXIT(EXIT_FAILURE)}} to terminate the process. In the case of tests, 
> this is problematic. If I create multiple agents, for example, and one of 
> their dependencies fails to initialize successfully, the entire test harness 
> would exit :-(
> During some discussion, [~jieyu] proposed an alternative: instead of using 
> {{EXIT}} when dependency creation fails, we could terminate the master/agent 
> libprocess process. In the case of the production binaries, this would cause 
> the executable to exit. In the case of our tests, this would allow a single 
> test to fail, while the test harness continues running.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7680) Stop using EXIT() in master/agent initialization code

2017-06-15 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7680:


 Summary: Stop using EXIT() in master/agent initialization code
 Key: MESOS-7680
 URL: https://issues.apache.org/jira/browse/MESOS-7680
 Project: Mesos
  Issue Type: Improvement
  Components: agent, master
Reporter: Greg Mann


The initialization of master/agent dependencies is currently inconsistent. For 
some dependencies, we initialize them outside of the actor and then inject them 
via the constructor; for example, in {{main.cpp}} and {{cluster.cpp}}.

Some other dependencies are created/initialized within the master/slave's 
{{initialize()}} method. In this case, if the dependency creation fails, we use 
{{EXIT(EXIT_FAILURE)}} to terminate the process. In the case of tests, this is 
problematic. If I create multiple agents, for example, and one of their 
dependencies fails to initialize successfully, the entire test harness would 
exit :-(

Instead of using {{EXIT}} when dependency creation fails, we should terminate 
the master/agent libprocess process. In the case of the production binaries, 
this will cause the executable to exit. In the case of our tests, this will 
allow a single test to fail, while the test harness continues running.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7434) SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.

2017-06-14 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7434:
-
Sprint: Mesosphere Sprint 58

> SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.
> -
>
> Key: MESOS-7434
> URL: https://issues.apache.org/jira/browse/MESOS-7434
> Project: Mesos
>  Issue Type: Bug
> Environment: Debian 8
> CentOS 6
> other Linux distros
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: flaky, flaky-test, mesosphere
> Attachments: 
> RestartSlaveRequireExecutorAuthentication_failure_log_debian8.txt, 
> RestartSlaveRequireExecutorAuthentication is flaky_failure_log_centos6.txt
>
>
> This test failure has been observed on an internal CI system. It occurs on a 
> variety of Linux distributions. It seems that using {{cat}} as the task 
> command may be problematic; see attached log file 
> {{SlaveTest.RestartSlaveRequireExecutorAuthentication.txt}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7661) Libprocess timers with long durations trigger immediately

2017-06-13 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-7661:
-
Summary: Libprocess timers with long durations trigger immediately  (was: 
Libprocess runs long timers right ahead)

> Libprocess timers with long durations trigger immediately
> -
>
> Key: MESOS-7661
> URL: https://issues.apache.org/jira/browse/MESOS-7661
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Gastón Kleiman
>  Labels: mesosphere
>
> {{process::delay()}} will schedule a method to be run right ahead when called 
> with a vry long {{Duration}}.
> This happens because [{{Timeout}} tries to add two long 
> durations|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/libprocess/include/process/timeout.hpp#L33-L38],
>  leading to an [integer overflow in 
> {{Duration}}|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/stout/include/stout/duration.hpp#L116].
> I'd expect libprocess to either:
>   1. Never run the method.
>   2. Schedule it in the longest possible {{Duration}}.
> {{Duration::operator+=()}} should probably also handle integer overflows 
> differently. If an addition leads to an integer overflow, it might make more 
> sense to return {{Duration::max()}} than a negative duration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7661) Libprocess runs long timers right ahead

2017-06-13 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048536#comment-16048536
 ] 

Greg Mann commented on MESOS-7661:
--

I could imagine something like:
{code}
Duration& operator+=(const Duration& that)
{
  if (max() - that < *this) {
nanos = max().nanos;
  } else {
nanos += that.nanos;
  }

  return *this;
}
{code}

cc [~bmahler] [~kaysoky]

> Libprocess runs long timers right ahead
> ---
>
> Key: MESOS-7661
> URL: https://issues.apache.org/jira/browse/MESOS-7661
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Gastón Kleiman
>  Labels: mesosphere
>
> {{process::delay()}} will schedule a method to be run right ahead when called 
> with a vry long {{Duration}}.
> This happens because [{{Timeout}} tries to add two long 
> durations|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/libprocess/include/process/timeout.hpp#L33-L38],
>  leading to an [integer overflow in 
> {{Duration}}|https://github.com/apache/mesos/blob/13cae29e7832d8bb879c68847ad0df449d227f17/3rdparty/stout/include/stout/duration.hpp#L116].
> I'd expect libprocess to either:
>   1. Never run the method.
>   2. Schedule it in the longest possible {{Duration}}.
> {{Duration::operator+=()}} should probably also handle integer overflows 
> differently. If an addition leads to an integer overflow, it might make more 
> sense to return {{Duration::max()}} than a negative duration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7629) Parsing to protobuf leads to API call validation errors

2017-06-06 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7629:


 Summary: Parsing to protobuf leads to API call validation errors
 Key: MESOS-7629
 URL: https://issues.apache.org/jira/browse/MESOS-7629
 Project: Mesos
  Issue Type: Bug
  Components: stout
Reporter: Greg Mann


The {{::protobuf::parse()}} function will [silently drop unrecognized 
fields|https://github.com/apache/mesos/blob/7ec3269d51d7d180aa857140097c170c469d7959/3rdparty/stout/include/stout/protobuf.hpp#L589],
 which makes sense in the context of maintaining backward-compatibility across 
different Mesos versions which may add or remove fields from protobuf messages. 
However, since we [rely on this protobuf 
parsing|https://github.com/apache/mesos/blob/7ec3269d51d7d180aa857140097c170c469d7959/src/master/http.cpp#L514-L520]
 in some places for validation of user-supplied JSON, this can lead to API 
endpoints returning successful 2XX responses, when in fact the JSON was 
malformed and the call has not been completed as submitted.

We should consider adding a parameter to API calls which allows users to 
enable/disable ignoring unrecognized fields in the call. If the default 
behavior for JSON requests was to return an error rather than ignore 
unrecognized fields, then our parsing code would catch malformed JSON 
submissions. The user could opt-in to the "ignore unrecognized fields" behavior 
when backwards compatibility is a concern.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7629) Parsing to protobuf leads to API call validation errors

2017-06-06 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16039076#comment-16039076
 ] 

Greg Mann commented on MESOS-7629:
--

cc [~bmahler]

> Parsing to protobuf leads to API call validation errors
> ---
>
> Key: MESOS-7629
> URL: https://issues.apache.org/jira/browse/MESOS-7629
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Greg Mann
>  Labels: api, json, mesosphere, parsing, protobuf
>
> The {{::protobuf::parse()}} function will [silently drop unrecognized 
> fields|https://github.com/apache/mesos/blob/7ec3269d51d7d180aa857140097c170c469d7959/3rdparty/stout/include/stout/protobuf.hpp#L589],
>  which makes sense in the context of maintaining backward-compatibility 
> across different Mesos versions which may add or remove fields from protobuf 
> messages. However, since we [rely on this protobuf 
> parsing|https://github.com/apache/mesos/blob/7ec3269d51d7d180aa857140097c170c469d7959/src/master/http.cpp#L514-L520]
>  in some places for validation of user-supplied JSON, this can lead to API 
> endpoints returning successful 2XX responses, when in fact the JSON was 
> malformed and the call has not been completed as submitted.
> We should consider adding a parameter to API calls which allows users to 
> enable/disable ignoring unrecognized fields in the call. If the default 
> behavior for JSON requests was to return an error rather than ignore 
> unrecognized fields, then our parsing code would catch malformed JSON 
> submissions. The user could opt-in to the "ignore unrecognized fields" 
> behavior when backwards compatibility is a concern.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7602) Add filtering capabilities to the master/agent operator APIs

2017-06-01 Thread Greg Mann (JIRA)
Greg Mann created MESOS-7602:


 Summary: Add filtering capabilities to the master/agent operator 
APIs
 Key: MESOS-7602
 URL: https://issues.apache.org/jira/browse/MESOS-7602
 Project: Mesos
  Issue Type: Epic
  Components: agent, HTTP API, master
Reporter: Greg Mann


We would like to add filtering capabilities to both the unversioned operator 
HTTP endpoints and the V1 operator APIs on the master and agent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MESOS-7542) Add executor reconnection retry logic to the agent

2017-05-25 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025602#comment-16025602
 ] 

Greg Mann edited comment on MESOS-7542 at 5/26/17 12:48 AM:


Implementation/tests of the agent-side behavior:
https://reviews.apache.org/r/59584/
https://reviews.apache.org/r/59585/
https://reviews.apache.org/r/59586/
https://reviews.apache.org/r/59587/


was (Author: greggomann):
Implementation/tests of the agent-side behavior:
https://reviews.apache.org/r/59584/
https://reviews.apache.org/r/59584/
https://reviews.apache.org/r/59584/
https://reviews.apache.org/r/59584/

> Add executor reconnection retry logic to the agent
> --
>
> Key: MESOS-7542
> URL: https://issues.apache.org/jira/browse/MESOS-7542
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, executor
>Reporter: Greg Mann
>Assignee: Benjamin Mahler
>  Labels: mesosphere
>
> Currently, the agent sends a single {{ReconnectExecutorMessage}} to PID-based 
> executors during recovery. It would be more robust to have the agent retry 
> these messages until {{executor_reregister_timeout}} has elapsed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7542) Add executor reconnection retry logic to the agent

2017-05-25 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025602#comment-16025602
 ] 

Greg Mann commented on MESOS-7542:
--

Implementation/tests of the agent-side behavior:
https://reviews.apache.org/r/59584/
https://reviews.apache.org/r/59584/
https://reviews.apache.org/r/59584/
https://reviews.apache.org/r/59584/

> Add executor reconnection retry logic to the agent
> --
>
> Key: MESOS-7542
> URL: https://issues.apache.org/jira/browse/MESOS-7542
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, executor
>Reporter: Greg Mann
>Assignee: Benjamin Mahler
>  Labels: mesosphere
>
> Currently, the agent sends a single {{ReconnectExecutorMessage}} to PID-based 
> executors during recovery. It would be more robust to have the agent retry 
> these messages until {{executor_reregister_timeout}} has elapsed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


<    1   2   3   4   5   6   7   8   9   10   >