[jira] [Comment Edited] (MESOS-4144) Allow for dynamic updating of --roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055276#comment-15055276
 ] 

Yong Qiao Wang edited comment on MESOS-4144 at 12/14/15 2:03 AM:
-

Currently, Implicit Roles Epic is used to change the behavior of the master 
when the --roles flag is NOT specified. Previously, this would allow only the 
"*" role to be used if "--roles" is not specified. Now, omitting "--roles" 
means that any role can be used. This is called "implicit roles". So Implicit 
Roles does not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.



was (Author: jamesyongqiaowang):
Currently, Implicit Roles Epic is used to change the behavior of the master 
when the '--roles' flag is NOT specified. Previously, this would allow only the 
"*" role to be used if "--roles" is not specified. Now, omitting "--roles" 
means that any role can be used. This is called "implicit roles". So Implicit 
Roles does not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.


> Allow for dynamic updating of --roles
> -
>
> Key: MESOS-4144
> URL: https://issues.apache.org/jira/browse/MESOS-4144
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
>Reporter: John Omernik
>
> Roles must be specified at master run time. If the environment changes, and 
> more roles are needed, it requires a restart of masters. Instead, there 
> should be a way via API to add roles or remove roles to properly authorized 
> principles. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4144) Allow for dynamic updating of --roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055276#comment-15055276
 ] 

Yong Qiao Wang edited comment on MESOS-4144 at 12/14/15 2:04 AM:
-

Currently, Implicit Roles Epic is used to change the behavior of the master 
when --roles flag is NOT specified. Previously, this would allow only the "*" 
role to be used if "--roles" is not specified. Now, omitting --roles means that 
any role can be used. This is called "implicit roles". So Implicit Roles does 
not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.



was (Author: jamesyongqiaowang):
Currently, Implicit Roles Epic is used to change the behavior of the master 
when the roles flag is NOT specified. Previously, this would allow only the "*" 
role to be used if "--roles" is not specified. Now, omitting --roles means that 
any role can be used. This is called "implicit roles". So Implicit Roles does 
not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.


> Allow for dynamic updating of --roles
> -
>
> Key: MESOS-4144
> URL: https://issues.apache.org/jira/browse/MESOS-4144
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
>Reporter: John Omernik
>
> Roles must be specified at master run time. If the environment changes, and 
> more roles are needed, it requires a restart of masters. Instead, there 
> should be a way via API to add roles or remove roles to properly authorized 
> principles. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4144) Allow for dynamic updating of --roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055276#comment-15055276
 ] 

Yong Qiao Wang edited comment on MESOS-4144 at 12/14/15 2:02 AM:
-

Currently, Implicit Roles Epic is used to change the behavior of the master 
when the '--roles' flag is NOT specified. Previously, this would allow only the 
"*" role to be used if "--roles" is not specified. Now, omitting "--roles" 
means that any role can be used. This is called "implicit roles". So Implicit 
Roles does not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.



was (Author: jamesyongqiaowang):
Currently, Implicit Roles Epic is used to change the behavior of the master 
when the "--roles" flag is NOT specified. Previously, this would allow only the 
"*" role to be used if "--roles" is not specified. Now, omitting "--roles" 
means that any role can be used. This is called "implicit roles". So Implicit 
Roles does not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.


> Allow for dynamic updating of --roles
> -
>
> Key: MESOS-4144
> URL: https://issues.apache.org/jira/browse/MESOS-4144
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
>Reporter: John Omernik
>
> Roles must be specified at master run time. If the environment changes, and 
> more roles are needed, it requires a restart of masters. Instead, there 
> should be a way via API to add roles or remove roles to properly authorized 
> principles. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4144) Allow for dynamic updating of --roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055276#comment-15055276
 ] 

Yong Qiao Wang edited comment on MESOS-4144 at 12/14/15 2:02 AM:
-

Currently, Implicit Roles Epic is used to change the behavior of the master 
when the "--roles" flag is NOT specified. Previously, this would allow only the 
"*" role to be used if "--roles" is not specified. Now, omitting "--roles" 
means that any role can be used. This is called "implicit roles". So Implicit 
Roles does not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.



was (Author: jamesyongqiaowang):
Currently, Implicit Roles Epic is used to change the behavior of the master 
when the `--roles` flag is NOT specified. Previously, this would allow only the 
`*` role to be used if '--roles' is not specified. Now, omitting `--roles` 
means that any role can be used. This is called "implicit roles". So Implicit 
Roles does not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.


> Allow for dynamic updating of --roles
> -
>
> Key: MESOS-4144
> URL: https://issues.apache.org/jira/browse/MESOS-4144
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
>Reporter: John Omernik
>
> Roles must be specified at master run time. If the environment changes, and 
> more roles are needed, it requires a restart of masters. Instead, there 
> should be a way via API to add roles or remove roles to properly authorized 
> principles. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4144) Allow for dynamic updating of --roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055276#comment-15055276
 ] 

Yong Qiao Wang edited comment on MESOS-4144 at 12/14/15 2:03 AM:
-

Currently, Implicit Roles Epic is used to change the behavior of the master 
when the roles flag is NOT specified. Previously, this would allow only the "*" 
role to be used if "--roles" is not specified. Now, omitting --roles means that 
any role can be used. This is called "implicit roles". So Implicit Roles does 
not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.



was (Author: jamesyongqiaowang):
Currently, Implicit Roles Epic is used to change the behavior of the master 
when the --roles flag is NOT specified. Previously, this would allow only the 
"*" role to be used if "--roles" is not specified. Now, omitting "--roles" 
means that any role can be used. This is called "implicit roles". So Implicit 
Roles does not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.


> Allow for dynamic updating of --roles
> -
>
> Key: MESOS-4144
> URL: https://issues.apache.org/jira/browse/MESOS-4144
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
>Reporter: John Omernik
>
> Roles must be specified at master run time. If the environment changes, and 
> more roles are needed, it requires a restart of masters. Instead, there 
> should be a way via API to add roles or remove roles to properly authorized 
> principles. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4144) Allow for dynamic updating of --roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055276#comment-15055276
 ] 

Yong Qiao Wang commented on MESOS-4144:
---

Currently, Implicit Roles Epic is used to change the behavior of the master 
when the `--roles` flag is NOT specified. Previously, this would allow only the 
`*` role to be used if '--roles' is not specified. Now, omitting `--roles` 
means that any role can be used. This is called "implicit roles". So Implicit 
Roles does not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.


> Allow for dynamic updating of --roles
> -
>
> Key: MESOS-4144
> URL: https://issues.apache.org/jira/browse/MESOS-4144
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
>Reporter: John Omernik
>
> Roles must be specified at master run time. If the environment changes, and 
> more roles are needed, it requires a restart of masters. Instead, there 
> should be a way via API to add roles or remove roles to properly authorized 
> principles. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4143) Reserve/UnReserve Dynamic Reservation Endpoints allow reservations on non-existing roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-4143:
--
Summary: Reserve/UnReserve Dynamic Reservation Endpoints allow reservations 
on non-existing roles  (was: Reserse/UnReserve Dynamic Reservation Endpoints 
allow reservations on non-existing roles)

> Reserve/UnReserve Dynamic Reservation Endpoints allow reservations on 
> non-existing roles
> 
>
> Key: MESOS-4143
> URL: https://issues.apache.org/jira/browse/MESOS-4143
> Project: Mesos
>  Issue Type: Bug
>  Components: general
>Affects Versions: 0.25.0, 0.26.0
>Reporter: John Omernik
>
> When working with Dynamic reservations via the /reserve and /unreserve 
> endpoints, it is possible to reserve resources for roles that have not been 
> specified via the --roles flag on the master.  However, these roles are not 
> usable because the roles have not been defined, nor are they added to the 
> list of roles available. 
> Per the mailing list, changing roles after the fact is not possible at this 
> time. (That may be another JIRA), more importantly, the /reserve and 
> /unreserve end points should not allow reservation of roles not specified by 
> --roles.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4144) Allow for dynamic updating of --roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055276#comment-15055276
 ] 

Yong Qiao Wang edited comment on MESOS-4144 at 12/14/15 2:05 AM:
-


Currently, Implicit Roles Epic is used to change the behavior of the master 
when --roles flag is NOT specified. Previously, this would allow only the "*" 
role to be used if --roles is not specified. Now, omitting --roles means that 
any role can be used. This is called "implicit roles". So Implicit Roles does 
not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
--roles flag will be removed in the future.


was (Author: jamesyongqiaowang):
Currently, Implicit Roles Epic is used to change the behavior of the master 
when --roles flag is NOT specified. Previously, this would allow only the "*" 
role to be used if "--roles" is not specified. Now, omitting --roles means that 
any role can be used. This is called "implicit roles". So Implicit Roles does 
not be used to dynamic update --roles.

Actually, this JIRA is duplicated by MESOS-3177(Dynamic Roles/Weights). But 
currently, we are using Implicit Roles to support dynamic roles implicitly, and 
'--roles' flag will be removed in the future.


> Allow for dynamic updating of --roles
> -
>
> Key: MESOS-4144
> URL: https://issues.apache.org/jira/browse/MESOS-4144
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
>Reporter: John Omernik
>
> Roles must be specified at master run time. If the environment changes, and 
> more roles are needed, it requires a restart of masters. Instead, there 
> should be a way via API to add roles or remove roles to properly authorized 
> principles. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4143) Reserve/UnReserve Dynamic Reservation Endpoints allow reservations on non-existing roles

2015-12-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055566#comment-15055566
 ] 

Yong Qiao Wang commented on MESOS-4143:
---

In explicit roles(--roles is specified when Mesos master startup), this is a 
bug, but in implicit roles(--roles is NOT specified), this is not. Suggest to 
fix this issue after Implicit Roles(MESOS-4085) commit, then we can call  
Master::validRole to do check.

> Reserve/UnReserve Dynamic Reservation Endpoints allow reservations on 
> non-existing roles
> 
>
> Key: MESOS-4143
> URL: https://issues.apache.org/jira/browse/MESOS-4143
> Project: Mesos
>  Issue Type: Bug
>  Components: general
>Affects Versions: 0.25.0, 0.26.0
>Reporter: John Omernik
>
> When working with Dynamic reservations via the /reserve and /unreserve 
> endpoints, it is possible to reserve resources for roles that have not been 
> specified via the --roles flag on the master.  However, these roles are not 
> usable because the roles have not been defined, nor are they added to the 
> list of roles available. 
> Per the mailing list, changing roles after the fact is not possible at this 
> time. (That may be another JIRA), more importantly, the /reserve and 
> /unreserve end points should not allow reservation of roles not specified by 
> --roles.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3943) Support dynamic weight in allocator

2015-12-11 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3943:
--
   Shepherd: Adam B
Description: Currently, RoleInfo protobuf never be used for serialization, 
so I think we can remove it from allocator.proto, and define a struct to 
communicate between the allocator and master. But for role information display, 
then current serialization way(call modle(role*) in http.cpp) is not better, 
and we should define another RoleInfo protobuf for serialization. Refer to 
other components(such as quota), I propose to define role protobuf in a 
separated package rather than define it in mesos.proto.  (was: Mesos allocator 
should aware the role change, this includes adding, updating and delete a role. 
so in this ticket, we will extend the allocator interface based on the design 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#)
Summary: Support dynamic weight in allocator  (was: Dynamic 
roles/weights support in allocator)

> Support dynamic weight in allocator
> ---
>
> Key: MESOS-3943
> URL: https://issues.apache.org/jira/browse/MESOS-3943
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently, RoleInfo protobuf never be used for serialization, so I think we 
> can remove it from allocator.proto, and define a struct to communicate 
> between the allocator and master. But for role information display, then 
> current serialization way(call modle(role*) in http.cpp) is not better, and 
> we should define another RoleInfo protobuf for serialization. Refer to other 
> components(such as quota), I propose to define role protobuf in a separated 
> package rather than define it in mesos.proto.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3943) Support dynamic weight in allocator

2015-12-11 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052486#comment-15052486
 ] 

Yong Qiao Wang commented on MESOS-3943:
---

Append RR: https://reviews.apache.org/r/40469/

> Support dynamic weight in allocator
> ---
>
> Key: MESOS-3943
> URL: https://issues.apache.org/jira/browse/MESOS-3943
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently, RoleInfo protobuf never be used for serialization, so I think we 
> can remove it from allocator.proto, and define a struct to communicate 
> between the allocator and master. But for role information display, then 
> current serialization way(call modle(role*) in http.cpp) is not better, and 
> we should define another RoleInfo protobuf for serialization. Refer to other 
> components(such as quota), I propose to define role protobuf in a separated 
> package rather than define it in mesos.proto.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4097) Change /roles endpoint to include quotas, weights, reserved resources?

2015-12-09 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049884#comment-15049884
 ] 

Yong Qiao Wang commented on MESOS-4097:
---

Some concerns as below:
1. we removed RoleInfo in MESOS-4085, but we are going to show RoleInfo (all 
role-related configuration) by /roles?
2. Currently design should be worse for operator experience, for example, after 
operator configured the quota for a role with /quota endpoint, and he has to 
check the configuration with another endpoint ( /roles ) which will return all 
active roles information and does not query a specified one.


> Change /roles endpoint to include quotas, weights, reserved resources?
> --
>
> Key: MESOS-4097
> URL: https://issues.apache.org/jira/browse/MESOS-4097
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>  Labels: mesosphere, quota, reservations, roles
>
> MESOS-4085 changes the behavior of the {{/roles}} endpoint: rather than 
> listing all the explicitly defined roles, we will now only list those roles 
> that have one or more registered frameworks.
> As suggested by [~alexr] in code review, this could be improved -- an 
> operator might reasonably expect to see all the roles that have
> * non-default weight
> * non-default quota
> * non-default ACLs?
> * any static or dynamically reserved resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4097) Change /roles endpoint to include quotas, weights, reserved resources?

2015-12-09 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049894#comment-15049894
 ] 

Yong Qiao Wang commented on MESOS-4097:
---

Maybe we should recover the RoleInfo and improve /roles for above requirements. 
 MESOS-3791 do the similar things.

> Change /roles endpoint to include quotas, weights, reserved resources?
> --
>
> Key: MESOS-4097
> URL: https://issues.apache.org/jira/browse/MESOS-4097
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>  Labels: mesosphere, quota, reservations, roles
>
> MESOS-4085 changes the behavior of the {{/roles}} endpoint: rather than 
> listing all the explicitly defined roles, we will now only list those roles 
> that have one or more registered frameworks.
> As suggested by [~alexr] in code review, this could be improved -- an 
> operator might reasonably expect to see all the roles that have
> * non-default weight
> * non-default quota
> * non-default ACLs?
> * any static or dynamically reserved resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4097) Change /roles endpoint to include quotas, weights, reserved resources?

2015-12-09 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049956#comment-15049956
 ] 

Yong Qiao Wang commented on MESOS-4097:
---

OK, per my understanding of role management I think Mesos is prefer to set 
role-related configuration with some separated endpoint, such as use /quota to 
set quota,  use /reserve to dynamically set reservation, and maybe use /weights 
to set weight, etc., and use the unified endpoint /roles to show all 
role-related information. That way, do we need to defer the function to query 
quota with /quota after using /roles to show quota information of a role? 

> Change /roles endpoint to include quotas, weights, reserved resources?
> --
>
> Key: MESOS-4097
> URL: https://issues.apache.org/jira/browse/MESOS-4097
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>  Labels: mesosphere, quota, reservations, roles
>
> MESOS-4085 changes the behavior of the {{/roles}} endpoint: rather than 
> listing all the explicitly defined roles, we will now only list those roles 
> that have one or more registered frameworks.
> As suggested by [~alexr] in code review, this could be improved -- an 
> operator might reasonably expect to see all the roles that have
> * non-default weight
> * non-default quota
> * non-default ACLs?
> * any static or dynamically reserved resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4097) Change /roles endpoint to include quotas, weights, reserved resources?

2015-12-09 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049956#comment-15049956
 ] 

Yong Qiao Wang edited comment on MESOS-4097 at 12/10/15 4:27 AM:
-

OK, per my understanding of role management I think Mesos is prefer to set 
role-related configuration with some separated endpoint, such as use /quota to 
set quota,  use /reserve to dynamically set reservation, and maybe use /weights 
to set weight, etc., but use the unified endpoint /roles to show all 
role-related information. That way, do we need to defer the function to query 
quota with /quota after using /roles to show quota information of a role? 
[~neilc] and [~alexr], do you think so?


was (Author: jamesyongqiaowang):
OK, per my understanding of role management I think Mesos is prefer to set 
role-related configuration with some separated endpoint, such as use /quota to 
set quota,  use /reserve to dynamically set reservation, and maybe use /weights 
to set weight, etc., and use the unified endpoint /roles to show all 
role-related information. That way, do we need to defer the function to query 
quota with /quota after using /roles to show quota information of a role? 

> Change /roles endpoint to include quotas, weights, reserved resources?
> --
>
> Key: MESOS-4097
> URL: https://issues.apache.org/jira/browse/MESOS-4097
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>  Labels: mesosphere, quota, reservations, roles
>
> MESOS-4085 changes the behavior of the {{/roles}} endpoint: rather than 
> listing all the explicitly defined roles, we will now only list those roles 
> that have one or more registered frameworks.
> As suggested by [~alexr] in code review, this could be improved -- an 
> operator might reasonably expect to see all the roles that have
> * non-default weight
> * non-default quota
> * non-default ACLs?
> * any static or dynamically reserved resources



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3942) Enhance endpoint /roles for adding a new role

2015-12-07 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044617#comment-15044617
 ] 

Yong Qiao Wang edited comment on MESOS-3942 at 12/7/15 9:13 AM:


After discussion with Adam B, and we will support both Dynamic Roles and 
Implicit Roles, and new role will be implicitly created when framework register 
in Implicit Roles ticket, so mark this JIRA as invalid.


was (Author: jamesyongqiaowang):
New role will be implicitly created when framework register in Implicit Roles 
ticket, so mark this JIRA as invalid.

> Enhance endpoint /roles for adding a new role
> -
>
> Key: MESOS-3942
> URL: https://issues.apache.org/jira/browse/MESOS-3942
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint /roles to can add 
> a new role at runtime as outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4085) Implement implicit roles

2015-12-07 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046253#comment-15046253
 ] 

Yong Qiao Wang commented on MESOS-4085:
---

[~neilc], we also have a requirement to register a Mesos framework with any 
role, so this improvement is urgent and important for us. I find this feature 
can be released before Dec 21 based on the current plan, can we work together 
to speed up this release? After you create sub tasks for this improvement, can 
you assign some to me?

> Implement implicit roles
> 
>
> Key: MESOS-4085
> URL: https://issues.apache.org/jira/browse/MESOS-4085
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere, roles
>
> See also design doc: MESOS-4000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3946) Test for role management

2015-12-07 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046117#comment-15046117
 ] 

Yong Qiao Wang commented on MESOS-3946:
---

Thanks [~neilc], I will have a look and append more test later.

> Test for role management
> 
>
> Key: MESOS-3946
> URL: https://issues.apache.org/jira/browse/MESOS-3946
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Add test for role dynamic configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3944) Move RoleInfo message out of allocator.proto

2015-12-07 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046150#comment-15046150
 ] 

Yong Qiao Wang commented on MESOS-3944:
---

Good points! I have the same concern before designing the DR(Dynamic Roles), 
but currently, in DR, we need to persist RoleInfo, so it is needed now.  For 
ACLs, role is just one of objects of ACLs, and ACLs can also be configured for 
user and framework_principle, so I think it does not make sence to include it  
into RoleInfo.  For Quota, Can you have a talk with Alexander(Owner of quota 
support)? I think like quota and resource reservation, they are heavy 
configuration for role, they can be managed by a separated project, but like 
weight and grace period, they are light configuration for role, it is no need 
to add a separated endpoints for their management, we can add them into 
RoleInfo, and manage them with /roles endpoint.

> Move RoleInfo message out of allocator.proto 
> -
>
> Key: MESOS-3944
> URL: https://issues.apache.org/jira/browse/MESOS-3944
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently role protobuf is defined in allocator.proto, we will move it out 
> and define role protobuf in a separated package, and this protobuf message 
> will as internal representation for role related information (e.g. for 
> persisting role).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4085) Implement implicit roles

2015-12-07 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046367#comment-15046367
 ] 

Yong Qiao Wang commented on MESOS-4085:
---

Good Jobs, it is a big surprise for me, expect for your patch, thanks!

> Implement implicit roles
> 
>
> Key: MESOS-4085
> URL: https://issues.apache.org/jira/browse/MESOS-4085
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere, roles
>
> See also design doc: MESOS-4000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3988) Implicit roles

2015-11-28 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030801#comment-15030801
 ] 

Yong Qiao Wang commented on MESOS-3988:
---

In Mesos, lots of concept are around with role, such as quota, static/dynamic 
reservation, weight, etc. I think it is a centre concept, why we need to relax 
it? Are there some detailed user requirement for this project, if we relax the 
concept of role, somethings will become confused:

1. As a framework developer, if a framework can register with any role, why I 
need to aware it? Will we plan to remove role parameter from FrameworkInfo?

2. As a cluster operator, if I do not know what role will be used by end 
users/frameworks , how to configure quota, ACLs for register_frameworks action 
and static reservation?

3. In Mesos, the role which is used to determine what resources frameworks can 
use, if the static role list is removed, then how to define the resource plan 
for cluster admin?

4. In the traditional DRF Allocator, the total number of roles affects each 
role's fair share of the Mesos cluster, so if framework can register with any 
role, and master does not check it, then how to control and guarantee the fair 
share?

5. For dynamic ACLs/Weights, do we will configure it with any role(because role 
list will be removed in implicit roles project)? then the configured role will 
be provided to framework to use, is it right?  if yes, I think it seems 
contradictory, we do not relax the role concept, and only merge the role 
configuration and Weights/ACLs configuration together, I think it is complex 
and does not easy understanding for end user.

According to my understanding, role concept does not be relax in this design, 
we just put it in a number of places for management, for example, we can 
configure roles in ACLs management or Weights management endpoints,  and will 
persist roles in ACLs or Weights replicated log, and frameworks can only use 
the roles configured in ACLs or Weights to apply the related policy. So I still 
proposed the centralized management method in MESOS-3177. 

> Implicit roles
> --
>
> Key: MESOS-3988
> URL: https://issues.apache.org/jira/browse/MESOS-3988
> Project: Mesos
>  Issue Type: Epic
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere, roles
>
> At present, Mesos uses a static list of roles that are configured when the 
> master starts up. This places some severe limitations on how roles can be 
> used (e.g., changing the set of roles requires restarting all the masters).
> As an alternative (or a precursor) to implementing full-blown dynamic roles, 
> we could instead relax the concept of roles, so that:
> * frameworks can register with any role (subject to ACLs/authz)
> * reservations can be made for any role
> Open questions, at least to me:
> * This would mean weights cannot be configured dynamically. Is that okay?
> * Is this feature useful enough without dynamic ACL changes?
> * If we implement this (+ dynamic ACLs), do we also need dynamic roles?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3946) Test for role management

2015-11-25 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026475#comment-15026475
 ] 

Yong Qiao Wang commented on MESOS-3946:
---

Good suggestion, we can consider this after the main tasks done.

> Test for role management
> 
>
> Key: MESOS-3946
> URL: https://issues.apache.org/jira/browse/MESOS-3946
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Add test for role dynamic configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3947) Authenticate /roles request

2015-11-25 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3947:
--
Description: 
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.

  was:
/roles requests except GET method need to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.


> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles endpoint needs to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3947) Authenticate /roles request

2015-11-25 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3947:
--
Description: 
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.

In addition, for the query request of /roles endpoint, considering that it 
would not change the status of roles/weights in Mesos master and for backward 
compatibility , so it will not be authenticated.

  was:
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.

In addition, for the query request of /roles endpoint, considering that it 
would not change the status of roles/weights in Mesos master and for backward 
compatibility , it does not need to be authenticated.


> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles endpoint needs to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.
> In addition, for the query request of /roles endpoint, considering that it 
> would not change the status of roles/weights in Mesos master and for backward 
> compatibility , so it will not be authenticated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3947) Authenticate /roles request

2015-11-25 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3947:
--
Description: 
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.

In addition, for the query request of /roles endpoint, considering that it 
would not change the status of roles/weights in Mesos master and for backward 
compatibility , it does not need to be authenticated.

  was:
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.


> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles endpoint needs to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.
> In addition, for the query request of /roles endpoint, considering that it 
> would not change the status of roles/weights in Mesos master and for backward 
> compatibility , it does not need to be authenticated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3791) Enhance the existing HTTP endpoint /roles

2015-11-25 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026469#comment-15026469
 ] 

Yong Qiao Wang commented on MESOS-3791:
---

The response JSON format of /roles endpoint with GET request has been changed 
in this ticket, I am not sure that we need to keep consistent as before for 
backward compatibility. [~adam-mesos] any suggestions for this?

> Enhance the existing HTTP endpoint /roles
> -
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint to query roles as 
> outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3947) Authenticate /roles request

2015-11-25 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026499#comment-15026499
 ] 

Yong Qiao Wang commented on MESOS-3947:
---

Thanks [~marco-mesos] for your information, I have updated the description of 
this ticket for your concern.

> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles endpoint needs to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.
> In addition, for the query request of /roles endpoint, considering that it 
> would not change the status of roles/weights in Mesos master and for backward 
> compatibility , so it will not be authenticated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4007) Persist role information to registry

2015-11-25 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-4007:
--
Description: 
To consider the Mesos master recovery and failover case, Mesos master needs to 
persist the roles and weights information in registry: 
- In the first boot, the first leading master initialize the replicated log 
with the roles/weights specified by command-line flags(--roles and --weights). 
The flags values are only useful to bootstrap the cluster, after which point 
the registry becomes the source of truth.

- At runtime, the replicated log can only be updated to add/remove/update 
entries by the operator REST API.

- For Mesos master restart/failover case, if the replicated log for 
roles/weights has exist, and then it prefers to use the registry values and 
ignore the flags (--roles/--weights), and also log a warning in Mesos master 
that the flags values are being ignored.

- For the future works, we can educate end users to create the replicated log 
to initialize the supported roles/weights before Mesos cluster bootstrap, and 
reset roles/weights configurations by update the replicated log.


  was:Persist role information to registry across master recovery/failover.


> Persist role information to registry
> 
>
> Key: MESOS-4007
> URL: https://issues.apache.org/jira/browse/MESOS-4007
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> To consider the Mesos master recovery and failover case, Mesos master needs 
> to persist the roles and weights information in registry: 
> - In the first boot, the first leading master initialize the replicated log 
> with the roles/weights specified by command-line flags(--roles and 
> --weights). The flags values are only useful to bootstrap the cluster, after 
> which point the registry becomes the source of truth.
> - At runtime, the replicated log can only be updated to add/remove/update 
> entries by the operator REST API.
> - For Mesos master restart/failover case, if the replicated log for 
> roles/weights has exist, and then it prefers to use the registry values and 
> ignore the flags (--roles/--weights), and also log a warning in Mesos master 
> that the flags values are being ignored.
> - For the future works, we can educate end users to create the replicated log 
> to initialize the supported roles/weights before Mesos cluster bootstrap, and 
> reset roles/weights configurations by update the replicated log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3942) Enhance endpoint /roles for adding a new role

2015-11-25 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026417#comment-15026417
 ] 

Yong Qiao Wang commented on MESOS-3942:
---

RR: https://reviews.apache.org/r/40697/

> Enhance endpoint /roles for adding a new role
> -
>
> Key: MESOS-3942
> URL: https://issues.apache.org/jira/browse/MESOS-3942
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint /roles to can add 
> a new role at runtime as outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4008) Master recovery with the persisted roles in registry

2015-11-25 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-4008:
--
Description: 
To consider the Mesos master recovery and failover case, Mesos master needs to 
persist the roles and weights information in registry: 
- In the first boot, the first leading master initialize the replicated log 
with the roles/weights specified by command-line flags(--roles and --weights). 
The flags values are only useful to bootstrap the cluster, after which point 
the registry becomes the source of truth.

- At runtime, the replicated log can only be updated to add/remove/update 
entries by the operator REST API.

- For Mesos master restart/failover case, if the replicated log for 
roles/weights has exist, and then it prefers to use the registry values and 
ignore the flags (--roles/--weights), and also log a warning in Mesos master 
that the flags values are being ignored.

- For the future works, we can educate end users to create the replicated log 
to initialize the supported roles/weights before Mesos cluster bootstrap, and 
reset roles/weights configurations by update the replicated log.


  was:
To consider the Mesos master recovery and failover case, Mesos master needs to 
persist the roles and weights information in registry: 
In the first boot, the first leading master initialize the replicated log with 
the roles/weights specified by command-line flags(--roles and --weights). The 
flags values are only useful to bootstrap the cluster, after which point the 
registry becomes the source of truth.
At runtime, the replicated log can only be updated to add/remove/update entries 
by the operator REST API.
For Mesos master restart/failover case, if the replicated log for roles/weights 
has exist, and then it prefers to use the registry values and ignore the flags 
(--roles/--weights), and also log a warning in Mesos master that the flags 
values are being ignored.
For the future works, we can educate end users to create the replicated log to 
initialize the supported roles/weights before Mesos cluster bootstrap, and 
reset roles/weights configurations by update the replicated log.



> Master recovery with the persisted roles in registry
> 
>
> Key: MESOS-4008
> URL: https://issues.apache.org/jira/browse/MESOS-4008
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> To consider the Mesos master recovery and failover case, Mesos master needs 
> to persist the roles and weights information in registry: 
> - In the first boot, the first leading master initialize the replicated log 
> with the roles/weights specified by command-line flags(--roles and 
> --weights). The flags values are only useful to bootstrap the cluster, after 
> which point the registry becomes the source of truth.
> - At runtime, the replicated log can only be updated to add/remove/update 
> entries by the operator REST API.
> - For Mesos master restart/failover case, if the replicated log for 
> roles/weights has exist, and then it prefers to use the registry values and 
> ignore the flags (--roles/--weights), and also log a warning in Mesos master 
> that the flags values are being ignored.
> - For the future works, we can educate end users to create the replicated log 
> to initialize the supported roles/weights before Mesos cluster bootstrap, and 
> reset roles/weights configurations by update the replicated log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4008) Master recovery with the persisted roles in registry

2015-11-25 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-4008:
-

 Summary: Master recovery with the persisted roles in registry
 Key: MESOS-4008
 URL: https://issues.apache.org/jira/browse/MESOS-4008
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


To consider the Mesos master recovery and failover case, Mesos master needs to 
persist the roles and weights information in registry: 
In the first boot, the first leading master initialize the replicated log with 
the roles/weights specified by command-line flags(--roles and --weights). The 
flags values are only useful to bootstrap the cluster, after which point the 
registry becomes the source of truth.
At runtime, the replicated log can only be updated to add/remove/update entries 
by the operator REST API.
For Mesos master restart/failover case, if the replicated log for roles/weights 
has exist, and then it prefers to use the registry values and ignore the flags 
(--roles/--weights), and also log a warning in Mesos master that the flags 
values are being ignored.
For the future works, we can educate end users to create the replicated log to 
initialize the supported roles/weights before Mesos cluster bootstrap, and 
reset roles/weights configurations by update the replicated log.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4007) Persist role information to registry

2015-11-25 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-4007:
-

 Summary: Persist role information to registry
 Key: MESOS-4007
 URL: https://issues.apache.org/jira/browse/MESOS-4007
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


Persist role information to registry across master recovery/failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3177) Dynamic roles/weights configuration at runtime

2015-11-24 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024538#comment-15024538
 ] 

Yong Qiao Wang commented on MESOS-3177:
---

[~adam-mesos], can you help review the related patches with REVIEWABLE status? 
Thanks!

> Dynamic roles/weights configuration at runtime
> --
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Epic
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3946) Test for role management

2015-11-23 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023691#comment-15023691
 ] 

Yong Qiao Wang commented on MESOS-3946:
---

The original idea is to add the test for role add/update/remove endpoint in 
this ticket, maybe we should add the test in the related ticket rather than 
issue a new separated ticket. I will mark this as an invalid task later. Thanks!

> Test for role management
> 
>
> Key: MESOS-3946
> URL: https://issues.apache.org/jira/browse/MESOS-3946
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Add test for role dynamic configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3942) Enhance endpoint /roles for adding a new role

2015-11-22 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3942:
--
Summary: Enhance endpoint /roles for adding a new role  (was: Enhance 
endpoint /roles for adding/updating/removing role)

> Enhance endpoint /roles for adding a new role
> -
>
> Key: MESOS-3942
> URL: https://issues.apache.org/jira/browse/MESOS-3942
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint /roles to can 
> add/update/delete role as outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3942) Enhance endpoint /roles for adding a new role

2015-11-22 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3942:
--
Description: 
In this ticket, we will enhance the existing HTTP endpoint /roles to can add a 
new role at runtime as outlined in the Design Doc: 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#

  was:
In this ticket, we will enhance the existing HTTP endpoint /roles to can 
add/update/delete role as outlined in the Design Doc: 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#


> Enhance endpoint /roles for adding a new role
> -
>
> Key: MESOS-3942
> URL: https://issues.apache.org/jira/browse/MESOS-3942
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint /roles to can add 
> a new role at runtime as outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3980) Enhance endpoint /roles for removing a role

2015-11-22 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3980:
-

 Summary: Enhance endpoint /roles for removing a role
 Key: MESOS-3980
 URL: https://issues.apache.org/jira/browse/MESOS-3980
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


In this ticket, we will enhance the existing HTTP endpoint /roles to can remove 
a role at runtime as outlined in the Design Doc: 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3944) Move RoleInfo message out of allocator.proto

2015-11-18 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15010456#comment-15010456
 ] 

Yong Qiao Wang commented on MESOS-3944:
---

Append the RR: https://reviews.apache.org/r/40431/

> Move RoleInfo message out of allocator.proto 
> -
>
> Key: MESOS-3944
> URL: https://issues.apache.org/jira/browse/MESOS-3944
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently role protobuf is defined in allocator.proto, we will move it out 
> and define role protobuf in a separated package, and this protobuf message 
> will as internal representation for role related information (e.g. for 
> persisting role).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3947) Authenticate /roles request

2015-11-18 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3947:
-

 Summary: Authenticate /roles request
 Key: MESOS-3947
 URL: https://issues.apache.org/jira/browse/MESOS-3947
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang


/roles requests except GET method need to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3945) Add operator documentation for role management

2015-11-18 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3945:
-

 Summary: Add operator documentation for role management
 Key: MESOS-3945
 URL: https://issues.apache.org/jira/browse/MESOS-3945
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


Add an operator guide for role management which describes basic usage of the 
/roles endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3947) Authenticate /roles request

2015-11-18 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-3947:
-

Assignee: Yong Qiao Wang

> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles requests except GET method need to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3946) Test for role management

2015-11-18 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3946:
-

 Summary: Test for role management
 Key: MESOS-3946
 URL: https://issues.apache.org/jira/browse/MESOS-3946
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


Add test for role dynamic configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3948) Authorize /roles request

2015-11-18 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3948:
-

 Summary: Authorize /roles request
 Key: MESOS-3948
 URL: https://issues.apache.org/jira/browse/MESOS-3948
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang


When /roles are requested it should authorize the updated role.

This ticket will authorize /roles requests with ACLs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3956) Update Allocator interface to support dynamic roles

2015-11-18 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3956:
-

 Summary: Update Allocator interface to support dynamic roles
 Key: MESOS-3956
 URL: https://issues.apache.org/jira/browse/MESOS-3956
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


An allocator should be notified when a role is being added/updated or removed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3956) Update Allocator interface to support dynamic roles

2015-11-18 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013015#comment-15013015
 ] 

Yong Qiao Wang commented on MESOS-3956:
---

Append RR: https://reviews.apache.org/r/40469/

> Update Allocator interface to support dynamic roles
> ---
>
> Key: MESOS-3956
> URL: https://issues.apache.org/jira/browse/MESOS-3956
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> An allocator should be notified when a role is being added/updated or 
> removed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3477) Add design doc for roles/weights configuration

2015-11-17 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15010035#comment-15010035
 ] 

Yong Qiao Wang commented on MESOS-3477:
---

Hi [~adam-mesos], I have updated this design doc to address all the comments, 
could you help to review again? I will break down the tasks and starting coding 
this week.

> Add design doc for roles/weights configuration
> --
>
> Key: MESOS-3477
> URL: https://issues.apache.org/jira/browse/MESOS-3477
> Project: Mesos
>  Issue Type: Documentation
>  Components: master
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-3477) Add design doc for roles/weights configuration

2015-11-17 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3477:
--
Comment: was deleted

(was: Hi [~adam-mesos], [~cmaloney], the design doc has be updated, could you 
give me a double review. Welcome your any comments. Thanks!)

> Add design doc for roles/weights configuration
> --
>
> Key: MESOS-3477
> URL: https://issues.apache.org/jira/browse/MESOS-3477
> Project: Mesos
>  Issue Type: Documentation
>  Components: master
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3477) Add design doc for roles/weights configuration

2015-11-17 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3477:
--
Shepherd: Adam B

> Add design doc for roles/weights configuration
> --
>
> Key: MESOS-3477
> URL: https://issues.apache.org/jira/browse/MESOS-3477
> Project: Mesos
>  Issue Type: Documentation
>  Components: master
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3791) Introduce HTTP endpoints for Role management

2015-11-17 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3791:
--
Description: 
Will implement the HTTP endpoints for role management as outlined in the Design 
Doc: 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



  was:Currently, there is already an endpoint named /roles in Mesos, which is 
used to query all roles information, in this JIRA, we will extend this endpoint 
to also support add, remove and update actions. It means that we will have a 
single REST-like endpoint with multiple http verbs to distinguish between 
different actions. 


> Introduce HTTP endpoints for Role management
> 
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Will implement the HTTP endpoints for role management as outlined in the 
> Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3791) Enhance the existing HTTP endpoint /roles

2015-11-17 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3791:
--
Summary: Enhance the existing HTTP endpoint /roles  (was: Introduce HTTP 
endpoints for Role management)

> Enhance the existing HTTP endpoint /roles
> -
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Will implement the HTTP endpoints for role management as outlined in the 
> Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3791) Enhance the existing HTTP endpoint /roles

2015-11-17 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3791:
--
Description: 
In this ticket, we will enhance the existing HTTP endpoint to query roles as 
outlined in the Design Doc: 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



  was:
Will implement the HTTP endpoints for role management as outlined in the Design 
Doc: 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#




> Enhance the existing HTTP endpoint /roles
> -
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint to query roles as 
> outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3943) Dynamic roles/weights support in allocator

2015-11-17 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3943:
-

 Summary: Dynamic roles/weights support in allocator
 Key: MESOS-3943
 URL: https://issues.apache.org/jira/browse/MESOS-3943
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


Mesos allocator should aware the role change, this includes adding, updating 
and delete a role. so in this ticket, we will extend the allocator interface 
based on the design 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3944) Move RoleInfo message out of allocator.proto

2015-11-17 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-3944:
-

Assignee: Yong Qiao Wang

> Move RoleInfo message out of allocator.proto 
> -
>
> Key: MESOS-3944
> URL: https://issues.apache.org/jira/browse/MESOS-3944
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently role protobuf is defined in allocator.proto, we will move it out 
> and define role protobuf in a separated package, and this protobuf message 
> will as internal representation for role related information (e.g. for 
> persisting role).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3944) Move RoleInfo message out of allocator.proto

2015-11-17 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3944:
-

 Summary: Move RoleInfo message out of allocator.proto 
 Key: MESOS-3944
 URL: https://issues.apache.org/jira/browse/MESOS-3944
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang


Currently role protobuf is defined in allocator.proto, we will move it out and 
define role protobuf in a separated package, and this protobuf message will as 
internal representation for role related information (e.g. for persisting role).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3791) Enhance the existing HTTP endpoint /roles

2015-11-17 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15010124#comment-15010124
 ] 

Yong Qiao Wang commented on MESOS-3791:
---

Append the RR https://reviews.apache.org/r/40424/

Hey [~adam-mesos], could you help to review this change?

> Enhance the existing HTTP endpoint /roles
> -
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint to query roles as 
> outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3791) Enhance the existing HTTP endpoint /roles

2015-11-17 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3791:
--
Issue Type: Task  (was: Bug)

> Enhance the existing HTTP endpoint /roles
> -
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint to query roles as 
> outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3942) Enhance endpoint /roles for adding/updating/removing role

2015-11-17 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3942:
-

 Summary: Enhance endpoint /roles for adding/updating/removing role
 Key: MESOS-3942
 URL: https://issues.apache.org/jira/browse/MESOS-3942
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


In this ticket, we will enhance the existing HTTP endpoint /roles to can 
add/update/delete role as outlined in the Design Doc: 
https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2936) Create a design document for Quota support in Master

2015-11-16 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15007932#comment-15007932
 ] 

Yong Qiao Wang commented on MESOS-2936:
---

Hey, [~alexr] and [~js84], as we talked as above, role should only can be added 
by /roles endpoint, and /quota endpoint should not modify roles, so I think we 
should remove the TODO comment located in src/master/quota_handler.cpp:204L as 
below. 

{code}
  // TODO(alexr): Once we are able to dynamically add roles, we should stop
  // checking whether the requested role is known to the master, because an
  // operator may set quota for a role that is about to be introduced.
  if (!master->roles.contains(create.get().role())) {
return BadRequest("Failed to validate set quota request query string: ('" +
  request.body +"')': Unknown role: '" +
  create.get().role() + "'");
  }
{code}



> Create a design document for Quota support in Master
> 
>
> Key: MESOS-2936
> URL: https://issues.apache.org/jira/browse/MESOS-2936
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Create a design document for the Quota feature support in Mesos Master 
> (excluding allocator) to be shared with the Mesos community.
> Design Doc:
> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2936) Create a design document for Quota support in Master

2015-11-16 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15008149#comment-15008149
 ] 

Yong Qiao Wang commented on MESOS-2936:
---

Thanks very much. I see you. 

> Create a design document for Quota support in Master
> 
>
> Key: MESOS-2936
> URL: https://issues.apache.org/jira/browse/MESOS-2936
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Create a design document for the Quota feature support in Mesos Master 
> (excluding allocator) to be shared with the Mesos community.
> Design Doc:
> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2864) Master should not change the state of a terminal task if it receives another terminal update

2015-10-29 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980342#comment-14980342
 ] 

Yong Qiao Wang commented on MESOS-2864:
---

Fix this issue by setting the uuid when the task is already terminated.

RR: https://reviews.apache.org/r/39754/

> Master should not change the state of a terminal task if it receives another 
> terminal update
> 
>
> Key: MESOS-2864
> URL: https://issues.apache.org/jira/browse/MESOS-2864
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
> Fix For: 0.26.0
>
>
> Currently, when the master receives a terminal update for an already 
> terminated (but unacknowledged) task it changes the state to the latest 
> update. This is confusing because the slave doesn't change the state of the 
> task in such a case. Master should just forward the update without changing 
> the task state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3791) Introduce HTTP endpoints for Role management

2015-10-22 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3791:
--
Shepherd: Adam B

> Introduce HTTP endpoints for Role management
> 
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently, there is already an endpoint named /roles in Mesos, which is used 
> to query all roles information, in this JIRA, we will extend this endpoint to 
> also support add, remove and update actions. It means that we will have a 
> single REST-like endpoint with multiple http verbs to distinguish between 
> different actions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3791) Introduce HTTP endpoints for Role management

2015-10-22 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3791:
--
Description: Currently, there is already an endpoint named /roles in Mesos, 
which is used to query all roles information, in this JIRA, we will extend this 
endpoint to also support add, remove and update actions. It means that we will 
have a single REST-like endpoint with multiple http verbs to distinguish 
between different actions. 

> Introduce HTTP endpoints for Role management
> 
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Currently, there is already an endpoint named /roles in Mesos, which is used 
> to query all roles information, in this JIRA, we will extend this endpoint to 
> also support add, remove and update actions. It means that we will have a 
> single REST-like endpoint with multiple http verbs to distinguish between 
> different actions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3177) Dynamic roles/weights configuration at runtime

2015-10-22 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3177:
--
Issue Type: Epic  (was: Improvement)
   Summary: Dynamic roles/weights configuration at runtime  (was: Make 
Mesos own configuration of roles/weights)

> Dynamic roles/weights configuration at runtime
> --
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Epic
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3791) Introduce HTTP endpoints for Role management

2015-10-22 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3791:
-

 Summary: Introduce HTTP endpoints for Role management
 Key: MESOS-3791
 URL: https://issues.apache.org/jira/browse/MESOS-3791
 Project: Mesos
  Issue Type: Bug
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2092) Make ACLs dynamic

2015-10-21 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-2092:
-

Assignee: Yong Qiao Wang

> Make ACLs dynamic
> -
>
> Key: MESOS-2092
> URL: https://issues.apache.org/jira/browse/MESOS-2092
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Alexander Rukletsov
>Assignee: Yong Qiao Wang
>  Labels: mesosphere, newbie
>
> Master loads ACLs once during its launch and there is no way to update them 
> in a running master. Making them dynamic will allow for updating ACLs on the 
> fly, for example granting a new framework necessary rights.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1832) Slave should accept PingSlaveMessage but not "PING" message.

2015-10-21 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966799#comment-14966799
 ] 

Yong Qiao Wang commented on MESOS-1832:
---

[~vinodkone], append the related RR for this ticket: 
https://reviews.apache.org/r/39516/

> Slave should accept PingSlaveMessage but not "PING" message.
> 
>
> Key: MESOS-1832
> URL: https://issues.apache.org/jira/browse/MESOS-1832
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> Slave handles both "PING" message and PingSlaveMessage in until 0.22.0 for 
> backwards compatibility (https://reviews.apache.org/r/25867/).
> In 0.23.0, slave no longer needs handle "PING".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2936) Create a design document for Quota support in Master

2015-10-21 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966524#comment-14966524
 ] 

Yong Qiao Wang commented on MESOS-2936:
---

Hey, [~alexr], we are planing to add a separated endpoint /roles to add/remove 
role in Role Dynamic Configuration project (MESOS-3177). I found in the request 
quota design, it also can be used to add a role. I think there are some repeats 
between these two projects. For my understanding , quota should be view as an 
attribute of role, so in quota management project,  the quota updating 
action(PUT) should be enough to manage(add/update/remove) the quota of an exist 
role, and if the role does not exist, then you need to create this role with 
endpoint /roles before configuring the quota for this role. So maybe we need to 
remove the quota request action from quota management endpoint. [~alexr], any 
thoughts for this?

In addition, welcome you to give me a review for the role dynamically configure 
design, your comments is important for me. Thanks in advance.

> Create a design document for Quota support in Master
> 
>
> Key: MESOS-2936
> URL: https://issues.apache.org/jira/browse/MESOS-2936
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Create a design document for the Quota feature support in Mesos Master 
> (excluding allocator) to be shared with the Mesos community.
> Design Doc:
> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3477) Add design doc for roles/weights configuration

2015-10-21 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966441#comment-14966441
 ] 

Yong Qiao Wang commented on MESOS-3477:
---

Hi [~adam-mesos], [~cmaloney], the design doc has be updated, could you give me 
a double review. Welcome your any comments. Thanks!

> Add design doc for roles/weights configuration
> --
>
> Key: MESOS-3477
> URL: https://issues.apache.org/jira/browse/MESOS-3477
> Project: Mesos
>  Issue Type: Documentation
>  Components: master
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2255) SlaveRecoveryTest/0.MasterFailover is flaky

2015-10-15 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958448#comment-14958448
 ] 

Yong Qiao Wang edited comment on MESOS-2255 at 10/15/15 7:12 AM:
-

[~xujyan], I ran this test case SlaveRecoveryTest/0.MasterFailover again on OS 
X(10.10.4), but I found it works well:

{noformat:title=}
Yongs-MacBook-Pro:bin yqwyq$ ./mesos-tests.sh 
--gtest_filter=SlaveRecoveryTest/0.MasterFailover
..
..
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from SlaveRecoveryTest/0, where TypeParam = 
mesos::internal::slave::MesosContainerizer
[ RUN  ] SlaveRecoveryTest/0.MasterFailover
I1015 14:58:55.538914 1939460864 exec.cpp:136] Version: 0.26.0
..
..
[   OK ] SlaveRecoveryTest/0.MasterFailover (1397 ms)
[--] 1 test from SlaveRecoveryTest/0 (1397 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (1406 ms total)
[  PASSED  ] 1 test.
{noformat}

Could you let me know which OS/version you ran this case? I need to reproduce 
this problem. Thanks!


was (Author: jamesyongqiaowang):
[~xujyan], I ran the test case SlaveRecoveryTest/0.MasterFailover again on OS 
X(10.10.4), but I found it work well:

{noformat:title=}
Yongs-MacBook-Pro:bin yqwyq$ ./mesos-tests.sh 
--gtest_filter=SlaveRecoveryTest/0.MasterFailover
..
..
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from SlaveRecoveryTest/0, where TypeParam = 
mesos::internal::slave::MesosContainerizer
[ RUN  ] SlaveRecoveryTest/0.MasterFailover
I1015 14:58:55.538914 1939460864 exec.cpp:136] Version: 0.26.0
..
..
[   OK ] SlaveRecoveryTest/0.MasterFailover (1397 ms)
[--] 1 test from SlaveRecoveryTest/0 (1397 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (1406 ms total)
[  PASSED  ] 1 test.
{noformat}

Could you let me know which OS/version you ran this case?

> SlaveRecoveryTest/0.MasterFailover is flaky
> ---
>
> Key: MESOS-2255
> URL: https://issues.apache.org/jira/browse/MESOS-2255
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Yan Xu
>Assignee: Yong Qiao Wang
>  Labels: flaky, twitter
>
> {noformat:title=}
> [ RUN  ] SlaveRecoveryTest/0.MasterFailover
> Using temporary directory '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0'
> I0123 07:45:49.818686 17634 leveldb.cpp:176] Opened db in 31.195549ms
> I0123 07:45:49.821962 17634 leveldb.cpp:183] Compacted db in 3.190936ms
> I0123 07:45:49.822049 17634 leveldb.cpp:198] Created db iterator in 47324ns
> I0123 07:45:49.822069 17634 leveldb.cpp:204] Seeked to beginning of db in 
> 2038ns
> I0123 07:45:49.822084 17634 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 484ns
> I0123 07:45:49.822160 17634 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0123 07:45:49.824241 17660 recover.cpp:449] Starting replica recovery
> I0123 07:45:49.825217 17660 recover.cpp:475] Replica is in EMPTY status
> I0123 07:45:49.827020 17660 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I0123 07:45:49.827453 17659 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I0123 07:45:49.828047 17659 recover.cpp:566] Updating replica status to 
> STARTING
> I0123 07:45:49.838543 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 10.24963ms
> I0123 07:45:49.838580 17659 replica.cpp:323] Persisted replica status to 
> STARTING
> I0123 07:45:49.848836 17659 recover.cpp:475] Replica is in STARTING status
> I0123 07:45:49.850039 17659 replica.cpp:641] Replica in STARTING status 
> received a broadcasted recover request
> I0123 07:45:49.850286 17659 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I0123 07:45:49.850754 17659 recover.cpp:566] Updating replica status to VOTING
> I0123 07:45:49.853698 17655 master.cpp:262] Master 
> 20150123-074549-16842879-44955-17634 (utopic) started on 127.0.1.1:44955
> I0123 07:45:49.853981 17655 master.cpp:308] Master only allowing 
> authenticated frameworks to register
> I0123 07:45:49.853997 17655 master.cpp:313] Master only allowing 
> authenticated slaves to register
> I0123 07:45:49.854038 17655 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0/credentials'
> I0123 07:45:49.854557 17655 master.cpp:357] Authorization enabled
> I0123 07:45:49.859633 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 8.742923ms
> I0123 07:45:49.859853 17659 replica.cpp:323] 

[jira] [Commented] (MESOS-2255) SlaveRecoveryTest/0.MasterFailover is flaky

2015-10-15 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958448#comment-14958448
 ] 

Yong Qiao Wang commented on MESOS-2255:
---

[~xujyan], I ran the test case SlaveRecoveryTest/0.MasterFailover again on OS 
X(10.10.4), but I found it work well:

{noformat:title=}
Yongs-MacBook-Pro:bin yqwyq$ ./mesos-tests.sh 
--gtest_filter=SlaveRecoveryTest/0.MasterFailover
..
..
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from SlaveRecoveryTest/0, where TypeParam = 
mesos::internal::slave::MesosContainerizer
[ RUN  ] SlaveRecoveryTest/0.MasterFailover
I1015 14:58:55.538914 1939460864 exec.cpp:136] Version: 0.26.0
..
..
[   OK ] SlaveRecoveryTest/0.MasterFailover (1397 ms)
[--] 1 test from SlaveRecoveryTest/0 (1397 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (1406 ms total)
[  PASSED  ] 1 test.
{noformat:title=}

Could you let me know which OS/version you ran this case?

> SlaveRecoveryTest/0.MasterFailover is flaky
> ---
>
> Key: MESOS-2255
> URL: https://issues.apache.org/jira/browse/MESOS-2255
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Yan Xu
>Assignee: Yong Qiao Wang
>  Labels: flaky, twitter
>
> {noformat:title=}
> [ RUN  ] SlaveRecoveryTest/0.MasterFailover
> Using temporary directory '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0'
> I0123 07:45:49.818686 17634 leveldb.cpp:176] Opened db in 31.195549ms
> I0123 07:45:49.821962 17634 leveldb.cpp:183] Compacted db in 3.190936ms
> I0123 07:45:49.822049 17634 leveldb.cpp:198] Created db iterator in 47324ns
> I0123 07:45:49.822069 17634 leveldb.cpp:204] Seeked to beginning of db in 
> 2038ns
> I0123 07:45:49.822084 17634 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 484ns
> I0123 07:45:49.822160 17634 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0123 07:45:49.824241 17660 recover.cpp:449] Starting replica recovery
> I0123 07:45:49.825217 17660 recover.cpp:475] Replica is in EMPTY status
> I0123 07:45:49.827020 17660 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I0123 07:45:49.827453 17659 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I0123 07:45:49.828047 17659 recover.cpp:566] Updating replica status to 
> STARTING
> I0123 07:45:49.838543 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 10.24963ms
> I0123 07:45:49.838580 17659 replica.cpp:323] Persisted replica status to 
> STARTING
> I0123 07:45:49.848836 17659 recover.cpp:475] Replica is in STARTING status
> I0123 07:45:49.850039 17659 replica.cpp:641] Replica in STARTING status 
> received a broadcasted recover request
> I0123 07:45:49.850286 17659 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I0123 07:45:49.850754 17659 recover.cpp:566] Updating replica status to VOTING
> I0123 07:45:49.853698 17655 master.cpp:262] Master 
> 20150123-074549-16842879-44955-17634 (utopic) started on 127.0.1.1:44955
> I0123 07:45:49.853981 17655 master.cpp:308] Master only allowing 
> authenticated frameworks to register
> I0123 07:45:49.853997 17655 master.cpp:313] Master only allowing 
> authenticated slaves to register
> I0123 07:45:49.854038 17655 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0/credentials'
> I0123 07:45:49.854557 17655 master.cpp:357] Authorization enabled
> I0123 07:45:49.859633 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 8.742923ms
> I0123 07:45:49.859853 17659 replica.cpp:323] Persisted replica status to 
> VOTING
> I0123 07:45:49.860327 17658 recover.cpp:580] Successfully joined the Paxos 
> group
> I0123 07:45:49.860703 17654 recover.cpp:464] Recover process terminated
> I0123 07:45:49.859591 17655 master.cpp:1219] The newly elected leader is 
> master@127.0.1.1:44955 with id 20150123-074549-16842879-44955-17634
> I0123 07:45:49.864702 17655 master.cpp:1232] Elected as the leading master!
> I0123 07:45:49.864904 17655 master.cpp:1050] Recovering from registrar
> I0123 07:45:49.865406 17660 registrar.cpp:313] Recovering registrar
> I0123 07:45:49.866576 17660 log.cpp:660] Attempting to start the writer
> I0123 07:45:49.868638 17658 replica.cpp:477] Replica received implicit 
> promise request with proposal 1
> I0123 07:45:49.872521 17658 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 3.848859ms
> I0123 07:45:49.872555 17658 replica.cpp:345] Persisted promised to 1
> I0123 07:45:49.873769 17661 coordinator.cpp:230] Coordinator attemping to 
> fill 

[jira] [Comment Edited] (MESOS-2255) SlaveRecoveryTest/0.MasterFailover is flaky

2015-10-15 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958448#comment-14958448
 ] 

Yong Qiao Wang edited comment on MESOS-2255 at 10/15/15 7:11 AM:
-

[~xujyan], I ran the test case SlaveRecoveryTest/0.MasterFailover again on OS 
X(10.10.4), but I found it work well:

{noformat:title=}
Yongs-MacBook-Pro:bin yqwyq$ ./mesos-tests.sh 
--gtest_filter=SlaveRecoveryTest/0.MasterFailover
..
..
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from SlaveRecoveryTest/0, where TypeParam = 
mesos::internal::slave::MesosContainerizer
[ RUN  ] SlaveRecoveryTest/0.MasterFailover
I1015 14:58:55.538914 1939460864 exec.cpp:136] Version: 0.26.0
..
..
[   OK ] SlaveRecoveryTest/0.MasterFailover (1397 ms)
[--] 1 test from SlaveRecoveryTest/0 (1397 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (1406 ms total)
[  PASSED  ] 1 test.
{noformat}

Could you let me know which OS/version you ran this case?


was (Author: jamesyongqiaowang):
[~xujyan], I ran the test case SlaveRecoveryTest/0.MasterFailover again on OS 
X(10.10.4), but I found it work well:

{noformat:title=}
Yongs-MacBook-Pro:bin yqwyq$ ./mesos-tests.sh 
--gtest_filter=SlaveRecoveryTest/0.MasterFailover
..
..
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from SlaveRecoveryTest/0, where TypeParam = 
mesos::internal::slave::MesosContainerizer
[ RUN  ] SlaveRecoveryTest/0.MasterFailover
I1015 14:58:55.538914 1939460864 exec.cpp:136] Version: 0.26.0
..
..
[   OK ] SlaveRecoveryTest/0.MasterFailover (1397 ms)
[--] 1 test from SlaveRecoveryTest/0 (1397 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (1406 ms total)
[  PASSED  ] 1 test.
{noformat:title=}

Could you let me know which OS/version you ran this case?

> SlaveRecoveryTest/0.MasterFailover is flaky
> ---
>
> Key: MESOS-2255
> URL: https://issues.apache.org/jira/browse/MESOS-2255
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Yan Xu
>Assignee: Yong Qiao Wang
>  Labels: flaky, twitter
>
> {noformat:title=}
> [ RUN  ] SlaveRecoveryTest/0.MasterFailover
> Using temporary directory '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0'
> I0123 07:45:49.818686 17634 leveldb.cpp:176] Opened db in 31.195549ms
> I0123 07:45:49.821962 17634 leveldb.cpp:183] Compacted db in 3.190936ms
> I0123 07:45:49.822049 17634 leveldb.cpp:198] Created db iterator in 47324ns
> I0123 07:45:49.822069 17634 leveldb.cpp:204] Seeked to beginning of db in 
> 2038ns
> I0123 07:45:49.822084 17634 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 484ns
> I0123 07:45:49.822160 17634 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0123 07:45:49.824241 17660 recover.cpp:449] Starting replica recovery
> I0123 07:45:49.825217 17660 recover.cpp:475] Replica is in EMPTY status
> I0123 07:45:49.827020 17660 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I0123 07:45:49.827453 17659 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I0123 07:45:49.828047 17659 recover.cpp:566] Updating replica status to 
> STARTING
> I0123 07:45:49.838543 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 10.24963ms
> I0123 07:45:49.838580 17659 replica.cpp:323] Persisted replica status to 
> STARTING
> I0123 07:45:49.848836 17659 recover.cpp:475] Replica is in STARTING status
> I0123 07:45:49.850039 17659 replica.cpp:641] Replica in STARTING status 
> received a broadcasted recover request
> I0123 07:45:49.850286 17659 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I0123 07:45:49.850754 17659 recover.cpp:566] Updating replica status to VOTING
> I0123 07:45:49.853698 17655 master.cpp:262] Master 
> 20150123-074549-16842879-44955-17634 (utopic) started on 127.0.1.1:44955
> I0123 07:45:49.853981 17655 master.cpp:308] Master only allowing 
> authenticated frameworks to register
> I0123 07:45:49.853997 17655 master.cpp:313] Master only allowing 
> authenticated slaves to register
> I0123 07:45:49.854038 17655 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0/credentials'
> I0123 07:45:49.854557 17655 master.cpp:357] Authorization enabled
> I0123 07:45:49.859633 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 8.742923ms
> I0123 07:45:49.859853 17659 replica.cpp:323] Persisted replica status to 
> VOTING
> 

[jira] [Commented] (MESOS-3022) export additional metrics from scheduler driver

2015-10-15 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958379#comment-14958379
 ] 

Yong Qiao Wang commented on MESOS-3022:
---

[~bmahler], Per my investigation, following messages by type should be added 
into the scheduler metrics:
{code}
process::metrics::Counter messages_registered_framework;
process::metrics::Counter messages_reregistered_framework;
process::metrics::Counter messages_resource_offers;​
process::metrics::Counter messages_rescind_offer;​
process::metrics::Counter messages_status_update;
​process::metrics::Counter messages_executor_to_framework;
process::metrics::Counter messages_slave_lost;​
process::metrics::Counter messages_framework_error_messages;​
{code}

If something is missed, could you let me know?

> export additional metrics from scheduler driver
> ---
>
> Key: MESOS-3022
> URL: https://issues.apache.org/jira/browse/MESOS-3022
> Project: Mesos
>  Issue Type: Improvement
>Reporter: David Robinson
>Assignee: Yong Qiao Wang
>Priority: Minor
>
> The scheduler driver only exports the metrics below, but ideally it would 
> export its version and a count of messages by message type.
> {code}
> $ curl -s localhost:20902/metrics/snapshot | python -m json.tool
> {
> "scheduler/event_queue_dispatches": 0,
> "scheduler/event_queue_messages": 0,
> "system/cpus_total": 24,
> "system/load_15min": 0.49,
> "system/load_1min": 0.36,
> "system/load_5min": 0.46,
> "system/mem_free_bytes": 269713408,
> "system/mem_total_bytes": 33529266176
> }
> {code}
> The scheduler driver version could be used during troubleshooting to identify 
> frameworks that are using an old, potentially backwards incompatible, 
> scheduler driver (eg, a framework hasn't been restarted after a Mesos deploy, 
> so it still links against an old incompatible libmesos).
> A count of messages by message type would help identify a problem w/ a 
> specific feature, eg task reconciliation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2255) SlaveRecoveryTest/0.MasterFailover is flaky

2015-10-14 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-2255:
-

Assignee: Yong Qiao Wang

> SlaveRecoveryTest/0.MasterFailover is flaky
> ---
>
> Key: MESOS-2255
> URL: https://issues.apache.org/jira/browse/MESOS-2255
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Yan Xu
>Assignee: Yong Qiao Wang
>  Labels: flaky, twitter
>
> {noformat:title=}
> [ RUN  ] SlaveRecoveryTest/0.MasterFailover
> Using temporary directory '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0'
> I0123 07:45:49.818686 17634 leveldb.cpp:176] Opened db in 31.195549ms
> I0123 07:45:49.821962 17634 leveldb.cpp:183] Compacted db in 3.190936ms
> I0123 07:45:49.822049 17634 leveldb.cpp:198] Created db iterator in 47324ns
> I0123 07:45:49.822069 17634 leveldb.cpp:204] Seeked to beginning of db in 
> 2038ns
> I0123 07:45:49.822084 17634 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 484ns
> I0123 07:45:49.822160 17634 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0123 07:45:49.824241 17660 recover.cpp:449] Starting replica recovery
> I0123 07:45:49.825217 17660 recover.cpp:475] Replica is in EMPTY status
> I0123 07:45:49.827020 17660 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I0123 07:45:49.827453 17659 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I0123 07:45:49.828047 17659 recover.cpp:566] Updating replica status to 
> STARTING
> I0123 07:45:49.838543 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 10.24963ms
> I0123 07:45:49.838580 17659 replica.cpp:323] Persisted replica status to 
> STARTING
> I0123 07:45:49.848836 17659 recover.cpp:475] Replica is in STARTING status
> I0123 07:45:49.850039 17659 replica.cpp:641] Replica in STARTING status 
> received a broadcasted recover request
> I0123 07:45:49.850286 17659 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I0123 07:45:49.850754 17659 recover.cpp:566] Updating replica status to VOTING
> I0123 07:45:49.853698 17655 master.cpp:262] Master 
> 20150123-074549-16842879-44955-17634 (utopic) started on 127.0.1.1:44955
> I0123 07:45:49.853981 17655 master.cpp:308] Master only allowing 
> authenticated frameworks to register
> I0123 07:45:49.853997 17655 master.cpp:313] Master only allowing 
> authenticated slaves to register
> I0123 07:45:49.854038 17655 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0/credentials'
> I0123 07:45:49.854557 17655 master.cpp:357] Authorization enabled
> I0123 07:45:49.859633 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 8.742923ms
> I0123 07:45:49.859853 17659 replica.cpp:323] Persisted replica status to 
> VOTING
> I0123 07:45:49.860327 17658 recover.cpp:580] Successfully joined the Paxos 
> group
> I0123 07:45:49.860703 17654 recover.cpp:464] Recover process terminated
> I0123 07:45:49.859591 17655 master.cpp:1219] The newly elected leader is 
> master@127.0.1.1:44955 with id 20150123-074549-16842879-44955-17634
> I0123 07:45:49.864702 17655 master.cpp:1232] Elected as the leading master!
> I0123 07:45:49.864904 17655 master.cpp:1050] Recovering from registrar
> I0123 07:45:49.865406 17660 registrar.cpp:313] Recovering registrar
> I0123 07:45:49.866576 17660 log.cpp:660] Attempting to start the writer
> I0123 07:45:49.868638 17658 replica.cpp:477] Replica received implicit 
> promise request with proposal 1
> I0123 07:45:49.872521 17658 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 3.848859ms
> I0123 07:45:49.872555 17658 replica.cpp:345] Persisted promised to 1
> I0123 07:45:49.873769 17661 coordinator.cpp:230] Coordinator attemping to 
> fill missing position
> I0123 07:45:49.875474 17658 replica.cpp:378] Replica received explicit 
> promise request for position 0 with proposal 2
> I0123 07:45:49.880878 17658 leveldb.cpp:343] Persisting action (8 bytes) to 
> leveldb took 5.364021ms
> I0123 07:45:49.880913 17658 replica.cpp:679] Persisted action at 0
> I0123 07:45:49.882619 17657 replica.cpp:511] Replica received write request 
> for position 0
> I0123 07:45:49.882998 17657 leveldb.cpp:438] Reading position from leveldb 
> took 150092ns
> I0123 07:45:49.886488 17657 leveldb.cpp:343] Persisting action (14 bytes) to 
> leveldb took 3.269189ms
> I0123 07:45:49.886536 17657 replica.cpp:679] Persisted action at 0
> I0123 07:45:49.887181 17657 replica.cpp:658] Replica received learned notice 
> for position 0
> I0123 07:45:49.892900 17657 leveldb.cpp:343] Persisting action (16 bytes) to 
> leveldb took 5.690093ms
> I0123 

[jira] [Commented] (MESOS-2255) SlaveRecoveryTest/0.MasterFailover is flaky

2015-10-14 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956331#comment-14956331
 ] 

Yong Qiao Wang commented on MESOS-2255:
---

I will re-run this test case and fix it if it is still a problem.

> SlaveRecoveryTest/0.MasterFailover is flaky
> ---
>
> Key: MESOS-2255
> URL: https://issues.apache.org/jira/browse/MESOS-2255
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Yan Xu
>Assignee: Yong Qiao Wang
>  Labels: flaky, twitter
>
> {noformat:title=}
> [ RUN  ] SlaveRecoveryTest/0.MasterFailover
> Using temporary directory '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0'
> I0123 07:45:49.818686 17634 leveldb.cpp:176] Opened db in 31.195549ms
> I0123 07:45:49.821962 17634 leveldb.cpp:183] Compacted db in 3.190936ms
> I0123 07:45:49.822049 17634 leveldb.cpp:198] Created db iterator in 47324ns
> I0123 07:45:49.822069 17634 leveldb.cpp:204] Seeked to beginning of db in 
> 2038ns
> I0123 07:45:49.822084 17634 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 484ns
> I0123 07:45:49.822160 17634 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0123 07:45:49.824241 17660 recover.cpp:449] Starting replica recovery
> I0123 07:45:49.825217 17660 recover.cpp:475] Replica is in EMPTY status
> I0123 07:45:49.827020 17660 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I0123 07:45:49.827453 17659 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I0123 07:45:49.828047 17659 recover.cpp:566] Updating replica status to 
> STARTING
> I0123 07:45:49.838543 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 10.24963ms
> I0123 07:45:49.838580 17659 replica.cpp:323] Persisted replica status to 
> STARTING
> I0123 07:45:49.848836 17659 recover.cpp:475] Replica is in STARTING status
> I0123 07:45:49.850039 17659 replica.cpp:641] Replica in STARTING status 
> received a broadcasted recover request
> I0123 07:45:49.850286 17659 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I0123 07:45:49.850754 17659 recover.cpp:566] Updating replica status to VOTING
> I0123 07:45:49.853698 17655 master.cpp:262] Master 
> 20150123-074549-16842879-44955-17634 (utopic) started on 127.0.1.1:44955
> I0123 07:45:49.853981 17655 master.cpp:308] Master only allowing 
> authenticated frameworks to register
> I0123 07:45:49.853997 17655 master.cpp:313] Master only allowing 
> authenticated slaves to register
> I0123 07:45:49.854038 17655 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0/credentials'
> I0123 07:45:49.854557 17655 master.cpp:357] Authorization enabled
> I0123 07:45:49.859633 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 8.742923ms
> I0123 07:45:49.859853 17659 replica.cpp:323] Persisted replica status to 
> VOTING
> I0123 07:45:49.860327 17658 recover.cpp:580] Successfully joined the Paxos 
> group
> I0123 07:45:49.860703 17654 recover.cpp:464] Recover process terminated
> I0123 07:45:49.859591 17655 master.cpp:1219] The newly elected leader is 
> master@127.0.1.1:44955 with id 20150123-074549-16842879-44955-17634
> I0123 07:45:49.864702 17655 master.cpp:1232] Elected as the leading master!
> I0123 07:45:49.864904 17655 master.cpp:1050] Recovering from registrar
> I0123 07:45:49.865406 17660 registrar.cpp:313] Recovering registrar
> I0123 07:45:49.866576 17660 log.cpp:660] Attempting to start the writer
> I0123 07:45:49.868638 17658 replica.cpp:477] Replica received implicit 
> promise request with proposal 1
> I0123 07:45:49.872521 17658 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 3.848859ms
> I0123 07:45:49.872555 17658 replica.cpp:345] Persisted promised to 1
> I0123 07:45:49.873769 17661 coordinator.cpp:230] Coordinator attemping to 
> fill missing position
> I0123 07:45:49.875474 17658 replica.cpp:378] Replica received explicit 
> promise request for position 0 with proposal 2
> I0123 07:45:49.880878 17658 leveldb.cpp:343] Persisting action (8 bytes) to 
> leveldb took 5.364021ms
> I0123 07:45:49.880913 17658 replica.cpp:679] Persisted action at 0
> I0123 07:45:49.882619 17657 replica.cpp:511] Replica received write request 
> for position 0
> I0123 07:45:49.882998 17657 leveldb.cpp:438] Reading position from leveldb 
> took 150092ns
> I0123 07:45:49.886488 17657 leveldb.cpp:343] Persisting action (14 bytes) to 
> leveldb took 3.269189ms
> I0123 07:45:49.886536 17657 replica.cpp:679] Persisted action at 0
> I0123 07:45:49.887181 17657 replica.cpp:658] Replica received learned notice 
> for position 0
> I0123 07:45:49.892900 17657 leveldb.cpp:343] 

[jira] [Comment Edited] (MESOS-2255) SlaveRecoveryTest/0.MasterFailover is flaky

2015-10-14 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956331#comment-14956331
 ] 

Yong Qiao Wang edited comment on MESOS-2255 at 10/14/15 6:06 AM:
-

I will re-run this test case and fix it if it is still a problem. [~xujyan], do 
you have some latest comment on this ticket.


was (Author: jamesyongqiaowang):
I will re-run this test case and fix it if it is still a problem.

> SlaveRecoveryTest/0.MasterFailover is flaky
> ---
>
> Key: MESOS-2255
> URL: https://issues.apache.org/jira/browse/MESOS-2255
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Yan Xu
>Assignee: Yong Qiao Wang
>  Labels: flaky, twitter
>
> {noformat:title=}
> [ RUN  ] SlaveRecoveryTest/0.MasterFailover
> Using temporary directory '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0'
> I0123 07:45:49.818686 17634 leveldb.cpp:176] Opened db in 31.195549ms
> I0123 07:45:49.821962 17634 leveldb.cpp:183] Compacted db in 3.190936ms
> I0123 07:45:49.822049 17634 leveldb.cpp:198] Created db iterator in 47324ns
> I0123 07:45:49.822069 17634 leveldb.cpp:204] Seeked to beginning of db in 
> 2038ns
> I0123 07:45:49.822084 17634 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 484ns
> I0123 07:45:49.822160 17634 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0123 07:45:49.824241 17660 recover.cpp:449] Starting replica recovery
> I0123 07:45:49.825217 17660 recover.cpp:475] Replica is in EMPTY status
> I0123 07:45:49.827020 17660 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I0123 07:45:49.827453 17659 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I0123 07:45:49.828047 17659 recover.cpp:566] Updating replica status to 
> STARTING
> I0123 07:45:49.838543 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 10.24963ms
> I0123 07:45:49.838580 17659 replica.cpp:323] Persisted replica status to 
> STARTING
> I0123 07:45:49.848836 17659 recover.cpp:475] Replica is in STARTING status
> I0123 07:45:49.850039 17659 replica.cpp:641] Replica in STARTING status 
> received a broadcasted recover request
> I0123 07:45:49.850286 17659 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I0123 07:45:49.850754 17659 recover.cpp:566] Updating replica status to VOTING
> I0123 07:45:49.853698 17655 master.cpp:262] Master 
> 20150123-074549-16842879-44955-17634 (utopic) started on 127.0.1.1:44955
> I0123 07:45:49.853981 17655 master.cpp:308] Master only allowing 
> authenticated frameworks to register
> I0123 07:45:49.853997 17655 master.cpp:313] Master only allowing 
> authenticated slaves to register
> I0123 07:45:49.854038 17655 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/SlaveRecoveryTest_0_MasterFailover_dtF7o0/credentials'
> I0123 07:45:49.854557 17655 master.cpp:357] Authorization enabled
> I0123 07:45:49.859633 17659 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 8.742923ms
> I0123 07:45:49.859853 17659 replica.cpp:323] Persisted replica status to 
> VOTING
> I0123 07:45:49.860327 17658 recover.cpp:580] Successfully joined the Paxos 
> group
> I0123 07:45:49.860703 17654 recover.cpp:464] Recover process terminated
> I0123 07:45:49.859591 17655 master.cpp:1219] The newly elected leader is 
> master@127.0.1.1:44955 with id 20150123-074549-16842879-44955-17634
> I0123 07:45:49.864702 17655 master.cpp:1232] Elected as the leading master!
> I0123 07:45:49.864904 17655 master.cpp:1050] Recovering from registrar
> I0123 07:45:49.865406 17660 registrar.cpp:313] Recovering registrar
> I0123 07:45:49.866576 17660 log.cpp:660] Attempting to start the writer
> I0123 07:45:49.868638 17658 replica.cpp:477] Replica received implicit 
> promise request with proposal 1
> I0123 07:45:49.872521 17658 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 3.848859ms
> I0123 07:45:49.872555 17658 replica.cpp:345] Persisted promised to 1
> I0123 07:45:49.873769 17661 coordinator.cpp:230] Coordinator attemping to 
> fill missing position
> I0123 07:45:49.875474 17658 replica.cpp:378] Replica received explicit 
> promise request for position 0 with proposal 2
> I0123 07:45:49.880878 17658 leveldb.cpp:343] Persisting action (8 bytes) to 
> leveldb took 5.364021ms
> I0123 07:45:49.880913 17658 replica.cpp:679] Persisted action at 0
> I0123 07:45:49.882619 17657 replica.cpp:511] Replica received write request 
> for position 0
> I0123 07:45:49.882998 17657 leveldb.cpp:438] Reading position from leveldb 
> took 150092ns
> I0123 07:45:49.886488 17657 leveldb.cpp:343] Persisting action (14 bytes) to 
> leveldb took 3.269189ms
> I0123 

[jira] [Issue Comment Deleted] (MESOS-3022) export additional metrics from scheduler driver

2015-10-14 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3022:
--
Comment: was deleted

(was: Hi [~benjaminhindman], [~jieyu] and [~vinodkone], Cloud you help to 
review this patch? )

> export additional metrics from scheduler driver
> ---
>
> Key: MESOS-3022
> URL: https://issues.apache.org/jira/browse/MESOS-3022
> Project: Mesos
>  Issue Type: Improvement
>Reporter: David Robinson
>Assignee: Yong Qiao Wang
>Priority: Minor
>
> The scheduler driver only exports the metrics below, but ideally it would 
> export its version and a count of messages by message type.
> {code}
> $ curl -s localhost:20902/metrics/snapshot | python -m json.tool
> {
> "scheduler/event_queue_dispatches": 0,
> "scheduler/event_queue_messages": 0,
> "system/cpus_total": 24,
> "system/load_15min": 0.49,
> "system/load_1min": 0.36,
> "system/load_5min": 0.46,
> "system/mem_free_bytes": 269713408,
> "system/mem_total_bytes": 33529266176
> }
> {code}
> The scheduler driver version could be used during troubleshooting to identify 
> frameworks that are using an old, potentially backwards incompatible, 
> scheduler driver (eg, a framework hasn't been restarted after a Mesos deploy, 
> so it still links against an old incompatible libmesos).
> A count of messages by message type would help identify a problem w/ a 
> specific feature, eg task reconciliation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-3022) export additional metrics from scheduler driver

2015-10-14 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3022:
--
Comment: was deleted

(was: Hi [~benjaminhindman], [~jieyu] and [~vinodkone], Cloud you help to 
review this patch? Thanks in advance!)

> export additional metrics from scheduler driver
> ---
>
> Key: MESOS-3022
> URL: https://issues.apache.org/jira/browse/MESOS-3022
> Project: Mesos
>  Issue Type: Improvement
>Reporter: David Robinson
>Assignee: Yong Qiao Wang
>Priority: Minor
>
> The scheduler driver only exports the metrics below, but ideally it would 
> export its version and a count of messages by message type.
> {code}
> $ curl -s localhost:20902/metrics/snapshot | python -m json.tool
> {
> "scheduler/event_queue_dispatches": 0,
> "scheduler/event_queue_messages": 0,
> "system/cpus_total": 24,
> "system/load_15min": 0.49,
> "system/load_1min": 0.36,
> "system/load_5min": 0.46,
> "system/mem_free_bytes": 269713408,
> "system/mem_total_bytes": 33529266176
> }
> {code}
> The scheduler driver version could be used during troubleshooting to identify 
> frameworks that are using an old, potentially backwards incompatible, 
> scheduler driver (eg, a framework hasn't been restarted after a Mesos deploy, 
> so it still links against an old incompatible libmesos).
> A count of messages by message type would help identify a problem w/ a 
> specific feature, eg task reconciliation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3022) export additional metrics from scheduler driver

2015-10-14 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958207#comment-14958207
 ] 

Yong Qiao Wang commented on MESOS-3022:
---

Thanks [~bmahler], I have understood your comments, I will update my patch to 
add the metrics of messages by type. 

In addition, the current patch is to added the other events (exiteds, https, 
terminates) in the metrics, I think we need to keep that fix. Do you think so?

> export additional metrics from scheduler driver
> ---
>
> Key: MESOS-3022
> URL: https://issues.apache.org/jira/browse/MESOS-3022
> Project: Mesos
>  Issue Type: Improvement
>Reporter: David Robinson
>Assignee: Yong Qiao Wang
>Priority: Minor
>
> The scheduler driver only exports the metrics below, but ideally it would 
> export its version and a count of messages by message type.
> {code}
> $ curl -s localhost:20902/metrics/snapshot | python -m json.tool
> {
> "scheduler/event_queue_dispatches": 0,
> "scheduler/event_queue_messages": 0,
> "system/cpus_total": 24,
> "system/load_15min": 0.49,
> "system/load_1min": 0.36,
> "system/load_5min": 0.46,
> "system/mem_free_bytes": 269713408,
> "system/mem_total_bytes": 33529266176
> }
> {code}
> The scheduler driver version could be used during troubleshooting to identify 
> frameworks that are using an old, potentially backwards incompatible, 
> scheduler driver (eg, a framework hasn't been restarted after a Mesos deploy, 
> so it still links against an old incompatible libmesos).
> A count of messages by message type would help identify a problem w/ a 
> specific feature, eg task reconciliation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2864) Master should not change the state of a terminal task if it receives another terminal update

2015-10-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730597#comment-14730597
 ] 

Yong Qiao Wang edited comment on MESOS-2864 at 10/13/15 7:40 AM:
-

Hi [~vinodkone], any comments for the added test?


was (Author: jamesyongqiaowang):
Hi [~vinodkone], any comments for this fix?

> Master should not change the state of a terminal task if it receives another 
> terminal update
> 
>
> Key: MESOS-2864
> URL: https://issues.apache.org/jira/browse/MESOS-2864
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>
> Currently, when the master receives a terminal update for an already 
> terminated (but unacknowledged) task it changes the state to the latest 
> update. This is confusing because the slave doesn't change the state of the 
> task in such a case. Master should just forward the update without changing 
> the task state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-10-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954553#comment-14954553
 ] 

Yong Qiao Wang commented on MESOS-3177:
---

Maybe we can consider taking quota as a parameter of role, then it does not 
needs to add some separated HTTP endpoints for quota management, and should use 
the role management HTTP endpoints for quota configuration. [~alex-mesos], some 
thoughts for this?

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-2864) Master should not change the state of a terminal task if it receives another terminal update

2015-10-13 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-2864:
--
Comment: was deleted

(was: Hi [~vinodkone], any comments for the updated code diff?)

> Master should not change the state of a terminal task if it receives another 
> terminal update
> 
>
> Key: MESOS-2864
> URL: https://issues.apache.org/jira/browse/MESOS-2864
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>
> Currently, when the master receives a terminal update for an already 
> terminated (but unacknowledged) task it changes the state to the latest 
> update. This is confusing because the slave doesn't change the state of the 
> task in such a case. Master should just forward the update without changing 
> the task state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-2864) Master should not change the state of a terminal task if it receives another terminal update

2015-10-13 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-2864:
--
Comment: was deleted

(was: HI [~vi...@twitter.com], could you give me a review for this fix? Welcome 
your any comments.)

> Master should not change the state of a terminal task if it receives another 
> terminal update
> 
>
> Key: MESOS-2864
> URL: https://issues.apache.org/jira/browse/MESOS-2864
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>
> Currently, when the master receives a terminal update for an already 
> terminated (but unacknowledged) task it changes the state to the latest 
> update. This is confusing because the slave doesn't change the state of the 
> task in such a case. Master should just forward the update without changing 
> the task state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-2864) Master should not change the state of a terminal task if it receives another terminal update

2015-10-13 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-2864:
--
Comment: was deleted

(was: Hi [~vinodkone], any comments for the added test?)

> Master should not change the state of a terminal task if it receives another 
> terminal update
> 
>
> Key: MESOS-2864
> URL: https://issues.apache.org/jira/browse/MESOS-2864
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>
> Currently, when the master receives a terminal update for an already 
> terminated (but unacknowledged) task it changes the state to the latest 
> update. This is confusing because the slave doesn't change the state of the 
> task in such a case. Master should just forward the update without changing 
> the task state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2864) Master should not change the state of a terminal task if it receives another terminal update

2015-10-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954910#comment-14954910
 ] 

Yong Qiao Wang commented on MESOS-2864:
---

Hi [~vinodkone], I have addressed your comments, any comments for the updated 
code changes.

> Master should not change the state of a terminal task if it receives another 
> terminal update
> 
>
> Key: MESOS-2864
> URL: https://issues.apache.org/jira/browse/MESOS-2864
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>
> Currently, when the master receives a terminal update for an already 
> terminated (but unacknowledged) task it changes the state to the latest 
> update. This is confusing because the slave doesn't change the state of the 
> task in such a case. Master should just forward the update without changing 
> the task state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1832) Slave should accept PingSlaveMessage but not "PING" message.

2015-10-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954906#comment-14954906
 ] 

Yong Qiao Wang commented on MESOS-1832:
---

[~vinodkone], 0.25.0 has be released, so this ticket can be fixed now?

> Slave should accept PingSlaveMessage but not "PING" message.
> 
>
> Key: MESOS-1832
> URL: https://issues.apache.org/jira/browse/MESOS-1832
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> Slave handles both "PING" message and PingSlaveMessage in until 0.22.0 for 
> backwards compatibility (https://reviews.apache.org/r/25867/).
> In 0.23.0, slave no longer needs handle "PING".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3338) Dynamic reservations are not counted as used resources in the master

2015-10-13 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954935#comment-14954935
 ] 

Yong Qiao Wang commented on MESOS-3338:
---

In optimistic offer design, the dynamic reserved resources will be treated as 
Reserved Resources rather than Used Resources, the Used Resources in that 
design should be the allocated resources.

> Dynamic reservations are not counted as used resources in the master
> 
>
> Key: MESOS-3338
> URL: https://issues.apache.org/jira/browse/MESOS-3338
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation, master
>Reporter: Alexander Rukletsov
>Assignee: Guangya Liu
>Priority: Minor
>  Labels: mesosphere, persistent-volumes
>
> Dynamically reserved resources should be considered used or allocated and 
> hence reflected in Mesos bookkeeping structures and {{state.json}}.
> I expanded the {{ReservationTest.ReserveThenUnreserve}} test with the 
> following section:
> {code}
>   // Check that the Master counts the reservation as a used resource.
>   {
> Future response =
>   process::http::get(master.get(), "state.json");
> AWAIT_READY(response);
> Try parse = JSON::parse(response.get().body);
> ASSERT_SOME(parse);
> Result cpus =
>   parse.get().find("slaves[0].used_resources.cpus");
> ASSERT_SOME_EQ(JSON::Number(1), cpus);
>   }
> {code}
> and got
> {noformat}
> ../../../src/tests/reservation_tests.cpp:168: Failure
> Value of: (cpus).get()
>   Actual: 0
> Expected: JSON::Number(1)
> Which is: 1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3022) export additional metrics from scheduler driver

2015-09-24 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905872#comment-14905872
 ] 

Yong Qiao Wang commented on MESOS-3022:
---

Hi [~bmahler], the messages by type have been added in my patch. Could you give 
ma a further review and let me know your concern. Thanks!

> export additional metrics from scheduler driver
> ---
>
> Key: MESOS-3022
> URL: https://issues.apache.org/jira/browse/MESOS-3022
> Project: Mesos
>  Issue Type: Improvement
>Reporter: David Robinson
>Assignee: Yong Qiao Wang
>Priority: Minor
>
> The scheduler driver only exports the metrics below, but ideally it would 
> export its version and a count of messages by message type.
> {code}
> $ curl -s localhost:20902/metrics/snapshot | python -m json.tool
> {
> "scheduler/event_queue_dispatches": 0,
> "scheduler/event_queue_messages": 0,
> "system/cpus_total": 24,
> "system/load_15min": 0.49,
> "system/load_1min": 0.36,
> "system/load_5min": 0.46,
> "system/mem_free_bytes": 269713408,
> "system/mem_total_bytes": 33529266176
> }
> {code}
> The scheduler driver version could be used during troubleshooting to identify 
> frameworks that are using an old, potentially backwards incompatible, 
> scheduler driver (eg, a framework hasn't been restarted after a Mesos deploy, 
> so it still links against an old incompatible libmesos).
> A count of messages by message type would help identify a problem w/ a 
> specific feature, eg task reconciliation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-21 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900287#comment-14900287
 ] 

Yong Qiao Wang commented on MESOS-3177:
---

Thanks [~thomasr], I think this requirement should be covered by Quota 
proposal. This ticket should only focus on the roles/weights add/remove/update.

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3477) Add design doc for roles/weights configuration

2015-09-20 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3477:
--
Summary: Add design doc for roles/weights configuration  (was: Add design 
doc for roles/weights configuraiton)

> Add design doc for roles/weights configuration
> --
>
> Key: MESOS-3477
> URL: https://issues.apache.org/jira/browse/MESOS-3477
> Project: Mesos
>  Issue Type: Documentation
>  Components: master
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3477) Add design doc for roles/weights configuraiton

2015-09-20 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3477:
-

 Summary: Add design doc for roles/weights configuraiton
 Key: MESOS-3477
 URL: https://issues.apache.org/jira/browse/MESOS-3477
 Project: Mesos
  Issue Type: Documentation
  Components: master
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-20 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-3177:
-

Assignee: Yong Qiao Wang

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-18 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804888#comment-14804888
 ] 

Yong Qiao Wang edited comment on MESOS-3177 at 9/18/15 6:07 AM:


Thanks [~cmaloney] for your quickly reply.

[~thomasr], Could you share some ideas to me?  if possible,  I'd like to work 
together with your on the design of this ticket. Thanks! 


was (Author: jamesyongqiaowang):
Thanks [~cmaloney] for your quickly reply.

[~thomasr], are your working on this ticket now? If you do not have time on 
this now, I want to re-assign this ticket to me, and try to propose a detailed 
design for this. Thanks! 

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Thomas Rampelberg
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3022) export additional metrics from scheduler driver

2015-09-18 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14805145#comment-14805145
 ] 

Yong Qiao Wang commented on MESOS-3022:
---

Thanks [~bmahler], I know version information will be available through 
MESOS-1841, so I only add messages by type in this patch. I am sorry to can not 
follow you, can you give me a detailed comments based on the code changes in 
RR? 

> export additional metrics from scheduler driver
> ---
>
> Key: MESOS-3022
> URL: https://issues.apache.org/jira/browse/MESOS-3022
> Project: Mesos
>  Issue Type: Improvement
>Reporter: David Robinson
>Assignee: Yong Qiao Wang
>Priority: Minor
>
> The scheduler driver only exports the metrics below, but ideally it would 
> export its version and a count of messages by message type.
> {code}
> $ curl -s localhost:20902/metrics/snapshot | python -m json.tool
> {
> "scheduler/event_queue_dispatches": 0,
> "scheduler/event_queue_messages": 0,
> "system/cpus_total": 24,
> "system/load_15min": 0.49,
> "system/load_1min": 0.36,
> "system/load_5min": 0.46,
> "system/mem_free_bytes": 269713408,
> "system/mem_total_bytes": 33529266176
> }
> {code}
> The scheduler driver version could be used during troubleshooting to identify 
> frameworks that are using an old, potentially backwards incompatible, 
> scheduler driver (eg, a framework hasn't been restarted after a Mesos deploy, 
> so it still links against an old incompatible libmesos).
> A count of messages by message type would help identify a problem w/ a 
> specific feature, eg task reconciliation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3403) Add support for removing no re-registered slaves with timeout(--slave_reregister_timeout) from an external allocator

2015-09-18 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3403:
--
Shepherd: Vinod Kone

> Add support for removing no re-registered slaves with 
> timeout(--slave_reregister_timeout) from an external allocator
> 
>
> Key: MESOS-3403
> URL: https://issues.apache.org/jira/browse/MESOS-3403
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> For an external Mesos allocator which does not run with Mesos master in the 
> same OS process, and maybe this allocator can be deployed in the different 
> host with Mesos master, then the Mesos allocator module should be implemented 
> as a proxy, which delegates calls to an actual allocator.
> For this external allocator, the total resources and allocated resources will 
> be stored in it. After Mesos master recovery (such as fail-over), it needs to 
> sync up with Mesos master. Under normal circumstances, all slaves will 
> reregister after Mesos master recovery, so we can sync up the total resources 
> and used resource of each slave in allocator->addSlave function call. But for 
> the abnormal case, a slave does not reregister after Mesos master recovery, 
> then master will call function Master::removeSlave(const Registry::Slave& 
> slave) to remove this slave from Registry after 
> timeout(slave_reregister_timeout), but this function does not call allocator 
> to remove the related resources. So in order to support the resources sync up 
> with the external allocator in this abnormal case, it needs to enhance 
> function Master::removeSlave(const Registry::Slave& slave) to call 
> allocator->removeSlave to remove the related resources from external 
> allocator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3403) Add support for removing no re-registered slaves with timeout(--slave_reregister_timeout) from an external allocator

2015-09-18 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3403:
--
Shepherd:   (was: Vinod Kone)

> Add support for removing no re-registered slaves with 
> timeout(--slave_reregister_timeout) from an external allocator
> 
>
> Key: MESOS-3403
> URL: https://issues.apache.org/jira/browse/MESOS-3403
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> For an external Mesos allocator which does not run with Mesos master in the 
> same OS process, and maybe this allocator can be deployed in the different 
> host with Mesos master, then the Mesos allocator module should be implemented 
> as a proxy, which delegates calls to an actual allocator.
> For this external allocator, the total resources and allocated resources will 
> be stored in it. After Mesos master recovery (such as fail-over), it needs to 
> sync up with Mesos master. Under normal circumstances, all slaves will 
> reregister after Mesos master recovery, so we can sync up the total resources 
> and used resource of each slave in allocator->addSlave function call. But for 
> the abnormal case, a slave does not reregister after Mesos master recovery, 
> then master will call function Master::removeSlave(const Registry::Slave& 
> slave) to remove this slave from Registry after 
> timeout(slave_reregister_timeout), but this function does not call allocator 
> to remove the related resources. So in order to support the resources sync up 
> with the external allocator in this abnormal case, it needs to enhance 
> function Master::removeSlave(const Registry::Slave& slave) to call 
> allocator->removeSlave to remove the related resources from external 
> allocator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-17 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791639#comment-14791639
 ] 

Yong Qiao Wang commented on MESOS-3177:
---

Thanks [~cmaloney] for your kindly reply. I have some questions and comments 
for your above thinks:

1. As we know, currently roles and weights are not persist in the replicated 
log, do you mean that we should persist them?

2. If yes for #1, then I think the initial replicated log for roles and weights 
are created when Mesos master starts in the first time, and the content of the 
log should be the roles and weights specified by --roles and --weights flag. is 
it right?

3. For add a new role "add_role", in code level, there are two places only need 
to change:

Add a new HTTP endpoint in master.cpp to add a new item in 
{code}
hashmap roles;
{code}

and call allocator to update the RoleSorter;

4. For remove an existing role "remove_role", I think it should ensure the 
following things before remove an existing role: 
  - Kill all tasks which using the resources reserved by this role;
  - Shutdown all executors which using the resources reserved by this role;
  - Unreserve the dynamically reserved resources for this role;
  - Destory the persisted volumn which using the resources reserved by this 
role;
  - Remove all frameworks which associated with this role?
  - Remove the related ACL of this role;

5. Do you mean the authorization rather than authentication in above comments?

[~cmaloney], Welcome your any comments for above thinks of me.

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Thomas Rampelberg
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-17 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791661#comment-14791661
 ] 

Yong Qiao Wang commented on MESOS-3177:
---

In addition, When we remove an existing role, we also need to call related 
slave to release the resources which reserved by that role before. [~cmaloney], 
any thoughts for this? Thanks! 

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Thomas Rampelberg
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-17 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804888#comment-14804888
 ] 

Yong Qiao Wang commented on MESOS-3177:
---

Thanks [~cmaloney] for your quickly reply.

[~thomasr], are your working on this ticket now? If you do not have time on 
this now, I want to re-assign this ticket to me, and try to propose a detailed 
design for this. Thanks! 

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Thomas Rampelberg
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3406) Should not update the framework info when framework->pid != from

2015-09-10 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738489#comment-14738489
 ] 

Yong Qiao Wang commented on MESOS-3406:
---

[~vinodkone], [~benjaminhindman], [~bmahler] and [~jieyu], any thoughts for 
this? Thanks!

> Should not update the framework info when framework->pid != from
> 
>
> Key: MESOS-3406
> URL: https://issues.apache.org/jira/browse/MESOS-3406
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> The current logic of Master::_subscribe as below:
> {code}
> if (frameworks.registered.contains(frameworkInfo.id())) {
> framework->updateFrameworkInfo(frameworkInfo);
> allocator->updateFramework(framework->id(), framework->info);
> framework->reregisteredTime = Clock::now();
> .
> .
> if (subscribe.force()) {
> ..
>  } else if (framework->pid != from) {
>  LOG(ERROR) << "Disallowing subscription attempt of"
>  << " framework " << *framework
>  << " because it is not expected from " << from;
>   FrameworkErrorMessage message;
>   message.set_message("Framework failed over");
>   send(from, message);
>  } else {
>  ..
>  }
> {code}
> In case if this framework has already registered but its pid does not equals 
> with from, Master will send a  "Framework failed over" message to framework 
> to reject this register, but the framework info still be updated, I think 
> this should be a bug, and in the failed register case, It should not update 
> the framework info. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3406) Should not update the framework info when framework->pid != from

2015-09-10 Thread Yong Qiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738518#comment-14738518
 ] 

Yong Qiao Wang commented on MESOS-3406:
---

Thanks [~gyliu]'s reminder,  this bug is duplicated with MESOS-3169

> Should not update the framework info when framework->pid != from
> 
>
> Key: MESOS-3406
> URL: https://issues.apache.org/jira/browse/MESOS-3406
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> The current logic of Master::_subscribe as below:
> {code}
> if (frameworks.registered.contains(frameworkInfo.id())) {
> framework->updateFrameworkInfo(frameworkInfo);
> allocator->updateFramework(framework->id(), framework->info);
> framework->reregisteredTime = Clock::now();
> .
> .
> if (subscribe.force()) {
> ..
>  } else if (framework->pid != from) {
>  LOG(ERROR) << "Disallowing subscription attempt of"
>  << " framework " << *framework
>  << " because it is not expected from " << from;
>   FrameworkErrorMessage message;
>   message.set_message("Framework failed over");
>   send(from, message);
>  } else {
>  ..
>  }
> {code}
> In case if this framework has already registered but its pid does not equals 
> with from, Master will send a  "Framework failed over" message to framework 
> to reject this register, but the framework info still be updated, I think 
> this should be a bug, and in the failed register case, It should not update 
> the framework info. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >