[jira] [Created] (MESOS-6583) Add Seccomp support for mesos-execute

2016-11-14 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6583:
--

 Summary: Add Seccomp support for mesos-execute
 Key: MESOS-6583
 URL: https://issues.apache.org/jira/browse/MESOS-6583
 Project: Mesos
  Issue Type: Task
Reporter: Jay Guo
Assignee: Jay Guo


User should be able to specify Seccomp profile when launching a task using 
mesos-execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6584) Add tests for Seccomp support

2016-11-14 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6584:
--

 Summary: Add tests for Seccomp support
 Key: MESOS-6584
 URL: https://issues.apache.org/jira/browse/MESOS-6584
 Project: Mesos
  Issue Type: Task
Reporter: Jay Guo
Assignee: Jay Guo


Add unit tests as well as integration tests for Seccomp support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6585) Create a user guide to document security features including capabilities and seccomp support

2016-11-14 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6585:
--

 Summary: Create a user guide to document security features 
including capabilities and seccomp support
 Key: MESOS-6585
 URL: https://issues.apache.org/jira/browse/MESOS-6585
 Project: Mesos
  Issue Type: Task
Reporter: Jay Guo


We should have a user guide to document security features in Mesos, including 
but not limited to: capabilities, seccomp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6585) Create a user guide to document security features including capabilities and seccomp support

2016-11-14 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663134#comment-15663134
 ] 

Jay Guo commented on MESOS-6585:


CC [~bbannier]

> Create a user guide to document security features including capabilities and 
> seccomp support
> 
>
> Key: MESOS-6585
> URL: https://issues.apache.org/jira/browse/MESOS-6585
> Project: Mesos
>  Issue Type: Task
>Reporter: Jay Guo
>
> We should have a user guide to document security features in Mesos, including 
> but not limited to: capabilities, seccomp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5795) Add support for Nvidia GPUs in the docker containerizer

2016-11-14 Thread yongyu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663163#comment-15663163
 ] 

yongyu commented on MESOS-5795:
---

What has been the progress?
I want to know

> Add support for Nvidia GPUs in the docker containerizer
> ---
>
> Key: MESOS-5795
> URL: https://issues.apache.org/jira/browse/MESOS-5795
> Project: Mesos
>  Issue Type: Epic
>  Components: docker, isolation
>Reporter: Kevin Klues
>  Labels: gpu, mesosphere
>
> In order to support Nvidia GPUs with docker containers in Mesos, we need to 
> be able to consolidate all Nvidia libraries into a common volume and inject 
> that volume into the container. This tracks the support in the docker 
> containerizer. The mesos containerizer support has already been completed in 
> MESOS-5401.
> More info on why this is necessary here: 
> https://github.com/NVIDIA/nvidia-docker/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6581) Add Seccomp support at Mesos Agent level

2016-11-14 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663232#comment-15663232
 ] 

Jay Guo commented on MESOS-6581:


Initial patches for review:
https://reviews.apache.org/r/53604/
https://reviews.apache.org/r/53605/
https://reviews.apache.org/r/53606/
https://reviews.apache.org/r/53607/
https://reviews.apache.org/r/53608/

> Add Seccomp support at Mesos Agent level
> 
>
> Key: MESOS-6581
> URL: https://issues.apache.org/jira/browse/MESOS-6581
> Project: Mesos
>  Issue Type: Task
> Environment: Linux Only
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> Operator of Mesos cluster should be able to enforce a set of Seccomp rules on 
> an Mesos Agent to defend against potential exploit attack through syscalls. 
> When enabled, every container launched on the Agent would comply with the 
> Seccomp filter otherwise being killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6427) Add documentation for rlimit support of Mesos containerizer

2016-11-14 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-6427:
---

Assignee: Benjamin Bannier

> Add documentation for rlimit support of Mesos containerizer
> ---
>
> Key: MESOS-6427
> URL: https://issues.apache.org/jira/browse/MESOS-6427
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization, documentation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)
Markus Jura created MESOS-6586:
--

 Summary: Teardown endpoint should remove framework
 Key: MESOS-6586
 URL: https://issues.apache.org/jira/browse/MESOS-6586
 Project: Mesos
  Issue Type: Improvement
  Components: cli, framework api, HTTP API
Affects Versions: 1.0.1
Reporter: Markus Jura


The Mesos {[/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jura updated MESOS-6586:
---
Description: 
The Mesos {[teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

  was:
The Mesos {[/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.


> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {[teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}} 
> at the scheduler API that will notify the framework that the mesos-master has 
> initiated a teardown. Mesos-master should only mark the framework as removed 
> if the framework has been successfully terminated, e.g. the framework could 
> send a message to mesos-master indicating that the termination was successful 
> / has been started.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jura updated MESOS-6586:
---
Description: 
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

  was:
The Mesos {[teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.


> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}} 
> at the scheduler API that will notify the framework that the mesos-master has 
> initiated a teardown. Mesos-master should only mark the framework as removed 
> if the framework has been successfully terminated, e.g. the framework could 
> send a message to mesos-master indicating that the termination was successful 
> / has been started.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jura updated MESOS-6586:
---
Description: 
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

This change will also affect the {{dcos service shutdown}} command which uses 
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
{{dcos service shutdown service-id}} command shuts down all components of the 
framework, not only the executors and tasks.

Tested on DC/OS with the frameworks conductr and elasticsearch.

  was:
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.


> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}} 
> at the scheduler API that will notify the framework that the mesos-master has 
> initiated a teardown. Mesos-master should only mark the framework as removed 
> if the framework has been successfully terminated, e.g. the framework could 
> send a message to mesos-master indicating that the termination was successful 
> / has been started.
> This change will also affect the {{dcos service shutdown}} command which uses 
> the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
> {{dcos service shutdown service-id}} command shuts down all components of the 
> framework, not only the executors and tasks.
> Tested on DC/OS with the frameworks conductr and elasticsearch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6587) Unable to cache certain archives with custom outputFile

2016-11-14 Thread Stephen Hankinson (JIRA)
Stephen Hankinson created MESOS-6587:


 Summary: Unable to cache certain archives with custom outputFile
 Key: MESOS-6587
 URL: https://issues.apache.org/jira/browse/MESOS-6587
 Project: Mesos
  Issue Type: Bug
  Components: fetcher
Reporter: Stephen Hankinson
Assignee: Stephen Hankinson
Priority: Minor


When caching an archive that is retrieved from a signed URL from somewhere like 
Amazon S3 or Azure Blob, the archive is not decompressed properly even when a 
valid compression suffix is set on the output_file parameter.

An example log is show below:

I1114 14:49:49.689990 39178 logging.cpp:194] INFO level logging started!
I1114 14:49:49.690237 39178 fetcher.cpp:498] Fetcher Info: 
{"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/68721b22-f102-443a-887c-b1df78f40bf5-S8\/root","items":[{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c4-docker.tar_5.255&sr=b","uri":{"cache":true,"executable":false,"extract":true,"output_file":"docker.tar.gz","value":"https:\/\/reportresources.blob.core.windows.net\/mesos\/docker.tar.gz?sig=thesignaturegoeshere"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/68721b22-f102-443a-887c-b1df78f40bf5-S8\/frameworks\/68721b22-f102-443a-887c-b1df78f40bf5-\/executors\/test.97c76288-aa79-11e6-9316-70b3d582\/runs\/a21ecf01-e80a-4d2b-b094-34d442081818","user":"root"}
I1114 14:49:49.692350 39178 fetcher.cpp:409] Fetching URI 
'https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere'
I1114 14:49:49.692369 39178 fetcher.cpp:306] Fetching from cache
W1114 14:49:49.692384 39178 fetcher.cpp:350] Copying instead of extracting 
resource from URI with 'extract' flag, because it does not seem to be an 
archive: 
https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere
I1114 14:49:49.692464 39178 fetcher.cpp:167] Copying resource with command:cp 
'/tmp/mesos/fetch/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/root/c4-docker.tar_5.255&sr=b'
 
'/var/lib/mesos/slave/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/frameworks/68721b22-f102-443a-887c-b1df78f40bf5-/executors/test.97c76288-aa79-11e6-9316-70b3d582/runs/a21ecf01-e80a-4d2b-b094-34d442081818/docker.tar.gz'
I1114 14:49:49.694368 39178 fetcher.cpp:547] Fetched 
'https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere'
 to 
'/var/lib/mesos/slave/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/frameworks/68721b22-f102-443a-887c-b1df78f40bf5-/executors/test.97c76288-aa79-11e6-9316-70b3d582/runs/a21ecf01-e80a-4d2b-b094-34d442081818/docker.tar.gz'

Even though the output_file is set to docker.tar.gz, the archive is copied 
instead of extracted because of the signature suffix from the source URL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6587) Unable to cache certain archives with custom output_file

2016-11-14 Thread Stephen Hankinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Hankinson updated MESOS-6587:
-
Summary: Unable to cache certain archives with custom output_file  (was: 
Unable to cache certain archives with custom outputFile)

> Unable to cache certain archives with custom output_file
> 
>
> Key: MESOS-6587
> URL: https://issues.apache.org/jira/browse/MESOS-6587
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Reporter: Stephen Hankinson
>Assignee: Stephen Hankinson
>Priority: Minor
>
> When caching an archive that is retrieved from a signed URL from somewhere 
> like Amazon S3 or Azure Blob, the archive is not decompressed properly even 
> when a valid compression suffix is set on the output_file parameter.
> An example log is show below:
> I1114 14:49:49.689990 39178 logging.cpp:194] INFO level logging started!
> I1114 14:49:49.690237 39178 fetcher.cpp:498] Fetcher Info: 
> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/68721b22-f102-443a-887c-b1df78f40bf5-S8\/root","items":[{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c4-docker.tar_5.255&sr=b","uri":{"cache":true,"executable":false,"extract":true,"output_file":"docker.tar.gz","value":"https:\/\/reportresources.blob.core.windows.net\/mesos\/docker.tar.gz?sig=thesignaturegoeshere"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/68721b22-f102-443a-887c-b1df78f40bf5-S8\/frameworks\/68721b22-f102-443a-887c-b1df78f40bf5-\/executors\/test.97c76288-aa79-11e6-9316-70b3d582\/runs\/a21ecf01-e80a-4d2b-b094-34d442081818","user":"root"}
> I1114 14:49:49.692350 39178 fetcher.cpp:409] Fetching URI 
> 'https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere'
> I1114 14:49:49.692369 39178 fetcher.cpp:306] Fetching from cache
> W1114 14:49:49.692384 39178 fetcher.cpp:350] Copying instead of extracting 
> resource from URI with 'extract' flag, because it does not seem to be an 
> archive: 
> https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere
> I1114 14:49:49.692464 39178 fetcher.cpp:167] Copying resource with command:cp 
> '/tmp/mesos/fetch/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/root/c4-docker.tar_5.255&sr=b'
>  
> '/var/lib/mesos/slave/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/frameworks/68721b22-f102-443a-887c-b1df78f40bf5-/executors/test.97c76288-aa79-11e6-9316-70b3d582/runs/a21ecf01-e80a-4d2b-b094-34d442081818/docker.tar.gz'
> I1114 14:49:49.694368 39178 fetcher.cpp:547] Fetched 
> 'https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere'
>  to 
> '/var/lib/mesos/slave/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/frameworks/68721b22-f102-443a-887c-b1df78f40bf5-/executors/test.97c76288-aa79-11e6-9316-70b3d582/runs/a21ecf01-e80a-4d2b-b094-34d442081818/docker.tar.gz'
> Even though the output_file is set to docker.tar.gz, the archive is copied 
> instead of extracted because of the signature suffix from the source URL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2092) Make ACLs dynamic

2016-11-14 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664132#comment-15664132
 ] 

Alexander Rukletsov commented on MESOS-2092:


Does not look like. [~gradywang]?

> Make ACLs dynamic
> -
>
> Key: MESOS-2092
> URL: https://issues.apache.org/jira/browse/MESOS-2092
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Alexander Rukletsov
>Assignee: Yongqiao Wang
>  Labels: mesosphere, newbie
>
> Master loads ACLs once during its launch and there is no way to update them 
> in a running master. Making them dynamic will allow for updating ACLs on the 
> fly, for example granting a new framework necessary rights.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6587) Unable to cache certain archives with custom output_file

2016-11-14 Thread Gilbert Song (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilbert Song updated MESOS-6587:

Description: 
When caching an archive that is retrieved from a signed URL from somewhere like 
Amazon S3 or Azure Blob, the archive is not decompressed properly even when a 
valid compression suffix is set on the output_file parameter.

An example log is show below:

{noformat}
I1114 14:49:49.689990 39178 logging.cpp:194] INFO level logging started!
I1114 14:49:49.690237 39178 fetcher.cpp:498] Fetcher Info: 
{"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/68721b22-f102-443a-887c-b1df78f40bf5-S8\/root","items":[{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c4-docker.tar_5.255&sr=b","uri":{"cache":true,"executable":false,"extract":true,"output_file":"docker.tar.gz","value":"https:\/\/reportresources.blob.core.windows.net\/mesos\/docker.tar.gz?sig=thesignaturegoeshere"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/68721b22-f102-443a-887c-b1df78f40bf5-S8\/frameworks\/68721b22-f102-443a-887c-b1df78f40bf5-\/executors\/test.97c76288-aa79-11e6-9316-70b3d582\/runs\/a21ecf01-e80a-4d2b-b094-34d442081818","user":"root"}
I1114 14:49:49.692350 39178 fetcher.cpp:409] Fetching URI 
'https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere'
I1114 14:49:49.692369 39178 fetcher.cpp:306] Fetching from cache
W1114 14:49:49.692384 39178 fetcher.cpp:350] Copying instead of extracting 
resource from URI with 'extract' flag, because it does not seem to be an 
archive: 
https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere
I1114 14:49:49.692464 39178 fetcher.cpp:167] Copying resource with command:cp 
'/tmp/mesos/fetch/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/root/c4-docker.tar_5.255&sr=b'
 
'/var/lib/mesos/slave/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/frameworks/68721b22-f102-443a-887c-b1df78f40bf5-/executors/test.97c76288-aa79-11e6-9316-70b3d582/runs/a21ecf01-e80a-4d2b-b094-34d442081818/docker.tar.gz'
I1114 14:49:49.694368 39178 fetcher.cpp:547] Fetched 
'https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere'
 to 
'/var/lib/mesos/slave/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/frameworks/68721b22-f102-443a-887c-b1df78f40bf5-/executors/test.97c76288-aa79-11e6-9316-70b3d582/runs/a21ecf01-e80a-4d2b-b094-34d442081818/docker.tar.gz'
{noformat}

Even though the output_file is set to docker.tar.gz, the archive is copied 
instead of extracted because of the signature suffix from the source URL.

  was:
When caching an archive that is retrieved from a signed URL from somewhere like 
Amazon S3 or Azure Blob, the archive is not decompressed properly even when a 
valid compression suffix is set on the output_file parameter.

An example log is show below:

I1114 14:49:49.689990 39178 logging.cpp:194] INFO level logging started!
I1114 14:49:49.690237 39178 fetcher.cpp:498] Fetcher Info: 
{"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/68721b22-f102-443a-887c-b1df78f40bf5-S8\/root","items":[{"action":"RETRIEVE_FROM_CACHE","cache_filename":"c4-docker.tar_5.255&sr=b","uri":{"cache":true,"executable":false,"extract":true,"output_file":"docker.tar.gz","value":"https:\/\/reportresources.blob.core.windows.net\/mesos\/docker.tar.gz?sig=thesignaturegoeshere"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/68721b22-f102-443a-887c-b1df78f40bf5-S8\/frameworks\/68721b22-f102-443a-887c-b1df78f40bf5-\/executors\/test.97c76288-aa79-11e6-9316-70b3d582\/runs\/a21ecf01-e80a-4d2b-b094-34d442081818","user":"root"}
I1114 14:49:49.692350 39178 fetcher.cpp:409] Fetching URI 
'https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere'
I1114 14:49:49.692369 39178 fetcher.cpp:306] Fetching from cache
W1114 14:49:49.692384 39178 fetcher.cpp:350] Copying instead of extracting 
resource from URI with 'extract' flag, because it does not seem to be an 
archive: 
https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere
I1114 14:49:49.692464 39178 fetcher.cpp:167] Copying resource with command:cp 
'/tmp/mesos/fetch/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/root/c4-docker.tar_5.255&sr=b'
 
'/var/lib/mesos/slave/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/frameworks/68721b22-f102-443a-887c-b1df78f40bf5-/executors/test.97c76288-aa79-11e6-9316-70b3d582/runs/a21ecf01-e80a-4d2b-b094-34d442081818/docker.tar.gz'
I1114 14:49:49.694368 39178 fetcher.cpp:547] Fetched 
'https://reportresources.blob.core.windows.net/mesos/docker.tar.gz?sig=thesignaturegoeshere'
 to 
'/var/lib/mesos/slave/slaves/68721b22-f102-443a-887c-b1df78f40bf5-S8/frameworks/68721b22-f102-443a-887c-b1df78f40bf5-/executors/test.97c76288-aa79-11e6-9316-70b3d582/runs/a21ecf01-e80a-4d2b-b094-34d442081818/docker.tar.gz'

Even though the output_file is set to d

[jira] [Created] (MESOS-6588) LinuxRoots misses required files

2016-11-14 Thread James Peach (JIRA)
James Peach created MESOS-6588:
--

 Summary: LinuxRoots misses required files
 Key: MESOS-6588
 URL: https://issues.apache.org/jira/browse/MESOS-6588
 Project: Mesos
  Issue Type: Bug
  Components: containerization, tests
Reporter: James Peach


The hard-coded list of required files in {{src/tests/containerizer/rootfs.hpp}} 
is out of date for Fedora 24. F24 now requires {{libtinfo.so.6}} and 
{{/lib64/libcrypto.so.10}}.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6588) LinuxRootfs misses required files

2016-11-14 Thread James Peach (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach updated MESOS-6588:
---
Summary: LinuxRootfs misses required files  (was: LinuxRoots misses 
required files)

> LinuxRootfs misses required files
> -
>
> Key: MESOS-6588
> URL: https://issues.apache.org/jira/browse/MESOS-6588
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, tests
>Reporter: James Peach
>
> The hard-coded list of required files in 
> {{src/tests/containerizer/rootfs.hpp}} is out of date for Fedora 24. F24 now 
> requires {{libtinfo.so.6}} and {{/lib64/libcrypto.so.10}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6588) LinuxRootfs misses required files

2016-11-14 Thread James Peach (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach reassigned MESOS-6588:
--

Assignee: James Peach

> LinuxRootfs misses required files
> -
>
> Key: MESOS-6588
> URL: https://issues.apache.org/jira/browse/MESOS-6588
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, tests
>Reporter: James Peach
>Assignee: James Peach
>
> The hard-coded list of required files in 
> {{src/tests/containerizer/rootfs.hpp}} is out of date for Fedora 24. F24 now 
> requires {{libtinfo.so.6}} and {{/lib64/libcrypto.so.10}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6588) LinuxRootfs misses required files

2016-11-14 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664889#comment-15664889
 ] 

James Peach commented on MESOS-6588:


I'm going to take a crack as using {{stout/elf.hpp}} to figure out the set of 
libraries we need to include to support the binaries that tests expect to be 
present in the test root.

> LinuxRootfs misses required files
> -
>
> Key: MESOS-6588
> URL: https://issues.apache.org/jira/browse/MESOS-6588
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, tests
>Reporter: James Peach
>Assignee: James Peach
>
> The hard-coded list of required files in 
> {{src/tests/containerizer/rootfs.hpp}} is out of date for Fedora 24. F24 now 
> requires {{libtinfo.so.6}} and {{/lib64/libcrypto.so.10}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-6557) IPC namespace isolator

2016-11-14 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664897#comment-15664897
 ] 

James Peach edited comment on MESOS-6557 at 11/14/16 8:19 PM:
--

|Implement a namespace/ipc isolator. 
|[https://reviews.apache.org/r/53688/|https://reviews.apache.org/r/53688/] |
| Use a common fixture for the PID namespace test. | 
[https://reviews.apache.org/r/53689/|https://reviews.apache.org/r/53689/] |
|Add namespaces/ipc documentation. 
|[https://reviews.apache.org/r/53690/|https://reviews.apache.org/r/53690/] |


was (Author: jamespeach):
|Implement a namespace/ipc isolator. |https://reviews.apache.org/r/53688/ |
|Add namespaces/ipc documentation. |https://reviews.apache.org/r/53690/ |

> IPC namespace isolator
> --
>
> Key: MESOS-6557
> URL: https://issues.apache.org/jira/browse/MESOS-6557
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: James Peach
>Assignee: James Peach
>
> Add a {{namespace/ipc}} isolator for creating an IPC namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6577) Failed to run docker inspect

2016-11-14 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664910#comment-15664910
 ] 

Joseph Wu commented on MESOS-6577:
--

The flow of the DockerContainerizer is the following:

1. {{docker run --name  ...}}
2. {{docker inspect }}
3. Retry (2) until the {{StartedAt}} field is populated.  Or until it errors.

In your case, (1) is in process, but (2) immediately errors.  This means that 
the docker daemon is not processing the {{docker run}} until after the {{docker 
inspect}} has finished.  This is a tricky problem to solve, as the 
DockerContainerizer doesn't know when the {{docker run}} will impact the 
{{docker inspect}}.

> Failed to run docker inspect
> 
>
> Key: MESOS-6577
> URL: https://issues.apache.org/jira/browse/MESOS-6577
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 1.0.1
> Environment: {code:none}
> core@kato-2 ~ $ cat /etc/kato.env 
>KATO_CLUSTER_ID=cell-1-dub
>KATO_QUORUM_COUNT=3
>KATO_ROLES='quorum master worker '
>KATO_HOST_NAME=kato
>KATO_HOST_ID=2
>KATO_ZK=quorum-1:2181,quorum-2:2181,quorum-3:2181
>
> KATO_ALERT_MANAGERS=http://master-1:9093,http://master-2:9093,http://master-3:9093
>KATO_DOMAIN=cell-1.dub.xnood.com
>KATO_MESOS_DOMAIN=cell-1.dub.mesos
>KATO_HOST_IP=10.136.64.12 
>KATO_QUORUM=2
>DOCKER_VERSION=1.12.3
> {code}
> {code:none}
> core@kato-2 ~ $ cat /etc/systemd/system/mesos-agent.service
> [Unit]
> Description=Mesos agent
> After=go-dnsmasq.service
> [Service]
> Slice=machine.slice
> Restart=always
> RestartSec=10
> TimeoutStartSec=0
> KillMode=mixed
> EnvironmentFile=/etc/kato.env
> ExecStartPre=/usr/bin/sh -c "[ -d /var/lib/mesos/agent ] || mkdir -p 
> /var/lib/mesos/agent"
> ExecStartPre=/usr/bin/sh -c "[ -d /etc/certs ] || mkdir -p /etc/certs"
> ExecStartPre=/usr/bin/sh -c "[ -d /etc/cni ] || mkdir -p /etc/cni"
> ExecStartPre=/opt/bin/zk-alive ${KATO_QUORUM_COUNT}
> ExecStartPre=/usr/bin/rkt fetch quay.io/kato/mesos:v1.0.1-${DOCKER_VERSION}-2
> ExecStartPre=/usr/bin/docker pull 
> quay.io/kato/mesos:v1.0.1-${DOCKER_VERSION}-2
> ExecStart=/usr/bin/rkt run \
>  --net=host \
>  --dns=host \
>  --hosts-entry=host \
>  --volume cni,kind=host,source=/etc/cni \
>  --mount volume=cni,target=/etc/cni \
>  --volume certs,kind=host,source=/etc/certs \
>  --mount volume=certs,target=/etc/certs \
>  --volume docker,kind=host,source=/var/run/docker.sock \
>  --mount volume=docker,target=/var/run/docker.sock \
>  --volume data,kind=host,source=/var/lib/mesos \
>  --mount volume=data,target=/var/lib/mesos \
>  --stage1-name=coreos.com/rkt/stage1-fly \
>  quay.io/kato/mesos:v1.0.1-${DOCKER_VERSION}-2 --exec /usr/sbin/mesos-agent 
> -- \
>  --no-systemd_enable_support \
>  --docker_mesos_image=quay.io/kato/mesos:v1.0.1-${DOCKER_VERSION}-2 \
>  --hostname=worker-${KATO_HOST_ID}.${KATO_DOMAIN} \
>  --ip=${KATO_HOST_IP} \
>  --containerizers=docker \
>  --executor_registration_timeout=2mins \
>  --master=zk://${KATO_ZK}/mesos \
>  --work_dir=/var/lib/mesos/agent \
>  --log_dir=/var/log/mesos/agent \
>  --network_cni_config_dir=/etc/cni \
>  --network_cni_plugins_dir=/var/lib/mesos/cni-plugins
> [Install]
> WantedBy=kato.target
> {code}
> {code:none}
> core@kato-2 ~ $ docker version
> Client:
>  Version:  1.12.3
>  API version:  1.24
>  Go version:   go1.6.3
>  Git commit:   34a2ead
>  Built:
>  OS/Arch:  linux/amd64
> Server:
>  Version:  1.12.3
>  API version:  1.24
>  Go version:   go1.6.3
>  Git commit:   34a2ead
>  Built:
>  OS/Arch:  linux/amd64
> {code}
>Reporter: Marc Villacorta
>
> I am running a _rocketized_ mesos agent.
> I am using the docker containerizer.
> My executors are _dockerized_.
> The very first time I deploy a sample platform I get some errors like the one 
> below:
> {code:none}
> Failed to launch container: Failed to run 'docker -H 
> unix:///var/run/docker.sock inspect 
> mesos-84a9df2b-be0e-459e-afc9-b95d4e8ced57-S0.0116a0a2-ccaf-4f1a-846c-361ec4e4a179':
>  exited with status 1; stderr='Error: No such image, container or task: 
> mesos-84a9df2b-be0e-459e-afc9-b95d4e8ced57-S0.0116a0a2-ccaf-4f1a-846c-361ec4e4a179
>  '
> {code}
> But when I check with {{docker ps}} I can see the supposedly missing 
> container and I can even successfully run {{docker inspect}} on it. Then 
> marathon reschedules and I get a duplicate. Nor mesos neither marathon list 
> any duplicate (only docker does).
> Restarting the mesos-agent wipes out the reported missing container leaving 
> the other ones alive.
> When all my nodes have the docker image layers cached I can deploy the sample 
> platform smoothly and I don't get the previous errors.
> If a container needs a remote volume attached (EBS via REX-Ray) the error 
> happens all 

[jira] [Commented] (MESOS-6580) why not to set gpus for "set" type, I think it is import, if I use both mesos containers and docker daemon running docker

2016-11-14 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664918#comment-15664918
 ] 

Joseph Wu commented on MESOS-6580:
--

Can you clarify what you mean?  i.e. a meaningful summary, examples, 
descriptions, etc.

> why not to set gpus for "set" type, I think it is import, if I use both mesos 
> containers and docker daemon running docker
> -
>
> Key: MESOS-6580
> URL: https://issues.apache.org/jira/browse/MESOS-6580
> Project: Mesos
>  Issue Type: Bug
>Reporter: yongyu
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6443) Display maintenance information in the webui.

2016-11-14 Thread Tomasz Janiszewski (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Janiszewski updated MESOS-6443:
--
Attachment: mesos_webui_maintenance_schedule.png

https://reviews.apache.org/r/53741/


> Display maintenance information in the webui.
> -
>
> Key: MESOS-6443
> URL: https://issues.apache.org/jira/browse/MESOS-6443
> Project: Mesos
>  Issue Type: Improvement
>  Components: webui
>Reporter: Tomasz Janiszewski
>Assignee: Tomasz Janiszewski
>Priority: Minor
> Attachments: mesos_webui_maintenance_schedule.png
>
>
> Add new tab with Maintenance schedule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6588) LinuxRootfs misses required files

2016-11-14 Thread James Peach (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach updated MESOS-6588:
---
Shepherd: Yan Xu

> LinuxRootfs misses required files
> -
>
> Key: MESOS-6588
> URL: https://issues.apache.org/jira/browse/MESOS-6588
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, tests
>Reporter: James Peach
>Assignee: James Peach
>
> The hard-coded list of required files in 
> {{src/tests/containerizer/rootfs.hpp}} is out of date for Fedora 24. F24 now 
> requires {{libtinfo.so.6}} and {{/lib64/libcrypto.so.10}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665735#comment-15665735
 ] 

Joseph Wu commented on MESOS-6586:
--

Overall, this sounds like a reasonable thing for the master to do, and for the 
operator to expect.

Even without adding an additional {{Event}}, we could potentially implement 
this as an {{Event::ERROR}}.  The expected behavior of a scheduler when it 
receives an {{ERROR}} is to abort.  If we implement it this way, non HTTP-API 
frameworks would terminate too, as there is an existing {{error}} callback.

Note: There is an existing feature request (MESOS-6419) for {{/teardown}} to 
work with unregistered frameworks (i.e. orphans).

> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}} 
> at the scheduler API that will notify the framework that the mesos-master has 
> initiated a teardown. Mesos-master should only mark the framework as removed 
> if the framework has been successfully terminated, e.g. the framework could 
> send a message to mesos-master indicating that the termination was successful 
> / has been started.
> This change will also affect the {{dcos service shutdown}} command which uses 
> the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
> {{dcos service shutdown service-id}} command shuts down all components of the 
> framework, not only the executors and tasks.
> Tested on DC/OS with the frameworks conductr and elasticsearch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6567) Actively Scan for CNI Configurations

2016-11-14 Thread Dan Osborne (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665861#comment-15665861
 ] 

Dan Osborne commented on MESOS-6567:


I've reproed this on 1.0.1 and can confirm it is not scanned and picked up at 
runtime, but instead requires a reboot of the slave process.

I think it's because scanning of CNI networks happens in 
NetworkCniIsolatorProcess::create which I believe happens at boot, not during 
NetworkCniIsolatorProcess::Isolate, which happens at container runtime?

> Actively Scan for CNI Configurations
> 
>
> Key: MESOS-6567
> URL: https://issues.apache.org/jira/browse/MESOS-6567
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Dan Osborne
>
> Mesos-Agent currently loads the CNI configs into memory at startup. After 
> this point, new configurations that are added will remain unknown to the 
> Mesos Agent process until it is restarted.
> This ticket is to request that the Mesos Agent process can the CNI config 
> directory each time it is networking a task, so that modifying, adding, and 
> removing networks will not require a slave reboot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666324#comment-15666324
 ] 

Markus Jura commented on MESOS-6586:


The {{Event::ERROR}} would be a possibility. However, the event only has a 
string argument which as error message. To possibly distinguish various errors 
it is desirable to then enhance the error event with a error type, similar as 
{{TaskStatus}} has a {{TaskStatus.Reason}} type.

> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}} 
> at the scheduler API that will notify the framework that the mesos-master has 
> initiated a teardown. Mesos-master should only mark the framework as removed 
> if the framework has been successfully terminated, e.g. the framework could 
> send a message to mesos-master indicating that the termination was successful 
> / has been started.
> This change will also affect the {{dcos service shutdown}} command which uses 
> the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
> {{dcos service shutdown service-id}} command shuts down all components of the 
> framework, not only the executors and tasks.
> Tested on DC/OS with the frameworks conductr and elasticsearch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666423#comment-15666423
 ] 

Markus Jura commented on MESOS-6586:


The issue https://issues.apache.org/jira/browse/MESOS-6136 relates to this 
issue. Generally, I'd expect that the {{/teardown}} event removes any kind of 
state and entities of the framework so that it is possible to reuse the same 
framework id once a framework has been removed.

> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}} 
> at the scheduler API that will notify the framework that the mesos-master has 
> initiated a teardown. Mesos-master should only mark the framework as removed 
> if the framework has been successfully terminated, e.g. the framework could 
> send a message to mesos-master indicating that the termination was successful 
> / has been started.
> This change will also affect the {{dcos service shutdown}} command which uses 
> the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
> {{dcos service shutdown service-id}} command shuts down all components of the 
> framework, not only the executors and tasks.
> Tested on DC/OS with the frameworks conductr and elasticsearch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jura updated MESOS-6586:
---
Description: 
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

This change will also affect the {{dcos service shutdown}} command which uses 
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
{{dcos service shutdown service-id}} command shuts down all components of the 
framework, not only the executors and tasks.

Also, for consistency reasons I'd expect that this shutdown action can also be 
taken by using the DC/OS UI. So far on DC/OS, you can only {{Suspend}} a 
service / framework which will stop the framework instances, but will not 
remove the framework from mesos-master and terminate its executors. 

Tested on DC/OS with the frameworks conductr and elasticsearch.

  was:
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

This change will also affect the {{dcos service shutdown}} command which uses 
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
{{dcos service shutdown service-id}} command shuts down all components of the 
framework, not only the executors and tasks.

Tested on DC/OS with the frameworks conductr and elasticsearch.


> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}} 
> at the scheduler API that will notify the framework that the mesos-master has 
> initiated a teardown. Mesos-master should only mark the framework as removed 
> if the framework has been successfully terminated, e.g. the framework could 
> send a message to mesos-master indicating that the termination was successful 
> / has been started.
> This change will also affect the {{dcos service shutdown}} command which uses 
> the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
> {{dcos service shutdown service-id}} command shuts down all components of the 
> framework, not only the executors and tasks.
> Also, for consistency reasons I'd expect that this shutdown action can also 
> be taken by using the DC/OS UI. So far on DC/OS, you can only {{Suspend}} a 
> service / framework which will stop the framework instances, but will not 
> remove the framework from mesos-master and terminate its executors. 
> Tested on DC/OS with the frameworks conductr an

[jira] [Updated] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jura updated MESOS-6586:
---
Description: 
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

This change will also affect the {{dcos service shutdown}} command which uses 
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
{{dcos service shutdown service-id}} command shuts down all components of the 
framework, not only the executors and tasks.

Also, for consistency reasons I'd expect that this shutdown action can also be 
taken by using the DC/OS UI. So far on DC/OS, you can only {{Suspend}} a 
service / framework which will stop the framework instances, but will not 
remove the framework from mesos-master and terminate it's executors. As far as 
I am aware there is no documentation that explains in detail the difference 
between the {{shutdown}} command in the DC/OS CLI and the {{Suspend}} button on 
the DC/OS UI. A user should carefully understand what these actions are doing 
with the system, especially if they are not consistent. Again, I'd recommend 
adding a new button to the DC/OS UI that uses the {{/teardown}} endpoint.

Tested on DC/OS with the frameworks conductr and elasticsearch.

  was:
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

This change will also affect the {{dcos service shutdown}} command which uses 
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
{{dcos service shutdown service-id}} command shuts down all components of the 
framework, not only the executors and tasks.

Also, for consistency reasons I'd expect that this shutdown action can also be 
taken by using the DC/OS UI. So far on DC/OS, you can only {{Suspend}} a 
service / framework which will stop the framework instances, but will not 
remove the framework from mesos-master and terminate it's executors. As far as 
I am aware there is no documentation that explains in detail the difference 
between the {{shutdown}} command in the DC/OS CLI and the {{Suspend}} button on 
the DC/OS UI. A user should carefully understand what these actions are doing 
with the system, especially if they are not consistent. Again, I'd recommend to 
add a new button to the DC/OS UI that uses the {{/teardown}} endpoint.

Tested on DC/OS with the frameworks conductr and elasticsearch.


> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framewo

[jira] [Updated] (MESOS-6586) Teardown endpoint should remove framework

2016-11-14 Thread Markus Jura (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jura updated MESOS-6586:
---
Description: 
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

This change will also affect the {{dcos service shutdown}} command which uses 
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
{{dcos service shutdown service-id}} command shuts down all components of the 
framework, not only the executors and tasks.

Also, for consistency reasons I'd expect that this shutdown action can also be 
taken by using the DC/OS UI. So far on DC/OS, you can only {{Suspend}} a 
service / framework which will stop the framework instances, but will not 
remove the framework from mesos-master and terminate it's executors. As far as 
I am aware there is no documentation that explains in detail the difference 
between the {{shutdown}} command in the DC/OS CLI and the {{Suspend}} button on 
the DC/OS UI. A user should carefully understand what these actions are doing 
with the system, especially if they are not consistent. Again, I'd recommend to 
add a new button to the DC/OS UI that uses the {{/teardown}} endpoint.

Tested on DC/OS with the frameworks conductr and elasticsearch.

  was:
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in 
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents

However, I'd also expect that a message from the mesos-master is sent to the 
framework (Scheduler API) so that the framework processes can initiate a 
shutdown as well. This is not the case. As a result, it is necessary to 
manually {{suspend}} the framework, e.g. by using the DC/OS UI.

A possible solution would be to provide an additional callback {{teardown}} at 
the scheduler API that will notify the framework that the mesos-master has 
initiated a teardown. Mesos-master should only mark the framework as removed if 
the framework has been successfully terminated, e.g. the framework could send a 
message to mesos-master indicating that the termination was successful / has 
been started.

This change will also affect the {{dcos service shutdown}} command which uses 
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the 
{{dcos service shutdown service-id}} command shuts down all components of the 
framework, not only the executors and tasks.

Also, for consistency reasons I'd expect that this shutdown action can also be 
taken by using the DC/OS UI. So far on DC/OS, you can only {{Suspend}} a 
service / framework which will stop the framework instances, but will not 
remove the framework from mesos-master and terminate its executors. 

Tested on DC/OS with the frameworks conductr and elasticsearch.


> Teardown endpoint should remove framework
> -
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli, framework api, HTTP API
>Affects Versions: 1.0.1
>Reporter: Markus Jura
>  Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is 
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the 
> framework (Scheduler API) so that the framework processes can initiate a 
> shutdown as well. This is not the case. As a result, it is necessary to 
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}} 
> at the scheduler API that will notify the framework that the mesos-master has 
> initiated a teardown. Mesos-master should only mark the framework as removed 
> if the framework has been successfully terminated, e.g. the framework could 
> send a message to mesos-master indicating that