[jira] [Updated] (MESOS-5960) Design doc for supporting seccomp in Mesos container

2016-08-01 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-5960:
---
   Assignee: Jay Guo
Component/s: containerization
 Issue Type: Task  (was: Bug)

> Design doc for supporting seccomp in Mesos container
> 
>
> Key: MESOS-5960
> URL: https://issues.apache.org/jira/browse/MESOS-5960
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jay Guo
>Assignee: Jay Guo
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5960) Design doc for supporting seccomp in Mesos container

2016-08-01 Thread Jay Guo (JIRA)
Jay Guo created MESOS-5960:
--

 Summary: Design doc for supporting seccomp in Mesos container
 Key: MESOS-5960
 URL: https://issues.apache.org/jira/browse/MESOS-5960
 Project: Mesos
  Issue Type: Bug
Reporter: Jay Guo






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5041) Add cgroups unified isolator

2016-08-01 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403377#comment-15403377
 ] 

Jie Yu commented on MESOS-5041:
---

commit 9567fb42062774ca841da8cbfb45119bc30a0a4d
Author: haosdent huang 
Date:   Mon Aug 1 22:03:41 2016 -0700

Implemented `CgroupsIsolatorProcess::cleanup`.

Review: https://reviews.apache.org/r/49827/

> Add cgroups unified isolator
> 
>
> Key: MESOS-5041
> URL: https://issues.apache.org/jira/browse/MESOS-5041
> Project: Mesos
>  Issue Type: Task
>  Components: cgroups, isolation
>Reporter: haosdent
>Assignee: haosdent
> Fix For: 1.1.0
>
>
> Implement the cgroups unified isolator for Mesos containerizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5923) Ubuntu 14.04 LTS GPU Isolator "/run" directory is noexec

2016-08-01 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402306#comment-15402306
 ] 

Jie Yu edited comment on MESOS-5923 at 8/2/16 4:51 AM:
---

commit 48a492cd9d7d0a194735b9b4107a35b489c596e1
Author: Kevin Klues 
Date:   Mon Aug 1 09:06:07 2016 -0700

Updated NvidiaVolume to mount as 'tmpfs' if parent fs is 'noexec'.

This patch is in response to an issue we ran into on Ubuntu 14.04,
where '/run' is being mounted as 'noexec' (MESOS-5923). Since our
NvidiaVolume is created below this mount point, we are unable to
execute any binaries we add to this volume. This causes problems, for
example, when trying to execute 'nvidia-smi' from within a container
that has this volume mounted in.

To work around this issue, we detect if any mount point above the path
where we create the volume is marked as 'noexec', and if so, we create
a new 'tmpfs' mount for the volume without 'noexec' set.

Review: https://reviews.apache.org/r/50592/



was (Author: jieyu):
commit 48a492cd9d7d0a194735b9b4107a35b489c596e1
Author: Kevin Klues 
Date:   Mon Aug 1 09:06:07 2016 -0700

Updated NvidiaVolume to mount as 'tmpfs' if parent fs is 'noexec'.

This patch is in response to an issue we ran into on Ubuntu 14.04,
where '/run' is being mounted as 'noexec' (MESOS-5923). Since our
NvidiaVolume is created below this mount point, we are unable to
execute any binaries we add to this volume. This causes problems, for
example, when trying to execute 'nvidia-smi' from within a container
that has this volume mounted in.

To work around this issue, we detect if any mount point above the path
where we create the volume is marked as 'noexec', and if so, we create
a new 'tmpfs' mount for the volume without 'noexec' set.

Review: https://reviews.apache.org/r/50592/

commit ad1f610508ca669b32b1cb7a4d5baf5f3b337b70
Author: Kevin Klues 
Date:   Mon Aug 1 09:06:04 2016 -0700

Added check for root permissions to 'NvidiaVolume::create()'.

Review: https://reviews.apache.org/r/50644/

> Ubuntu 14.04 LTS GPU Isolator "/run" directory is noexec
> 
>
> Key: MESOS-5923
> URL: https://issues.apache.org/jira/browse/MESOS-5923
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
> Environment: Ubuntu 14.04 LTS
>Reporter: Bill Zhao
>Assignee: Kevin Klues
>  Labels: gpu, mesosphere
> Fix For: 1.0.1
>
>
> In Ubuntu 14.04 LTS the mount for /run directory is noexec.  It affect the 
> {{/var/run/mesos/isolators/gpu/nvidia_352.63/bin}} directory which mesos GPU 
> isolators depended on.
> {{bill@billz:/var/run$ mount | grep noexec
> proc on /proc type proc (rw,noexec,nosuid,nodev)
> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
> tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)}}
> The /var/run is link to /run:
> {{bill@billz:/var$ ll
> total 52
> drwxr-xr-x 13 root root 4096 May  5 20:00 ./
> drwxr-xr-x 27 root root 4096 Jul 14 17:29 ../
> lrwxrwxrwx  1 root root9 May  5 19:50 lock -> /run/lock/
> drwxrwxr-x 19 root syslog   4096 Jul 28 08:00 log/
> drwxr-xr-x  2 root root 4096 Aug  4  2015 opt/
> lrwxrwxrwx  1 root root4 May  5 19:50 run -> /run/}}
> Current the work around is mount without noexec:
> {{sudo mount -o remount,exec /run}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5388) MesosContainerizerLaunch flags execute arbitrary commands via shell

2016-08-01 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403069#comment-15403069
 ] 

Jie Yu commented on MESOS-5388:
---

commit 9c6097f063405279efc07eec22457c2059653f07
Author: Gilbert Song 
Date:   Mon Aug 1 17:07:00 2016 -0700

Updated filesystem linux isolator pre exec commands to be non-shell.

Review: https://reviews.apache.org/r/50216/

> MesosContainerizerLaunch flags execute arbitrary commands via shell
> ---
>
> Key: MESOS-5388
> URL: https://issues.apache.org/jira/browse/MESOS-5388
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: James DeFelice
>Assignee: Gilbert Song
>  Labels: mesosphere, security
>
> For example, the docker volume isolator's containerPath is appended (without 
> sanitation) to a command that's executed in this manner. As such, it's 
> possible to inject arbitrary shell commands to be executed by mesos.
> https://github.com/apache/mesos/blob/17260204c833c643adf3d8f36ad8a1a606ece809/src/slave/containerizer/mesos/launch.cpp#L206
> Perhaps instead of strings these commands could/should be sent as string 
> arrays that could be passed as argv arguments w/o shell interpretation?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3548) Investigate federations of Mesos masters

2016-08-01 Thread Dhilip Kumar S (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dhilip Kumar S updated MESOS-3548:
--
Description: 
In a large Mesos installation, the operator might want to ensure that even if 
the Mesos masters are inaccessible or failed, new tasks can still be scheduled 
(across multiple different frameworks). HA masters are only a partial solution 
here: the masters might still be inaccessible due to a correlated failure 
(e.g., Zookeeper misconfiguration/human error).

To support this, we could support the notion of "hierarchies" or "federations" 
of Mesos masters. In a Mesos installation with 10k machines, the operator might 
configure 10 Mesos masters (each of which might be HA) to manage 1k machines 
each. Then an additional "meta-Master" would manage the allocation of cluster 
resources to the 10 masters. Hence, the failure of any individual master would 
impact 1k machines at most. The meta-master might not have a lot of work to do: 
e.g., it might be limited to occasionally reallocating cluster resources among 
the 10 masters, or ensuring that newly added cluster resources are allocated 
among the masters as appropriate. Hence, the failure of the meta-master would 
not prevent any of the individual masters from scheduling new tasks. A single 
framework instance probably wouldn't be able to use more resources than have 
been assigned to a single Master, but that seems like a reasonable restriction.

This feature might also be a good fit for a multi-datacenter deployment of 
Mesos: each Mesos master instance would manage a single DC. Naturally, reducing 
the traffic between frameworks and the meta-master would be important for 
performance reasons in a configuration like this.

Operationally, this might be simpler if Mesos processes were self-hosting 
([MESOS-3547]).

Intial Design document: 
https://docs.google.com/document/d/1U4IY_ObAXUPhtTa-0Rw_5zQxHDRnJFe5uFNOQ0VUcLg/edit#
Initial Survey :  https://goo.gl/forms/DpVRV9Zh3kunhJkP2

  was:
In a large Mesos installation, the operator might want to ensure that even if 
the Mesos masters are inaccessible or failed, new tasks can still be scheduled 
(across multiple different frameworks). HA masters are only a partial solution 
here: the masters might still be inaccessible due to a correlated failure 
(e.g., Zookeeper misconfiguration/human error).

To support this, we could support the notion of "hierarchies" or "federations" 
of Mesos masters. In a Mesos installation with 10k machines, the operator might 
configure 10 Mesos masters (each of which might be HA) to manage 1k machines 
each. Then an additional "meta-Master" would manage the allocation of cluster 
resources to the 10 masters. Hence, the failure of any individual master would 
impact 1k machines at most. The meta-master might not have a lot of work to do: 
e.g., it might be limited to occasionally reallocating cluster resources among 
the 10 masters, or ensuring that newly added cluster resources are allocated 
among the masters as appropriate. Hence, the failure of the meta-master would 
not prevent any of the individual masters from scheduling new tasks. A single 
framework instance probably wouldn't be able to use more resources than have 
been assigned to a single Master, but that seems like a reasonable restriction.

This feature might also be a good fit for a multi-datacenter deployment of 
Mesos: each Mesos master instance would manage a single DC. Naturally, reducing 
the traffic between frameworks and the meta-master would be important for 
performance reasons in a configuration like this.

Operationally, this might be simpler if Mesos processes were self-hosting 
([MESOS-3547]).


> Investigate federations of Mesos masters
> 
>
> Key: MESOS-3548
> URL: https://issues.apache.org/jira/browse/MESOS-3548
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>Assignee: Dhilip Kumar S
>  Labels: federation, mesosphere, multi-dc
>
> In a large Mesos installation, the operator might want to ensure that even if 
> the Mesos masters are inaccessible or failed, new tasks can still be 
> scheduled (across multiple different frameworks). HA masters are only a 
> partial solution here: the masters might still be inaccessible due to a 
> correlated failure (e.g., Zookeeper misconfiguration/human error).
> To support this, we could support the notion of "hierarchies" or 
> "federations" of Mesos masters. In a Mesos installation with 10k machines, 
> the operator might configure 10 Mesos masters (each of which might be HA) to 
> manage 1k machines each. Then an additional "meta-Master" would manage the 
> allocation of cluster resources to the 10 masters. Hence, the failure of any 
> individual master would impact 1k machines at most. The meta-master might n

[jira] [Commented] (MESOS-4862) Setting failover_timeout in FrameworkInfo to Double.MAX_VALUE causes it to be set to zero

2016-08-01 Thread Steven Schlansker (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403028#comment-15403028
 ] 

Steven Schlansker commented on MESOS-4862:
--

Is this a duplicate of https://issues.apache.org/jira/browse/MESOS-1575 ?

> Setting failover_timeout in FrameworkInfo to Double.MAX_VALUE causes it to be 
> set to zero
> -
>
> Key: MESOS-4862
> URL: https://issues.apache.org/jira/browse/MESOS-4862
> Project: Mesos
>  Issue Type: Bug
>  Components: master, stout
>Reporter: Timothy Chen
>
> Currently we expose framework failover_timeout as a double in Proto, and if 
> users set the failover_timeout to Double.MAX_VALUE, the Master will actually 
> set it to zero which is the complete opposite of the original intent.
> The problem is that in stout/duration.hpp we only store down to the 
> nanoseconds with int64_t, and it gives an error when we pass double.max as it 
> goes out of the int64_t bounds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4992) sandbox uri does not work outisde mesos http server

2016-08-01 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403013#comment-15403013
 ] 

Benjamin Mahler commented on MESOS-4992:


It looks like this was done intentionally:

{code}
// When navigating directly to this page, e.g. pasting the URL into the
// browser, the previous page is not a page in Mesos. In that case, navigate
// home.
if (!$scope.agents) {
  $alert.danger({
message: "Navigate to the agent's sandbox via the Mesos UI.",
title: "Failed to find agents."
  });
  return $location.path('/').replace();
}
{code}

>From here: 
>https://github.com/apache/mesos/blob/8dc71da12c9b91edd2fa6c7b9a0a088b7dbb0ad3/src/webui/master/static/js/controllers.js#L751-L760

Looking at the 
[commit|https://github.com/apache/mesos/commit/270b7594c8eb3dd0d4db8461c3ee1108fa16b45d]
 and the code, it's not clear to me why this check was introduced since it is 
not being done in the other controllers.

> sandbox uri does not work outisde mesos http server
> ---
>
> Key: MESOS-4992
> URL: https://issues.apache.org/jira/browse/MESOS-4992
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 0.27.1
>Reporter: Stavros Kontopoulos
>  Labels: mesosphere
>
> The SandBox uri of a framework does not work if i just copy paste it to the 
> browser.
> For example the following sandbox uri:
> http://172.17.0.1:5050/#/slaves/50f87c73-79ef-4f2a-95f0-b2b4062b2de6-S0/frameworks/50f87c73-79ef-4f2a-95f0-b2b4062b2de6-0009/executors/driver-20160321155016-0001/browse
> should redirect to:
> http://172.17.0.1:5050/#/slaves/50f87c73-79ef-4f2a-95f0-b2b4062b2de6-S0/browse?path=%2Ftmp%2Fmesos%2Fslaves%2F50f87c73-79ef-4f2a-95f0-b2b4062b2de6-S0%2Fframeworks%2F50f87c73-79ef-4f2a-95f0-b2b4062b2de6-0009%2Fexecutors%2Fdriver-20160321155016-0001%2Fruns%2F60533483-31fb-4353-987d-f3393911cc80
> yet it fails with the message:
> "Failed to find slaves.
> Navigate to the slave's sandbox via the Mesos UI."
> and redirects to:
> http://172.17.0.1:5050/#/
> It is an issue for me because im working on expanding the mesos spark ui with 
> sandbox uri, The other option is to get the slave info and parse the json 
> file there and get executor paths not so straightforward or elegant though.
> Moreover i dont see the runs/container_id in the Mesos Proto Api. I guess 
> this is hidden info, this is the needed piece of info to re-write the uri 
> without redirection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5959) All non-root tests fail on GPU machine

2016-08-01 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402848#comment-15402848
 ] 

Kevin Klues commented on MESOS-5959:


https://reviews.apache.org/r/50671/
https://reviews.apache.org/r/50672/

> All non-root tests fail on GPU machine
> --
>
> Key: MESOS-5959
> URL: https://issues.apache.org/jira/browse/MESOS-5959
> Project: Mesos
>  Issue Type: Bug
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: gpu, mesosphere
> Fix For: 1.0.1
>
>
> A recent addition to ensure that {{NvidiaVolume::create()}} ran as root broke 
> all non-root tests on GPU machines. The reason is that we unconditionally 
> create this volume so long as we detect {{nvml.isAvailable()}} which will 
> fail now that we are only allowed to create this volume if we have root 
> permissions.
> We should fix this by adding the proper conditions to determine when / if we 
> should create this volume based on some combination of {{\-\-containerizer}} 
> and {{\-\-isolation}} flags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5894) cpu share should be considered distinctly from cpu allocation

2016-08-01 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402846#comment-15402846
 ] 

Vinod Kone commented on MESOS-5894:
---

Currently, an executor and any of its tasks share the same cgroup limits. The 
cgroup limits are increased/decreased as tasks come and go from the executor. 
So yes, there is no isolation between tasks of an executor and executor itself. 
This will change in the future when we allow tasks to have their own resource 
limits.  See https://issues.apache.org/jira/browse/MESOS-

> cpu share should be considered distinctly from cpu allocation
> -
>
> Key: MESOS-5894
> URL: https://issues.apache.org/jira/browse/MESOS-5894
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Affects Versions: 0.28.2
> Environment: Linux, cgroups, docker
>Reporter: Christopher Hunt
>  Labels: cgroups, cpu-usage, docker
>
> As a framework developer I wish to explicitly declare the cpu.share for a 
> task and its associated executor so that I may have a direct means of 
> controlling their respective cpu usage at runtime.
> With current behaviour, I've noticed that the cgroup cpu.share for a task 
> includes both the executor's cpu.share and also the cpu value specified as a 
> task's resources. The cpu.share value appears to be calculated as a multiple 
> of 1024, therefore 1 cpu == 1024, 0.1 cpu = 102 and so forth. I find this 
> behaviour to be unexpected, and also an overloading of the meaning of the 
> resource cpu type. My understanding of the resource cpu type is that it is 
> used primarily for decrementing from the total number of cpus available to a 
> node, and thereby influences the resource offers made to a given framework in 
> consideration of other frameworks. On the other hand, cpu shares limit the 
> amount of cpu used at runtime.
> By way of a solution, perhaps a new Resource type could be introduced named 
> "cpu-share" and optionally provided by a scheduler when constructing a 
> TaskInfo. The cpu share resource could also be optionally specified for the 
> associated executor. By not specifying the cpu share, the existing behaviour 
> is preserved thereby providing backward compatibility.
> Related issue: https://issues.apache.org/jira/browse/MESOS-1718



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5959) All non-root tests fail on GPU machine

2016-08-01 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-5959:
---
Description: 
A recent addition to ensure that {{NvidiaVolume::create()}} ran as root broke 
all non-root tests on GPU machines. The reason is that we unconditionally 
create this volume so long as we detect {{nvml.isAvailable()}} which will fail 
now that we are only allowed to create this volume if we have root permissions.

We should fix this by adding the proper conditions to determine when / if we 
should create this volume based on some combination of {{\-\-containerizer}} 
and {{\-\-isolation}} flags.

  was:
A recent addition to ensure that {{NvidiaVolume::create() }} ran as root broke 
all non-root tests on GPU machines. The reason is that we unconditionally 
create this volume so long as we detect {{nvml.isAvailable()}} which will fail 
now that we are only allowed to create this volume if we have root permissions.

We should fix this by adding the proper conditions to determine when / if we 
should create this volume based on some combination of {{\-\-containerizer}} 
and {{\-\-isolation}} flags.


> All non-root tests fail on GPU machine
> --
>
> Key: MESOS-5959
> URL: https://issues.apache.org/jira/browse/MESOS-5959
> Project: Mesos
>  Issue Type: Bug
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: gpu, mesosphere
> Fix For: 1.0.1
>
>
> A recent addition to ensure that {{NvidiaVolume::create()}} ran as root broke 
> all non-root tests on GPU machines. The reason is that we unconditionally 
> create this volume so long as we detect {{nvml.isAvailable()}} which will 
> fail now that we are only allowed to create this volume if we have root 
> permissions.
> We should fix this by adding the proper conditions to determine when / if we 
> should create this volume based on some combination of {{\-\-containerizer}} 
> and {{\-\-isolation}} flags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5959) All non-root tests fail on GPU machine

2016-08-01 Thread Kevin Klues (JIRA)
Kevin Klues created MESOS-5959:
--

 Summary: All non-root tests fail on GPU machine
 Key: MESOS-5959
 URL: https://issues.apache.org/jira/browse/MESOS-5959
 Project: Mesos
  Issue Type: Bug
Reporter: Kevin Klues
Assignee: Kevin Klues
 Fix For: 1.0.1


A recent addition to ensure that {{NvidiaVolume::create() }} ran as root broke 
all non-root tests on GPU machines. The reason is that we unconditionally 
create this volume so long as we detect {{nvml.isAvailable()}} which will fail 
now that we are only allowed to create this volume if we have root permissions.

We should fix this by adding the proper conditions to determine when / if we 
should create this volume based on some combination of {{\-\-containerizer}} 
and {{\-\-isolation}} flags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5958) Reviewbot failing due to python files not being cleaned up after distclean

2016-08-01 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402809#comment-15402809
 ] 

Vinod Kone commented on MESOS-5958:
---

What's a bit surprising is that both the ReviewBot jenkins job and the main 
Mesos job run the same command (./support/docker_build.sh) and the latter 
doesn't seem to error out on this failure; although both the jobs show these 
errors during the cleanup phase.

Successful Mesos build: 
https://builds.apache.org/view/M-R/view/Mesos/job/Mesos/2570/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu:14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-6)/console
{code}
find python -name "build" -o -name "dist" -o -name "*.pyc"  \
  -o -name "*.egg-info" | xargs rm -rf
test -z "libmesos_no_3rdparty.la libbuild.la liblog.la libstate.la libjava.la 
libexamplemodule.la libtestallocator.la libtestanonymous.la 
libtestauthentication.la libtestauthorizer.la libtestcontainer_logger.la 
libtesthook.la libtesthttpauthenticator.la libtestisolator.la 
libtestmastercontender.la libtestmasterdetector.la libtestqos_controller.la 
libtestresource_estimator.la" || rm -f libmesos_no_3rdparty.la libbuild.la 
liblog.la libstate.la libjava.la libexamplemodule.la libtestallocator.la 
libtestanonymous.la libtestauthentication.la libtestauthorizer.la 
libtestcontainer_logger.la libtesthook.la libtesthttpauthenticator.la 
libtestisolator.la libtestmastercontender.la libtestmasterdetector.la 
libtestqos_controller.la libtestresource_estimator.la
test -z "liblogrotate_container_logger.la libfixed_resource_estimator.la 
libload_qos_controller.la " || rm -f liblogrotate_container_logger.la 
libfixed_resource_estimator.la libload_qos_controller.la 
 rm -f mesos-fetcher mesos-executor mesos-containerizer mesos-logrotate-logger 
mesos-health-check mesos-usage mesos-docker-executor
rm -f ./so_locations
 rm -f mesos-agent mesos-master mesos-slave
rm -f ./so_locations
rm -f *.o
rm -f *.lo
rm -f *.tab.c
rm -f ./so_locations
test -z "" || rm -f 
rm -f ../include/mesos/*.o
rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
test . = "../../src" || test -z "" || rm -f 
rm -f ../include/mesos/*.lo
rm -f ../include/mesos/.deps/.dirstamp
rm -f ../include/mesos/agent/*.o
rm -f ../include/mesos/agent/*.lo
rm -f ../include/mesos/.dirstamp
rm -f ../include/mesos/allocator/*.o
rm -f ../include/mesos/agent/.deps/.dirstamp
rm -f ../include/mesos/agent/.dirstamp
rm -f ../include/mesos/allocator/*.lo
rm -f ../include/mesos/allocator/.deps/.dirstamp
rm -f ../include/mesos/appc/*.o
rm -f ../include/mesos/allocator/.dirstamp
rm -f ../include/mesos/appc/*.lo
rm -f ../include/mesos/appc/.deps/.dirstamp
rm: cannot remove 'python/cli/build': Is a directory
rm: cannot remove 'python/executor/build': Is a directory
rm: cannot remove 'python/interface/build': Is a directory
rm: cannot remove 'python/native/build': Is a directory
rm: cannot remove 'python/scheduler/build': Is a directory
rm -f ../include/mesos/authentication/*.o
make[2]: [clean-generic] Error 1 (ignored)
rm -f ../include/mesos/appc/.dirstamp
rm -f ../include/mesos/authentication/*.lo
rm -f ../include/mesos/authentication/.deps/.dirstamp
rm -f ../include/mesos/authorizer/*.o
rm -f ../include/mesos/authentication/.dirstamp
rm -f ../include/mesos/authorizer/*.lo
rm -f ../include/mesos/authorizer/.deps/.dirstamp
rm -f ../include/mesos/containerizer/*.o
rm -f ../include/mesos/authorizer/.dirstamp
rm -f ../include/mesos/containerizer/*.lo
rm -f ../include/mesos/containerizer/.deps/.dirstamp
rm -f ../include/mesos/docker/*.o
rm -f ../include/mesos/containerizer/.dirstamp
rm -f ../include/mesos/docker/*.lo
rm -f ../include/mesos/docker/.deps/.dirstamp
rm -f ../include/mesos/docker/.dirstamp
rm -f ../include/mesos/executor/*.o
rm -f ../include/mesos/executor/.deps/.dirstamp
rm -f ../include/mesos/executor/*.lo
rm -f ../include/mesos/executor/.dirstamp
rm -f ../include/mesos/fetcher/*.o
rm -f ../include/mesos/fetcher/.deps/.dirstamp
rm -f ../include/mesos/fetcher/*.lo
rm -f ../include/mesos/fetcher/.dirstamp
rm -f ../include/mesos/maintenance/*.o
rm -f ../include/mesos/maintenance/.deps/.dirstamp
rm -f ../include/mesos/maintenance/*.lo
rm -f ../include/mesos/maintenance/.dirstamp
rm -f ../include/mesos/master/*.o
rm -f ../include/mesos/master/.deps/.dirstamp
rm -f ../include/mesos/master/*.lo
rm -f ../include/mesos/master/.dirstamp
rm -f ../include/mesos/module/*.o
rm -f ../include/mesos/module/.deps/.dirstamp
rm -f ../include/mesos/module/*.lo
rm -f ../include/mesos/module/.dirstamp
rm -f ../include/mesos/quota/*.o
rm -f ../include/mesos/quota/.deps/.dirstamp
rm -f ../include/mesos/quota/*.lo
rm -f ../include/mesos/quota/.dirstamp
rm -f ../include/mesos/scheduler/*.o
rm -f ../include/mesos/scheduler/.deps/.dirstamp
rm -f ../include/mesos/scheduler/*.lo
rm -f ../include/mesos/scheduler/.dirstamp
rm -f 

[jira] [Updated] (MESOS-5958) Reviewbot failing due to python files not being cleaned up after distclean

2016-08-01 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5958:
--
Description: 
This is on ASF CI. 
https://builds.apache.org/job/mesos-reviewbot/14573/consoleFull

{code}
find python -name "build" -o -name "dist" -o -name "*.pyc"  \
  -o -name "*.egg-info" -exec rm -rf '{}' \+
test -z "libmesos_no_3rdparty.la libbuild.la liblog.la libstate.la libjava.la 
libexamplemodule.la libtestallocator.la libtestanonymous.la 
libtestauthentication.la libtestauthorizer.la libtestcontainer_logger.la 
libtesthook.la libtesthttpauthenticator.la libtestisolator.la 
libtestmastercontender.la libtestmasterdetector.la libtestqos_controller.la 
libtestresource_estimator.la" || rm -f libmesos_no_3rdparty.la libbuild.la 
liblog.la libstate.la libjava.la libexamplemodule.la libtestallocator.la 
libtestanonymous.la libtestauthentication.la libtestauthorizer.la 
libtestcontainer_logger.la libtesthook.la libtesthttpauthenticator.la 
libtestisolator.la libtestmastercontender.la libtestmasterdetector.la 
libtestqos_controller.la libtestresource_estimator.la
test -z "liblogrotate_container_logger.la libfixed_resource_estimator.la 
libload_qos_controller.la " || rm -f liblogrotate_container_logger.la 
libfixed_resource_estimator.la libload_qos_controller.la 
 rm -f mesos-fetcher mesos-executor mesos-containerizer mesos-logrotate-logger 
mesos-health-check mesos-usage mesos-docker-executor
 rm -f mesos-agent mesos-master mesos-slave
rm -f ./so_locations
rm -f *.o
rm -f *.lo
rm -f ../include/mesos/*.o
rm -f ./so_locations
rm -f *.tab.c
test -z "" || rm -f 
rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
rm -f ./so_locations
rm -f ../include/mesos/*.lo
test . = "../../src" || test -z "" || rm -f 
rm -f ../include/mesos/.deps/.dirstamp
rm -f ../include/mesos/agent/*.o
rm -f ../include/mesos/agent/*.lo
rm -f ../include/mesos/.dirstamp
rm -f ../include/mesos/allocator/*.o
rm -f ../include/mesos/agent/.deps/.dirstamp
rm -f ../include/mesos/allocator/*.lo
rm -f ../include/mesos/agent/.dirstamp
rm -f ../include/mesos/appc/*.o
rm -f ../include/mesos/allocator/.deps/.dirstamp
rm -f ../include/mesos/appc/*.lo
rm -f ../include/mesos/allocator/.dirstamp
rm -f ../include/mesos/authentication/*.o
rm -f ../include/mesos/appc/.deps/.dirstamp
rm -f ../include/mesos/authentication/*.lo
rm -f ../include/mesos/appc/.dirstamp
rm -f ../include/mesos/authorizer/*.o
rm -f ../include/mesos/authentication/.deps/.dirstamp
rm -f ../include/mesos/authorizer/*.lo
rm -f ../include/mesos/authentication/.dirstamp
rm -f ../include/mesos/containerizer/*.o
rm -f ../include/mesos/authorizer/.deps/.dirstamp
rm -f ../include/mesos/containerizer/*.lo
rm -f ../include/mesos/authorizer/.dirstamp
rm -f ../include/mesos/docker/*.o
rm -f ../include/mesos/containerizer/.deps/.dirstamp
rm -f ../include/mesos/docker/*.lo
rm: cannot remove 'python/cli/build': Is a directory
rm: cannot remove 'python/executor/build': Is a directory
rm: cannot remove 'python/interface/build': Is a directory
rm: cannot remove 'python/native/build': Is a directory
rm: cannot remove 'python/scheduler/build': Is a directory
rm -f ../include/mesos/containerizer/.dirstamp
make[2]: [clean-generic] Error 1 (ignored)
rm -f ../include/mesos/executor/*.o
rm -f ../include/mesos/docker/.deps/.dirstamp
rm -f ../include/mesos/executor/*.lo
rm -f ../include/mesos/docker/.dirstamp
rm -f ../include/mesos/fetcher/*.o
rm -f ../include/mesos/executor/.deps/.dirstamp
rm -f ../include/mesos/fetcher/*.lo
rm -f ../include/mesos/executor/.dirstamp
rm -f ../include/mesos/maintenance/*.o
rm -f ../include/mesos/fetcher/.deps/.dirstamp
rm -f ../include/mesos/maintenance/*.lo
rm -f ../include/mesos/fetcher/.dirstamp
rm -f ../include/mesos/master/*.o
rm -f ../include/mesos/maintenance/.deps/.dirstamp
rm -f ../include/mesos/master/*.lo
rm -f ../include/mesos/module/*.o
rm -f ../include/mesos/maintenance/.dirstamp
rm -f ../include/mesos/module/*.lo
rm -f ../include/mesos/master/.deps/.dirstamp
rm -f ../include/mesos/master/.dirstamp
rm -f ../include/mesos/quota/*.o
rm -f ../include/mesos/module/.deps/.dirstamp
rm -f ../include/mesos/quota/*.lo
rm -f ../include/mesos/module/.dirstamp
rm -f ../include/mesos/scheduler/*.o
rm -f ../include/mesos/quota/.deps/.dirstamp
rm -f ../include/mesos/scheduler/*.lo
rm -f ../include/mesos/quota/.dirstamp
rm -f ../include/mesos/slave/*.o
rm -f ../include/mesos/scheduler/.deps/.dirstamp
rm -f ../include/mesos/scheduler/.dirstamp
rm -f ../include/mesos/slave/*.lo
rm -f ../include/mesos/slave/.deps/.dirstamp
rm -f ../include/mesos/state/*.o
rm -f ../include/mesos/slave/.dirstamp
rm -f ../include/mesos/state/*.lo
rm -f ../include/mesos/state/.deps/.dirstamp
rm -f ../include/mesos/uri/*.o
rm -f ../include/mesos/state/.dirstamp
rm -f ../include/mesos/uri/*.lo
rm -f ../include/mesos/uri/.deps/.dirstamp
rm -f ../include/mesos/v1/*.o
rm -f ../include/mesos/uri/.dirstamp
rm -f ../

[jira] [Assigned] (MESOS-5930) Orphan tasks shown as RUNNING have state TASK_FINISHED

2016-08-01 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar reassigned MESOS-5930:
-

Assignee: Anand Mazumdar

> Orphan tasks shown as RUNNING have state TASK_FINISHED
> --
>
> Key: MESOS-5930
> URL: https://issues.apache.org/jira/browse/MESOS-5930
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.0.0
>Reporter: Lukas Loesche
>Assignee: Anand Mazumdar
> Fix For: 1.0.1
>
> Attachments: Screen Shot 2016-07-29 at 19.23.49.png, Screen Shot 
> 2016-07-29 at 19.24.03.png, orphan-running.txt
>
>
> On my cluster I have 111 Orphan Tasks of which some are RUNNING some are 
> FINISHED and some are FAILED. When I open the task details for a FINISHED 
> tasks the following page shows a state of TASK_FINISHED and likewise when I 
> open a FAILED task the details page shows TASK_FAILED.
> However when I open the details for the RUNNING tasks they all have a task 
> state of TASK_FINISHED. None of them is in state TASK_RUNNING.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402768#comment-15402768
 ] 

Gilbert Song commented on MESOS-5953:
-

A discussion is needed to prefer which way we should go.

> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> /. https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402767#comment-15402767
 ] 

Gilbert Song commented on MESOS-5953:
-

[~philwinder], understand your concern. You want the docker image created by 
Dockerfile using default working_dir can be executable in unified 
containerizer. We need to leverage the pros and cons:

1. Using mesos container sandbox:
Pros: Keep the semantics in mesos consistent. Any files/dirs under the sandbox 
will not be lost even the container is killed (bind mounted to host sandbox). 
Persistent volumes should be accessible in the container sandbox.
Cons: operators may need to add one more layer to the image to specify the 
working_dir as "/".

2. Using "/" by default:
Pros: will not break some docker images which has some default entrypoint/cmd 
using "/" as working_dir.
Cons: Semantic is not guaranteed to be the same in docker. semantics in mesos 
insistency.

> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> /. https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5792) Add mesos tests to CMake (make check)

2016-08-01 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402763#comment-15402763
 ] 

Joseph Wu commented on MESOS-5792:
--

More progress:
{code}
commit 7dbc74efaea3e4ec185bfbd0c503a61ac2a5f1e1
Author: Srinivas Brahmaroutu 
Date:   Thu Jul 28 12:37:19 2016 -0700

CMake: Added `setns` and `active-user` test helper binaries.

These binaries are required for `NsTest.ROOT_setns` and
`SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser`.

Review: https://reviews.apache.org/r/50064/
{code}
{code}
commit 697b55a733d15a5bdc0a524f062f3dd93263a224
Author: Srinivas Brahmaroutu 
Date:   Thu Jul 28 14:49:41 2016 -0700

CMake: Added LogrotateContainerLogger companion executable.

This binary is required for the various `LOGROTATE_*` tests.
For now, this binary is not built on Windows due to some
optimizations made inside the executable.

Review: https://reviews.apache.org/r/50179/
{code}
{code}
commit dac771f1e2fafbf9ad8adfebc491933d64a21d66
Author: Srinivas Brahmaroutu 
Date:   Thu Jul 28 15:13:42 2016 -0700

CMake: Added build script for mesos-local executable.

This executable is used to run a local Mesos cluster
for testing purposes.

Review: https://reviews.apache.org/r/50323/
{code}
{code}
commit 0c2166c4e68748a285a680f32b1dbf51d865f245
Author: Srinivas Brahmaroutu 
Date:   Thu Jul 28 15:55:48 2016 -0700

CMake: Added script to build mesos-execute.

`mesos-execute` is a utility that can schedule and run
a single task.

Review: https://reviews.apache.org/r/50324/
{code}

> Add mesos tests to CMake (make check)
> -
>
> Key: MESOS-5792
> URL: https://issues.apache.org/jira/browse/MESOS-5792
> Project: Mesos
>  Issue Type: Improvement
>  Components: build
>Reporter: Srinivas
>Assignee: Srinivas
>  Labels: build, mesosphere
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Provide CMakeLists.txt and configuration files to build mesos tests using 
> CMake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5388) MesosContainerizerLaunch flags execute arbitrary commands via shell

2016-08-01 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402727#comment-15402727
 ] 

Jie Yu commented on MESOS-5388:
---

commit ca5eaad82f69309de427aab3ec2ed7976c9cc850
Author: Gilbert Song 
Date:   Mon Aug 1 13:05:53 2016 -0700

Updated docker volume isolator to return non-shell 'pre_exec_commands'.

Review: https://reviews.apache.org/r/50535/

commit 202e1933c592f456420ec1c85fd9a21d0df9
Author: Gilbert Song 
Date:   Mon Aug 1 13:03:16 2016 -0700

Updated mesos containerizer launch execute() to return 'EXIT_FAILURE'.

Review: https://reviews.apache.org/r/50534/

> MesosContainerizerLaunch flags execute arbitrary commands via shell
> ---
>
> Key: MESOS-5388
> URL: https://issues.apache.org/jira/browse/MESOS-5388
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: James DeFelice
>Assignee: Gilbert Song
>  Labels: mesosphere, security
>
> For example, the docker volume isolator's containerPath is appended (without 
> sanitation) to a command that's executed in this manner. As such, it's 
> possible to inject arbitrary shell commands to be executed by mesos.
> https://github.com/apache/mesos/blob/17260204c833c643adf3d8f36ad8a1a606ece809/src/slave/containerizer/mesos/launch.cpp#L206
> Perhaps instead of strings these commands could/should be sent as string 
> arrays that could be passed as argv arguments w/o shell interpretation?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5933) Refactor the uri::Fetcher as a binary.

2016-08-01 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402653#comment-15402653
 ] 

Joseph Wu commented on MESOS-5933:
--

You can do that by scheduling a task :)

One of the motivations behind the URI fetcher is to get rid of the extra binary 
(harder to maintain and has some odd undesirable behavior in some 
configurations).  By running the fetcher separately, you'll end up testing a 
different code path.  This is especially true if the fetcher becomes pluggable.

> Refactor the uri::Fetcher as a binary.
> --
>
> Key: MESOS-5933
> URL: https://issues.apache.org/jira/browse/MESOS-5933
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Gilbert Song
>Assignee: Zhitao Li
>  Labels: fetcher, mesosphere
>
> By refactoring the uri::Fetcher as a binary, the fetcher can be used 
> independently. Not only mesos, but also new fetcher plugin testing, mesos cli 
> and many other new components in the future can re-use the binary to fetch 
> any URI with different schemes. Ideally, after this change, mesos cli is able 
> to re-use the uri::Fetcher binary to introduce new image pulling commands, 
> e.g., `mesos fetch -i `.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5958) Reviewbot failing due to python files not being cleaned up after distclean

2016-08-01 Thread Vinod Kone (JIRA)
Vinod Kone created MESOS-5958:
-

 Summary: Reviewbot failing due to python files not being cleaned 
up after distclean
 Key: MESOS-5958
 URL: https://issues.apache.org/jira/browse/MESOS-5958
 Project: Mesos
  Issue Type: Bug
 Environment: ASF CI
Reporter: Vinod Kone


This is on ASF CI.

{code}
rm -rf ../include/mesos/.deps ../include/mesos/agent/.deps 
../include/mesos/allocator/.deps ../include/mesos/appc/.deps 
../include/mesos/authentication/.deps ../include/mesos/authorizer/.deps 
../include/mesos/containerizer/.deps ../include/mesos/docker/.deps 
../include/mesos/executor/.deps ../include/mesos/fetcher/.deps 
../include/mesos/maintenance/.deps ../include/mesos/master/.deps 
../include/mesos/module/.deps ../include/mesos/quota/.deps 
../include/mesos/scheduler/.deps ../include/mesos/slave/.deps 
../include/mesos/state/.deps ../include/mesos/uri/.deps 
../include/mesos/v1/.deps ../include/mesos/v1/agent/.deps 
../include/mesos/v1/allocator/.deps ../include/mesos/v1/executor/.deps 
../include/mesos/v1/maintenance/.deps ../include/mesos/v1/master/.deps 
../include/mesos/v1/quota/.deps ../include/mesos/v1/scheduler/.deps appc/.deps 
authentication/cram_md5/.deps authentication/http/.deps authorizer/.deps 
authorizer/local/.deps cli/.deps common/.deps docker/.deps examples/.deps 
exec/.deps executor/.deps files/.deps hdfs/.deps health-check/.deps hook/.deps 
internal/.deps java/jni/.deps jvm/.deps jvm/org/apache/.deps launcher/.deps 
launcher/posix/.deps linux/.deps linux/routing/.deps 
linux/routing/diagnosis/.deps linux/routing/filter/.deps 
linux/routing/link/.deps linux/routing/queueing/.deps local/.deps log/.deps 
log/tool/.deps logging/.deps master/.deps master/allocator/.deps 
master/allocator/mesos/.deps master/allocator/sorter/drf/.deps 
master/contender/.deps master/detector/.deps messages/.deps module/.deps 
sched/.deps scheduler/.deps slave/.deps slave/container_loggers/.deps 
slave/containerizer/.deps slave/containerizer/mesos/.deps 
slave/containerizer/mesos/isolators/appc/.deps 
slave/containerizer/mesos/isolators/cgroups/.deps 
slave/containerizer/mesos/isolators/docker/.deps 
slave/containerizer/mesos/isolators/docker/volume/.deps 
slave/containerizer/mesos/isolators/filesystem/.deps 
slave/containerizer/mesos/isolators/gpu/.deps 
slave/containerizer/mesos/isolators/namespaces/.deps 
slave/containerizer/mesos/isolators/network/.deps 
slave/containerizer/mesos/isolators/network/cni/.deps 
slave/containerizer/mesos/isolators/posix/.deps 
slave/containerizer/mesos/isolators/xfs/.deps 
slave/containerizer/mesos/provisioner/.deps 
slave/containerizer/mesos/provisioner/appc/.deps 
slave/containerizer/mesos/provisioner/backends/.deps 
slave/containerizer/mesos/provisioner/docker/.deps slave/qos_controllers/.deps 
slave/resource_estimators/.deps state/.deps tests/.deps tests/common/.deps 
tests/containerizer/.deps uri/.deps uri/fetchers/.deps usage/.deps v1/.deps 
version/.deps watcher/.deps zookeeper/.deps
rm -f Makefile
make[2]: Leaving directory `/mesos/mesos-1.1.0/_build/src'
rm -f config.status config.cache config.log configure.lineno 
config.status.lineno
rm -f Makefile
ERROR: files left in build directory after distclean:
./src/python/executor/build/temp.linux-x86_64-2.7/src/mesos/executor/module.o
./src/python/executor/build/temp.linux-x86_64-2.7/src/mesos/executor/mesos_executor_driver_impl.o
./src/python/executor/build/temp.linux-x86_64-2.7/src/mesos/executor/proxy_executor.o
./src/python/executor/build/lib.linux-x86_64-2.7/mesos/executor/_executor.so
./src/python/executor/build/lib.linux-x86_64-2.7/mesos/executor/__init__.py
./src/python/executor/build/lib.linux-x86_64-2.7/mesos/__init__.py
./src/python/executor/ext_modules.pyc
./src/python/scheduler/build/temp.linux-x86_64-2.7/src/mesos/scheduler/module.o
./src/python/scheduler/build/temp.linux-x86_64-2.7/src/mesos/scheduler/mesos_scheduler_driver_impl.o
./src/python/scheduler/build/temp.linux-x86_64-2.7/src/mesos/scheduler/proxy_scheduler.o
./src/python/scheduler/build/lib.linux-x86_64-2.7/mesos/scheduler/_scheduler.so
./src/python/scheduler/build/lib.linux-x86_64-2.7/mesos/scheduler/__init__.py
./src/python/scheduler/build/lib.linux-x86_64-2.7/mesos/__init__.py
./src/python/scheduler/ext_modules.pyc
./src/python/build/lib.linux-x86_64-2.7/mesos/__init__.py
./src/python/cli/build/lib.linux-x86_64-2.7/mesos/http.py
./src/python/cli/build/lib.linux-x86_64-2.7/mesos/cli.py
./src/python/cli/build/lib.linux-x86_64-2.7/mesos/futures.py
./src/python/cli/build/lib.linux-x86_64-2.7/mesos/__init__.py
./src/python/interface/build/lib.linux-x86_64-2.7/mesos/__init__.py
./src/python/interface/build/lib.linux-x86_64-2.7/mesos/interface/containerizer_pb2.py
./src/python/interface/build/lib.linux-x86_64-2.7/mesos/interface/mesos_pb2.py
./src/python/interface/build/lib.linux-x86_64-2.7/mesos/interface/__init__.py
./src/python/native/bui

[jira] [Commented] (MESOS-5388) MesosContainerizerLaunch flags execute arbitrary commands via shell

2016-08-01 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402575#comment-15402575
 ] 

Jie Yu commented on MESOS-5388:
---

commit 25626fcf8f63875ed0ccfe2ddb67a9998e5ba934
Author: Gilbert Song 
Date:   Mon Aug 1 09:50:13 2016 -0700

Supported non-shell command in MesosLaunch to avoid arbitrary commands.

Currently all pre_exec_commands are executed as shell commands in Mesos
Launch. It is not safe because arbitrary shell command may be included
in some user facing api (e.g., container_path).  We should execute those
command as a subprocess to prevent arbitrary shell command injection.

Review: https://reviews.apache.org/r/50214/

> MesosContainerizerLaunch flags execute arbitrary commands via shell
> ---
>
> Key: MESOS-5388
> URL: https://issues.apache.org/jira/browse/MESOS-5388
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: James DeFelice
>Assignee: Gilbert Song
>  Labels: mesosphere, security
>
> For example, the docker volume isolator's containerPath is appended (without 
> sanitation) to a command that's executed in this manner. As such, it's 
> possible to inject arbitrary shell commands to be executed by mesos.
> https://github.com/apache/mesos/blob/17260204c833c643adf3d8f36ad8a1a606ece809/src/slave/containerizer/mesos/launch.cpp#L206
> Perhaps instead of strings these commands could/should be sent as string 
> arrays that could be passed as argv arguments w/o shell interpretation?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5933) Refactor the uri::Fetcher as a binary.

2016-08-01 Thread Zhitao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402569#comment-15402569
 ] 

Zhitao Li commented on MESOS-5933:
--

[~kaysoky], prime the cache is another good result, but what I want is a 
separate utility to quickly test different fetcher code/configuration against 
different registry (or whatever image store for AppC) independently.

> Refactor the uri::Fetcher as a binary.
> --
>
> Key: MESOS-5933
> URL: https://issues.apache.org/jira/browse/MESOS-5933
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Gilbert Song
>Assignee: Zhitao Li
>  Labels: fetcher, mesosphere
>
> By refactoring the uri::Fetcher as a binary, the fetcher can be used 
> independently. Not only mesos, but also new fetcher plugin testing, mesos cli 
> and many other new components in the future can re-use the binary to fetch 
> any URI with different schemes. Ideally, after this change, mesos cli is able 
> to re-use the uri::Fetcher binary to introduce new image pulling commands, 
> e.g., `mesos fetch -i `.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5933) Refactor the uri::Fetcher as a binary.

2016-08-01 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402565#comment-15402565
 ] 

Joseph Wu commented on MESOS-5933:
--

Another note:  If the purpose of this ticket is to add a way to "prime" the 
(fetcher) cache, there are other ways to achieve this.  You could schedule a 
task (i.e. a command task of {{exit 0}}) that fetches the object (into the 
cache).

> Refactor the uri::Fetcher as a binary.
> --
>
> Key: MESOS-5933
> URL: https://issues.apache.org/jira/browse/MESOS-5933
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Gilbert Song
>Assignee: Zhitao Li
>  Labels: fetcher, mesosphere
>
> By refactoring the uri::Fetcher as a binary, the fetcher can be used 
> independently. Not only mesos, but also new fetcher plugin testing, mesos cli 
> and many other new components in the future can re-use the binary to fetch 
> any URI with different schemes. Ideally, after this change, mesos cli is able 
> to re-use the uri::Fetcher binary to introduce new image pulling commands, 
> e.g., `mesos fetch -i `.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5853) http v1 API should document behavior regarding generated content-type header in the presence of errors

2016-08-01 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-5853:
--
Assignee: Abhishek Dasgupta

> http v1 API should document behavior regarding generated content-type header 
> in the presence of errors
> --
>
> Key: MESOS-5853
> URL: https://issues.apache.org/jira/browse/MESOS-5853
> Project: Mesos
>  Issue Type: Improvement
>  Components: documentation
>Reporter: James DeFelice
>Assignee: Abhishek Dasgupta
>  Labels: mesosphere
>
> Changes made as part of https://issues.apache.org/jira/browse/MESOS-3739 set 
> a default Content-Type header. This should be documented in the Mesos v1 HTTP 
> API literature so that devs implementing against the spec know what to expect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5933) Refactor the uri::Fetcher as a binary.

2016-08-01 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402556#comment-15402556
 ] 

Gilbert Song commented on MESOS-5933:
-

[~haosd...@gmail.com], This is a separate issue. It is a little confusing cuz 
we have some tech debt in fetchers. E.g., we have the mesos fetcher and the uri 
fetcher. This JIRA MESOS-5259 should address the it and refactor the mesos 
fetcher into the uri fetcher. Please note that most of the tickets in this Epic 
MESOS-3918 have dependencies.

I talked to [~klueska] about the Mesos CLI, it would be great and easy to 
re-use this binary in CLI.

BTW [~zhitao], please note that we need MESOS-5254 before you start working on 
this issue. Already link it as a dependency. :)

> Refactor the uri::Fetcher as a binary.
> --
>
> Key: MESOS-5933
> URL: https://issues.apache.org/jira/browse/MESOS-5933
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Gilbert Song
>Assignee: Zhitao Li
>  Labels: fetcher, mesosphere
>
> By refactoring the uri::Fetcher as a binary, the fetcher can be used 
> independently. Not only mesos, but also new fetcher plugin testing, mesos cli 
> and many other new components in the future can re-use the binary to fetch 
> any URI with different schemes. Ideally, after this change, mesos cli is able 
> to re-use the uri::Fetcher binary to introduce new image pulling commands, 
> e.g., `mesos fetch -i `.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5933) Refactor the uri::Fetcher as a binary.

2016-08-01 Thread Zhitao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402441#comment-15402441
 ] 

Zhitao Li commented on MESOS-5933:
--

yes, in our conversation we agreed that ideally this should be able to be 
invoked by the new mesos cli, or built as a sub component somehow.

I'm still getting familiar with that new architecture.

> Refactor the uri::Fetcher as a binary.
> --
>
> Key: MESOS-5933
> URL: https://issues.apache.org/jira/browse/MESOS-5933
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Gilbert Song
>Assignee: Zhitao Li
>  Labels: fetcher, mesosphere
>
> By refactoring the uri::Fetcher as a binary, the fetcher can be used 
> independently. Not only mesos, but also new fetcher plugin testing, mesos cli 
> and many other new components in the future can re-use the binary to fetch 
> any URI with different schemes. Ideally, after this change, mesos cli is able 
> to re-use the uri::Fetcher binary to introduce new image pulling commands, 
> e.g., `mesos fetch -i `.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5933) Refactor the uri::Fetcher as a binary.

2016-08-01 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402429#comment-15402429
 ] 

haosdent commented on MESOS-5933:
-

Got your idea now. Then they should be different things. In additionally, 
because we are going to implement Mesos CLI 
https://docs.google.com/document/d/1r6Iv4Efu8v8IBrcUTjgYkvZ32WVscgYqrD07OyIglsA/edit
 I think this ticket should relate to it as well.

> Refactor the uri::Fetcher as a binary.
> --
>
> Key: MESOS-5933
> URL: https://issues.apache.org/jira/browse/MESOS-5933
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Gilbert Song
>Assignee: Zhitao Li
>  Labels: fetcher, mesosphere
>
> By refactoring the uri::Fetcher as a binary, the fetcher can be used 
> independently. Not only mesos, but also new fetcher plugin testing, mesos cli 
> and many other new components in the future can re-use the binary to fetch 
> any URI with different schemes. Ideally, after this change, mesos cli is able 
> to re-use the uri::Fetcher binary to introduce new image pulling commands, 
> e.g., `mesos fetch -i `.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5790) Ensure all examples in Scheduler HTTP API docs are valid JSON

2016-08-01 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-5790:
--
Shepherd: Anand Mazumdar

> Ensure all examples in Scheduler HTTP API docs are valid JSON
> -
>
> Key: MESOS-5790
> URL: https://issues.apache.org/jira/browse/MESOS-5790
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>Assignee: Abhishek Dasgupta
>  Labels: mesosphere, newbie
>
> Currently, there are a lot of JSON snippets in the [API Docs | 
> http://mesos.apache.org/documentation/latest/scheduler-http-api/ ] that are 
> not valid JSON i.e. have {{...}} to make the snippet succinct/easy to read. 
> e.g., 
> {code}
> {{"filters"   : {...}
> {code} 
> However, this is a problem for framework developers who are trying to use the 
> new API. Looking at the corresponding protobuf definitions can be a good 
> place to start but hardly ideal.
> It would be good to address the shortcomings and make the JSON snippets 
> complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5933) Refactor the uri::Fetcher as a binary.

2016-08-01 Thread Zhitao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402390#comment-15402390
 ] 

Zhitao Li commented on MESOS-5933:
--

[~haosd...@gmail.com], from my perspective, what I want is to refactor the part 
of natively fetching images from a docker registry as a separate binary so that 
we can easily test different registries and storages options around.

It seems like mesos_fetcher is designed is used for fetching `CommandInfo.URI` 
to sandbox. It remains a question to me whether these two things should be in 
the same binary. Usually, image fetching is stored into the image `Store`, 
which we probably don't want executors to play with?

> Refactor the uri::Fetcher as a binary.
> --
>
> Key: MESOS-5933
> URL: https://issues.apache.org/jira/browse/MESOS-5933
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Gilbert Song
>Assignee: Zhitao Li
>  Labels: fetcher, mesosphere
>
> By refactoring the uri::Fetcher as a binary, the fetcher can be used 
> independently. Not only mesos, but also new fetcher plugin testing, mesos cli 
> and many other new components in the future can re-use the binary to fetch 
> any URI with different schemes. Ideally, after this change, mesos cli is able 
> to re-use the uri::Fetcher binary to introduce new image pulling commands, 
> e.g., `mesos fetch -i `.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5956) ABORT or report TASK_FAILED on health check creation failure

2016-08-01 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5956:
---
Description: Now when a task with health check fail to create 
{{HealthChecker}}, we just print a warning log and continue to start the task. 
We should consider aborting or sending a {{TASK_FAILED}} or another appropriate 
{{TaskStatus}} in this case.  (was: Now when a task with health check fail to 
create {{HealthChecker}}, we just print a warning log and continue to start the 
task. We should decide to abort or send a {{TASK_FAILED}} {{TaskStatus}}  in 
this case.)

> ABORT or report TASK_FAILED on health check creation failure
> 
>
> Key: MESOS-5956
> URL: https://issues.apache.org/jira/browse/MESOS-5956
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>  Labels: health-check, tech-debt
>
> Now when a task with health check fail to create {{HealthChecker}}, we just 
> print a warning log and continue to start the task. We should consider 
> aborting or sending a {{TASK_FAILED}} or another appropriate {{TaskStatus}} 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5956) ABORT or report TASK_FAILED on health check creation failure

2016-08-01 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5956:
---
Description: Now when a task with health check fails to create a 
{{HealthChecker}} instance, we just log a warning and continue to start the 
task. We should consider aborting or sending a {{TASK_FAILED}} or another 
appropriate {{TaskStatus}} in this case.  (was: Now when a task with health 
check fail to create {{HealthChecker}}, we just print a warning log and 
continue to start the task. We should consider aborting or sending a 
{{TASK_FAILED}} or another appropriate {{TaskStatus}} in this case.)

> ABORT or report TASK_FAILED on health check creation failure
> 
>
> Key: MESOS-5956
> URL: https://issues.apache.org/jira/browse/MESOS-5956
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>  Labels: health-check, tech-debt
>
> Now when a task with health check fails to create a {{HealthChecker}} 
> instance, we just log a warning and continue to start the task. We should 
> consider aborting or sending a {{TASK_FAILED}} or another appropriate 
> {{TaskStatus}} in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402307#comment-15402307
 ] 

Avinash Sridharan edited comment on MESOS-5953 at 8/1/16 4:09 PM:
--

But I think that is the problem that [~philwinder] was alluding to, that 
certain Dockerfile assume that the working directory is `/`, when the WORKDIR 
is not specified? 


was (Author: avin...@mesosphere.io):
But I think that is the problem that [~philwinder] was adhering to, that 
certain Dockerfile assume that the working directory is `/`, when the WORKDIR 
is not specified? 

> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> /. https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402307#comment-15402307
 ] 

Avinash Sridharan commented on MESOS-5953:
--

But I think that is the problem that [~philwinder] was adhering to, that 
certain Dockerfile assume that the working directory is `/`, when the WORKDIR 
is not specified? 

> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> /. https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402294#comment-15402294
 ] 

Jie Yu commented on MESOS-5953:
---

I feel like we should not strictly follow Docker engine semantics here. Mesos 
has this notion of sandbox ($MESOS_SANDBOX). It makes more sense to set workdir 
to $MESOS_SANDBOX if it's not set in Dockerfile than setting to `/`.

> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> /. https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402280#comment-15402280
 ] 

haosdent commented on MESOS-5953:
-

I think need tp update here 
https://github.com/apache/mesos/blob/7864eb860cc5b6d12c4af968e85640613dc34f1d/src/slave/containerizer/mesos/isolators/docker/runtime.cpp#L384
 to set the work dir to {{/}} when {{WorkingDir}} is empty in the docker image 
manifest.

> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> /. https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5957) Provide packages for Ubuntu 16.04

2016-08-01 Thread Andreas Streichardt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Streichardt updated MESOS-5957:
---
Description: 
As per dcos-community slack:

```
mop
5:23 PM for ubuntu vivid mesosphere was kind enough to host a package repo: 
https://open.mesosphere.com/getting-started/install/ ... will there be an 
update for 16.04? or did I miss something?
thomas.mesosphere
5:26 PM @mop: please file an issue and assign to Artem (issues.apache.org)
```

Ubuntu 16.04 has been released since a while.

This getting started tutorial

https://open.mesosphere.com/getting-started/install/

still lists 15.04 as the most up to date ubuntu. Having updated packages would 
be great!

  was:
As per dcos-community slack:

```
mop
5:23 PM for ubuntu vivid mesosphere was kind enough to host a package repo: 
https://open.mesosphere.com/getting-started/install/ ... will there be an 
update for 16.04? or did I miss something?
thomas.mesosphere
5:26 PM @mop: please file an issue and assign to Artem (issues.apache.org)
```

Ubuntu 16.04 has been released since a while.

This getting started tutorial

https://open.mesosphere.com/getting-started/install/

still lists 15.04 as the most up to date ubuntu. Having updated packages would 
be great!

There are multiple Artems in the list :S Not sure which Artem to assign this 
ticket


> Provide packages for Ubuntu 16.04
> -
>
> Key: MESOS-5957
> URL: https://issues.apache.org/jira/browse/MESOS-5957
> Project: Mesos
>  Issue Type: Wish
>  Components: release
>Reporter: Andreas Streichardt
>Priority: Minor
>
> As per dcos-community slack:
> ```
> mop
> 5:23 PM for ubuntu vivid mesosphere was kind enough to host a package repo: 
> https://open.mesosphere.com/getting-started/install/ ... will there be an 
> update for 16.04? or did I miss something?
> thomas.mesosphere
> 5:26 PM @mop: please file an issue and assign to Artem (issues.apache.org)
> ```
> Ubuntu 16.04 has been released since a while.
> This getting started tutorial
> https://open.mesosphere.com/getting-started/install/
> still lists 15.04 as the most up to date ubuntu. Having updated packages 
> would be great!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5957) Provide packages for Ubuntu 16.04

2016-08-01 Thread Andreas Streichardt (JIRA)
Andreas Streichardt created MESOS-5957:
--

 Summary: Provide packages for Ubuntu 16.04
 Key: MESOS-5957
 URL: https://issues.apache.org/jira/browse/MESOS-5957
 Project: Mesos
  Issue Type: Wish
  Components: release
Reporter: Andreas Streichardt
Priority: Minor


As per dcos-community slack:

```
mop
5:23 PM for ubuntu vivid mesosphere was kind enough to host a package repo: 
https://open.mesosphere.com/getting-started/install/ ... will there be an 
update for 16.04? or did I miss something?
thomas.mesosphere
5:26 PM @mop: please file an issue and assign to Artem (issues.apache.org)
```

Ubuntu 16.04 has been released since a while.

This getting started tutorial

https://open.mesosphere.com/getting-started/install/

still lists 15.04 as the most up to date ubuntu. Having updated packages would 
be great!

There are multiple Artems in the list :S Not sure which Artem to assign this 
ticket



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5956) ABORT or report TASK_FAILED on health check creation failure

2016-08-01 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5956:

Labels: health-check tech-debt  (was: health-check)

> ABORT or report TASK_FAILED on health check creation failure
> 
>
> Key: MESOS-5956
> URL: https://issues.apache.org/jira/browse/MESOS-5956
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>  Labels: health-check, tech-debt
>
> Now when a task with health check fail to create {{HealthChecker}}, we just 
> print a warning log and continue to start the task. We should decide to abort 
> or send a {{TASK_FAILED}} {{TaskStatus}}  in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5956) ABORT or report TASK_FAILED on health check creation failure

2016-08-01 Thread haosdent (JIRA)
haosdent created MESOS-5956:
---

 Summary: ABORT or report TASK_FAILED on health check creation 
failure
 Key: MESOS-5956
 URL: https://issues.apache.org/jira/browse/MESOS-5956
 Project: Mesos
  Issue Type: Improvement
Reporter: haosdent


Now when a task with health check fail to create {{HealthChecker}}, we just 
print a warning log and continue to start the task. We should decide to abort 
or send a {{TASK_FAILED}} {{TaskStatus}}  in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5803) Command health checks do not survive after framework restart

2016-08-01 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5803:

Summary: Command health checks do not survive after framework restart  
(was: Command health checks do not survive after master restart)

> Command health checks do not survive after framework restart
> 
>
> Key: MESOS-5803
> URL: https://issues.apache.org/jira/browse/MESOS-5803
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>  Labels: health-check
>
> Reported in https://github.com/mesosphere/marathon/issues/916
> and https://github.com/apache/mesos/pull/118
> So far health check only sends success healthy status if the previous status 
> is failed or not exists. So frameworks could not know the health status of 
> tasks after master restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5955) The "mesos-health-check" binary is not used anymore.

2016-08-01 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402200#comment-15402200
 ] 

haosdent commented on MESOS-5955:
-

| Removed the binary way of HealthCheck in libprocess. | 
https://reviews.apache.org/r/50657/ |
| Removed the binary way of HealthCheck in src. | 
https://reviews.apache.org/r/49556/ |

> The "mesos-health-check" binary is not used anymore.
> 
>
> Key: MESOS-5955
> URL: https://issues.apache.org/jira/browse/MESOS-5955
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>  Labels: mesosphere
>
> MESOS-5727 and MESOS-5954 refactored the health check code into the 
> {{HealthChecker}} library, hence the "mesos-health-check" binary became 
> unused.
> While the command and docker executors could just use the library to avoid 
> the subprocess complexity, we may want to consider keeping a binary version 
> that ships with the installation, because the intention of the binary was to 
> allow other executors to re-use our implementation. On the other side, this 
> binary is ill suited to this since it uses libprocess message passing, so if 
> we do not have code that requires the binary it seems ok to remove it for 
> now. Custom executors may use the {{HealthChecker}} library directly, it is 
> not much more complex than using the binary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402196#comment-15402196
 ] 

haosdent commented on MESOS-5953:
-

[~philwinder]  /tmp/mesos/sandbox should be /mnt/mesos/sandbox ?

> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> /. https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5955) The "mesos-health-check" binary is not used anymore.

2016-08-01 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-5955:
--

 Summary: The "mesos-health-check" binary is not used anymore.
 Key: MESOS-5955
 URL: https://issues.apache.org/jira/browse/MESOS-5955
 Project: Mesos
  Issue Type: Improvement
Reporter: Alexander Rukletsov
Assignee: haosdent


MESOS-5727 and MESOS-5954 refactored the health check code into the 
{{HealthChecker}} library, hence the "mesos-health-check" binary became unused.

While the command and docker executors could just use the library to avoid the 
subprocess complexity, we may want to consider keeping a binary version that 
ships with the installation, because the intention of the binary was to allow 
other executors to re-use our implementation. On the other side, this binary is 
ill suited to this since it uses libprocess message passing, so if we do not 
have code that requires the binary it seems ok to remove it for now. Custom 
executors may use the {{HealthChecker}} library directly, it is not much more 
complex than using the binary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread Philip Winder (JIRA)
Philip Winder created MESOS-5953:


 Summary: Default work dir is not root for unified containerizer 
and docker
 Key: MESOS-5953
 URL: https://issues.apache.org/jira/browse/MESOS-5953
 Project: Mesos
  Issue Type: Bug
  Components: containerization
Reporter: Philip Winder


According to the docker spec, the default working directory (WORKDIR) is root 
(/). https://docs.docker.com/engine/reference/run/#/workdir

The unified containerizer with the docker runtime isolator sets the default 
working directory to /tmp/mesos/sandbox.

Hence, dockerfiles that are relying on the default workdir will not work 
because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread Philip Winder (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Winder updated MESOS-5953:
-
Description: 
According to the docker spec, the default working directory (WORKDIR) is root 
/. https://docs.docker.com/engine/reference/run/#/workdir

The unified containerizer with the docker runtime isolator sets the default 
working directory to /tmp/mesos/sandbox.

Hence, dockerfiles that are relying on the default workdir will not work 
because the pwd is changed by mesos.

  was:
According to the docker spec, the default working directory (WORKDIR) is root 
(/). https://docs.docker.com/engine/reference/run/#/workdir

The unified containerizer with the docker runtime isolator sets the default 
working directory to /tmp/mesos/sandbox.

Hence, dockerfiles that are relying on the default workdir will not work 
because the pwd is changed by mesos.


> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> /. https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5954) Docker executor does not use HealthChecker library.

2016-08-01 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-5954:
--

 Summary: Docker executor does not use HealthChecker library.
 Key: MESOS-5954
 URL: https://issues.apache.org/jira/browse/MESOS-5954
 Project: Mesos
  Issue Type: Improvement
Reporter: Alexander Rukletsov
Assignee: haosdent


https://github.com/apache/mesos/commit/1556d9a3a02de4e8a90b5b64d268754f95b12d77 
refactored health checks into a library. Command executor uses the library 
instead of the "mesos-health-check" binary, docker executor should do the same 
for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5953) Default work dir is not root for unified containerizer and docker

2016-08-01 Thread Philip Winder (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Winder updated MESOS-5953:
-
Affects Version/s: 1.0.0

> Default work dir is not root for unified containerizer and docker
> -
>
> Key: MESOS-5953
> URL: https://issues.apache.org/jira/browse/MESOS-5953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Philip Winder
>
> According to the docker spec, the default working directory (WORKDIR) is root 
> (/). https://docs.docker.com/engine/reference/run/#/workdir
> The unified containerizer with the docker runtime isolator sets the default 
> working directory to /tmp/mesos/sandbox.
> Hence, dockerfiles that are relying on the default workdir will not work 
> because the pwd is changed by mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5952) Update docs for new slave removal behavior

2016-08-01 Thread Neil Conway (JIRA)
Neil Conway created MESOS-5952:
--

 Summary: Update docs for new slave removal behavior
 Key: MESOS-5952
 URL: https://issues.apache.org/jira/browse/MESOS-5952
 Project: Mesos
  Issue Type: Improvement
  Components: documentation
Reporter: Neil Conway
Assignee: Neil Conway






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5951) Remove "strict registry" code

2016-08-01 Thread Neil Conway (JIRA)
Neil Conway created MESOS-5951:
--

 Summary: Remove "strict registry" code
 Key: MESOS-5951
 URL: https://issues.apache.org/jira/browse/MESOS-5951
 Project: Mesos
  Issue Type: Improvement
  Components: master
Reporter: Neil Conway
Assignee: Neil Conway


Once {{PARTITION_AWARE}} frameworks are supported, we should eventually remove 
the code that supports the "non-strict" semantics in the master. That is:

1. The master will be "strict" in Mesos 1.1, in the sense that master behavior 
will always reflect the content of the registry and will not change depending 
on whether the master has failed over. The exception here is that for 
non-PARTITION_AWARE frameworks, we will _only_ kill such tasks on a 
reregistering agent if the master hasn't failed over in the meantime. i.e., 
we'll remain backwards compatible with the previous "non-strict" semantics that 
old frameworks might depend on.
2. The "strict" semantics will be less problematic, because the master will no 
longer be killing tasks and shutting down agents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5950) Consider request/response for reconciliation, bulk reconcile

2016-08-01 Thread Neil Conway (JIRA)
Neil Conway created MESOS-5950:
--

 Summary: Consider request/response for reconciliation, bulk 
reconcile
 Key: MESOS-5950
 URL: https://issues.apache.org/jira/browse/MESOS-5950
 Project: Mesos
  Issue Type: Improvement
  Components: framework api, master
Reporter: Neil Conway


The current task reconciliation API has a few quirks:

1. The master will sometimes use "send nothing" as a way to communicate 
information (MESOS-4050), which is very confusing in a distributed system that 
might drop messages for other reasons.
2. A framework has no way to determine when the reconciliation results for a 
given reconciliation request are "complete". That is, when a framework sends a 
reconciliation request, it starts to receive zero or more task status updates 
(with {{reason}} set to {{REASON_RECONCILIATION}}). The framework can't easily 
determine how many results it should expect to receive.
3. For efficiency (and perhaps to simplify framework logic), it might be easier 
to send a batch of task status updates together in a single message, rather 
than sending potentially tens of thousands of individual messages.

For #2, arguably a framework shouldn't _need_ to know when it has seen the 
"complete" set of results for a reconciliation request. However, supporting a 
"request/reply" structure for reconciliation can simplify framework logic, 
especially if a framework might have multiple timers/reasons to be doing 
reconciliation at the same time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4050) Change task reconciliation not omit unknown tasks

2016-08-01 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4050:
---
Description: 
If the master fails over and a framework tries to do an explicit reconciliation 
for a task running on an agent that has not reregistered yet (and 
{{agent_reregister_timeout}} has not been exceeded), the master will _not_ send 
a reconciliation response for that task.

This is confusing for framework authors. It seems better for the master to 
announce all the information it has explicitly: e.g., to return "task X is in 
an unknown state", rather than not returning anything. Then as more information 
arrives (e.g., agent reregisters or task definitively dies), task state would 
transition appropriately. We might want to do this via a new task state, e.g., 
{{TASK_REREGISTER_PENDING}}.

This might be consistent with changing the task states so that we capture "task 
is partitioned" as an explicit task state ({{TASK_UNKNOWN}} or 
{{TASK_WANDERING}}) -- see MESOS-4049.

  was:
If a framework tries to reconcile the state of a task that is in an unknown 
state (because the agent running the task is partitioned from the master), the 
master will _not_ include any information about that task.

This is confusing for framework authors. It seems better for the master to 
announce all the information it has explicitly: e.g., to return "task X is in 
an unknown state", rather than not returning anything. Then as more information 
arrives (e.g., task returns or task definitively dies), task state would 
transition appropriately.

This might be consistent with changing the task states so that we capture "task 
is partitioned" as an explicit task state ({{TASK_UNKNOWN}} or 
{{TASK_WANDERING}}) -- see MESOS-4049.


> Change task reconciliation not omit unknown tasks
> -
>
> Key: MESOS-4050
> URL: https://issues.apache.org/jira/browse/MESOS-4050
> Project: Mesos
>  Issue Type: Improvement
>  Components: framework, master
>Reporter: Neil Conway
>  Labels: mesosphere, reconciliation
>
> If the master fails over and a framework tries to do an explicit 
> reconciliation for a task running on an agent that has not reregistered yet 
> (and {{agent_reregister_timeout}} has not been exceeded), the master will 
> _not_ send a reconciliation response for that task.
> This is confusing for framework authors. It seems better for the master to 
> announce all the information it has explicitly: e.g., to return "task X is in 
> an unknown state", rather than not returning anything. Then as more 
> information arrives (e.g., agent reregisters or task definitively dies), task 
> state would transition appropriately. We might want to do this via a new task 
> state, e.g., {{TASK_REREGISTER_PENDING}}.
> This might be consistent with changing the task states so that we capture 
> "task is partitioned" as an explicit task state ({{TASK_UNKNOWN}} or 
> {{TASK_WANDERING}}) -- see MESOS-4049.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5949) Allow frameworks to learn the time when an agent became unreachable

2016-08-01 Thread Neil Conway (JIRA)
Neil Conway created MESOS-5949:
--

 Summary: Allow frameworks to learn the time when an agent became 
unreachable
 Key: MESOS-5949
 URL: https://issues.apache.org/jira/browse/MESOS-5949
 Project: Mesos
  Issue Type: Improvement
  Components: master
Reporter: Neil Conway
Assignee: Neil Conway


We currently store the time at which agents become unreachable in the registry, 
but we don't expose that to information to frameworks yet. One mechanism would 
be via a new optional field in {{TaskStatus}}; other mechanisms would also be 
possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5948) Remove rate-limiting for agent removal

2016-08-01 Thread Neil Conway (JIRA)
Neil Conway created MESOS-5948:
--

 Summary: Remove rate-limiting for agent removal
 Key: MESOS-5948
 URL: https://issues.apache.org/jira/browse/MESOS-5948
 Project: Mesos
  Issue Type: Improvement
  Components: master
Reporter: Neil Conway
Assignee: Neil Conway


If we can assume that all frameworks are {{PARTITION_AWARE}} (e.g., for Mesos 
2), we can likely remove the code that applies a rate-limit to agent removal. 
This is because "agent removal" just means marking the agent as 
{{UNREACHABLE}}; because this is a non-destructive operation, we don't need to 
be as careful about the situations in which we do it. If a framework responds 
to {{UNREACHABLE}} by terminating and replacing tasks, they can (and often 
should) use their own safety mechanisms, whether a rate-limit or something else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5344) Partition-aware Mesos frameworks

2016-08-01 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-5344:
---
  Epic Name: PARTITION_AWARE  (was: New TaskStatuses)
Description: 
This epic covers three related tasks:
1. Allowing partitioned agents to reregister with the master. This allows 
frameworks to control how tasks running on partitioned agents should be dealt 
with.
2. Replacing the TASK_LOST task state with a set of more granular states with 
more precise semantics: UNREACHABLE, DROPPED, UNKNOWN, GONE, and 
GONE_BY_OPERATOR.
3. Allow frameworks to be informed when a task that was running on a 
partitioned agent has been terminated (GONE and GONE_BY_OPERATOR states).

These new behaviors will be guarded by the {{PARTITION_AWARE}} framework 
capability.

  was:
This epic covers three related tasks:
1. Allowing partitioned agents to reregister with the master. This allows 
frameworks to control how tasks running on partitioned agents should be dealt 
with.
2. Replacing the TASK_LOST task state with a set of more granular states with 
more precise semantics: UNREACHABLE, DROPPED, UNKNOWN, GONE, and 
GONE_BY_OPERATOR.
3. Allow frameworks to be informed when a task that was running on a 
partitioned agent has been terminated (GONE and GONE_BY_OPERATOR states).


> Partition-aware Mesos frameworks
> 
>
> Key: MESOS-5344
> URL: https://issues.apache.org/jira/browse/MESOS-5344
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> This epic covers three related tasks:
> 1. Allowing partitioned agents to reregister with the master. This allows 
> frameworks to control how tasks running on partitioned agents should be dealt 
> with.
> 2. Replacing the TASK_LOST task state with a set of more granular states with 
> more precise semantics: UNREACHABLE, DROPPED, UNKNOWN, GONE, and 
> GONE_BY_OPERATOR.
> 3. Allow frameworks to be informed when a task that was running on a 
> partitioned agent has been terminated (GONE and GONE_BY_OPERATOR states).
> These new behaviors will be guarded by the {{PARTITION_AWARE}} framework 
> capability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5344) Partition-aware Mesos frameworks

2016-08-01 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-5344:
---
Shepherd: Vinod Kone

> Partition-aware Mesos frameworks
> 
>
> Key: MESOS-5344
> URL: https://issues.apache.org/jira/browse/MESOS-5344
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> This epic covers three related tasks:
> 1. Allowing partitioned agents to reregister with the master. This allows 
> frameworks to control how tasks running on partitioned agents should be dealt 
> with.
> 2. Replacing the TASK_LOST task state with a set of more granular states with 
> more precise semantics: UNREACHABLE, DROPPED, UNKNOWN, GONE, and 
> GONE_BY_OPERATOR.
> 3. Allow frameworks to be informed when a task that was running on a 
> partitioned agent has been terminated (GONE and GONE_BY_OPERATOR states).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5344) Partition-aware Mesos frameworks

2016-08-01 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-5344:
---
Description: 
This epic covers three related tasks:
1. Allowing partitioned agents to reregister with the master. This allows 
frameworks to control how tasks running on partitioned agents should be dealt 
with.
2. Replacing the TASK_LOST task state with a set of more granular states with 
more precise semantics: UNREACHABLE, DROPPED, UNKNOWN, GONE, and 
GONE_BY_OPERATOR.
3. Allow frameworks to be informed when a task that was running on a 
partitioned agent has been terminated (GONE and GONE_BY_OPERATOR states).

  was:
This epic covers two related tasks:
1. Clarifying the semantics of TASK_LOST, and allow frameworks to learn when a 
task is *truly* lost (i.e., not running), versus the current LOST semantics of 
"may or may not be running".
2. Allowing frameworks to control how partitioned tasks are handled.



> Partition-aware Mesos frameworks
> 
>
> Key: MESOS-5344
> URL: https://issues.apache.org/jira/browse/MESOS-5344
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> This epic covers three related tasks:
> 1. Allowing partitioned agents to reregister with the master. This allows 
> frameworks to control how tasks running on partitioned agents should be dealt 
> with.
> 2. Replacing the TASK_LOST task state with a set of more granular states with 
> more precise semantics: UNREACHABLE, DROPPED, UNKNOWN, GONE, and 
> GONE_BY_OPERATOR.
> 3. Allow frameworks to be informed when a task that was running on a 
> partitioned agent has been terminated (GONE and GONE_BY_OPERATOR states).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-5344) Partition-aware Mesos frameworks

2016-08-01 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway reassigned MESOS-5344:
--

Assignee: Neil Conway

> Partition-aware Mesos frameworks
> 
>
> Key: MESOS-5344
> URL: https://issues.apache.org/jira/browse/MESOS-5344
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> This epic covers two related tasks:
> 1. Clarifying the semantics of TASK_LOST, and allow frameworks to learn when 
> a task is *truly* lost (i.e., not running), versus the current LOST semantics 
> of "may or may not be running".
> 2. Allowing frameworks to control how partitioned tasks are handled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5344) Partition-aware Mesos frameworks

2016-08-01 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-5344:
---
Summary: Partition-aware Mesos frameworks  (was: Revise TaskStatus 
semantics)

> Partition-aware Mesos frameworks
> 
>
> Key: MESOS-5344
> URL: https://issues.apache.org/jira/browse/MESOS-5344
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Neil Conway
>  Labels: mesosphere
>
> This epic covers two related tasks:
> 1. Clarifying the semantics of TASK_LOST, and allow frameworks to learn when 
> a task is *truly* lost (i.e., not running), versus the current LOST semantics 
> of "may or may not be running".
> 2. Allowing frameworks to control how partitioned tasks are handled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5947) Optimize registry to avoid unnecessary storage writes

2016-08-01 Thread Neil Conway (JIRA)
Neil Conway created MESOS-5947:
--

 Summary: Optimize registry to avoid unnecessary storage writes
 Key: MESOS-5947
 URL: https://issues.apache.org/jira/browse/MESOS-5947
 Project: Mesos
  Issue Type: Improvement
  Components: master
Reporter: Neil Conway
Assignee: Neil Conway


If we apply a sequence of registry operations and none of those operations 
actually modify the registry, we can skip writing out the new registry state 
variable to the replicated log. This can be an important optimization, e.g., 
when the master fails over and a large number of agents attempt to reregister 
simultaneously. Since those agents already appear in the "admitted" list of 
agents in the registry, we don't need to do any replicated log writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5896) When start Mesos container and docker images, it does not work.

2016-08-01 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401634#comment-15401634
 ] 

Gilbert Song commented on MESOS-5896:
-

[~Sunzhe], could you try by using the default `--containerizers` agent flag? 
E.g., --containerizers=mesos

BTW, are you testing using mesos-execute? or your own framework? Since your 
ContainerInfo definition seems strange to me. The `NetworkInfo` is not supposed 
to be available ContainerInfo.MesosInfo.

> When start Mesos container and docker images, it does not work.
> ---
>
> Key: MESOS-5896
> URL: https://issues.apache.org/jira/browse/MESOS-5896
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 1.0.0
>Reporter: Sunzhe
>  Labels: containerizer
>
> When I create Mesos container with docker image, like this:
> {code:title=test.json|borderStyle=solid}
> {
>   "id": "test-mesos-container-docker-image",
>   "cmd": "while [ true ]; do uname -a; sleep 3; done",
>   "cpus": 0.5,
>   "mem": 32.0,
>   "container": {
>   "type": "MESOS",
>   "mesos": {
>   "image": {
>   "type": "DOCKER",
>   "docker": {
>   "name": "ubuntu:14.04"
>   }
>   },
>   "network": "BRIDGE",
> "portMappings": [
>   {
> "containerPort": 8080,
> "hostPort": 0,
> "servicePort": 10008,
> "protocol": "tcp",
> "labels": {}
>   }
> ],
> "privileged": false,
> "parameters": [],
> "forcePullImage": false
>   }
>   }
> }
> {code}
> It does not wok! The result seems Docker image does not work, the container 
> uses host filesystem not the Docker image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)