[jira] [Commented] (MESOS-2930) Allow the Resource Estimator to express over-allocation of revocable resources.

2015-09-20 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900248#comment-14900248
 ] 

Klaus Ma commented on MESOS-2930:
-

Sure :). I'd like to confirm/collect requirements before patch; and then, we 
can follow up this topics with this task or new task.

> Allow the Resource Estimator to express over-allocation of revocable 
> resources.
> ---
>
> Key: MESOS-2930
> URL: https://issues.apache.org/jira/browse/MESOS-2930
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Benjamin Mahler
>Assignee: Klaus Ma
>
> Currently the resource estimator returns the amount of oversubscription 
> resources that are available, since resources cannot be negative, this allows 
> the resource estimator to express the following:
> (1) Return empty resources: We are fully allocated for oversubscription 
> resources.
> (2) Return non-empty resources: We are under-allocated for oversubscription 
> resources. In other words, some are available.
> However, there is an additional situation that we cannot express:
> (3) Analogous to returning non-empty "negative" resources: We are 
> over-allocated for oversubscription resources. Do not re-offer any of the 
> over-allocated oversubscription resources that are recovered.
> Without (3), the slave can only shrink the total pool of oversubscription 
> resources by returning (1) as resources are recovered, until the pool is 
> shrunk to the desired size. However, this approach is only best-effort, it's 
> possible for a framework to launch more tasks in the window of time (15 
> seconds by default) that the slave polls the estimator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2728) Introduce concept of cluster wide resources.

2015-09-20 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-2728:

Epic Name: Global Resources.  (was: Clusterwide Resources.)

> Introduce concept of cluster wide resources.
> 
>
> Key: MESOS-2728
> URL: https://issues.apache.org/jira/browse/MESOS-2728
> Project: Mesos
>  Issue Type: Epic
>Reporter: Joerg Schad
>Assignee: Klaus Ma
>  Labels: mesosphere
>
> There are resources which are not provided by a single node. Consider for 
> example a external Network Bandwidth of a cluster. Being a limited resource 
> it makes sense for Mesos to manage it but still it is not a resource being 
> offered by a single node. A cluster-wide resource is still consumed by a 
> task, and when that task completes, the resources are then available to be 
> allocated to another framework/task.
> Use Cases:
> 1. Network Bandwidth
> 2. IP Addresses
> 3. Global Service Ports
> 2. Distributed File System Storage
> 3. Software Licences



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3478) Design document of Global Resources

2015-09-20 Thread Klaus Ma (JIRA)
Klaus Ma created MESOS-3478:
---

 Summary: Design document of Global Resources
 Key: MESOS-3478
 URL: https://issues.apache.org/jira/browse/MESOS-3478
 Project: Mesos
  Issue Type: Task
Reporter: Klaus Ma
Assignee: Klaus Ma






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3477) Add design doc for roles/weights configuration

2015-09-20 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3477:
--
Summary: Add design doc for roles/weights configuration  (was: Add design 
doc for roles/weights configuraiton)

> Add design doc for roles/weights configuration
> --
>
> Key: MESOS-3477
> URL: https://issues.apache.org/jira/browse/MESOS-3477
> Project: Mesos
>  Issue Type: Documentation
>  Components: master
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3477) Add design doc for roles/weights configuraiton

2015-09-20 Thread Yong Qiao Wang (JIRA)
Yong Qiao Wang created MESOS-3477:
-

 Summary: Add design doc for roles/weights configuraiton
 Key: MESOS-3477
 URL: https://issues.apache.org/jira/browse/MESOS-3477
 Project: Mesos
  Issue Type: Documentation
  Components: master
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-20 Thread Yong Qiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-3177:
-

Assignee: Yong Qiao Wang

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3413) Docker containerizer does not symlink persistent volumes into sandbox

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900025#comment-14900025
 ] 

haosdent edited comment on MESOS-3413 at 9/21/15 12:49 AM:
---

[~neunhoef] No sure I understand your ideas correct or not. Let me show how I 
use persistent volumes in docker below.

I need set the volume info correctly in ContainerInfo, so that docker executor 
would mount the persistent volumes. Suppose we have already set up a "path1" 
persistent volume correctly.
{code}
  ContainerInfo::DockerInfo dockerInfo;
  dockerInfo.set_image("busybox");
  Volume* dockerVolume = containerInfo.add_volumes();
  dockerVolume->set_host_path("path1");
  dockerVolume->set_container_path("/path2");
  dockerVolume->set_mode(Volume::RW);
  containerInfo.mutable_docker()->CopyFrom(dockerInfo);
{code}

We could found the docker executor would mount the "path1" to "/path2" in 
docker docker. I got below log from executor stderr file. 
{code}
docker -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906/path1:/path2:rw
 -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906:/mnt/mesos/sandbox
 --net host --entrypoint /bin/sh --name 
mesos-db206124-6d5f-493b-8b72-fdfbf65ed744-S0.88cc5c49-50bd-4bab-9e74-f23c43504906
 busybox -c ls /
{code}

>From executor stdout file, because I run "ls /" command we also could see the 
>/path2 exists.
{code}
total 52
drwxrwxr-x2 root root  4096 May 22  2014 bin
drwxr-xr-x5 root root   360 Sep 21 00:48 dev
drwxr-xr-x6 root root  4096 Sep 21 00:48 etc
drwxrwxr-x4 root root  4096 May 22  2014 home
drwxrwxr-x2 root root  4096 May 22  2014 lib
lrwxrwxrwx1 root root 3 May 22  2014 lib64 -> lib
lrwxrwxrwx1 root root11 May 22  2014 linuxrc -> bin/busybox
drwxrwxr-x2 root root  4096 Feb 27  2014 media
drwxrwxr-x3 root root  4096 Sep 21 00:48 mnt
drwxrwxr-x2 root root  4096 Feb 27  2014 opt
drwxr-xr-x2 root root  4096 Sep 21 00:48 path1
dr-xr-xr-x  171 root root 0 Sep 21 00:48 proc
drwx--2 root root  4096 Feb 27  2014 root
lrwxrwxrwx1 root root 3 Feb 27  2014 run -> tmp
drwxr-xr-x2 root root  4096 May 22  2014 sbin
dr-xr-xr-x   13 root root 0 Aug 12 09:16 sys
drwxrwxrwt3 root root  4096 May 22  2014 tmp
drwxrwxr-x6 root root  4096 May 22  2014 usr
drwxrwxr-x4 root root  4096 May 22  2014 var
{code}


was (Author: haosd...@gmail.com):
[~neunhoef] No sure I understand your ideas correct or not. Let me show how I 
use persistent volumes in docker below.

I need set the volume info correctly in ContainerInfo, so that docker executor 
would mount the persistent volumes. Suppose we have already set up a "path1" 
persistent volume correctly.
{code}
  ContainerInfo::DockerInfo dockerInfo;
  dockerInfo.set_image("busybox");
  Volume* dockerVolume = containerInfo.add_volumes();
  dockerVolume->set_host_path("path1");
  dockerVolume->set_container_path("/path2");
  dockerVolume->set_mode(Volume::RW);
  containerInfo.mutable_docker()->CopyFrom(dockerInfo);
{code}

We could found the docker executor would mount the "path1" to "/path2" in 
docker docker. I got below log from executor stderr file. 
{code}
docker -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906/path1:/path2:rw
 -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906:/mnt/mesos/sandbox
 --net host --entrypoint /bin/sh --name 
mesos-db206124-6d5f-493b-8b72-fdfbf65ed744-S0.88cc5c49-50bd-4bab-9e74-f23c43504906
 busybox -c ls /
{code}

>From executor stdout file, because I run "ls /" command we also could see the 
>/path2 exists.
{code}
Starting task 1
bin
dev
etc
home
lib
lib64
linuxrc
media
mnt
opt
path2
proc
root
run
sbin
sys
tmp
usr
var
{code}

> Docker containerizer does not symlink persistent volumes into sandbox
> -
>
> Key: MESOS-3413
> URL: https://issues.apache.org/jira/browse/MESOS-3413
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker, slave
>Affects Versions: 0.23.0
>Reporter: Max Neunhöffer
>Assignee: haosdent
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> For the Ara

[jira] [Comment Edited] (MESOS-3413) Docker containerizer does not symlink persistent volumes into sandbox

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900025#comment-14900025
 ] 

haosdent edited comment on MESOS-3413 at 9/21/15 12:49 AM:
---

[~neunhoef] No sure I understand your ideas correct or not. Let me show how I 
use persistent volumes in docker below.

I need set the volume info correctly in ContainerInfo, so that docker executor 
would mount the persistent volumes. Suppose we have already set up a "path1" 
persistent volume correctly.
{code}
  ContainerInfo::DockerInfo dockerInfo;
  dockerInfo.set_image("busybox");
  Volume* dockerVolume = containerInfo.add_volumes();
  dockerVolume->set_host_path("path1");
  dockerVolume->set_container_path("/path2");
  dockerVolume->set_mode(Volume::RW);
  containerInfo.mutable_docker()->CopyFrom(dockerInfo);
{code}

We could found the docker executor would mount the "path1" to "/path2" in 
docker docker. I got below log from executor stderr file. 
{code}
docker -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906/path1:/path2:rw
 -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906:/mnt/mesos/sandbox
 --net host --entrypoint /bin/sh --name 
mesos-db206124-6d5f-493b-8b72-fdfbf65ed744-S0.88cc5c49-50bd-4bab-9e74-f23c43504906
 busybox -c ls /
{code}

>From executor stdout file, because I run "ls /" command we also could see the 
>/path2 exists.
{code}
total 52
drwxrwxr-x2 root root  4096 May 22  2014 bin
drwxr-xr-x5 root root   360 Sep 21 00:48 dev
drwxr-xr-x6 root root  4096 Sep 21 00:48 etc
drwxrwxr-x4 root root  4096 May 22  2014 home
drwxrwxr-x2 root root  4096 May 22  2014 lib
lrwxrwxrwx1 root root 3 May 22  2014 lib64 -> lib
lrwxrwxrwx1 root root11 May 22  2014 linuxrc -> bin/busybox
drwxrwxr-x2 root root  4096 Feb 27  2014 media
drwxrwxr-x3 root root  4096 Sep 21 00:48 mnt
drwxrwxr-x2 root root  4096 Feb 27  2014 opt
drwxr-xr-x2 root root  4096 Sep 21 00:48 path2
dr-xr-xr-x  171 root root 0 Sep 21 00:48 proc
drwx--2 root root  4096 Feb 27  2014 root
lrwxrwxrwx1 root root 3 Feb 27  2014 run -> tmp
drwxr-xr-x2 root root  4096 May 22  2014 sbin
dr-xr-xr-x   13 root root 0 Aug 12 09:16 sys
drwxrwxrwt3 root root  4096 May 22  2014 tmp
drwxrwxr-x6 root root  4096 May 22  2014 usr
drwxrwxr-x4 root root  4096 May 22  2014 var
{code}


was (Author: haosd...@gmail.com):
[~neunhoef] No sure I understand your ideas correct or not. Let me show how I 
use persistent volumes in docker below.

I need set the volume info correctly in ContainerInfo, so that docker executor 
would mount the persistent volumes. Suppose we have already set up a "path1" 
persistent volume correctly.
{code}
  ContainerInfo::DockerInfo dockerInfo;
  dockerInfo.set_image("busybox");
  Volume* dockerVolume = containerInfo.add_volumes();
  dockerVolume->set_host_path("path1");
  dockerVolume->set_container_path("/path2");
  dockerVolume->set_mode(Volume::RW);
  containerInfo.mutable_docker()->CopyFrom(dockerInfo);
{code}

We could found the docker executor would mount the "path1" to "/path2" in 
docker docker. I got below log from executor stderr file. 
{code}
docker -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906/path1:/path2:rw
 -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906:/mnt/mesos/sandbox
 --net host --entrypoint /bin/sh --name 
mesos-db206124-6d5f-493b-8b72-fdfbf65ed744-S0.88cc5c49-50bd-4bab-9e74-f23c43504906
 busybox -c ls /
{code}

>From executor stdout file, because I run "ls /" command we also could see the 
>/path2 exists.
{code}
total 52
drwxrwxr-x2 root root  4096 May 22  2014 bin
drwxr-xr-x5 root root   360 Sep 21 00:48 dev
drwxr-xr-x6 root root  4096 Sep 21 00:48 etc
drwxrwxr-x4 root root  4096 May 22  2014 home
drwxrwxr-x2 root root  4096 May 22  2014 lib
lrwxrwxrwx1 root root 3 May 22  2014 lib64 -> lib
lrwxrwxrwx1 root root11 May 22  2014 linuxrc -> bin/busybox
drwxrwxr-x2 root root  4096 Feb 27  2014 media
drwxrwxr-x3 root root  4096 Sep 21 00:48 mnt
drwxrwxr-x2 root root  4096 Feb 27  201

[jira] [Commented] (MESOS-3462) Containerization issues with mesos running on CoreOS

2015-09-20 Thread Francis Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900105#comment-14900105
 ] 

Francis Chuang commented on MESOS-3462:
---

1. I am setting the --prefix during ./configure to /tmp/mesos-build/mesos. When 
I run make install, it works. I have also tried not setting the --prefix flag 
and it also works when I run make install.

2. When I run the second command:
Traceback (most recent call last):
  File "app_main.py", line 75, in run_toplevel
  File "app_main.py", line 581, in run_it
  File "", line 1, in 
ImportError: No module named mesos

Python was installed using 
https://github.com/defunctzombie/ansible-coreos-bootstrap

> Containerization issues with mesos running on CoreOS
> 
>
> Key: MESOS-3462
> URL: https://issues.apache.org/jira/browse/MESOS-3462
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 0.24.0
> Environment: CoreOS 801.0.0 64-bit
>Reporter: Francis Chuang
>Assignee: haosdent
>
> These are the steps to I used to build mesos 0.24.0 on Ubuntu 15.04 64-bit:
> wget http://www.apache.org/dist/mesos/0.24.0/mesos-0.24.0.tar.gz
> wget http://mirror.ventraip.net.au/apache/apr/apr-1.5.2.tar.gz
> wget http://mirror.ventraip.net.au/apache/apr/apr-util-1.5.4.tar.gz
> wget http://mirror.ventraip.net.au/apache/subversion/subversion-1.9.0.tar.gz
> wget http://www.sqlite.org/sqlite-amalgamation-3071501.zip
> wget ftp://ftp.cyrusimap.org/cyrus-sasl/cyrus-sasl-2.1.26.tar.gz
> mkdir /tmp/mesos-build
> cd /tmp/mesos-build
> - Build apr
> tar zxf apr-$APR_VERSION.tar.gz
> cd apr-$APR_VERSION
> ./configure CC=gcc-4.8 --prefix=/tmp/mesos-build/apr
> make
> make install
> cd ..
> - Build apr-util
> tar zxf apr-util-$APR_UTIL_VERSION.tar.gz
> cd apr-util-$APR_UTIL_VERSION
> ./configure CC=gcc-4.8 --prefix=/tmp/mesos-build/apr-util 
> --with-apr=/tmp/mesos-build/apr
> make
> make install
> cd ..
> - Build libsasl2
> tar zxf cyrus-sasl-$SASL_VERSION.tar.gz
> cd cyrus-sasl-$SASL_VERSION
> ./configure CC=gcc-4.8 CPPFLAGS=-I/usr/include/openssl 
> --prefix=/tmp/mesos-build/sasl2 --enable-cram
> make
> make install
> cd ..
> - Build subversion
> tar zxf subversion-$SVN_VERSION.tar.gz
> unzip sqlite-amalgamation-$SQLITE_AMALGATION_VERSION.zip
> mv sqlite-amalgamation-$SQLITE_AMALGATION_VERSION/ 
> subversion-$SVN_VERSION/sqlite-amalgamation/
> cd subversion-$SVN_VERSION
> ./configure CC=gcc-4.8 CXX=g++-4.8 --prefix=/tmp/mesos-build/svn 
> --with-apr=/tmp/mesos-build/apr --with-apr-util=/tmp/mesos-build/apr-util 
> --with-sasl=/tmp/mesos-build/sasl2
> make
> make install
> cd ..
> - Build curl
> tar zxf curl-$CURL_VERSION.tar.gz
> cd curl-$CURL_VERSION
> ./configure CC=gcc-4.8 --prefix=/tmp/mesos-build/curl
> make
> make install
> cd ..
> - Build mesos
> tar zxf mesos-$MESOS_VERSION.tar.gz
> cd mesos-$MESOS_VERSION
> mkdir build
> cd build
> ../configure CC=gcc-4.8 CXX=g++-4.8 
> LD_LIBRARY_PATH=/tmp/mesos-build/sasl2/lib 
> SASL_PATH=/tmp/mesos-build/sasl2/lib/sasl2 --prefix=/tmp/mesos-build/mesos 
> --with-svn=/tmp/mesos-build/svn --with-apr=/tmp/mesos-build/apr 
> --with-sasl=/tmp/mesos-build/sasl2/ --with-curl=/tmp/mesos-build/curl
> make
> make install
> cd ..
> cd ..
> - Copy shared objects into mesos build
> cp apr/lib/libapr-1.so.0.5.2 mesos/lib/libapr-1.so.0
> cp apr-util/lib/libaprutil-1.so.0.5.4 mesos/lib/libaprutil-1.so.0
> cp sasl2/lib/libsasl2.so.3.0.0 mesos/lib/libsasl2.so.3
> cp svn/lib/libsvn_delta-1.so.0.0.0 mesos/lib/libsvn_delta-1.so.0
> cp svn/lib/libsvn_subr-1.so.0.0.0 mesos/lib/libsvn_subr-1.so.0
> I then compress the build into an archive and distributed it onto my CoreOS 
> nodes.
> Once I have the archive extracted on each node, I start the master and slaves:
> /opt/mesos/sbin/mesos-master --zk=zk://192.168.33.10/mesos --quorum=1 
> --hostname=192.168.33.10 --ip=192.168.33.10 
> --webui_dir=/opt/mesos/share/mesos/webui --cluster=mesos
> /opt/mesos/sbin/mesos-slave --hostname=192.168.33.11 --ip=192.168.33.11 
> --master=zk://192.168.33.10/mesos 
> --executor_environment_variables='{"LD_LIBRARY_PATH": "/opt/mesos/lib", 
> "PATH": "/opt/java/bin:/usr/sbin:/usr/bin"}' --containerizers=docker,mesos 
> --executor_registration_timeout=60mins 
> --launcher_dir=/opt/mesos/libexec/mesos/
> In addition, the following environment variables are set:
> LD_LIBRARY_PATH=/opt/mesos/lib/
> JAVA_HOME=/opt/java
> MESOS_NATIVE_JAVA_LIBRARY=/opt/mesos/lib/libmesos.so
> I am finding that when I run meso-hdfs from 
> https://github.com/mesosphere/hdfs, the scheduler starts properly and 
> launches the executors. However, the executors will fail and terminate 
> without writing any error to stderr and stdout.
> I have reproduced the same problem with mesos 0.24, 0.23 and 0.22.1
> If I install mesos onto a Ubunt

[jira] [Commented] (MESOS-3448) Trying to Live Upgrade Mesos Slaves from 0.22.x to 0.23.x Causes New Executors to Fail

2015-09-20 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900077#comment-14900077
 ] 

Benjamin Mahler commented on MESOS-3448:


Agreed that it would be an ideal, but it's nice to keep the backwards 
compatibility burden off of the developers. Definitely warrants documenting! 
Could you do me a huge favor and file a ticket for us to document this so 
others don't get tripped up and we can point folks to something?

> Trying to Live Upgrade Mesos Slaves from 0.22.x to 0.23.x Causes New 
> Executors to Fail
> --
>
> Key: MESOS-3448
> URL: https://issues.apache.org/jira/browse/MESOS-3448
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Reporter: Paul Cavallaro
>Priority: Minor
>
> When live upgrading from Mesos 0.22.x to 0.23.x the old running mesos-0.22.x 
> slaves call out to the newly instaleld 0.23.x fetcher and fails with:
> F0914 11:43:16.355821 28833 fetcher.cpp:415] CHECK_SOME(fetcherInfo): Missing 
> required fields: sandbox_directory Failed to parse FetcherInfo: Missing 
> required fields: sandbox_directory
> This is because the old 0.22.x slave does not include those fields 
> apparently, that the 0.23.x fetcher requires.
> Here's another report of that happening: 
> http://wilderness.apache.org/channels/?f=mesos/2015-07-29
> Just wanted to raise to attention that this makes it annoying to live upgrade 
> from 0.22.x to 0.23.x which otherwise would be smooth.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3051) performance issues with port ranges comparison

2015-09-20 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3051:

Target Version/s: 0.25.0

> performance issues with port ranges comparison
> --
>
> Key: MESOS-3051
> URL: https://issues.apache.org/jira/browse/MESOS-3051
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Affects Versions: 0.22.1
>Reporter: James Peach
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> Testing in an environment with lots of frameworks (>200), where the 
> frameworks permanently decline resources they don't need. The allocator ends 
> up spending a lot of time figuring out whether offers are refused (the code 
> path through {{HierarchicalAllocatorProcess::isFiltered()}}.
> In profiling a synthetic benchmark, it turns out that comparing port ranges 
> is very expensive, involving many temporary allocations. 61% of 
> Resources::contains() run time is in operator -= (Resource). 35% of 
> Resources::contains() run time is in Resources::_contains().
> The heaviest call chain through {{Resources::_contains}} is:
> {code}
> Running Time  Self (ms) Symbol Name
> 7237.0ms   35.5%  4.0
> mesos::Resources::_contains(mesos::Resource const&) const
> 7200.0ms   35.3%  1.0 mesos::contains(mesos::Resource 
> const&, mesos::Resource const&)
> 7133.0ms   35.0%121.0  
> mesos::operator<=(mesos::Value_Ranges const&, mesos::Value_Ranges const&)
> 6319.0ms   31.0%  7.0   
> mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Ranges const&)
> 6240.0ms   30.6%161.0
> mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
> 1867.0ms9.1% 25.0 mesos::Value_Ranges::add_range()
> 1694.0ms8.3%  4.0 
> mesos::Value_Ranges::~Value_Ranges()
> 1495.0ms7.3% 16.0 
> mesos::Value_Ranges::operator=(mesos::Value_Ranges const&)
>  445.0ms2.1% 94.0 
> mesos::Value_Range::MergeFrom(mesos::Value_Range const&)
>  154.0ms0.7% 24.0 mesos::Value_Ranges::range(int) 
> const
>  103.0ms0.5% 24.0 
> mesos::Value_Ranges::range_size() const
>   95.0ms0.4%  2.0 
> mesos::Value_Range::Value_Range(mesos::Value_Range const&)
>   59.0ms0.2%  4.0 
> mesos::Value_Ranges::Value_Ranges()
>   50.0ms0.2% 50.0 mesos::Value_Range::begin() 
> const
>   28.0ms0.1% 28.0 mesos::Value_Range::end() const
>   26.0ms0.1%  0.0 
> mesos::Value_Range::~Value_Range()
> {code}
> mesos::coalesce(Value_Ranges) gets done a lot and ends up being really 
> expensive. The heaviest parts of the inverted call chain are:
> {code}
> Running Time  Self (ms)   Symbol Name
> 3209.0ms   15.7%  3209.0  mesos::Value_Range::~Value_Range()
> 3209.0ms   15.7%  0.0  
> google::protobuf::internal::GenericTypeHandler::Delete(mesos::Value_Range*)
> 3209.0ms   15.7%  0.0   void 
> google::protobuf::internal::RepeatedPtrFieldBase::Destroy::TypeHandler>()
> 3209.0ms   15.7%  0.0
> google::protobuf::RepeatedPtrField::~RepeatedPtrField()
> 3209.0ms   15.7%  0.0 
> google::protobuf::RepeatedPtrField::~RepeatedPtrField()
> 3209.0ms   15.7%  0.0  
> mesos::Value_Ranges::~Value_Ranges()
> 3209.0ms   15.7%  0.0   
> mesos::Value_Ranges::~Value_Ranges()
> 2441.0ms   11.9%  0.0
> mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
>  452.0ms2.2%  0.0
> mesos::remove(mesos::Value_Ranges*, mesos::Value_Range const&)
>  169.0ms0.8%  0.0
> mesos::operator<=(mesos::Value_Ranges const&, mesos::Value_Ranges const&)
>   82.0ms0.4%  0.0
> mesos::operator-=(mesos::Value_Ranges&, mesos::Value_Ranges const&)
>   65.0ms0.3%  0.0
> mesos::Value_Ranges::~Value_Ranges()
> 2541.0ms   12.4%  2541.0  
> google::protobuf::internal::GenericTypeHandler::New()
> 2541.0ms   12.4%  0.0  
> google::protobuf::RepeatedPtrField::TypeHandler::Type* 
> google::protobuf::internal::RepeatedPtrFieldBase::Add::TypeHandler>()
> 2305.0ms   11.3%  0.0   
> google::protobuf::RepeatedPtrField::Add()
> 2305.0ms   11.3%  0.0mesos::Value_Ranges::add_range()
> 1962.0ms9.6%  0.0 
> mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
>  343.0ms1.6%  0.0   

[jira] [Updated] (MESOS-3476) Refactor Status Update method on Slave to handle HTTP based Executors

2015-09-20 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-3476:
--
Summary: Refactor Status Update method on Slave to handle HTTP based 
Executors  (was: Refactor Status Update method on Slave to handle HTTP 
Executors)

> Refactor Status Update method on Slave to handle HTTP based Executors
> -
>
> Key: MESOS-3476
> URL: https://issues.apache.org/jira/browse/MESOS-3476
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Currently, receiving a status update sent from slave to itself , {{runTask}} 
> , {{killTask}} and status updates from executors are handled by the 
> {{Slave::statusUpdate}} method on Slave. The signature of the method is 
> {{void Slave::statusUpdate(StatusUpdate update, const UPID& pid)}}. 
> We need to create another overload of it that can also handle HTTP based 
> executors which the previous PID based function can also call into. The 
> signature of the new function could be:
> {{void Slave::statusUpdate(StatusUpdate update, Executor* executor)}}
> The HTTP Executor would also call into this new function via 
> {{src/slave/http.cpp}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3476) Refactor Status Update method on Slave to handle HTTP Executors

2015-09-20 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar reassigned MESOS-3476:
-

Assignee: Anand Mazumdar

> Refactor Status Update method on Slave to handle HTTP Executors
> ---
>
> Key: MESOS-3476
> URL: https://issues.apache.org/jira/browse/MESOS-3476
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Currently, receiving a status update sent from slave to itself , {{runTask}} 
> , {{killTask}} and status updates from executors are handled by the 
> {{Slave::statusUpdate}} method on Slave. The signature of the method is 
> {{void Slave::statusUpdate(StatusUpdate update, const UPID& pid)}}. 
> We need to create another overload of it that can also handle HTTP based 
> executors which the previous PID based function can also call into. The 
> signature of the new function could be:
> {{void Slave::statusUpdate(StatusUpdate update, Executor* executor)}}
> The HTTP Executor would also call into this new function via 
> {{src/slave/http.cpp}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3476) Refactor Status Update method on Slave to handle HTTP Executors

2015-09-20 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-3476:
-

 Summary: Refactor Status Update method on Slave to handle HTTP 
Executors
 Key: MESOS-3476
 URL: https://issues.apache.org/jira/browse/MESOS-3476
 Project: Mesos
  Issue Type: Bug
Reporter: Anand Mazumdar


Currently, receiving a status update sent from slave to itself , {{runTask}} , 
{{killTask}} and status updates from executors are handled by the 
{{Slave::statusUpdate}} method on Slave. The signature of the method is {{void 
Slave::statusUpdate(StatusUpdate update, const UPID& pid)}}. 

We need to create another overload of it that can also handle HTTP based 
executors which the previous PID based function can also call into. The 
signature of the new function could be:

{{void Slave::statusUpdate(StatusUpdate update, Executor* executor)}}

The HTTP Executor would also call into this new function via 
{{src/slave/http.cpp}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900028#comment-14900028
 ] 

Joris Van Remoortere commented on MESOS-3475:
-

That's ok. You couldn't have known... I wanted to make the JIRA first so I 
could reference it in the commit ;-)

> TestContainerizer should not modify global environment variables
> 
>
> Key: MESOS-3475
> URL: https://issues.apache.org/jira/browse/MESOS-3475
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joris Van Remoortere
>Assignee: haosdent
>
> Currently the {{TestContainerizer}} modifies the environment variables. Since 
> these are global variables, this can cause other threads reading these 
> variables to get inconsistent results, or even segfault if they happen to 
> read while the environment is being changed.
> Synchronizing within the TestContainerizer is not sufficient. We should pass 
> the environment variables into a fork, or set them on the command line of an 
> execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900027#comment-14900027
 ] 

haosdent commented on MESOS-3475:
-

Oh, sorry.

> TestContainerizer should not modify global environment variables
> 
>
> Key: MESOS-3475
> URL: https://issues.apache.org/jira/browse/MESOS-3475
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joris Van Remoortere
>
> Currently the {{TestContainerizer}} modifies the environment variables. Since 
> these are global variables, this can cause other threads reading these 
> variables to get inconsistent results, or even segfault if they happen to 
> read while the environment is being changed.
> Synchronizing within the TestContainerizer is not sufficient. We should pass 
> the environment variables into a fork, or set them on the command line of an 
> execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-3475:
---

Assignee: haosdent

> TestContainerizer should not modify global environment variables
> 
>
> Key: MESOS-3475
> URL: https://issues.apache.org/jira/browse/MESOS-3475
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joris Van Remoortere
>Assignee: haosdent
>
> Currently the {{TestContainerizer}} modifies the environment variables. Since 
> these are global variables, this can cause other threads reading these 
> variables to get inconsistent results, or even segfault if they happen to 
> read while the environment is being changed.
> Synchronizing within the TestContainerizer is not sufficient. We should pass 
> the environment variables into a fork, or set them on the command line of an 
> execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-3475:

Assignee: (was: haosdent)

> TestContainerizer should not modify global environment variables
> 
>
> Key: MESOS-3475
> URL: https://issues.apache.org/jira/browse/MESOS-3475
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joris Van Remoortere
>
> Currently the {{TestContainerizer}} modifies the environment variables. Since 
> these are global variables, this can cause other threads reading these 
> variables to get inconsistent results, or even segfault if they happen to 
> read while the environment is being changed.
> Synchronizing within the TestContainerizer is not sufficient. We should pass 
> the environment variables into a fork, or set them on the command line of an 
> execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3413) Docker containerizer does not symlink persistent volumes into sandbox

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900025#comment-14900025
 ] 

haosdent commented on MESOS-3413:
-

[~neunhoef] No sure I understand your ideas correct or not. Let me show how I 
use persistent volumes in docker below.

I need set the volume info correctly in ContainerInfo, so that docker executor 
would mount the persistent volumes. Suppose we have already set up a "path1" 
persistent volume correctly.
{code}
  ContainerInfo::DockerInfo dockerInfo;
  dockerInfo.set_image("busybox");
  Volume* dockerVolume = containerInfo.add_volumes();
  dockerVolume->set_host_path("path1");
  dockerVolume->set_container_path("/path2");
  dockerVolume->set_mode(Volume::RW);
  containerInfo.mutable_docker()->CopyFrom(dockerInfo);
{code}

We could found the docker executor would mount the "path1" to "/path2" in 
docker docker. I got below log from executor stderr file. 
{code}
docker -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906/path1:/path2:rw
 -v 
/tmp//slaves/db206124-6d5f-493b-8b72-fdfbf65ed744-S0/frameworks/db206124-6d5f-493b-8b72-fdfbf65ed744-/executors/1/runs/88cc5c49-50bd-4bab-9e74-f23c43504906:/mnt/mesos/sandbox
 --net host --entrypoint /bin/sh --name 
mesos-db206124-6d5f-493b-8b72-fdfbf65ed744-S0.88cc5c49-50bd-4bab-9e74-f23c43504906
 busybox -c ls /
{code}

>From executor stdout file, because I run "ls /" command we also could see the 
>/path2 exists.
{code}
Starting task 1
bin
dev
etc
home
lib
lib64
linuxrc
media
mnt
opt
path2
proc
root
run
sbin
sys
tmp
usr
var
{code}

> Docker containerizer does not symlink persistent volumes into sandbox
> -
>
> Key: MESOS-3413
> URL: https://issues.apache.org/jira/browse/MESOS-3413
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker, slave
>Affects Versions: 0.23.0
>Reporter: Max Neunhöffer
>Assignee: haosdent
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> For the ArangoDB framework I am trying to use the persistent primitives. 
> nearly all is working, but I am missing a crucial piece at the end: I have 
> successfully created a persistent disk resource and have set the persistence 
> and volume information in the DiskInfo message. However, I do not see any way 
> to find out what directory on the host the mesos slave has reserved for us. I 
> know it is ${MESOS_SLAVE_WORKDIR}/volumes/roles//_ but we 
> have no way to query this information anywhere. The docker containerizer does 
> not automatically mount this directory into our docker container, or symlinks 
> it into our sandbox. Therefore, I have essentially no access to it. Note that 
> the mesos containerizer (which I cannot use for other reasons) seems to 
> create a symlink in the sandbox to the actual path for the persistent volume. 
> With that, I could mount the volume into our docker container and all would 
> be well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900022#comment-14900022
 ] 

Joris Van Remoortere commented on MESOS-3475:
-

Hi [~haosd...@gmail.com]!
Sort of. I am referring to my changes and comments in this commit:
https://github.com/apache/mesos/commit/a92ff3cd7388cfcf948e4ffa3dabcad98a29e3a8

> TestContainerizer should not modify global environment variables
> 
>
> Key: MESOS-3475
> URL: https://issues.apache.org/jira/browse/MESOS-3475
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joris Van Remoortere
>Assignee: haosdent
>
> Currently the {{TestContainerizer}} modifies the environment variables. Since 
> these are global variables, this can cause other threads reading these 
> variables to get inconsistent results, or even segfault if they happen to 
> read while the environment is being changed.
> Synchronizing within the TestContainerizer is not sufficient. We should pass 
> the environment variables into a fork, or set them on the command line of an 
> execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-3475:
---

Assignee: haosdent

> TestContainerizer should not modify global environment variables
> 
>
> Key: MESOS-3475
> URL: https://issues.apache.org/jira/browse/MESOS-3475
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joris Van Remoortere
>Assignee: haosdent
>
> Currently the {{TestContainerizer}} modifies the environment variables. Since 
> these are global variables, this can cause other threads reading these 
> variables to get inconsistent results, or even segfault if they happen to 
> read while the environment is being changed.
> Synchronizing within the TestContainerizer is not sufficient. We should pass 
> the environment variables into a fork, or set them on the command line of an 
> execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900014#comment-14900014
 ] 

haosdent commented on MESOS-3475:
-

Hi, [~jvanremoortere] Do you refer this code snippet
{code}
  // TODO(benh): Can this be removed and done exlusively in the
  // 'executorEnvironment()' function? There are other places in the
  // code where we do this as well and it's likely we can do this once
  // in 'executorEnvironment()'.
  foreach (const Environment::Variable& variable,
   executorInfo.command().environment().variables()) {
os::setenv(variable.name(), variable.value());
  }

  os::setenv("MESOS_LOCAL", "1");

  driver->start();

  os::unsetenv("MESOS_LOCAL");

  // Unset the environment variables we set by resetting them to their
  // original values and also removing any that were not part of the
  // original environment.
  foreachpair (const string& name, const string& value, original) {
os::setenv(name, value);
  }

  foreachkey (const string& name, environment) {
if (!original.contains(name)) {
  os::unsetenv(name);
}
  }
{code}

> TestContainerizer should not modify global environment variables
> 
>
> Key: MESOS-3475
> URL: https://issues.apache.org/jira/browse/MESOS-3475
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joris Van Remoortere
>
> Currently the {{TestContainerizer}} modifies the environment variables. Since 
> these are global variables, this can cause other threads reading these 
> variables to get inconsistent results, or even segfault if they happen to 
> read while the environment is being changed.
> Synchronizing within the TestContainerizer is not sufficient. We should pass 
> the environment variables into a fork, or set them on the command line of an 
> execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3475:

Description: 
Currently the {{TestContainerizer}} modifies the environment variables. Since 
these are global variables, this can cause other threads reading these 
variables to get inconsistent results, or even segfault if they happen to read 
while the environment is being changed.
Synchronizing within the TestContainerizer is not sufficient. We should pass 
the environment variables into a fork, or set them on the command line of an 
execute.

  was:
Currently the `TestContainerizer` modifies the environment variables. Since 
these are global variables, this can cause other threads reading these 
variables to get inconsistent results, or even segfault if they happen to read 
while the environment is being changed.
Synchronizing within the TestContainerizer is not sufficient. We should pass 
the environment variables into a fork, or set them on the command line of an 
execute.


> TestContainerizer should not modify global environment variables
> 
>
> Key: MESOS-3475
> URL: https://issues.apache.org/jira/browse/MESOS-3475
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joris Van Remoortere
>
> Currently the {{TestContainerizer}} modifies the environment variables. Since 
> these are global variables, this can cause other threads reading these 
> variables to get inconsistent results, or even segfault if they happen to 
> read while the environment is being changed.
> Synchronizing within the TestContainerizer is not sufficient. We should pass 
> the environment variables into a fork, or set them on the command line of an 
> execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3475) TestContainerizer should not modify global environment variables

2015-09-20 Thread Joris Van Remoortere (JIRA)
Joris Van Remoortere created MESOS-3475:
---

 Summary: TestContainerizer should not modify global environment 
variables
 Key: MESOS-3475
 URL: https://issues.apache.org/jira/browse/MESOS-3475
 Project: Mesos
  Issue Type: Bug
Reporter: Joris Van Remoortere


Currently the `TestContainerizer` modifies the environment variables. Since 
these are global variables, this can cause other threads reading these 
variables to get inconsistent results, or even segfault if they happen to read 
while the environment is being changed.
Synchronizing within the TestContainerizer is not sufficient. We should pass 
the environment variables into a fork, or set them on the command line of an 
execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3454) Remove duplicated logic in Flags::load

2015-09-20 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-3454:

Assignee: (was: Klaus Ma)

> Remove duplicated logic in Flags::load
> --
>
> Key: MESOS-3454
> URL: https://issues.apache.org/jira/browse/MESOS-3454
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Klaus Ma
>Priority: Minor
>
> In {{flags.hpp}}, there are two functions with almost the same logic; this 
> ticket is used to merge the duplicated part.
> {code}
> inline Try FlagsBase::load(
> const Option& prefix,
> int* argc,
> char*** argv,
> bool unknowns,
> bool duplicates)
> ...
> inline Try FlagsBase::load(
> const Option& prefix,
> int argc,
> const char* const *argv,
> bool unknowns,
> bool duplicates)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3462) Containerization issues with mesos running on CoreOS

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14899953#comment-14899953
 ] 

haosdent commented on MESOS-3462:
-

Hi, [~francischuang] Sorry for deplay. I have some questions need confirm with 
you.

1. If you build the mesos in ubuntu and "make install" in same ubuntu machine, 
could it success?
2. Could you try this in your CoreOS successfully?
{code}
export LD_LIBRARY_PATH=/opt/mesos/lib
python -c "import mesos.native"
{code}

> Containerization issues with mesos running on CoreOS
> 
>
> Key: MESOS-3462
> URL: https://issues.apache.org/jira/browse/MESOS-3462
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 0.24.0
> Environment: CoreOS 801.0.0 64-bit
>Reporter: Francis Chuang
>Assignee: haosdent
>
> These are the steps to I used to build mesos 0.24.0 on Ubuntu 15.04 64-bit:
> wget http://www.apache.org/dist/mesos/0.24.0/mesos-0.24.0.tar.gz
> wget http://mirror.ventraip.net.au/apache/apr/apr-1.5.2.tar.gz
> wget http://mirror.ventraip.net.au/apache/apr/apr-util-1.5.4.tar.gz
> wget http://mirror.ventraip.net.au/apache/subversion/subversion-1.9.0.tar.gz
> wget http://www.sqlite.org/sqlite-amalgamation-3071501.zip
> wget ftp://ftp.cyrusimap.org/cyrus-sasl/cyrus-sasl-2.1.26.tar.gz
> mkdir /tmp/mesos-build
> cd /tmp/mesos-build
> - Build apr
> tar zxf apr-$APR_VERSION.tar.gz
> cd apr-$APR_VERSION
> ./configure CC=gcc-4.8 --prefix=/tmp/mesos-build/apr
> make
> make install
> cd ..
> - Build apr-util
> tar zxf apr-util-$APR_UTIL_VERSION.tar.gz
> cd apr-util-$APR_UTIL_VERSION
> ./configure CC=gcc-4.8 --prefix=/tmp/mesos-build/apr-util 
> --with-apr=/tmp/mesos-build/apr
> make
> make install
> cd ..
> - Build libsasl2
> tar zxf cyrus-sasl-$SASL_VERSION.tar.gz
> cd cyrus-sasl-$SASL_VERSION
> ./configure CC=gcc-4.8 CPPFLAGS=-I/usr/include/openssl 
> --prefix=/tmp/mesos-build/sasl2 --enable-cram
> make
> make install
> cd ..
> - Build subversion
> tar zxf subversion-$SVN_VERSION.tar.gz
> unzip sqlite-amalgamation-$SQLITE_AMALGATION_VERSION.zip
> mv sqlite-amalgamation-$SQLITE_AMALGATION_VERSION/ 
> subversion-$SVN_VERSION/sqlite-amalgamation/
> cd subversion-$SVN_VERSION
> ./configure CC=gcc-4.8 CXX=g++-4.8 --prefix=/tmp/mesos-build/svn 
> --with-apr=/tmp/mesos-build/apr --with-apr-util=/tmp/mesos-build/apr-util 
> --with-sasl=/tmp/mesos-build/sasl2
> make
> make install
> cd ..
> - Build curl
> tar zxf curl-$CURL_VERSION.tar.gz
> cd curl-$CURL_VERSION
> ./configure CC=gcc-4.8 --prefix=/tmp/mesos-build/curl
> make
> make install
> cd ..
> - Build mesos
> tar zxf mesos-$MESOS_VERSION.tar.gz
> cd mesos-$MESOS_VERSION
> mkdir build
> cd build
> ../configure CC=gcc-4.8 CXX=g++-4.8 
> LD_LIBRARY_PATH=/tmp/mesos-build/sasl2/lib 
> SASL_PATH=/tmp/mesos-build/sasl2/lib/sasl2 --prefix=/tmp/mesos-build/mesos 
> --with-svn=/tmp/mesos-build/svn --with-apr=/tmp/mesos-build/apr 
> --with-sasl=/tmp/mesos-build/sasl2/ --with-curl=/tmp/mesos-build/curl
> make
> make install
> cd ..
> cd ..
> - Copy shared objects into mesos build
> cp apr/lib/libapr-1.so.0.5.2 mesos/lib/libapr-1.so.0
> cp apr-util/lib/libaprutil-1.so.0.5.4 mesos/lib/libaprutil-1.so.0
> cp sasl2/lib/libsasl2.so.3.0.0 mesos/lib/libsasl2.so.3
> cp svn/lib/libsvn_delta-1.so.0.0.0 mesos/lib/libsvn_delta-1.so.0
> cp svn/lib/libsvn_subr-1.so.0.0.0 mesos/lib/libsvn_subr-1.so.0
> I then compress the build into an archive and distributed it onto my CoreOS 
> nodes.
> Once I have the archive extracted on each node, I start the master and slaves:
> /opt/mesos/sbin/mesos-master --zk=zk://192.168.33.10/mesos --quorum=1 
> --hostname=192.168.33.10 --ip=192.168.33.10 
> --webui_dir=/opt/mesos/share/mesos/webui --cluster=mesos
> /opt/mesos/sbin/mesos-slave --hostname=192.168.33.11 --ip=192.168.33.11 
> --master=zk://192.168.33.10/mesos 
> --executor_environment_variables='{"LD_LIBRARY_PATH": "/opt/mesos/lib", 
> "PATH": "/opt/java/bin:/usr/sbin:/usr/bin"}' --containerizers=docker,mesos 
> --executor_registration_timeout=60mins 
> --launcher_dir=/opt/mesos/libexec/mesos/
> In addition, the following environment variables are set:
> LD_LIBRARY_PATH=/opt/mesos/lib/
> JAVA_HOME=/opt/java
> MESOS_NATIVE_JAVA_LIBRARY=/opt/mesos/lib/libmesos.so
> I am finding that when I run meso-hdfs from 
> https://github.com/mesosphere/hdfs, the scheduler starts properly and 
> launches the executors. However, the executors will fail and terminate 
> without writing any error to stderr and stdout.
> I have reproduced the same problem with mesos 0.24, 0.23 and 0.22.1
> If I install mesos onto a Ubuntu machine (tried 14.04 and 15.04 64-bit) using 
> the apt-repositories, this problem does not happen.
> I am not well-versed with mesos internals, but it was pointed out that it's 
> most likely a con

[jira] [Commented] (MESOS-3451) Failing tests after changes to Isolator/MesosContainerizer API

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14899935#comment-14899935
 ] 

haosdent commented on MESOS-3451:
-

[~karya][~jieyu] I think this issue is similar to 
[MESOS-3474|https://issues.apache.org/jira/browse/MESOS-3474]

In LinuxLauncher, we use
{code}
   // - unsigned long long used for best alignment.
   // - static is ok because each child gets their own copy after the clone.
   // - 8 MiB appears to be the default for "ulimit -s" on OSX and Linux.
-  static unsigned long long stack[(8*1024*1024)/sizeof(unsigned long long)];
{code}

a share memory as child stack. But because we have multi slaves, this share 
memory would be override in different threads. So that all threads would got 
the same "pipes" and same execute "function" pointer. This cause the problems 
above.

> Failing tests after changes to Isolator/MesosContainerizer API
> --
>
> Key: MESOS-3451
> URL: https://issues.apache.org/jira/browse/MESOS-3451
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>Priority: Blocker
>
> The failures are related to the following recent commits :
> e047f7d69b5297cc787487b6093119a3be517e48
> fc541a9a97eb1d86c27452019ff217eed11ed5a3
> 6923bb3e8cfbddde9fbabc6ca4edc29d9fc96c06



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-20 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14880720#comment-14880720
 ] 

Alexander Rukletsov commented on MESOS-3177:


I think this ticket is more about adding and removing roles dynamically, while 
quota is about resource guarantees associated with roles.

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3177) Make Mesos own configuration of roles/weights

2015-09-20 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14880719#comment-14880719
 ] 

Alexander Rukletsov commented on MESOS-3177:


I think this ticket is more about adding and removing roles dynamically, while 
quota is about resource guarantees associated with roles.

> Make Mesos own configuration of roles/weights
> -
>
> Key: MESOS-3177
> URL: https://issues.apache.org/jira/browse/MESOS-3177
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Reporter: Cody Maloney
>  Labels: mesosphere
>
> All roles and weights must currently be specified up-front when starting 
> Mesos masters currently. In addition, they should be consistent on every 
> master, otherwise unexpected behavior could occur (You can have them be 
> inconsistent for some upgrade paths / changing the set).
> This makes it hard to introduce new groups of machines under new roles 
> dynamically (Have to generate a new master configuration, deploy that, before 
> we can connect slaves with a new role to the cluster).
> Ideally an administrator can manually add / remove / edit roles and have the 
> settings replicated / passed to all masters in the cluster by Mesos. 
> Effectively Mesos takes ownership of the setting, rather than requiring it to 
> be done externally.
> In addition, if a new slave joins the cluster with an unexpected / new role 
> that should just work, making it much easier to introduce machines with new 
> roles. (Policy around whether or not a slave can cause creation of a new 
> role, a given slave can register with a given role, etc. is out of scope, and 
> would be controls in the general registration process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3451) Failing tests after changes to Isolator/MesosContainerizer API

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14877470#comment-14877470
 ] 

haosdent edited comment on MESOS-3451 at 9/20/15 9:03 AM:
--

I think it is because we change always use LinuxLauncher when user is root and 
current system is linux. But the confuse things is I could not understand is 
why childSetup always got same pipes in LinuxLauncher. 


was (Author: haosd...@gmail.com):
I think it is because we change also use LinuxLauncher when user is root and 
current system is linux. But the confuse things is I could not understand is 
why childSetup always got same pipes in LinuxLauncher. 

> Failing tests after changes to Isolator/MesosContainerizer API
> --
>
> Key: MESOS-3451
> URL: https://issues.apache.org/jira/browse/MESOS-3451
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>Priority: Blocker
>
> The failures are related to the following recent commits :
> e047f7d69b5297cc787487b6093119a3be517e48
> fc541a9a97eb1d86c27452019ff217eed11ed5a3
> 6923bb3e8cfbddde9fbabc6ca4edc29d9fc96c06



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3451) Failing tests after changes to Isolator/MesosContainerizer API

2015-09-20 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14877470#comment-14877470
 ] 

haosdent commented on MESOS-3451:
-

I think it is because we change also use LinuxLauncher when user is root and 
current system is linux. But the confuse things is I could not understand is 
why childSetup always got same pipes in LinuxLauncher. 

> Failing tests after changes to Isolator/MesosContainerizer API
> --
>
> Key: MESOS-3451
> URL: https://issues.apache.org/jira/browse/MESOS-3451
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>Priority: Blocker
>
> The failures are related to the following recent commits :
> e047f7d69b5297cc787487b6093119a3be517e48
> fc541a9a97eb1d86c27452019ff217eed11ed5a3
> 6923bb3e8cfbddde9fbabc6ca4edc29d9fc96c06



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3253) Add pid to network helper error messages

2015-09-20 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-3253:

Fix Version/s: 0.25.0

> Add pid to network helper error messages
> 
>
> Key: MESOS-3253
> URL: https://issues.apache.org/jira/browse/MESOS-3253
> Project: Mesos
>  Issue Type: Bug
>Reporter: Paul Brett
>Assignee: Klaus Ma
> Fix For: 0.25.0
>
>
> Network helper logs errors to stderr without the associated namespace pid or 
> container id  which prevents the errors from being associated with the 
> appropriate container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)