[jira] [Commented] (MESOS-2972) Serialize Docker image spec as protobuf

2015-09-28 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934629#comment-14934629
 ] 

Marco Massenzio commented on MESOS-2972:


(We discussed this in person - recording here for future reference.)

It looks as if the structure of the Docker Image spec should be simple enough 
and amenable to being "represented" in PB format without any conversion/adapter 
functionality, but just using Mesos' {{JSON::protobuf}} functionality.

I am fine with trying this out, provided the PB structure that comes out is not 
too gnarly.

> Serialize Docker image spec as protobuf
> ---
>
> Key: MESOS-2972
> URL: https://issues.apache.org/jira/browse/MESOS-2972
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> The Docker image specification defines a schema for the metadata json that it 
> puts into each image. Currently the docker image provisioner needs to be able 
> to parse and understand this metadata json, and we should create a protobuf 
> equivelent schema so we can utilize the json to protobuf conversion to read 
> and validate the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3391) Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution

2015-09-28 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934581#comment-14934581
 ] 

James Peach commented on MESOS-3391:


If Mesos starts to depend on this behavior, what happens to people who are 
using unbundled ZK libraries? 

> Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution
> --
>
> Key: MESOS-3391
> URL: https://issues.apache.org/jira/browse/MESOS-3391
> Project: Mesos
>  Issue Type: Bug
>  Components: general
> Environment: Linux, OS X
>Reporter: Chris Chen
>Assignee: Chris Chen
>
> The Zookeeper C client does makes certain assertions about the ordering of 
> ping packets that the Java client does not. An alternate implementation of 
> the Zookeeper server would then break the C client while working correctly 
> with the Java client.
> A patch has been submitted to the Zookeeper project under ZOOKEEPER-2253. 
> This adds that patch to mesos 3rdparty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3533) Unable to find and run URIs files

2015-09-28 Thread Rafael Capucho (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934451#comment-14934451
 ] 

Rafael Capucho commented on MESOS-3533:
---

Of course, thank you...

root@li202-122:/# cat /proc/self/mountinfo
193 63 251:4 /rootfs / rw,relatime - ext4 
/dev/mapper/docker-8:0-57389-e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb
 rw,stripe=16,data=ordered
194 193 0:57 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
195 193 0:58 / /dev rw,nosuid - tmpfs tmpfs rw,mode=755
196 195 0:59 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts 
rw,gid=5,mode=620,ptmxmode=666
197 195 0:60 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm 
rw,size=65536k
198 195 0:56 / /dev/mqueue rw,nosuid,nodev,noexec,relatime - mqueue mqueue rw
199 193 0:61 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs rw
200 199 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup 
systemd rw,name=systemd
201 199 0:25 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset
202 199 0:26 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu
203 199 0:27 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct
204 199 0:28 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio
205 199 0:29 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory
206 199 0:30 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices
207 199 0:31 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer
208 199 0:32 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls
209 199 0:33 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup rw,perf_event
210 199 0:34 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio
211 199 0:35 / /sys/fs/cgroup/debug rw,relatime - cgroup cgroup rw,debug
212 193 0:16 / /sys rw,relatime - sysfs sysfs rw
213 212 0:18 / /sys/fs/cgroup rw,relatime - tmpfs none rw,size=4k,mode=755
214 213 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup 
systemd rw,name=systemd
215 213 0:25 / /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset
216 213 0:26 / /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu
217 213 0:27 / /sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct
218 213 0:28 / /sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio
219 213 0:29 / /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory
220 213 0:30 / /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices
221 213 0:31 / /sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer
222 213 0:32 / /sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls
223 213 0:33 / /sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup 
rw,perf_event
224 213 0:34 / /sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio
225 213 0:35 / /sys/fs/cgroup/debug rw,relatime - cgroup cgroup rw,debug
226 212 0:19 / /sys/fs/fuse/connections rw,relatime - fusectl none rw
227 212 0:8 / /sys/kernel/debug rw,relatime - debugfs none rw
228 193 8:0 /lib/x86_64-linux-gnu/libudev.so.1.3.5 /lib/libudev.so.1 ro,noatime 
- ext4 /dev/root rw,errors=remount-ro,data=ordered
229 193 8:0 /usr/bin/docker /bin/docker rw,noatime - ext4 /dev/root 
rw,errors=remount-ro,data=ordered
230 193 8:0 /lib/x86_64-linux-gnu/libpthread-2.19.so /lib/libpthread.so.0 
ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered
231 193 8:0 /usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6 /lib/libsqlite3.so.0 
ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered
258 193 0:20 /docker.sock /run/docker.sock rw,nosuid,noexec,relatime - tmpfs 
none rw,size=204708k,mode=755
259 193 8:0 /lib/x86_64-linux-gnu/libdevmapper.so.1.02.1 
/usr/lib/libdevmapper.so.1.02 ro,noatime - ext4 /dev/root 
rw,errors=remount-ro,data=ordered
260 193 8:0 
/var/lib/docker/containers/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb/resolv.conf
 /etc/resolv.conf rw,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered
261 193 8:0 
/var/lib/docker/containers/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb/hostname
 /etc/hostname rw,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered
262 193 8:0 
/var/lib/

[jira] [Comment Edited] (MESOS-3533) Unable to find and run URIs files

2015-09-28 Thread Rafael Capucho (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934451#comment-14934451
 ] 

Rafael Capucho edited comment on MESOS-3533 at 9/29/15 1:15 AM:


Of course, from mesos slave i guess.. follow, thank you...

root@li202-122:/# cat /proc/self/mountinfo
193 63 251:4 /rootfs / rw,relatime - ext4 
/dev/mapper/docker-8:0-57389-e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb
 rw,stripe=16,data=ordered
194 193 0:57 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
195 193 0:58 / /dev rw,nosuid - tmpfs tmpfs rw,mode=755
196 195 0:59 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts 
rw,gid=5,mode=620,ptmxmode=666
197 195 0:60 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm 
rw,size=65536k
198 195 0:56 / /dev/mqueue rw,nosuid,nodev,noexec,relatime - mqueue mqueue rw
199 193 0:61 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs rw
200 199 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup 
systemd rw,name=systemd
201 199 0:25 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset
202 199 0:26 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu
203 199 0:27 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct
204 199 0:28 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio
205 199 0:29 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory
206 199 0:30 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices
207 199 0:31 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer
208 199 0:32 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls
209 199 0:33 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup rw,perf_event
210 199 0:34 
/docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb 
/sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio
211 199 0:35 / /sys/fs/cgroup/debug rw,relatime - cgroup cgroup rw,debug
212 193 0:16 / /sys rw,relatime - sysfs sysfs rw
213 212 0:18 / /sys/fs/cgroup rw,relatime - tmpfs none rw,size=4k,mode=755
214 213 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup 
systemd rw,name=systemd
215 213 0:25 / /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset
216 213 0:26 / /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu
217 213 0:27 / /sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct
218 213 0:28 / /sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio
219 213 0:29 / /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory
220 213 0:30 / /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices
221 213 0:31 / /sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer
222 213 0:32 / /sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls
223 213 0:33 / /sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup 
rw,perf_event
224 213 0:34 / /sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio
225 213 0:35 / /sys/fs/cgroup/debug rw,relatime - cgroup cgroup rw,debug
226 212 0:19 / /sys/fs/fuse/connections rw,relatime - fusectl none rw
227 212 0:8 / /sys/kernel/debug rw,relatime - debugfs none rw
228 193 8:0 /lib/x86_64-linux-gnu/libudev.so.1.3.5 /lib/libudev.so.1 ro,noatime 
- ext4 /dev/root rw,errors=remount-ro,data=ordered
229 193 8:0 /usr/bin/docker /bin/docker rw,noatime - ext4 /dev/root 
rw,errors=remount-ro,data=ordered
230 193 8:0 /lib/x86_64-linux-gnu/libpthread-2.19.so /lib/libpthread.so.0 
ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered
231 193 8:0 /usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6 /lib/libsqlite3.so.0 
ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered
258 193 0:20 /docker.sock /run/docker.sock rw,nosuid,noexec,relatime - tmpfs 
none rw,size=204708k,mode=755
259 193 8:0 /lib/x86_64-linux-gnu/libdevmapper.so.1.02.1 
/usr/lib/libdevmapper.so.1.02 ro,noatime - ext4 /dev/root 
rw,errors=remount-ro,data=ordered
260 193 8:0 
/var/lib/docker/containers/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb/resolv.conf
 /etc/resolv.conf rw,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered
261 193 8:0 
/var/lib/docker/containers/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb/hostname
 /etc/hostname 

[jira] [Updated] (MESOS-3540) Libevent termination triggers Broken Pipe

2015-09-28 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3540:

Sprint: Mesosphere Sprint 20

> Libevent termination triggers Broken Pipe
> -
>
> Key: MESOS-3540
> URL: https://issues.apache.org/jira/browse/MESOS-3540
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Joris Van Remoortere
>Assignee: Joris Van Remoortere
>  Labels: libevent, libprocess, mesosphere
>
> When the libevent loop terminates and we unblock the {{SIGPIPE}} signal, the 
> pending {{SIGPIPE}} instantly triggers and causes a broken pipe when the test 
> binary stops running.
> {code}
> Program received signal SIGPIPE, Broken pipe.
> [Switching to Thread 0x718b4700 (LWP 16270)]
> pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at 
> ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53
> 53../sysdeps/unix/sysv/linux/pthread_sigmask.c: No such file or directory.
> (gdb) bt
> #0  pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) 
> at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53
> #1  0x006fd9a4 in unblock () at 
> ../../../3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp:90
> #2  0x007d7915 in run () at 
> ../../../3rdparty/libprocess/src/libevent.cpp:125
> #3  0x007950cb in _M_invoke<>(void) () at 
> /usr/include/c++/4.9/functional:1700
> #4  0x00795000 in operator() () at 
> /usr/include/c++/4.9/functional:1688
> #5  0x00794f6e in _M_run () at /usr/include/c++/4.9/thread:115
> #6  0x7668de30 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #7  0x779a16aa in start_thread (arg=0x718b4700) at 
> pthread_create.c:333
> #8  0x75df1eed in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3540) Libevent termination triggers Broken Pipe

2015-09-28 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere reassigned MESOS-3540:
---

Assignee: Joris Van Remoortere

> Libevent termination triggers Broken Pipe
> -
>
> Key: MESOS-3540
> URL: https://issues.apache.org/jira/browse/MESOS-3540
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Joris Van Remoortere
>Assignee: Joris Van Remoortere
>  Labels: libevent, libprocess, mesosphere
>
> When the libevent loop terminates and we unblock the {{SIGPIPE}} signal, the 
> pending {{SIGPIPE}} instantly triggers and causes a broken pipe when the test 
> binary stops running.
> {code}
> Program received signal SIGPIPE, Broken pipe.
> [Switching to Thread 0x718b4700 (LWP 16270)]
> pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at 
> ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53
> 53../sysdeps/unix/sysv/linux/pthread_sigmask.c: No such file or directory.
> (gdb) bt
> #0  pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) 
> at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53
> #1  0x006fd9a4 in unblock () at 
> ../../../3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp:90
> #2  0x007d7915 in run () at 
> ../../../3rdparty/libprocess/src/libevent.cpp:125
> #3  0x007950cb in _M_invoke<>(void) () at 
> /usr/include/c++/4.9/functional:1700
> #4  0x00795000 in operator() () at 
> /usr/include/c++/4.9/functional:1688
> #5  0x00794f6e in _M_run () at /usr/include/c++/4.9/thread:115
> #6  0x7668de30 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #7  0x779a16aa in start_thread (arg=0x718b4700) at 
> pthread_create.c:333
> #8  0x75df1eed in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3540) Libevent termination triggers Broken Pipe

2015-09-28 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3540:

Story Points: 2

> Libevent termination triggers Broken Pipe
> -
>
> Key: MESOS-3540
> URL: https://issues.apache.org/jira/browse/MESOS-3540
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Joris Van Remoortere
>  Labels: libevent, libprocess, mesosphere
>
> When the libevent loop terminates and we unblock the {{SIGPIPE}} signal, the 
> pending {{SIGPIPE}} instantly triggers and causes a broken pipe when the test 
> binary stops running.
> {code}
> Program received signal SIGPIPE, Broken pipe.
> [Switching to Thread 0x718b4700 (LWP 16270)]
> pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at 
> ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53
> 53../sysdeps/unix/sysv/linux/pthread_sigmask.c: No such file or directory.
> (gdb) bt
> #0  pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) 
> at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53
> #1  0x006fd9a4 in unblock () at 
> ../../../3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp:90
> #2  0x007d7915 in run () at 
> ../../../3rdparty/libprocess/src/libevent.cpp:125
> #3  0x007950cb in _M_invoke<>(void) () at 
> /usr/include/c++/4.9/functional:1700
> #4  0x00795000 in operator() () at 
> /usr/include/c++/4.9/functional:1688
> #5  0x00794f6e in _M_run () at /usr/include/c++/4.9/thread:115
> #6  0x7668de30 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #7  0x779a16aa in start_thread (arg=0x718b4700) at 
> pthread_create.c:333
> #8  0x75df1eed in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3533) Unable to find and run URIs files

2015-09-28 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934437#comment-14934437
 ] 

haosdent commented on MESOS-3533:
-

Could you cat /proc/self/mountinfo and show the result here?

> Unable to find and run URIs files
> -
>
> Key: MESOS-3533
> URL: https://issues.apache.org/jira/browse/MESOS-3533
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, general
>Affects Versions: 0.25.0
> Environment: Linux li202-122 4.1.5-x86_64-linode61 #7 SMP Mon Aug 24 
> 13:46:31 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu 14.04.1 LTS
> Docker Version: 1.8.2
> Docker API version: 1.20
> Go version: go1.4.2
>Reporter: Rafael Capucho
>Priority: Blocker
>
> Hello,
> Deploying a docker container using marathon 0.11 with the following structure 
> (just example, I had tried some variations with same result):
> {
>   "id": "testando-flask",
>   "cmd": "ls -l; pip install -r requeriments.txt; ls -l; python app.py",
>   "cpus": 0.5,
>   "mem": 20.0,
>   "container": {
> "type": "DOCKER",
> "docker": {
>   "image": "therealwardo/python-2.7-pip",
>   "network": "BRIDGE",
>   "privileged": true,
>   "portMappings": [
> { "containerPort": 31177, "hostPort": 0 }
>   ]
> }
>   },
>   "uris": [
> "http://blog.rafaelcapucho.com/app.zip";
>   ]
> }
> curl -X POST http://173.255.192.XXX:8080/v2/apps -d @flask.json -H 
> "Content-type: application/json"
> The task are reaching mesos master properly but it failed. When I execute the 
> same structure without uris and with a simple "python -m SimpleHTTPServer" it 
> works! The docker is created and running.
> Analyzing the sandbox on Mesos UI I can see that the files of URIs are 
> download correctly, the project and the requirements.txt in stdout I got: 
> Archive:  
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.zip
>   inflating: 
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.py
>   
>  extracting: 
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/requeriments.txt
>   
> --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --stop_timeout="0ns"
> --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --stop_timeout="0ns"
> Registered docker executor on li202-122.members.linode.com
> Starting task testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb
> Could not open requirements file: [Errno 2] No such file or directory: 
> 'requeriments.txt'
> Storing complete log in /root/.pip/pip.log
> total 68
> drwxr-xr-x   2 root root  4096 Jan 15  2015 bin
> drwxr-xr-x   2 root root  4096 Apr 19  2012 boot
> drwxr-xr-x  10 root root 13740 Sep 28 12:44 dev
> drwxr-xr-x  46 root root  4096 Sep 28 12:44 etc
> drwxr-xr-x   2 root root  4096 Apr 19  2012 home
> drwxr-xr-x  11 root root  4096 Jan 15  2015 lib
> drwxr-xr-x   2 root root  4096 Jan 15  2015 lib64
> drwxr-xr-x   2 root root  4096 Jan 15  2015 media
> drwxr-xr-x   3 root root  4096 Sep 28 12:44 mnt
> drwxr-xr-x   2 root root  4096 Jan 15  2015 opt
> dr-xr-xr-x 170 root root 0 Sep 28 12:44 proc
> drwx--   3 root root  4096 Sep 28 12:44 root
> drwxr-xr-x   5 root root  4096 Jan 15  2015 run
> drwxr-xr-x   2 root root  4096 Jan 16  2015 sbin
> drwxr-xr-x   2 root root  4096 Mar  5  2012 selinux
> drwxr-xr-x   2 root root  4096 Jan 15  2015 srv
> dr-x

[jira] [Commented] (MESOS-1806) Substituting etcd or ReplicatedLog for Zookeeper

2015-09-28 Thread Shuai Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934435#comment-14934435
 ] 

Shuai Lin commented on MESOS-1806:
--

Checkout the etcd branch, build it, and run:

{code}
export MESOS_SOURCE_DIR=/path/to/mesos/
export MESOS_BUILD_DIR=/path/to/mesos/build
cd $MESOS_BUILD_DIR
$MESOS_SOURCE_DIR/src/tests/etcd_test.sh
{code}


> Substituting etcd or ReplicatedLog for Zookeeper
> 
>
> Key: MESOS-1806
> URL: https://issues.apache.org/jira/browse/MESOS-1806
> Project: Mesos
>  Issue Type: Task
>Reporter: Ed Ropple
>Assignee: Shuai Lin
>Priority: Minor
>
>eropple: Could you also file a new JIRA for Mesos to drop ZK 
> in favor of etcd or ReplicatedLog? Would love to get some momentum going on 
> that one.
> --
> Consider it filed. =)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3123) DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails & crashes

2015-09-28 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934429#comment-14934429
 ] 

haosdent commented on MESOS-3123:
-

Do you know which ip slave start on this test? If slave start in 127.0.0.1, 
would have this problem

> DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails & crashes
> ---
>
> Key: MESOS-3123
> URL: https://issues.apache.org/jira/browse/MESOS-3123
> Project: Mesos
>  Issue Type: Bug
>  Components: docker, test
>Affects Versions: 0.23.0
> Environment: CentOS 7.1, or Ubuntu 14.04
> Mesos 0.23.0-rc4 or today's master
>Reporter: Adam B
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> Fails the test and then crashes while trying to shutdown the slaves.
> {code}
> [ RUN  ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged
> ../../src/tests/docker_containerizer_tests.cpp:618: Failure
> Value of: statusRunning.get().state()
>   Actual: TASK_LOST
> Expected: TASK_RUNNING
> ../../src/tests/docker_containerizer_tests.cpp:619: Failure
> Failed to wait 1mins for statusFinished
> ../../src/tests/docker_containerizer_tests.cpp:610: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, 
> statusUpdate(&driver, _))...
>  Expected: to be called twice
>Actual: called once - unsatisfied and active
> F0721 21:59:54.950773 30622 logging.cpp:57] RAW: Pure virtual method called
> @ 0x7f3915347a02  google::LogMessage::Fail()
> @ 0x7f391534cee4  google::RawLog__()
> @ 0x7f3914890312  __cxa_pure_virtual
> @   0x88c3ae  mesos::internal::tests::Cluster::Slaves::shutdown()
> @   0x88c176  mesos::internal::tests::Cluster::Slaves::~Slaves()
> @   0x88dc16  mesos::internal::tests::Cluster::~Cluster()
> @   0x88dc87  mesos::internal::tests::MesosTest::~MesosTest()
> @   0xa529ab  
> mesos::internal::tests::DockerContainerizerTest::~DockerContainerizerTest()
> @   0xa8125f  
> mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test()
> @   0xa8128e  
> mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test()
> @  0x1218b4e  testing::Test::DeleteSelf_()
> @  0x1221909  
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x121cb38  
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x1205713  testing::TestInfo::Run()
> @  0x1205c4e  testing::TestCase::Run()
> @  0x120a9ca  testing::internal::UnitTestImpl::RunAllTests()
> @  0x122277b  
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x121d81b  
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x120987a  testing::UnitTest::Run()
> @   0xcfbf0c  main
> @ 0x7f391097caf5  __libc_start_main
> @   0x882089  (unknown)
> make[3]: *** [check-local] Aborted (core dumped)
> make[3]: Leaving directory `/home/me/mesos/build/src'
> make[2]: *** [check-am] Error 2
> make[2]: Leaving directory `/home/me/mesos/build/src'
> make[1]: *** [check] Error 2
> make[1]: Leaving directory `/home/me/mesos/build/src'
> make: *** [check-recursive] Error 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3519) Fix file descriptor leakage / double close in the code base

2015-09-28 Thread Chi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934417#comment-14934417
 ] 

Chi Zhang commented on MESOS-3519:
--

Hi [~tnachen],

I added you as a reviewer to this patch. Could you take a look at it please?

https://reviews.apache.org/r/38828/

> Fix file descriptor leakage / double close in the code base
> ---
>
> Key: MESOS-3519
> URL: https://issues.apache.org/jira/browse/MESOS-3519
> Project: Mesos
>  Issue Type: Bug
>Reporter: Chi Zhang
>Assignee: Chi Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3548) Investigate federations of Mesos masters

2015-09-28 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-3548:
---
Summary: Investigate federations of Mesos masters  (was: Support 
federations of Mesos masters)

> Investigate federations of Mesos masters
> 
>
> Key: MESOS-3548
> URL: https://issues.apache.org/jira/browse/MESOS-3548
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>  Labels: federation, multi-dc
>
> In a large Mesos installation, the operator might want to ensure that even if 
> the Mesos masters are inaccessible or failed, new tasks can still be 
> scheduled (across multiple different frameworks). HA masters are only a 
> partial solution here: the masters might still be inaccessible due to a 
> correlated failure (e.g., Zookeeper misconfiguration/human error).
> To support this, we could support the notion of "hierarchies" or 
> "federations" of Mesos masters. In a Mesos installation with 10k machines, 
> the operator might configure 10 Mesos masters (each of which might be HA) to 
> manage 1k machines each. Then an additional "meta-Master" would manage the 
> allocation of cluster resources to the 10 masters. Hence, the failure of any 
> individual master would impact 1k machines at most. The meta-master might not 
> have a lot of work to do: e.g., it might be limited to occasionally 
> reallocating cluster resources among the 10 masters, or ensuring that newly 
> added cluster resources are allocated among the masters as appropriate. 
> Hence, the failure of the meta-master would not prevent any of the individual 
> masters from scheduling new tasks. A single framework instance probably 
> wouldn't be able to use more resources than have been assigned to a single 
> Master, but that seems like a reasonable restriction.
> This feature might also be a good fit for a multi-datacenter deployment of 
> Mesos: each Mesos master instance would manage a single DC. Naturally, 
> reducing the traffic between frameworks and the meta-master would be 
> important for performance reasons in a configuration like this.
> Operationally, this might be simpler if Mesos processes were self-hosting 
> ([MESOS-3547]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3548) Support federations of Mesos masters

2015-09-28 Thread Neil Conway (JIRA)
Neil Conway created MESOS-3548:
--

 Summary: Support federations of Mesos masters
 Key: MESOS-3548
 URL: https://issues.apache.org/jira/browse/MESOS-3548
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway


In a large Mesos installation, the operator might want to ensure that even if 
the Mesos masters are inaccessible or failed, new tasks can still be scheduled 
(across multiple different frameworks). HA masters are only a partial solution 
here: the masters might still be inaccessible due to a correlated failure 
(e.g., Zookeeper misconfiguration/human error).

To support this, we could support the notion of "hierarchies" or "federations" 
of Mesos masters. In a Mesos installation with 10k machines, the operator might 
configure 10 Mesos masters (each of which might be HA) to manage 1k machines 
each. Then an additional "meta-Master" would manage the allocation of cluster 
resources to the 10 masters. Hence, the failure of any individual master would 
impact 1k machines at most. The meta-master might not have a lot of work to do: 
e.g., it might be limited to occasionally reallocating cluster resources among 
the 10 masters, or ensuring that newly added cluster resources are allocated 
among the masters as appropriate. Hence, the failure of the meta-master would 
not prevent any of the individual masters from scheduling new tasks. A single 
framework instance probably wouldn't be able to use more resources than have 
been assigned to a single Master, but that seems like a reasonable restriction.

This feature might also be a good fit for a multi-datacenter deployment of 
Mesos: each Mesos master instance would manage a single DC. Naturally, reducing 
the traffic between frameworks and the meta-master would be important for 
performance reasons in a configuration like this.

Operationally, this might be simpler if Mesos processes were self-hosting 
([MESOS-3547]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3547) Investigate self-hosting Mesos processes

2015-09-28 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934378#comment-14934378
 ] 

Jie Yu commented on MESOS-3547:
---

Flying by... This is really interesting! Wondering if slave itself can be a 
"persistent task" or not? The cyclic dependency means that we need some sort of 
bootstrapping. Having a bootstrapping means maybe we can do self slave 
upgrading? Just some random thoughts :)

> Investigate self-hosting Mesos processes
> 
>
> Key: MESOS-3547
> URL: https://issues.apache.org/jira/browse/MESOS-3547
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Neil Conway
>
> Right now, Mesos master and slave nodes are managed differently: they use 
> different binaries and startup scripts and require different ops procedures. 
> Some of this asymmetric is essential, but perhaps not all of it is. If Mesos 
> supported a concept of "persistent tasks" (see [MESOS-3545]), it might be 
> possible to implement the Mesos master as such a task -- this might help 
> unify the ops procedures between a master and a slave.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3545) Investigate restoring tasks/executors after machine reboot.

2015-09-28 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934371#comment-14934371
 ] 

Neil Conway commented on MESOS-3545:


This might take the form of supporting a notion of "persistent tasks" -- i.e., 
tasks that Mesos tries to keep running whenever possible (e.g., during a 
network partition and after a machine reboot, even in the absence of network 
connectivity).

> Investigate restoring tasks/executors after machine reboot.
> ---
>
> Key: MESOS-3545
> URL: https://issues.apache.org/jira/browse/MESOS-3545
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Benjamin Hindman
>
> If a task/executor is restartable (see MESOS-3544) it might make sense to 
> force an agent to restart these tasks/executors _before_ after a machine 
> reboot in the event that the machine is network partitioned away from the 
> master (or the master has failed) but we'd like to get these services running 
> again. Assuming the agent(s) running on the machine has not been disconnected 
> from the master for longer than the master's agent re-registration timeout 
> the agent should be able to re-register (i.e., after a network partition is 
> resolved) without a problem. However, in the same way that a framework would 
> be interested in knowing that it's tasks/executors were restarted we'd want 
> to send something like a TASK_RESTARTED status update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3547) Investigate self-hosting Mesos processes

2015-09-28 Thread Neil Conway (JIRA)
Neil Conway created MESOS-3547:
--

 Summary: Investigate self-hosting Mesos processes
 Key: MESOS-3547
 URL: https://issues.apache.org/jira/browse/MESOS-3547
 Project: Mesos
  Issue Type: Improvement
  Components: master
Reporter: Neil Conway


Right now, Mesos master and slave nodes are managed differently: they use 
different binaries and startup scripts and require different ops procedures. 
Some of this asymmetric is essential, but perhaps not all of it is. If Mesos 
supported a concept of "persistent tasks" (see [MESOS-3545]), it might be 
possible to implement the Mesos master as such a task -- this might help unify 
the ops procedures between a master and a slave.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3546) Mesos scheduler driver python binding breaks when implicitAcknowledgements is not supplied.

2015-09-28 Thread Yan Xu (JIRA)
Yan Xu created MESOS-3546:
-

 Summary: Mesos scheduler driver python binding breaks when 
implicitAcknowledgements is not supplied.
 Key: MESOS-3546
 URL: https://issues.apache.org/jira/browse/MESOS-3546
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Yan Xu


The C++ driver has overloads that make `bool implicitAcknowledgements` optional 
but the python binding throws an error if the client code doesn't supply it.

{noformat:title=error}
TypeError: an integer is required
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3545) Investigate restoring tasks/executors after machine reboot.

2015-09-28 Thread Benjamin Hindman (JIRA)
Benjamin Hindman created MESOS-3545:
---

 Summary: Investigate restoring tasks/executors after machine 
reboot.
 Key: MESOS-3545
 URL: https://issues.apache.org/jira/browse/MESOS-3545
 Project: Mesos
  Issue Type: Improvement
  Components: slave
Reporter: Benjamin Hindman


If a task/executor is restartable (see MESOS-3544) it might make sense to force 
an agent to restart these tasks/executors _before_ after a machine reboot in 
the event that the machine is network partitioned away from the master (or the 
master has failed) but we'd like to get these services running again. Assuming 
the agent(s) running on the machine has not been disconnected from the master 
for longer than the master's agent re-registration timeout the agent should be 
able to re-register (i.e., after a network partition is resolved) without a 
problem. However, in the same way that a framework would be interested in 
knowing that it's tasks/executors were restarted we'd want to send something 
like a TASK_RESTARTED status update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3544) Support task and/or executor restart on failure.

2015-09-28 Thread Benjamin Hindman (JIRA)
Benjamin Hindman created MESOS-3544:
---

 Summary: Support task and/or executor restart on failure.
 Key: MESOS-3544
 URL: https://issues.apache.org/jira/browse/MESOS-3544
 Project: Mesos
  Issue Type: Bug
  Components: HTTP API, master, slave
Reporter: Benjamin Hindman


In certain instances it might be preferable to restart a task/executor after it 
fails (i.e., non-zero exit code) rather than going through an entire status 
update -> offer -> accept (launch) cycle to restart the task/executor on the 
same machine. This is especially true if the resources are reserved 
(dynamically or statically).

Of course, we still want to highlight the restart to the framework, so 
introducing something like TASK_RESTARTED might be necessary (not sure what the 
analog would be for executors).

Finally, if the task/executor has a bug we don't want to sit in an infinite 
loop, so we'll likely want to introduce this functionality in such a way as to 
limit the total restart attempts (or force a framework to have the proper 
authority to restart forever).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3544) Support task and/or executor restart on failure.

2015-09-28 Thread Benjamin Hindman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Hindman updated MESOS-3544:

Issue Type: Epic  (was: Bug)

> Support task and/or executor restart on failure.
> 
>
> Key: MESOS-3544
> URL: https://issues.apache.org/jira/browse/MESOS-3544
> Project: Mesos
>  Issue Type: Epic
>  Components: HTTP API, master, slave
>Reporter: Benjamin Hindman
>
> In certain instances it might be preferable to restart a task/executor after 
> it fails (i.e., non-zero exit code) rather than going through an entire 
> status update -> offer -> accept (launch) cycle to restart the task/executor 
> on the same machine. This is especially true if the resources are reserved 
> (dynamically or statically).
> Of course, we still want to highlight the restart to the framework, so 
> introducing something like TASK_RESTARTED might be necessary (not sure what 
> the analog would be for executors).
> Finally, if the task/executor has a bug we don't want to sit in an infinite 
> loop, so we'll likely want to introduce this functionality in such a way as 
> to limit the total restart attempts (or force a framework to have the proper 
> authority to restart forever).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3516) Add user doc for networking support in Mesos 0.25.0

2015-09-28 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3516:
--
Story Points: 2

> Add user doc for networking support in Mesos 0.25.0
> ---
>
> Key: MESOS-3516
> URL: https://issues.apache.org/jira/browse/MESOS-3516
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Niklas Quarfot Nielsen
>Assignee: Niklas Quarfot Nielsen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3507) As an operator, I want a way to inspect queued tasks in running schedulers

2015-09-28 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934287#comment-14934287
 ] 

Vinod Kone commented on MESOS-3507:
---

{quote}
there is no uniform way of getting a notion of 'awaiting' tasks i.e. expressing 
that a framework has more work to do.
{quote}

We do have an API call for frameworks to express this. requestResources(). 
Instead of having frameworks expose "queued work" endpoints, and having 
something on the master (module?) to interpret this data in a uniform way, why 
not just have frameworks explicitly and directly provide the intent of needing 
more resources via requestResources() call? Resources is a uniform abstraction 
that every framework already understands.

> As an operator, I want a way to inspect queued tasks in running schedulers
> --
>
> Key: MESOS-3507
> URL: https://issues.apache.org/jira/browse/MESOS-3507
> Project: Mesos
>  Issue Type: Story
>Reporter: Niklas Quarfot Nielsen
>
> Currently, there is no uniform way of getting a notion of 'awaiting' tasks 
> i.e. expressing that a framework has more work to do. This information is 
> useful for auto-scaling and anomaly detection systems. Schedulers tend to 
> expose this over their own http endpoints, but the format across schedulers 
> are most likely not compatible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3516) Add user doc for networking support in Mesos 0.25.0

2015-09-28 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen reassigned MESOS-3516:
-

Assignee: Niklas Quarfot Nielsen  (was: Kapil Arya)

> Add user doc for networking support in Mesos 0.25.0
> ---
>
> Key: MESOS-3516
> URL: https://issues.apache.org/jira/browse/MESOS-3516
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Niklas Quarfot Nielsen
>Assignee: Niklas Quarfot Nielsen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3543) Add libevent support on Unix builds.

2015-09-28 Thread Alex Clemmer (JIRA)
Alex Clemmer created MESOS-3543:
---

 Summary: Add libevent support on Unix builds.
 Key: MESOS-3543
 URL: https://issues.apache.org/jira/browse/MESOS-3543
 Project: Mesos
  Issue Type: Task
  Components: build
Reporter: Alex Clemmer
Assignee: Alex Clemmer


Right now Unix builds will (intentionally) error out when we try to build them 
with libevent. We should add support for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3542) Separate libmesos into compiling from many binaries.

2015-09-28 Thread Alex Clemmer (JIRA)
Alex Clemmer created MESOS-3542:
---

 Summary: Separate libmesos into compiling from many binaries.
 Key: MESOS-3542
 URL: https://issues.apache.org/jira/browse/MESOS-3542
 Project: Mesos
  Issue Type: Task
Reporter: Alex Clemmer
Assignee: Alex Clemmer


Historically libmesos is built as a huge monolithic binary. Another idea would 
be to build it from a bunch of smaller libraries (_e.g._, libagent, _etc_.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3519) Fix file descriptor leakage / double close in the code base

2015-09-28 Thread Chi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934255#comment-14934255
 ] 

Chi Zhang commented on MESOS-3519:
--

https://reviews.apache.org/r/38828/

> Fix file descriptor leakage / double close in the code base
> ---
>
> Key: MESOS-3519
> URL: https://issues.apache.org/jira/browse/MESOS-3519
> Project: Mesos
>  Issue Type: Bug
>Reporter: Chi Zhang
>Assignee: Chi Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3541) Add CMakeLists that builds the Mesos master

2015-09-28 Thread Alex Clemmer (JIRA)
Alex Clemmer created MESOS-3541:
---

 Summary: Add CMakeLists that builds the Mesos master
 Key: MESOS-3541
 URL: https://issues.apache.org/jira/browse/MESOS-3541
 Project: Mesos
  Issue Type: Task
  Components: build
Reporter: Alex Clemmer
Assignee: Alex Clemmer


Right now CMake builds only the agent. We want it to also build the master as 
part of the libmesos binary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3540) Libevent termination triggers Broken Pipe

2015-09-28 Thread Joris Van Remoortere (JIRA)
Joris Van Remoortere created MESOS-3540:
---

 Summary: Libevent termination triggers Broken Pipe
 Key: MESOS-3540
 URL: https://issues.apache.org/jira/browse/MESOS-3540
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Reporter: Joris Van Remoortere


When the libevent loop terminates and we unblock the {{SIGPIPE}} signal, the 
pending {{SIGPIPE}} instantly triggers and causes a broken pipe when the test 
binary stops running.
{code}
Program received signal SIGPIPE, Broken pipe.
[Switching to Thread 0x718b4700 (LWP 16270)]
pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at 
../sysdeps/unix/sysv/linux/pthread_sigmask.c:53
53  ../sysdeps/unix/sysv/linux/pthread_sigmask.c: No such file or directory.
(gdb) bt
#0  pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at 
../sysdeps/unix/sysv/linux/pthread_sigmask.c:53
#1  0x006fd9a4 in unblock () at 
../../../3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp:90
#2  0x007d7915 in run () at 
../../../3rdparty/libprocess/src/libevent.cpp:125
#3  0x007950cb in _M_invoke<>(void) () at 
/usr/include/c++/4.9/functional:1700
#4  0x00795000 in operator() () at /usr/include/c++/4.9/functional:1688
#5  0x00794f6e in _M_run () at /usr/include/c++/4.9/thread:115
#6  0x7668de30 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x779a16aa in start_thread (arg=0x718b4700) at 
pthread_create.c:333
#8  0x75df1eed in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3539) Validate that slave's work_dir is a shared mount in its own peer group when LinuxFilesystemIsolator is used.

2015-09-28 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-3539:
--
Sprint: Twitter Mesos Q3 Sprint 6

> Validate that slave's work_dir is a shared mount in its own peer group when 
> LinuxFilesystemIsolator is used.
> 
>
> Key: MESOS-3539
> URL: https://issues.apache.org/jira/browse/MESOS-3539
> Project: Mesos
>  Issue Type: Bug
>Reporter: Jie Yu
>
> To address this TODO in the code:
> {noformat}
> src/slave/containerizer/isolators/filesystem/linux.cpp +122
> // TODO(jieyu): Currently, we don't check if the slave's work_dir
> // mount is a shared mount or not. We just assume it is. We cannot
> // simply mark the slave as shared again because that will create a
> // new peer group for the mounts. This is a temporary workaround for
> // now while we are thinking about fixes.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3539) Validate that slave's work_dir is a shared mount in its own peer group when LinuxFilesystemIsolator is used.

2015-09-28 Thread Jie Yu (JIRA)
Jie Yu created MESOS-3539:
-

 Summary: Validate that slave's work_dir is a shared mount in its 
own peer group when LinuxFilesystemIsolator is used.
 Key: MESOS-3539
 URL: https://issues.apache.org/jira/browse/MESOS-3539
 Project: Mesos
  Issue Type: Bug
Reporter: Jie Yu


To address this TODO in the code:

{noformat}
src/slave/containerizer/isolators/filesystem/linux.cpp +122


// TODO(jieyu): Currently, we don't check if the slave's work_dir
// mount is a shared mount or not. We just assume it is. We cannot
// simply mark the slave as shared again because that will create a
// new peer group for the mounts. This is a temporary workaround for
// now while we are thinking about fixes.
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3538) CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is flaky

2015-09-28 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934219#comment-14934219
 ] 

Niklas Quarfot Nielsen commented on MESOS-3538:
---

Thanks Jie! I will rerun the test and see if that solves the problem (and close 
the ticket if everything is OK)

> CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is 
> flaky
> ---
>
> Key: MESOS-3538
> URL: https://issues.apache.org/jira/browse/MESOS-3538
> Project: Mesos
>  Issue Type: Bug
>Reporter: Niklas Quarfot Nielsen
>Priority: Blocker
>
> {code}
> $ sudo ./bin/mesos-tests.sh 
> --gtest_filter="CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy"
>  
> Source directory: /home/vagrant/mesos
> Build directory: /home/vagrant/mesos-build
> -
> We cannot run any cgroups tests that require mounting
> hierarchies because you have the following hierarchies mounted:
> /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, 
> /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, 
> /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/perf_event, 
> /sys/fs/cgroup/systemd
> We'll disable the CgroupsNoHierarchyTest test fixture for now.
> -
> sh: 1: perf: not found
> -
> No 'perf' command found so no 'perf' tests will be run
> -
> /bin/nc
> Note: Google Test filter = 
> CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy-MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf:PerfTest.ROOT_Events:PerfTest.ROOT_Sample:PerfTest.Parse:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/2:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/3:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/4:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/6:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/7:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/8:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/9:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/10:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/11:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/12:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/13:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/14:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/15:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/16:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/17:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/18:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/19:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/20:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/21:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/22:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/23:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/24:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/25:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/26:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/27:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/28:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/29:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/30:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/31:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/32:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/33:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/34:Sl

[jira] [Commented] (MESOS-3538) CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is flaky

2015-09-28 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934212#comment-14934212
 ] 

Jie Yu commented on MESOS-3538:
---

That should be fixed by this commit:

commit 4635b66af7caf024695f69f4ca07a57f2876ad29
Author: Jie Yu 
Date:   Mon Sep 28 12:45:20 2015 -0700

Fixed a bug in cgroups test filter.

Review: https://reviews.apache.org/r/38819

> CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is 
> flaky
> ---
>
> Key: MESOS-3538
> URL: https://issues.apache.org/jira/browse/MESOS-3538
> Project: Mesos
>  Issue Type: Bug
>Reporter: Niklas Quarfot Nielsen
>Priority: Blocker
>
> {code}
> $ sudo ./bin/mesos-tests.sh 
> --gtest_filter="CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy"
>  
> Source directory: /home/vagrant/mesos
> Build directory: /home/vagrant/mesos-build
> -
> We cannot run any cgroups tests that require mounting
> hierarchies because you have the following hierarchies mounted:
> /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, 
> /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, 
> /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/perf_event, 
> /sys/fs/cgroup/systemd
> We'll disable the CgroupsNoHierarchyTest test fixture for now.
> -
> sh: 1: perf: not found
> -
> No 'perf' command found so no 'perf' tests will be run
> -
> /bin/nc
> Note: Google Test filter = 
> CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy-MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf:PerfTest.ROOT_Events:PerfTest.ROOT_Sample:PerfTest.Parse:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/2:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/3:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/4:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/6:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/7:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/8:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/9:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/10:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/11:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/12:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/13:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/14:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/15:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/16:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/17:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/18:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/19:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/20:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/21:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/22:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/23:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/24:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/25:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/26:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/27:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/28:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/29:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/30:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/31:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/32:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSla

[jira] [Created] (MESOS-3538) CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is flaky

2015-09-28 Thread Niklas Quarfot Nielsen (JIRA)
Niklas Quarfot Nielsen created MESOS-3538:
-

 Summary: 
CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is 
flaky
 Key: MESOS-3538
 URL: https://issues.apache.org/jira/browse/MESOS-3538
 Project: Mesos
  Issue Type: Bug
Reporter: Niklas Quarfot Nielsen
Priority: Blocker


{code}
$ sudo ./bin/mesos-tests.sh 
--gtest_filter="CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy"
 
Source directory: /home/vagrant/mesos
Build directory: /home/vagrant/mesos-build
-
We cannot run any cgroups tests that require mounting
hierarchies because you have the following hierarchies mounted:
/sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, 
/sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, 
/sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/perf_event, 
/sys/fs/cgroup/systemd
We'll disable the CgroupsNoHierarchyTest test fixture for now.
-
sh: 1: perf: not found
-
No 'perf' command found so no 'perf' tests will be run
-
/bin/nc
Note: Google Test filter = 
CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy-MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf:PerfTest.ROOT_Events:PerfTest.ROOT_Sample:PerfTest.Parse:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/2:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/3:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/4:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/6:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/7:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/8:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/9:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/10:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/11:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/12:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/13:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/14:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/15:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/16:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/17:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/18:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/19:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/20:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/21:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/22:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/23:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/24:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/25:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/26:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/27:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/28:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/29:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/30:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/31:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/32:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/33:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/34:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/35:SlaveCount/Registrar_BENCHMARK_Test.Performance/0:SlaveCount/Registrar_BENCHMARK_Test.Performance/1:SlaveCount/Registrar_BENCHMARK_Test.Performance/2:SlaveCount/Registrar_BENCHMARK_Test.Performance/3
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from CgroupsNoHierarchyTe

[jira] [Comment Edited] (MESOS-3123) DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails & crashes

2015-09-28 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934166#comment-14934166
 ] 

Niklas Quarfot Nielsen edited comment on MESOS-3123 at 9/28/15 10:08 PM:
-

Just ran into this during testing of Mesos 0.25.0 rc1 on Ubuntu 14.04

{code}
[ RUN  ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor
2015-09-28 
22:00:14,166:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:17,504:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:20,841:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:24,178:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:254: Failure
Value of: statusRunning.get().state()
  Actual: TASK_FAILED
Expected: TASK_RUNNING
2015-09-28 
22:00:27,515:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:30,851:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:34,188:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:37,526:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:40,863:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:44,208:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:47,546:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:50,884:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:54,222:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:57,560:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:00,899:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:04,238:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:07,575:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:10,912:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:14,249:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:17,587:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:20,925:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:24,264:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:255: Failure

[jira] [Commented] (MESOS-3123) DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails & crashes

2015-09-28 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934166#comment-14934166
 ] 

Niklas Quarfot Nielsen commented on MESOS-3123:
---

{code}
[ RUN  ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor
2015-09-28 
22:00:14,166:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:17,504:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:20,841:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:24,178:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:254: Failure
Value of: statusRunning.get().state()
  Actual: TASK_FAILED
Expected: TASK_RUNNING
2015-09-28 
22:00:27,515:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:30,851:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:34,188:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:37,526:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:40,863:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:44,208:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:47,546:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:50,884:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:54,222:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:00:57,560:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:00,899:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:04,238:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:07,575:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:10,912:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:14,249:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:17,587:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:20,925:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-09-28 
22:01:24,264:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:255: Failure
Failed to wait 1mins for statusFinished
../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:246: Failure
Ac

[jira] [Updated] (MESOS-3516) Add user doc for networking support in Mesos 0.25.0

2015-09-28 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3516:
--
Target Version/s: 0.25.0

> Add user doc for networking support in Mesos 0.25.0
> ---
>
> Key: MESOS-3516
> URL: https://issues.apache.org/jira/browse/MESOS-3516
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Niklas Quarfot Nielsen
>Assignee: Kapil Arya
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3519) Fix file descriptor leakage / double close in the code base

2015-09-28 Thread Chi Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934087#comment-14934087
 ] 

Chi Zhang commented on MESOS-3519:
--

https://reviews.apache.org/r/38823/

> Fix file descriptor leakage / double close in the code base
> ---
>
> Key: MESOS-3519
> URL: https://issues.apache.org/jira/browse/MESOS-3519
> Project: Mesos
>  Issue Type: Bug
>Reporter: Chi Zhang
>Assignee: Chi Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3537) Allow the frameworks to specify filesystem perms for volumes they own.

2015-09-28 Thread Yan Xu (JIRA)
Yan Xu created MESOS-3537:
-

 Summary: Allow the frameworks to specify filesystem perms for 
volumes they own.
 Key: MESOS-3537
 URL: https://issues.apache.org/jira/browse/MESOS-3537
 Project: Mesos
  Issue Type: Task
Reporter: Yan Xu


This is applicable to persistent volumes as well as regular volumes with the 
host path under the sandbox.

Currently these volumes are created by the slave with perms from its own umask. 
In order to simulate system directories from the host sandbox the users may 
need to request certain perms (e.g. {{/var/www-data}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3533) Unable to find and run URIs files

2015-09-28 Thread Rafael Capucho (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933968#comment-14933968
 ] 

Rafael Capucho commented on MESOS-3533:
---

Looking into Mesos Slave Log I found a lot of lines like that:

W0928 20:54:24.81043113 slave.cpp:4452] Failed to get resource statistics 
for executor 'novo-teste.c0442998-661f-11e5-8b11-0242ac1101eb' of framework 
fe42c404-7266-462b-adf5-549311bfbf32-: Failed to collect cgroup stats: 
Failed to determine cgroup for the 'cpu' subsystem: Failed to read 
/proc/27138/cgroup: Failed to open file '/proc/27138/cgroup': No such file or 
directory

Could it be the source of the problem?

> Unable to find and run URIs files
> -
>
> Key: MESOS-3533
> URL: https://issues.apache.org/jira/browse/MESOS-3533
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, general
>Affects Versions: 0.25.0
> Environment: Linux li202-122 4.1.5-x86_64-linode61 #7 SMP Mon Aug 24 
> 13:46:31 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu 14.04.1 LTS
> Docker Version: 1.8.2
> Docker API version: 1.20
> Go version: go1.4.2
>Reporter: Rafael Capucho
>Priority: Blocker
>
> Hello,
> Deploying a docker container using marathon 0.11 with the following structure 
> (just example, I had tried some variations with same result):
> {
>   "id": "testando-flask",
>   "cmd": "ls -l; pip install -r requeriments.txt; ls -l; python app.py",
>   "cpus": 0.5,
>   "mem": 20.0,
>   "container": {
> "type": "DOCKER",
> "docker": {
>   "image": "therealwardo/python-2.7-pip",
>   "network": "BRIDGE",
>   "privileged": true,
>   "portMappings": [
> { "containerPort": 31177, "hostPort": 0 }
>   ]
> }
>   },
>   "uris": [
> "http://blog.rafaelcapucho.com/app.zip";
>   ]
> }
> curl -X POST http://173.255.192.XXX:8080/v2/apps -d @flask.json -H 
> "Content-type: application/json"
> The task are reaching mesos master properly but it failed. When I execute the 
> same structure without uris and with a simple "python -m SimpleHTTPServer" it 
> works! The docker is created and running.
> Analyzing the sandbox on Mesos UI I can see that the files of URIs are 
> download correctly, the project and the requirements.txt in stdout I got: 
> Archive:  
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.zip
>   inflating: 
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.py
>   
>  extracting: 
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/requeriments.txt
>   
> --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --stop_timeout="0ns"
> --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --stop_timeout="0ns"
> Registered docker executor on li202-122.members.linode.com
> Starting task testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb
> Could not open requirements file: [Errno 2] No such file or directory: 
> 'requeriments.txt'
> Storing complete log in /root/.pip/pip.log
> total 68
> drwxr-xr-x   2 root root  4096 Jan 15  2015 bin
> drwxr-xr-x   2 root root  4096 Apr 19  2012 boot
> drwxr-xr-x  10 root root 13740 Sep 28 12:44 dev
> drwxr-xr-x  46 root root  4096 Sep 28 12:44 etc
> drwxr-xr-x   2 root root  4096 Apr 19  2012 home
> drwxr-xr-x  11 root root  4096 Jan 15  2015 lib
> drwxr-xr-x   2 root root  4096 Jan 15  2015 lib64
> drwxr-xr-x   2 root

[jira] [Updated] (MESOS-2467) Allow --resources flag to take JSON.

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2467:
---
Story Points: 3

> Allow --resources flag to take JSON.
> 
>
> Key: MESOS-2467
> URL: https://issues.apache.org/jira/browse/MESOS-2467
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Jie Yu
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Currently, we used a customized format for --resources flag. As we introduce 
> more and more stuffs (e.g., persistence, reservation) in Resource object, we 
> need a more generic way to specify --resources.
> For backward compatibility, we can scan the first character. If it is '[', 
> then we invoke the JSON parser. Otherwise, we use the existing parser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2467) Allow --resources flag to take JSON.

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2467:
---
Sprint: Mesosphere Sprint 20

> Allow --resources flag to take JSON.
> 
>
> Key: MESOS-2467
> URL: https://issues.apache.org/jira/browse/MESOS-2467
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Jie Yu
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Currently, we used a customized format for --resources flag. As we introduce 
> more and more stuffs (e.g., persistence, reservation) in Resource object, we 
> need a more generic way to specify --resources.
> For backward compatibility, we can scan the first character. If it is '[', 
> then we invoke the JSON parser. Otherwise, we use the existing parser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1607) Introduce optimistic offers.

2015-09-28 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-1607:
-
Labels: mesosphere  (was: )

> Introduce optimistic offers.
> 
>
> Key: MESOS-1607
> URL: https://issues.apache.org/jira/browse/MESOS-1607
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation, framework, master
>Reporter: Benjamin Hindman
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
> Attachments: optimisitic-offers.pdf
>
>
> The current implementation of resource offers only enable a single framework 
> scheduler to make scheduling decisions for some available resources at a 
> time. In some circumstances, this is good, i.e., when we don't want other 
> framework schedulers to have access to some resources. However, in other 
> circumstances, there are advantages to letting multiple framework schedulers 
> attempt to make scheduling decisions for the _same_ allocation of resources 
> in parallel.
> If you think about this from a "concurrency control" perspective, the current 
> implementation of resource offers is _pessimistic_, the resources contained 
> within an offer are _locked_ until the framework scheduler that they were 
> offered to launches tasks with them or declines them. In addition to making 
> pessimistic offers we'd like to give out _optimistic_ offers, where the same 
> resources are offered to multiple framework schedulers at the same time, and 
> framework schedulers "compete" for those resources on a 
> first-come-first-serve basis (i.e., the first to launch a task "wins"). We've 
> always reserved the right to rescind resource offers using the 'rescind' 
> primitive in the API, and a framework scheduler should be prepared to launch 
> a task and have those tasks go lost because another framework already started 
> to use those resources.
> Introducing optimistic offers will enable more sophisticated allocation 
> algorithms. For example, we can optimistically allocate resources that are 
> reserved for a particular framework (role) but are not being used. In 
> conjunction with revocable resources (the concept that using resources not 
> reserved for you means you might get those resources revoked) we can easily 
> create a "spot" market for unused resources, driving up utilization by 
> letting frameworks that are willing to use revocable resources run tasks.
> In the limit, one could imagine always making optimistic resource offers. 
> This bears a striking resemblance with the Google Omega model (an isomorphism 
> even). However, being able to configure what resources should be allocated 
> optimistically and what resources should be allocated pessimistically gives 
> even more control to a datacenter/cluster operator that might want to, for 
> example, never let multiple frameworks (roles) compete for some set of 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2972) Serialize Docker image spec as protobuf

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2972:
---
Story Points: 3

> Serialize Docker image spec as protobuf
> ---
>
> Key: MESOS-2972
> URL: https://issues.apache.org/jira/browse/MESOS-2972
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> The Docker image specification defines a schema for the metadata json that it 
> puts into each image. Currently the docker image provisioner needs to be able 
> to parse and understand this metadata json, and we should create a protobuf 
> equivelent schema so we can utilize the json to protobuf conversion to read 
> and validate the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2972) Serialize Docker image spec as protobuf

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2972:
---
Sprint: Mesosphere Sprint 20

> Serialize Docker image spec as protobuf
> ---
>
> Key: MESOS-2972
> URL: https://issues.apache.org/jira/browse/MESOS-2972
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> The Docker image specification defines a schema for the metadata json that it 
> puts into each image. Currently the docker image provisioner needs to be able 
> to parse and understand this metadata json, and we should create a protobuf 
> equivelent schema so we can utilize the json to protobuf conversion to read 
> and validate the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3099) Validation of Docker Image Manifests from Docker Registry

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3099:
---
Sprint: Mesosphere Sprint 20

> Validation of Docker Image Manifests from Docker Registry
> -
>
> Key: MESOS-3099
> URL: https://issues.apache.org/jira/browse/MESOS-3099
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Lily Chen
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> Docker image manifests pulled from remote Docker registries should be 
> verified against their signature digest before they are used. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3476) Refactor Status Update method on Slave to handle HTTP based Executors

2015-09-28 Thread Isabel Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-3476:
--
Shepherd: Vinod Kone
  Sprint: Mesosphere Sprint 20
Story Points: 8

> Refactor Status Update method on Slave to handle HTTP based Executors
> -
>
> Key: MESOS-3476
> URL: https://issues.apache.org/jira/browse/MESOS-3476
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Isabel Jimenez
>  Labels: mesosphere
>
> Currently, receiving a status update sent from slave to itself , {{runTask}} 
> , {{killTask}} and status updates from executors are handled by the 
> {{Slave::statusUpdate}} method on Slave. The signature of the method is 
> {{void Slave::statusUpdate(StatusUpdate update, const UPID& pid)}}. 
> We need to create another overload of it that can also handle HTTP based 
> executors which the previous PID based function can also call into. The 
> signature of the new function could be:
> {{void Slave::statusUpdate(StatusUpdate update, Executor* executor)}}
> The HTTP Executor would also call into this new function via 
> {{src/slave/http.cpp}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3235) FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3235:
---
Story Points: 2

> FetcherCacheHttpTest.HttpCachedSerialized and 
> FetcherCacheHttpTest.HttpCachedConcurrent are flaky
> -
>
> Key: MESOS-3235
> URL: https://issues.apache.org/jira/browse/MESOS-3235
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Joseph Wu
>Assignee: Bernd Mathiske
>  Labels: mesosphere
>
> On OSX, {{make clean && make -j8 V=0 check}}:
> {code}
> [--] 3 tests from FetcherCacheHttpTest
> [ RUN  ] FetcherCacheHttpTest.HttpCachedSerialized
> HTTP/1.1 200 OK
> Date: Fri, 07 Aug 2015 17:23:05 GMT
> Content-Length: 30
> I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 0
> Forked command at 54363
> sh -c './mesos-fetcher-test-cmd 0'
> E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54363)
> E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 1
> Forked command at 54411
> sh -c './mesos-fetcher-test-cmd 1'
> E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54411)
> E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> ../../src/tests/fetcher_cache_tests.cpp:860: Failure
> Failed to wait 15secs for awaitFinished(task.get())
> *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are 
> using GNU date ***
> [  FAILED  ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms)
> [ RUN  ] FetcherCacheHttpTest.HttpCachedConcurrent
> PC: @0x113723618 process::Owned<>::get()
> *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: ***
> @ 0x7fff8fcacf1a _sigtramp
> @ 0x7f9bc3109710 (unknown)
> @0x1136f07e2 mesos::internal::slave::Fetcher::fetch()
> @0x113862f9d 
> mesos::internal::slave::MesosContainerizerProcess::fetch()
> @0x1138f1b5d 
> _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcRK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_
> @0x1138f18cf 
> _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcRK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_
> @0x1143768cf std::__1::function<>::operator()()
> @0x11435ca7f process::ProcessBase::visit()
> @0x1143ed6fe process::DispatchEvent::visit()
> @0x11271 process::ProcessBase::serve()
> @0x114343b4e process::ProcessManager::resume()
> @0x1143431ca process::internal::schedule()
> @0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEPvS5_
> @ 0x7fff95090268 _pthread_body
> @ 0x7fff950901e5 _pthread_start
> @ 0x7fff9508e41d thread_start
> Failed to synchronize with slave (it's probably exited)
> make[3]: *** [check-local] Segmentation fault: 11
> make[2]: *** [check-am] Error 2
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}
> This was encountered just once out of 3+ {{make check}}s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3291) Add docker exec command

2015-09-28 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-3291:

Shepherd: Timothy Chen

> Add docker exec command
> ---
>
> Key: MESOS-3291
> URL: https://issues.apache.org/jira/browse/MESOS-3291
> Project: Mesos
>  Issue Type: Task
>  Components: docker
>Reporter: haosdent
>Assignee: haosdent
>  Labels: docker, mesosphere
>
> For fix the problem [MESOS-3136 | 
> https://issues.apache.org/jira/browse/MESOS-3136], we need run the health 
> check command in docker container through "docker exec". So we need implement 
> exec command in docker/docker.cpp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3235) FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3235:
---
Sprint: Mesosphere Sprint 20

> FetcherCacheHttpTest.HttpCachedSerialized and 
> FetcherCacheHttpTest.HttpCachedConcurrent are flaky
> -
>
> Key: MESOS-3235
> URL: https://issues.apache.org/jira/browse/MESOS-3235
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Joseph Wu
>Assignee: Bernd Mathiske
>  Labels: mesosphere
>
> On OSX, {{make clean && make -j8 V=0 check}}:
> {code}
> [--] 3 tests from FetcherCacheHttpTest
> [ RUN  ] FetcherCacheHttpTest.HttpCachedSerialized
> HTTP/1.1 200 OK
> Date: Fri, 07 Aug 2015 17:23:05 GMT
> Content-Length: 30
> I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 0
> Forked command at 54363
> sh -c './mesos-fetcher-test-cmd 0'
> E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54363)
> E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 1
> Forked command at 54411
> sh -c './mesos-fetcher-test-cmd 1'
> E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54411)
> E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> ../../src/tests/fetcher_cache_tests.cpp:860: Failure
> Failed to wait 15secs for awaitFinished(task.get())
> *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are 
> using GNU date ***
> [  FAILED  ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms)
> [ RUN  ] FetcherCacheHttpTest.HttpCachedConcurrent
> PC: @0x113723618 process::Owned<>::get()
> *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: ***
> @ 0x7fff8fcacf1a _sigtramp
> @ 0x7f9bc3109710 (unknown)
> @0x1136f07e2 mesos::internal::slave::Fetcher::fetch()
> @0x113862f9d 
> mesos::internal::slave::MesosContainerizerProcess::fetch()
> @0x1138f1b5d 
> _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcRK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_
> @0x1138f18cf 
> _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcRK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_
> @0x1143768cf std::__1::function<>::operator()()
> @0x11435ca7f process::ProcessBase::visit()
> @0x1143ed6fe process::DispatchEvent::visit()
> @0x11271 process::ProcessBase::serve()
> @0x114343b4e process::ProcessManager::resume()
> @0x1143431ca process::internal::schedule()
> @0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEPvS5_
> @ 0x7fff95090268 _pthread_body
> @ 0x7fff950901e5 _pthread_start
> @ 0x7fff9508e41d thread_start
> Failed to synchronize with slave (it's probably exited)
> make[3]: *** [check-local] Segmentation fault: 11
> make[2]: *** [check-am] Error 2
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}
> This was encountered just once out of 3+ {{make check}}s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2972) Serialize Docker image spec as protobuf

2015-09-28 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933907#comment-14933907
 ] 

Timothy Chen commented on MESOS-2972:
-

I see, we don't adapt JsonSchema yet and not sure how that integration looks 
like.
This is actually recommended by other commiters from Twitter, as they already 
modeled AppC json with Protobf.
The conversion and modeling is actually quite straightforward using the JSON 
<-> Protobuf tools we have.
I still think we should conform to what's being practiced for now, and look 
into JsonSchema if it's a better alternative to convert all others.
What I'd like to avoid is to have a complicated way that's harder to maintain 
and at least conform to the best practice we have as for now.
Does that sounds good [~marco-mesos]?

> Serialize Docker image spec as protobuf
> ---
>
> Key: MESOS-2972
> URL: https://issues.apache.org/jira/browse/MESOS-2972
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> The Docker image specification defines a schema for the metadata json that it 
> puts into each image. Currently the docker image provisioner needs to be able 
> to parse and understand this metadata json, and we should create a protobuf 
> equivelent schema so we can utilize the json to protobuf conversion to read 
> and validate the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3183) Documentation images do not load

2015-09-28 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-3183:
-
Sprint: Mesosphere Sprint 20

> Documentation images do not load
> 
>
> Key: MESOS-3183
> URL: https://issues.apache.org/jira/browse/MESOS-3183
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 0.24.0
>Reporter: James Mulcahy
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: mesosphere
> Attachments: rake.patch
>
>
> Any images which are referenced from the generated docs ({{docs/*.md}}) do 
> not show up on the website.  For example:
> * [External 
> Containerizer|http://mesos.apache.org/documentation/latest/external-containerizer/]
> * [Fetcher Cache 
> Internals|http://mesos.apache.org/documentation/latest/fetcher-cache-internals/]
> * [Maintenance|http://mesos.apache.org/documentation/latest/maintenance/] 
> * 
> [Oversubscription|http://mesos.apache.org/documentation/latest/oversubscription/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3378) Document a test pattern for expediting event firing

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3378:
---
Sprint: Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20  
(was: Mesosphere Sprint 18, Mesosphere Sprint 19)

> Document a test pattern for expediting event firing
> ---
>
> Key: MESOS-3378
> URL: https://issues.apache.org/jira/browse/MESOS-3378
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation, test
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Minor
>  Labels: mesosphere
>
> We use {{Clock::advance()}} extensively in tests to expedite event firing and 
> minimize overall {{make check}} time. Document this pattern for posterity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3468) Improve apply_reviews.sh script to apply chain of reviews

2015-09-28 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3468:

Sprint: Mesosphere Sprint 20

> Improve apply_reviews.sh script to apply chain of reviews
> -
>
> Key: MESOS-3468
> URL: https://issues.apache.org/jira/browse/MESOS-3468
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Vinod Kone
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> Currently the support/apply-review.sh script allows an user (typically 
> committer) to apply a single review on top the HEAD. Since Mesos contributors 
> typically submit a chain of reviews for a given issue it makes sense for the 
> script to apply the whole chain recursively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3140) Implement Docker remote puller

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3140:
---
Sprint: Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20  
(was: Mesosphere Sprint 18, Mesosphere Sprint 19)

> Implement Docker remote puller
> --
>
> Key: MESOS-3140
> URL: https://issues.apache.org/jira/browse/MESOS-3140
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Lily Chen
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Given a Docker image name and registry host URL, fetches the image. If 
> necessary, it will download the manifest and layers from the registry host. 
> It will place the layers and image manifest into persistent store.
> Done when a Docker image can be successfully stored and retrieved using 'put' 
> and 'get' methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3074) Check satisfiability of quota requests in Master

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3074:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20  (was: 
Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere 
Sprint 18, Mesosphere Sprint 19)

> Check satisfiability of quota requests in Master
> 
>
> Key: MESOS-3074
> URL: https://issues.apache.org/jira/browse/MESOS-3074
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> We need to to validate and quota requests in the Mesos Master as outlined in 
> the Design Doc: 
> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I
> This ticket aims to validate satisfiability (in terms of available resources) 
> of a quota request using a heuristic algorithm in the Mesos Master, rather 
> than validating the syntax of the request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers

2015-09-28 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-1615:
-
Sprint: Mesosphere Sprint 20

> Create design document for Optimistic Offers
> 
>
> Key: MESOS-1615
> URL: https://issues.apache.org/jira/browse/MESOS-1615
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Dominic Hamon
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> As a first step toward Optimistic Offers, take the description from the epic 
> and build an implementation design doc that can be shared for comments.
> Note: the links to the working group notes and design doc are located in the 
> [JIRA Epic|MESOS-1607].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3313) Rework Jenkins build script

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3313:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, 
Mesosphere Sprint 20  (was: Mesosphere Sprint 17, Mesosphere Sprint 18, 
Mesosphere Sprint 19)

> Rework Jenkins build script
> ---
>
> Key: MESOS-3313
> URL: https://issues.apache.org/jira/browse/MESOS-3313
> Project: Mesos
>  Issue Type: Task
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> Mesos Jenkins build script needs to be reworked to support the following:
> - Wider test coverage (libevent, libssl, root tests, Docker tests).
> - More OS/compiler Docker images for testing Mesos.
> - Excluding tests on per-image basis.
> - Reproducing the test image locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3480) Refactor Executor struct in Slave to handle HTTP based executors

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3480:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20  (was: Mesosphere Sprint 
19)

> Refactor Executor struct in Slave to handle HTTP based executors
> 
>
> Key: MESOS-3480
> URL: https://issues.apache.org/jira/browse/MESOS-3480
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Currently, the {{struct Executor}} in slave only supports executors connected 
> via message passing (driver). We should refactor it to add support for HTTP 
> based Executors similar to what was done for the Scheduler API {{struct 
> Framework}} in {{src/master/master.hpp}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3515) Support Subscribe Call for HTTP based Executors

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3515:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20  (was: Mesosphere Sprint 
19)

> Support Subscribe Call for HTTP based Executors
> ---
>
> Key: MESOS-3515
> URL: https://issues.apache.org/jira/browse/MESOS-3515
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> We need to add a {{subscribe(...)}} method in {{src/slave/slave.cpp}} to 
> introduce the ability for HTTP based executors to subscribe and then receive 
> events on the persistent HTTP connection. Most of the functionality needed 
> would be similar to {{Master::subscribe}} in {{src/master/master.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2949) Design generalized Authorizer interface

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2949:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20  (was: 
Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere 
Sprint 18, Mesosphere Sprint 19)

> Design generalized Authorizer interface
> ---
>
> Key: MESOS-2949
> URL: https://issues.apache.org/jira/browse/MESOS-2949
> Project: Mesos
>  Issue Type: Task
>  Components: master, security
>Reporter: Alexander Rojas
>Assignee: Alexander Rojas
>  Labels: acl, mesosphere, security
>
> As mentioned in MESOS-2948 the current {{mesos::Authorizer}} interface is 
> rather inflexible if new _Actions_ or _Objects_ need to be added.
> A new API needs to be designed in a way that allows for arbitrary _Actions_ 
> and _Objects_ to be added to the authorization mechanism without having to 
> recompile mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3164) Introduce QuotaInfo message

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3164:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20  (was: 
Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere 
Sprint 18, Mesosphere Sprint 19)

> Introduce QuotaInfo message
> ---
>
> Key: MESOS-3164
> URL: https://issues.apache.org/jira/browse/MESOS-3164
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> A {{QuotaInfo}} protobuf message is internal representation for quota related 
> information (e.g. for persisting quota). The protobuf message should be 
> extendable for future needs and allows for easy aggregation across roles and 
> operator principals. It may also be used to pass quota information to 
> allocators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3428) Support running filesystem isolation with Command Executor in MesosContainerizer

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3428:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20  (was: Mesosphere Sprint 
19)

> Support running filesystem isolation with Command Executor in 
> MesosContainerizer
> 
>
> Key: MESOS-3428
> URL: https://issues.apache.org/jira/browse/MESOS-3428
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Timothy Chen
>Assignee: Timothy Chen
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2906) Slave : Synchronous Validation for Calls

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2906:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20  (was: Mesosphere Sprint 
19)

> Slave : Synchronous Validation for Calls
> 
>
> Key: MESOS-2906
> URL: https://issues.apache.org/jira/browse/MESOS-2906
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Isabel Jimenez
>  Labels: HTTP, mesosphere
>
> /call endpoint on the slave will return a 202 accepted code but has to do 
> some basic validations before. In case of invalidation it will return a 
> {{BadRequest}} back to the client.
> - We need to create the required infrastructure to validate the request and 
> then process it similar to {{src/master/validation.cpp}} in the {{namespace 
> scheduler}} i.e. check if the protobuf is properly initialized, has the 
> required attributes set pertaining to the call message etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3357) Update quota design doc based on user comments and offline syncs

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3357:
---
Sprint: Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20  
(was: Mesosphere Sprint 18, Mesosphere Sprint 19)

> Update quota design doc based on user comments and offline syncs
> 
>
> Key: MESOS-3357
> URL: https://issues.apache.org/jira/browse/MESOS-3357
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> We got plenty of feedback from different parties, which we would like to 
> persist in the design doc for posterity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3497) Add implementation for sha256 based file content verification.

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3497:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20  (was: Mesosphere Sprint 
19)

> Add implementation for sha256 based file content verification.
> --
>
> Key: MESOS-3497
> URL: https://issues.apache.org/jira/browse/MESOS-3497
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> https://reviews.apache.org/r/38747/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3496) Create interface for digest verifier

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3496:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20  (was: Mesosphere Sprint 
19)

> Create interface for digest verifier
> 
>
> Key: MESOS-3496
> URL: https://issues.apache.org/jira/browse/MESOS-3496
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Add interface for digest verifier so that we can add implementations for 
> digest types like sha256, sha512 etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2879) Random recursive_mutex errors in when running make check

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2879:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 18, Mesosphere Sprint 19, 
Mesosphere Sprint 20  (was: Mesosphere Sprint 15, Mesosphere Sprint 18, 
Mesosphere Sprint 19)

> Random recursive_mutex errors in when running make check
> 
>
> Key: MESOS-2879
> URL: https://issues.apache.org/jira/browse/MESOS-2879
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Alexander Rojas
>Assignee: Greg Mann
>  Labels: mesosphere, tech-debt
>
> While running make check on OS X, from time to time {{recursive_mutex}} 
> errors appear after running all the test successfully. Just one of the 
> experience messages actually stops {{make check}} reporting an error.
> The following error messages have been experienced:
> {code}
> libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: 
> libc++abi.dylib: libc++abi.dylib: terminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argument
> *** Aborted at 1434553937 (unix time) try "date -d @1434553937" if you are 
> using GNU date ***
> {code}
> {code}
> libc++abi.dylib: terminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid argument
> *** Aborted at 1434557001 (unix time) try "date -d @1434557001" if you are 
> using GNU date ***
> libc++abi.dylib: PC: @ 0x7fff93855286 __pthread_kill
> libc++abi.dylib: *** SIGABRT (@0x7fff93855286) received by PID 88060 (TID 
> 0x10fc4) stack trace: ***
> @ 0x7fff8e1d6f1a _sigtramp
> libc++abi.dylib: @0x10fc3f1a8 (unknown)
> libc++abi.dylib: @ 0x7fff979deb53 abort
> libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: terminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentMaking check in include
> {code}
> {code}
> Assertion failed: (e == 0), function ~recursive_mutex, file 
> /SourceCache/libcxx/libcxx-120/src/mutex.cpp, line 82.
> *** Aborted at 1434555685 (unix time) try "date -d @1434555685" if you are 
> using GNU date ***
> PC: @ 0x7fff93855286 __pthread_kill
> *** SIGABRT (@0x7fff93855286) received by PID 60235 (TID 0x7fff7ebdc300) 
> stack trace: ***
> @ 0x7fff8e1d6f1a _sigtramp
> @0x10b512350 google::CheckNotNull<>()
> @ 0x7fff979deb53 abort
> @ 0x7fff979a6c39 __assert_rtn
> @ 0x7fff9bffdcc9 std::__1::recursive_mutex::~recursive_mutex()
> @0x10b881928 process::ProcessManager::~ProcessManager()
> @0x10b874445 process::ProcessManager::~ProcessManager()
> @0x10b874418 process::finalize()
> @0x10b2f7aec main
> @ 0x7fff98edc5c9 start
> make[5]: *** [check-local] Abort trap: 6
> make[4]: *** [check-am] Error 2
> make[3]: *** [check-recursive] Error 1
> make[2]: *** [check-recursive] Error 1
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2708) Design doc for the Executor HTTP API

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2708:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, 
Mesosphere Sprint 20  (was: Mesosphere Sprint 17, Mesosphere Sprint 18, 
Mesosphere Sprint 19)

> Design doc for the Executor HTTP API
> 
>
> Key: MESOS-2708
> URL: https://issues.apache.org/jira/browse/MESOS-2708
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rojas
>Assignee: Isabel Jimenez
>  Labels: mesosphere
>
> This tracks the design of the Executor HTTP API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3086:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20  (was: 
Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere 
Sprint 18, Mesosphere Sprint 19)

> Create cgroups TasksKiller for non freeze subsystems.
> -
>
> Key: MESOS-3086
> URL: https://issues.apache.org/jira/browse/MESOS-3086
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> We have a number of test issues when we cannot remove cgroups (in case there 
> are still related tasks running) in cases where the freezer subsystem is not 
> available. 
> In the current code 
> (https://github.com/apache/mesos/blob/0.22.1/src/linux/cgroups.cpp#L1728)  we 
> will fallback to a very simple mechnism of recursivly trying to remove the 
> cgroups which fails if there are still tasks running. 
> Therefore we need an additional  (NonFreeze)TasksKiller which doesn't  rely 
> on the freezer subsystem.
> This problem caused issues when running 'sudo make check' during 0.23 release 
> testing, where BenH provided already a better error message with 
> b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota

2015-09-28 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3073:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20  (was: 
Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere 
Sprint 18, Mesosphere Sprint 19)

> Introduce HTTP endpoints for Quota
> --
>
> Key: MESOS-3073
> URL: https://issues.apache.org/jira/browse/MESOS-3073
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> We need to implement the HTTP endpoints for Quota as outlined in the Design 
> Doc: 
> (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3536) Error loading isolator module with 0.25.0-rc1

2015-09-28 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-3536:
-

 Summary: Error loading isolator module with 0.25.0-rc1
 Key: MESOS-3536
 URL: https://issues.apache.org/jira/browse/MESOS-3536
 Project: Mesos
  Issue Type: Bug
Reporter: Kapil Arya
Assignee: Kapil Arya
Priority: Blocker


When trying to load the network isolator module from 
https://github.com/djosborne/net-modules/tree/test-0.25.0/ in 0.25.0-rc1, we 
are seeing the following error:

{code}
Error loading modules: Error opening library: 
'/isolator/build/.libs/libmesos_network_isolator.so': Could not load library 
'/isolator/build/.libs/libmesos_network_isolator.so': 
/isolator/build/.libs/libmesos_network_isolator.so: undefined symbol: 
_ZNK8picojson5value2isIlEEbv
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1478) Replace Master/Slave terminology

2015-09-28 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933889#comment-14933889
 ] 

Vinod Kone commented on MESOS-1478:
---

[~benjaminhindman] Can you share the doc on the details please?

> Replace Master/Slave terminology
> 
>
> Key: MESOS-1478
> URL: https://issues.apache.org/jira/browse/MESOS-1478
> Project: Mesos
>  Issue Type: Wish
>Reporter: Clark Breyman
>Assignee: Benjamin Hindman
>Priority: Minor
>  Labels: mesosphere
>
> Inspired by the comments on this PR:
> https://github.com/django/django/pull/2692
> TL;DR - Computers sharing work should be a good thing. Using the language of 
> human bondage and suffering is inappropriate in this context. It also has the 
> potential to alienate users and community members. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2972) Serialize Docker image spec as protobuf

2015-09-28 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933873#comment-14933873
 ] 

Marco Massenzio commented on MESOS-2972:


Yes, I completely agree on wanting to avoid 'boilerplate' and ad-hoc schema 
checking.

However, I'd suggest to look into something like JsonSchema for that - I would 
assume that Docker (or someone else) had already done this?
Trying to model an arbitrary JSON model into PB is likely to be *extremely* 
difficult - if not outright impossible.



> Serialize Docker image spec as protobuf
> ---
>
> Key: MESOS-2972
> URL: https://issues.apache.org/jira/browse/MESOS-2972
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> The Docker image specification defines a schema for the metadata json that it 
> puts into each image. Currently the docker image provisioner needs to be able 
> to parse and understand this metadata json, and we should create a protobuf 
> equivelent schema so we can utilize the json to protobuf conversion to read 
> and validate the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1478) Replace Master/Slave terminology

2015-09-28 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933867#comment-14933867
 ] 

Adam B commented on MESOS-1478:
---

In the new HTTP API, we are referring to the Mesos "Slave" as the "Agent", and 
we will phase the term into the rest of the Mesos code/canon as we approach the 
Mesos 1.0 release. [~benjaminhindman] has more details on the plan.
We will keep the Mesos "Masters" (one of which is the "leading Master") 
terminology, to be less disruptive to the API.

> Replace Master/Slave terminology
> 
>
> Key: MESOS-1478
> URL: https://issues.apache.org/jira/browse/MESOS-1478
> Project: Mesos
>  Issue Type: Wish
>Reporter: Clark Breyman
>Assignee: Benjamin Hindman
>Priority: Minor
>  Labels: mesosphere
>
> Inspired by the comments on this PR:
> https://github.com/django/django/pull/2692
> TL;DR - Computers sharing work should be a good thing. Using the language of 
> human bondage and suffering is inappropriate in this context. It also has the 
> potential to alienate users and community members. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3535) Expose info about the container image associated each container through an HTTP endpoint.

2015-09-28 Thread Yan Xu (JIRA)
Yan Xu created MESOS-3535:
-

 Summary: Expose info about the container image associated each 
container through an HTTP endpoint.
 Key: MESOS-3535
 URL: https://issues.apache.org/jira/browse/MESOS-3535
 Project: Mesos
  Issue Type: Task
Reporter: Yan Xu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3534) add test cases for sha256/sha512 digest verifier

2015-09-28 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933826#comment-14933826
 ] 

Gilbert Song commented on MESOS-3534:
-

https://reviews.apache.org/r/38814/

> add test cases for sha256/sha512 digest verifier
> 
>
> Key: MESOS-3534
> URL: https://issues.apache.org/jira/browse/MESOS-3534
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>
> add test cases for sha256/sha512 digest verifier, to read from a file path 
> and verify with corresponding string digest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3534) add test cases for sha256/sha512 digest verifier

2015-09-28 Thread Gilbert Song (JIRA)
Gilbert Song created MESOS-3534:
---

 Summary: add test cases for sha256/sha512 digest verifier
 Key: MESOS-3534
 URL: https://issues.apache.org/jira/browse/MESOS-3534
 Project: Mesos
  Issue Type: Improvement
Reporter: Gilbert Song
Assignee: Gilbert Song


add test cases for sha256/sha512 digest verifier, to read from a file path and 
verify with corresponding string digest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2035) Add reason to containerizer proto Termination

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2035:
--
Sprint: Twitter Mesos Q3 Sprint 5, Twitter Mesos Q3 Sprint 6  (was: Twitter 
Mesos Q3 Sprint 5)

> Add reason to containerizer proto Termination
> -
>
> Key: MESOS-2035
> URL: https://issues.apache.org/jira/browse/MESOS-2035
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Affects Versions: 0.21.0
>Reporter: Dominic Hamon
>Assignee: Jie Yu
>  Labels: mesosphere
>
> When an isolator kills a task, the reason is unknown. As part of MESOS-1830, 
> the reason is set to a general one but ideally we would have the termination 
> reason to pass through to the status update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers

2015-09-28 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-1615:
-
Story Points: 8  (was: 5)

> Create design document for Optimistic Offers
> 
>
> Key: MESOS-1615
> URL: https://issues.apache.org/jira/browse/MESOS-1615
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Dominic Hamon
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> As a first step toward Optimistic Offers, take the description from the epic 
> and build an implementation design doc that can be shared for comments.
> Note: the links to the working group notes and design doc are located in the 
> [JIRA Epic|MESOS-1607].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-1607) Introduce optimistic offers.

2015-09-28 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan reassigned MESOS-1607:


Assignee: Artem Harutyunyan

> Introduce optimistic offers.
> 
>
> Key: MESOS-1607
> URL: https://issues.apache.org/jira/browse/MESOS-1607
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation, framework, master
>Reporter: Benjamin Hindman
>Assignee: Artem Harutyunyan
> Attachments: optimisitic-offers.pdf
>
>
> The current implementation of resource offers only enable a single framework 
> scheduler to make scheduling decisions for some available resources at a 
> time. In some circumstances, this is good, i.e., when we don't want other 
> framework schedulers to have access to some resources. However, in other 
> circumstances, there are advantages to letting multiple framework schedulers 
> attempt to make scheduling decisions for the _same_ allocation of resources 
> in parallel.
> If you think about this from a "concurrency control" perspective, the current 
> implementation of resource offers is _pessimistic_, the resources contained 
> within an offer are _locked_ until the framework scheduler that they were 
> offered to launches tasks with them or declines them. In addition to making 
> pessimistic offers we'd like to give out _optimistic_ offers, where the same 
> resources are offered to multiple framework schedulers at the same time, and 
> framework schedulers "compete" for those resources on a 
> first-come-first-serve basis (i.e., the first to launch a task "wins"). We've 
> always reserved the right to rescind resource offers using the 'rescind' 
> primitive in the API, and a framework scheduler should be prepared to launch 
> a task and have those tasks go lost because another framework already started 
> to use those resources.
> Introducing optimistic offers will enable more sophisticated allocation 
> algorithms. For example, we can optimistically allocate resources that are 
> reserved for a particular framework (role) but are not being used. In 
> conjunction with revocable resources (the concept that using resources not 
> reserved for you means you might get those resources revoked) we can easily 
> create a "spot" market for unused resources, driving up utilization by 
> letting frameworks that are willing to use revocable resources run tasks.
> In the limit, one could imagine always making optimistic resource offers. 
> This bears a striking resemblance with the Google Omega model (an isomorphism 
> even). However, being able to configure what resources should be allocated 
> optimistically and what resources should be allocated pessimistically gives 
> even more control to a datacenter/cluster operator that might want to, for 
> example, never let multiple frameworks (roles) compete for some set of 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3468) Improve apply_reviews.sh script to apply chain of reviews

2015-09-28 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-3468:
-
Labels: mesosphere  (was: )

> Improve apply_reviews.sh script to apply chain of reviews
> -
>
> Key: MESOS-3468
> URL: https://issues.apache.org/jira/browse/MESOS-3468
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Vinod Kone
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> Currently the support/apply-review.sh script allows an user (typically 
> committer) to apply a single review on top the HEAD. Since Mesos contributors 
> typically submit a chain of reviews for a given issue it makes sense for the 
> script to apply the whole chain recursively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3519) Fix file descriptor leakage / double close in the code base

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3519:
--
Story Points: 3

> Fix file descriptor leakage / double close in the code base
> ---
>
> Key: MESOS-3519
> URL: https://issues.apache.org/jira/browse/MESOS-3519
> Project: Mesos
>  Issue Type: Bug
>Reporter: Chi Zhang
>Assignee: Chi Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3520) Add an abstraction to manage the life cycle of file descriptors.

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3520:
--
Sprint: Twitter Mesos Q3 Sprint 6

> Add an abstraction to manage the life cycle of file descriptors.
> 
>
> Key: MESOS-3520
> URL: https://issues.apache.org/jira/browse/MESOS-3520
> Project: Mesos
>  Issue Type: Improvement
>  Components: stout
>Reporter: Chi Zhang
>Assignee: Chi Zhang
>
> In order to avoid missing {{close()}} calls on file descriptors, or 
> double-closing file descriptors, it would be nice to add a reference counted 
> {{FileDescriptor}} in a similar way to what we've done for Socket. This will 
> be closed automatically when the last reference goes away, and double closes 
> can be prevented via internal state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3468) Improve apply_reviews.sh script to apply chain of reviews

2015-09-28 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-3468:
-
Story Points: 5

> Improve apply_reviews.sh script to apply chain of reviews
> -
>
> Key: MESOS-3468
> URL: https://issues.apache.org/jira/browse/MESOS-3468
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Vinod Kone
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> Currently the support/apply-review.sh script allows an user (typically 
> committer) to apply a single review on top the HEAD. Since Mesos contributors 
> typically submit a chain of reviews for a given issue it makes sense for the 
> script to apply the whole chain recursively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3519) Fix file descriptor leakage / double close in the code base

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3519:
--
Sprint: Twitter Mesos Q3 Sprint 6

> Fix file descriptor leakage / double close in the code base
> ---
>
> Key: MESOS-3519
> URL: https://issues.apache.org/jira/browse/MESOS-3519
> Project: Mesos
>  Issue Type: Bug
>Reporter: Chi Zhang
>Assignee: Chi Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3494) Add Test for Docker RemotePuller

2015-09-28 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933717#comment-14933717
 ] 

Gilbert Song commented on MESOS-3494:
-

https://reviews.apache.org/r/38816/

> Add Test for Docker RemotePuller
> 
>
> Key: MESOS-3494
> URL: https://issues.apache.org/jira/browse/MESOS-3494
> Project: Mesos
>  Issue Type: Task
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>
> Add unit test for Docker RemotePuller implementation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3533) Unable to find and run URIs files

2015-09-28 Thread Rafael Capucho (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933713#comment-14933713
 ] 

Rafael Capucho commented on MESOS-3533:
---

Docker Version: 1.8.2

> Unable to find and run URIs files
> -
>
> Key: MESOS-3533
> URL: https://issues.apache.org/jira/browse/MESOS-3533
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, general
>Affects Versions: 0.25.0
> Environment: Linux li202-122 4.1.5-x86_64-linode61 #7 SMP Mon Aug 24 
> 13:46:31 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu 14.04.1 LTS
> Docker Version: 1.8.2
> Docker API version: 1.20
> Go version: go1.4.2
>Reporter: Rafael Capucho
>Priority: Blocker
>
> Hello,
> Deploying a docker container using marathon 0.11 with the following structure 
> (just example, I had tried some variations with same result):
> {
>   "id": "testando-flask",
>   "cmd": "ls -l; pip install -r requeriments.txt; ls -l; python app.py",
>   "cpus": 0.5,
>   "mem": 20.0,
>   "container": {
> "type": "DOCKER",
> "docker": {
>   "image": "therealwardo/python-2.7-pip",
>   "network": "BRIDGE",
>   "privileged": true,
>   "portMappings": [
> { "containerPort": 31177, "hostPort": 0 }
>   ]
> }
>   },
>   "uris": [
> "http://blog.rafaelcapucho.com/app.zip";
>   ]
> }
> curl -X POST http://173.255.192.XXX:8080/v2/apps -d @flask.json -H 
> "Content-type: application/json"
> The task are reaching mesos master properly but it failed. When I execute the 
> same structure without uris and with a simple "python -m SimpleHTTPServer" it 
> works! The docker is created and running.
> Analyzing the sandbox on Mesos UI I can see that the files of URIs are 
> download correctly, the project and the requirements.txt in stdout I got: 
> Archive:  
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.zip
>   inflating: 
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.py
>   
>  extracting: 
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/requeriments.txt
>   
> --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --stop_timeout="0ns"
> --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --stop_timeout="0ns"
> Registered docker executor on li202-122.members.linode.com
> Starting task testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb
> Could not open requirements file: [Errno 2] No such file or directory: 
> 'requeriments.txt'
> Storing complete log in /root/.pip/pip.log
> total 68
> drwxr-xr-x   2 root root  4096 Jan 15  2015 bin
> drwxr-xr-x   2 root root  4096 Apr 19  2012 boot
> drwxr-xr-x  10 root root 13740 Sep 28 12:44 dev
> drwxr-xr-x  46 root root  4096 Sep 28 12:44 etc
> drwxr-xr-x   2 root root  4096 Apr 19  2012 home
> drwxr-xr-x  11 root root  4096 Jan 15  2015 lib
> drwxr-xr-x   2 root root  4096 Jan 15  2015 lib64
> drwxr-xr-x   2 root root  4096 Jan 15  2015 media
> drwxr-xr-x   3 root root  4096 Sep 28 12:44 mnt
> drwxr-xr-x   2 root root  4096 Jan 15  2015 opt
> dr-xr-xr-x 170 root root 0 Sep 28 12:44 proc
> drwx--   3 root root  4096 Sep 28 12:44 root
> drwxr-xr-x   5 root root  4096 Jan 15  2015 run
> drwxr-xr-x   2 root root  4096 Jan 16  2015 sbin
> drwxr-xr-x   2 root root  4096 Mar  5  2012 selinux
> drwxr-xr-x   2 root root  4096 Jan 15  2015 srv
> dr-xr-xr-x  13 root root 0 

[jira] [Updated] (MESOS-3399) Rewrite perf events code

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3399:
--
Sprint: Twitter Mesos Q3 Sprint 5, Twitter Mesos Q3 Sprint 6  (was: Twitter 
Mesos Q3 Sprint 5)

> Rewrite perf events code
> 
>
> Key: MESOS-3399
> URL: https://issues.apache.org/jira/browse/MESOS-3399
> Project: Mesos
>  Issue Type: Task
>Reporter: Cong Wang
>Assignee: Cong Wang
>Priority: Minor
>  Labels: twitter
>
> Our current code base invokes and parses `perf stat`, which sucks, because 
> cmdline output is not a stable ABI at all, it can break our code at any time, 
> for example MESOS-2834.
> We should use the stable API perf_event_open(2). With this patch 
> https://reviews.apache.org/r/37540/, we already have the infrastructure for 
> the implementation, so it should not be hard to rewrite all the perf events 
> code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3365) Export per container SNMP statistics

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3365:
--
Sprint: Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5, Twitter Mesos 
Q3 Sprint 6  (was: Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5)

> Export per container SNMP statistics
> 
>
> Key: MESOS-3365
> URL: https://issues.apache.org/jira/browse/MESOS-3365
> Project: Mesos
>  Issue Type: Task
>Reporter: Cong Wang
>Assignee: Cong Wang
>Priority: Minor
>  Labels: twitter
>
> We need to export the per container SNMP statistics too, from its 
> /proc/net/snmp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2769) Metric for cpu scheduling latency from all components

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2769:
--
Sprint: Twitter Q2 Sprint 3, Twitter Mesos Q3 Sprint 3, Twitter Mesos Q3 
Sprint 4, Twitter Mesos Q3 Sprint 5, Twitter Mesos Q3 Sprint 6  (was: Twitter 
Q2 Sprint 3, Twitter Mesos Q3 Sprint 3, Twitter Mesos Q3 Sprint 4, Twitter 
Mesos Q3 Sprint 5)

> Metric for cpu scheduling latency from all components
> -
>
> Key: MESOS-2769
> URL: https://issues.apache.org/jira/browse/MESOS-2769
> Project: Mesos
>  Issue Type: Improvement
>  Components: isolation
>Affects Versions: 0.22.1
>Reporter: Ian Downes
>Assignee: Cong Wang
>  Labels: twitter
>
> The metric will provide statistics on the scheduling latency for 
> processes/threads in a container, i.e., statistics on the delay before 
> application code can run. This will be the aggregate effect of the normal 
> scheduling period, contention from other threads/processes, both in the 
> container and on the system, and any effects from the CFS bandwidth control 
> (if enabled) or other CPU isolation strategies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3332) Support HTTP Pipelining in libprocess (http::post)

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3332:
--
Sprint: Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5, Twitter Mesos 
Q3 Sprint 6  (was: Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5)

> Support HTTP Pipelining in libprocess (http::post)
> --
>
> Key: MESOS-3332
> URL: https://issues.apache.org/jira/browse/MESOS-3332
> Project: Mesos
>  Issue Type: Task
>  Components: libprocess
>Reporter: Anand Mazumdar
>Assignee: Benjamin Mahler
>  Labels: twitter
>
> Currently , {{http::post}} in libprocess, does not support HTTP pipelining. 
> Each call as of know sends in the {{Connection: close}} header, thereby, 
> signaling to the server to close the TCP socket after the response.
> We either need to create a new interface for supporting HTTP pipelining , or 
> modify the existing {{http::post}} to do so.
> This is needed for the Scheduler/Executor library implementations to make 
> sure "Calls" are sent in order to the master. Currently, in order to do so, 
> we send in the next request only after we have received a response for an 
> earlier call that results in degraded performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3399) Rewrite perf events code

2015-09-28 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3399:
--
Story Points: 5

> Rewrite perf events code
> 
>
> Key: MESOS-3399
> URL: https://issues.apache.org/jira/browse/MESOS-3399
> Project: Mesos
>  Issue Type: Task
>Reporter: Cong Wang
>Assignee: Cong Wang
>Priority: Minor
>  Labels: twitter
>
> Our current code base invokes and parses `perf stat`, which sucks, because 
> cmdline output is not a stable ABI at all, it can break our code at any time, 
> for example MESOS-2834.
> We should use the stable API perf_event_open(2). With this patch 
> https://reviews.apache.org/r/37540/, we already have the infrastructure for 
> the implementation, so it should not be hard to rewrite all the perf events 
> code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3421) Support sharing persistent volumes across task instances

2015-09-28 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933692#comment-14933692
 ] 

Jie Yu commented on MESOS-3421:
---

Glad to see the discussion here! I agree with [~adam-mesos] that the concept of 
"shared" resources should be applicable to other types of resources as well. 
Here are a list of issues I can think of that we need to address:

1) ownership? For instance, disk resource, which executor/task has the write 
permission (i.e., 1 writer + multiple readers)?
2) what if the limit of the resource has been reached? Do we kill all tasks 
using it? Or just the owner?
3) reference counting? Do we need to track how many tasks/executors are still 
using the resource so that it cannot be released?
4) permission (e.g., group/owner)? This is specific for disk resources.

We definitely need to change the allocator accordingly so that 'shared' 
resources can be allocated to multiple fraemworks currently.

> Support sharing persistent volumes across task instances
> 
>
> Key: MESOS-3421
> URL: https://issues.apache.org/jira/browse/MESOS-3421
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.23.0
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>
> A service that needs persistent volume needs to have access to the same 
> persistent volume (RW) from multiple task(s) instances on the same agent 
> node. Currently, a persistent volume once offered to the framework(s) can be 
> scheduled to a task and until that tasks terminates, that persistent volume 
> cannot be used by another task.
> Explore providing the capability of sharing persistent volumes across task 
> instances scheduled on a single agent node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3391) Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution

2015-09-28 Thread Chris Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933690#comment-14933690
 ] 

Chris Chen commented on MESOS-3391:
---

This will soon break in our production environment. This is an interesting 
point to make since ZK 3.4.6 has been out for a little while and mesos is still 
pinned to 3.4.5.

> Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution
> --
>
> Key: MESOS-3391
> URL: https://issues.apache.org/jira/browse/MESOS-3391
> Project: Mesos
>  Issue Type: Bug
>  Components: general
> Environment: Linux, OS X
>Reporter: Chris Chen
>Assignee: Chris Chen
>
> The Zookeeper C client does makes certain assertions about the ordering of 
> ping packets that the Java client does not. An alternate implementation of 
> the Zookeeper server would then break the C client while working correctly 
> with the Java client.
> A patch has been submitted to the Zookeeper project under ZOOKEEPER-2253. 
> This adds that patch to mesos 3rdparty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3391) Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution

2015-09-28 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933679#comment-14933679
 ] 

Neil Conway commented on MESOS-3391:


Is there a reason we need to apply this patch now, rather than just updating to 
the next upstream Zk release with includes the change?

> Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution
> --
>
> Key: MESOS-3391
> URL: https://issues.apache.org/jira/browse/MESOS-3391
> Project: Mesos
>  Issue Type: Bug
>  Components: general
> Environment: Linux, OS X
>Reporter: Chris Chen
>Assignee: Chris Chen
>
> The Zookeeper C client does makes certain assertions about the ordering of 
> ping packets that the Java client does not. An alternate implementation of 
> the Zookeeper server would then break the C client while working correctly 
> with the Java client.
> A patch has been submitted to the Zookeeper project under ZOOKEEPER-2253. 
> This adds that patch to mesos 3rdparty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3533) Unable to find and run URIs files

2015-09-28 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933675#comment-14933675
 ] 

haosdent commented on MESOS-3533:
-

what is the docker version you use?

> Unable to find and run URIs files
> -
>
> Key: MESOS-3533
> URL: https://issues.apache.org/jira/browse/MESOS-3533
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, general
>Affects Versions: 0.25.0
> Environment: Linux li202-122 4.1.5-x86_64-linode61 #7 SMP Mon Aug 24 
> 13:46:31 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu 14.04.1 LTS
> Docker Version: 1.8.2
> Docker API version: 1.20
> Go version: go1.4.2
>Reporter: Rafael Capucho
>Priority: Blocker
>
> Hello,
> Deploying a docker container using marathon 0.11 with the following structure 
> (just example, I had tried some variations with same result):
> {
>   "id": "testando-flask",
>   "cmd": "ls -l; pip install -r requeriments.txt; ls -l; python app.py",
>   "cpus": 0.5,
>   "mem": 20.0,
>   "container": {
> "type": "DOCKER",
> "docker": {
>   "image": "therealwardo/python-2.7-pip",
>   "network": "BRIDGE",
>   "privileged": true,
>   "portMappings": [
> { "containerPort": 31177, "hostPort": 0 }
>   ]
> }
>   },
>   "uris": [
> "http://blog.rafaelcapucho.com/app.zip";
>   ]
> }
> curl -X POST http://173.255.192.XXX:8080/v2/apps -d @flask.json -H 
> "Content-type: application/json"
> The task are reaching mesos master properly but it failed. When I execute the 
> same structure without uris and with a simple "python -m SimpleHTTPServer" it 
> works! The docker is created and running.
> Analyzing the sandbox on Mesos UI I can see that the files of URIs are 
> download correctly, the project and the requirements.txt in stdout I got: 
> Archive:  
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.zip
>   inflating: 
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.py
>   
>  extracting: 
> /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/requeriments.txt
>   
> --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --stop_timeout="0ns"
> --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff"
>  --stop_timeout="0ns"
> Registered docker executor on li202-122.members.linode.com
> Starting task testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb
> Could not open requirements file: [Errno 2] No such file or directory: 
> 'requeriments.txt'
> Storing complete log in /root/.pip/pip.log
> total 68
> drwxr-xr-x   2 root root  4096 Jan 15  2015 bin
> drwxr-xr-x   2 root root  4096 Apr 19  2012 boot
> drwxr-xr-x  10 root root 13740 Sep 28 12:44 dev
> drwxr-xr-x  46 root root  4096 Sep 28 12:44 etc
> drwxr-xr-x   2 root root  4096 Apr 19  2012 home
> drwxr-xr-x  11 root root  4096 Jan 15  2015 lib
> drwxr-xr-x   2 root root  4096 Jan 15  2015 lib64
> drwxr-xr-x   2 root root  4096 Jan 15  2015 media
> drwxr-xr-x   3 root root  4096 Sep 28 12:44 mnt
> drwxr-xr-x   2 root root  4096 Jan 15  2015 opt
> dr-xr-xr-x 170 root root 0 Sep 28 12:44 proc
> drwx--   3 root root  4096 Sep 28 12:44 root
> drwxr-xr-x   5 root root  4096 Jan 15  2015 run
> drwxr-xr-x   2 root root  4096 Jan 16  2015 sbin
> drwxr-xr-x   2 root root  4096 Mar  5  2012 selinux
> drwxr-xr-x   2 root root  4096 Jan 15  2015 srv
> dr-xr-xr-x  13 root root 

[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers

2015-09-28 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-1615:
-
Description: 
As a first step toward Optimistic Offers, take the description from the epic 
and build an implementation design doc that can be shared for comments.

Note: the links to the working group notes and design doc are located in the 
[JIRA Epic|MESOS-1607].

  was:As a first step toward Optimistic Offers, take the description from the 
epic and build an implementation design doc that can be shared for comments.


> Create design document for Optimistic Offers
> 
>
> Key: MESOS-1615
> URL: https://issues.apache.org/jira/browse/MESOS-1615
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Dominic Hamon
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> As a first step toward Optimistic Offers, take the description from the epic 
> and build an implementation design doc that can be shared for comments.
> Note: the links to the working group notes and design doc are located in the 
> [JIRA Epic|MESOS-1607].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers

2015-09-28 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-1615:
-
Story Points: 5

> Create design document for Optimistic Offers
> 
>
> Key: MESOS-1615
> URL: https://issues.apache.org/jira/browse/MESOS-1615
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Dominic Hamon
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> As a first step toward Optimistic Offers, take the description from the epic 
> and build an implementation design doc that can be shared for comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers

2015-09-28 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-1615:
-
Labels: mesosphere  (was: )

> Create design document for Optimistic Offers
> 
>
> Key: MESOS-1615
> URL: https://issues.apache.org/jira/browse/MESOS-1615
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Dominic Hamon
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> As a first step toward Optimistic Offers, take the description from the epic 
> and build an implementation design doc that can be shared for comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3391) Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution

2015-09-28 Thread Chris Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Chen reassigned MESOS-3391:
-

Assignee: Chris Chen

> Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution
> --
>
> Key: MESOS-3391
> URL: https://issues.apache.org/jira/browse/MESOS-3391
> Project: Mesos
>  Issue Type: Bug
>  Components: general
> Environment: Linux, OS X
>Reporter: Chris Chen
>Assignee: Chris Chen
>
> The Zookeeper C client does makes certain assertions about the ordering of 
> ping packets that the Java client does not. An alternate implementation of 
> the Zookeeper server would then break the C client while working correctly 
> with the Java client.
> A patch has been submitted to the Zookeeper project under ZOOKEEPER-2253. 
> This adds that patch to mesos 3rdparty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >