[jira] [Commented] (MESOS-2972) Serialize Docker image spec as protobuf
[ https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934629#comment-14934629 ] Marco Massenzio commented on MESOS-2972: (We discussed this in person - recording here for future reference.) It looks as if the structure of the Docker Image spec should be simple enough and amenable to being "represented" in PB format without any conversion/adapter functionality, but just using Mesos' {{JSON::protobuf}} functionality. I am fine with trying this out, provided the PB structure that comes out is not too gnarly. > Serialize Docker image spec as protobuf > --- > > Key: MESOS-2972 > URL: https://issues.apache.org/jira/browse/MESOS-2972 > Project: Mesos > Issue Type: Improvement >Reporter: Timothy Chen >Assignee: Gilbert Song > Labels: mesosphere > > The Docker image specification defines a schema for the metadata json that it > puts into each image. Currently the docker image provisioner needs to be able > to parse and understand this metadata json, and we should create a protobuf > equivelent schema so we can utilize the json to protobuf conversion to read > and validate the metadata. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3391) Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution
[ https://issues.apache.org/jira/browse/MESOS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934581#comment-14934581 ] James Peach commented on MESOS-3391: If Mesos starts to depend on this behavior, what happens to people who are using unbundled ZK libraries? > Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution > -- > > Key: MESOS-3391 > URL: https://issues.apache.org/jira/browse/MESOS-3391 > Project: Mesos > Issue Type: Bug > Components: general > Environment: Linux, OS X >Reporter: Chris Chen >Assignee: Chris Chen > > The Zookeeper C client does makes certain assertions about the ordering of > ping packets that the Java client does not. An alternate implementation of > the Zookeeper server would then break the C client while working correctly > with the Java client. > A patch has been submitted to the Zookeeper project under ZOOKEEPER-2253. > This adds that patch to mesos 3rdparty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3533) Unable to find and run URIs files
[ https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934451#comment-14934451 ] Rafael Capucho commented on MESOS-3533: --- Of course, thank you... root@li202-122:/# cat /proc/self/mountinfo 193 63 251:4 /rootfs / rw,relatime - ext4 /dev/mapper/docker-8:0-57389-e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb rw,stripe=16,data=ordered 194 193 0:57 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw 195 193 0:58 / /dev rw,nosuid - tmpfs tmpfs rw,mode=755 196 195 0:59 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=666 197 195 0:60 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm rw,size=65536k 198 195 0:56 / /dev/mqueue rw,nosuid,nodev,noexec,relatime - mqueue mqueue rw 199 193 0:61 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs rw 200 199 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup systemd rw,name=systemd 201 199 0:25 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset 202 199 0:26 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu 203 199 0:27 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct 204 199 0:28 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio 205 199 0:29 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory 206 199 0:30 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices 207 199 0:31 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer 208 199 0:32 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls 209 199 0:33 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup rw,perf_event 210 199 0:34 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio 211 199 0:35 / /sys/fs/cgroup/debug rw,relatime - cgroup cgroup rw,debug 212 193 0:16 / /sys rw,relatime - sysfs sysfs rw 213 212 0:18 / /sys/fs/cgroup rw,relatime - tmpfs none rw,size=4k,mode=755 214 213 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup systemd rw,name=systemd 215 213 0:25 / /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset 216 213 0:26 / /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu 217 213 0:27 / /sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct 218 213 0:28 / /sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio 219 213 0:29 / /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory 220 213 0:30 / /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices 221 213 0:31 / /sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer 222 213 0:32 / /sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls 223 213 0:33 / /sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup rw,perf_event 224 213 0:34 / /sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio 225 213 0:35 / /sys/fs/cgroup/debug rw,relatime - cgroup cgroup rw,debug 226 212 0:19 / /sys/fs/fuse/connections rw,relatime - fusectl none rw 227 212 0:8 / /sys/kernel/debug rw,relatime - debugfs none rw 228 193 8:0 /lib/x86_64-linux-gnu/libudev.so.1.3.5 /lib/libudev.so.1 ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 229 193 8:0 /usr/bin/docker /bin/docker rw,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 230 193 8:0 /lib/x86_64-linux-gnu/libpthread-2.19.so /lib/libpthread.so.0 ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 231 193 8:0 /usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6 /lib/libsqlite3.so.0 ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 258 193 0:20 /docker.sock /run/docker.sock rw,nosuid,noexec,relatime - tmpfs none rw,size=204708k,mode=755 259 193 8:0 /lib/x86_64-linux-gnu/libdevmapper.so.1.02.1 /usr/lib/libdevmapper.so.1.02 ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 260 193 8:0 /var/lib/docker/containers/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb/resolv.conf /etc/resolv.conf rw,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 261 193 8:0 /var/lib/docker/containers/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb/hostname /etc/hostname rw,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 262 193 8:0 /var/lib/
[jira] [Comment Edited] (MESOS-3533) Unable to find and run URIs files
[ https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934451#comment-14934451 ] Rafael Capucho edited comment on MESOS-3533 at 9/29/15 1:15 AM: Of course, from mesos slave i guess.. follow, thank you... root@li202-122:/# cat /proc/self/mountinfo 193 63 251:4 /rootfs / rw,relatime - ext4 /dev/mapper/docker-8:0-57389-e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb rw,stripe=16,data=ordered 194 193 0:57 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw 195 193 0:58 / /dev rw,nosuid - tmpfs tmpfs rw,mode=755 196 195 0:59 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=666 197 195 0:60 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm rw,size=65536k 198 195 0:56 / /dev/mqueue rw,nosuid,nodev,noexec,relatime - mqueue mqueue rw 199 193 0:61 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs rw 200 199 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup systemd rw,name=systemd 201 199 0:25 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset 202 199 0:26 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu 203 199 0:27 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct 204 199 0:28 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio 205 199 0:29 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory 206 199 0:30 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices 207 199 0:31 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer 208 199 0:32 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls 209 199 0:33 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup rw,perf_event 210 199 0:34 /docker/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb /sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio 211 199 0:35 / /sys/fs/cgroup/debug rw,relatime - cgroup cgroup rw,debug 212 193 0:16 / /sys rw,relatime - sysfs sysfs rw 213 212 0:18 / /sys/fs/cgroup rw,relatime - tmpfs none rw,size=4k,mode=755 214 213 0:24 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup systemd rw,name=systemd 215 213 0:25 / /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset 216 213 0:26 / /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu 217 213 0:27 / /sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct 218 213 0:28 / /sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio 219 213 0:29 / /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory 220 213 0:30 / /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices 221 213 0:31 / /sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer 222 213 0:32 / /sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls 223 213 0:33 / /sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup rw,perf_event 224 213 0:34 / /sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio 225 213 0:35 / /sys/fs/cgroup/debug rw,relatime - cgroup cgroup rw,debug 226 212 0:19 / /sys/fs/fuse/connections rw,relatime - fusectl none rw 227 212 0:8 / /sys/kernel/debug rw,relatime - debugfs none rw 228 193 8:0 /lib/x86_64-linux-gnu/libudev.so.1.3.5 /lib/libudev.so.1 ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 229 193 8:0 /usr/bin/docker /bin/docker rw,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 230 193 8:0 /lib/x86_64-linux-gnu/libpthread-2.19.so /lib/libpthread.so.0 ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 231 193 8:0 /usr/lib/x86_64-linux-gnu/libsqlite3.so.0.8.6 /lib/libsqlite3.so.0 ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 258 193 0:20 /docker.sock /run/docker.sock rw,nosuid,noexec,relatime - tmpfs none rw,size=204708k,mode=755 259 193 8:0 /lib/x86_64-linux-gnu/libdevmapper.so.1.02.1 /usr/lib/libdevmapper.so.1.02 ro,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 260 193 8:0 /var/lib/docker/containers/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb/resolv.conf /etc/resolv.conf rw,noatime - ext4 /dev/root rw,errors=remount-ro,data=ordered 261 193 8:0 /var/lib/docker/containers/e446f2354b5e9cd7d20d5bfbf56b640a2090558c95caa3bd81a9debcb7756fcb/hostname /etc/hostname
[jira] [Updated] (MESOS-3540) Libevent termination triggers Broken Pipe
[ https://issues.apache.org/jira/browse/MESOS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-3540: Sprint: Mesosphere Sprint 20 > Libevent termination triggers Broken Pipe > - > > Key: MESOS-3540 > URL: https://issues.apache.org/jira/browse/MESOS-3540 > Project: Mesos > Issue Type: Bug > Components: libprocess >Reporter: Joris Van Remoortere >Assignee: Joris Van Remoortere > Labels: libevent, libprocess, mesosphere > > When the libevent loop terminates and we unblock the {{SIGPIPE}} signal, the > pending {{SIGPIPE}} instantly triggers and causes a broken pipe when the test > binary stops running. > {code} > Program received signal SIGPIPE, Broken pipe. > [Switching to Thread 0x718b4700 (LWP 16270)] > pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at > ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53 > 53../sysdeps/unix/sysv/linux/pthread_sigmask.c: No such file or directory. > (gdb) bt > #0 pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) > at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53 > #1 0x006fd9a4 in unblock () at > ../../../3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp:90 > #2 0x007d7915 in run () at > ../../../3rdparty/libprocess/src/libevent.cpp:125 > #3 0x007950cb in _M_invoke<>(void) () at > /usr/include/c++/4.9/functional:1700 > #4 0x00795000 in operator() () at > /usr/include/c++/4.9/functional:1688 > #5 0x00794f6e in _M_run () at /usr/include/c++/4.9/thread:115 > #6 0x7668de30 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #7 0x779a16aa in start_thread (arg=0x718b4700) at > pthread_create.c:333 > #8 0x75df1eed in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3540) Libevent termination triggers Broken Pipe
[ https://issues.apache.org/jira/browse/MESOS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere reassigned MESOS-3540: --- Assignee: Joris Van Remoortere > Libevent termination triggers Broken Pipe > - > > Key: MESOS-3540 > URL: https://issues.apache.org/jira/browse/MESOS-3540 > Project: Mesos > Issue Type: Bug > Components: libprocess >Reporter: Joris Van Remoortere >Assignee: Joris Van Remoortere > Labels: libevent, libprocess, mesosphere > > When the libevent loop terminates and we unblock the {{SIGPIPE}} signal, the > pending {{SIGPIPE}} instantly triggers and causes a broken pipe when the test > binary stops running. > {code} > Program received signal SIGPIPE, Broken pipe. > [Switching to Thread 0x718b4700 (LWP 16270)] > pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at > ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53 > 53../sysdeps/unix/sysv/linux/pthread_sigmask.c: No such file or directory. > (gdb) bt > #0 pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) > at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53 > #1 0x006fd9a4 in unblock () at > ../../../3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp:90 > #2 0x007d7915 in run () at > ../../../3rdparty/libprocess/src/libevent.cpp:125 > #3 0x007950cb in _M_invoke<>(void) () at > /usr/include/c++/4.9/functional:1700 > #4 0x00795000 in operator() () at > /usr/include/c++/4.9/functional:1688 > #5 0x00794f6e in _M_run () at /usr/include/c++/4.9/thread:115 > #6 0x7668de30 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #7 0x779a16aa in start_thread (arg=0x718b4700) at > pthread_create.c:333 > #8 0x75df1eed in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3540) Libevent termination triggers Broken Pipe
[ https://issues.apache.org/jira/browse/MESOS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-3540: Story Points: 2 > Libevent termination triggers Broken Pipe > - > > Key: MESOS-3540 > URL: https://issues.apache.org/jira/browse/MESOS-3540 > Project: Mesos > Issue Type: Bug > Components: libprocess >Reporter: Joris Van Remoortere > Labels: libevent, libprocess, mesosphere > > When the libevent loop terminates and we unblock the {{SIGPIPE}} signal, the > pending {{SIGPIPE}} instantly triggers and causes a broken pipe when the test > binary stops running. > {code} > Program received signal SIGPIPE, Broken pipe. > [Switching to Thread 0x718b4700 (LWP 16270)] > pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at > ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53 > 53../sysdeps/unix/sysv/linux/pthread_sigmask.c: No such file or directory. > (gdb) bt > #0 pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) > at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53 > #1 0x006fd9a4 in unblock () at > ../../../3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp:90 > #2 0x007d7915 in run () at > ../../../3rdparty/libprocess/src/libevent.cpp:125 > #3 0x007950cb in _M_invoke<>(void) () at > /usr/include/c++/4.9/functional:1700 > #4 0x00795000 in operator() () at > /usr/include/c++/4.9/functional:1688 > #5 0x00794f6e in _M_run () at /usr/include/c++/4.9/thread:115 > #6 0x7668de30 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #7 0x779a16aa in start_thread (arg=0x718b4700) at > pthread_create.c:333 > #8 0x75df1eed in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3533) Unable to find and run URIs files
[ https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934437#comment-14934437 ] haosdent commented on MESOS-3533: - Could you cat /proc/self/mountinfo and show the result here? > Unable to find and run URIs files > - > > Key: MESOS-3533 > URL: https://issues.apache.org/jira/browse/MESOS-3533 > Project: Mesos > Issue Type: Bug > Components: fetcher, general >Affects Versions: 0.25.0 > Environment: Linux li202-122 4.1.5-x86_64-linode61 #7 SMP Mon Aug 24 > 13:46:31 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux > Ubuntu 14.04.1 LTS > Docker Version: 1.8.2 > Docker API version: 1.20 > Go version: go1.4.2 >Reporter: Rafael Capucho >Priority: Blocker > > Hello, > Deploying a docker container using marathon 0.11 with the following structure > (just example, I had tried some variations with same result): > { > "id": "testando-flask", > "cmd": "ls -l; pip install -r requeriments.txt; ls -l; python app.py", > "cpus": 0.5, > "mem": 20.0, > "container": { > "type": "DOCKER", > "docker": { > "image": "therealwardo/python-2.7-pip", > "network": "BRIDGE", > "privileged": true, > "portMappings": [ > { "containerPort": 31177, "hostPort": 0 } > ] > } > }, > "uris": [ > "http://blog.rafaelcapucho.com/app.zip"; > ] > } > curl -X POST http://173.255.192.XXX:8080/v2/apps -d @flask.json -H > "Content-type: application/json" > The task are reaching mesos master properly but it failed. When I execute the > same structure without uris and with a simple "python -m SimpleHTTPServer" it > works! The docker is created and running. > Analyzing the sandbox on Mesos UI I can see that the files of URIs are > download correctly, the project and the requirements.txt in stdout I got: > Archive: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.zip > inflating: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.py > > extracting: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/requeriments.txt > > --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff" > --stop_timeout="0ns" > --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff" > --stop_timeout="0ns" > Registered docker executor on li202-122.members.linode.com > Starting task testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb > Could not open requirements file: [Errno 2] No such file or directory: > 'requeriments.txt' > Storing complete log in /root/.pip/pip.log > total 68 > drwxr-xr-x 2 root root 4096 Jan 15 2015 bin > drwxr-xr-x 2 root root 4096 Apr 19 2012 boot > drwxr-xr-x 10 root root 13740 Sep 28 12:44 dev > drwxr-xr-x 46 root root 4096 Sep 28 12:44 etc > drwxr-xr-x 2 root root 4096 Apr 19 2012 home > drwxr-xr-x 11 root root 4096 Jan 15 2015 lib > drwxr-xr-x 2 root root 4096 Jan 15 2015 lib64 > drwxr-xr-x 2 root root 4096 Jan 15 2015 media > drwxr-xr-x 3 root root 4096 Sep 28 12:44 mnt > drwxr-xr-x 2 root root 4096 Jan 15 2015 opt > dr-xr-xr-x 170 root root 0 Sep 28 12:44 proc > drwx-- 3 root root 4096 Sep 28 12:44 root > drwxr-xr-x 5 root root 4096 Jan 15 2015 run > drwxr-xr-x 2 root root 4096 Jan 16 2015 sbin > drwxr-xr-x 2 root root 4096 Mar 5 2012 selinux > drwxr-xr-x 2 root root 4096 Jan 15 2015 srv > dr-x
[jira] [Commented] (MESOS-1806) Substituting etcd or ReplicatedLog for Zookeeper
[ https://issues.apache.org/jira/browse/MESOS-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934435#comment-14934435 ] Shuai Lin commented on MESOS-1806: -- Checkout the etcd branch, build it, and run: {code} export MESOS_SOURCE_DIR=/path/to/mesos/ export MESOS_BUILD_DIR=/path/to/mesos/build cd $MESOS_BUILD_DIR $MESOS_SOURCE_DIR/src/tests/etcd_test.sh {code} > Substituting etcd or ReplicatedLog for Zookeeper > > > Key: MESOS-1806 > URL: https://issues.apache.org/jira/browse/MESOS-1806 > Project: Mesos > Issue Type: Task >Reporter: Ed Ropple >Assignee: Shuai Lin >Priority: Minor > >eropple: Could you also file a new JIRA for Mesos to drop ZK > in favor of etcd or ReplicatedLog? Would love to get some momentum going on > that one. > -- > Consider it filed. =) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3123) DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails & crashes
[ https://issues.apache.org/jira/browse/MESOS-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934429#comment-14934429 ] haosdent commented on MESOS-3123: - Do you know which ip slave start on this test? If slave start in 127.0.0.1, would have this problem > DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails & crashes > --- > > Key: MESOS-3123 > URL: https://issues.apache.org/jira/browse/MESOS-3123 > Project: Mesos > Issue Type: Bug > Components: docker, test >Affects Versions: 0.23.0 > Environment: CentOS 7.1, or Ubuntu 14.04 > Mesos 0.23.0-rc4 or today's master >Reporter: Adam B >Assignee: Timothy Chen > Labels: mesosphere > > Fails the test and then crashes while trying to shutdown the slaves. > {code} > [ RUN ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged > ../../src/tests/docker_containerizer_tests.cpp:618: Failure > Value of: statusRunning.get().state() > Actual: TASK_LOST > Expected: TASK_RUNNING > ../../src/tests/docker_containerizer_tests.cpp:619: Failure > Failed to wait 1mins for statusFinished > ../../src/tests/docker_containerizer_tests.cpp:610: Failure > Actual function call count doesn't match EXPECT_CALL(sched, > statusUpdate(&driver, _))... > Expected: to be called twice >Actual: called once - unsatisfied and active > F0721 21:59:54.950773 30622 logging.cpp:57] RAW: Pure virtual method called > @ 0x7f3915347a02 google::LogMessage::Fail() > @ 0x7f391534cee4 google::RawLog__() > @ 0x7f3914890312 __cxa_pure_virtual > @ 0x88c3ae mesos::internal::tests::Cluster::Slaves::shutdown() > @ 0x88c176 mesos::internal::tests::Cluster::Slaves::~Slaves() > @ 0x88dc16 mesos::internal::tests::Cluster::~Cluster() > @ 0x88dc87 mesos::internal::tests::MesosTest::~MesosTest() > @ 0xa529ab > mesos::internal::tests::DockerContainerizerTest::~DockerContainerizerTest() > @ 0xa8125f > mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test() > @ 0xa8128e > mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test() > @ 0x1218b4e testing::Test::DeleteSelf_() > @ 0x1221909 > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > @ 0x121cb38 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0x1205713 testing::TestInfo::Run() > @ 0x1205c4e testing::TestCase::Run() > @ 0x120a9ca testing::internal::UnitTestImpl::RunAllTests() > @ 0x122277b > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > @ 0x121d81b > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0x120987a testing::UnitTest::Run() > @ 0xcfbf0c main > @ 0x7f391097caf5 __libc_start_main > @ 0x882089 (unknown) > make[3]: *** [check-local] Aborted (core dumped) > make[3]: Leaving directory `/home/me/mesos/build/src' > make[2]: *** [check-am] Error 2 > make[2]: Leaving directory `/home/me/mesos/build/src' > make[1]: *** [check] Error 2 > make[1]: Leaving directory `/home/me/mesos/build/src' > make: *** [check-recursive] Error 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3519) Fix file descriptor leakage / double close in the code base
[ https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934417#comment-14934417 ] Chi Zhang commented on MESOS-3519: -- Hi [~tnachen], I added you as a reviewer to this patch. Could you take a look at it please? https://reviews.apache.org/r/38828/ > Fix file descriptor leakage / double close in the code base > --- > > Key: MESOS-3519 > URL: https://issues.apache.org/jira/browse/MESOS-3519 > Project: Mesos > Issue Type: Bug >Reporter: Chi Zhang >Assignee: Chi Zhang > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3548) Investigate federations of Mesos masters
[ https://issues.apache.org/jira/browse/MESOS-3548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-3548: --- Summary: Investigate federations of Mesos masters (was: Support federations of Mesos masters) > Investigate federations of Mesos masters > > > Key: MESOS-3548 > URL: https://issues.apache.org/jira/browse/MESOS-3548 > Project: Mesos > Issue Type: Improvement >Reporter: Neil Conway > Labels: federation, multi-dc > > In a large Mesos installation, the operator might want to ensure that even if > the Mesos masters are inaccessible or failed, new tasks can still be > scheduled (across multiple different frameworks). HA masters are only a > partial solution here: the masters might still be inaccessible due to a > correlated failure (e.g., Zookeeper misconfiguration/human error). > To support this, we could support the notion of "hierarchies" or > "federations" of Mesos masters. In a Mesos installation with 10k machines, > the operator might configure 10 Mesos masters (each of which might be HA) to > manage 1k machines each. Then an additional "meta-Master" would manage the > allocation of cluster resources to the 10 masters. Hence, the failure of any > individual master would impact 1k machines at most. The meta-master might not > have a lot of work to do: e.g., it might be limited to occasionally > reallocating cluster resources among the 10 masters, or ensuring that newly > added cluster resources are allocated among the masters as appropriate. > Hence, the failure of the meta-master would not prevent any of the individual > masters from scheduling new tasks. A single framework instance probably > wouldn't be able to use more resources than have been assigned to a single > Master, but that seems like a reasonable restriction. > This feature might also be a good fit for a multi-datacenter deployment of > Mesos: each Mesos master instance would manage a single DC. Naturally, > reducing the traffic between frameworks and the meta-master would be > important for performance reasons in a configuration like this. > Operationally, this might be simpler if Mesos processes were self-hosting > ([MESOS-3547]). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3548) Support federations of Mesos masters
Neil Conway created MESOS-3548: -- Summary: Support federations of Mesos masters Key: MESOS-3548 URL: https://issues.apache.org/jira/browse/MESOS-3548 Project: Mesos Issue Type: Improvement Reporter: Neil Conway In a large Mesos installation, the operator might want to ensure that even if the Mesos masters are inaccessible or failed, new tasks can still be scheduled (across multiple different frameworks). HA masters are only a partial solution here: the masters might still be inaccessible due to a correlated failure (e.g., Zookeeper misconfiguration/human error). To support this, we could support the notion of "hierarchies" or "federations" of Mesos masters. In a Mesos installation with 10k machines, the operator might configure 10 Mesos masters (each of which might be HA) to manage 1k machines each. Then an additional "meta-Master" would manage the allocation of cluster resources to the 10 masters. Hence, the failure of any individual master would impact 1k machines at most. The meta-master might not have a lot of work to do: e.g., it might be limited to occasionally reallocating cluster resources among the 10 masters, or ensuring that newly added cluster resources are allocated among the masters as appropriate. Hence, the failure of the meta-master would not prevent any of the individual masters from scheduling new tasks. A single framework instance probably wouldn't be able to use more resources than have been assigned to a single Master, but that seems like a reasonable restriction. This feature might also be a good fit for a multi-datacenter deployment of Mesos: each Mesos master instance would manage a single DC. Naturally, reducing the traffic between frameworks and the meta-master would be important for performance reasons in a configuration like this. Operationally, this might be simpler if Mesos processes were self-hosting ([MESOS-3547]). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3547) Investigate self-hosting Mesos processes
[ https://issues.apache.org/jira/browse/MESOS-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934378#comment-14934378 ] Jie Yu commented on MESOS-3547: --- Flying by... This is really interesting! Wondering if slave itself can be a "persistent task" or not? The cyclic dependency means that we need some sort of bootstrapping. Having a bootstrapping means maybe we can do self slave upgrading? Just some random thoughts :) > Investigate self-hosting Mesos processes > > > Key: MESOS-3547 > URL: https://issues.apache.org/jira/browse/MESOS-3547 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Neil Conway > > Right now, Mesos master and slave nodes are managed differently: they use > different binaries and startup scripts and require different ops procedures. > Some of this asymmetric is essential, but perhaps not all of it is. If Mesos > supported a concept of "persistent tasks" (see [MESOS-3545]), it might be > possible to implement the Mesos master as such a task -- this might help > unify the ops procedures between a master and a slave. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3545) Investigate restoring tasks/executors after machine reboot.
[ https://issues.apache.org/jira/browse/MESOS-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934371#comment-14934371 ] Neil Conway commented on MESOS-3545: This might take the form of supporting a notion of "persistent tasks" -- i.e., tasks that Mesos tries to keep running whenever possible (e.g., during a network partition and after a machine reboot, even in the absence of network connectivity). > Investigate restoring tasks/executors after machine reboot. > --- > > Key: MESOS-3545 > URL: https://issues.apache.org/jira/browse/MESOS-3545 > Project: Mesos > Issue Type: Improvement > Components: slave >Reporter: Benjamin Hindman > > If a task/executor is restartable (see MESOS-3544) it might make sense to > force an agent to restart these tasks/executors _before_ after a machine > reboot in the event that the machine is network partitioned away from the > master (or the master has failed) but we'd like to get these services running > again. Assuming the agent(s) running on the machine has not been disconnected > from the master for longer than the master's agent re-registration timeout > the agent should be able to re-register (i.e., after a network partition is > resolved) without a problem. However, in the same way that a framework would > be interested in knowing that it's tasks/executors were restarted we'd want > to send something like a TASK_RESTARTED status update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3547) Investigate self-hosting Mesos processes
Neil Conway created MESOS-3547: -- Summary: Investigate self-hosting Mesos processes Key: MESOS-3547 URL: https://issues.apache.org/jira/browse/MESOS-3547 Project: Mesos Issue Type: Improvement Components: master Reporter: Neil Conway Right now, Mesos master and slave nodes are managed differently: they use different binaries and startup scripts and require different ops procedures. Some of this asymmetric is essential, but perhaps not all of it is. If Mesos supported a concept of "persistent tasks" (see [MESOS-3545]), it might be possible to implement the Mesos master as such a task -- this might help unify the ops procedures between a master and a slave. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3546) Mesos scheduler driver python binding breaks when implicitAcknowledgements is not supplied.
Yan Xu created MESOS-3546: - Summary: Mesos scheduler driver python binding breaks when implicitAcknowledgements is not supplied. Key: MESOS-3546 URL: https://issues.apache.org/jira/browse/MESOS-3546 Project: Mesos Issue Type: Bug Affects Versions: 0.22.0 Reporter: Yan Xu The C++ driver has overloads that make `bool implicitAcknowledgements` optional but the python binding throws an error if the client code doesn't supply it. {noformat:title=error} TypeError: an integer is required {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3545) Investigate restoring tasks/executors after machine reboot.
Benjamin Hindman created MESOS-3545: --- Summary: Investigate restoring tasks/executors after machine reboot. Key: MESOS-3545 URL: https://issues.apache.org/jira/browse/MESOS-3545 Project: Mesos Issue Type: Improvement Components: slave Reporter: Benjamin Hindman If a task/executor is restartable (see MESOS-3544) it might make sense to force an agent to restart these tasks/executors _before_ after a machine reboot in the event that the machine is network partitioned away from the master (or the master has failed) but we'd like to get these services running again. Assuming the agent(s) running on the machine has not been disconnected from the master for longer than the master's agent re-registration timeout the agent should be able to re-register (i.e., after a network partition is resolved) without a problem. However, in the same way that a framework would be interested in knowing that it's tasks/executors were restarted we'd want to send something like a TASK_RESTARTED status update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3544) Support task and/or executor restart on failure.
Benjamin Hindman created MESOS-3544: --- Summary: Support task and/or executor restart on failure. Key: MESOS-3544 URL: https://issues.apache.org/jira/browse/MESOS-3544 Project: Mesos Issue Type: Bug Components: HTTP API, master, slave Reporter: Benjamin Hindman In certain instances it might be preferable to restart a task/executor after it fails (i.e., non-zero exit code) rather than going through an entire status update -> offer -> accept (launch) cycle to restart the task/executor on the same machine. This is especially true if the resources are reserved (dynamically or statically). Of course, we still want to highlight the restart to the framework, so introducing something like TASK_RESTARTED might be necessary (not sure what the analog would be for executors). Finally, if the task/executor has a bug we don't want to sit in an infinite loop, so we'll likely want to introduce this functionality in such a way as to limit the total restart attempts (or force a framework to have the proper authority to restart forever). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3544) Support task and/or executor restart on failure.
[ https://issues.apache.org/jira/browse/MESOS-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-3544: Issue Type: Epic (was: Bug) > Support task and/or executor restart on failure. > > > Key: MESOS-3544 > URL: https://issues.apache.org/jira/browse/MESOS-3544 > Project: Mesos > Issue Type: Epic > Components: HTTP API, master, slave >Reporter: Benjamin Hindman > > In certain instances it might be preferable to restart a task/executor after > it fails (i.e., non-zero exit code) rather than going through an entire > status update -> offer -> accept (launch) cycle to restart the task/executor > on the same machine. This is especially true if the resources are reserved > (dynamically or statically). > Of course, we still want to highlight the restart to the framework, so > introducing something like TASK_RESTARTED might be necessary (not sure what > the analog would be for executors). > Finally, if the task/executor has a bug we don't want to sit in an infinite > loop, so we'll likely want to introduce this functionality in such a way as > to limit the total restart attempts (or force a framework to have the proper > authority to restart forever). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3516) Add user doc for networking support in Mesos 0.25.0
[ https://issues.apache.org/jira/browse/MESOS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niklas Quarfot Nielsen updated MESOS-3516: -- Story Points: 2 > Add user doc for networking support in Mesos 0.25.0 > --- > > Key: MESOS-3516 > URL: https://issues.apache.org/jira/browse/MESOS-3516 > Project: Mesos > Issue Type: Documentation >Reporter: Niklas Quarfot Nielsen >Assignee: Niklas Quarfot Nielsen > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3507) As an operator, I want a way to inspect queued tasks in running schedulers
[ https://issues.apache.org/jira/browse/MESOS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934287#comment-14934287 ] Vinod Kone commented on MESOS-3507: --- {quote} there is no uniform way of getting a notion of 'awaiting' tasks i.e. expressing that a framework has more work to do. {quote} We do have an API call for frameworks to express this. requestResources(). Instead of having frameworks expose "queued work" endpoints, and having something on the master (module?) to interpret this data in a uniform way, why not just have frameworks explicitly and directly provide the intent of needing more resources via requestResources() call? Resources is a uniform abstraction that every framework already understands. > As an operator, I want a way to inspect queued tasks in running schedulers > -- > > Key: MESOS-3507 > URL: https://issues.apache.org/jira/browse/MESOS-3507 > Project: Mesos > Issue Type: Story >Reporter: Niklas Quarfot Nielsen > > Currently, there is no uniform way of getting a notion of 'awaiting' tasks > i.e. expressing that a framework has more work to do. This information is > useful for auto-scaling and anomaly detection systems. Schedulers tend to > expose this over their own http endpoints, but the format across schedulers > are most likely not compatible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3516) Add user doc for networking support in Mesos 0.25.0
[ https://issues.apache.org/jira/browse/MESOS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niklas Quarfot Nielsen reassigned MESOS-3516: - Assignee: Niklas Quarfot Nielsen (was: Kapil Arya) > Add user doc for networking support in Mesos 0.25.0 > --- > > Key: MESOS-3516 > URL: https://issues.apache.org/jira/browse/MESOS-3516 > Project: Mesos > Issue Type: Documentation >Reporter: Niklas Quarfot Nielsen >Assignee: Niklas Quarfot Nielsen > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3543) Add libevent support on Unix builds.
Alex Clemmer created MESOS-3543: --- Summary: Add libevent support on Unix builds. Key: MESOS-3543 URL: https://issues.apache.org/jira/browse/MESOS-3543 Project: Mesos Issue Type: Task Components: build Reporter: Alex Clemmer Assignee: Alex Clemmer Right now Unix builds will (intentionally) error out when we try to build them with libevent. We should add support for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3542) Separate libmesos into compiling from many binaries.
Alex Clemmer created MESOS-3542: --- Summary: Separate libmesos into compiling from many binaries. Key: MESOS-3542 URL: https://issues.apache.org/jira/browse/MESOS-3542 Project: Mesos Issue Type: Task Reporter: Alex Clemmer Assignee: Alex Clemmer Historically libmesos is built as a huge monolithic binary. Another idea would be to build it from a bunch of smaller libraries (_e.g._, libagent, _etc_.). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3519) Fix file descriptor leakage / double close in the code base
[ https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934255#comment-14934255 ] Chi Zhang commented on MESOS-3519: -- https://reviews.apache.org/r/38828/ > Fix file descriptor leakage / double close in the code base > --- > > Key: MESOS-3519 > URL: https://issues.apache.org/jira/browse/MESOS-3519 > Project: Mesos > Issue Type: Bug >Reporter: Chi Zhang >Assignee: Chi Zhang > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3541) Add CMakeLists that builds the Mesos master
Alex Clemmer created MESOS-3541: --- Summary: Add CMakeLists that builds the Mesos master Key: MESOS-3541 URL: https://issues.apache.org/jira/browse/MESOS-3541 Project: Mesos Issue Type: Task Components: build Reporter: Alex Clemmer Assignee: Alex Clemmer Right now CMake builds only the agent. We want it to also build the master as part of the libmesos binary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3540) Libevent termination triggers Broken Pipe
Joris Van Remoortere created MESOS-3540: --- Summary: Libevent termination triggers Broken Pipe Key: MESOS-3540 URL: https://issues.apache.org/jira/browse/MESOS-3540 Project: Mesos Issue Type: Bug Components: libprocess Reporter: Joris Van Remoortere When the libevent loop terminates and we unblock the {{SIGPIPE}} signal, the pending {{SIGPIPE}} instantly triggers and causes a broken pipe when the test binary stops running. {code} Program received signal SIGPIPE, Broken pipe. [Switching to Thread 0x718b4700 (LWP 16270)] pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53 53 ../sysdeps/unix/sysv/linux/pthread_sigmask.c: No such file or directory. (gdb) bt #0 pthread_sigmask (how=1, newmask=, oldmask=0x718b3d80) at ../sysdeps/unix/sysv/linux/pthread_sigmask.c:53 #1 0x006fd9a4 in unblock () at ../../../3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp:90 #2 0x007d7915 in run () at ../../../3rdparty/libprocess/src/libevent.cpp:125 #3 0x007950cb in _M_invoke<>(void) () at /usr/include/c++/4.9/functional:1700 #4 0x00795000 in operator() () at /usr/include/c++/4.9/functional:1688 #5 0x00794f6e in _M_run () at /usr/include/c++/4.9/thread:115 #6 0x7668de30 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #7 0x779a16aa in start_thread (arg=0x718b4700) at pthread_create.c:333 #8 0x75df1eed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3539) Validate that slave's work_dir is a shared mount in its own peer group when LinuxFilesystemIsolator is used.
[ https://issues.apache.org/jira/browse/MESOS-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-3539: -- Sprint: Twitter Mesos Q3 Sprint 6 > Validate that slave's work_dir is a shared mount in its own peer group when > LinuxFilesystemIsolator is used. > > > Key: MESOS-3539 > URL: https://issues.apache.org/jira/browse/MESOS-3539 > Project: Mesos > Issue Type: Bug >Reporter: Jie Yu > > To address this TODO in the code: > {noformat} > src/slave/containerizer/isolators/filesystem/linux.cpp +122 > // TODO(jieyu): Currently, we don't check if the slave's work_dir > // mount is a shared mount or not. We just assume it is. We cannot > // simply mark the slave as shared again because that will create a > // new peer group for the mounts. This is a temporary workaround for > // now while we are thinking about fixes. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3539) Validate that slave's work_dir is a shared mount in its own peer group when LinuxFilesystemIsolator is used.
Jie Yu created MESOS-3539: - Summary: Validate that slave's work_dir is a shared mount in its own peer group when LinuxFilesystemIsolator is used. Key: MESOS-3539 URL: https://issues.apache.org/jira/browse/MESOS-3539 Project: Mesos Issue Type: Bug Reporter: Jie Yu To address this TODO in the code: {noformat} src/slave/containerizer/isolators/filesystem/linux.cpp +122 // TODO(jieyu): Currently, we don't check if the slave's work_dir // mount is a shared mount or not. We just assume it is. We cannot // simply mark the slave as shared again because that will create a // new peer group for the mounts. This is a temporary workaround for // now while we are thinking about fixes. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3538) CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is flaky
[ https://issues.apache.org/jira/browse/MESOS-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934219#comment-14934219 ] Niklas Quarfot Nielsen commented on MESOS-3538: --- Thanks Jie! I will rerun the test and see if that solves the problem (and close the ticket if everything is OK) > CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is > flaky > --- > > Key: MESOS-3538 > URL: https://issues.apache.org/jira/browse/MESOS-3538 > Project: Mesos > Issue Type: Bug >Reporter: Niklas Quarfot Nielsen >Priority: Blocker > > {code} > $ sudo ./bin/mesos-tests.sh > --gtest_filter="CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy" > > Source directory: /home/vagrant/mesos > Build directory: /home/vagrant/mesos-build > - > We cannot run any cgroups tests that require mounting > hierarchies because you have the following hierarchies mounted: > /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, > /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, > /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/perf_event, > /sys/fs/cgroup/systemd > We'll disable the CgroupsNoHierarchyTest test fixture for now. > - > sh: 1: perf: not found > - > No 'perf' command found so no 'perf' tests will be run > - > /bin/nc > Note: Google Test filter = > CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy-MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf:PerfTest.ROOT_Events:PerfTest.ROOT_Sample:PerfTest.Parse:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/2:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/3:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/4:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/6:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/7:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/8:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/9:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/10:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/11:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/12:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/13:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/14:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/15:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/16:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/17:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/18:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/19:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/20:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/21:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/22:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/23:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/24:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/25:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/26:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/27:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/28:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/29:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/30:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/31:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/32:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/33:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/34:Sl
[jira] [Commented] (MESOS-3538) CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is flaky
[ https://issues.apache.org/jira/browse/MESOS-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934212#comment-14934212 ] Jie Yu commented on MESOS-3538: --- That should be fixed by this commit: commit 4635b66af7caf024695f69f4ca07a57f2876ad29 Author: Jie Yu Date: Mon Sep 28 12:45:20 2015 -0700 Fixed a bug in cgroups test filter. Review: https://reviews.apache.org/r/38819 > CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is > flaky > --- > > Key: MESOS-3538 > URL: https://issues.apache.org/jira/browse/MESOS-3538 > Project: Mesos > Issue Type: Bug >Reporter: Niklas Quarfot Nielsen >Priority: Blocker > > {code} > $ sudo ./bin/mesos-tests.sh > --gtest_filter="CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy" > > Source directory: /home/vagrant/mesos > Build directory: /home/vagrant/mesos-build > - > We cannot run any cgroups tests that require mounting > hierarchies because you have the following hierarchies mounted: > /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, > /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, > /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/perf_event, > /sys/fs/cgroup/systemd > We'll disable the CgroupsNoHierarchyTest test fixture for now. > - > sh: 1: perf: not found > - > No 'perf' command found so no 'perf' tests will be run > - > /bin/nc > Note: Google Test filter = > CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy-MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf:PerfTest.ROOT_Events:PerfTest.ROOT_Sample:PerfTest.Parse:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/2:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/3:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/4:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/6:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/7:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/8:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/9:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/10:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/11:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/12:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/13:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/14:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/15:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/16:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/17:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/18:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/19:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/20:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/21:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/22:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/23:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/24:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/25:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/26:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/27:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/28:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/29:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/30:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/31:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/32:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSla
[jira] [Created] (MESOS-3538) CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is flaky
Niklas Quarfot Nielsen created MESOS-3538: - Summary: CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is flaky Key: MESOS-3538 URL: https://issues.apache.org/jira/browse/MESOS-3538 Project: Mesos Issue Type: Bug Reporter: Niklas Quarfot Nielsen Priority: Blocker {code} $ sudo ./bin/mesos-tests.sh --gtest_filter="CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy" Source directory: /home/vagrant/mesos Build directory: /home/vagrant/mesos-build - We cannot run any cgroups tests that require mounting hierarchies because you have the following hierarchies mounted: /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/perf_event, /sys/fs/cgroup/systemd We'll disable the CgroupsNoHierarchyTest test fixture for now. - sh: 1: perf: not found - No 'perf' command found so no 'perf' tests will be run - /bin/nc Note: Google Test filter = CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy-MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf:PerfTest.ROOT_Events:PerfTest.ROOT_Sample:PerfTest.Parse:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/2:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/3:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/4:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/6:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/7:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/8:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/9:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/10:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/11:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/12:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/13:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/14:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/15:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/16:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/17:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/18:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/19:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/20:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/21:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/22:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/23:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/24:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/25:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/26:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/27:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/28:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/29:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/30:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/31:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/32:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/33:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/34:SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/35:SlaveCount/Registrar_BENCHMARK_Test.Performance/0:SlaveCount/Registrar_BENCHMARK_Test.Performance/1:SlaveCount/Registrar_BENCHMARK_Test.Performance/2:SlaveCount/Registrar_BENCHMARK_Test.Performance/3 [==] Running 1 test from 1 test case. [--] Global test environment set-up. [--] 1 test from CgroupsNoHierarchyTe
[jira] [Comment Edited] (MESOS-3123) DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails & crashes
[ https://issues.apache.org/jira/browse/MESOS-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934166#comment-14934166 ] Niklas Quarfot Nielsen edited comment on MESOS-3123 at 9/28/15 10:08 PM: - Just ran into this during testing of Mesos 0.25.0 rc1 on Ubuntu 14.04 {code} [ RUN ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor 2015-09-28 22:00:14,166:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:17,504:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:20,841:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:24,178:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client ../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:254: Failure Value of: statusRunning.get().state() Actual: TASK_FAILED Expected: TASK_RUNNING 2015-09-28 22:00:27,515:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:30,851:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:34,188:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:37,526:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:40,863:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:44,208:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:47,546:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:50,884:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:54,222:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:57,560:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:00,899:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:04,238:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:07,575:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:10,912:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:14,249:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:17,587:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:20,925:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:24,264:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client ../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:255: Failure
[jira] [Commented] (MESOS-3123) DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails & crashes
[ https://issues.apache.org/jira/browse/MESOS-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934166#comment-14934166 ] Niklas Quarfot Nielsen commented on MESOS-3123: --- {code} [ RUN ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor 2015-09-28 22:00:14,166:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:17,504:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:20,841:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:24,178:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client ../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:254: Failure Value of: statusRunning.get().state() Actual: TASK_FAILED Expected: TASK_RUNNING 2015-09-28 22:00:27,515:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:30,851:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:34,188:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:37,526:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:40,863:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:44,208:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:47,546:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:50,884:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:54,222:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:00:57,560:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:00,899:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:04,238:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:07,575:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:10,912:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:14,249:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:17,587:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:20,925:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-09-28 22:01:24,264:7267(0x2ba9fb511700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:53630] zk retcode=-4, errno=111(Connection refused): server refused to accept the client ../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:255: Failure Failed to wait 1mins for statusFinished ../../mesos/src/tests/containerizer/docker_containerizer_tests.cpp:246: Failure Ac
[jira] [Updated] (MESOS-3516) Add user doc for networking support in Mesos 0.25.0
[ https://issues.apache.org/jira/browse/MESOS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niklas Quarfot Nielsen updated MESOS-3516: -- Target Version/s: 0.25.0 > Add user doc for networking support in Mesos 0.25.0 > --- > > Key: MESOS-3516 > URL: https://issues.apache.org/jira/browse/MESOS-3516 > Project: Mesos > Issue Type: Documentation >Reporter: Niklas Quarfot Nielsen >Assignee: Kapil Arya > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3519) Fix file descriptor leakage / double close in the code base
[ https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934087#comment-14934087 ] Chi Zhang commented on MESOS-3519: -- https://reviews.apache.org/r/38823/ > Fix file descriptor leakage / double close in the code base > --- > > Key: MESOS-3519 > URL: https://issues.apache.org/jira/browse/MESOS-3519 > Project: Mesos > Issue Type: Bug >Reporter: Chi Zhang >Assignee: Chi Zhang > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3537) Allow the frameworks to specify filesystem perms for volumes they own.
Yan Xu created MESOS-3537: - Summary: Allow the frameworks to specify filesystem perms for volumes they own. Key: MESOS-3537 URL: https://issues.apache.org/jira/browse/MESOS-3537 Project: Mesos Issue Type: Task Reporter: Yan Xu This is applicable to persistent volumes as well as regular volumes with the host path under the sandbox. Currently these volumes are created by the slave with perms from its own umask. In order to simulate system directories from the host sandbox the users may need to request certain perms (e.g. {{/var/www-data}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3533) Unable to find and run URIs files
[ https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933968#comment-14933968 ] Rafael Capucho commented on MESOS-3533: --- Looking into Mesos Slave Log I found a lot of lines like that: W0928 20:54:24.81043113 slave.cpp:4452] Failed to get resource statistics for executor 'novo-teste.c0442998-661f-11e5-8b11-0242ac1101eb' of framework fe42c404-7266-462b-adf5-549311bfbf32-: Failed to collect cgroup stats: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/27138/cgroup: Failed to open file '/proc/27138/cgroup': No such file or directory Could it be the source of the problem? > Unable to find and run URIs files > - > > Key: MESOS-3533 > URL: https://issues.apache.org/jira/browse/MESOS-3533 > Project: Mesos > Issue Type: Bug > Components: fetcher, general >Affects Versions: 0.25.0 > Environment: Linux li202-122 4.1.5-x86_64-linode61 #7 SMP Mon Aug 24 > 13:46:31 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux > Ubuntu 14.04.1 LTS > Docker Version: 1.8.2 > Docker API version: 1.20 > Go version: go1.4.2 >Reporter: Rafael Capucho >Priority: Blocker > > Hello, > Deploying a docker container using marathon 0.11 with the following structure > (just example, I had tried some variations with same result): > { > "id": "testando-flask", > "cmd": "ls -l; pip install -r requeriments.txt; ls -l; python app.py", > "cpus": 0.5, > "mem": 20.0, > "container": { > "type": "DOCKER", > "docker": { > "image": "therealwardo/python-2.7-pip", > "network": "BRIDGE", > "privileged": true, > "portMappings": [ > { "containerPort": 31177, "hostPort": 0 } > ] > } > }, > "uris": [ > "http://blog.rafaelcapucho.com/app.zip"; > ] > } > curl -X POST http://173.255.192.XXX:8080/v2/apps -d @flask.json -H > "Content-type: application/json" > The task are reaching mesos master properly but it failed. When I execute the > same structure without uris and with a simple "python -m SimpleHTTPServer" it > works! The docker is created and running. > Analyzing the sandbox on Mesos UI I can see that the files of URIs are > download correctly, the project and the requirements.txt in stdout I got: > Archive: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.zip > inflating: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.py > > extracting: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/requeriments.txt > > --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff" > --stop_timeout="0ns" > --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff" > --stop_timeout="0ns" > Registered docker executor on li202-122.members.linode.com > Starting task testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb > Could not open requirements file: [Errno 2] No such file or directory: > 'requeriments.txt' > Storing complete log in /root/.pip/pip.log > total 68 > drwxr-xr-x 2 root root 4096 Jan 15 2015 bin > drwxr-xr-x 2 root root 4096 Apr 19 2012 boot > drwxr-xr-x 10 root root 13740 Sep 28 12:44 dev > drwxr-xr-x 46 root root 4096 Sep 28 12:44 etc > drwxr-xr-x 2 root root 4096 Apr 19 2012 home > drwxr-xr-x 11 root root 4096 Jan 15 2015 lib > drwxr-xr-x 2 root root 4096 Jan 15 2015 lib64 > drwxr-xr-x 2 root
[jira] [Updated] (MESOS-2467) Allow --resources flag to take JSON.
[ https://issues.apache.org/jira/browse/MESOS-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2467: --- Story Points: 3 > Allow --resources flag to take JSON. > > > Key: MESOS-2467 > URL: https://issues.apache.org/jira/browse/MESOS-2467 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu >Assignee: Greg Mann > Labels: mesosphere > > Currently, we used a customized format for --resources flag. As we introduce > more and more stuffs (e.g., persistence, reservation) in Resource object, we > need a more generic way to specify --resources. > For backward compatibility, we can scan the first character. If it is '[', > then we invoke the JSON parser. Otherwise, we use the existing parser. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2467) Allow --resources flag to take JSON.
[ https://issues.apache.org/jira/browse/MESOS-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2467: --- Sprint: Mesosphere Sprint 20 > Allow --resources flag to take JSON. > > > Key: MESOS-2467 > URL: https://issues.apache.org/jira/browse/MESOS-2467 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu >Assignee: Greg Mann > Labels: mesosphere > > Currently, we used a customized format for --resources flag. As we introduce > more and more stuffs (e.g., persistence, reservation) in Resource object, we > need a more generic way to specify --resources. > For backward compatibility, we can scan the first character. If it is '[', > then we invoke the JSON parser. Otherwise, we use the existing parser. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1607) Introduce optimistic offers.
[ https://issues.apache.org/jira/browse/MESOS-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-1607: - Labels: mesosphere (was: ) > Introduce optimistic offers. > > > Key: MESOS-1607 > URL: https://issues.apache.org/jira/browse/MESOS-1607 > Project: Mesos > Issue Type: Epic > Components: allocation, framework, master >Reporter: Benjamin Hindman >Assignee: Artem Harutyunyan > Labels: mesosphere > Attachments: optimisitic-offers.pdf > > > The current implementation of resource offers only enable a single framework > scheduler to make scheduling decisions for some available resources at a > time. In some circumstances, this is good, i.e., when we don't want other > framework schedulers to have access to some resources. However, in other > circumstances, there are advantages to letting multiple framework schedulers > attempt to make scheduling decisions for the _same_ allocation of resources > in parallel. > If you think about this from a "concurrency control" perspective, the current > implementation of resource offers is _pessimistic_, the resources contained > within an offer are _locked_ until the framework scheduler that they were > offered to launches tasks with them or declines them. In addition to making > pessimistic offers we'd like to give out _optimistic_ offers, where the same > resources are offered to multiple framework schedulers at the same time, and > framework schedulers "compete" for those resources on a > first-come-first-serve basis (i.e., the first to launch a task "wins"). We've > always reserved the right to rescind resource offers using the 'rescind' > primitive in the API, and a framework scheduler should be prepared to launch > a task and have those tasks go lost because another framework already started > to use those resources. > Introducing optimistic offers will enable more sophisticated allocation > algorithms. For example, we can optimistically allocate resources that are > reserved for a particular framework (role) but are not being used. In > conjunction with revocable resources (the concept that using resources not > reserved for you means you might get those resources revoked) we can easily > create a "spot" market for unused resources, driving up utilization by > letting frameworks that are willing to use revocable resources run tasks. > In the limit, one could imagine always making optimistic resource offers. > This bears a striking resemblance with the Google Omega model (an isomorphism > even). However, being able to configure what resources should be allocated > optimistically and what resources should be allocated pessimistically gives > even more control to a datacenter/cluster operator that might want to, for > example, never let multiple frameworks (roles) compete for some set of > resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2972) Serialize Docker image spec as protobuf
[ https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2972: --- Story Points: 3 > Serialize Docker image spec as protobuf > --- > > Key: MESOS-2972 > URL: https://issues.apache.org/jira/browse/MESOS-2972 > Project: Mesos > Issue Type: Improvement >Reporter: Timothy Chen >Assignee: Gilbert Song > Labels: mesosphere > > The Docker image specification defines a schema for the metadata json that it > puts into each image. Currently the docker image provisioner needs to be able > to parse and understand this metadata json, and we should create a protobuf > equivelent schema so we can utilize the json to protobuf conversion to read > and validate the metadata. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2972) Serialize Docker image spec as protobuf
[ https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2972: --- Sprint: Mesosphere Sprint 20 > Serialize Docker image spec as protobuf > --- > > Key: MESOS-2972 > URL: https://issues.apache.org/jira/browse/MESOS-2972 > Project: Mesos > Issue Type: Improvement >Reporter: Timothy Chen >Assignee: Gilbert Song > Labels: mesosphere > > The Docker image specification defines a schema for the metadata json that it > puts into each image. Currently the docker image provisioner needs to be able > to parse and understand this metadata json, and we should create a protobuf > equivelent schema so we can utilize the json to protobuf conversion to read > and validate the metadata. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3099) Validation of Docker Image Manifests from Docker Registry
[ https://issues.apache.org/jira/browse/MESOS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3099: --- Sprint: Mesosphere Sprint 20 > Validation of Docker Image Manifests from Docker Registry > - > > Key: MESOS-3099 > URL: https://issues.apache.org/jira/browse/MESOS-3099 > Project: Mesos > Issue Type: Improvement >Reporter: Lily Chen >Assignee: Gilbert Song > Labels: mesosphere > > Docker image manifests pulled from remote Docker registries should be > verified against their signature digest before they are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3476) Refactor Status Update method on Slave to handle HTTP based Executors
[ https://issues.apache.org/jira/browse/MESOS-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabel Jimenez updated MESOS-3476: -- Shepherd: Vinod Kone Sprint: Mesosphere Sprint 20 Story Points: 8 > Refactor Status Update method on Slave to handle HTTP based Executors > - > > Key: MESOS-3476 > URL: https://issues.apache.org/jira/browse/MESOS-3476 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar >Assignee: Isabel Jimenez > Labels: mesosphere > > Currently, receiving a status update sent from slave to itself , {{runTask}} > , {{killTask}} and status updates from executors are handled by the > {{Slave::statusUpdate}} method on Slave. The signature of the method is > {{void Slave::statusUpdate(StatusUpdate update, const UPID& pid)}}. > We need to create another overload of it that can also handle HTTP based > executors which the previous PID based function can also call into. The > signature of the new function could be: > {{void Slave::statusUpdate(StatusUpdate update, Executor* executor)}} > The HTTP Executor would also call into this new function via > {{src/slave/http.cpp}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3235) FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky
[ https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3235: --- Story Points: 2 > FetcherCacheHttpTest.HttpCachedSerialized and > FetcherCacheHttpTest.HttpCachedConcurrent are flaky > - > > Key: MESOS-3235 > URL: https://issues.apache.org/jira/browse/MESOS-3235 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0 >Reporter: Joseph Wu >Assignee: Bernd Mathiske > Labels: mesosphere > > On OSX, {{make clean && make -j8 V=0 check}}: > {code} > [--] 3 tests from FetcherCacheHttpTest > [ RUN ] FetcherCacheHttpTest.HttpCachedSerialized > HTTP/1.1 200 OK > Date: Fri, 07 Aug 2015 17:23:05 GMT > Content-Length: 30 > I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0 > E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave > 20150807-102305-139395082-52338-52313-S0 > E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > Registered executor on 10.0.79.8 > Starting task 0 > Forked command at 54363 > sh -c './mesos-fetcher-test-cmd 0' > E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > Command exited with status 0 (pid: 54363) > E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0 > E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave > 20150807-102305-139395082-52338-52313-S0 > E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > Registered executor on 10.0.79.8 > Starting task 1 > Forked command at 54411 > sh -c './mesos-fetcher-test-cmd 1' > E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > Command exited with status 0 (pid: 54411) > E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > ../../src/tests/fetcher_cache_tests.cpp:860: Failure > Failed to wait 15secs for awaitFinished(task.get()) > *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are > using GNU date *** > [ FAILED ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms) > [ RUN ] FetcherCacheHttpTest.HttpCachedConcurrent > PC: @0x113723618 process::Owned<>::get() > *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: *** > @ 0x7fff8fcacf1a _sigtramp > @ 0x7f9bc3109710 (unknown) > @0x1136f07e2 mesos::internal::slave::Fetcher::fetch() > @0x113862f9d > mesos::internal::slave::MesosContainerizerProcess::fetch() > @0x1138f1b5d > _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcRK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_ > @0x1138f18cf > _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcRK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_ > @0x1143768cf std::__1::function<>::operator()() > @0x11435ca7f process::ProcessBase::visit() > @0x1143ed6fe process::DispatchEvent::visit() > @0x11271 process::ProcessBase::serve() > @0x114343b4e process::ProcessManager::resume() > @0x1143431ca process::internal::schedule() > @0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEPvS5_ > @ 0x7fff95090268 _pthread_body > @ 0x7fff950901e5 _pthread_start > @ 0x7fff9508e41d thread_start > Failed to synchronize with slave (it's probably exited) > make[3]: *** [check-local] Segmentation fault: 11 > make[2]: *** [check-am] Error 2 > make[1]: *** [check] Error 2 > make: *** [check-recursive] Error 1 > {code} > This was encountered just once out of 3+ {{make check}}s. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3291) Add docker exec command
[ https://issues.apache.org/jira/browse/MESOS-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-3291: Shepherd: Timothy Chen > Add docker exec command > --- > > Key: MESOS-3291 > URL: https://issues.apache.org/jira/browse/MESOS-3291 > Project: Mesos > Issue Type: Task > Components: docker >Reporter: haosdent >Assignee: haosdent > Labels: docker, mesosphere > > For fix the problem [MESOS-3136 | > https://issues.apache.org/jira/browse/MESOS-3136], we need run the health > check command in docker container through "docker exec". So we need implement > exec command in docker/docker.cpp -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3235) FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky
[ https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3235: --- Sprint: Mesosphere Sprint 20 > FetcherCacheHttpTest.HttpCachedSerialized and > FetcherCacheHttpTest.HttpCachedConcurrent are flaky > - > > Key: MESOS-3235 > URL: https://issues.apache.org/jira/browse/MESOS-3235 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0 >Reporter: Joseph Wu >Assignee: Bernd Mathiske > Labels: mesosphere > > On OSX, {{make clean && make -j8 V=0 check}}: > {code} > [--] 3 tests from FetcherCacheHttpTest > [ RUN ] FetcherCacheHttpTest.HttpCachedSerialized > HTTP/1.1 200 OK > Date: Fri, 07 Aug 2015 17:23:05 GMT > Content-Length: 30 > I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0 > E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave > 20150807-102305-139395082-52338-52313-S0 > E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > Registered executor on 10.0.79.8 > Starting task 0 > Forked command at 54363 > sh -c './mesos-fetcher-test-cmd 0' > E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > Command exited with status 0 (pid: 54363) > E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0 > E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave > 20150807-102305-139395082-52338-52313-S0 > E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > Registered executor on 10.0.79.8 > Starting task 1 > Forked command at 54411 > sh -c './mesos-fetcher-test-cmd 1' > E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > Command exited with status 0 (pid: 54411) > E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: > Socket is not connected [57] > ../../src/tests/fetcher_cache_tests.cpp:860: Failure > Failed to wait 15secs for awaitFinished(task.get()) > *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are > using GNU date *** > [ FAILED ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms) > [ RUN ] FetcherCacheHttpTest.HttpCachedConcurrent > PC: @0x113723618 process::Owned<>::get() > *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: *** > @ 0x7fff8fcacf1a _sigtramp > @ 0x7f9bc3109710 (unknown) > @0x1136f07e2 mesos::internal::slave::Fetcher::fetch() > @0x113862f9d > mesos::internal::slave::MesosContainerizerProcess::fetch() > @0x1138f1b5d > _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcRK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_ > @0x1138f18cf > _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcRK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_ > @0x1143768cf std::__1::function<>::operator()() > @0x11435ca7f process::ProcessBase::visit() > @0x1143ed6fe process::DispatchEvent::visit() > @0x11271 process::ProcessBase::serve() > @0x114343b4e process::ProcessManager::resume() > @0x1143431ca process::internal::schedule() > @0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEPvS5_ > @ 0x7fff95090268 _pthread_body > @ 0x7fff950901e5 _pthread_start > @ 0x7fff9508e41d thread_start > Failed to synchronize with slave (it's probably exited) > make[3]: *** [check-local] Segmentation fault: 11 > make[2]: *** [check-am] Error 2 > make[1]: *** [check] Error 2 > make: *** [check-recursive] Error 1 > {code} > This was encountered just once out of 3+ {{make check}}s. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2972) Serialize Docker image spec as protobuf
[ https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933907#comment-14933907 ] Timothy Chen commented on MESOS-2972: - I see, we don't adapt JsonSchema yet and not sure how that integration looks like. This is actually recommended by other commiters from Twitter, as they already modeled AppC json with Protobf. The conversion and modeling is actually quite straightforward using the JSON <-> Protobuf tools we have. I still think we should conform to what's being practiced for now, and look into JsonSchema if it's a better alternative to convert all others. What I'd like to avoid is to have a complicated way that's harder to maintain and at least conform to the best practice we have as for now. Does that sounds good [~marco-mesos]? > Serialize Docker image spec as protobuf > --- > > Key: MESOS-2972 > URL: https://issues.apache.org/jira/browse/MESOS-2972 > Project: Mesos > Issue Type: Improvement >Reporter: Timothy Chen >Assignee: Gilbert Song > Labels: mesosphere > > The Docker image specification defines a schema for the metadata json that it > puts into each image. Currently the docker image provisioner needs to be able > to parse and understand this metadata json, and we should create a protobuf > equivelent schema so we can utilize the json to protobuf conversion to read > and validate the metadata. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3183) Documentation images do not load
[ https://issues.apache.org/jira/browse/MESOS-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-3183: - Sprint: Mesosphere Sprint 20 > Documentation images do not load > > > Key: MESOS-3183 > URL: https://issues.apache.org/jira/browse/MESOS-3183 > Project: Mesos > Issue Type: Documentation > Components: documentation >Affects Versions: 0.24.0 >Reporter: James Mulcahy >Assignee: Joseph Wu >Priority: Minor > Labels: mesosphere > Attachments: rake.patch > > > Any images which are referenced from the generated docs ({{docs/*.md}}) do > not show up on the website. For example: > * [External > Containerizer|http://mesos.apache.org/documentation/latest/external-containerizer/] > * [Fetcher Cache > Internals|http://mesos.apache.org/documentation/latest/fetcher-cache-internals/] > * [Maintenance|http://mesos.apache.org/documentation/latest/maintenance/] > * > [Oversubscription|http://mesos.apache.org/documentation/latest/oversubscription/] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3378) Document a test pattern for expediting event firing
[ https://issues.apache.org/jira/browse/MESOS-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3378: --- Sprint: Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 18, Mesosphere Sprint 19) > Document a test pattern for expediting event firing > --- > > Key: MESOS-3378 > URL: https://issues.apache.org/jira/browse/MESOS-3378 > Project: Mesos > Issue Type: Documentation > Components: documentation, test >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov >Priority: Minor > Labels: mesosphere > > We use {{Clock::advance()}} extensively in tests to expedite event firing and > minimize overall {{make check}} time. Document this pattern for posterity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3468) Improve apply_reviews.sh script to apply chain of reviews
[ https://issues.apache.org/jira/browse/MESOS-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-3468: Sprint: Mesosphere Sprint 20 > Improve apply_reviews.sh script to apply chain of reviews > - > > Key: MESOS-3468 > URL: https://issues.apache.org/jira/browse/MESOS-3468 > Project: Mesos > Issue Type: Improvement >Reporter: Vinod Kone >Assignee: Artem Harutyunyan > Labels: mesosphere > > Currently the support/apply-review.sh script allows an user (typically > committer) to apply a single review on top the HEAD. Since Mesos contributors > typically submit a chain of reviews for a given issue it makes sense for the > script to apply the whole chain recursively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3140) Implement Docker remote puller
[ https://issues.apache.org/jira/browse/MESOS-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3140: --- Sprint: Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 18, Mesosphere Sprint 19) > Implement Docker remote puller > -- > > Key: MESOS-3140 > URL: https://issues.apache.org/jira/browse/MESOS-3140 > Project: Mesos > Issue Type: Improvement >Reporter: Lily Chen >Assignee: Jojy Varghese > Labels: mesosphere > > Given a Docker image name and registry host URL, fetches the image. If > necessary, it will download the manifest and layers from the registry host. > It will place the layers and image manifest into persistent store. > Done when a Docker image can be successfully stored and retrieved using 'put' > and 'get' methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3074) Check satisfiability of quota requests in Master
[ https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3074: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19) > Check satisfiability of quota requests in Master > > > Key: MESOS-3074 > URL: https://issues.apache.org/jira/browse/MESOS-3074 > Project: Mesos > Issue Type: Improvement >Reporter: Joerg Schad >Assignee: Alexander Rukletsov > Labels: mesosphere > > We need to to validate and quota requests in the Mesos Master as outlined in > the Design Doc: > https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I > This ticket aims to validate satisfiability (in terms of available resources) > of a quota request using a heuristic algorithm in the Mesos Master, rather > than validating the syntax of the request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers
[ https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-1615: - Sprint: Mesosphere Sprint 20 > Create design document for Optimistic Offers > > > Key: MESOS-1615 > URL: https://issues.apache.org/jira/browse/MESOS-1615 > Project: Mesos > Issue Type: Documentation >Reporter: Dominic Hamon >Assignee: Joseph Wu > Labels: mesosphere > > As a first step toward Optimistic Offers, take the description from the epic > and build an implementation design doc that can be shared for comments. > Note: the links to the working group notes and design doc are located in the > [JIRA Epic|MESOS-1607]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3313) Rework Jenkins build script
[ https://issues.apache.org/jira/browse/MESOS-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3313: --- Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19) > Rework Jenkins build script > --- > > Key: MESOS-3313 > URL: https://issues.apache.org/jira/browse/MESOS-3313 > Project: Mesos > Issue Type: Task >Reporter: Artem Harutyunyan >Assignee: Artem Harutyunyan > Labels: mesosphere > > Mesos Jenkins build script needs to be reworked to support the following: > - Wider test coverage (libevent, libssl, root tests, Docker tests). > - More OS/compiler Docker images for testing Mesos. > - Excluding tests on per-image basis. > - Reproducing the test image locally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3480) Refactor Executor struct in Slave to handle HTTP based executors
[ https://issues.apache.org/jira/browse/MESOS-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3480: --- Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 19) > Refactor Executor struct in Slave to handle HTTP based executors > > > Key: MESOS-3480 > URL: https://issues.apache.org/jira/browse/MESOS-3480 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: mesosphere > > Currently, the {{struct Executor}} in slave only supports executors connected > via message passing (driver). We should refactor it to add support for HTTP > based Executors similar to what was done for the Scheduler API {{struct > Framework}} in {{src/master/master.hpp}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3515) Support Subscribe Call for HTTP based Executors
[ https://issues.apache.org/jira/browse/MESOS-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3515: --- Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 19) > Support Subscribe Call for HTTP based Executors > --- > > Key: MESOS-3515 > URL: https://issues.apache.org/jira/browse/MESOS-3515 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: mesosphere > > We need to add a {{subscribe(...)}} method in {{src/slave/slave.cpp}} to > introduce the ability for HTTP based executors to subscribe and then receive > events on the persistent HTTP connection. Most of the functionality needed > would be similar to {{Master::subscribe}} in {{src/master/master.cpp}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2949) Design generalized Authorizer interface
[ https://issues.apache.org/jira/browse/MESOS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2949: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19) > Design generalized Authorizer interface > --- > > Key: MESOS-2949 > URL: https://issues.apache.org/jira/browse/MESOS-2949 > Project: Mesos > Issue Type: Task > Components: master, security >Reporter: Alexander Rojas >Assignee: Alexander Rojas > Labels: acl, mesosphere, security > > As mentioned in MESOS-2948 the current {{mesos::Authorizer}} interface is > rather inflexible if new _Actions_ or _Objects_ need to be added. > A new API needs to be designed in a way that allows for arbitrary _Actions_ > and _Objects_ to be added to the authorization mechanism without having to > recompile mesos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3164) Introduce QuotaInfo message
[ https://issues.apache.org/jira/browse/MESOS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3164: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19) > Introduce QuotaInfo message > --- > > Key: MESOS-3164 > URL: https://issues.apache.org/jira/browse/MESOS-3164 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Alexander Rukletsov >Assignee: Joerg Schad > Labels: mesosphere > > A {{QuotaInfo}} protobuf message is internal representation for quota related > information (e.g. for persisting quota). The protobuf message should be > extendable for future needs and allows for easy aggregation across roles and > operator principals. It may also be used to pass quota information to > allocators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3428) Support running filesystem isolation with Command Executor in MesosContainerizer
[ https://issues.apache.org/jira/browse/MESOS-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3428: --- Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 19) > Support running filesystem isolation with Command Executor in > MesosContainerizer > > > Key: MESOS-3428 > URL: https://issues.apache.org/jira/browse/MESOS-3428 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Timothy Chen >Assignee: Timothy Chen > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2906) Slave : Synchronous Validation for Calls
[ https://issues.apache.org/jira/browse/MESOS-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2906: --- Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 19) > Slave : Synchronous Validation for Calls > > > Key: MESOS-2906 > URL: https://issues.apache.org/jira/browse/MESOS-2906 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar >Assignee: Isabel Jimenez > Labels: HTTP, mesosphere > > /call endpoint on the slave will return a 202 accepted code but has to do > some basic validations before. In case of invalidation it will return a > {{BadRequest}} back to the client. > - We need to create the required infrastructure to validate the request and > then process it similar to {{src/master/validation.cpp}} in the {{namespace > scheduler}} i.e. check if the protobuf is properly initialized, has the > required attributes set pertaining to the call message etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3357) Update quota design doc based on user comments and offline syncs
[ https://issues.apache.org/jira/browse/MESOS-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3357: --- Sprint: Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 18, Mesosphere Sprint 19) > Update quota design doc based on user comments and offline syncs > > > Key: MESOS-3357 > URL: https://issues.apache.org/jira/browse/MESOS-3357 > Project: Mesos > Issue Type: Documentation >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: mesosphere > > We got plenty of feedback from different parties, which we would like to > persist in the design doc for posterity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3497) Add implementation for sha256 based file content verification.
[ https://issues.apache.org/jira/browse/MESOS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3497: --- Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 19) > Add implementation for sha256 based file content verification. > -- > > Key: MESOS-3497 > URL: https://issues.apache.org/jira/browse/MESOS-3497 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Jojy Varghese >Assignee: Jojy Varghese > Labels: mesosphere > > https://reviews.apache.org/r/38747/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3496) Create interface for digest verifier
[ https://issues.apache.org/jira/browse/MESOS-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3496: --- Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 19) > Create interface for digest verifier > > > Key: MESOS-3496 > URL: https://issues.apache.org/jira/browse/MESOS-3496 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Jojy Varghese >Assignee: Jojy Varghese > Labels: mesosphere > > Add interface for digest verifier so that we can add implementations for > digest types like sha256, sha512 etc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2879) Random recursive_mutex errors in when running make check
[ https://issues.apache.org/jira/browse/MESOS-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2879: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 15, Mesosphere Sprint 18, Mesosphere Sprint 19) > Random recursive_mutex errors in when running make check > > > Key: MESOS-2879 > URL: https://issues.apache.org/jira/browse/MESOS-2879 > Project: Mesos > Issue Type: Bug > Components: libprocess >Reporter: Alexander Rojas >Assignee: Greg Mann > Labels: mesosphere, tech-debt > > While running make check on OS X, from time to time {{recursive_mutex}} > errors appear after running all the test successfully. Just one of the > experience messages actually stops {{make check}} reporting an error. > The following error messages have been experienced: > {code} > libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: > libc++abi.dylib: libc++abi.dylib: terminating with uncaught exception of type > std::__1::system_error: recursive_mutex lock failed: Invalid > argumentterminating with uncaught exception of type std::__1::system_error: > recursive_mutex lock failed: Invalid argumentterminating with uncaught > exception of type std::__1::system_error: recursive_mutex lock failed: > Invalid argumentterminating with uncaught exception of type > std::__1::system_error: recursive_mutex lock failed: Invalid > argumentterminating with uncaught exception of type std::__1::system_error: > recursive_mutex lock failed: Invalid argumentterminating with uncaught > exception of type std::__1::system_error: recursive_mutex lock failed: > Invalid argument > *** Aborted at 1434553937 (unix time) try "date -d @1434553937" if you are > using GNU date *** > {code} > {code} > libc++abi.dylib: terminating with uncaught exception of type > std::__1::system_error: recursive_mutex lock failed: Invalid argument > *** Aborted at 1434557001 (unix time) try "date -d @1434557001" if you are > using GNU date *** > libc++abi.dylib: PC: @ 0x7fff93855286 __pthread_kill > libc++abi.dylib: *** SIGABRT (@0x7fff93855286) received by PID 88060 (TID > 0x10fc4) stack trace: *** > @ 0x7fff8e1d6f1a _sigtramp > libc++abi.dylib: @0x10fc3f1a8 (unknown) > libc++abi.dylib: @ 0x7fff979deb53 abort > libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: terminating with uncaught > exception of type std::__1::system_error: recursive_mutex lock failed: > Invalid argumentterminating with uncaught exception of type > std::__1::system_error: recursive_mutex lock failed: Invalid > argumentterminating with uncaught exception of type std::__1::system_error: > recursive_mutex lock failed: Invalid argumentterminating with uncaught > exception of type std::__1::system_error: recursive_mutex lock failed: > Invalid argumentterminating with uncaught exception of type > std::__1::system_error: recursive_mutex lock failed: Invalid > argumentterminating with uncaught exception of type std::__1::system_error: > recursive_mutex lock failed: Invalid argumentMaking check in include > {code} > {code} > Assertion failed: (e == 0), function ~recursive_mutex, file > /SourceCache/libcxx/libcxx-120/src/mutex.cpp, line 82. > *** Aborted at 1434555685 (unix time) try "date -d @1434555685" if you are > using GNU date *** > PC: @ 0x7fff93855286 __pthread_kill > *** SIGABRT (@0x7fff93855286) received by PID 60235 (TID 0x7fff7ebdc300) > stack trace: *** > @ 0x7fff8e1d6f1a _sigtramp > @0x10b512350 google::CheckNotNull<>() > @ 0x7fff979deb53 abort > @ 0x7fff979a6c39 __assert_rtn > @ 0x7fff9bffdcc9 std::__1::recursive_mutex::~recursive_mutex() > @0x10b881928 process::ProcessManager::~ProcessManager() > @0x10b874445 process::ProcessManager::~ProcessManager() > @0x10b874418 process::finalize() > @0x10b2f7aec main > @ 0x7fff98edc5c9 start > make[5]: *** [check-local] Abort trap: 6 > make[4]: *** [check-am] Error 2 > make[3]: *** [check-recursive] Error 1 > make[2]: *** [check-recursive] Error 1 > make[1]: *** [check] Error 2 > make: *** [check-recursive] Error 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2708) Design doc for the Executor HTTP API
[ https://issues.apache.org/jira/browse/MESOS-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2708: --- Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19) > Design doc for the Executor HTTP API > > > Key: MESOS-2708 > URL: https://issues.apache.org/jira/browse/MESOS-2708 > Project: Mesos > Issue Type: Bug >Reporter: Alexander Rojas >Assignee: Isabel Jimenez > Labels: mesosphere > > This tracks the design of the Executor HTTP API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3086: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19) > Create cgroups TasksKiller for non freeze subsystems. > - > > Key: MESOS-3086 > URL: https://issues.apache.org/jira/browse/MESOS-3086 > Project: Mesos > Issue Type: Bug >Reporter: Joerg Schad >Assignee: Joerg Schad > Labels: mesosphere > > We have a number of test issues when we cannot remove cgroups (in case there > are still related tasks running) in cases where the freezer subsystem is not > available. > In the current code > (https://github.com/apache/mesos/blob/0.22.1/src/linux/cgroups.cpp#L1728) we > will fallback to a very simple mechnism of recursivly trying to remove the > cgroups which fails if there are still tasks running. > Therefore we need an additional (NonFreeze)TasksKiller which doesn't rely > on the freezer subsystem. > This problem caused issues when running 'sudo make check' during 0.23 release > testing, where BenH provided already a better error message with > b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota
[ https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3073: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20 (was: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19) > Introduce HTTP endpoints for Quota > -- > > Key: MESOS-3073 > URL: https://issues.apache.org/jira/browse/MESOS-3073 > Project: Mesos > Issue Type: Improvement >Reporter: Joerg Schad >Assignee: Joerg Schad > Labels: mesosphere > > We need to implement the HTTP endpoints for Quota as outlined in the Design > Doc: > (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3536) Error loading isolator module with 0.25.0-rc1
Kapil Arya created MESOS-3536: - Summary: Error loading isolator module with 0.25.0-rc1 Key: MESOS-3536 URL: https://issues.apache.org/jira/browse/MESOS-3536 Project: Mesos Issue Type: Bug Reporter: Kapil Arya Assignee: Kapil Arya Priority: Blocker When trying to load the network isolator module from https://github.com/djosborne/net-modules/tree/test-0.25.0/ in 0.25.0-rc1, we are seeing the following error: {code} Error loading modules: Error opening library: '/isolator/build/.libs/libmesos_network_isolator.so': Could not load library '/isolator/build/.libs/libmesos_network_isolator.so': /isolator/build/.libs/libmesos_network_isolator.so: undefined symbol: _ZNK8picojson5value2isIlEEbv {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1478) Replace Master/Slave terminology
[ https://issues.apache.org/jira/browse/MESOS-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933889#comment-14933889 ] Vinod Kone commented on MESOS-1478: --- [~benjaminhindman] Can you share the doc on the details please? > Replace Master/Slave terminology > > > Key: MESOS-1478 > URL: https://issues.apache.org/jira/browse/MESOS-1478 > Project: Mesos > Issue Type: Wish >Reporter: Clark Breyman >Assignee: Benjamin Hindman >Priority: Minor > Labels: mesosphere > > Inspired by the comments on this PR: > https://github.com/django/django/pull/2692 > TL;DR - Computers sharing work should be a good thing. Using the language of > human bondage and suffering is inappropriate in this context. It also has the > potential to alienate users and community members. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2972) Serialize Docker image spec as protobuf
[ https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933873#comment-14933873 ] Marco Massenzio commented on MESOS-2972: Yes, I completely agree on wanting to avoid 'boilerplate' and ad-hoc schema checking. However, I'd suggest to look into something like JsonSchema for that - I would assume that Docker (or someone else) had already done this? Trying to model an arbitrary JSON model into PB is likely to be *extremely* difficult - if not outright impossible. > Serialize Docker image spec as protobuf > --- > > Key: MESOS-2972 > URL: https://issues.apache.org/jira/browse/MESOS-2972 > Project: Mesos > Issue Type: Improvement >Reporter: Timothy Chen >Assignee: Gilbert Song > Labels: mesosphere > > The Docker image specification defines a schema for the metadata json that it > puts into each image. Currently the docker image provisioner needs to be able > to parse and understand this metadata json, and we should create a protobuf > equivelent schema so we can utilize the json to protobuf conversion to read > and validate the metadata. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1478) Replace Master/Slave terminology
[ https://issues.apache.org/jira/browse/MESOS-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933867#comment-14933867 ] Adam B commented on MESOS-1478: --- In the new HTTP API, we are referring to the Mesos "Slave" as the "Agent", and we will phase the term into the rest of the Mesos code/canon as we approach the Mesos 1.0 release. [~benjaminhindman] has more details on the plan. We will keep the Mesos "Masters" (one of which is the "leading Master") terminology, to be less disruptive to the API. > Replace Master/Slave terminology > > > Key: MESOS-1478 > URL: https://issues.apache.org/jira/browse/MESOS-1478 > Project: Mesos > Issue Type: Wish >Reporter: Clark Breyman >Assignee: Benjamin Hindman >Priority: Minor > Labels: mesosphere > > Inspired by the comments on this PR: > https://github.com/django/django/pull/2692 > TL;DR - Computers sharing work should be a good thing. Using the language of > human bondage and suffering is inappropriate in this context. It also has the > potential to alienate users and community members. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3535) Expose info about the container image associated each container through an HTTP endpoint.
Yan Xu created MESOS-3535: - Summary: Expose info about the container image associated each container through an HTTP endpoint. Key: MESOS-3535 URL: https://issues.apache.org/jira/browse/MESOS-3535 Project: Mesos Issue Type: Task Reporter: Yan Xu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3534) add test cases for sha256/sha512 digest verifier
[ https://issues.apache.org/jira/browse/MESOS-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933826#comment-14933826 ] Gilbert Song commented on MESOS-3534: - https://reviews.apache.org/r/38814/ > add test cases for sha256/sha512 digest verifier > > > Key: MESOS-3534 > URL: https://issues.apache.org/jira/browse/MESOS-3534 > Project: Mesos > Issue Type: Improvement >Reporter: Gilbert Song >Assignee: Gilbert Song > > add test cases for sha256/sha512 digest verifier, to read from a file path > and verify with corresponding string digest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3534) add test cases for sha256/sha512 digest verifier
Gilbert Song created MESOS-3534: --- Summary: add test cases for sha256/sha512 digest verifier Key: MESOS-3534 URL: https://issues.apache.org/jira/browse/MESOS-3534 Project: Mesos Issue Type: Improvement Reporter: Gilbert Song Assignee: Gilbert Song add test cases for sha256/sha512 digest verifier, to read from a file path and verify with corresponding string digest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2035) Add reason to containerizer proto Termination
[ https://issues.apache.org/jira/browse/MESOS-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2035: -- Sprint: Twitter Mesos Q3 Sprint 5, Twitter Mesos Q3 Sprint 6 (was: Twitter Mesos Q3 Sprint 5) > Add reason to containerizer proto Termination > - > > Key: MESOS-2035 > URL: https://issues.apache.org/jira/browse/MESOS-2035 > Project: Mesos > Issue Type: Improvement > Components: slave >Affects Versions: 0.21.0 >Reporter: Dominic Hamon >Assignee: Jie Yu > Labels: mesosphere > > When an isolator kills a task, the reason is unknown. As part of MESOS-1830, > the reason is set to a general one but ideally we would have the termination > reason to pass through to the status update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers
[ https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-1615: - Story Points: 8 (was: 5) > Create design document for Optimistic Offers > > > Key: MESOS-1615 > URL: https://issues.apache.org/jira/browse/MESOS-1615 > Project: Mesos > Issue Type: Documentation >Reporter: Dominic Hamon >Assignee: Joseph Wu > Labels: mesosphere > > As a first step toward Optimistic Offers, take the description from the epic > and build an implementation design doc that can be shared for comments. > Note: the links to the working group notes and design doc are located in the > [JIRA Epic|MESOS-1607]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-1607) Introduce optimistic offers.
[ https://issues.apache.org/jira/browse/MESOS-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan reassigned MESOS-1607: Assignee: Artem Harutyunyan > Introduce optimistic offers. > > > Key: MESOS-1607 > URL: https://issues.apache.org/jira/browse/MESOS-1607 > Project: Mesos > Issue Type: Epic > Components: allocation, framework, master >Reporter: Benjamin Hindman >Assignee: Artem Harutyunyan > Attachments: optimisitic-offers.pdf > > > The current implementation of resource offers only enable a single framework > scheduler to make scheduling decisions for some available resources at a > time. In some circumstances, this is good, i.e., when we don't want other > framework schedulers to have access to some resources. However, in other > circumstances, there are advantages to letting multiple framework schedulers > attempt to make scheduling decisions for the _same_ allocation of resources > in parallel. > If you think about this from a "concurrency control" perspective, the current > implementation of resource offers is _pessimistic_, the resources contained > within an offer are _locked_ until the framework scheduler that they were > offered to launches tasks with them or declines them. In addition to making > pessimistic offers we'd like to give out _optimistic_ offers, where the same > resources are offered to multiple framework schedulers at the same time, and > framework schedulers "compete" for those resources on a > first-come-first-serve basis (i.e., the first to launch a task "wins"). We've > always reserved the right to rescind resource offers using the 'rescind' > primitive in the API, and a framework scheduler should be prepared to launch > a task and have those tasks go lost because another framework already started > to use those resources. > Introducing optimistic offers will enable more sophisticated allocation > algorithms. For example, we can optimistically allocate resources that are > reserved for a particular framework (role) but are not being used. In > conjunction with revocable resources (the concept that using resources not > reserved for you means you might get those resources revoked) we can easily > create a "spot" market for unused resources, driving up utilization by > letting frameworks that are willing to use revocable resources run tasks. > In the limit, one could imagine always making optimistic resource offers. > This bears a striking resemblance with the Google Omega model (an isomorphism > even). However, being able to configure what resources should be allocated > optimistically and what resources should be allocated pessimistically gives > even more control to a datacenter/cluster operator that might want to, for > example, never let multiple frameworks (roles) compete for some set of > resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3468) Improve apply_reviews.sh script to apply chain of reviews
[ https://issues.apache.org/jira/browse/MESOS-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-3468: - Labels: mesosphere (was: ) > Improve apply_reviews.sh script to apply chain of reviews > - > > Key: MESOS-3468 > URL: https://issues.apache.org/jira/browse/MESOS-3468 > Project: Mesos > Issue Type: Improvement >Reporter: Vinod Kone >Assignee: Artem Harutyunyan > Labels: mesosphere > > Currently the support/apply-review.sh script allows an user (typically > committer) to apply a single review on top the HEAD. Since Mesos contributors > typically submit a chain of reviews for a given issue it makes sense for the > script to apply the whole chain recursively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3519) Fix file descriptor leakage / double close in the code base
[ https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3519: -- Story Points: 3 > Fix file descriptor leakage / double close in the code base > --- > > Key: MESOS-3519 > URL: https://issues.apache.org/jira/browse/MESOS-3519 > Project: Mesos > Issue Type: Bug >Reporter: Chi Zhang >Assignee: Chi Zhang > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3520) Add an abstraction to manage the life cycle of file descriptors.
[ https://issues.apache.org/jira/browse/MESOS-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3520: -- Sprint: Twitter Mesos Q3 Sprint 6 > Add an abstraction to manage the life cycle of file descriptors. > > > Key: MESOS-3520 > URL: https://issues.apache.org/jira/browse/MESOS-3520 > Project: Mesos > Issue Type: Improvement > Components: stout >Reporter: Chi Zhang >Assignee: Chi Zhang > > In order to avoid missing {{close()}} calls on file descriptors, or > double-closing file descriptors, it would be nice to add a reference counted > {{FileDescriptor}} in a similar way to what we've done for Socket. This will > be closed automatically when the last reference goes away, and double closes > can be prevented via internal state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3468) Improve apply_reviews.sh script to apply chain of reviews
[ https://issues.apache.org/jira/browse/MESOS-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-3468: - Story Points: 5 > Improve apply_reviews.sh script to apply chain of reviews > - > > Key: MESOS-3468 > URL: https://issues.apache.org/jira/browse/MESOS-3468 > Project: Mesos > Issue Type: Improvement >Reporter: Vinod Kone >Assignee: Artem Harutyunyan > Labels: mesosphere > > Currently the support/apply-review.sh script allows an user (typically > committer) to apply a single review on top the HEAD. Since Mesos contributors > typically submit a chain of reviews for a given issue it makes sense for the > script to apply the whole chain recursively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3519) Fix file descriptor leakage / double close in the code base
[ https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3519: -- Sprint: Twitter Mesos Q3 Sprint 6 > Fix file descriptor leakage / double close in the code base > --- > > Key: MESOS-3519 > URL: https://issues.apache.org/jira/browse/MESOS-3519 > Project: Mesos > Issue Type: Bug >Reporter: Chi Zhang >Assignee: Chi Zhang > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3494) Add Test for Docker RemotePuller
[ https://issues.apache.org/jira/browse/MESOS-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933717#comment-14933717 ] Gilbert Song commented on MESOS-3494: - https://reviews.apache.org/r/38816/ > Add Test for Docker RemotePuller > > > Key: MESOS-3494 > URL: https://issues.apache.org/jira/browse/MESOS-3494 > Project: Mesos > Issue Type: Task >Reporter: Gilbert Song >Assignee: Gilbert Song > > Add unit test for Docker RemotePuller implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3533) Unable to find and run URIs files
[ https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933713#comment-14933713 ] Rafael Capucho commented on MESOS-3533: --- Docker Version: 1.8.2 > Unable to find and run URIs files > - > > Key: MESOS-3533 > URL: https://issues.apache.org/jira/browse/MESOS-3533 > Project: Mesos > Issue Type: Bug > Components: fetcher, general >Affects Versions: 0.25.0 > Environment: Linux li202-122 4.1.5-x86_64-linode61 #7 SMP Mon Aug 24 > 13:46:31 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux > Ubuntu 14.04.1 LTS > Docker Version: 1.8.2 > Docker API version: 1.20 > Go version: go1.4.2 >Reporter: Rafael Capucho >Priority: Blocker > > Hello, > Deploying a docker container using marathon 0.11 with the following structure > (just example, I had tried some variations with same result): > { > "id": "testando-flask", > "cmd": "ls -l; pip install -r requeriments.txt; ls -l; python app.py", > "cpus": 0.5, > "mem": 20.0, > "container": { > "type": "DOCKER", > "docker": { > "image": "therealwardo/python-2.7-pip", > "network": "BRIDGE", > "privileged": true, > "portMappings": [ > { "containerPort": 31177, "hostPort": 0 } > ] > } > }, > "uris": [ > "http://blog.rafaelcapucho.com/app.zip"; > ] > } > curl -X POST http://173.255.192.XXX:8080/v2/apps -d @flask.json -H > "Content-type: application/json" > The task are reaching mesos master properly but it failed. When I execute the > same structure without uris and with a simple "python -m SimpleHTTPServer" it > works! The docker is created and running. > Analyzing the sandbox on Mesos UI I can see that the files of URIs are > download correctly, the project and the requirements.txt in stdout I got: > Archive: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.zip > inflating: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.py > > extracting: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/requeriments.txt > > --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff" > --stop_timeout="0ns" > --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff" > --stop_timeout="0ns" > Registered docker executor on li202-122.members.linode.com > Starting task testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb > Could not open requirements file: [Errno 2] No such file or directory: > 'requeriments.txt' > Storing complete log in /root/.pip/pip.log > total 68 > drwxr-xr-x 2 root root 4096 Jan 15 2015 bin > drwxr-xr-x 2 root root 4096 Apr 19 2012 boot > drwxr-xr-x 10 root root 13740 Sep 28 12:44 dev > drwxr-xr-x 46 root root 4096 Sep 28 12:44 etc > drwxr-xr-x 2 root root 4096 Apr 19 2012 home > drwxr-xr-x 11 root root 4096 Jan 15 2015 lib > drwxr-xr-x 2 root root 4096 Jan 15 2015 lib64 > drwxr-xr-x 2 root root 4096 Jan 15 2015 media > drwxr-xr-x 3 root root 4096 Sep 28 12:44 mnt > drwxr-xr-x 2 root root 4096 Jan 15 2015 opt > dr-xr-xr-x 170 root root 0 Sep 28 12:44 proc > drwx-- 3 root root 4096 Sep 28 12:44 root > drwxr-xr-x 5 root root 4096 Jan 15 2015 run > drwxr-xr-x 2 root root 4096 Jan 16 2015 sbin > drwxr-xr-x 2 root root 4096 Mar 5 2012 selinux > drwxr-xr-x 2 root root 4096 Jan 15 2015 srv > dr-xr-xr-x 13 root root 0
[jira] [Updated] (MESOS-3399) Rewrite perf events code
[ https://issues.apache.org/jira/browse/MESOS-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3399: -- Sprint: Twitter Mesos Q3 Sprint 5, Twitter Mesos Q3 Sprint 6 (was: Twitter Mesos Q3 Sprint 5) > Rewrite perf events code > > > Key: MESOS-3399 > URL: https://issues.apache.org/jira/browse/MESOS-3399 > Project: Mesos > Issue Type: Task >Reporter: Cong Wang >Assignee: Cong Wang >Priority: Minor > Labels: twitter > > Our current code base invokes and parses `perf stat`, which sucks, because > cmdline output is not a stable ABI at all, it can break our code at any time, > for example MESOS-2834. > We should use the stable API perf_event_open(2). With this patch > https://reviews.apache.org/r/37540/, we already have the infrastructure for > the implementation, so it should not be hard to rewrite all the perf events > code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3365) Export per container SNMP statistics
[ https://issues.apache.org/jira/browse/MESOS-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3365: -- Sprint: Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5, Twitter Mesos Q3 Sprint 6 (was: Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5) > Export per container SNMP statistics > > > Key: MESOS-3365 > URL: https://issues.apache.org/jira/browse/MESOS-3365 > Project: Mesos > Issue Type: Task >Reporter: Cong Wang >Assignee: Cong Wang >Priority: Minor > Labels: twitter > > We need to export the per container SNMP statistics too, from its > /proc/net/snmp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2769) Metric for cpu scheduling latency from all components
[ https://issues.apache.org/jira/browse/MESOS-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2769: -- Sprint: Twitter Q2 Sprint 3, Twitter Mesos Q3 Sprint 3, Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5, Twitter Mesos Q3 Sprint 6 (was: Twitter Q2 Sprint 3, Twitter Mesos Q3 Sprint 3, Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5) > Metric for cpu scheduling latency from all components > - > > Key: MESOS-2769 > URL: https://issues.apache.org/jira/browse/MESOS-2769 > Project: Mesos > Issue Type: Improvement > Components: isolation >Affects Versions: 0.22.1 >Reporter: Ian Downes >Assignee: Cong Wang > Labels: twitter > > The metric will provide statistics on the scheduling latency for > processes/threads in a container, i.e., statistics on the delay before > application code can run. This will be the aggregate effect of the normal > scheduling period, contention from other threads/processes, both in the > container and on the system, and any effects from the CFS bandwidth control > (if enabled) or other CPU isolation strategies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3332) Support HTTP Pipelining in libprocess (http::post)
[ https://issues.apache.org/jira/browse/MESOS-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3332: -- Sprint: Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5, Twitter Mesos Q3 Sprint 6 (was: Twitter Mesos Q3 Sprint 4, Twitter Mesos Q3 Sprint 5) > Support HTTP Pipelining in libprocess (http::post) > -- > > Key: MESOS-3332 > URL: https://issues.apache.org/jira/browse/MESOS-3332 > Project: Mesos > Issue Type: Task > Components: libprocess >Reporter: Anand Mazumdar >Assignee: Benjamin Mahler > Labels: twitter > > Currently , {{http::post}} in libprocess, does not support HTTP pipelining. > Each call as of know sends in the {{Connection: close}} header, thereby, > signaling to the server to close the TCP socket after the response. > We either need to create a new interface for supporting HTTP pipelining , or > modify the existing {{http::post}} to do so. > This is needed for the Scheduler/Executor library implementations to make > sure "Calls" are sent in order to the master. Currently, in order to do so, > we send in the next request only after we have received a response for an > earlier call that results in degraded performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3399) Rewrite perf events code
[ https://issues.apache.org/jira/browse/MESOS-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3399: -- Story Points: 5 > Rewrite perf events code > > > Key: MESOS-3399 > URL: https://issues.apache.org/jira/browse/MESOS-3399 > Project: Mesos > Issue Type: Task >Reporter: Cong Wang >Assignee: Cong Wang >Priority: Minor > Labels: twitter > > Our current code base invokes and parses `perf stat`, which sucks, because > cmdline output is not a stable ABI at all, it can break our code at any time, > for example MESOS-2834. > We should use the stable API perf_event_open(2). With this patch > https://reviews.apache.org/r/37540/, we already have the infrastructure for > the implementation, so it should not be hard to rewrite all the perf events > code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3421) Support sharing persistent volumes across task instances
[ https://issues.apache.org/jira/browse/MESOS-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933692#comment-14933692 ] Jie Yu commented on MESOS-3421: --- Glad to see the discussion here! I agree with [~adam-mesos] that the concept of "shared" resources should be applicable to other types of resources as well. Here are a list of issues I can think of that we need to address: 1) ownership? For instance, disk resource, which executor/task has the write permission (i.e., 1 writer + multiple readers)? 2) what if the limit of the resource has been reached? Do we kill all tasks using it? Or just the owner? 3) reference counting? Do we need to track how many tasks/executors are still using the resource so that it cannot be released? 4) permission (e.g., group/owner)? This is specific for disk resources. We definitely need to change the allocator accordingly so that 'shared' resources can be allocated to multiple fraemworks currently. > Support sharing persistent volumes across task instances > > > Key: MESOS-3421 > URL: https://issues.apache.org/jira/browse/MESOS-3421 > Project: Mesos > Issue Type: Improvement > Components: general >Affects Versions: 0.23.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha > > A service that needs persistent volume needs to have access to the same > persistent volume (RW) from multiple task(s) instances on the same agent > node. Currently, a persistent volume once offered to the framework(s) can be > scheduled to a task and until that tasks terminates, that persistent volume > cannot be used by another task. > Explore providing the capability of sharing persistent volumes across task > instances scheduled on a single agent node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3391) Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution
[ https://issues.apache.org/jira/browse/MESOS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933690#comment-14933690 ] Chris Chen commented on MESOS-3391: --- This will soon break in our production environment. This is an interesting point to make since ZK 3.4.6 has been out for a little while and mesos is still pinned to 3.4.5. > Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution > -- > > Key: MESOS-3391 > URL: https://issues.apache.org/jira/browse/MESOS-3391 > Project: Mesos > Issue Type: Bug > Components: general > Environment: Linux, OS X >Reporter: Chris Chen >Assignee: Chris Chen > > The Zookeeper C client does makes certain assertions about the ordering of > ping packets that the Java client does not. An alternate implementation of > the Zookeeper server would then break the C client while working correctly > with the Java client. > A patch has been submitted to the Zookeeper project under ZOOKEEPER-2253. > This adds that patch to mesos 3rdparty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3391) Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution
[ https://issues.apache.org/jira/browse/MESOS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933679#comment-14933679 ] Neil Conway commented on MESOS-3391: Is there a reason we need to apply this patch now, rather than just updating to the next upstream Zk release with includes the change? > Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution > -- > > Key: MESOS-3391 > URL: https://issues.apache.org/jira/browse/MESOS-3391 > Project: Mesos > Issue Type: Bug > Components: general > Environment: Linux, OS X >Reporter: Chris Chen >Assignee: Chris Chen > > The Zookeeper C client does makes certain assertions about the ordering of > ping packets that the Java client does not. An alternate implementation of > the Zookeeper server would then break the C client while working correctly > with the Java client. > A patch has been submitted to the Zookeeper project under ZOOKEEPER-2253. > This adds that patch to mesos 3rdparty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3533) Unable to find and run URIs files
[ https://issues.apache.org/jira/browse/MESOS-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933675#comment-14933675 ] haosdent commented on MESOS-3533: - what is the docker version you use? > Unable to find and run URIs files > - > > Key: MESOS-3533 > URL: https://issues.apache.org/jira/browse/MESOS-3533 > Project: Mesos > Issue Type: Bug > Components: fetcher, general >Affects Versions: 0.25.0 > Environment: Linux li202-122 4.1.5-x86_64-linode61 #7 SMP Mon Aug 24 > 13:46:31 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux > Ubuntu 14.04.1 LTS > Docker Version: 1.8.2 > Docker API version: 1.20 > Go version: go1.4.2 >Reporter: Rafael Capucho >Priority: Blocker > > Hello, > Deploying a docker container using marathon 0.11 with the following structure > (just example, I had tried some variations with same result): > { > "id": "testando-flask", > "cmd": "ls -l; pip install -r requeriments.txt; ls -l; python app.py", > "cpus": 0.5, > "mem": 20.0, > "container": { > "type": "DOCKER", > "docker": { > "image": "therealwardo/python-2.7-pip", > "network": "BRIDGE", > "privileged": true, > "portMappings": [ > { "containerPort": 31177, "hostPort": 0 } > ] > } > }, > "uris": [ > "http://blog.rafaelcapucho.com/app.zip"; > ] > } > curl -X POST http://173.255.192.XXX:8080/v2/apps -d @flask.json -H > "Content-type: application/json" > The task are reaching mesos master properly but it failed. When I execute the > same structure without uris and with a simple "python -m SimpleHTTPServer" it > works! The docker is created and running. > Analyzing the sandbox on Mesos UI I can see that the files of URIs are > download correctly, the project and the requirements.txt in stdout I got: > Archive: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.zip > inflating: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/app.py > > extracting: > /tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff/requeriments.txt > > --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff" > --stop_timeout="0ns" > --container="mesos-fe42c404-7266-462b-adf5-549311bfbf32-S37.28e2dbd9-fa10-4d96-baec-0c89868237ff" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/fe42c404-7266-462b-adf5-549311bfbf32-S37/frameworks/fe42c404-7266-462b-adf5-549311bfbf32-/executors/testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb/runs/28e2dbd9-fa10-4d96-baec-0c89868237ff" > --stop_timeout="0ns" > Registered docker executor on li202-122.members.linode.com > Starting task testando-flask.a5ef5aad-65de-11e5-8b11-0242ac1101eb > Could not open requirements file: [Errno 2] No such file or directory: > 'requeriments.txt' > Storing complete log in /root/.pip/pip.log > total 68 > drwxr-xr-x 2 root root 4096 Jan 15 2015 bin > drwxr-xr-x 2 root root 4096 Apr 19 2012 boot > drwxr-xr-x 10 root root 13740 Sep 28 12:44 dev > drwxr-xr-x 46 root root 4096 Sep 28 12:44 etc > drwxr-xr-x 2 root root 4096 Apr 19 2012 home > drwxr-xr-x 11 root root 4096 Jan 15 2015 lib > drwxr-xr-x 2 root root 4096 Jan 15 2015 lib64 > drwxr-xr-x 2 root root 4096 Jan 15 2015 media > drwxr-xr-x 3 root root 4096 Sep 28 12:44 mnt > drwxr-xr-x 2 root root 4096 Jan 15 2015 opt > dr-xr-xr-x 170 root root 0 Sep 28 12:44 proc > drwx-- 3 root root 4096 Sep 28 12:44 root > drwxr-xr-x 5 root root 4096 Jan 15 2015 run > drwxr-xr-x 2 root root 4096 Jan 16 2015 sbin > drwxr-xr-x 2 root root 4096 Mar 5 2012 selinux > drwxr-xr-x 2 root root 4096 Jan 15 2015 srv > dr-xr-xr-x 13 root root
[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers
[ https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-1615: - Description: As a first step toward Optimistic Offers, take the description from the epic and build an implementation design doc that can be shared for comments. Note: the links to the working group notes and design doc are located in the [JIRA Epic|MESOS-1607]. was:As a first step toward Optimistic Offers, take the description from the epic and build an implementation design doc that can be shared for comments. > Create design document for Optimistic Offers > > > Key: MESOS-1615 > URL: https://issues.apache.org/jira/browse/MESOS-1615 > Project: Mesos > Issue Type: Documentation >Reporter: Dominic Hamon >Assignee: Joseph Wu > Labels: mesosphere > > As a first step toward Optimistic Offers, take the description from the epic > and build an implementation design doc that can be shared for comments. > Note: the links to the working group notes and design doc are located in the > [JIRA Epic|MESOS-1607]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers
[ https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-1615: - Story Points: 5 > Create design document for Optimistic Offers > > > Key: MESOS-1615 > URL: https://issues.apache.org/jira/browse/MESOS-1615 > Project: Mesos > Issue Type: Documentation >Reporter: Dominic Hamon >Assignee: Joseph Wu > Labels: mesosphere > > As a first step toward Optimistic Offers, take the description from the epic > and build an implementation design doc that can be shared for comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1615) Create design document for Optimistic Offers
[ https://issues.apache.org/jira/browse/MESOS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-1615: - Labels: mesosphere (was: ) > Create design document for Optimistic Offers > > > Key: MESOS-1615 > URL: https://issues.apache.org/jira/browse/MESOS-1615 > Project: Mesos > Issue Type: Documentation >Reporter: Dominic Hamon >Assignee: Joseph Wu > Labels: mesosphere > > As a first step toward Optimistic Offers, take the description from the epic > and build an implementation design doc that can be shared for comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3391) Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution
[ https://issues.apache.org/jira/browse/MESOS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Chen reassigned MESOS-3391: - Assignee: Chris Chen > Include patch for ZOOKEEPER-2253 for built-in Zookeeper 3.4.5 distribution > -- > > Key: MESOS-3391 > URL: https://issues.apache.org/jira/browse/MESOS-3391 > Project: Mesos > Issue Type: Bug > Components: general > Environment: Linux, OS X >Reporter: Chris Chen >Assignee: Chris Chen > > The Zookeeper C client does makes certain assertions about the ordering of > ping packets that the Java client does not. An alternate implementation of > the Zookeeper server would then break the C client while working correctly > with the Java client. > A patch has been submitted to the Zookeeper project under ZOOKEEPER-2253. > This adds that patch to mesos 3rdparty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)