[jira] [Comment Edited] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233297#comment-15233297 ] wangqun edited comment on MESOS-5148 at 4/9/16 3:11 AM: [~haosd...@gmail.com] I have paste slave and master log. Please check it.I don't use the Docker containerizer, so if I won't be able to access the container using the Docker client by the command "sudo docker run -ti --net=host redis redis-cli"? Thanks. was (Author: wangqun): @haosdent I have paste slave and master log. Please check it.I don't use the Docker containerizer, so if I won't be able to access the container using the Docker client by the command "sudo docker run -ti --net=host redis redis-cli"? Thanks. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> > 7) > I0409 01:43:48.774868 3492 slave.cpp:3886] Executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 > I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 > I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9070953778days in the future > I0409 01:43:48.817401 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' > for gc 6.9065992889days in the future > I0409 01:43:48.823158 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9065273185days in the future > I0409 01:43:48.826216 3491 status_update_manager.cpp:282] Closing status > update streams for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.835602 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' > for gc 6.9064716444days in the future > I0409 01:43:48.838580 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' > for gc 6.9041064889days in the future >
[jira] [Comment Edited] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233293#comment-15233293 ] wangqun edited comment on MESOS-5148 at 4/9/16 3:10 AM: [~tanderegg] Thank you tell me the wrong place. I have modify the mesos.json above according to your said. $sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } And I test it again by the command "sudo docker run -ti --net=host redis redis-cli" it can't still connect it successfully. I want to know if my test way is wrong. I don't know how to validate if the container has created successsfully. And I only run the command "sudo docker run -ti --net=host redis redis-cli" according to https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out. Because I am using the mesos containerizer, if I shouldn't access the docker client by command "sudo docker run -ti --net=host redis redis-cli". And I have paste the master log and slave log. Please check it. Thanks. was (Author: wangqun): @Tim Anderegg Thank you tell me the wrong place. I have modify the mesos.json above according to your said. $sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } And I test it again by the command "sudo docker run -ti --net=host redis redis-cli" it can't still connect it successfully. I want to know if my test way is wrong. I don't know how to validate if the container has created successsfully. And I only run the command "sudo docker run -ti --net=host redis redis-cli" according to https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out. Because I am using the mesos containerizer, if I shouldn't access the docker client by command "sudo docker run -ti --net=host redis redis-cli". And I have paste the master log and slave log. Please check it. Thanks. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> > 7) > I0409 01:43:48.774868 3492 slave.cpp:3886] Executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 > I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 > I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling >
[jira] [Comment Edited] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233298#comment-15233298 ] wangqun edited comment on MESOS-5148 at 4/9/16 3:09 AM: [~kaysoky] I have modify the mesos.json above. $sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } And I test it again by the command "sudo docker run -ti --net=host redis redis-cli" it can't still connect it successfully. I want to know if my test way is wrong. I don't know how to validate if the container has created successsfully. And I only run the command "sudo docker run -ti --net=host redis redis-cli" according to https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out. Because I am using the mesos containerizer, if I shouldn't access the docker client by command "sudo docker run -ti --net=host redis redis-cli". And I have paste the master log and slave log. Please check it. Thanks. was (Author: wangqun): @Joseph Wu I have modify the mesos.json above. $sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } And I test it again by the command "sudo docker run -ti --net=host redis redis-cli" it can't still connect it successfully. I want to know if my test way is wrong. I don't know how to validate if the container has created successsfully. And I only run the command "sudo docker run -ti --net=host redis redis-cli" according to https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out. Because I am using the mesos containerizer, if I shouldn't access the docker client by command "sudo docker run -ti --net=host redis redis-cli". And I have paste the master log and slave log. Please check it. Thanks. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> > 7) > I0409 01:43:48.774868 3492 slave.cpp:3886] Executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 > I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 > I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9070953778days in the future > I0409
[jira] [Commented] (MESOS-1837) failed to determine cgroup for the 'cpu' subsystem
[ https://issues.apache.org/jira/browse/MESOS-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233312#comment-15233312 ] Proton commented on MESOS-1837: --- 17 22 0:16 / /sys rw,nosuid,nodev,noexec,relatime - sysfs sysfs rw 18 22 0:4 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw 19 22 0:6 / /dev rw,relatime - devtmpfs udev rw,size=32962416k,nr_inodes=8240604,mode=755 20 19 0:13 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=000 21 22 0:17 / /run rw,nosuid,noexec,relatime - tmpfs tmpfs rw,size=6594632k,mode=755 22 0 8:1 / / rw,relatime - ext4 /dev/sda1 rw,data=ordered 24 17 0:18 / /sys/fs/cgroup rw,relatime - tmpfs none rw,size=4k,mode=755 25 17 0:19 / /sys/fs/fuse/connections rw,relatime - fusectl none rw 26 17 0:7 / /sys/kernel/debug rw,relatime - debugfs none rw 27 17 0:11 / /sys/kernel/security rw,relatime - securityfs none rw 28 21 0:20 / /run/lock rw,nosuid,nodev,noexec,relatime - tmpfs none rw,size=5120k 29 21 0:21 / /run/shm rw,nosuid,nodev,relatime - tmpfs none rw 30 21 0:22 / /run/user rw,nosuid,nodev,noexec,relatime - tmpfs none rw,size=102400k,mode=755 31 17 0:23 / /sys/fs/pstore rw,relatime - pstore none rw 40 24 0:26 / /sys/fs/cgroup/cpuset rw,relatime - cgroup cgroup rw,cpuset 41 24 0:27 / /sys/fs/cgroup/cpu rw,relatime - cgroup cgroup rw,cpu 42 24 0:28 / /sys/fs/cgroup/cpuacct rw,relatime - cgroup cgroup rw,cpuacct 43 24 0:29 / /sys/fs/cgroup/memory rw,relatime - cgroup cgroup rw,memory 44 24 0:30 / /sys/fs/cgroup/devices rw,relatime - cgroup cgroup rw,devices 45 24 0:31 / /sys/fs/cgroup/freezer rw,relatime - cgroup cgroup rw,freezer 46 24 0:32 / /sys/fs/cgroup/net_cls rw,relatime - cgroup cgroup rw,net_cls 47 24 0:33 / /sys/fs/cgroup/blkio rw,relatime - cgroup cgroup rw,blkio 48 24 0:34 / /sys/fs/cgroup/perf_event rw,relatime - cgroup cgroup rw,perf_event 49 24 0:35 / /sys/fs/cgroup/net_prio rw,relatime - cgroup cgroup rw,net_prio 50 24 0:36 / /sys/fs/cgroup/hugetlb rw,relatime - cgroup cgroup rw,hugetlb 51 21 0:37 / /run/rpc_pipefs rw,relatime - rpc_pipefs rpc_pipefs rw 53 24 0:38 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup systemd rw,name=systemd 56 22 8:1 /var/lib/docker/aufs /var/lib/docker/aufs rw,relatime - ext4 /dev/sda1 rw,data=ordered 32 22 8:16 / /mnt rw,relatime - ext4 /dev/sdb rw,data=ordered 79 32 0:70 / /mnt/avos/data/cloud-code-redis rw,relatime - nfs4 docker-test:/mnt/avos/data/cloud-code-redis rw,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=10.10.92.217,local_lock=none,addr=10.10.12.239 36 22 0:102 / /storage rw,noatime - fuse mfs#mfsmaster:9421 rw,user_id=0,group_id=0,default_permissions,allow_other 81 32 0:96 / /mnt/avos/hdfs_exporter rw,relatime - fuse.fuse_dfs fuse_dfs rw,user_id=0,group_id=0,default_permissions,allow_other 68 56 0:68 / /var/lib/docker/aufs/mnt/fd5ac39765d8c8241db36dfad9507232b9602ba6eb29aa99e8266ba0336096d9 rw,relatime - aufs none rw,si=8b7664bb2270a0ee 61 56 0:50 / /var/lib/docker/aufs/mnt/c604b9eace4f12044074e02049db2589f5fc3f603683ded21927ece3e6c54820 rw,relatime - aufs none rw,si=8b7664bb3cac90ee 77 56 0:71 / /var/lib/docker/aufs/mnt/4d9b6ce48fee75778a9e678dc86c6750d54d966ff7e6c2a3a3320ff81ef75ea9 rw,relatime - aufs none rw,si=8b7664b93e9570ee 69 56 0:82 / /var/lib/docker/aufs/mnt/b378cbca6f4da4bb35ea2924b4b56d1e465fc71dfa7ed44ce9485758eb926c0f rw,relatime - aufs none rw,si=8b7664bb2bb4a0ee 39 56 0:56 / /var/lib/docker/aufs/mnt/12e53119bcccba5c75dd2ccc856cd1e8b07f5cb8d69a40bced0eb2be094e1074 rw,relatime - aufs none rw,si=8b7664bb2653d0ee 33 56 0:25 / /var/lib/docker/aufs/mnt/d9b98ca0bf09d135975d80d8f219782a3c41f60e3c788e98741ca8606d816459 rw,relatime - aufs none rw,si=8b7664bb2ba6e0ee 35 56 0:44 / /var/lib/docker/aufs/mnt/8538c8409717ba9c7b740a9eb8a8b03b751827f02a93fab39702eaf1b85c66e0 rw,relatime - aufs none rw,si=8b7664bb2fb2f0ee 110 56 0:94 / /var/lib/docker/aufs/mnt/43314ccd1cab72b9343ab227aeaefcf13f62fab6c2ccff6328d66cf20bd1755c rw,relatime - aufs none rw,si=8b7664b26bc9a0ee 52 56 0:62 / /var/lib/docker/aufs/mnt/94c2cc960bb532ac1642f8e4c7c16073c009b0356a4c50185af295bde01bb0dd rw,relatime - aufs none rw,si=8b7664b26bc9f0ee 54 56 0:66 / /var/lib/docker/aufs/mnt/0bd4d7e3d905484fc90e0137449ed7bb7d5ee60a24b5c098387d49a9a1489e85 rw,relatime - aufs none rw,si=8b7664ba4715c0ee 85 56 0:67 / /var/lib/docker/aufs/mnt/19b2384edb77ed0ea660b0ed4b33d81662eeda6d6f06ca148c8bfbafe79a5815 rw,relatime - aufs none rw,si=8b7664ba1a9600ee 89 56 0:69 / /var/lib/docker/aufs/mnt/70684386a9a1c6a3fe4010d695b5441349380232d8946299f5363bd6616b6426 rw,relatime - aufs none rw,si=8b7664bb255820ee 103 56 0:76 / /var/lib/docker/aufs/mnt/933e54edf8761d64270f1ab4318d8b4aac83270a0156fed2139692f57a433e36 rw,relatime - aufs none rw,si=8b7664bb255850ee 154 56 0:79 / /var/lib/docker/aufs/mnt/db3a2624c1023bb4728ddb209f6bf128347f1b20b825dccfc172682a1f6c9e52
[jira] [Issue Comment Deleted] (MESOS-5131) DRF allocator crashes master with CHECK when resource is incorrect
[ https://issues.apache.org/jira/browse/MESOS-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangya Liu updated MESOS-5131: --- Comment: was deleted (was: One question, how does the resource estimator report some non-revocable resources to master? Seems at https://github.com/apache/mesos/blob/18f60da868b07885fc2c29b4494054dd9bc871a6/src/slave/slave.cpp#L4980 , the agent only check the revocable resources from resources estimator.) > DRF allocator crashes master with CHECK when resource is incorrect > -- > > Key: MESOS-5131 > URL: https://issues.apache.org/jira/browse/MESOS-5131 > Project: Mesos > Issue Type: Bug > Components: allocation, oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Critical > > We were testing a custom resource estimator which broadcasts oversubscribed > resources, but they are not marked as "revocable". > This unfortunately triggered the following check in hierarchical allocator: > {quote} > void HierarchicalAllocatorProcess::updateSlave( > // Check that all the oversubscribed resources are revocable. > CHECK_EQ(oversubscribed, oversubscribed.revocable()); > {quote} > This definitely shouldn't happen in production cluster. IMO, we should do > both of following: > 1. Make sure incorrect resource is not sent from agent (even crash agent > process is better); > 2. Decline agent registration if it's resources is incorrect, or even tell it > to shutdown, and possibly remove this check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5131) DRF allocator crashes master with CHECK when resource is incorrect
[ https://issues.apache.org/jira/browse/MESOS-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233311#comment-15233311 ] Guangya Liu commented on MESOS-5131: One question, how does the resource estimator report some non-revocable resources to master? Seems at https://github.com/apache/mesos/blob/18f60da868b07885fc2c29b4494054dd9bc871a6/src/slave/slave.cpp#L4980 , the agent only check the revocable resources from resources estimator. > DRF allocator crashes master with CHECK when resource is incorrect > -- > > Key: MESOS-5131 > URL: https://issues.apache.org/jira/browse/MESOS-5131 > Project: Mesos > Issue Type: Bug > Components: allocation, oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Critical > > We were testing a custom resource estimator which broadcasts oversubscribed > resources, but they are not marked as "revocable". > This unfortunately triggered the following check in hierarchical allocator: > {quote} > void HierarchicalAllocatorProcess::updateSlave( > // Check that all the oversubscribed resources are revocable. > CHECK_EQ(oversubscribed, oversubscribed.revocable()); > {quote} > This definitely shouldn't happen in production cluster. IMO, we should do > both of following: > 1. Make sure incorrect resource is not sent from agent (even crash agent > process is better); > 2. Decline agent registration if it's resources is incorrect, or even tell it > to shutdown, and possibly remove this check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233306#comment-15233306 ] wangqun commented on MESOS-5148: [~gyliu] I have paste my log. I want to know how to test the container to validate if it has been created susccessfully. Because I am using the mesos containerizer, if I shouldn't access the docker client by command "sudo docker run -ti --net=host redis redis-cli". Please check it. Thanks. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> > 7) > I0409 01:43:48.774868 3492 slave.cpp:3886] Executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 > I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 > I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9070953778days in the future > I0409 01:43:48.817401 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' > for gc 6.9065992889days in the future > I0409 01:43:48.823158 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9065273185days in the future > I0409 01:43:48.826216 3491 status_update_manager.cpp:282] Closing status > update streams for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.835602 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' > for gc 6.9064716444days in the future > I0409 01:43:48.838580 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' > for gc 6.9041064889days in the future > I0409 01:43:48.844699 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' > for gc 6.902654163days in the future > I0409 01:44:01.623440 3494 slave.cpp:4374] Current disk
[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233298#comment-15233298 ] wangqun commented on MESOS-5148: @Joseph Wu I have modify the mesos.json above. $sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } And I test it again by the command "sudo docker run -ti --net=host redis redis-cli" it can't still connect it successfully. I want to know if my test way is wrong. I don't know how to validate if the container has created successsfully. And I only run the command "sudo docker run -ti --net=host redis redis-cli" according to https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out. Because I am using the mesos containerizer, if I shouldn't access the docker client by command "sudo docker run -ti --net=host redis redis-cli". And I have paste the master log and slave log. Please check it. Thanks. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> > 7) > I0409 01:43:48.774868 3492 slave.cpp:3886] Executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 > I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 > I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9070953778days in the future > I0409 01:43:48.817401 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' > for gc 6.9065992889days in the future > I0409 01:43:48.823158 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9065273185days in the future > I0409 01:43:48.826216 3491 status_update_manager.cpp:282] Closing status > update streams for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.835602 3493 gc.cpp:55] Scheduling >
[jira] [Comment Edited] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233293#comment-15233293 ] wangqun edited comment on MESOS-5148 at 4/9/16 2:14 AM: @Tim Anderegg Thank you tell me the wrong place. I have modify the mesos.json above according to your said. $sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } And I test it again by the command "sudo docker run -ti --net=host redis redis-cli" it can't still connect it successfully. I want to know if my test way is wrong. I don't know how to validate if the container has created successsfully. And I only run the command "sudo docker run -ti --net=host redis redis-cli" according to https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out. Because I am using the mesos containerizer, if I shouldn't access the docker client by command "sudo docker run -ti --net=host redis redis-cli". And I have paste the master log and slave log. Please check it. Thanks. was (Author: wangqun): @Tim Anderegg Thank you tell me the wrong place. I have modify the mesos.json above according to your said. $sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } And I test it again by the command "sudo docker run -ti --net=host redis redis-cli" it can't still connect it successfully. I want to know if my test way is wrong. I don't know how to validate if the container has created successsfully. And I only run the command "sudo docker run -ti --net=host redis redis-cli" according to https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out. And I have paste the master log and slave log. Please check it. Thanks. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> > 7) > I0409 01:43:48.774868 3492 slave.cpp:3886] Executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 > I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 > I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9070953778days in the future > I0409 01:43:48.817401 3493
[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233297#comment-15233297 ] wangqun commented on MESOS-5148: @haosdent I have paste slave and master log. Please check it.I don't use the Docker containerizer, so if I won't be able to access the container using the Docker client by the command "sudo docker run -ti --net=host redis redis-cli"? Thanks. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> > 7) > I0409 01:43:48.774868 3492 slave.cpp:3886] Executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 > I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor > 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 > I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework > ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9070953778days in the future > I0409 01:43:48.817401 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' > for gc 6.9065992889days in the future > I0409 01:43:48.823158 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' > for gc 6.9065273185days in the future > I0409 01:43:48.826216 3491 status_update_manager.cpp:282] Closing status > update streams for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- > I0409 01:43:48.835602 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' > for gc 6.9064716444days in the future > I0409 01:43:48.838580 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' > for gc 6.9041064889days in the future > I0409 01:43:48.844699 3493 gc.cpp:55] Scheduling > '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' > for gc 6.902654163days in the future > I0409 01:44:01.623440 3494 slave.cpp:4374] Current disk usage 27.10%. Max > allowed age:
[jira] [Updated] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangqun updated MESOS-5148: --- Description: Hi I use the marathon api to create tasks to test Supporting Container Images in Mesos Containerizer . My steps is the following: 1) to run the process in master node. sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 --quorum=1 --work_dir=/var/lib/mesos 2) to run the process in slave node. sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos --log_dir=/var/log/mesos --containerizers=docker,mesos --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave --image_providers=docker --executor_environment_variables="{}" 3) to create one json file to specify the container to be managed by mesos. sudo touch mesos.json sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } 4)sudo curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d...@mesos.json 5)sudo curl http://localhost:8080/v2/tasks {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} 6) sudo docker run -ti --net=host redis redis-cli Could not connect to Redis at 127.0.0.1:6379: Connection refused not connected> 7) I0409 01:43:48.774868 3492 slave.cpp:3886] Executor 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' for gc 6.9070953778days in the future I0409 01:43:48.817401 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' for gc 6.9065992889days in the future I0409 01:43:48.823158 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' for gc 6.9065273185days in the future I0409 01:43:48.826216 3491 status_update_manager.cpp:282] Closing status update streams for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:43:48.835602 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' for gc 6.9064716444days in the future I0409 01:43:48.838580 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' for gc 6.9041064889days in the future I0409 01:43:48.844699 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' for gc 6.902654163days in the future I0409 01:44:01.623440 3494 slave.cpp:4374] Current disk usage 27.10%. Max allowed age: 4.403153217546436days I0409 01:44:32.339310 3494 slave.cpp:1361] Got assigned task ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:44:32.451300 3489 gc.cpp:83] Unscheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' from gc I0409 01:44:32.459689 3491 gc.cpp:83] Unscheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' from gc I0409 01:44:32.465939 3494 slave.cpp:1480] Launching task ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:44:32.508301 3494 paths.cpp:528] Trying to chown
[jira] [Issue Comment Deleted] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangqun updated MESOS-5148: --- Comment: was deleted (was: @haosdent This is slave info. Please help to check it. Thanks. I0409 01:43:48.774868 3492 slave.cpp:3886] Executor 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' for gc 6.9070953778days in the future I0409 01:43:48.817401 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' for gc 6.9065992889days in the future I0409 01:43:48.823158 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' for gc 6.9065273185days in the future I0409 01:43:48.826216 3491 status_update_manager.cpp:282] Closing status update streams for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:43:48.835602 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' for gc 6.9064716444days in the future I0409 01:43:48.838580 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' for gc 6.9041064889days in the future I0409 01:43:48.844699 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' for gc 6.902654163days in the future I0409 01:44:01.623440 3494 slave.cpp:4374] Current disk usage 27.10%. Max allowed age: 4.403153217546436days I0409 01:44:32.339310 3494 slave.cpp:1361] Got assigned task ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:44:32.451300 3489 gc.cpp:83] Unscheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' from gc I0409 01:44:32.459689 3491 gc.cpp:83] Unscheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' from gc I0409 01:44:32.465939 3494 slave.cpp:1480] Launching task ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:44:32.508301 3494 paths.cpp:528] Trying to chown '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce/runs/5d230e57-25be-4105-8725-f65e15ff' to user 'root' I0409 01:44:33.795454 3494 slave.cpp:5367] Launching executor ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce/runs/5d230e57-25be-4105-8725-f65e15ff' I0409 01:44:33.915488 3495 docker.cpp:1014] Skipping non-docker container I0409 01:44:33.980628 3491 containerizer.cpp:666] Starting container '5d230e57-25be-4105-8725-f65e15ff' for executor 'ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce' of framework 'ffb72d7c-dd63-4c30-abea-bb746ab2c326-' I0409 01:44:34.027020 3494 slave.cpp:1698] Queuing task 'ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce' for executor 'ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce' of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:44:34.292232 3492 linux_launcher.cpp:304] Cloning child process with flags = CLONE_NEWNS I0409 01:44:34.453189 3492 containerizer.cpp:1118] Checkpointing executor's forked pid 3982 to
[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233293#comment-15233293 ] wangqun commented on MESOS-5148: @Tim Anderegg Thank you tell me the wrong place. I have modify the mesos.json above according to your said. $sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } And I test it again by the command "sudo docker run -ti --net=host redis redis-cli" it can't still connect it successfully. I want to know if my test way is wrong. I don't know how to validate if the container has created successsfully. And I only run the command "sudo docker run -ti --net=host redis redis-cli" according to https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out. And I have paste the master log and slave log. Please check it. Thanks. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233290#comment-15233290 ] wangqun commented on MESOS-5148: @haosdent This is slave info. Please help to check it. Thanks. I0409 01:43:48.774868 3492 slave.cpp:3886] Executor 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- exited with status 0 I0409 01:43:48.781307 3492 slave.cpp:3990] Cleaning up executor 'ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- at executor(1)@10.0.0.5:60134 I0409 01:43:48.808364 3492 slave.cpp:4078] Cleaning up framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:43:48.811336 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' for gc 6.9070953778days in the future I0409 01:43:48.817401 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' for gc 6.9065992889days in the future I0409 01:43:48.823158 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce/runs/24d0872d-1ba1-4384-be11-a20c82893ea4' for gc 6.9065273185days in the future I0409 01:43:48.826216 3491 status_update_manager.cpp:282] Closing status update streams for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:43:48.835602 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.a0b45838-fdf0-11e5-8b4b-0242e2dedfce' for gc 6.9064716444days in the future I0409 01:43:48.838580 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' for gc 6.9041064889days in the future I0409 01:43:48.844699 3493 gc.cpp:55] Scheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' for gc 6.902654163days in the future I0409 01:44:01.623440 3494 slave.cpp:4374] Current disk usage 27.10%. Max allowed age: 4.403153217546436days I0409 01:44:32.339310 3494 slave.cpp:1361] Got assigned task ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:44:32.451300 3489 gc.cpp:83] Unscheduling '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' from gc I0409 01:44:32.459689 3491 gc.cpp:83] Unscheduling '/tmp/mesos/slave/meta/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-' from gc I0409 01:44:32.465939 3494 slave.cpp:1480] Launching task ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce for framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:44:32.508301 3494 paths.cpp:528] Trying to chown '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce/runs/5d230e57-25be-4105-8725-f65e15ff' to user 'root' I0409 01:44:33.795454 3494 slave.cpp:5367] Launching executor ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/mesos/slave/slaves/da0e09ff-d5b2-4680-bd7e-b58a2a206497-S0/frameworks/ffb72d7c-dd63-4c30-abea-bb746ab2c326-/executors/ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce/runs/5d230e57-25be-4105-8725-f65e15ff' I0409 01:44:33.915488 3495 docker.cpp:1014] Skipping non-docker container I0409 01:44:33.980628 3491 containerizer.cpp:666] Starting container '5d230e57-25be-4105-8725-f65e15ff' for executor 'ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce' of framework 'ffb72d7c-dd63-4c30-abea-bb746ab2c326-' I0409 01:44:34.027020 3494 slave.cpp:1698] Queuing task 'ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce' for executor 'ubuntumesos.9ab04999-fdf4-11e5-8b4b-0242e2dedfce' of framework ffb72d7c-dd63-4c30-abea-bb746ab2c326- I0409 01:44:34.292232 3492 linux_launcher.cpp:304] Cloning child process with flags = CLONE_NEWNS I0409 01:44:34.453189 3492 containerizer.cpp:1118] Checkpointing executor's forked pid 3982 to
[jira] [Updated] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangqun updated MESOS-5148: --- Description: Hi I use the marathon api to create tasks to test Supporting Container Images in Mesos Containerizer . My steps is the following: 1) to run the process in master node. sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 --quorum=1 --work_dir=/var/lib/mesos 2) to run the process in slave node. sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos --log_dir=/var/log/mesos --containerizers=docker,mesos --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave --image_providers=docker --executor_environment_variables="{}" 3) to create one json file to specify the container to be managed by mesos. sudo touch mesos.json sudo vim mesos.json { "container": { "type": "MESOS", "docker": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } 4)sudo curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d...@mesos.json 5)sudo curl http://localhost:8080/v2/tasks {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} 6) sudo docker run -ti --net=host redis redis-cli Could not connect to Redis at 127.0.0.1:6379: Connection refused not connected> was: Hi I use the marathon api to create tasks to test Supporting Container Images in Mesos Containerizer . My steps is the following: 1) to run the process in master node. sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 --quorum=1 --work_dir=/var/lib/mesos 2) to run the process in slave node. sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos --log_dir=/var/log/mesos --containerizers=docker,mesos --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave --image_providers=docker --executor_environment_variables="{}" 3) to create one json file to specify the container to be managed by mesos. sudo touch mesos.json sudo vim mesos.json { "container": { "type": "MESOS", "mesos": { "image": "library/redis" } }, "id": "ubuntumesos", "instances": 1, "cpus": 0.5, "mem": 512, "uris": [], "cmd": "ping 8.8.8.8" } 4)sudo curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d...@mesos.json 5)sudo curl http://localhost:8080/v2/tasks {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} 6) sudo docker run -ti --net=host redis redis-cli Could not connect to Redis at 127.0.0.1:6379: Connection refused not connected> > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "docker": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], >
[jira] [Comment Edited] (MESOS-5101) Add CMake build to docker_build.sh
[ https://issues.apache.org/jira/browse/MESOS-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233269#comment-15233269 ] Vinod Kone edited comment on MESOS-5101 at 4/9/16 1:30 AM: --- I see the following error when testing it on the ASF CI. https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-CMake-Test/3/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos:7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/console {code} Step 17 : CMD ./bootstrap && mkdir ./build && cd ./build/ && cmake ../ --verbose && make -j8 check ---> Running in cd62f5ca63fe ---> 90c475a824a5 Removing intermediate container cd62f5ca63fe Successfully built 90c475a824a5 + trap 'docker rmi mesos-1460161977-13796' EXIT + docker run --privileged --rm mesos-1460161977-13796 autoreconf: Entering directory `.' autoreconf: configure.ac: not using Gettext autoreconf: running: aclocal --warnings=all -I m4 autoreconf: configure.ac: tracing configure.ac:47: warning: back quotes and double quotes must not be escaped in: unrecognized option: $[1] configure.ac:47: Try \`$[0] --help' for more information. aclocal.m4:625: LT_OUTPUT is expanded from... configure.ac:47: the top level configure.ac:47: warning: back quotes and double quotes must not be escaped in: unrecognized argument: $[1] configure.ac:47: Try \`$[0] --help' for more information. aclocal.m4:625: LT_OUTPUT is expanded from... configure.ac:47: the top level configure.ac:1552: warning: cannot check for file existence when cross compiling ../../lib/autoconf/general.m4:2777: AC_CHECK_FILE is expanded from... configure.ac:1552: the top level autoreconf: configure.ac: adding subdirectory 3rdparty/libprocess to autoreconf autoreconf: Entering directory `3rdparty/libprocess' configure.ac:42: warning: back quotes and double quotes must not be escaped in: unrecognized option: $[1] configure.ac:42: Try \`$[0] --help' for more information. aclocal.m4:625: LT_OUTPUT is expanded from... configure.ac:42: the top level configure.ac:42: warning: back quotes and double quotes must not be escaped in: unrecognized argument: $[1] configure.ac:42: Try \`$[0] --help' for more information. aclocal.m4:625: LT_OUTPUT is expanded from... configure.ac:42: the top level autoreconf: configure.ac: adding subdirectory 3rdparty/stout to autoreconf autoreconf: Entering directory `3rdparty/stout' autoreconf: running: aclocal --warnings=all autoreconf: configure.ac: not using Libtool autoreconf: running: /usr/bin/autoconf --warnings=all autoreconf: configure.ac: not using Autoheader autoreconf: running: automake --add-missing --copy --no-force --warnings=all configure.ac:22: installing './missing' autoreconf: Leaving directory `3rdparty/stout' autoreconf: running: libtoolize --copy libtoolize: putting auxiliary files in `.'. libtoolize: copying file `./ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'. libtoolize: copying file `m4/libtool.m4' libtoolize: copying file `m4/ltoptions.m4' libtoolize: copying file `m4/ltsugar.m4' libtoolize: copying file `m4/ltversion.m4' libtoolize: copying file `m4/lt~obsolete.m4' configure.ac:42: warning: back quotes and double quotes must not be escaped in: unrecognized option: $[1] configure.ac:42: Try \`$[0] --help' for more information. m4/libtool.m4:609: LT_OUTPUT is expanded from... configure.ac:42: the top level configure.ac:42: warning: back quotes and double quotes must not be escaped in: unrecognized argument: $[1] configure.ac:42: Try \`$[0] --help' for more information. m4/libtool.m4:609: LT_OUTPUT is expanded from... configure.ac:42: the top level configure.ac:35: installing './ar-lib' configure.ac:20: installing './config.guess' configure.ac:20: installing './config.sub' configure.ac:31: installing './missing' 3rdparty/Makefile.am: installing './depcomp' 3rdparty/Makefile.am:131: warning: variable 'GLOG_LDFLAGS' is defined but no program or 3rdparty/Makefile.am:131: library has 'GLOG' as canonical name (possible typo) autoreconf: Leaving directory `3rdparty/libprocess' libtoolize: putting auxiliary files in `.'. libtoolize: copying file `./ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'. libtoolize: copying file `m4/libtool.m4' libtoolize: copying file `m4/ltoptions.m4' libtoolize: copying file `m4/ltsugar.m4' libtoolize: copying file `m4/ltversion.m4' libtoolize: copying file `m4/lt~obsolete.m4' configure.ac:47: warning: back quotes and double quotes must not be escaped in: unrecognized option: $[1] configure.ac:47: Try \`$[0] --help' for more information. m4/libtool.m4:609: LT_OUTPUT is expanded from... configure.ac:47: the top level configure.ac:47: warning: back quotes and double quotes must not be escaped in: unrecognized argument: $[1] configure.ac:47: Try \`$[0] --help' for more information. m4/libtool.m4:609: LT_OUTPUT is expanded from... configure.ac:47: the top level
[jira] [Commented] (MESOS-5101) Add CMake build to docker_build.sh
[ https://issues.apache.org/jira/browse/MESOS-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233269#comment-15233269 ] Vinod Kone commented on MESOS-5101: --- I see the following error when testing it on the ASF CI. https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-CMake-Test/3/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos:7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/console {code} configure.ac:47: the top level configure.ac:47: warning: back quotes and double quotes must not be escaped in: unrecognized argument: $[1] configure.ac:47: Try \`$[0] --help' for more information. m4/libtool.m4:609: LT_OUTPUT is expanded from... configure.ac:47: the top level configure.ac:1552: warning: cannot check for file existence when cross compiling ../../lib/autoconf/general.m4:2777: AC_CHECK_FILE is expanded from... configure.ac:1552: the top level configure.ac:40: installing './ar-lib' configure.ac:24: installing './config.guess' configure.ac:24: installing './config.sub' configure.ac:36: installing './install-sh' configure.ac:36: installing './missing' src/Makefile.am: installing './depcomp' autoreconf: Leaving directory `.' CMake Error: The source directory "/mesos/build/--verbose" does not exist. {code} > Add CMake build to docker_build.sh > -- > > Key: MESOS-5101 > URL: https://issues.apache.org/jira/browse/MESOS-5101 > Project: Mesos > Issue Type: Improvement >Reporter: Juan Larriba >Assignee: Juan Larriba > > Add the CMake build system to docker_build.sh to automatically test the build > on Jenkins alongside gcc and clang. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5080) With NetworkManager off, can't seem to deploy docker containerized apps
[ https://issues.apache.org/jira/browse/MESOS-5080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233230#comment-15233230 ] Lax commented on MESOS-5080: I solved the issue. Since I did not have permissions add comments/edit left the ticket open. Issue was my marathon App definition happened to have Overlay network specific settings and mesos on seeing those settings deployed my app with NONE network mode instead of HOST networking causing Docker not to copy in any of the Network config files along. Once I got rid off those attributes from my definition my container got deployed fine. Thanks for looking into the issue and your time Lax > With NetworkManager off, can't seem to deploy docker containerized apps > --- > > Key: MESOS-5080 > URL: https://issues.apache.org/jira/browse/MESOS-5080 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.27.0 > Environment: Centos 7.1 >Reporter: Lax > > Mesos is unable to deploy docker container apps after switching > NetworkManager service off. > > Noticed none of the Host network config files (like hosts, hostname, > resolv.conf, etc) were seen under the docker container dir > (/var/lib/docker/containers/[container id]) if I provision the container thru > mesos. Only files seen were config.json and hostconfig.json files. > However when I launch same container direct via docker run the issue is not > seen. Container dir has all the network config files. > Any reason why mesos unable to push those to the container dir? is there > specific network setting needs enabled now that I have NetworkManager > switched off? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4828) XFS disk quota isolator
[ https://issues.apache.org/jira/browse/MESOS-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233191#comment-15233191 ] Jie Yu commented on MESOS-4828: --- Glad to see this committed! I am curious any reason the isolator is named as "xfs/disk"? Would 'disk/xfs' be better? I imagine we could have: 'disk/xfs', 'disk/du' ('posix/disk' is an alias of that to be backwards compatible), 'disk/zfs'... > XFS disk quota isolator > --- > > Key: MESOS-4828 > URL: https://issues.apache.org/jira/browse/MESOS-4828 > Project: Mesos > Issue Type: Epic > Components: isolation >Reporter: James Peach >Assignee: James Peach > > Implement a disk resource isolator using XFS project quotas. Compared to the > {{posix/disk}} isolator, this doesn't need to scan the filesystem > periodically, and applications receive a {{EDQUOT}} error instead of being > summarily killed. > This initial implementation only isolates sandbox directory resources, since > isolation doesn't have any visibility into the the lifecycle of volumes, > which is needed to assign and track project IDs. > The build dependencies for this are XFS header (from xfsprogs-devel) and > libblkid. We need libblkid or the equivalent to map filesystem paths to block > devices in order to apply quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4828) XFS disk quota isolator
[ https://issues.apache.org/jira/browse/MESOS-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233183#comment-15233183 ] Yan Xu commented on MESOS-4828: --- {noformat:title=} commit b900abff1648ae397d9819322de95ad99737ce4d Author: James PeachDate: Fri Apr 8 14:56:12 2016 -0700 Add XFS disk isolator documentation. Review: https://reviews.apache.org/r/44950/ commit 255710b7c95e578c873e1317e3705a55e81b1f61 Author: James Peach Date: Fri Apr 8 14:53:56 2016 -0700 Add XFS disk isolator tests. Review: https://reviews.apache.org/r/44949/ commit 04be1d03ca71513cc966a17f87cd10611d959ac9 Author: James Peach Date: Fri Apr 8 14:07:03 2016 -0700 Add tests for XFS project quota utilities. Review: https://reviews.apache.org/r/44947/ commit a0e96bd22da7a39086600c3186fbad61c554e262 Author: James Peach Date: Fri Apr 8 13:49:16 2016 -0700 Add utility functions to manipulate XFS project quotas. Review: https://reviews.apache.org/r/44946/ commit 031370725d05866f98016dfdba8ebf5448067a22 Author: James Peach Date: Fri Apr 8 13:48:36 2016 -0700 Add autoconf tests for XFS project quotas. Review: https://reviews.apache.org/r/44945/ commit 548da8ff3597935c618b43a82bd432482e5e5fed Author: James Peach Date: Fri Apr 8 14:00:10 2016 -0700 Make tests::cluster::Slave more tolerant of start failures. If cluster::Slave::start() fails, make sure we don't crash in the destructor. Review: https://reviews.apache.org/r/45689/ {noformat} > XFS disk quota isolator > --- > > Key: MESOS-4828 > URL: https://issues.apache.org/jira/browse/MESOS-4828 > Project: Mesos > Issue Type: Epic > Components: isolation >Reporter: James Peach >Assignee: James Peach > > Implement a disk resource isolator using XFS project quotas. Compared to the > {{posix/disk}} isolator, this doesn't need to scan the filesystem > periodically, and applications receive a {{EDQUOT}} error instead of being > summarily killed. > This initial implementation only isolates sandbox directory resources, since > isolation doesn't have any visibility into the the lifecycle of volumes, > which is needed to assign and track project IDs. > The build dependencies for this are XFS header (from xfsprogs-devel) and > libblkid. We need libblkid or the equivalent to map filesystem paths to block > devices in order to apply quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5159) Add test to verify error when requesting fractional GPUs
Kevin Klues created MESOS-5159: -- Summary: Add test to verify error when requesting fractional GPUs Key: MESOS-5159 URL: https://issues.apache.org/jira/browse/MESOS-5159 Project: Mesos Issue Type: Task Reporter: Kevin Klues Assignee: Kevin Klues Fractional GPU requests should immediately cause a TASK_FAILED without ever launching the task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4325) Offer shareable resources to frameworks only if it is opted in
[ https://issues.apache.org/jira/browse/MESOS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233138#comment-15233138 ] Anindya Sinha edited comment on MESOS-4325 at 4/8/16 11:20 PM: --- Here are the RRs: https://reviews.apache.org/r/45966/ https://reviews.apache.org/r/45967/ was (Author: anindya.sinha): Here are the RRs: https://reviews.apache.org/r/45963/ https://reviews.apache.org/r/45964/ > Offer shareable resources to frameworks only if it is opted in > -- > > Key: MESOS-4325 > URL: https://issues.apache.org/jira/browse/MESOS-4325 > Project: Mesos > Issue Type: Improvement > Components: general >Affects Versions: 0.25.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha >Priority: Minor > Labels: external-volumes, persistent-volumes > > Added a new capability SHAREABLE_RESOURCES that frameworks need to opt in if > they are interested in receiving shared resources in their offers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4324) Allow access to shared persistent volumes as read only or read write by tasks
[ https://issues.apache.org/jira/browse/MESOS-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233139#comment-15233139 ] Anindya Sinha commented on MESOS-4324: -- Here are the RRs: https://reviews.apache.org/r/45963/ https://reviews.apache.org/r/45964/ > Allow access to shared persistent volumes as read only or read write by tasks > - > > Key: MESOS-4324 > URL: https://issues.apache.org/jira/browse/MESOS-4324 > Project: Mesos > Issue Type: Improvement > Components: general >Affects Versions: 0.25.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha >Priority: Minor > Labels: external-volumes, persistent-volumes > > Allow the task to specify the access to a shared persistent volume to be > read-only or read-write. Note that the persistent volume is always created as > read-write. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4325) Offer shareable resources to frameworks only if it is opted in
[ https://issues.apache.org/jira/browse/MESOS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233138#comment-15233138 ] Anindya Sinha commented on MESOS-4325: -- Here are the RRs: https://reviews.apache.org/r/45963/ https://reviews.apache.org/r/45964/ > Offer shareable resources to frameworks only if it is opted in > -- > > Key: MESOS-4325 > URL: https://issues.apache.org/jira/browse/MESOS-4325 > Project: Mesos > Issue Type: Improvement > Components: general >Affects Versions: 0.25.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha >Priority: Minor > Labels: external-volumes, persistent-volumes > > Added a new capability SHAREABLE_RESOURCES that frameworks need to opt in if > they are interested in receiving shared resources in their offers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4431) Sharing of persistent volumes via reference counting
[ https://issues.apache.org/jira/browse/MESOS-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233137#comment-15233137 ] Anindya Sinha commented on MESOS-4431: -- Here are the RRs: https://reviews.apache.org/r/45960/ https://reviews.apache.org/r/45961/ https://reviews.apache.org/r/45962/ > Sharing of persistent volumes via reference counting > > > Key: MESOS-4431 > URL: https://issues.apache.org/jira/browse/MESOS-4431 > Project: Mesos > Issue Type: Improvement > Components: general >Affects Versions: 0.25.0 >Reporter: Anindya Sinha >Assignee: Anindya Sinha > Labels: external-volumes, persistent-volumes > > Add capability for specific resources to be shared amongst tasks within or > across frameworks/roles. Enable this functionality for persistent volumes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4892) Support arithmetic operations for shared resources with consumer counts
[ https://issues.apache.org/jira/browse/MESOS-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233136#comment-15233136 ] Anindya Sinha commented on MESOS-4892: -- https://reviews.apache.org/r/45958/ https://reviews.apache.org/r/45959/ > Support arithmetic operations for shared resources with consumer counts > --- > > Key: MESOS-4892 > URL: https://issues.apache.org/jira/browse/MESOS-4892 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Anindya Sinha >Assignee: Anindya Sinha > Labels: external-volum, persistent-volumes, resource > > With the introduction of shared resources, we need to add support for > arithmetic operations on Resources which perform such operations on shared > resources. Shared resources need to be handled differently so as to account > for incrementing/decrementing consumer counts maintained by each Resources > object. > Case 1: > Resources total += shared_resource; > If shared_resource exists in total, this would imply that the consumer count > is incremented. If shared_resource does not exist in total, this would imply > we start tracking consumers for this shared resource initialized to 0 > consumers. > Case 2 > Resources total -= shared_resource; > If shared_resource exists in total, this would imply that the consumer count > is decremented. However, the shared_resource is removed from total if the > consumer count is originally 0 in total). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4705) Slave failed to sample container with perf event
[ https://issues.apache.org/jira/browse/MESOS-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233098#comment-15233098 ] Benjamin Mahler commented on MESOS-4705: [~haosd...@gmail.com] thanks for investigating the kernel code and clarifying. [~fan.du] [~haosd...@gmail.com] This is the kind of information I'd like to see in the code so that our methodology is clear to the reader. If you'd like to remove kernel version checking as a part of this, that's fine as well so long as the explanation is clear and correct. > Slave failed to sample container with perf event > > > Key: MESOS-4705 > URL: https://issues.apache.org/jira/browse/MESOS-4705 > Project: Mesos > Issue Type: Bug > Components: cgroups, isolation >Affects Versions: 0.27.1 >Reporter: Fan Du >Assignee: Fan Du > > When sampling container with perf event on Centos7 with kernel > 3.10.0-123.el7.x86_64, slave complained with below error spew: > {code} > E0218 16:32:00.591181 8376 perf_event.cpp:408] Failed to get perf sample: > Failed to parse perf sample: Failed to parse perf sample line > '25871993253,,cycles,mesos/5f23ffca-87ed-4ff6-84f2-6ec3d4098ab8,10059827422,100.00': > Unexpected number of fields > {code} > it's caused by the current perf format [assumption | > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/linux/perf.cpp;h=1c113a2b3f57877e132bbd65e01fb2f045132128;hb=HEAD#l430] > with kernel version below 3.12 > On 3.10.0-123.el7.x86_64 kernel, the format is with 6 tokens as below: > value,unit,event,cgroup,running,ratio > A local modification fixed this error on my test bed, please review this > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-5101) Add CMake build to docker_build.sh
[ https://issues.apache.org/jira/browse/MESOS-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juan Larriba reassigned MESOS-5101: --- Assignee: Juan Larriba > Add CMake build to docker_build.sh > -- > > Key: MESOS-5101 > URL: https://issues.apache.org/jira/browse/MESOS-5101 > Project: Mesos > Issue Type: Improvement >Reporter: Juan Larriba >Assignee: Juan Larriba > > Add the CMake build system to docker_build.sh to automatically test the build > on Jenkins alongside gcc and clang. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5081) Posix disk isolator allows unrestricted sandbox disk usage if the executor/task doesn't specify disk resource
[ https://issues.apache.org/jira/browse/MESOS-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233060#comment-15233060 ] Jie Yu commented on MESOS-5081: --- Yeah, we should fix that. > Posix disk isolator allows unrestricted sandbox disk usage if the > executor/task doesn't specify disk resource > - > > Key: MESOS-5081 > URL: https://issues.apache.org/jira/browse/MESOS-5081 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Yan Xu > Labels: mesosphere > Fix For: 0.29.0 > > > This is the case even if {{flags.enforce_container_disk_quota}} is true. When > a task/executor doesn't specify a disk resource, it still gets to write to > the container sandbox. However the posix disk isolator doesn't limit it. > Even though tasks always have access to the sandbox, it should be able to > write zero bytes if it doesn't have any {{disk}} resource (it can still touch > files). This likely will cause tasks to immediately fail due to > stdout/stderr/executor download, etc. but should be the correct behavior > (when {{flags.enforce_container_disk_quota}} is true). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role
[ https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233058#comment-15233058 ] Michael Park commented on MESOS-4744: - [~qiujian] Could you please rebase the patch on top of the current master? > mesos-execute should allow setting role > --- > > Key: MESOS-4744 > URL: https://issues.apache.org/jira/browse/MESOS-4744 > Project: Mesos > Issue Type: Bug > Components: cli >Reporter: Jian Qiu >Assignee: Jian Qiu >Priority: Minor > > It will be quite useful if we can set role when running mesos-execute -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5081) Posix disk isolator allows unrestricted sandbox disk usage if the executor/task doesn't specify disk resource
[ https://issues.apache.org/jira/browse/MESOS-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233055#comment-15233055 ] Greg Mann commented on MESOS-5081: -- Have you seen this [~jieyu]? Looks like something that should be addressed soon. > Posix disk isolator allows unrestricted sandbox disk usage if the > executor/task doesn't specify disk resource > - > > Key: MESOS-5081 > URL: https://issues.apache.org/jira/browse/MESOS-5081 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Yan Xu > Labels: mesosphere > Fix For: 0.29.0 > > > This is the case even if {{flags.enforce_container_disk_quota}} is true. When > a task/executor doesn't specify a disk resource, it still gets to write to > the container sandbox. However the posix disk isolator doesn't limit it. > Even though tasks always have access to the sandbox, it should be able to > write zero bytes if it doesn't have any {{disk}} resource (it can still touch > files). This likely will cause tasks to immediately fail due to > stdout/stderr/executor download, etc. but should be the correct behavior > (when {{flags.enforce_container_disk_quota}} is true). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5081) Posix disk isolator allows unrestricted sandbox disk usage if the executor/task doesn't specify disk resource
[ https://issues.apache.org/jira/browse/MESOS-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-5081: - Labels: mesosphere (was: ) Fix Version/s: 0.29.0 > Posix disk isolator allows unrestricted sandbox disk usage if the > executor/task doesn't specify disk resource > - > > Key: MESOS-5081 > URL: https://issues.apache.org/jira/browse/MESOS-5081 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Yan Xu > Labels: mesosphere > Fix For: 0.29.0 > > > This is the case even if {{flags.enforce_container_disk_quota}} is true. When > a task/executor doesn't specify a disk resource, it still gets to write to > the container sandbox. However the posix disk isolator doesn't limit it. > Even though tasks always have access to the sandbox, it should be able to > write zero bytes if it doesn't have any {{disk}} resource (it can still touch > files). This likely will cause tasks to immediately fail due to > stdout/stderr/executor download, etc. but should be the correct behavior > (when {{flags.enforce_container_disk_quota}} is true). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-5139) ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar is flaky
[ https://issues.apache.org/jira/browse/MESOS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gilbert Song reassigned MESOS-5139: --- Assignee: Gilbert Song > ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar is flaky > -- > > Key: MESOS-5139 > URL: https://issues.apache.org/jira/browse/MESOS-5139 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.28.0 > Environment: Ubuntu14.04 >Reporter: Vinod Kone >Assignee: Gilbert Song > > Found this on ASF CI while testing 0.28.1-rc2 > {code} > [ RUN ] ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar > E0406 18:29:30.870481 520 shell.hpp:93] Command 'hadoop version 2>&1' > failed; this is the output: > sh: 1: hadoop: not found > E0406 18:29:30.870576 520 fetcher.cpp:59] Failed to create URI fetcher > plugin 'hadoop': Failed to create HDFS client: Failed to execute 'hadoop > version 2>&1'; the command was either not found or exited with a non-zero > exit status: 127 > I0406 18:29:30.871052 520 local_puller.cpp:90] Creating local puller with > docker registry '/tmp/3l8ZBv/images' > I0406 18:29:30.873325 539 metadata_manager.cpp:159] Looking for image 'abc' > I0406 18:29:30.874438 539 local_puller.cpp:142] Untarring image 'abc' from > '/tmp/3l8ZBv/images/abc.tar' to '/tmp/3l8ZBv/store/staging/5tw8bD' > I0406 18:29:30.901916 547 local_puller.cpp:162] The repositories JSON file > for image 'abc' is '{"abc":{"latest":"456"}}' > I0406 18:29:30.902304 547 local_puller.cpp:290] Extracting layer tar ball > '/tmp/3l8ZBv/store/staging/5tw8bD/123/layer.tar to rootfs > '/tmp/3l8ZBv/store/staging/5tw8bD/123/rootfs' > I0406 18:29:30.909144 547 local_puller.cpp:290] Extracting layer tar ball > '/tmp/3l8ZBv/store/staging/5tw8bD/456/layer.tar to rootfs > '/tmp/3l8ZBv/store/staging/5tw8bD/456/rootfs' > ../../src/tests/containerizer/provisioner_docker_tests.cpp:183: Failure > (imageInfo).failure(): Collect failed: Subprocess 'tar, tar, -x, -f, > /tmp/3l8ZBv/store/staging/5tw8bD/456/layer.tar, -C, > /tmp/3l8ZBv/store/staging/5tw8bD/456/rootfs' failed: tar: This does not look > like a tar archive > tar: Exiting with failure status due to previous errors > [ FAILED ] ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar (243 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5139) ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar is flaky
[ https://issues.apache.org/jira/browse/MESOS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gilbert Song updated MESOS-5139: Environment: Ubuntu14.04 > ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar is flaky > -- > > Key: MESOS-5139 > URL: https://issues.apache.org/jira/browse/MESOS-5139 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.28.0 > Environment: Ubuntu14.04 >Reporter: Vinod Kone > > Found this on ASF CI while testing 0.28.1-rc2 > {code} > [ RUN ] ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar > E0406 18:29:30.870481 520 shell.hpp:93] Command 'hadoop version 2>&1' > failed; this is the output: > sh: 1: hadoop: not found > E0406 18:29:30.870576 520 fetcher.cpp:59] Failed to create URI fetcher > plugin 'hadoop': Failed to create HDFS client: Failed to execute 'hadoop > version 2>&1'; the command was either not found or exited with a non-zero > exit status: 127 > I0406 18:29:30.871052 520 local_puller.cpp:90] Creating local puller with > docker registry '/tmp/3l8ZBv/images' > I0406 18:29:30.873325 539 metadata_manager.cpp:159] Looking for image 'abc' > I0406 18:29:30.874438 539 local_puller.cpp:142] Untarring image 'abc' from > '/tmp/3l8ZBv/images/abc.tar' to '/tmp/3l8ZBv/store/staging/5tw8bD' > I0406 18:29:30.901916 547 local_puller.cpp:162] The repositories JSON file > for image 'abc' is '{"abc":{"latest":"456"}}' > I0406 18:29:30.902304 547 local_puller.cpp:290] Extracting layer tar ball > '/tmp/3l8ZBv/store/staging/5tw8bD/123/layer.tar to rootfs > '/tmp/3l8ZBv/store/staging/5tw8bD/123/rootfs' > I0406 18:29:30.909144 547 local_puller.cpp:290] Extracting layer tar ball > '/tmp/3l8ZBv/store/staging/5tw8bD/456/layer.tar to rootfs > '/tmp/3l8ZBv/store/staging/5tw8bD/456/rootfs' > ../../src/tests/containerizer/provisioner_docker_tests.cpp:183: Failure > (imageInfo).failure(): Collect failed: Subprocess 'tar, tar, -x, -f, > /tmp/3l8ZBv/store/staging/5tw8bD/456/layer.tar, -C, > /tmp/3l8ZBv/store/staging/5tw8bD/456/rootfs' failed: tar: This does not look > like a tar archive > tar: Exiting with failure status due to previous errors > [ FAILED ] ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar (243 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5048) MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
[ https://issues.apache.org/jira/browse/MESOS-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233034#comment-15233034 ] Greg Mann commented on MESOS-5048: -- [~qiujian], thus far I've been unable to reproduce this failure. I've been running it on Ubuntu 14.04 - I see you specified the environment as Ubuntu 15.04, is that what you're running on your local machine? RB runs the reviewbot on Ubuntu 14.04, so if you saw the failure on RB, I would expect it to be reproducible on 14.04 as well. Did you use simply {{../configure}}, with no arguments, to configure your build? > MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky > --- > > Key: MESOS-5048 > URL: https://issues.apache.org/jira/browse/MESOS-5048 > Project: Mesos > Issue Type: Bug > Components: tests >Affects Versions: 0.28.0 > Environment: Ubuntu 15.04 >Reporter: Jian Qiu > Labels: flaky-test > > ./mesos-tests.sh > --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics > --gtest_repeat=100 --gtest_break_on_failure > This is found in rb, and reproduced in my local machine. There are two types > of failures. However, the failure does not appear when enabling verbose... > {code} > ../../src/tests/environment.cpp:790: Failure > Failed > Tests completed with child processes remaining: > -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests > \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor >\--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor > {code} > And > {code} > I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0 > I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave > 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0 > Registered executor on mesos > ../../src/tests/slave_recovery_tests.cpp:3506: Failure > Value of: containers.get().size() > Actual: 0 > Expected: 1u > Which is: 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-5061) process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is not connected
[ https://issues.apache.org/jira/browse/MESOS-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233008#comment-15233008 ] Zogg edited comment on MESOS-5061 at 4/8/16 9:43 PM: - Hello, thanks for taking a look at this issue. I'v been trying to find the problem with no success. And we'd like to have mesos/calico running in next Mantl's release. Network isolator lib compiled from https://github.com/mesosphere/net-modules (0.2.x branch). And used with calico. {code} { "libraries": [ { "file": "/usr/local/lib/mesos/libmesos_network_isolator.so", "modules": [ { "name": "com_mesosphere_mesos_NetworkIsolator", "parameters": [ { "key": "isolator_command", "value": "/usr/local/bin/calico_mesos" }, { "key": "ipam_command", "value": "/usr/local/bin/calico_mesos" } ] }, { "name": "com_mesosphere_mesos_NetworkHook" } ] } ] } {code} was (Author: zogg): Hello, thanks for taking a look at this issue. I'v been trying to find the problem with no success. And we'd like to have mesos/calico running in next Mantl's release. Compiled from https://github.com/mesosphere/net-modules (0.2.x branch). And used with calico. {code} { "libraries": [ { "file": "/usr/local/lib/mesos/libmesos_network_isolator.so", "modules": [ { "name": "com_mesosphere_mesos_NetworkIsolator", "parameters": [ { "key": "isolator_command", "value": "/usr/local/bin/calico_mesos" }, { "key": "ipam_command", "value": "/usr/local/bin/calico_mesos" } ] }, { "name": "com_mesosphere_mesos_NetworkHook" } ] } ] } {code} > process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is > not connected > -- > > Key: MESOS-5061 > URL: https://issues.apache.org/jira/browse/MESOS-5061 > Project: Mesos > Issue Type: Bug > Components: containerization, modules >Affects Versions: 0.27.0, 0.27.1, 0.28.0, 0.27.2 > Environment: Centos 7.1 >Reporter: Zogg > Fix For: 0.29.0 > > > When launching a task through Marathon and asking the task to assign an IP > (using Calico networking): > {noformat} > { > "id":"/calico-apps", > "apps": [ > { > "id": "hello-world-1", > "cmd": "ip addr && sleep 3", > "cpus": 0.1, > "mem": 64.0, > "ipAddress": { > "groups": ["calico-k8s-network"] > } > } > ] > } > {noformat} > Mesos slave fails to launch a task, locking in STAGING state forewer, with > error: > {noformat} > [centos@rtmi-worker-001 mesos]$ tail mesos-slave.INFO > I0325 20:35:43.420171 13495 slave.cpp:2642] Got registration for executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- from executor(1)@10.0.0.10:33443 > I0325 20:35:43.422652 13495 slave.cpp:1862] Sending queued task > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' to executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- at executor(1)@10.0.0.10:33443 > E0325 20:35:43.423159 13502 process.cpp:1966] Failed to shutdown socket with > fd 22: Transport endpoint is not connected > I0325 20:35:43.423316 13501 slave.cpp:3481] executor(1)@10.0.0.10:33443 exited > {noformat} > However, when deploying a task without ipAddress field, mesos slave launches > a task successfully. > Tested with various Mesos/Marathon/Calico versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-5061) process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is not connected
[ https://issues.apache.org/jira/browse/MESOS-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233008#comment-15233008 ] Zogg edited comment on MESOS-5061 at 4/8/16 9:42 PM: - Hello, thanks for taking a look at this issue. I'v been trying to find the problem with no success. And we'd like to have mesos/calico running in next Mantl's release. Compiled from https://github.com/mesosphere/net-modules (0.2.x branch). And used with calico. {code} { "libraries": [ { "file": "/usr/local/lib/mesos/libmesos_network_isolator.so", "modules": [ { "name": "com_mesosphere_mesos_NetworkIsolator", "parameters": [ { "key": "isolator_command", "value": "/usr/local/bin/calico_mesos" }, { "key": "ipam_command", "value": "/usr/local/bin/calico_mesos" } ] }, { "name": "com_mesosphere_mesos_NetworkHook" } ] } ] } {code} was (Author: zogg): Compiled from https://github.com/mesosphere/net-modules (0.2.x branch). And used with calico. {code} { "libraries": [ { "file": "/usr/local/lib/mesos/libmesos_network_isolator.so", "modules": [ { "name": "com_mesosphere_mesos_NetworkIsolator", "parameters": [ { "key": "isolator_command", "value": "/usr/local/bin/calico_mesos" }, { "key": "ipam_command", "value": "/usr/local/bin/calico_mesos" } ] }, { "name": "com_mesosphere_mesos_NetworkHook" } ] } ] } {code} > process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is > not connected > -- > > Key: MESOS-5061 > URL: https://issues.apache.org/jira/browse/MESOS-5061 > Project: Mesos > Issue Type: Bug > Components: containerization, modules >Affects Versions: 0.27.0, 0.27.1, 0.28.0, 0.27.2 > Environment: Centos 7.1 >Reporter: Zogg > Fix For: 0.29.0 > > > When launching a task through Marathon and asking the task to assign an IP > (using Calico networking): > {noformat} > { > "id":"/calico-apps", > "apps": [ > { > "id": "hello-world-1", > "cmd": "ip addr && sleep 3", > "cpus": 0.1, > "mem": 64.0, > "ipAddress": { > "groups": ["calico-k8s-network"] > } > } > ] > } > {noformat} > Mesos slave fails to launch a task, locking in STAGING state forewer, with > error: > {noformat} > [centos@rtmi-worker-001 mesos]$ tail mesos-slave.INFO > I0325 20:35:43.420171 13495 slave.cpp:2642] Got registration for executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- from executor(1)@10.0.0.10:33443 > I0325 20:35:43.422652 13495 slave.cpp:1862] Sending queued task > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' to executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- at executor(1)@10.0.0.10:33443 > E0325 20:35:43.423159 13502 process.cpp:1966] Failed to shutdown socket with > fd 22: Transport endpoint is not connected > I0325 20:35:43.423316 13501 slave.cpp:3481] executor(1)@10.0.0.10:33443 exited > {noformat} > However, when deploying a task without ipAddress field, mesos slave launches > a task successfully. > Tested with various Mesos/Marathon/Calico versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-5061) process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is not connected
[ https://issues.apache.org/jira/browse/MESOS-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233008#comment-15233008 ] Zogg edited comment on MESOS-5061 at 4/8/16 9:40 PM: - Compiled from https://github.com/mesosphere/net-modules (0.2.x branch). And used with calico. {code} { "libraries": [ { "file": "/usr/local/lib/mesos/libmesos_network_isolator.so", "modules": [ { "name": "com_mesosphere_mesos_NetworkIsolator", "parameters": [ { "key": "isolator_command", "value": "/usr/local/bin/calico_mesos" }, { "key": "ipam_command", "value": "/usr/local/bin/calico_mesos" } ] }, { "name": "com_mesosphere_mesos_NetworkHook" } ] } ] } {code} was (Author: zogg): Compiled from https://github.com/mesosphere/net-modules. And used with calico. ``` { "libraries": [ { "file": "/usr/local/lib/mesos/libmesos_network_isolator.so", "modules": [ { "name": "com_mesosphere_mesos_NetworkIsolator", "parameters": [ { "key": "isolator_command", "value": "/usr/local/bin/calico_mesos" }, { "key": "ipam_command", "value": "/usr/local/bin/calico_mesos" } ] }, { "name": "com_mesosphere_mesos_NetworkHook" } ] } ] } ``` > process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is > not connected > -- > > Key: MESOS-5061 > URL: https://issues.apache.org/jira/browse/MESOS-5061 > Project: Mesos > Issue Type: Bug > Components: containerization, modules >Affects Versions: 0.27.0, 0.27.1, 0.28.0, 0.27.2 > Environment: Centos 7.1 >Reporter: Zogg > Fix For: 0.29.0 > > > When launching a task through Marathon and asking the task to assign an IP > (using Calico networking): > {noformat} > { > "id":"/calico-apps", > "apps": [ > { > "id": "hello-world-1", > "cmd": "ip addr && sleep 3", > "cpus": 0.1, > "mem": 64.0, > "ipAddress": { > "groups": ["calico-k8s-network"] > } > } > ] > } > {noformat} > Mesos slave fails to launch a task, locking in STAGING state forewer, with > error: > {noformat} > [centos@rtmi-worker-001 mesos]$ tail mesos-slave.INFO > I0325 20:35:43.420171 13495 slave.cpp:2642] Got registration for executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- from executor(1)@10.0.0.10:33443 > I0325 20:35:43.422652 13495 slave.cpp:1862] Sending queued task > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' to executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- at executor(1)@10.0.0.10:33443 > E0325 20:35:43.423159 13502 process.cpp:1966] Failed to shutdown socket with > fd 22: Transport endpoint is not connected > I0325 20:35:43.423316 13501 slave.cpp:3481] executor(1)@10.0.0.10:33443 exited > {noformat} > However, when deploying a task without ipAddress field, mesos slave launches > a task successfully. > Tested with various Mesos/Marathon/Calico versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5061) process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is not connected
[ https://issues.apache.org/jira/browse/MESOS-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233008#comment-15233008 ] Zogg commented on MESOS-5061: - Compiled from https://github.com/mesosphere/net-modules. And used with calico. ``` { "libraries": [ { "file": "/usr/local/lib/mesos/libmesos_network_isolator.so", "modules": [ { "name": "com_mesosphere_mesos_NetworkIsolator", "parameters": [ { "key": "isolator_command", "value": "/usr/local/bin/calico_mesos" }, { "key": "ipam_command", "value": "/usr/local/bin/calico_mesos" } ] }, { "name": "com_mesosphere_mesos_NetworkHook" } ] } ] } ``` > process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is > not connected > -- > > Key: MESOS-5061 > URL: https://issues.apache.org/jira/browse/MESOS-5061 > Project: Mesos > Issue Type: Bug > Components: containerization, modules >Affects Versions: 0.27.0, 0.27.1, 0.28.0, 0.27.2 > Environment: Centos 7.1 >Reporter: Zogg > Fix For: 0.29.0 > > > When launching a task through Marathon and asking the task to assign an IP > (using Calico networking): > {noformat} > { > "id":"/calico-apps", > "apps": [ > { > "id": "hello-world-1", > "cmd": "ip addr && sleep 3", > "cpus": 0.1, > "mem": 64.0, > "ipAddress": { > "groups": ["calico-k8s-network"] > } > } > ] > } > {noformat} > Mesos slave fails to launch a task, locking in STAGING state forewer, with > error: > {noformat} > [centos@rtmi-worker-001 mesos]$ tail mesos-slave.INFO > I0325 20:35:43.420171 13495 slave.cpp:2642] Got registration for executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- from executor(1)@10.0.0.10:33443 > I0325 20:35:43.422652 13495 slave.cpp:1862] Sending queued task > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' to executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- at executor(1)@10.0.0.10:33443 > E0325 20:35:43.423159 13502 process.cpp:1966] Failed to shutdown socket with > fd 22: Transport endpoint is not connected > I0325 20:35:43.423316 13501 slave.cpp:3481] executor(1)@10.0.0.10:33443 exited > {noformat} > However, when deploying a task without ipAddress field, mesos slave launches > a task successfully. > Tested with various Mesos/Marathon/Calico versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5135) Update existing documentation to Include references to GPUs as a first class resource.
[ https://issues.apache.org/jira/browse/MESOS-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5135: --- Issue Type: Task (was: Bug) Moving from a bug to a task. > Update existing documentation to Include references to GPUs as a first class > resource. > -- > > Key: MESOS-5135 > URL: https://issues.apache.org/jira/browse/MESOS-5135 > Project: Mesos > Issue Type: Task > Components: documentation >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: docs, gpu, mesosphere, resource > > Specifically, the documentation in the following files should be udated: > {noformat} > docs/attributes-resources.md > docs/monitoring.md > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5136) Update the default JSON representation of a Resource to include GPUs
[ https://issues.apache.org/jira/browse/MESOS-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5136: --- Issue Type: Task (was: Bug) Moved this from a bug to a task. > Update the default JSON representation of a Resource to include GPUs > > > Key: MESOS-5136 > URL: https://issues.apache.org/jira/browse/MESOS-5136 > Project: Mesos > Issue Type: Task >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: gpu, json, mesosphere, resource > Fix For: 0.29.0 > > > The default JSON representation of a Resource currently lists a value of "0" > if no value is set on a first class SCALAR resource (i.e. cpus, mem, disk). > We should add GPUs in here as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5158) Provide XFS quota support for persistent volumes.
Yan Xu created MESOS-5158: - Summary: Provide XFS quota support for persistent volumes. Key: MESOS-5158 URL: https://issues.apache.org/jira/browse/MESOS-5158 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Yan Xu Given that the lifecycle of persistent volumes is managed outside of the isolator, we may need to further abstract out the quota management functionality to do it outside the XFS isolator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5060) Requesting /files/read.json with a negative length value causes subsequent /files requests to 404.
[ https://issues.apache.org/jira/browse/MESOS-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232876#comment-15232876 ] Greg Mann commented on MESOS-5060: -- [~dongdong], thanks for taking on this ticket! Could you find a committer to shepherd this for you before you start work? > Requesting /files/read.json with a negative length value causes subsequent > /files requests to 404. > -- > > Key: MESOS-5060 > URL: https://issues.apache.org/jira/browse/MESOS-5060 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0 > Environment: Mesos 0.23.0 on CentOS 6, also Mesos 0.28.0 on OSX >Reporter: Tom Petr >Assignee: zhou xing >Priority: Minor > Fix For: 0.29.0 > > > I accidentally hit a slave's /files/read.json endpoint with a negative length > (ex. http://hostname:5051/files/read.json?path=XXX=0=-100). The > HTTP request timed out after 30 seconds with nothing relevant in the slave > logs, and subsequent calls to any of the /files endpoints on that slave > immediately returned a HTTP 404 response. We ultimately got things working > again by restarting the mesos-slave process (checkpointing FTW!), but it'd be > wise to guard against negative lengths on the slave's end too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5060) Requesting /files/read.json with a negative length value causes subsequent /files requests to 404.
[ https://issues.apache.org/jira/browse/MESOS-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-5060: - Assignee: zhou xing (was: Greg Mann) > Requesting /files/read.json with a negative length value causes subsequent > /files requests to 404. > -- > > Key: MESOS-5060 > URL: https://issues.apache.org/jira/browse/MESOS-5060 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0 > Environment: Mesos 0.23.0 on CentOS 6, also Mesos 0.28.0 on OSX >Reporter: Tom Petr >Assignee: zhou xing >Priority: Minor > Fix For: 0.29.0 > > > I accidentally hit a slave's /files/read.json endpoint with a negative length > (ex. http://hostname:5051/files/read.json?path=XXX=0=-100). The > HTTP request timed out after 30 seconds with nothing relevant in the slave > logs, and subsequent calls to any of the /files endpoints on that slave > immediately returned a HTTP 404 response. We ultimately got things working > again by restarting the mesos-slave process (checkpointing FTW!), but it'd be > wise to guard against negative lengths on the slave's end too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-5060) Requesting /files/read.json with a negative length value causes subsequent /files requests to 404.
[ https://issues.apache.org/jira/browse/MESOS-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann reassigned MESOS-5060: Assignee: Greg Mann (was: zhou xing) > Requesting /files/read.json with a negative length value causes subsequent > /files requests to 404. > -- > > Key: MESOS-5060 > URL: https://issues.apache.org/jira/browse/MESOS-5060 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0 > Environment: Mesos 0.23.0 on CentOS 6, also Mesos 0.28.0 on OSX >Reporter: Tom Petr >Assignee: Greg Mann >Priority: Minor > Fix For: 0.29.0 > > > I accidentally hit a slave's /files/read.json endpoint with a negative length > (ex. http://hostname:5051/files/read.json?path=XXX=0=-100). The > HTTP request timed out after 30 seconds with nothing relevant in the slave > logs, and subsequent calls to any of the /files endpoints on that slave > immediately returned a HTTP 404 response. We ultimately got things working > again by restarting the mesos-slave process (checkpointing FTW!), but it'd be > wise to guard against negative lengths on the slave's end too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5157) Update webui for GPU metrics
Kevin Klues created MESOS-5157: -- Summary: Update webui for GPU metrics Key: MESOS-5157 URL: https://issues.apache.org/jira/browse/MESOS-5157 Project: Mesos Issue Type: Task Components: webui Reporter: Kevin Klues Assignee: Kevin Klues After adding the GPU metrics and updating the resources JSON to include GPU information, the webui should be updated accordingly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4962) Support for Mesos releases
[ https://issues.apache.org/jira/browse/MESOS-4962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232860#comment-15232860 ] Vinod Kone commented on MESOS-4962: --- commit 52b8cd9aaecea86c1915af5cae4a6cd78c4f3a23 Author: Vinod KoneDate: Thu Apr 7 13:18:07 2016 -0700 Updated versioning doc with release and support policy. > Support for Mesos releases > -- > > Key: MESOS-4962 > URL: https://issues.apache.org/jira/browse/MESOS-4962 > Project: Mesos > Issue Type: Task >Reporter: Vinod Kone >Assignee: Vinod Kone > Fix For: 0.29.0 > > > As part of Mesos reaching 1.0, we need to formalize the policy of supporting > Mesos releases. > Some specific questions we need to answer: > --> What fixes should we backports to older releases. > --> How many old releases are supported. > --> Should we have a LTS version? > --> What is the cadence of major, minor and patch releases? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-5080) With NetworkManager off, can't seem to deploy docker containerized apps
[ https://issues.apache.org/jira/browse/MESOS-5080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232842#comment-15232842 ] Greg Mann edited comment on MESOS-5080 at 4/8/16 8:13 PM: -- [~lax77], can you tell me if the following description of your issue is correct? You were successfully able to launch this Docker task on an agent with NetworkManager running on it. Then, you turned NetworkManager off on the agent. Then, you tried launching the same Docker task on the agent and it failed. Is that right? How are you launching this Docker task? Is it via Marathon? Is your Dockerfile mounting the locations of the network config files into the container? If my understanding is correct, if you're using the filesystem/linux isolator on the agent, then the host filesystem will be unavailable and your container won't be able to access those files. What isolators are you loading on the agent? Perhaps [~jieyu] can provide some more thoughts on this? was (Author: greggomann): [~lax77], can you tell me if the following description of your issue is correct? You were successfully able to launch this Docker task on an agent with NetworkManager running on it. Then, you turned NetworkManager off on the agent. Then, you tried launching the same Docker task on the agent and it failed. Is that right? How are you launching this Docker task? Is it via Marathon? Is your Dockerfile mounting the locations of the network config files into the container? If my understanding is correct, if you're using the filesystem/linux isolator on the agent, then the host filesystem will be unavailable and your container won't be able to access those files. Perhaps [~jieyu] can provide some more thoughts on this? > With NetworkManager off, can't seem to deploy docker containerized apps > --- > > Key: MESOS-5080 > URL: https://issues.apache.org/jira/browse/MESOS-5080 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.27.0 > Environment: Centos 7.1 >Reporter: Lax > > Mesos is unable to deploy docker container apps after switching > NetworkManager service off. > > Noticed none of the Host network config files (like hosts, hostname, > resolv.conf, etc) were seen under the docker container dir > (/var/lib/docker/containers/[container id]) if I provision the container thru > mesos. Only files seen were config.json and hostconfig.json files. > However when I launch same container direct via docker run the issue is not > seen. Container dir has all the network config files. > Any reason why mesos unable to push those to the container dir? is there > specific network setting needs enabled now that I have NetworkManager > switched off? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5080) With NetworkManager off, can't seem to deploy docker containerized apps
[ https://issues.apache.org/jira/browse/MESOS-5080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232842#comment-15232842 ] Greg Mann commented on MESOS-5080: -- [~lax77], can you tell me if the following description of your issue is correct? You were successfully able to launch this Docker task on an agent with NetworkManager running on it. Then, you turned NetworkManager off on the agent. Then, you tried launching the same Docker task on the agent and it failed. Is that right? How are you launching this Docker task? Is it via Marathon? Is your Dockerfile mounting the locations of the network config files into the container? If my understanding is correct, if you're using the filesystem/linux isolator on the agent, then the host filesystem will be unavailable and your container won't be able to access those files. Perhaps [~jieyu] can provide some more thoughts on this? > With NetworkManager off, can't seem to deploy docker containerized apps > --- > > Key: MESOS-5080 > URL: https://issues.apache.org/jira/browse/MESOS-5080 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.27.0 > Environment: Centos 7.1 >Reporter: Lax > > Mesos is unable to deploy docker container apps after switching > NetworkManager service off. > > Noticed none of the Host network config files (like hosts, hostname, > resolv.conf, etc) were seen under the docker container dir > (/var/lib/docker/containers/[container id]) if I provision the container thru > mesos. Only files seen were config.json and hostconfig.json files. > However when I launch same container direct via docker run the issue is not > seen. Container dir has all the network config files. > Any reason why mesos unable to push those to the container dir? is there > specific network setting needs enabled now that I have NetworkManager > switched off? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5061) process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is not connected
[ https://issues.apache.org/jira/browse/MESOS-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232826#comment-15232826 ] Greg Mann commented on MESOS-5061: -- [~Zogg], thanks for the bug report. I'll have a crack at reproducing this; I wonder if [~karya] has any thoughts? > process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is > not connected > -- > > Key: MESOS-5061 > URL: https://issues.apache.org/jira/browse/MESOS-5061 > Project: Mesos > Issue Type: Bug > Components: containerization, modules >Affects Versions: 0.27.0, 0.27.1, 0.28.0, 0.27.2 > Environment: Centos 7.1 >Reporter: Zogg > Fix For: 0.29.0 > > > When launching a task through Marathon and asking the task to assign an IP > (using Calico networking): > {noformat} > { > "id":"/calico-apps", > "apps": [ > { > "id": "hello-world-1", > "cmd": "ip addr && sleep 3", > "cpus": 0.1, > "mem": 64.0, > "ipAddress": { > "groups": ["calico-k8s-network"] > } > } > ] > } > {noformat} > Mesos slave fails to launch a task, locking in STAGING state forewer, with > error: > {noformat} > [centos@rtmi-worker-001 mesos]$ tail mesos-slave.INFO > I0325 20:35:43.420171 13495 slave.cpp:2642] Got registration for executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- from executor(1)@10.0.0.10:33443 > I0325 20:35:43.422652 13495 slave.cpp:1862] Sending queued task > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' to executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- at executor(1)@10.0.0.10:33443 > E0325 20:35:43.423159 13502 process.cpp:1966] Failed to shutdown socket with > fd 22: Transport endpoint is not connected > I0325 20:35:43.423316 13501 slave.cpp:3481] executor(1)@10.0.0.10:33443 exited > {noformat} > However, when deploying a task without ipAddress field, mesos slave launches > a task successfully. > Tested with various Mesos/Marathon/Calico versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5151) Marathon Pass Dynamic Value with Parameters Resource in Docker Configuration
[ https://issues.apache.org/jira/browse/MESOS-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232833#comment-15232833 ] Greg Mann commented on MESOS-5151: -- [~jesada], could you elaborate a bit regarding what exactly your use case is? It's not clear to me from the ticket if this is really an improvement for Mesos, or for Marathon. > Marathon Pass Dynamic Value with Parameters Resource in Docker Configuration > > > Key: MESOS-5151 > URL: https://issues.apache.org/jira/browse/MESOS-5151 > Project: Mesos > Issue Type: Wish > Components: docker >Affects Versions: 0.28.0 > Environment: software >Reporter: Jesada Gonkratoke > > "parameters": [ >{ "key": "add-host", "value": "dockerhost:$(hostname -i)" } > ] > }, > # I want to add dynamic host ip -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5061) process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is not connected
[ https://issues.apache.org/jira/browse/MESOS-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232831#comment-15232831 ] Jie Yu commented on MESOS-5061: --- Which network isolator are you using? > process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is > not connected > -- > > Key: MESOS-5061 > URL: https://issues.apache.org/jira/browse/MESOS-5061 > Project: Mesos > Issue Type: Bug > Components: containerization, modules >Affects Versions: 0.27.0, 0.27.1, 0.28.0, 0.27.2 > Environment: Centos 7.1 >Reporter: Zogg > Fix For: 0.29.0 > > > When launching a task through Marathon and asking the task to assign an IP > (using Calico networking): > {noformat} > { > "id":"/calico-apps", > "apps": [ > { > "id": "hello-world-1", > "cmd": "ip addr && sleep 3", > "cpus": 0.1, > "mem": 64.0, > "ipAddress": { > "groups": ["calico-k8s-network"] > } > } > ] > } > {noformat} > Mesos slave fails to launch a task, locking in STAGING state forewer, with > error: > {noformat} > [centos@rtmi-worker-001 mesos]$ tail mesos-slave.INFO > I0325 20:35:43.420171 13495 slave.cpp:2642] Got registration for executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- from executor(1)@10.0.0.10:33443 > I0325 20:35:43.422652 13495 slave.cpp:1862] Sending queued task > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' to executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- at executor(1)@10.0.0.10:33443 > E0325 20:35:43.423159 13502 process.cpp:1966] Failed to shutdown socket with > fd 22: Transport endpoint is not connected > I0325 20:35:43.423316 13501 slave.cpp:3481] executor(1)@10.0.0.10:33443 exited > {noformat} > However, when deploying a task without ipAddress field, mesos slave launches > a task successfully. > Tested with various Mesos/Marathon/Calico versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5131) DRF allocator crashes master with CHECK when resource is incorrect
[ https://issues.apache.org/jira/browse/MESOS-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232827#comment-15232827 ] Greg Mann commented on MESOS-5131: -- [~alexr] do you have any thoughts on this? > DRF allocator crashes master with CHECK when resource is incorrect > -- > > Key: MESOS-5131 > URL: https://issues.apache.org/jira/browse/MESOS-5131 > Project: Mesos > Issue Type: Bug > Components: allocation, oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Critical > > We were testing a custom resource estimator which broadcasts oversubscribed > resources, but they are not marked as "revocable". > This unfortunately triggered the following check in hierarchical allocator: > {quote} > void HierarchicalAllocatorProcess::updateSlave( > // Check that all the oversubscribed resources are revocable. > CHECK_EQ(oversubscribed, oversubscribed.revocable()); > {quote} > This definitely shouldn't happen in production cluster. IMO, we should do > both of following: > 1. Make sure incorrect resource is not sent from agent (even crash agent > process is better); > 2. Decline agent registration if it's resources is incorrect, or even tell it > to shutdown, and possibly remove this check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5123) Docker task may fail if path to agent work_dir is relative.
[ https://issues.apache.org/jira/browse/MESOS-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-5123: - Fix Version/s: 0.29.0 > Docker task may fail if path to agent work_dir is relative. > > > Key: MESOS-5123 > URL: https://issues.apache.org/jira/browse/MESOS-5123 > Project: Mesos > Issue Type: Improvement > Components: docker >Affects Versions: 0.28.0, 0.29.0 >Reporter: Alexander Rukletsov > Labels: docker, documentation, mesosphere > Fix For: 0.29.0 > > > When a local folder for agent’s {{\-\-work_dir}} is specified (e.g., > {{\-\-work_dir=w/s}}) docker complains that there are forbidden symbols in a > *local* volume name. Specifying an absolute path (e.g., > {{\-\-work_dir=/tmp}}) solves the problem. > Docker error observed: > {noformat} > docker: Error response from daemon: create > w/s/slaves/33b8fe47-e9e0-468a-83a6-98c1e3537e59-S1/frameworks/33b8fe47-e9e0-468a-83a6-98c1e3537e59-0001/executors/docker-test/runs/3cc5cb04-d0a9-490e-94d5-d446b66c97cc: > volume name invalid: > "w/s/slaves/33b8fe47-e9e0-468a-83a6-98c1e3537e59-S1/frameworks/33b8fe47-e9e0-468a-83a6-98c1e3537e59-0001/executors/docker-test/runs/3cc5cb04-d0a9-490e-94d5-d446b66c97cc" > includes invalid characters for a local volume name, only > "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed. > {noformat} > First off, it is not obvious that Mesos always creates a volume for the > sandbox. We may want to document it. > Second, it's hard to understand that local {{work_dir}} can trigger forbidden > symbols error in docker. Does it make sense to check it during agent launch > if docker containerizer is enabled? Or reject docker tasks during task > validation? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-5131) DRF allocator crashes master with CHECK when resource is incorrect
[ https://issues.apache.org/jira/browse/MESOS-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhitao Li reassigned MESOS-5131: Assignee: Zhitao Li > DRF allocator crashes master with CHECK when resource is incorrect > -- > > Key: MESOS-5131 > URL: https://issues.apache.org/jira/browse/MESOS-5131 > Project: Mesos > Issue Type: Bug > Components: allocation, oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Critical > > We were testing a custom resource estimator which broadcasts oversubscribed > resources, but they are not marked as "revocable". > This unfortunately triggered the following check in hierarchical allocator: > {quote} > void HierarchicalAllocatorProcess::updateSlave( > // Check that all the oversubscribed resources are revocable. > CHECK_EQ(oversubscribed, oversubscribed.revocable()); > {quote} > This definitely shouldn't happen in production cluster. IMO, we should do > both of following: > 1. Make sure incorrect resource is not sent from agent (even crash agent > process is better); > 2. Decline agent registration if it's resources is incorrect, or even tell it > to shutdown, and possibly remove this check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5061) process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is not connected
[ https://issues.apache.org/jira/browse/MESOS-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-5061: - Fix Version/s: (was: 0.26.0) 0.29.0 > process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is > not connected > -- > > Key: MESOS-5061 > URL: https://issues.apache.org/jira/browse/MESOS-5061 > Project: Mesos > Issue Type: Bug > Components: containerization, modules >Affects Versions: 0.27.0, 0.27.1, 0.28.0, 0.27.2 > Environment: Centos 7.1 >Reporter: Zogg > Fix For: 0.29.0 > > > When launching a task through Marathon and asking the task to assign an IP > (using Calico networking): > {noformat} > { > "id":"/calico-apps", > "apps": [ > { > "id": "hello-world-1", > "cmd": "ip addr && sleep 3", > "cpus": 0.1, > "mem": 64.0, > "ipAddress": { > "groups": ["calico-k8s-network"] > } > } > ] > } > {noformat} > Mesos slave fails to launch a task, locking in STAGING state forewer, with > error: > {noformat} > [centos@rtmi-worker-001 mesos]$ tail mesos-slave.INFO > I0325 20:35:43.420171 13495 slave.cpp:2642] Got registration for executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- from executor(1)@10.0.0.10:33443 > I0325 20:35:43.422652 13495 slave.cpp:1862] Sending queued task > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' to executor > 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework > 23b404e4-700a-4348-a7c0-226239348981- at executor(1)@10.0.0.10:33443 > E0325 20:35:43.423159 13502 process.cpp:1966] Failed to shutdown socket with > fd 22: Transport endpoint is not connected > I0325 20:35:43.423316 13501 slave.cpp:3481] executor(1)@10.0.0.10:33443 exited > {noformat} > However, when deploying a task without ipAddress field, mesos slave launches > a task successfully. > Tested with various Mesos/Marathon/Calico versions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5137) Remove 'dashboard.js' from the webui.
[ https://issues.apache.org/jira/browse/MESOS-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-5137: --- Issue Type: Task (was: Bug) Changed this from a Bug to a Task. > Remove 'dashboard.js' from the webui. > - > > Key: MESOS-5137 > URL: https://issues.apache.org/jira/browse/MESOS-5137 > Project: Mesos > Issue Type: Task >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: webui > > This file is no longer in use anywhere. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5154) Build the application using Visual C++ Build Tools 2015
[ https://issues.apache.org/jira/browse/MESOS-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232779#comment-15232779 ] Vinod Kone commented on MESOS-5154: --- Linking the related INFRA ticket https://issues.apache.org/jira/browse/INFRA-11625 [~jlarriba] and [~hausdorff] feel free to add yourselves as watchers to the INFRA ticket for updates and further discussion. > Build the application using Visual C++ Build Tools 2015 > --- > > Key: MESOS-5154 > URL: https://issues.apache.org/jira/browse/MESOS-5154 > Project: Mesos > Issue Type: Improvement >Reporter: Juan Larriba >Assignee: Juan Larriba > > As Microsoft is releasing the Visual C++ Build Tools as a downloadable bundle > without Visual Studio 2015, we should ensure that we can build Mesos without > the need for Visual Studio. > This will make building at CI servers easier, as well as reduce the download > needed to build for Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5060) Requesting /files/read.json with a negative length value causes subsequent /files requests to 404.
[ https://issues.apache.org/jira/browse/MESOS-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-5060: - Fix Version/s: 0.29.0 > Requesting /files/read.json with a negative length value causes subsequent > /files requests to 404. > -- > > Key: MESOS-5060 > URL: https://issues.apache.org/jira/browse/MESOS-5060 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.23.0 > Environment: Mesos 0.23.0 on CentOS 6, also Mesos 0.28.0 on OSX >Reporter: Tom Petr >Assignee: zhou xing >Priority: Minor > Fix For: 0.29.0 > > > I accidentally hit a slave's /files/read.json endpoint with a negative length > (ex. http://hostname:5051/files/read.json?path=XXX=0=-100). The > HTTP request timed out after 30 seconds with nothing relevant in the slave > logs, and subsequent calls to any of the /files endpoints on that slave > immediately returned a HTTP 404 response. We ultimately got things working > again by restarting the mesos-slave process (checkpointing FTW!), but it'd be > wise to guard against negative lengths on the slave's end too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5156) Run mesos builds on PowerPC platform in ASF CI
Vinod Kone created MESOS-5156: - Summary: Run mesos builds on PowerPC platform in ASF CI Key: MESOS-5156 URL: https://issues.apache.org/jira/browse/MESOS-5156 Project: Mesos Issue Type: Bug Reporter: Vinod Kone This is the last step to declare official support for PowerPC. This is currently blocked on ASF INFRA adding PowerPC based Jenkins machines to the ASF CI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4785) Reorganize ACL subject/object descriptions
[ https://issues.apache.org/jira/browse/MESOS-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232668#comment-15232668 ] Greg Mann commented on MESOS-4785: -- The current docs have the headings {{The currently supported Actions are:}}, {{The currently supported Subjects are:}}, and {{The currently supported Objects are:}}, with lists of Actions, Subjects, and Objects under each, and a long list of examples at the end. I think it would be easier to read if the doc enumerated the Actions, showing the relevant Subject/Object for each. I can imagine a couple ways of organizing things: * A heading for each Action, and below it a description of the ACL, with the relevant Subject/Object and a couple examples * A table showing which Subject and Object is associated with each Action, followed by the current list of examples > Reorganize ACL subject/object descriptions > -- > > Key: MESOS-4785 > URL: https://issues.apache.org/jira/browse/MESOS-4785 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Greg Mann >Assignee: Alexander Rojas > Labels: documentation, mesosphere, security > Fix For: 0.29.0 > > > The authorization documentation would benefit from a reorganization of the > ACL subject/object descriptions. Instead of simple lists of the available > subjects and objects, it would be nice to see a table showing which subject > and object is used with each action. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5155) Consolidate authorization actions for quota.
Alexander Rukletsov created MESOS-5155: -- Summary: Consolidate authorization actions for quota. Key: MESOS-5155 URL: https://issues.apache.org/jira/browse/MESOS-5155 Project: Mesos Issue Type: Improvement Reporter: Alexander Rukletsov We should have just a single authz action: {{UPDATE_QUOTA_WITH_ROLE}}. It was a mistake in retrospect to introduce multiple actions. Actions that are not symmetrical are register/teardown and dynamic reservations. The way they are implemented in this way is because entities that do one action differ from entities that do the other. For example, register framework is issued by a framework, teardown by an operator. What is a good way to identify a framework? A role it runs in, which may be different each launch and makes no sense in multi-role frameworks setup or better a sort of a group id, which is its principal. For dynamic reservations and persistent volumes, they can be both issued by frameworks and operators, hence similar reasoning applies. Now, quota is associated with a role and set only by operators. Do we need to care about principals that set it? Not that much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1934) Tasks created with mesos-execute disappear from task list at termination
[ https://issues.apache.org/jira/browse/MESOS-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232569#comment-15232569 ] Ben Hardy commented on MESOS-1934: -- Fair enough, thanks guys. > Tasks created with mesos-execute disappear from task list at termination > > > Key: MESOS-1934 > URL: https://issues.apache.org/jira/browse/MESOS-1934 > Project: Mesos > Issue Type: Bug > Components: cli >Affects Versions: 0.20.1 > Environment: Linux 3.13.0-34-generic kernel, Ubuntu 14.04 >Reporter: Ben Hardy >Priority: Minor > > We are noticing that when tasks created with mesos-execute terminate, either > normally or abnormally, the task disappears from the task list. They do not > appear in the "Completed" section as you would expect, or anywhere else in > the page. > This makes problem diagnosis a bit inconvenient since one has to go digging > around in the logs on slave nodes to find out what went wrong, rather than > just being able to look at logs in the UI as you can with tasks submitted by > Singularity or Marathon. > Not a big deal but would be a nice time saver, and make things more > consistent. > Thanks, > B -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3402) mesos-execute does not support credentials
[ https://issues.apache.org/jira/browse/MESOS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232567#comment-15232567 ] Tim Anderegg commented on MESOS-3402: - [~anandmazumdar] Ah, I should have looked into that first :) I'll watch MESOS-3923 and work on this once that is merged, thanks! > mesos-execute does not support credentials > -- > > Key: MESOS-3402 > URL: https://issues.apache.org/jira/browse/MESOS-3402 > Project: Mesos > Issue Type: Bug >Reporter: Evan Krall >Assignee: Tim Anderegg > > mesos-execute does not appear to support passing credentials. This makes it > impossible to use on a cluster where framework authentication is required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api
[ https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232554#comment-15232554 ] Tim Anderegg commented on MESOS-5148: - According to the Marathon API docs (https://mesosphere.github.io/marathon/docs/generated/api.html) there is no "mesos" key in the "container" object. So if you specify "type": "MESOS", I believe you would still need to use the "docker" key to set your container image settings. The "type" parameter specifies which containerizer to use, not necessarily which container image type to use. "appc" would be the other container image type, but is not yet supported by Marathon. > Supporting Container Images in Mesos Containerizer doesn't work by using > marathon api > - > > Key: MESOS-5148 > URL: https://issues.apache.org/jira/browse/MESOS-5148 > Project: Mesos > Issue Type: Bug >Reporter: wangqun > > Hi > I use the marathon api to create tasks to test Supporting Container > Images in Mesos Containerizer . > My steps is the following: > 1) to run the process in master node. > sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 > --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 > --quorum=1 --work_dir=/var/lib/mesos > 2) to run the process in slave node. > sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos > --log_dir=/var/log/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 > --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave > --image_providers=docker --executor_environment_variables="{}" > 3) to create one json file to specify the container to be managed by mesos. > sudo touch mesos.json > sudo vim mesos.json > { > "container": { > "type": "MESOS", > "mesos": { > "image": "library/redis" > } > }, > "id": "ubuntumesos", > "instances": 1, > "cpus": 0.5, > "mem": 512, > "uris": [], > "cmd": "ping 8.8.8.8" > } > 4)sudo curl -X POST -H "Content-Type: application/json" > localhost:8080/v2/apps -d...@mesos.json > 5)sudo curl http://localhost:8080/v2/tasks > {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]} > 6) sudo docker run -ti --net=host redis redis-cli > Could not connect to Redis at 127.0.0.1:6379: Connection refused > not connected> -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3402) mesos-execute does not support credentials
[ https://issues.apache.org/jira/browse/MESOS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232547#comment-15232547 ] Anand Mazumdar edited comment on MESOS-3402 at 4/8/16 5:40 PM: --- [~tanderegg] Thanks for taking this on. We recently moved {{mesos-execute}} to use the scheduler library that does not yet have AuthN support. But I intend to work on it soon ~2-3 days. I have linked the corresponding JIRA to this ticket. The next step would be finding a shepherd for this once MESOS-3923 is resolved. Sound reasonable? was (Author: anandmazumdar): [~tanderegg] Thanks for taking this on Tim. We recently moved {{mesos-execute}} to use the scheduler library that does not yet have AuthN support. But I intend to work on it soon ~2-3 days. I have linked the corresponding JIRA to this ticket. The next step would be finding a shepherd for this once MESOS-3923 is resolved. Sound reasonable? > mesos-execute does not support credentials > -- > > Key: MESOS-3402 > URL: https://issues.apache.org/jira/browse/MESOS-3402 > Project: Mesos > Issue Type: Bug >Reporter: Evan Krall >Assignee: Tim Anderegg > > mesos-execute does not appear to support passing credentials. This makes it > impossible to use on a cluster where framework authentication is required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3402) mesos-execute does not support credentials
[ https://issues.apache.org/jira/browse/MESOS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232547#comment-15232547 ] Anand Mazumdar commented on MESOS-3402: --- [~tanderegg] Thanks for taking this on Tim. We recently moved {{mesos-execute}} to use the scheduler library that does not yet have AuthN support. But I intend to work on it soon ~2-3 days. I have linked the corresponding JIRA to this ticket. The next step would be finding a shepherd for this once MESOS-3923 is resolved. Sound reasonable? > mesos-execute does not support credentials > -- > > Key: MESOS-3402 > URL: https://issues.apache.org/jira/browse/MESOS-3402 > Project: Mesos > Issue Type: Bug >Reporter: Evan Krall >Assignee: Tim Anderegg > > mesos-execute does not appear to support passing credentials. This makes it > impossible to use on a cluster where framework authentication is required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3402) mesos-execute does not support credentials
[ https://issues.apache.org/jira/browse/MESOS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232510#comment-15232510 ] Tim Anderegg commented on MESOS-3402: - I'd like to take this one on, if no one else is interested, since I've been using mesos-execute to test out the new appc functionality. In addition to generally supporting credentials, I was going to add a "user" flag to specify which user to run as, in order to properly test authorization and containers that have predefined users as well. I can create a separate issue for that if folks consider it to be out of the scope of this issue. > mesos-execute does not support credentials > -- > > Key: MESOS-3402 > URL: https://issues.apache.org/jira/browse/MESOS-3402 > Project: Mesos > Issue Type: Bug >Reporter: Evan Krall >Assignee: Tim Anderegg > > mesos-execute does not appear to support passing credentials. This makes it > impossible to use on a cluster where framework authentication is required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1934) Tasks created with mesos-execute disappear from task list at termination
[ https://issues.apache.org/jira/browse/MESOS-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232488#comment-15232488 ] Tim Anderegg commented on MESOS-1934: - FWIW, I ran into this issue too but once I figured out the method Adam described, that worked fine for me, so I'd +1 closing this. > Tasks created with mesos-execute disappear from task list at termination > > > Key: MESOS-1934 > URL: https://issues.apache.org/jira/browse/MESOS-1934 > Project: Mesos > Issue Type: Bug > Components: cli >Affects Versions: 0.20.1 > Environment: Linux 3.13.0-34-generic kernel, Ubuntu 14.04 >Reporter: Ben Hardy >Priority: Minor > > We are noticing that when tasks created with mesos-execute terminate, either > normally or abnormally, the task disappears from the task list. They do not > appear in the "Completed" section as you would expect, or anywhere else in > the page. > This makes problem diagnosis a bit inconvenient since one has to go digging > around in the logs on slave nodes to find out what went wrong, rather than > just being able to look at logs in the UI as you can with tasks submitted by > Singularity or Marathon. > Not a big deal but would be a nice time saver, and make things more > consistent. > Thanks, > B -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3402) mesos-execute does not support credentials
[ https://issues.apache.org/jira/browse/MESOS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Anderegg reassigned MESOS-3402: --- Assignee: Tim Anderegg > mesos-execute does not support credentials > -- > > Key: MESOS-3402 > URL: https://issues.apache.org/jira/browse/MESOS-3402 > Project: Mesos > Issue Type: Bug >Reporter: Evan Krall >Assignee: Tim Anderegg > > mesos-execute does not appear to support passing credentials. This makes it > impossible to use on a cluster where framework authentication is required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4175) ContentType/SchedulerTest.Decline is slow.
[ https://issues.apache.org/jira/browse/MESOS-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4175: --- Shepherd: Alexander Rukletsov Summary: ContentType/SchedulerTest.Decline is slow. (was: ContentType/SchedulerTest.Decline is slow) > ContentType/SchedulerTest.Decline is slow. > -- > > Key: MESOS-4175 > URL: https://issues.apache.org/jira/browse/MESOS-4175 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Shuai Lin >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{ContentType/SchedulerTest.Decline}} test takes more than {{1s}} to > finish on my Mac OS 10.10.4: > {code} > ContentType/SchedulerTest.Decline/0 (1022 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4162) SlaveTest.MetricsSlaveLaunchErrors is slow.
[ https://issues.apache.org/jira/browse/MESOS-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232243#comment-15232243 ] haosdent commented on MESOS-4162: - This have been fixed in [MESOS-4783 | https://issues.apache.org/jira/browse/MESOS-4783] > SlaveTest.MetricsSlaveLaunchErrors is slow. > --- > > Key: MESOS-4162 > URL: https://issues.apache.org/jira/browse/MESOS-4162 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: haosdent >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > Fix For: 0.28.0 > > > The {{SlaveTest.MetricsSlaveLaunchErrors}} test takes around {{1s}} to finish > on my Mac OS 10.10.4: > {code} > SlaveTest.MetricsSlaveLaunchErrors (1009 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4165) MasterTest.MasterInfoOnReElection is slow.
[ https://issues.apache.org/jira/browse/MESOS-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4165: --- Shepherd: Alexander Rukletsov Summary: MasterTest.MasterInfoOnReElection is slow. (was: MasterTest.MasterInfoOnReElection is slow) > MasterTest.MasterInfoOnReElection is slow. > -- > > Key: MESOS-4165 > URL: https://issues.apache.org/jira/browse/MESOS-4165 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: haosdent >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{MasterTest.MasterInfoOnReElection}} test takes more than {{1s}} to > finish on my Mac OS 10.10.4: > {code} > MasterTest.MasterInfoOnReElection (1024 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4164) MasterTest.RecoverResources is slow.
[ https://issues.apache.org/jira/browse/MESOS-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4164: --- Shepherd: Alexander Rukletsov Summary: MasterTest.RecoverResources is slow. (was: MasterTest.RecoverResources is slow) > MasterTest.RecoverResources is slow. > > > Key: MESOS-4164 > URL: https://issues.apache.org/jira/browse/MESOS-4164 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: haosdent >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{MasterTest.RecoverResources}} test takes more than {{1s}} to finish on > my Mac OS 10.10.4: > {code} > MasterTest.RecoverResources (1018 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3775) MasterAllocatorTest.SlaveLost is slow.
[ https://issues.apache.org/jira/browse/MESOS-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3775: --- Summary: MasterAllocatorTest.SlaveLost is slow. (was: MasterAllocatorTest.SlaveLost is slow) > MasterAllocatorTest.SlaveLost is slow. > -- > > Key: MESOS-3775 > URL: https://issues.apache.org/jira/browse/MESOS-3775 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Shuai Lin >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{MasterAllocatorTest.SlaveLost}} takes more that {{5s}} to complete. A > brief look into the code hints that the stopped agent does not quit > immediately (and hence its resources are not released by the allocator) > because [it waits for the executor to > terminate|https://github.com/apache/mesos/blob/master/src/tests/master_allocator_tests.cpp#L717]. > {{5s}} timeout comes from {{EXECUTOR_SHUTDOWN_GRACE_PERIOD}} agent constant. > Possible solutions: > * Do not wait until the stopped agent quits (can be flaky, needs deeper > analysis). > * Decrease the agent's {{executor_shutdown_grace_period}} flag. > * Terminate the executor faster (this may require some refactoring since the > executor driver is created in the {{TestContainerizer}} and we do not have > direct access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4162) SlaveTest.MetricsSlaveLaunchErrors is slow.
[ https://issues.apache.org/jira/browse/MESOS-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4162: --- Summary: SlaveTest.MetricsSlaveLaunchErrors is slow. (was: SlaveTest.MetricsSlaveLaunchErrors is slow) > SlaveTest.MetricsSlaveLaunchErrors is slow. > --- > > Key: MESOS-4162 > URL: https://issues.apache.org/jira/browse/MESOS-4162 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: haosdent >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{SlaveTest.MetricsSlaveLaunchErrors}} test takes around {{1s}} to finish > on my Mac OS 10.10.4: > {code} > SlaveTest.MetricsSlaveLaunchErrors (1009 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3775) MasterAllocatorTest.SlaveLost is slow.
[ https://issues.apache.org/jira/browse/MESOS-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3775: --- Shepherd: Alexander Rukletsov > MasterAllocatorTest.SlaveLost is slow. > -- > > Key: MESOS-3775 > URL: https://issues.apache.org/jira/browse/MESOS-3775 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Shuai Lin >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > The {{MasterAllocatorTest.SlaveLost}} takes more that {{5s}} to complete. A > brief look into the code hints that the stopped agent does not quit > immediately (and hence its resources are not released by the allocator) > because [it waits for the executor to > terminate|https://github.com/apache/mesos/blob/master/src/tests/master_allocator_tests.cpp#L717]. > {{5s}} timeout comes from {{EXECUTOR_SHUTDOWN_GRACE_PERIOD}} agent constant. > Possible solutions: > * Do not wait until the stopped agent quits (can be flaky, needs deeper > analysis). > * Decrease the agent's {{executor_shutdown_grace_period}} flag. > * Terminate the executor faster (this may require some refactoring since the > executor driver is created in the {{TestContainerizer}} and we do not have > direct access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4160) Log recover tests are slow
[ https://issues.apache.org/jira/browse/MESOS-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4160: --- Shepherd: Alexander Rukletsov > Log recover tests are slow > -- > > Key: MESOS-4160 > URL: https://issues.apache.org/jira/browse/MESOS-4160 > Project: Mesos > Issue Type: Improvement > Components: technical debt, test >Reporter: Alexander Rukletsov >Assignee: Shuai Lin >Priority: Minor > Labels: mesosphere, newbie++, tech-debt > > On Mac OS 10.10.4, some tests take longer than {{1s}} to finish: > {code} > RecoverTest.AutoInitialization (1003 ms) > RecoverTest.AutoInitializationRetry (1000 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5124) TASK_KILLING is not supported by mesos-execute.
[ https://issues.apache.org/jira/browse/MESOS-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-5124: --- Description: Recently {{TASK_KILLING}} state (MESOS-4547) have been introduced to Mesos. We should add support for this feature to {{mesos-execute}}. (was: Recently {{KillPolicy}} protobuf (MESOS-4909) and {{TASK_KILLING}} state (MESOS-4547) have been introduced to Mesos. We should add support for these features to {{mesos-execute}}.) > TASK_KILLING is not supported by mesos-execute. > --- > > Key: MESOS-5124 > URL: https://issues.apache.org/jira/browse/MESOS-5124 > Project: Mesos > Issue Type: Improvement > Components: cli >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: mesosphere > > Recently {{TASK_KILLING}} state (MESOS-4547) have been introduced to Mesos. > We should add support for this feature to {{mesos-execute}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5124) TASK_KILLING is not supported by mesos-execute.
[ https://issues.apache.org/jira/browse/MESOS-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-5124: --- Summary: TASK_KILLING is not supported by mesos-execute. (was: Kill policies are not supported by mesos-execute.) > TASK_KILLING is not supported by mesos-execute. > --- > > Key: MESOS-5124 > URL: https://issues.apache.org/jira/browse/MESOS-5124 > Project: Mesos > Issue Type: Improvement > Components: cli >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: mesosphere > > Recently {{KillPolicy}} protobuf (MESOS-4909) and {{TASK_KILLING}} state > (MESOS-4547) have been introduced to Mesos. We should add support for these > features to {{mesos-execute}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5154) Build the application using Visual C++ Build Tools 2015
Juan Larriba created MESOS-5154: --- Summary: Build the application using Visual C++ Build Tools 2015 Key: MESOS-5154 URL: https://issues.apache.org/jira/browse/MESOS-5154 Project: Mesos Issue Type: Improvement Reporter: Juan Larriba Assignee: Juan Larriba As Microsoft is releasing the Visual C++ Build Tools as a downloadable bundle without Visual Studio 2015, we should ensure that we can build Mesos without the need for Visual Studio. This will make building at CI servers easier, as well as reduce the download needed to build for Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5153) Sandboxes contents should be protected from unauthorized users
[ https://issues.apache.org/jira/browse/MESOS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5153: -- Story Points: 8 > Sandboxes contents should be protected from unauthorized users > -- > > Key: MESOS-5153 > URL: https://issues.apache.org/jira/browse/MESOS-5153 > Project: Mesos > Issue Type: Bug > Components: security, slave >Reporter: Alexander Rojas >Assignee: Alexander Rojas > Labels: mesosphere, security > > MESOS-4956 introduced authentication support for the sandboxes. However, > authentication can only go as far as to tell whether an user is known to > mesos or not. An extra additional step is necessary to verify whether the > known user is allowed to executed the requested operation on the sandbox > (browse, read, download, debug). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5153) Sandboxes contents should be protected from unauthorized users
Alexander Rojas created MESOS-5153: -- Summary: Sandboxes contents should be protected from unauthorized users Key: MESOS-5153 URL: https://issues.apache.org/jira/browse/MESOS-5153 Project: Mesos Issue Type: Bug Components: security, slave Reporter: Alexander Rojas Assignee: Alexander Rojas MESOS-4956 introduced authentication support for the sandboxes. However, authentication can only go as far as to tell whether an user is known to mesos or not. An extra additional step is necessary to verify whether the known user is allowed to executed the requested operation on the sandbox (browse, read, download, debug). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4785) Reorganize ACL subject/object descriptions
[ https://issues.apache.org/jira/browse/MESOS-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rojas reassigned MESOS-4785: -- Assignee: Alexander Rojas > Reorganize ACL subject/object descriptions > -- > > Key: MESOS-4785 > URL: https://issues.apache.org/jira/browse/MESOS-4785 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Greg Mann >Assignee: Alexander Rojas > Labels: documentation, mesosphere, security > Fix For: 0.29.0 > > > The authorization documentation would benefit from a reorganization of the > ACL subject/object descriptions. Instead of simple lists of the available > subjects and objects, it would be nice to see a table showing which subject > and object is used with each action. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4902) Add authentication to libprocess endpoints
[ https://issues.apache.org/jira/browse/MESOS-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4902: -- Assignee: Greg Mann > Add authentication to libprocess endpoints > -- > > Key: MESOS-4902 > URL: https://issues.apache.org/jira/browse/MESOS-4902 > Project: Mesos > Issue Type: Improvement > Components: HTTP API >Reporter: Greg Mann >Assignee: Greg Mann > Labels: authentication, http, mesosphere, security > > In addition to the endpoints addressed by MESOS-4850 and MESOS-5152, the > following endpoints would also benefit from HTTP authentication: > * {{/profiler/*}} > * {{/logging/toggle}} > * {{/metrics/snapshot}} > * {{/system/stats.json}} > Adding HTTP authentication to these endpoints is a bit more complicated > because they are defined at the libprocess level. > While working on MESOS-4850, it became apparent that since our tests use the > same instance of libprocess for both master and agent, different default > authentication realms must be used for master/agent so that HTTP > authentication can be independently enabled/disabled for each. > We should establish a mechanism for making an endpoint authenticated that > allows us to: > 1) Install an endpoint like {{/files}}, whose code is shared by the master > and agent, with different authentication realms for the master and agent > 2) Avoid hard-coding a default authentication realm into libprocess, to > permit the use of different authentication realms for the master and agent > and to keep application-level concerns from leaking into libprocess > Another option would be to use a single default authentication realm and > always enable or disable HTTP authentication for *both* the master and agent > in tests. However, this wouldn't allow us to test scenarios where HTTP > authentication is enabled on one but disabled on the other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4902) Add authentication to libprocess endpoints
[ https://issues.apache.org/jira/browse/MESOS-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231950#comment-15231950 ] Adam B commented on MESOS-4902: --- I split out /monitor/statistics into MESOS-5152, so this ticket can now focus on the libprocess endpoints alone. > Add authentication to libprocess endpoints > -- > > Key: MESOS-4902 > URL: https://issues.apache.org/jira/browse/MESOS-4902 > Project: Mesos > Issue Type: Improvement > Components: HTTP API >Reporter: Greg Mann > Labels: authentication, http, mesosphere, security > > In addition to the endpoints addressed by MESOS-4850 and MESOS-5152, the > following endpoints would also benefit from HTTP authentication: > * {{/profiler/*}} > * {{/logging/toggle}} > * {{/metrics/snapshot}} > * {{/system/stats.json}} > Adding HTTP authentication to these endpoints is a bit more complicated > because they are defined at the libprocess level. > While working on MESOS-4850, it became apparent that since our tests use the > same instance of libprocess for both master and agent, different default > authentication realms must be used for master/agent so that HTTP > authentication can be independently enabled/disabled for each. > We should establish a mechanism for making an endpoint authenticated that > allows us to: > 1) Install an endpoint like {{/files}}, whose code is shared by the master > and agent, with different authentication realms for the master and agent > 2) Avoid hard-coding a default authentication realm into libprocess, to > permit the use of different authentication realms for the master and agent > and to keep application-level concerns from leaking into libprocess > Another option would be to use a single default authentication realm and > always enable or disable HTTP authentication for *both* the master and agent > in tests. However, this wouldn't allow us to test scenarios where HTTP > authentication is enabled on one but disabled on the other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4902) Add authentication to libprocess endpoints
[ https://issues.apache.org/jira/browse/MESOS-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4902: -- Description: In addition to the endpoints addressed by MESOS-4850 and MESOS-5152, the following endpoints would also benefit from HTTP authentication: * {{/profiler/*}} * {{/logging/toggle}} * {{/metrics/snapshot}} * {{/system/stats.json}} Adding HTTP authentication to these endpoints is a bit more complicated because they are defined at the libprocess level. While working on MESOS-4850, it became apparent that since our tests use the same instance of libprocess for both master and agent, different default authentication realms must be used for master/agent so that HTTP authentication can be independently enabled/disabled for each. We should establish a mechanism for making an endpoint authenticated that allows us to: 1) Install an endpoint like {{/files}}, whose code is shared by the master and agent, with different authentication realms for the master and agent 2) Avoid hard-coding a default authentication realm into libprocess, to permit the use of different authentication realms for the master and agent and to keep application-level concerns from leaking into libprocess Another option would be to use a single default authentication realm and always enable or disable HTTP authentication for *both* the master and agent in tests. However, this wouldn't allow us to test scenarios where HTTP authentication is enabled on one but disabled on the other. was: In addition to the endpoints addressed by MESOS-4850 and MESOS-4951, the following endpoints would also benefit from HTTP authentication: * {{/profiler/*}} * {{/logging/toggle}} * {{/metrics/snapshot}} * {{/monitor/statistics}} * {{/system/stats.json}} Adding HTTP authentication to these endpoints is a bit more complicated: some endpoints are defined at the libprocess level, while others are defined in code that is shared by the master and agent. While working on MESOS-4850, it became apparent that since our tests use the same instance of libprocess for both master and agent, different default authentication realms must be used for master/agent so that HTTP authentication can be independently enabled/disabled for each. We should establish a mechanism for making an endpoint authenticated that allows us to: 1) Install an endpoint like {{/files}}, whose code is shared by the master and agent, with different authentication realms for the master and agent 2) Avoid hard-coding a default authentication realm into libprocess, to permit the use of different authentication realms for the master and agent and to keep application-level concerns from leaking into libprocess Another option would be to use a single default authentication realm and always enable or disable HTTP authentication for *both* the master and agent in tests. However, this wouldn't allow us to test scenarios where HTTP authentication is enabled on one but disabled on the other. > Add authentication to libprocess endpoints > -- > > Key: MESOS-4902 > URL: https://issues.apache.org/jira/browse/MESOS-4902 > Project: Mesos > Issue Type: Improvement > Components: HTTP API >Reporter: Greg Mann > Labels: authentication, http, mesosphere, security > > In addition to the endpoints addressed by MESOS-4850 and MESOS-5152, the > following endpoints would also benefit from HTTP authentication: > * {{/profiler/*}} > * {{/logging/toggle}} > * {{/metrics/snapshot}} > * {{/system/stats.json}} > Adding HTTP authentication to these endpoints is a bit more complicated > because they are defined at the libprocess level. > While working on MESOS-4850, it became apparent that since our tests use the > same instance of libprocess for both master and agent, different default > authentication realms must be used for master/agent so that HTTP > authentication can be independently enabled/disabled for each. > We should establish a mechanism for making an endpoint authenticated that > allows us to: > 1) Install an endpoint like {{/files}}, whose code is shared by the master > and agent, with different authentication realms for the master and agent > 2) Avoid hard-coding a default authentication realm into libprocess, to > permit the use of different authentication realms for the master and agent > and to keep application-level concerns from leaking into libprocess > Another option would be to use a single default authentication realm and > always enable or disable HTTP authentication for *both* the master and agent > in tests. However, this wouldn't allow us to test scenarios where HTTP > authentication is enabled on one but disabled on the other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4902) Add authentication to libprocess endpoints
[ https://issues.apache.org/jira/browse/MESOS-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4902: -- Summary: Add authentication to libprocess endpoints (was: Add authentication to remaining agent endpoints) > Add authentication to libprocess endpoints > -- > > Key: MESOS-4902 > URL: https://issues.apache.org/jira/browse/MESOS-4902 > Project: Mesos > Issue Type: Improvement > Components: HTTP API >Reporter: Greg Mann > Labels: authentication, http, mesosphere, security > > In addition to the endpoints addressed by MESOS-4850 and MESOS-4951, the > following endpoints would also benefit from HTTP authentication: > * {{/profiler/*}} > * {{/logging/toggle}} > * {{/metrics/snapshot}} > * {{/monitor/statistics}} > * {{/system/stats.json}} > Adding HTTP authentication to these endpoints is a bit more complicated: some > endpoints are defined at the libprocess level, while others are defined in > code that is shared by the master and agent. > While working on MESOS-4850, it became apparent that since our tests use the > same instance of libprocess for both master and agent, different default > authentication realms must be used for master/agent so that HTTP > authentication can be independently enabled/disabled for each. > We should establish a mechanism for making an endpoint authenticated that > allows us to: > 1) Install an endpoint like {{/files}}, whose code is shared by the master > and agent, with different authentication realms for the master and agent > 2) Avoid hard-coding a default authentication realm into libprocess, to > permit the use of different authentication realms for the master and agent > and to keep application-level concerns from leaking into libprocess > Another option would be to use a single default authentication realm and > always enable or disable HTTP authentication for *both* the master and agent > in tests. However, this wouldn't allow us to test scenarios where HTTP > authentication is enabled on one but disabled on the other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5152) Add authentication to agent's /monitor/statistics endpoint
Adam B created MESOS-5152: - Summary: Add authentication to agent's /monitor/statistics endpoint Key: MESOS-5152 URL: https://issues.apache.org/jira/browse/MESOS-5152 Project: Mesos Issue Type: Task Components: security, slave Reporter: Adam B Operators may want to enforce that only authenticated users (and subsequently only specific authorized users) be able to view per-executor resource usage statistics. Since this endpoint is handled by the ResourceMonitorProcess, I would expect the work necessary to be similar to what was done for /files or /registry endpoint authn. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5063) SSLTest.HTTPSPost and SSLTest.HTTPSGet tests fail
[ https://issues.apache.org/jira/browse/MESOS-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231923#comment-15231923 ] Chen Zhiwei commented on MESOS-5063: Should disable HTTP_PARSER_STRICT mode to get it this work. I dont know how to add macro HTTP_PARSER_STRICT=0 to CMakefile, so I only update the Makefile.am: https://reviews.apache.org/r/45917/ > SSLTest.HTTPSPost and SSLTest.HTTPSGet tests fail > - > > Key: MESOS-5063 > URL: https://issues.apache.org/jira/browse/MESOS-5063 > Project: Mesos > Issue Type: Bug > Components: libprocess >Affects Versions: 0.29.0 > Environment: Configured with SSL enabled >Reporter: Greg Mann >Assignee: Chen Zhiwei >Priority: Critical > Labels: mesosphere, ssl, tests > Fix For: 0.29.0 > > > These tests fail, with minimal logging output: > {code} > [ RUN ] SSLTest.HTTPSGet > ../../../3rdparty/libprocess/src/tests/ssl_tests.cpp:663: Failure > (response).failure(): Failed to decode response > [ FAILED ] SSLTest.HTTPSGet (137 ms) > [ RUN ] SSLTest.HTTPSPost > ../../../3rdparty/libprocess/src/tests/ssl_tests.cpp:704: Failure > (response).failure(): Failed to decode response > [ FAILED ] SSLTest.HTTPSPost (243 ms) > {code} > It's worth noting that the 3rdparty http-parser library was recently > upgraded: > https://github.com/apache/mesos/commit/94df63f72146501872a06c6487e94bdfd0f23025 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4951) Enable actors to pass an authentication realm to libprocess
[ https://issues.apache.org/jira/browse/MESOS-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4951: -- Assignee: Greg Mann > Enable actors to pass an authentication realm to libprocess > --- > > Key: MESOS-4951 > URL: https://issues.apache.org/jira/browse/MESOS-4951 > Project: Mesos > Issue Type: Improvement > Components: libprocess, slave >Reporter: Greg Mann >Assignee: Greg Mann > Labels: authentication, http, mesosphere, security > > To prepare for MESOS-4902, the Mesos master and agent need a way to pass the > desired authentication realm to libprocess. Since some endpoints (like > {{/profiler/*}}) get installed in libprocess, the master/agent should be able > to specify during initialization what authentication realm the > libprocess-level endpoints will be authenticated under. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5151) Marathon Pass Dynamic Value with Parameters Resource in Docker Configuration
Jesada Gonkratoke created MESOS-5151: Summary: Marathon Pass Dynamic Value with Parameters Resource in Docker Configuration Key: MESOS-5151 URL: https://issues.apache.org/jira/browse/MESOS-5151 Project: Mesos Issue Type: Wish Components: docker Affects Versions: 0.28.0 Environment: software Reporter: Jesada Gonkratoke "parameters": [ { "key": "add-host", "value": "dockerhost:$(hostname -i)" } ] }, # I want to add dynamic host ip -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5142) Add agent flags for HTTP authorization
[ https://issues.apache.org/jira/browse/MESOS-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5142: -- Sprint: Mesosphere Sprint 32 > Add agent flags for HTTP authorization > -- > > Key: MESOS-5142 > URL: https://issues.apache.org/jira/browse/MESOS-5142 > Project: Mesos > Issue Type: Bug > Components: security, slave >Reporter: Jan Schlicht >Assignee: Jan Schlicht > Labels: mesosphere, security > > Flags should be added to the agent to: > 1. Enable authorization ({{--authorizers}}) > 2. Provide ACLs ({{--acls}}) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5150) Authorize Agent HTTP Endpoints
Adam B created MESOS-5150: - Summary: Authorize Agent HTTP Endpoints Key: MESOS-5150 URL: https://issues.apache.org/jira/browse/MESOS-5150 Project: Mesos Issue Type: Epic Components: security, slave Reporter: Adam B Assignee: Alexander Rojas As we add authentication in agent http endpoint handlers in MESOS-4847, we now have the opportunity to perform ACL-based authorization on these endpoints. Most important is the authorization of the /files endpoints, as those allow access to executor sandboxes (and agent logs), and the operator may wish to control which users may access which sandboxes. Similarly, the operator may only want certain users to be able to view agent flags, change logging level, enable the profiler, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4785) Reorganize ACL subject/object descriptions
[ https://issues.apache.org/jira/browse/MESOS-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231825#comment-15231825 ] Adam B commented on MESOS-4785: --- [~arojas], can you look into this so we can close out this Epic? > Reorganize ACL subject/object descriptions > -- > > Key: MESOS-4785 > URL: https://issues.apache.org/jira/browse/MESOS-4785 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Greg Mann > Labels: documentation, mesosphere, security > Fix For: 0.29.0 > > > The authorization documentation would benefit from a reorganization of the > ACL subject/object descriptions. Instead of simple lists of the available > subjects and objects, it would be nice to see a table showing which subject > and object is used with each action. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4785) Reorganize ACL subject/object descriptions
[ https://issues.apache.org/jira/browse/MESOS-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4785: -- Fix Version/s: 0.29.0 > Reorganize ACL subject/object descriptions > -- > > Key: MESOS-4785 > URL: https://issues.apache.org/jira/browse/MESOS-4785 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Greg Mann > Labels: documentation, mesosphere, security > Fix For: 0.29.0 > > > The authorization documentation would benefit from a reorganization of the > ACL subject/object descriptions. Instead of simple lists of the available > subjects and objects, it would be nice to see a table showing which subject > and object is used with each action. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5146) MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky
[ https://issues.apache.org/jira/browse/MESOS-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5146: -- Fix Version/s: 0.29.0 > MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky > -- > > Key: MESOS-5146 > URL: https://issues.apache.org/jira/browse/MESOS-5146 > Project: Mesos > Issue Type: Bug > Components: allocation, tests >Affects Versions: 0.28.0 > Environment: Ubuntu 14.04 using clang, without libevent or SSL >Reporter: Greg Mann >Assignee: Yongqiao Wang > Labels: mesosphere > Fix For: 0.29.0 > > > Observed on the ASF CI: > {code} > [ RUN ] MasterAllocatorTest/1.RebalancedForUpdatedWeights > I0407 22:34:10.330394 29278 cluster.cpp:149] Creating default 'local' > authorizer > I0407 22:34:10.466182 29278 leveldb.cpp:174] Opened db in 135.608207ms > I0407 22:34:10.516398 29278 leveldb.cpp:181] Compacted db in 50.159558ms > I0407 22:34:10.516464 29278 leveldb.cpp:196] Created db iterator in 34959ns > I0407 22:34:10.516484 29278 leveldb.cpp:202] Seeked to beginning of db in > 10195ns > I0407 22:34:10.516496 29278 leveldb.cpp:271] Iterated through 0 keys in the > db in 7324ns > I0407 22:34:10.516547 29278 replica.cpp:779] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0407 22:34:10.517277 29298 recover.cpp:447] Starting replica recovery > I0407 22:34:10.517693 29300 recover.cpp:473] Replica is in EMPTY status > I0407 22:34:10.520251 29310 replica.cpp:673] Replica in EMPTY status received > a broadcasted recover request from (4775)@172.17.0.3:35855 > I0407 22:34:10.520611 29311 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I0407 22:34:10.521164 29299 recover.cpp:564] Updating replica status to > STARTING > I0407 22:34:10.523435 29298 master.cpp:382] Master > f59f9057-a5c7-43e1-b129-96862e640a12 (129e11060069) started on > 172.17.0.3:35855 > I0407 22:34:10.523473 29298 master.cpp:384] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/3rZY8C/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="100secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" > --work_dir="/tmp/3rZY8C/master" --zk_session_timeout="10secs" > I0407 22:34:10.523885 29298 master.cpp:433] Master only allowing > authenticated frameworks to register > I0407 22:34:10.523901 29298 master.cpp:438] Master only allowing > authenticated agents to register > I0407 22:34:10.523913 29298 credentials.hpp:37] Loading credentials for > authentication from '/tmp/3rZY8C/credentials' > I0407 22:34:10.524298 29298 master.cpp:480] Using default 'crammd5' > authenticator > I0407 22:34:10.524441 29298 master.cpp:551] Using default 'basic' HTTP > authenticator > I0407 22:34:10.524564 29298 master.cpp:589] Authorization enabled > I0407 22:34:10.525269 29305 hierarchical.cpp:145] Initialized hierarchical > allocator process > I0407 22:34:10.525333 29305 whitelist_watcher.cpp:77] No whitelist given > I0407 22:34:10.527331 29298 master.cpp:1832] The newly elected leader is > master@172.17.0.3:35855 with id f59f9057-a5c7-43e1-b129-96862e640a12 > I0407 22:34:10.527441 29298 master.cpp:1845] Elected as the leading master! > I0407 22:34:10.527545 29298 master.cpp:1532] Recovering from registrar > I0407 22:34:10.527889 29298 registrar.cpp:331] Recovering registrar > I0407 22:34:10.549734 29299 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 28.25177ms > I0407 22:34:10.549782 29299 replica.cpp:320] Persisted replica status to > STARTING > I0407 22:34:10.550010 29299 recover.cpp:473] Replica is in STARTING status > I0407 22:34:10.551352 29299 replica.cpp:673] Replica in STARTING status > received a broadcasted recover request from (4777)@172.17.0.3:35855 > I0407 22:34:10.551676 29299 recover.cpp:193] Received a recover response from > a replica in STARTING status > I0407 22:34:10.552315 29308 recover.cpp:564] Updating replica status to VOTING > I0407 22:34:10.574865 29308 leveldb.cpp:304] Persisting metadata (8 bytes) to
[jira] [Assigned] (MESOS-5027) Enable authenticated login in the webui
[ https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent reassigned MESOS-5027: --- Assignee: haosdent (was: Joerg Schad) > Enable authenticated login in the webui > --- > > Key: MESOS-5027 > URL: https://issues.apache.org/jira/browse/MESOS-5027 > Project: Mesos > Issue Type: Improvement > Components: master, security, webui >Reporter: Greg Mann >Assignee: haosdent > Labels: mesosphere, security > Attachments: Screen Shot 2016-04-07 at 21.02.45.png > > > The webui hits a number of endpoints to get the data that it displays: > {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and > maybe others? Once authentication is enabled on these endpoints, we need to > add a login prompt to the webui so that users can provide credentials. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5027) Enable authenticated login in the webui
[ https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231817#comment-15231817 ] haosdent commented on MESOS-5027: - Thanks! Let me reopen it and assign it to myself. > Enable authenticated login in the webui > --- > > Key: MESOS-5027 > URL: https://issues.apache.org/jira/browse/MESOS-5027 > Project: Mesos > Issue Type: Improvement > Components: master, security, webui >Reporter: Greg Mann >Assignee: Joerg Schad > Labels: mesosphere, security > Attachments: Screen Shot 2016-04-07 at 21.02.45.png > > > The webui hits a number of endpoints to get the data that it displays: > {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and > maybe others? Once authentication is enabled on these endpoints, we need to > add a login prompt to the webui so that users can provide credentials. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5027) Enable authenticated login in the webui
[ https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231816#comment-15231816 ] Adam B commented on MESOS-5027: --- Start a design doc and find a shepherd. I don't want you to write up a bunch of code that sits in ReviewBoard unshepherded. > Enable authenticated login in the webui > --- > > Key: MESOS-5027 > URL: https://issues.apache.org/jira/browse/MESOS-5027 > Project: Mesos > Issue Type: Improvement > Components: master, security, webui >Reporter: Greg Mann >Assignee: Joerg Schad > Labels: mesosphere, security > Attachments: Screen Shot 2016-04-07 at 21.02.45.png > > > The webui hits a number of endpoints to get the data that it displays: > {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and > maybe others? Once authentication is enabled on these endpoints, we need to > add a login prompt to the webui so that users can provide credentials. -- This message was sent by Atlassian JIRA (v6.3.4#6332)