[jira] [Updated] (MESOS-6837) FaultToleranceTest.FrameworkReregister is flaky

2016-12-22 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-6837:
---
Description: 
Observed on internal CI:

{noformat}
[21:27:38] : [Step 11/11] 
/mnt/teamcity/work/4240ba9ddd0997c3/src/tests/fault_tolerance_tests.cpp:892: 
Failure
[21:27:38] : [Step 11/11] Value of: 
framework.values["registered_time"].as().as()
[21:27:38] : [Step 11/11]   Actual: 1482442093
[21:27:38] : [Step 11/11] Expected: static_cast(registerTime.secs())
[21:27:38] : [Step 11/11] Which is: 1482442094
{noformat}

Looks like another instance of MESOS-4695.

  was:
Observed on internal CI:

{noformat}
[21:27:38] : [Step 11/11] 
/mnt/teamcity/work/4240ba9ddd0997c3/src/tests/fault_tolerance_tests.cpp:892: 
Failure
[21:27:38] : [Step 11/11] Value of: 
framework.values["registered_time"].as().as()
[21:27:38] : [Step 11/11]   Actual: 1482442093
[21:27:38] : [Step 11/11] Expected: static_cast(registerTime.secs())
[21:27:38] : [Step 11/11] Which is: 1482442094
{noformat}


> FaultToleranceTest.FrameworkReregister is flaky
> ---
>
> Key: MESOS-6837
> URL: https://issues.apache.org/jira/browse/MESOS-6837
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> Observed on internal CI:
> {noformat}
> [21:27:38] : [Step 11/11] 
> /mnt/teamcity/work/4240ba9ddd0997c3/src/tests/fault_tolerance_tests.cpp:892: 
> Failure
> [21:27:38] : [Step 11/11] Value of: 
> framework.values["registered_time"].as().as()
> [21:27:38] : [Step 11/11]   Actual: 1482442093
> [21:27:38] : [Step 11/11] Expected: static_cast(registerTime.secs())
> [21:27:38] : [Step 11/11] Which is: 1482442094
> {noformat}
> Looks like another instance of MESOS-4695.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6837) FaultToleranceTest.FrameworkReregister is flaky

2016-12-22 Thread Neil Conway (JIRA)
Neil Conway created MESOS-6837:
--

 Summary: FaultToleranceTest.FrameworkReregister is flaky
 Key: MESOS-6837
 URL: https://issues.apache.org/jira/browse/MESOS-6837
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Neil Conway
Assignee: Neil Conway


Observed on internal CI:

{noformat}
[21:27:38] : [Step 11/11] 
/mnt/teamcity/work/4240ba9ddd0997c3/src/tests/fault_tolerance_tests.cpp:892: 
Failure
[21:27:38] : [Step 11/11] Value of: 
framework.values["registered_time"].as().as()
[21:27:38] : [Step 11/11]   Actual: 1482442093
[21:27:38] : [Step 11/11] Expected: static_cast(registerTime.secs())
[21:27:38] : [Step 11/11] Which is: 1482442094
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6828) Consider ways for frameworks to ignore offers with an Unavailability

2016-12-22 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771179#comment-15771179
 ] 

Joris Van Remoortere commented on MESOS-6828:
-

An updated proposal to improve flexibility while still being easily consumable:
# Allow operators to specify a separate start time for when offers should stop 
being sent prior to the actual maintenance window.
# Add an opt-in capability for frameworks to be able to see offers during the 
period described in point #1

By controlling the time period during which offers are not sent out we are able 
to stagger them out based on the maintenance schedule and prevent the stalling 
scenario described in the ticket description.

> Consider ways for frameworks to ignore offers with an Unavailability
> 
>
> Key: MESOS-6828
> URL: https://issues.apache.org/jira/browse/MESOS-6828
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joris Van Remoortere
>Assignee: Artem Harutyunyan
>  Labels: maintenance
>
> Due to the opt-in nature of maintenance primitives in Mesos, there is a 
> deficiency for cluster administrators when frameworks have not opted in.
> An example case:
> - Cluster with reasonable churn (tasks terminate naturally)
> - Operator specifies maintenance schedule
> Ideally *even* in a world where none of the frameworks had opted in to 
> maintenance primitives the operator would have some way of preventing 
> frameworks from scheduling further work on agents in the schedule. The 
> natural termination of the tasks in the cluster would allow the nodes to 
> drain gracefully and the operator to then perform maintenance.
> 2 options that have been discussed so far:
> # Provide a capability for frameworks to automatically filter offers with an 
> {{Unavailability}} set.
> #* Pro: Finer grained control. Allows other frameworks to keep scheduling 
> short lived tasks that can complete before the Unavailability.
> #* Con: All frameworks have to be updated. Consider making this an 
> environment variable to the scheduler driver for legacy frameworks.
> # Provide a flag on the master to filter all offers with an 
> {{Unavailability}} set.
> #* Pro: Immediately actionable / usable.
> #* Con: Coarse grained. Some frameworks may suffer efficiency.
> #* Con: *Dangerous*: planning out a multi-day maintenance schedule for an 
> entire cluster will prevent any frameworks from scheduling further work, 
> potentially stalling the cluster.
> Action Items: Provide further context for each option and consider others. We 
> need to ensure we have something immediately consumable by users to fill the 
> gap until maintenance primitives are the norm. We also need to ensure we 
> prevent dangerous scenarios like the Con listed for option #2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6835) Fix SIGBUS on ARM64/AArch64

2016-12-22 Thread Aaron Wood (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Wood updated MESOS-6835:
--
Description: 
Currently in the Linux launcher when the stack is allocated and prepared for a 
call to clone() it is not properly aligned. This is not an issue for x86 or x64 
but for ARM64/AArch64 it is because of the requirement of having the stack 
aligned to a 16 byte boundary. While x86 and x64 also expect the stack to have 
a 16 byte aligned stack, it is not enforced. An explanation of the stack and 
requirements for ARM64 can be found here 
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf 
(specifically section 5.2.2.1 that says SP mod 16 = 0. The stack must be 
quad-word aligned.)

Additionally, the way that the stack is currently allocated and passed to 
clone() accidentally chops off one entry, making a stack overflow using those 
missing 8 bytes a possibility. Fixing this while aligning the memory will fix 
both the issue of the stack overflow issue as well as the SIGBUS crash.

https://reviews.apache.org/r/54996/

  was:
Currently in the Linux launcher when the stack is allocated and prepared for a 
call to clone() it is not properly aligned. This is not an issue for x86 or x64 
but for ARM64/AArch64 it is because of the requirement of having the stack 
aligned to a 16 byte boundary. While x86 and x64 also expect the stack to have 
a 16 byte aligned stack, it is not enforced.

Additionally, the way that the stack is currently allocated and passed to 
clone() accidentally chops off one entry, making a stack overflow using those 
missing 8 bytes a possibility. Fixing this while aligning the memory will fix 
both the issue of the stack overflow issue as well as the SIGBUS crash.

https://reviews.apache.org/r/54996/


> Fix SIGBUS on ARM64/AArch64
> ---
>
> Key: MESOS-6835
> URL: https://issues.apache.org/jira/browse/MESOS-6835
> Project: Mesos
>  Issue Type: Bug
>  Components: security, stout
>Reporter: Aaron Wood
>Assignee: Aaron Wood
>
> Currently in the Linux launcher when the stack is allocated and prepared for 
> a call to clone() it is not properly aligned. This is not an issue for x86 or 
> x64 but for ARM64/AArch64 it is because of the requirement of having the 
> stack aligned to a 16 byte boundary. While x86 and x64 also expect the stack 
> to have a 16 byte aligned stack, it is not enforced. An explanation of the 
> stack and requirements for ARM64 can be found here 
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
>  (specifically section 5.2.2.1 that says SP mod 16 = 0. The stack must be 
> quad-word aligned.)
> Additionally, the way that the stack is currently allocated and passed to 
> clone() accidentally chops off one entry, making a stack overflow using those 
> missing 8 bytes a possibility. Fixing this while aligning the memory will fix 
> both the issue of the stack overflow issue as well as the SIGBUS crash.
> https://reviews.apache.org/r/54996/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6834) Allow Mesos to compile on ARM64/AArch64

2016-12-22 Thread Aaron Wood (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Wood updated MESOS-6834:
--
Description: 
Mesos will not compile on ARM64/AArch64 without a patch to the version of 
LevelDB that is used within Mesos. While the fix is already in newer versions 
of LevelDB it's not something that Mesos pulls down.

The main issue is that the AtomicPointer header needs to be aware of other 
architectures to provide its functionality to those architectures.

https://reviews.apache.org/r/54993/

  was:
Mesos will not compile on ARM64/AArch64 without a patch to the version of 
LevelDB that is used within Mesos. While the fix is already in newer versions 
of LevelDB it's not something that Mesos pulls down.

The main issue is that the AtomicPointer header needs to be aware of other 
architectures to provide its functionality to those architectures.


> Allow Mesos to compile on ARM64/AArch64
> ---
>
> Key: MESOS-6834
> URL: https://issues.apache.org/jira/browse/MESOS-6834
> Project: Mesos
>  Issue Type: Bug
>  Components: build, general
>Reporter: Aaron Wood
>Assignee: Aaron Wood
>
> Mesos will not compile on ARM64/AArch64 without a patch to the version of 
> LevelDB that is used within Mesos. While the fix is already in newer versions 
> of LevelDB it's not something that Mesos pulls down.
> The main issue is that the AtomicPointer header needs to be aware of other 
> architectures to provide its functionality to those architectures.
> https://reviews.apache.org/r/54993/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6835) Fix SIGBUS on ARM64/AArch64

2016-12-22 Thread Aaron Wood (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Wood updated MESOS-6835:
--
Description: 
Currently in the Linux launcher when the stack is allocated and prepared for a 
call to clone() it is not properly aligned. This is not an issue for x86 or x64 
but for ARM64/AArch64 it is because of the requirement of having the stack 
aligned to a 16 byte boundary. While x86 and x64 also expect the stack to have 
a 16 byte aligned stack, it is not enforced.

Additionally, the way that the stack is currently allocated and passed to 
clone() accidentally chops off one entry, making a stack overflow using those 
missing 8 bytes a possibility. Fixing this while aligning the memory will fix 
both the issue of the stack overflow issue as well as the SIGBUS crash.

https://reviews.apache.org/r/54996/

  was:
Currently in the Linux launcher when the stack is allocated and prepared for a 
call to clone() it is not properly aligned. This is not an issue for x86 or x64 
but for ARM64/AArch64 it is because of the requirement of having the stack 
aligned to a 16 byte boundary. While x86 and x64 also expect the stack to have 
a 16 byte aligned stack, it is not enforced.

Additionally, the way that the stack is currently allocated and passed to 
clone() accidentally chops off one entry, making a stack overflow using those 
missing 8 bytes a possibility. Fixing this while aligning the memory will fix 
both the issue of the stack overflow issue as well as the SIGBUS crash.


> Fix SIGBUS on ARM64/AArch64
> ---
>
> Key: MESOS-6835
> URL: https://issues.apache.org/jira/browse/MESOS-6835
> Project: Mesos
>  Issue Type: Bug
>  Components: security, stout
>Reporter: Aaron Wood
>Assignee: Aaron Wood
>
> Currently in the Linux launcher when the stack is allocated and prepared for 
> a call to clone() it is not properly aligned. This is not an issue for x86 or 
> x64 but for ARM64/AArch64 it is because of the requirement of having the 
> stack aligned to a 16 byte boundary. While x86 and x64 also expect the stack 
> to have a 16 byte aligned stack, it is not enforced.
> Additionally, the way that the stack is currently allocated and passed to 
> clone() accidentally chops off one entry, making a stack overflow using those 
> missing 8 bytes a possibility. Fixing this while aligning the memory will fix 
> both the issue of the stack overflow issue as well as the SIGBUS crash.
> https://reviews.apache.org/r/54996/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6836) Immediately failing tasks show incomplete logs in the sandbox

2016-12-22 Thread Joseph Wu (JIRA)
Joseph Wu created MESOS-6836:


 Summary: Immediately failing tasks show incomplete logs in the 
sandbox
 Key: MESOS-6836
 URL: https://issues.apache.org/jira/browse/MESOS-6836
 Project: Mesos
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Joseph Wu


I started a master with default settings:
{code}
src/mesos-master --work_dir=/tmp/master
{code}

And an agent with default settings (on OSX and CentOS 7)
{code}
sudo src/mesos-agent --work_dir=/tmp/agent --master=...
{code}

Then I ran a task which I expect to fail immediately:
{code}
src/mesos-execute --master=... --name=fail --command=asdf
{code}

When I look inside the sandbox, I see a {{stderr}} like this:
{code}
@   0x4156be _Abort()
@   0x4156fc _Abort()
{code}

The stack trace is apparently clipped.  I have a hunch (insubstantiated) that 
this output clipping is due to the IO Switchboard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770957#comment-15770957
 ] 

Avinash Sridharan commented on MESOS-6143:
--

Have seen a reproduce of this problem for a while, so going to close this as 
"Not reproducible" at this point. Please feel free to re-open if and when we 
have more data on this.

> resolv.conf is not copied when using the Mesos containerizer with a Docker 
> image
> 
>
> Key: MESOS-6143
> URL: https://issues.apache.org/jira/browse/MESOS-6143
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, isolation
>Affects Versions: 1.0.0
> Environment: OS: Debian Jessie
> Mesos version: 1.0.0
>Reporter: Justin Pinkul
>Assignee: Avinash Sridharan
>
> When using the Mesos containierizer, host networking and a Docker image 
> {{resolv.conf}} is not copied from the host. The only piece of Mesos code 
> that copies these file is currently in the {{network/cni}} isolator so I 
> tried turning this on, by setting 
> {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}},
>  but the issue still remained. I suspect this might be related to not setting 
> {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems 
> incorrect that these flags would be required to use host networking.
> Here is how I am able to reproduce this issue:
> {code}
> mesos-execute --master=mesosmaster1:5050 \
>   --name=dns-test \
>   --docker_image=my-docker-image:1.1.3 \
>   --command="bash -c 'ping google.com; while ((1)); do date; 
> sleep 10; done'"
> # Find the PID of mesos-executor's child process and enter it
> nsenter -m -u -i -n -p -r -w -t $PID
> # This file will be empty
> cat /etc/resolv.conf
> {code}
> {code:title=Mesos agent log}
> I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
>  to user 'root'
> I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources 
> cpus(*):0.1; mem(*):32 in work directory 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
> I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for 
> executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container
> I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container 
> '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework 
> '51831498-0902-4ae9-a1ff-4396f8b8d823-0006'
> I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs 
> '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9'
>  for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process 
> with flags = CLONE_NEWNS | CLONE_NEWPID
> I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' 
> to 'mesos_executors.slice'
> I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor 
> 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for 
> container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; 
> gpus(*):2
> I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' 
> to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 
> at executor(1)@10.191.4.65:43707
> I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update 
> TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test 
> of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.169019 181577 status_update_manager.cpp:320] Received status 
> update TASK_RUNNING (UUID: 319e0235-01b9

[jira] [Comment Edited] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770957#comment-15770957
 ] 

Avinash Sridharan edited comment on MESOS-6143 at 12/22/16 7:59 PM:


Haven't seen a reproduce of this problem for a while, so going to close this as 
"Not reproducible" at this point. Please feel free to re-open if and when we 
have more data on this.


was (Author: avin...@mesosphere.io):
Have seen a reproduce of this problem for a while, so going to close this as 
"Not reproducible" at this point. Please feel free to re-open if and when we 
have more data on this.

> resolv.conf is not copied when using the Mesos containerizer with a Docker 
> image
> 
>
> Key: MESOS-6143
> URL: https://issues.apache.org/jira/browse/MESOS-6143
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, isolation
>Affects Versions: 1.0.0
> Environment: OS: Debian Jessie
> Mesos version: 1.0.0
>Reporter: Justin Pinkul
>Assignee: Avinash Sridharan
>
> When using the Mesos containierizer, host networking and a Docker image 
> {{resolv.conf}} is not copied from the host. The only piece of Mesos code 
> that copies these file is currently in the {{network/cni}} isolator so I 
> tried turning this on, by setting 
> {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}},
>  but the issue still remained. I suspect this might be related to not setting 
> {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems 
> incorrect that these flags would be required to use host networking.
> Here is how I am able to reproduce this issue:
> {code}
> mesos-execute --master=mesosmaster1:5050 \
>   --name=dns-test \
>   --docker_image=my-docker-image:1.1.3 \
>   --command="bash -c 'ping google.com; while ((1)); do date; 
> sleep 10; done'"
> # Find the PID of mesos-executor's child process and enter it
> nsenter -m -u -i -n -p -r -w -t $PID
> # This file will be empty
> cat /etc/resolv.conf
> {code}
> {code:title=Mesos agent log}
> I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
>  to user 'root'
> I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources 
> cpus(*):0.1; mem(*):32 in work directory 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
> I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for 
> executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container
> I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container 
> '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework 
> '51831498-0902-4ae9-a1ff-4396f8b8d823-0006'
> I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs 
> '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9'
>  for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process 
> with flags = CLONE_NEWNS | CLONE_NEWPID
> I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' 
> to 'mesos_executors.slice'
> I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor 
> 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for 
> container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; 
> gpus(*):2
> I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' 
> to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 
> at executor(1)@10.191.4.65:43707
> I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update 
> TASK_RUNNING (UUID: 

[jira] [Commented] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770944#comment-15770944
 ] 

Avinash Sridharan commented on MESOS-6571:
--

[~qianzhang] can you backport these commits to the 1.1.1 branch. There seems be 
conflicts when [~alexr] tried backporting
commit bf52cc1342b89b7d6923d017ddda06b066d03082
Author: Qian Zhang 
Date: Tue Nov 22 09:31:47 2016 +0800
Added parse function for v1::TaskInfo protobuf.
Review: https://reviews.apache.org/r/53644/

[~alexr] ^^

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770946#comment-15770946
 ] 

Avinash Sridharan commented on MESOS-6571:
--

[~qianzhang] just to give some context this is for the 1.1.1 release.

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6834) Allow Mesos to compile on ARM64/AArch64

2016-12-22 Thread Aaron Wood (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770931#comment-15770931
 ] 

Aaron Wood commented on MESOS-6834:
---

[~neilc] here it is https://reviews.apache.org/r/51053/

> Allow Mesos to compile on ARM64/AArch64
> ---
>
> Key: MESOS-6834
> URL: https://issues.apache.org/jira/browse/MESOS-6834
> Project: Mesos
>  Issue Type: Bug
>  Components: build, general
>Reporter: Aaron Wood
>Assignee: Aaron Wood
>
> Mesos will not compile on ARM64/AArch64 without a patch to the version of 
> LevelDB that is used within Mesos. While the fix is already in newer versions 
> of LevelDB it's not something that Mesos pulls down.
> The main issue is that the AtomicPointer header needs to be aware of other 
> architectures to provide its functionality to those architectures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6834) Allow Mesos to compile on ARM64/AArch64

2016-12-22 Thread Aaron Wood (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770900#comment-15770900
 ] 

Aaron Wood commented on MESOS-6834:
---

I had asked a few people about upgrading LevelDB a while back and I think they 
pointed me to an RR that was already open for this. It looked like some 
benchmarking was being done but that it was a very low priority for the 
project. FWIW this is the patch I made to get this to work without upgrading 
LevelDB. 
https://github.com/verizonlabs/mesos/commit/7ef493d51de5853e0c82471af36603f87146aec2

If LevelDB can be upgraded I can get rid of that :)

> Allow Mesos to compile on ARM64/AArch64
> ---
>
> Key: MESOS-6834
> URL: https://issues.apache.org/jira/browse/MESOS-6834
> Project: Mesos
>  Issue Type: Bug
>  Components: build, general
>Reporter: Aaron Wood
>Assignee: Aaron Wood
>
> Mesos will not compile on ARM64/AArch64 without a patch to the version of 
> LevelDB that is used within Mesos. While the fix is already in newer versions 
> of LevelDB it's not something that Mesos pulls down.
> The main issue is that the AtomicPointer header needs to be aware of other 
> architectures to provide its functionality to those architectures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6835) Fix SIGBUS on ARM64/AArch64

2016-12-22 Thread Aaron Wood (JIRA)
Aaron Wood created MESOS-6835:
-

 Summary: Fix SIGBUS on ARM64/AArch64
 Key: MESOS-6835
 URL: https://issues.apache.org/jira/browse/MESOS-6835
 Project: Mesos
  Issue Type: Bug
  Components: security, stout
Reporter: Aaron Wood
Assignee: Aaron Wood


Currently in the Linux launcher when the stack is allocated and prepared for a 
call to clone() it is not properly aligned. This is not an issue for x86 or x64 
but for ARM64/AArch64 it is because of the requirement of having the stack 
aligned to a 16 byte boundary. While x86 and x64 also expect the stack to have 
a 16 byte aligned stack, it is not enforced.

Additionally, the way that the stack is currently allocated and passed to 
clone() accidentally chops off one entry, making a stack overflow using those 
missing 8 bytes a possibility. Fixing this while aligning the memory will fix 
both the issue of the stack overflow issue as well as the SIGBUS crash.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6834) Allow Mesos to compile on ARM64/AArch64

2016-12-22 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770896#comment-15770896
 ] 

Neil Conway commented on MESOS-6834:


The powerpc issue is here: https://issues.apache.org/jira/browse/MESOS-4802

I agree that upgrading leveldb is a better path.

> Allow Mesos to compile on ARM64/AArch64
> ---
>
> Key: MESOS-6834
> URL: https://issues.apache.org/jira/browse/MESOS-6834
> Project: Mesos
>  Issue Type: Bug
>  Components: build, general
>Reporter: Aaron Wood
>Assignee: Aaron Wood
>
> Mesos will not compile on ARM64/AArch64 without a patch to the version of 
> LevelDB that is used within Mesos. While the fix is already in newer versions 
> of LevelDB it's not something that Mesos pulls down.
> The main issue is that the AtomicPointer header needs to be aware of other 
> architectures to provide its functionality to those architectures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6834) Allow Mesos to compile on ARM64/AArch64

2016-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770884#comment-15770884
 ] 

Jie Yu commented on MESOS-6834:
---

I am wondering if we can just upgrade leveldb?

cc [~vinodkone], [~neilc], [~chenzhiwei]

I remembered you guys patched leveldb for powerpc. Is there any specific reason 
not upgrading leveldb? Thanks!

> Allow Mesos to compile on ARM64/AArch64
> ---
>
> Key: MESOS-6834
> URL: https://issues.apache.org/jira/browse/MESOS-6834
> Project: Mesos
>  Issue Type: Bug
>  Components: build, general
>Reporter: Aaron Wood
>Assignee: Aaron Wood
>
> Mesos will not compile on ARM64/AArch64 without a patch to the version of 
> LevelDB that is used within Mesos. While the fix is already in newer versions 
> of LevelDB it's not something that Mesos pulls down.
> The main issue is that the AtomicPointer header needs to be aware of other 
> architectures to provide its functionality to those architectures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6834) Allow Mesos to compile on ARM64/AArch64

2016-12-22 Thread Aaron Wood (JIRA)
Aaron Wood created MESOS-6834:
-

 Summary: Allow Mesos to compile on ARM64/AArch64
 Key: MESOS-6834
 URL: https://issues.apache.org/jira/browse/MESOS-6834
 Project: Mesos
  Issue Type: Bug
  Components: build, general
Reporter: Aaron Wood
Assignee: Aaron Wood


Mesos will not compile on ARM64/AArch64 without a patch to the version of 
LevelDB that is used within Mesos. While the fix is already in newer versions 
of LevelDB it's not something that Mesos pulls down.

The main issue is that the AtomicPointer header needs to be aware of other 
architectures to provide its functionality to those architectures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6817) Audit the use of UNICODE-related code paths

2016-12-22 Thread Andrew Schwartzmeyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770767#comment-15770767
 ] 

Andrew Schwartzmeyer commented on MESOS-6817:
-

In my opinion, we should explicitly use the {{W}} suffixed Windows APIs at all 
times, to guarantee we are given a) the type {{wchar_t*}} and b) data encoded 
with UTF-16, as we can easily and correctly convert this to {{std::string}} 
encoded with UTF-8 using C++'s {{}} library. We _must not_ use the 
non-explicit (non-suffixed) versions of Windows APIs, as we lose type safety 
(the point of this issue: they use the type {{TCHAR}} which is variable, it's 
either {{char_t}} or {{wchar_t}} depending on {{UNICODE}}, which may or may not 
be defined). We _should not_ use the {{A}} suffixed versions as they do not 
guarantee an encoding: instead of standard UTF-16 (or UTF-8 which would be 
reasonable), they encode with the system's current ANSI code page, leaving us 
with an unknown encoding (also, it was deprecated by Unicode; Windows 
development guidelines say to use Unicode).

> Audit the use of UNICODE-related code paths
> ---
>
> Key: MESOS-6817
> URL: https://issues.apache.org/jira/browse/MESOS-6817
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Alex Clemmer
>Assignee: Alex Clemmer
>
> Currently we are being kind of lazy about when we're using things like 
> `TCHAR`. Functions like `os::user` will fail when we do something like 
> `std::string` with a `TCHAR` string if the string happens to not be `char`.
> We need to go back to all of these things and audit them so that they don't 
> break if we turn `UNICODE` on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6411) Add documentation for CNI port-mapper plugin.

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770752#comment-15770752
 ] 

Avinash Sridharan commented on MESOS-6411:
--

Thanks [~alexr] . Yeah. This is documentation for the port-mapping CNI plugin. 
The Mesos website already got a refresh with this documentation (since the 
plugin is available since 1.1.0). So makes sense to have it in 1.1.1 .

Thanks,
Avinash

> Add documentation for CNI port-mapper plugin.
> -
>
> Key: MESOS-6411
> URL: https://issues.apache.org/jira/browse/MESOS-6411
> Project: Mesos
>  Issue Type: Documentation
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
> Fix For: 1.1.1, 1.2.0
>
>
> Need to add the CNI port-mapper plugin to the CNI documentation within Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770699#comment-15770699
 ] 

Avinash Sridharan commented on MESOS-6571:
--

Yup, just set the target version.

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3011) Publish release documentation for major releases on website

2016-12-22 Thread Tim Anderegg (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770649#comment-15770649
 ] 

Tim Anderegg commented on MESOS-3011:
-

[~haosd...@gmail.com] [~vinodkone] Sorry for the delay in this, but I believe 
the patch is fully ready for merge, if you have a minute to give it another 
look.  Thanks!
https://reviews.apache.org/r/52064/


> Publish release documentation for major releases on website
> ---
>
> Key: MESOS-3011
> URL: https://issues.apache.org/jira/browse/MESOS-3011
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation, project website
>Reporter: Paul Brett
>Assignee: Tim Anderegg
>  Labels: documentation, mesosphere
>
> Currently, the website only provides a single version of the documentation.  
> We should publish documentation for each release on the website independently 
> (for example as https://mesos.apache.org/documentation/0.22/index.html, 
> https://mesos.apache.org/documentation/0.23/index.html) and make latest 
> redirect to the current version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770630#comment-15770630
 ] 

Alexander Rukletsov commented on MESOS-6571:


Do we need to backport it to 1.1.1?

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6571:
---
Target Version/s: 1.1.1, 1.2.0  (was: 1.1.0, 1.2.0)

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-6040) Add a CMake build for `mesos-port-mapper`

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770597#comment-15770597
 ] 

Avinash Sridharan edited comment on MESOS-6040 at 12/22/16 5:24 PM:


We can target for 1.2.0 . Cmake builds shouldn't be a blocker for 1.1.1. This 
is specifically for Linux and not Windows.


was (Author: avin...@mesosphere.io):
We can target for 1.2.0 . Cmake builds shouldn't be a blocker for 1.1.1. 

> Add a CMake build for `mesos-port-mapper`
> -
>
> Key: MESOS-6040
> URL: https://issues.apache.org/jira/browse/MESOS-6040
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Once the port-mapper binary compiles with GNU make, we need to modify the 
> CMake to build the port-mapper binary as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6040) Add a CMake build for `mesos-port-mapper`

2016-12-22 Thread Avinash Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan updated MESOS-6040:
-
Target Version/s: 1.2.0  (was: 1.1.1)

> Add a CMake build for `mesos-port-mapper`
> -
>
> Key: MESOS-6040
> URL: https://issues.apache.org/jira/browse/MESOS-6040
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Once the port-mapper binary compiles with GNU make, we need to modify the 
> CMake to build the port-mapper binary as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6040) Add a CMake build for `mesos-port-mapper`

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770597#comment-15770597
 ] 

Avinash Sridharan commented on MESOS-6040:
--

We can target for 1.2.0 . Cmake builds shouldn't be a blocker for 1.1.1. 

> Add a CMake build for `mesos-port-mapper`
> -
>
> Key: MESOS-6040
> URL: https://issues.apache.org/jira/browse/MESOS-6040
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Once the port-mapper binary compiles with GNU make, we need to modify the 
> CMake to build the port-mapper binary as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Avinash Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan updated MESOS-6571:
-
Target Version/s: 1.1.0, 1.2.0  (was: 1.2.0)
   Fix Version/s: (was: 1.1.1)
  1.2.0

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Avinash Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan updated MESOS-6571:
-
Target Version/s: 1.2.0  (was: 1.1.1)

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.1.1
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Avinash Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan updated MESOS-6571:
-
Target Version/s: 1.1.1
   Fix Version/s: (was: 1.2.0)
  1.1.1

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.1.1
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6431) Add support for port-mapping in `mesos-execute`

2016-12-22 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770574#comment-15770574
 ] 

Avinash Sridharan commented on MESOS-6431:
--

[~alexr] marked this as DUP of MESOS-6571. MESOS-6571 wasn't part of Mesos 
1.1.0, so yeah MESOS-6571 should be cherry-picked into MESOS 1.1.1 . Will set 
the target version for MESOS-6571.

> Add support for port-mapping in `mesos-execute`
> ---
>
> Key: MESOS-6431
> URL: https://issues.apache.org/jira/browse/MESOS-6431
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
> Fix For: 1.2.0
>
>
> Add support to specify port-mappings for a container in mesos-execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-6820) FaultToleranceTest.FrameworkReregister is flaky.

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15764034#comment-15764034
 ] 

Alexander Rukletsov edited comment on MESOS-6820 at 12/22/16 3:09 PM:
--

This issue was introduced in https://reviews.apache.org/r/53887/.


was (Author: bbannier):
This was added in https://reviews.apache.org/r/53887/.

> FaultToleranceTest.FrameworkReregister is flaky.
> 
>
> Key: MESOS-6820
> URL: https://issues.apache.org/jira/browse/MESOS-6820
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.2.0
> Environment: Debian 8, OS X
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: flaky, flaky-test, mesosphere
>
> I just saw {{FaultToleranceTest.FrameworkReregister}} fail in internal CI on 
> a Debian 8 system. Running the test in repetition on my OS X machine I was 
> able to reproduce the issue on OS X as well.
> {noformat}
> [ RUN  ] FaultToleranceTest.FrameworkReregister
> I1219 23:04:12.914769 23530 cluster.cpp:160] Creating default 'local' 
> authorizer
> I1219 23:04:12.915388 23545 master.cpp:380] Master 
> 4daa3046-9990-49c7-b601-958964306799 (ip-172-16-10-223.mesosphere.io) started 
> on 172.16.10.223:52614
> I1219 23:04:12.915400 23545 master.cpp:382] Flags at startup: --acls="" 
> --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate_agents="true" --authenticate_frameworks="true" 
> --authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
> --authenticate_http_readwrite="true" --authenticators="crammd5" 
> --authorizers="local" 
> --credentials="/mnt/teamcity/temp/buildTmp/4KpUDy/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --http_authenticators="basic" --http_framework_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
> --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
> --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" 
> --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
> --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
> --registry_store_timeout="100secs" --registry_strict="false" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" 
> --work_dir="/mnt/teamcity/temp/buildTmp/4KpUDy/master" 
> --zk_session_timeout="10secs"
> I1219 23:04:12.915504 23545 master.cpp:432] Master only allowing 
> authenticated frameworks to register
> I1219 23:04:12.915509 23545 master.cpp:446] Master only allowing 
> authenticated agents to register
> I1219 23:04:12.915511 23545 master.cpp:459] Master only allowing 
> authenticated HTTP frameworks to register
> I1219 23:04:12.915514 23545 credentials.hpp:37] Loading credentials for 
> authentication from '/mnt/teamcity/temp/buildTmp/4KpUDy/credentials'
> I1219 23:04:12.915570 23545 master.cpp:504] Using default 'crammd5' 
> authenticator
> I1219 23:04:12.915597 23545 http.cpp:922] Using default 'basic' HTTP 
> authenticator for realm 'mesos-master-readonly'
> I1219 23:04:12.915617 23545 http.cpp:922] Using default 'basic' HTTP 
> authenticator for realm 'mesos-master-readwrite'
> I1219 23:04:12.915658 23545 http.cpp:922] Using default 'basic' HTTP 
> authenticator for realm 'mesos-master-scheduler'
> I1219 23:04:12.915688 23545 master.cpp:584] Authorization enabled
> I1219 23:04:12.915725 23546 whitelist_watcher.cpp:77] No whitelist given
> I1219 23:04:12.915737 23547 hierarchical.cpp:149] Initialized hierarchical 
> allocator process
> I1219 23:04:12.916110 23545 master.cpp:2046] Elected as the leading master!
> I1219 23:04:12.916118 23545 master.cpp:1568] Recovering from registrar
> I1219 23:04:12.916179 23548 registrar.cpp:329] Recovering registrar
> I1219 23:04:12.916311 23545 registrar.cpp:362] Successfully fetched the 
> registry (0B) in 115968ns
> I1219 23:04:12.916334 23545 registrar.cpp:461] Applied 1 operations in 
> 1982ns; attempting to update the registry
> I1219 23:04:12.916554 23547 registrar.cpp:506] Successfully updated the 
> registry in 208896ns
> I1219 23:04:12.916770 23547 registrar.cpp:392] Successfully recovered 
> registrar
> I1219 23:04:12.916853 23547 master.cpp:1684] Recovered 0 agents from the 
> registry (174B); allowing 10mins for agents to re-register
> I1219 23:04:12.916956 23544 hierarchical.cpp:176] Skipping recovery of 
> hierarchical allocator: nothing to recover
> I1219 23:04:12.918097 23530 containerizer.cpp:220] Using isolation: 
> posix/cpu,posix/mem,filesystem/posix,network/cni

[jira] [Updated] (MESOS-6184) Health checks should use a general mechanism to enter namespaces of the task.

2016-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6184:
---
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 46, Mesosphere Sprint 47  
(was: Mesosphere Sprint 44, Mesosphere Sprint 46, Mesosphere Sprint 47, 
Mesosphere Sprint 48)

> Health checks should use a general mechanism to enter namespaces of the task.
> -
>
> Key: MESOS-6184
> URL: https://issues.apache.org/jira/browse/MESOS-6184
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Blocker
>  Labels: health-check, mesosphere
>
> To perform health checks for tasks, we need to enter the corresponding 
> namespaces of the container. For now health check use custom clone to 
> implement this
> {code}
>   return process::defaultClone([=]() -> int {
> if (taskPid.isSome()) {
>   foreach (const string& ns, namespaces) {
> Try setns = ns::setns(taskPid.get(), ns);
> if (setns.isError()) {
>   ...
> }
>   }
> }
> return func();
>   });
> {code}
> After the childHooks patches merged, we could change the health check to use 
> childHooks to call {{setns}} and make {{process::defaultClone}} private 
> again.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6184) Health checks should use a general mechanism to enter namespaces of the task.

2016-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6184:
---
Priority: Critical  (was: Blocker)

> Health checks should use a general mechanism to enter namespaces of the task.
> -
>
> Key: MESOS-6184
> URL: https://issues.apache.org/jira/browse/MESOS-6184
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Critical
>  Labels: health-check, mesosphere
>
> To perform health checks for tasks, we need to enter the corresponding 
> namespaces of the container. For now health check use custom clone to 
> implement this
> {code}
>   return process::defaultClone([=]() -> int {
> if (taskPid.isSome()) {
>   foreach (const string& ns, namespaces) {
> Try setns = ns::setns(taskPid.get(), ns);
> if (setns.isError()) {
>   ...
> }
>   }
> }
> return func();
>   });
> {code}
> After the childHooks patches merged, we could change the health check to use 
> childHooks to call {{setns}} and make {{process::defaultClone}} private 
> again.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6833) consecutive_failures 0 == 1 in HealthCheck.

2016-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6833:
---
 Shepherd: Alexander Rukletsov
Affects Version/s: 0.28.0
   1.0.0
 Story Points: 3
   Labels: health-check mesosphere  (was: )
  Summary: consecutive_failures 0 == 1 in HealthCheck.  (was: 
consecutive_failures 0 == 1 in HealthCheck)

> consecutive_failures 0 == 1 in HealthCheck.
> ---
>
> Key: MESOS-6833
> URL: https://issues.apache.org/jira/browse/MESOS-6833
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Affects Versions: 0.28.0, 1.0.0, 1.1.0
>Reporter: Lukas Loesche
>  Labels: health-check, mesosphere
>
> When defining a HealthCheck with consecutive_failures=0 one would expect 
> Mesos to never kill the task and only notify about the failure.
> What seems to happen instead is Mesos handles consecutive_failures=0 as 
> consecutive_failures=1 and kills the task after 1 failure.
> Since 0 isn't the same as 1 this seems to be a bug and results in unexpected 
> behaviour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6833) consecutive_failures 0 == 1 in HealthCheck

2016-12-22 Thread Lukas Loesche (JIRA)
Lukas Loesche created MESOS-6833:


 Summary: consecutive_failures 0 == 1 in HealthCheck
 Key: MESOS-6833
 URL: https://issues.apache.org/jira/browse/MESOS-6833
 Project: Mesos
  Issue Type: Bug
  Components: agent
Affects Versions: 1.1.0
Reporter: Lukas Loesche


When defining a HealthCheck with consecutive_failures=0 one would expect Mesos 
to never kill the task and only notify about the failure.

What seems to happen instead is Mesos handles consecutive_failures=0 as 
consecutive_failures=1 and kills the task after 1 failure.

Since 0 isn't the same as 1 this seems to be a bug and results in unexpected 
behaviour.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6676) Always re-link with scheduler during re-registration.

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770153#comment-15770153
 ] 

Alexander Rukletsov commented on MESOS-6676:


I've backported this to 1.1.1. [~vinodkone] I believe you still might want to 
backport it to 1.0.x.

> Always re-link with scheduler during re-registration.
> -
>
> Key: MESOS-6676
> URL: https://issues.apache.org/jira/browse/MESOS-6676
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
> Fix For: 1.1.1, 1.2.0
>
>
> Scenario:
> # Framework registers with master using a non-zero {{failover_timeout}} and 
> is assigned a FrameworkID.
> # The master sees an {{ExitedEvent}} for the master->scheduler link. This 
> could happen due to some transient network error, e.g., 1-way partition. The 
> master sends a {{FrameworkErrorMessage}} to the framework. The master marks 
> the framework as disconnected, but keeps the {{Framework*}} for it around in 
> {{frameworks.registered}}.
> # The framework doesn't receive the {{FrameworkErrorMessage}} because it is 
> dropped by the network.
> # The scheduler might receive an {{ExitedEvent}} for the scheduler -> master 
> link, but it ignores this anyway (see MESOS-887).
> # The scheduler sees a new-master-detected event and re-registers with the 
> master. It doesn _not_ set the {{force}} flag. This means we follow [this 
> code 
> path|https://github.com/apache/mesos/blob/a6bab9015cd63121081495b8291635f386b95a92/src/master/master.cpp#L2771]
>  in the master, which does _not_ relink with the scheduler.
> The result is that scheduler re-registration succeds, but the master -> 
> scheduler link is never re-established.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6676) Always re-link with scheduler during re-registration.

2016-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6676:
---
Fix Version/s: 1.1.1
  Summary: Always re-link with scheduler during re-registration.  (was: 
Always re-link with scheduler during re-registration)

> Always re-link with scheduler during re-registration.
> -
>
> Key: MESOS-6676
> URL: https://issues.apache.org/jira/browse/MESOS-6676
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
> Fix For: 1.1.1, 1.2.0
>
>
> Scenario:
> # Framework registers with master using a non-zero {{failover_timeout}} and 
> is assigned a FrameworkID.
> # The master sees an {{ExitedEvent}} for the master->scheduler link. This 
> could happen due to some transient network error, e.g., 1-way partition. The 
> master sends a {{FrameworkErrorMessage}} to the framework. The master marks 
> the framework as disconnected, but keeps the {{Framework*}} for it around in 
> {{frameworks.registered}}.
> # The framework doesn't receive the {{FrameworkErrorMessage}} because it is 
> dropped by the network.
> # The scheduler might receive an {{ExitedEvent}} for the scheduler -> master 
> link, but it ignores this anyway (see MESOS-887).
> # The scheduler sees a new-master-detected event and re-registers with the 
> master. It doesn _not_ set the {{force}} flag. This means we follow [this 
> code 
> path|https://github.com/apache/mesos/blob/a6bab9015cd63121081495b8291635f386b95a92/src/master/master.cpp#L2771]
>  in the master, which does _not_ relink with the scheduler.
> The result is that scheduler re-registration succeds, but the master -> 
> scheduler link is never re-established.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6597) Include missing Mesos Java classes for Protobuf files to support Operator HTTP V1 API

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770147#comment-15770147
 ] 

Alexander Rukletsov commented on MESOS-6597:


[~anandmazumdar] could you please backport it to 1.1.x?

> Include missing Mesos Java classes for Protobuf files to support Operator 
> HTTP V1 API
> -
>
> Key: MESOS-6597
> URL: https://issues.apache.org/jira/browse/MESOS-6597
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: Vijay Srinivasaraghavan
>Assignee: Vijay Srinivasaraghavan
>Priority: Blocker
>
> For V1 API support, the build file that generates Java protos wrapper as of 
> now includes only executor and scheduler. 
> (https://github.com/apache/mesos/blob/master/src/Makefile.am#L334) 
> To support operator HTTP API, we also need to generate java protos for 
> additional proto definitions like quota, maintenance etc., These java 
> definition files will be used by a standard Rest client when using the 
> straight HTTP API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6419) The 'master/teardown' endpoint should support tearing down 'unregistered_frameworks'.

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770145#comment-15770145
 ] 

Alexander Rukletsov commented on MESOS-6419:


[~neilc], [~vinodkone]: Do you still want it in 1.1.1?

> The 'master/teardown' endpoint should support tearing down 
> 'unregistered_frameworks'.
> -
>
> Key: MESOS-6419
> URL: https://issues.apache.org/jira/browse/MESOS-6419
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.26.2, 0.27.3, 0.28.2, 1.0.1
>Reporter: Gilbert Song
>Assignee: Neil Conway
>Priority: Critical
>  Labels: endpoint, master
>
> This issue is exposed from 
> [MESOS-6400](https://issues.apache.org/jira/browse/MESOS-6400). When a user 
> is trying to tear down an 'unregistered_framework' from the 'master/teardown' 
> endpoint, a bad request will be returned: `No framework found with specified 
> ID`.
> Ideally, we should support tearing down an unregistered framework, since 
> those frameworks may occur due to network partition, then all the orphan 
> tasks still occupy the resources. It would be a nightmare if a user has to 
> wait until the unregistered framework to get those resources back.
> This may be the initial implementation: 
> https://github.com/apache/mesos/commit/bb8375975e92ee722befb478ddc3b2541d1ccaa9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6411) Add documentation for CNI port-mapper plugin.

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770133#comment-15770133
 ] 

Alexander Rukletsov commented on MESOS-6411:


[~avin...@mesosphere.io], [~jieyu]: I've backported this to 1.1.1. This was the 
original intention, right?

> Add documentation for CNI port-mapper plugin.
> -
>
> Key: MESOS-6411
> URL: https://issues.apache.org/jira/browse/MESOS-6411
> Project: Mesos
>  Issue Type: Documentation
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
> Fix For: 1.1.1, 1.2.0
>
>
> Need to add the CNI port-mapper plugin to the CNI documentation within Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6411) Add documentation for CNI port-mapper plugin.

2016-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6411:
---
Target Version/s: 1.1.1, 1.2.0  (was: 1.1.1)
   Fix Version/s: 1.1.1

> Add documentation for CNI port-mapper plugin.
> -
>
> Key: MESOS-6411
> URL: https://issues.apache.org/jira/browse/MESOS-6411
> Project: Mesos
>  Issue Type: Documentation
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
> Fix For: 1.1.1, 1.2.0
>
>
> Need to add the CNI port-mapper plugin to the CNI documentation within Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770098#comment-15770098
 ] 

Qian Zhang commented on MESOS-6571:
---

Sorry about that. [~alexr], you are correct, thanks for updating it.

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6002) The whiteout file cannot be removed correctly using aufs backend.

2016-12-22 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770093#comment-15770093
 ] 

Qian Zhang commented on MESOS-6002:
---

Yes [~alexr], you are correct, thanks for adding 1.2.0 as fix version.

> The whiteout file cannot be removed correctly using aufs backend.
> -
>
> Key: MESOS-6002
> URL: https://issues.apache.org/jira/browse/MESOS-6002
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: Ubuntu 14, Ubuntu 12
> Or any os with aufs module
>Reporter: Gilbert Song
>Assignee: Qian Zhang
>  Labels: aufs, backend, containerizer
> Fix For: 1.1.1, 1.2.0
>
> Attachments: whiteout.diff
>
>
> The whiteout file is not removed correctly when using the aufs backend in 
> unified containerizer. It can be verified by this unit test with the aufs 
> manually specified.
> {noformat}
> [20:11:24] :   [Step 10/10] [ RUN  ] 
> ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_Whiteout
> [20:11:24]W:   [Step 10/10] I0805 20:11:24.986734 24295 cluster.cpp:155] 
> Creating default 'local' authorizer
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.001153 24295 leveldb.cpp:174] 
> Opened db in 14.308627ms
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003731 24295 leveldb.cpp:181] 
> Compacted db in 2.558329ms
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003749 24295 leveldb.cpp:196] 
> Created db iterator in 3086ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003754 24295 leveldb.cpp:202] 
> Seeked to beginning of db in 595ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003758 24295 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 314ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003769 24295 replica.cpp:776] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004086 24315 recover.cpp:451] 
> Starting replica recovery
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004251 24312 recover.cpp:477] 
> Replica is in EMPTY status
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004546 24314 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> __req_res__(5640)@172.30.2.105:36006
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004607 24312 recover.cpp:197] 
> Received a recover response from a replica in EMPTY status
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004762 24313 recover.cpp:568] 
> Updating replica status to STARTING
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004776 24314 master.cpp:375] 
> Master 21665992-d47e-402f-a00c-6f8fab613019 (ip-172-30-2-105.mesosphere.io) 
> started on 172.30.2.105:36006
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004787 24314 master.cpp:377] Flags 
> at startup: --acls="" --agent_ping_timeout="15secs" 
> --agent_reregister_timeout="10mins" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate_agents="true" 
> --authenticate_frameworks="true" --authenticate_http_frameworks="true" 
> --authenticate_http_readonly="true" --authenticate_http_readwrite="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/0z753P/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --http_framework_authenticators="basic" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --quiet="false" 
> --recovery_agent_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="100secs" 
> --registry_strict="true" --root_submissions="true" --user_sorter="drf" 
> --version="false" --webui_dir="/usr/local/share/mesos/webui" 
> --work_dir="/tmp/0z753P/master" --zk_session_timeout="10secs"
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004920 24314 master.cpp:427] 
> Master only allowing authenticated frameworks to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004930 24314 master.cpp:441] 
> Master only allowing authenticated agents to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004935 24314 master.cpp:454] 
> Master only allowing authenticated HTTP frameworks to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004942 24314 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/0z753P/credentials'
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.005018 24314 master.cpp:499] Using 
> default 'crammd5' authenticator
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.005101 24314 http.cpp:883] Using 
> default 'basic' HTTP authenticator for realm 'mesos-master-readonly'
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.005152 24314 http.cpp:883] Usin

[jira] [Commented] (MESOS-6357) `NestedMesosContainerizerTest.ROOT_CGROUPS_ParentExit` is flaky in Debian 8.

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769918#comment-15769918
 ] 

Alexander Rukletsov commented on MESOS-6357:


[~gilbert], [~klueska]: I'm inclined to retarget it to 1.2.0 and not for 1.1.1. 
Agreed?

> `NestedMesosContainerizerTest.ROOT_CGROUPS_ParentExit` is flaky in Debian 8.
> 
>
> Key: MESOS-6357
> URL: https://issues.apache.org/jira/browse/MESOS-6357
> Project: Mesos
>  Issue Type: Bug
>  Components: tests
>Affects Versions: 1.1.0
> Environment: Debian 8 with SSL enabled
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: flaky-test
>
> {noformat}
> [00:21:51] :   [Step 10/10] [ RUN  ] 
> NestedMesosContainerizerTest.ROOT_CGROUPS_ParentExit
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.357839 23530 
> containerizer.cpp:202] Using isolation: 
> cgroups/cpu,filesystem/linux,namespaces/pid,network/cni,volume/image
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.361143 23530 
> linux_launcher.cpp:150] Using /sys/fs/cgroup/freezer as the freezer hierarchy 
> for the Linux launcher
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.366930 23547 
> containerizer.cpp:557] Recovering containerizer
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.367962 23551 provisioner.cpp:253] 
> Provisioner recovery complete
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.368253 23549 
> containerizer.cpp:954] Starting container 
> 42589936-56b2-4e41-86d8-447bfaba4666 for executor 'executor' of framework 
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.368577 23548 cgroups.cpp:404] 
> Creating cgroup at 
> '/sys/fs/cgroup/cpu,cpuacct/mesos_test_458f8018-67e7-4cc6-8126-a535974db35d/42589936-56b2-4e41-86d8-447bfaba4666'
>  for container 42589936-56b2-4e41-86d8-447bfaba4666
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.369863 23544 cpu.cpp:103] Updated 
> 'cpu.shares' to 1024 (cpus 1) for container 
> 42589936-56b2-4e41-86d8-447bfaba4666
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.370384 23545 
> containerizer.cpp:1443] Launching 'mesos-containerizer' with flags 
> '--command="{"shell":true,"value":"read key <&30"}" --help="false" 
> --pipe_read="30" --pipe_write="34" 
> --pre_exec_commands="[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/mnt\/teamcity\/work\/4240ba9ddd0997c3\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount
>  -n -t proc proc \/proc -o nosuid,noexec,nodev"}]" 
> --runtime_directory="/mnt/teamcity/temp/buildTmp/NestedMesosContainerizerTest_ROOT_CGROUPS_ParentExit_sEbtvQ/containers/42589936-56b2-4e41-86d8-447bfaba4666"
>  --unshare_namespace_mnt="false" 
> --working_directory="/mnt/teamcity/temp/buildTmp/NestedMesosContainerizerTest_ROOT_CGROUPS_ParentExit_MqjHi0"'
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.370483 23544 
> linux_launcher.cpp:421] Launching container 
> 42589936-56b2-4e41-86d8-447bfaba4666 and cloning with namespaces CLONE_NEWNS 
> | CLONE_NEWPID
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.374867 23545 
> containerizer.cpp:1480] Checkpointing container's forked pid 14139 to 
> '/mnt/teamcity/temp/buildTmp/NestedMesosContainerizerTest_ROOT_CGROUPS_ParentExit_gzjeKG/meta/slaves/frameworks/executors/executor/runs/42589936-56b2-4e41-86d8-447bfaba4666/pids/forked.pid'
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.376519 23551 
> containerizer.cpp:1648] Starting nested container 
> 42589936-56b2-4e41-86d8-447bfaba4666.a5bc9913-c32c-40c6-ab78-2b08910847f8
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.377296 23549 
> containerizer.cpp:1443] Launching 'mesos-containerizer' with flags 
> '--command="{"shell":true,"value":"sleep 1000"}" --help="false" 
> --pipe_read="30" --pipe_write="34" 
> --pre_exec_commands="[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/mnt\/teamcity\/work\/4240ba9ddd0997c3\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount
>  -n -t proc proc \/proc -o nosuid,noexec,nodev"}]" 
> --runtime_directory="/mnt/teamcity/temp/buildTmp/NestedMesosContainerizerTest_ROOT_CGROUPS_ParentExit_sEbtvQ/containers/42589936-56b2-4e41-86d8-447bfaba4666/containers/a5bc9913-c32c-40c6-ab78-2b08910847f8"
>  --unshare_namespace_mnt="false" 
> --working_directory="/mnt/teamcity/temp/buildTmp/NestedMesosContainerizerTest_ROOT_CGROUPS_ParentExit_MqjHi0/containers/a5bc9913-c32c-40c6-ab78-2b08910847f8"'
> [00:21:51]W:   [Step 10/10] I1008 00:21:51.377424 23548 
> linux_launcher.cpp:421] Launching nested container 
> 42589936-56b2-4e41-86d8-447bfaba4666.a5bc9913-c32c-40c6-ab78-2b08910847f8 and 
> cloning with namespaces CLONE_NEWNS | CLONE_NEWPID
> [00:21:51] :   [Step 10/10] Executing pre-exec command 
> '{"argu

[jira] [Commented] (MESOS-6431) Add support for port-mapping in `mesos-execute`

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769914#comment-15769914
 ] 

Alexander Rukletsov commented on MESOS-6431:


[~avin...@mesosphere.io], [~jieyu]: Does this need to be backported to 1.1.1?

> Add support for port-mapping in `mesos-execute`
> ---
>
> Key: MESOS-6431
> URL: https://issues.apache.org/jira/browse/MESOS-6431
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
> Fix For: 1.2.0
>
>
> Add support to specify port-mappings for a container in mesos-execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6040) Add a CMake build for `mesos-port-mapper`

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769912#comment-15769912
 ] 

Alexander Rukletsov commented on MESOS-6040:


[~avin...@mesosphere.io], [~jieyu]: Is this a blocker for 1.1.1? Given there is 
no progress made in last several weeks, shall we retarget it for 1.2.0?

> Add a CMake build for `mesos-port-mapper`
> -
>
> Key: MESOS-6040
> URL: https://issues.apache.org/jira/browse/MESOS-6040
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Once the port-mapper binary compiles with GNU make, we need to modify the 
> CMake to build the port-mapper binary as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6002) The whiteout file cannot be removed correctly using aufs backend.

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769906#comment-15769906
 ] 

Alexander Rukletsov commented on MESOS-6002:


[~qianzhang], [~jieyu]: This one looks like it was backported to 1.1.1 but 
apparently is missing 1.2.0 as fix version, correct?

> The whiteout file cannot be removed correctly using aufs backend.
> -
>
> Key: MESOS-6002
> URL: https://issues.apache.org/jira/browse/MESOS-6002
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: Ubuntu 14, Ubuntu 12
> Or any os with aufs module
>Reporter: Gilbert Song
>Assignee: Qian Zhang
>  Labels: aufs, backend, containerizer
> Fix For: 1.1.1, 1.2.0
>
> Attachments: whiteout.diff
>
>
> The whiteout file is not removed correctly when using the aufs backend in 
> unified containerizer. It can be verified by this unit test with the aufs 
> manually specified.
> {noformat}
> [20:11:24] :   [Step 10/10] [ RUN  ] 
> ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_Whiteout
> [20:11:24]W:   [Step 10/10] I0805 20:11:24.986734 24295 cluster.cpp:155] 
> Creating default 'local' authorizer
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.001153 24295 leveldb.cpp:174] 
> Opened db in 14.308627ms
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003731 24295 leveldb.cpp:181] 
> Compacted db in 2.558329ms
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003749 24295 leveldb.cpp:196] 
> Created db iterator in 3086ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003754 24295 leveldb.cpp:202] 
> Seeked to beginning of db in 595ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003758 24295 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 314ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003769 24295 replica.cpp:776] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004086 24315 recover.cpp:451] 
> Starting replica recovery
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004251 24312 recover.cpp:477] 
> Replica is in EMPTY status
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004546 24314 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> __req_res__(5640)@172.30.2.105:36006
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004607 24312 recover.cpp:197] 
> Received a recover response from a replica in EMPTY status
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004762 24313 recover.cpp:568] 
> Updating replica status to STARTING
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004776 24314 master.cpp:375] 
> Master 21665992-d47e-402f-a00c-6f8fab613019 (ip-172-30-2-105.mesosphere.io) 
> started on 172.30.2.105:36006
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004787 24314 master.cpp:377] Flags 
> at startup: --acls="" --agent_ping_timeout="15secs" 
> --agent_reregister_timeout="10mins" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate_agents="true" 
> --authenticate_frameworks="true" --authenticate_http_frameworks="true" 
> --authenticate_http_readonly="true" --authenticate_http_readwrite="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/0z753P/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --http_framework_authenticators="basic" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --quiet="false" 
> --recovery_agent_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="100secs" 
> --registry_strict="true" --root_submissions="true" --user_sorter="drf" 
> --version="false" --webui_dir="/usr/local/share/mesos/webui" 
> --work_dir="/tmp/0z753P/master" --zk_session_timeout="10secs"
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004920 24314 master.cpp:427] 
> Master only allowing authenticated frameworks to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004930 24314 master.cpp:441] 
> Master only allowing authenticated agents to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004935 24314 master.cpp:454] 
> Master only allowing authenticated HTTP frameworks to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004942 24314 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/0z753P/credentials'
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.005018 24314 master.cpp:499] Using 
> default 'crammd5' authenticator
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.005101 24314 http.cpp:883] Using 
> default 'basic' HTTP authenticator for realm 'mesos-master-readonly'

[jira] [Updated] (MESOS-6002) The whiteout file cannot be removed correctly using aufs backend.

2016-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6002:
---
Fix Version/s: 1.2.0

> The whiteout file cannot be removed correctly using aufs backend.
> -
>
> Key: MESOS-6002
> URL: https://issues.apache.org/jira/browse/MESOS-6002
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: Ubuntu 14, Ubuntu 12
> Or any os with aufs module
>Reporter: Gilbert Song
>Assignee: Qian Zhang
>  Labels: aufs, backend, containerizer
> Fix For: 1.1.1, 1.2.0
>
> Attachments: whiteout.diff
>
>
> The whiteout file is not removed correctly when using the aufs backend in 
> unified containerizer. It can be verified by this unit test with the aufs 
> manually specified.
> {noformat}
> [20:11:24] :   [Step 10/10] [ RUN  ] 
> ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_Whiteout
> [20:11:24]W:   [Step 10/10] I0805 20:11:24.986734 24295 cluster.cpp:155] 
> Creating default 'local' authorizer
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.001153 24295 leveldb.cpp:174] 
> Opened db in 14.308627ms
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003731 24295 leveldb.cpp:181] 
> Compacted db in 2.558329ms
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003749 24295 leveldb.cpp:196] 
> Created db iterator in 3086ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003754 24295 leveldb.cpp:202] 
> Seeked to beginning of db in 595ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003758 24295 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 314ns
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.003769 24295 replica.cpp:776] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004086 24315 recover.cpp:451] 
> Starting replica recovery
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004251 24312 recover.cpp:477] 
> Replica is in EMPTY status
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004546 24314 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> __req_res__(5640)@172.30.2.105:36006
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004607 24312 recover.cpp:197] 
> Received a recover response from a replica in EMPTY status
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004762 24313 recover.cpp:568] 
> Updating replica status to STARTING
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004776 24314 master.cpp:375] 
> Master 21665992-d47e-402f-a00c-6f8fab613019 (ip-172-30-2-105.mesosphere.io) 
> started on 172.30.2.105:36006
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004787 24314 master.cpp:377] Flags 
> at startup: --acls="" --agent_ping_timeout="15secs" 
> --agent_reregister_timeout="10mins" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate_agents="true" 
> --authenticate_frameworks="true" --authenticate_http_frameworks="true" 
> --authenticate_http_readonly="true" --authenticate_http_readwrite="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/0z753P/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --http_framework_authenticators="basic" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --quiet="false" 
> --recovery_agent_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="100secs" 
> --registry_strict="true" --root_submissions="true" --user_sorter="drf" 
> --version="false" --webui_dir="/usr/local/share/mesos/webui" 
> --work_dir="/tmp/0z753P/master" --zk_session_timeout="10secs"
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004920 24314 master.cpp:427] 
> Master only allowing authenticated frameworks to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004930 24314 master.cpp:441] 
> Master only allowing authenticated agents to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004935 24314 master.cpp:454] 
> Master only allowing authenticated HTTP frameworks to register
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.004942 24314 credentials.hpp:37] 
> Loading credentials for authentication from '/tmp/0z753P/credentials'
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.005018 24314 master.cpp:499] Using 
> default 'crammd5' authenticator
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.005101 24314 http.cpp:883] Using 
> default 'basic' HTTP authenticator for realm 'mesos-master-readonly'
> [20:11:25]W:   [Step 10/10] I0805 20:11:25.005152 24314 http.cpp:883] Using 
> default 'basic' HTTP authenticator for realm 'mesos-master-readwrite'
> [20:11:25

[jira] [Commented] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769896#comment-15769896
 ] 

Alexander Rukletsov commented on MESOS-6571:


[~qianzhang], [~jieyu]: This ticket was marked as fixed for 1.1.1 release, but 
it is clearly not there. I've changed the fix version to 1.2.0, please update 
the ticket appropriately and let me know if this is not correct.

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6571) Add "--task" flag to mesos-execute

2016-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6571:
---
Fix Version/s: (was: 1.1.1)
   1.2.0

> Add "--task" flag to mesos-execute
> --
>
> Key: MESOS-6571
> URL: https://issues.apache.org/jira/browse/MESOS-6571
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 1.2.0
>
>
> In [MESOS-6096 | https://issues.apache.org/jira/browse/MESOS-6096], we have 
> added the flag {{\--task_group}} to {{mesos-execute}} such that user can 
> specify a {{TaskGroupInfo}} json with that flag to launch a task group. In 
> this ticket, we'd like to add another flag {{\--task}} for user to specify 
> {{TaskInfo}} json to launch a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6821) Override of automatic resources should be by exact match not substring

2016-12-22 Thread Bruce Merry (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769770#comment-15769770
 ] 

Bruce Merry commented on MESOS-6821:


I may look into doing a patch when I'm back from holiday next year, although 
it's not top of my priorities so if someone else wants to take it I'm quite 
happy for them to do so.

> Override of automatic resources should be by exact match not substring
> --
>
> Key: MESOS-6821
> URL: https://issues.apache.org/jira/browse/MESOS-6821
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Affects Versions: 1.1.0
> Environment: Ubuntu 16.04 x86_64
>Reporter: Bruce Merry
>Priority: Minor
>  Labels: newbie
>
> The agent code for auto-detecting resources (cpus, mem, disk) assumes that, 
> say, "cpus" has been specified in the string "cpus" appears anywhere in the 
> resource string (see 
> [here](https://github.com/apache/mesos/blob/1.1.0/src/slave/containerizer/containerizer.cpp#L79)).
>  This means that using a custom resource called, say, "members", will disable 
> auto-detection of the "mem" resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6832) Review how Mesos handles loading and unloading of dynamic libraries

2016-12-22 Thread Joseph Wu (JIRA)
Joseph Wu created MESOS-6832:


 Summary: Review how Mesos handles loading and unloading of dynamic 
libraries
 Key: MESOS-6832
 URL: https://issues.apache.org/jira/browse/MESOS-6832
 Project: Mesos
  Issue Type: Improvement
  Components: gpu, cmake, java api, modules
Affects Versions: 1.2.0
Reporter: Joseph Wu


There are three instances in the codebase where we load a dynamic library into 
a static variable and leak said variable on purpose:

* https://github.com/apache/mesos/blob/1.1.x/src/jvm/jvm.cpp#L83
* 
https://github.com/apache/mesos/blob/1.1.x/src/slave/containerizer/mesos/isolators/gpu/nvml.cpp#L78
* https://github.com/apache/mesos/blob/1.1.x/src/module/manager.hpp#L181
^ This last one will be changed to leak as part of MESOS-6658

Since the dynamic libraries are loaded into static variables, they will only be 
destructed when the library (i.e. libmesos) gets unloaded. This might lead to 
inconsistencies when libmesos's own destruction unloads e.g., a dynamic 
libprocess, which might be opened by a {{dlopen}} of a module.  The module's 
cleanup would not find libprocess anymore and potentially crash during 
unloading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)