date:20160523

[jira] [Issue Comment Deleted] (MESOS-5425) Consider using IntervalSet for Port range resource math

2016-05-23 Thread Yanyan Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanyan Hu updated MESOS-5425:
-
Comment: was deleted

(was: Hi, Joseph, I just made a quick test using "IntervalSet" data type: I 
first converted two "Ranges" values to "IntervalSet" values and performed 
subtraction operation between them. Then I converted the result "IntervalSet" 
back to "Ranges" value. Test results illustrate that the performance is much 
better when there are 1600 sub ranges in res2. The test result is as followed:

res2 range_size execution time(second)
1 0.010
100 0.028
200 0.030
400 0.035
800 0.044
1600 0.061

So just as you suggested that using IntervalSet in Port range resource math can 
resolve this issue effectively.)

> Consider using IntervalSet for Port range resource math
> ---
>
> Key: MESOS-5425
> URL: https://issues.apache.org/jira/browse/MESOS-5425
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Joseph Wu
>  Labels: mesosphere
>
> Follow-up JIRA for comments raised in MESOS-3051 (see comments there).
> We should consider utilizing 
> [{{IntervalSet}}|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/3rdparty/stout/include/stout/interval.hpp]
>  in [Port range resource 
> math|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/src/common/values.cpp#L143].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5425) Consider using IntervalSet for Port range resource math

2016-05-23 Thread Yanyan Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296038#comment-15296038
 ] 

Yanyan Hu commented on MESOS-5425:
--

Hi, Joseph, I just made a quick test using "IntervalSet" data type: I first 
converted two "Ranges" values to "IntervalSet" values and performed subtraction 
operation between them. Then I converted the result "IntervalSet" back to 
"Ranges" value. Test results illustrate that the performance is much better 
when there are 1600 sub ranges in res2. The test result is as followed:

res2 range_size execution time(second)
1 0.010
100 0.028
200 0.030
400 0.035
800 0.044
1600 0.061

So just as you suggested that using IntervalSet in Port range resource math can 
resolve this issue effectively.

> Consider using IntervalSet for Port range resource math
> ---
>
> Key: MESOS-5425
> URL: https://issues.apache.org/jira/browse/MESOS-5425
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Joseph Wu
>  Labels: mesosphere
>
> Follow-up JIRA for comments raised in MESOS-3051 (see comments there).
> We should consider utilizing 
> [{{IntervalSet}}|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/3rdparty/stout/include/stout/interval.hpp]
>  in [Port range resource 
> math|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/src/common/values.cpp#L143].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5425) Consider using IntervalSet for Port range resource math

2016-05-23 Thread Yanyan Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296040#comment-15296040
 ] 

Yanyan Hu commented on MESOS-5425:
--

Hi, Joseph, I just made a quick test using "IntervalSet" data type: I first 
converted two "Ranges" values to "IntervalSet" values and performed subtraction 
operation between them. Then I converted the result "IntervalSet" back to 
"Ranges" value. Test results illustrate that the performance is much better 
when there are 1600 sub ranges in res2. The test result is as followed:

res2 range_size execution time(second)
1 0.010
100 0.028
200 0.030
400 0.035
800 0.044
1600 0.061

So just as you suggested that using IntervalSet in Port range resource math 
should be able to resolve this issue effectively.


> Consider using IntervalSet for Port range resource math
> ---
>
> Key: MESOS-5425
> URL: https://issues.apache.org/jira/browse/MESOS-5425
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Joseph Wu
>  Labels: mesosphere
>
> Follow-up JIRA for comments raised in MESOS-3051 (see comments there).
> We should consider utilizing 
> [{{IntervalSet}}|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/3rdparty/stout/include/stout/interval.hpp]
>  in [Port range resource 
> math|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/src/common/values.cpp#L143].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5425) Consider using IntervalSet for Port range resource math

2016-05-23 Thread Yanyan Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296045#comment-15296045
 ] 

Yanyan Hu commented on MESOS-5425:
--

Will make more tests to see whether we can get Mesos allocator work more 
efficiently with this optimization. Thanks.

> Consider using IntervalSet for Port range resource math
> ---
>
> Key: MESOS-5425
> URL: https://issues.apache.org/jira/browse/MESOS-5425
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Joseph Wu
>  Labels: mesosphere
>
> Follow-up JIRA for comments raised in MESOS-3051 (see comments there).
> We should consider utilizing 
> [{{IntervalSet}}|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/3rdparty/stout/include/stout/interval.hpp]
>  in [Port range resource 
> math|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/src/common/values.cpp#L143].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4279) Docker executor truncates task's output when the task is killed.

2016-05-23 Thread Martin Bydzovsky (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296080#comment-15296080
 ] 

Martin Bydzovsky commented on MESOS-4279:
-

Btw, https://github.com/mesosphere/marathon/issues/2707. This bug (which we see 
as well) is EXACTLY the same problem. You are killing the parent process  
({code}mesos/build/src/.libs/lt-mesos-docker-executor{code}) too early (and the 
docker stop doesn't get called at all - so the \-\-rm flag doesnt get respected 
by docker stop. We waste our slaves disk space like once a week - cos the 
containers simply doesn't get removed.

Hmm, now im kinda curious, why this flag doesnt take effect at all: 
{code}--docker_remove_delay=VALUE   The amount of time to wait before removing 
docker containers (e.g., 3days, 2weeks, etc). (default: 6hrs){code}

But i dont want to test it unless this is fixed.

> Docker executor truncates task's output when the task is killed.
> 
>
> Key: MESOS-4279
> URL: https://issues.apache.org/jira/browse/MESOS-4279
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.25.0, 0.26.0, 0.27.2, 0.28.1
>Reporter: Martin Bydzovsky
>Assignee: Martin Bydzovsky
>Priority: Blocker
>  Labels: docker, mesosphere
> Fix For: 0.29.0
>
>
> I'm implementing a graceful restarts of our mesos-marathon-docker setup and I 
> came to a following issue:
> (it was already discussed on 
> https://github.com/mesosphere/marathon/issues/2876 and guys form mesosphere 
> got to a point that its probably a docker containerizer problem...)
> To sum it up:
> When i deploy simple python script to all mesos-slaves:
> {code}
> #!/usr/bin/python
> from time import sleep
> import signal
> import sys
> import datetime
> def sigterm_handler(_signo, _stack_frame):
> print "got %i" % _signo
> print datetime.datetime.now().time()
> sys.stdout.flush()
> sleep(2)
> print datetime.datetime.now().time()
> print "ending"
> sys.stdout.flush()
> sys.exit(0)
> signal.signal(signal.SIGTERM, sigterm_handler)
> signal.signal(signal.SIGINT, sigterm_handler)
> try:
> print "Hello"
> i = 0
> while True:
> i += 1
> print datetime.datetime.now().time()
> print "Iteration #%i" % i
> sys.stdout.flush()
> sleep(1)
> finally:
> print "Goodbye"
> {code}
> and I run it through Marathon like
> {code:javascript}
> data = {
>   args: ["/tmp/script.py"],
>   instances: 1,
>   cpus: 0.1,
>   mem: 256,
>   id: "marathon-test-api"
> }
> {code}
> During the app restart I get expected result - the task receives sigterm and 
> dies peacefully (during my script-specified 2 seconds period)
> But when i wrap this python script in a docker:
> {code}
> FROM node:4.2
> RUN mkdir /app
> ADD . /app
> WORKDIR /app
> ENTRYPOINT []
> {code}
> and run appropriate application by Marathon:
> {code:javascript}
> data = {
>   args: ["./script.py"],
>   container: {
>   type: "DOCKER",
>   docker: {
>   image: "bydga/marathon-test-api"
>   },
>   forcePullImage: yes
>   },
>   cpus: 0.1,
>   mem: 256,
>   instances: 1,
>   id: "marathon-test-api"
> }
> {code}
> The task during restart (issued from marathon) dies immediately without 
> having a chance to do any cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5436) GPU resource broke framework data table

2016-05-23 Thread Kevin Klues (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296094#comment-15296094
 ] 

Kevin Klues commented on MESOS-5436:


I think it's OK to just put 0 or "N/A" in cases where we don't yet have the 
proper statistics. We can fill them in once we have them.

> GPU resource broke framework data table
> ---
>
> Key: MESOS-5436
> URL: https://issues.apache.org/jira/browse/MESOS-5436
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: gpu
> Attachments: incorrect_agent_framework_page.png, 
> incorrect_agent_page.png
>
>
> In agent_framework.html and master/static/agent.html, we add {{GPUs (Used / 
> Allocated)}} in table header. But we didn't add the corresponding column to 
> the table body as well.
> On the other hand, we didn't provide statistics for gpus on monitor endpoints.
> To provide those data in webui, it requires we implement gpus statistics in 
> monitor endpoints firstly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5255) Add GPUs to container resource consumption metrics.

2016-05-23 Thread Kevin Klues (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296096#comment-15296096
 ] 

Kevin Klues commented on MESOS-5255:


It does. Can you point me at the lines of code that are causing the errors you 
see in the other bug? Where did I miss adding a column, etc.

> Add GPUs to container resource consumption metrics.
> ---
>
> Key: MESOS-5255
> URL: https://issues.apache.org/jira/browse/MESOS-5255
> Project: Mesos
>  Issue Type: Task
>Reporter: Kevin Klues
>  Labels: gpu
>
> Currently the usage callback in the Nvidia GPU isolator is unimplemented:
> {noformat}
> src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp
> {noformat}
> It should use functionality from NVML to gather the current GPU usage and add 
> it to a ResourceStatistics object. It is still an open question as to exactly 
> what information we want to expose here (power, memory consumption, current 
> load, etc.). Whatever we decide on should be standard across different GPU 
> types, different GPU vendors, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5439) Got registration problem

2016-05-23 Thread kimjoohwan (JIRA)

kimjoohwan created MESOS-5439:
-

 Summary: Got registration problem
 Key: MESOS-5439
 URL: https://issues.apache.org/jira/browse/MESOS-5439
 Project: Mesos
  Issue Type: Bug
  Components: c++ api, slave
Affects Versions: 0.27.0
Reporter: kimjoohwan


Currently, we are using Mesos 0.27.0. The master is build up with a Intel(R) 
Core(TM) i5-3470 CPU @ 3.20GHz CPU and a 4GB RAM. The slave (Banana PI) is 
build up with a Cortex -A7 Dual-Core CPU and a 1GB RAM.

By using the Mesos API, we have developed and completed the execution of the 
framework which is based on python.

but, we found that it takes too much time between the messages, 'Forked child 
with pid' and 'Got registration for executor' from the slave log. (5sec)

If you know how to deal with this problem, please let us know.

I0523 17:38:16.264289  1787 slave.cpp:5208] Launching executor default of 
framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 with resources  in work 
directory 
'/tmp/mesos/slaves/3fb86eea-96c4-4b07-aaa2-caf071275bdf-S2/frameworks/3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010/executors/default/runs/1c830c9a-4120-4ef0-af80-49a52d307539'
I0523 17:38:16.290601  1789 containerizer.cpp:616] Starting container 
'1c830c9a-4120-4ef0-af80-49a52d307539' for executor 'default' of framework 
'3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010'
I0523 17:38:16.293285  1787 slave.cpp:1626] Queuing task '0' for executor 
'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010
I0523 17:38:16.297369  1787 slave.cpp:4233] Current disk usage 2.14%. Max 
allowed age: 6.150293798159722days
I0523 17:38:16.504043  1789 launcher.cpp:132] Forked child with pid '1837' for 
container '1c830c9a-4120-4ef0-af80-49a52d307539'
I0523 17:38:21.510535  1785 slave.cpp:2573] Got registration for executor 
'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
executor(1)@192.168.0.8:56508
I0523 17:38:21.554608  1785 slave.cpp:1791] Sending queued task '0' to executor 
'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 at 
executor(1)@192.168.0.8:56508
I0523 17:38:21.594511  1789 slave.cpp:2932] Handling status update TASK_RUNNING 
(UUID: cd04ec2a-0e68-460a-ad2e-e4f504f3b032) for task 0 of framework 
3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from executor(1)@192.168.0.8:56508
I0523 17:38:21.600050  1789 slave.cpp:2932] Handling status update 
TASK_FINISHED (UUID: 46e110c8-4078-4f98-ae30-30b3a1376034) for task 0 of 
framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
executor(1)@192.168.0.8:56508



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5255) Add GPUs to container resource consumption metrics.

2016-05-23 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296107#comment-15296107
 ] 

haosdent commented on MESOS-5255:
-

Got it, thank you for your check. At here https://reviews.apache.org/r/47719/

> Add GPUs to container resource consumption metrics.
> ---
>
> Key: MESOS-5255
> URL: https://issues.apache.org/jira/browse/MESOS-5255
> Project: Mesos
>  Issue Type: Task
>Reporter: Kevin Klues
>  Labels: gpu
>
> Currently the usage callback in the Nvidia GPU isolator is unimplemented:
> {noformat}
> src/slave/containerizer/mesos/isolators/cgroups/devices/gpus/nvidia.cpp
> {noformat}
> It should use functionality from NVML to gather the current GPU usage and add 
> it to a ResourceStatistics object. It is still an open question as to exactly 
> what information we want to expose here (power, memory consumption, current 
> load, etc.). Whatever we decide on should be standard across different GPU 
> types, different GPU vendors, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5439) registerExecutor problem

2016-05-23 Thread kimjoohwan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kimjoohwan updated MESOS-5439:
--
Summary: registerExecutor problem  (was: Got registration problem)

> registerExecutor problem
> 
>
> Key: MESOS-5439
> URL: https://issues.apache.org/jira/browse/MESOS-5439
> Project: Mesos
>  Issue Type: Bug
>  Components: c++ api, slave
>Affects Versions: 0.27.0
>Reporter: kimjoohwan
>
> Currently, we are using Mesos 0.27.0. The master is build up with a Intel(R) 
> Core(TM) i5-3470 CPU @ 3.20GHz CPU and a 4GB RAM. The slave (Banana PI) is 
> build up with a Cortex -A7 Dual-Core CPU and a 1GB RAM.
> By using the Mesos API, we have developed and completed the execution of the 
> framework which is based on python.
> but, we found that it takes too much time between the messages, 'Forked child 
> with pid' and 'Got registration for executor' from the slave log. (5sec)
> If you know how to deal with this problem, please let us know.
> I0523 17:38:16.264289  1787 slave.cpp:5208] Launching executor default of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 with resources  in work 
> directory 
> '/tmp/mesos/slaves/3fb86eea-96c4-4b07-aaa2-caf071275bdf-S2/frameworks/3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010/executors/default/runs/1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:16.290601  1789 containerizer.cpp:616] Starting container 
> '1c830c9a-4120-4ef0-af80-49a52d307539' for executor 'default' of framework 
> '3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010'
> I0523 17:38:16.293285  1787 slave.cpp:1626] Queuing task '0' for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010
> I0523 17:38:16.297369  1787 slave.cpp:4233] Current disk usage 2.14%. Max 
> allowed age: 6.150293798159722days
> I0523 17:38:16.504043  1789 launcher.cpp:132] Forked child with pid '1837' 
> for container '1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:21.510535  1785 slave.cpp:2573] Got registration for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.554608  1785 slave.cpp:1791] Sending queued task '0' to 
> executor 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 at 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.594511  1789 slave.cpp:2932] Handling status update 
> TASK_RUNNING (UUID: cd04ec2a-0e68-460a-ad2e-e4f504f3b032) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.600050  1789 slave.cpp:2932] Handling status update 
> TASK_FINISHED (UUID: 46e110c8-4078-4f98-ae30-30b3a1376034) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5440) There is a misspelling in some markdown files

2016-05-23 Thread GyeongWon, Do (JIRA)

GyeongWon, Do created MESOS-5440:


 Summary: There is a misspelling in some markdown files
 Key: MESOS-5440
 URL: https://issues.apache.org/jira/browse/MESOS-5440
 Project: Mesos
  Issue Type: Documentation
Reporter: GyeongWon, Do
Priority: Trivial


"This endpoint requires authentication {color:red}iff{color} HTTP 
authentication is enabled."
I think iff is misspelling about if, is it right?

There are many occurrences about that statement in many markdown files.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3139) Incorporate CMake into standard documentation

2016-05-23 Thread Frank Scholten (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296255#comment-15296255
 ] 

Frank Scholten commented on MESOS-3139:
---

Trying to post a review but it fails

{code}
frank@franktop:~/src/mesos$ ./support/post-reviews.py 
--server=https://reviews.apache.org --tracking-branch=origin/master 
--target-groups=mesos --open
Running 'rbt post' across all of ...
0949e6be6a4260933172ea93acc4bc0592c1e2f1 - (HEAD -> MESOS-3139) Added first 
draft CMake build docs. (4 minutes ago)

Creating diff of:
0949e6be6a4260933172ea93acc4bc0592c1e2f1 - (HEAD -> MESOS-3139) Added first 
draft CMake build docs.

Press enter to continue or 'Ctrl-C' to skip.

Review request #47723 posted.

https://reviews.apache.org/r/47723/
https://reviews.apache.org/r/47723/diff/
[10746:10777:0523/133438:ERROR:nss_util.cc(839)] After loading Root Certs, 
loaded==false: NSS error code: -8018
Created new window in existing browser session.
Failed to execute: 'git commit --amend -m Added first draft CMake build docs.


Review: [10746:10777:0523/133438:ERROR:nss_util.cc(839)] After loading Root 
Certs, loaded==false: NSS error code: -8018
':
Usage: ./mesos-split.py ...
Error: No line in the commit message summary may exceed 72 characters.
{code}

> Incorporate CMake into standard documentation
> -
>
> Key: MESOS-3139
> URL: https://issues.apache.org/jira/browse/MESOS-3139
> Project: Mesos
>  Issue Type: Task
>  Components: cmake
>Reporter: Alex Clemmer
>Assignee: Alex Clemmer
>  Labels: build, cmake, mesosphere
>
> Right now it's anyone's guess how to build with CMake. If we want people to 
> use it, we should put up documentation. The central challenge is that the 
> CMake instructions will be slightly different for different platforms.
> For example, on Linux, the gist of the build is basically the same as 
> autotools; you pull down the system dependencies (like APR, _etc_.), and then:
> ```
> ./bootstrap
> mkdir build-cmake && cd build-cmake
> cmake ..
> make
> ```
> But, on Windows, it will be somewhat more complicated. There is no bootstrap 
> step, for example, because Windows doesn't have bash natively. And even when 
> we put that in, you'll still have to build the glog stuff out-of-band because 
> CMake has no way of booting up Visual Studio and calling "build."
> So practically, we need to figure out:
> * What our build story is for different platforms
> * Write specific instructions for our "core" target platforms.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5441) Tests fail to use mounted cgroups on Ubuntu 16.04

2016-05-23 Thread Jan Schlicht (JIRA)

Jan Schlicht created MESOS-5441:
---

 Summary: Tests fail to use mounted cgroups on Ubuntu 16.04
 Key: MESOS-5441
 URL: https://issues.apache.org/jira/browse/MESOS-5441
 Project: Mesos
  Issue Type: Bug
  Components: cgroups, tests
 Environment: Ubuntu 16.04
Reporter: Jan Schlicht


Test fixtures inheriting from {{mesos::internal::tests::ContainerizerTest}} 
fail if {{sudo ./bin/mesos-tests.sh}} is run. Here's an example from our 
internal CI:
{noformat}

[23:49:18] : [Step 10/10] [ RUN  ] SlaveRecoveryTest/0.RecoverSlaveState
[23:49:18] : [Step 10/10] ../../src/tests/mesos.cpp:864: Failure
[23:49:18] : [Step 10/10] cgroups::mount(hierarchy, subsystem): 'cpu' is 
already attached to another hierarchy
[23:49:18] : [Step 10/10] 
-
[23:49:18] : [Step 10/10] We cannot run any cgroups tests that require
[23:49:18] : [Step 10/10] a hierarchy with subsystem 'cpu'
[23:49:18] : [Step 10/10] because we failed to find an existing hierarchy
[23:49:18] : [Step 10/10] or create a new one (tried 
'/run/lxcfs/controllers/cpu').
[23:49:18] : [Step 10/10] You can either remove all existing
[23:49:18] : [Step 10/10] hierarchies, or disable this test case
[23:49:18] : [Step 10/10] (i.e., --gtest_filter=-SlaveRecoveryTest/0.*).
[23:49:18] : [Step 10/10] 
-
[23:49:18] : [Step 10/10] ../../src/tests/mesos.cpp:918: Failure
[23:49:18] : [Step 10/10] cgroups: '/run/lxcfs/controllers/cpu' is not a 
valid hierarchy
[23:49:18] : [Step 10/10] [  FAILED  ] 
SlaveRecoveryTest/0.RecoverSlaveState, where TypeParam = 
mesos::internal::slave::MesosContainerizer (11 ms)
{noformat}
It seems that {{lxcfs}} of Ubuntu 16.04 might be causing this, {{/proc/mounts}} 
looks like this:
{noformat}
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,nosuid,relatime,size=3809788k,nr_inodes=952447,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs 
rw,nosuid,noexec,relatime,size=765776k,nr_inodes=957217,mode=755 0 0
/dev/xvda1 / ext4 rw,relatime,discard,data=ordered 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,size=3828868k,nr_inodes=957217 0 0
tmpfs /run/lock tmpfs 
rw,nosuid,nodev,noexec,relatime,size=5120k,nr_inodes=957217 0 0
tmpfs /sys/fs/cgroup tmpfs 
ro,nosuid,nodev,noexec,size=3828868k,nr_inodes=957217,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup 
rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd,nsroot=/
 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/blkio cgroup 
rw,nosuid,nodev,noexec,relatime,blkio,nsroot=/ 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup 
rw,nosuid,nodev,noexec,relatime,cpu,cpuacct,nsroot=/ 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup 
rw,nosuid,nodev,noexec,relatime,net_cls,net_prio,nsroot=/ 0 0
cgroup /sys/fs/cgroup/freezer cgroup 
rw,nosuid,nodev,noexec,relatime,freezer,nsroot=/ 0 0
cgroup /sys/fs/cgroup/cpuset cgroup 
rw,nosuid,nodev,noexec,relatime,cpuset,nsroot=/ 0 0
cgroup /sys/fs/cgroup/memory cgroup 
rw,nosuid,nodev,noexec,relatime,memory,nsroot=/ 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup 
rw,nosuid,nodev,noexec,relatime,hugetlb,nsroot=/ 0 0
cgroup /sys/fs/cgroup/perf_event cgroup 
rw,nosuid,nodev,noexec,relatime,perf_event,nsroot=/ 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids,nsroot=/ 
0 0
cgroup /sys/fs/cgroup/devices cgroup 
rw,nosuid,nodev,noexec,relatime,devices,nsroot=/ 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs 
rw,relatime,fd=25,pgrp=1,timeout=0,minproto=5,maxproto=5,direct 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
/dev/xvdb /mnt ext3 rw,relatime,data=ordered 0 0
tmpfs /run/user/1000 tmpfs 
rw,nosuid,nodev,relatime,size=765780k,mode=700,uid=1000,gid=1000 0 0
tmpfs /run/lxcfs/controllers tmpfs rw,relatime,size=100k,mode=700 0 0
devices /run/lxcfs/controllers/devices cgroup rw,relatime,devices,nsroot=/ 0 0
pids /run/lxcfs/controllers/pids cgroup rw,relatime,pids,nsroot=/ 0 0
perf_event /run/lxcfs/controllers/perf_event cgroup 
rw,relatime,perf_event,nsroot=/ 0 0
hugetlb /run/lxcfs/controllers/hugetlb cgroup rw,relatime,hugetlb,nsroot=/ 0 0
memory /run/lxcfs/controllers/memory cgroup rw,relatime,memory,nsroot=/ 0 0
cpuset /run/lxcfs/controllers/cpuset cgroup rw,relatime,cpuset,nsroot=/ 0 0
freezer /run/lxcfs/controllers/freezer cgroup rw,relatime,freezer,nsroot=/ 0 0
net_cls,net_prio /run/lxcfs/controllers/net_cls

[jira] [Created] (MESOS-5442) Stuck when extracting two archive contains overlapped file structure

2016-05-23 Thread Timon Wong (JIRA)

Timon Wong created MESOS-5442:
-

 Summary: Stuck when extracting two archive contains overlapped 
file structure
 Key: MESOS-5442
 URL: https://issues.apache.org/jira/browse/MESOS-5442
 Project: Mesos
  Issue Type: Bug
  Components: fetcher
Affects Versions: 0.28.1
Reporter: Timon Wong
Priority: Minor


Provided we have two zip files:

{code}
aaa.zip:
  - conf/aaa.conf  # Overlapped file structure
  - aaa

aaa-patch.zip:
  - conf/aaa.conf  # Overlapped file structure
{code}

Then we create a marathon task for it:
{code:javascript}
{
  // ...
  "uris": [
"http://X/aaa.zip";,
"http://X/aaa-patch.zip";
  ]
}
{code}

Then after the `aaa.zip` was extracted, it get stuck when trying to extracting 
`aaa-patch.zip`, the log will finally look like:

{code}
I0522 01:23:05.618922 25041 fetcher.cpp:134] Downloading resource from 
'http://X/-patch.zip' to '/var/lib//-patch.zip'
I0522 01:23:05.624514 25041 fetcher.cpp:84] Extracting with command: unzip -d 
'/var/lib/' '/var/lib//aaa-patch.zip'
replace /var/lib//conf/aaa.conf? [y]es, [n]o, [A]ll, [N]one, 
[r]ename: 
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5437) AppC appc_simple_discovery_uri_prefix is lost in configuration.md

2016-05-23 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5437:
--
Shepherd: Jie Yu
Story Points: 1

> AppC  appc_simple_discovery_uri_prefix is lost in configuration.md
> --
>
> Key: MESOS-5437
> URL: https://issues.apache.org/jira/browse/MESOS-5437
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.29.0
>Reporter: Guangya Liu
>Assignee: Guangya Liu
> Fix For: 0.29.0
>
>
> AppC  appc_simple_discovery_uri_prefix is lost in configuration.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5436) GPU resource broke framework data table

2016-05-23 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5436:

Attachment: after_agent_page.png
after_agent_framework_page.png

> GPU resource broke framework data table
> ---
>
> Key: MESOS-5436
> URL: https://issues.apache.org/jira/browse/MESOS-5436
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: gpu
> Attachments: after_agent_framework_page.png, after_agent_page.png, 
> incorrect_agent_framework_page.png, incorrect_agent_page.png
>
>
> In agent_framework.html and master/static/agent.html, we add {{GPUs (Used / 
> Allocated)}} in table header. But we didn't add the corresponding column to 
> the table body as well.
> On the other hand, we didn't provide statistics for gpus on monitor endpoints.
> To provide those data in webui, it requires we implement gpus statistics in 
> monitor endpoints firstly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5413) `network/cni` isolator should skip the bind mounting of the CNI network information root directory if possible

2016-05-23 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5413:
--
Sprint: Mesosphere Sprint 35

> `network/cni` isolator should skip the bind mounting of the CNI network 
> information root directory if possible
> --
>
> Key: MESOS-5413
> URL: https://issues.apache.org/jira/browse/MESOS-5413
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 0.29.0
>
>
> Currently in the create() method `network/cni` isolator, for the CNI network 
> information root directory (i.e., {{/var/run/mesos/isolators/network/cni}}), 
> we do a self bind mount and make sure it is a shared mount of its own peer 
> group. However, we should not do a self bind mount if the mount containing 
> the CNI network information root directory is already a shared mount in its 
> own share peer group, just like what we did for `filesystem/linux` isolator 
> in [MESOS-5239 | https://issues.apache.org/jira/browse/MESOS-5239].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5413) `network/cni` isolator should skip the bind mounting of the CNI network information root directory if possible

2016-05-23 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5413:
--
Story Points: 3

> `network/cni` isolator should skip the bind mounting of the CNI network 
> information root directory if possible
> --
>
> Key: MESOS-5413
> URL: https://issues.apache.org/jira/browse/MESOS-5413
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
> Fix For: 0.29.0
>
>
> Currently in the create() method `network/cni` isolator, for the CNI network 
> information root directory (i.e., {{/var/run/mesos/isolators/network/cni}}), 
> we do a self bind mount and make sure it is a shared mount of its own peer 
> group. However, we should not do a self bind mount if the mount containing 
> the CNI network information root directory is already a shared mount in its 
> own share peer group, just like what we did for `filesystem/linux` isolator 
> in [MESOS-5239 | https://issues.apache.org/jira/browse/MESOS-5239].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5436) GPU resource broke framework data table

2016-05-23 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5436:

Attachment: (was: after_agent_page.png)

> GPU resource broke framework data table
> ---
>
> Key: MESOS-5436
> URL: https://issues.apache.org/jira/browse/MESOS-5436
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: gpu
> Attachments: incorrect_agent_page.png
>
>
> In agent_framework.html and master/static/agent.html, we add {{GPUs (Used / 
> Allocated)}} in table header. But we didn't add the corresponding column to 
> the table body as well.
> On the other hand, we didn't provide statistics for gpus on monitor endpoints.
> To provide those data in webui, it requires we implement gpus statistics in 
> monitor endpoints firstly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5436) GPU resource broke framework data table

2016-05-23 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5436:

Attachment: (was: incorrect_agent_framework_page.png)

> GPU resource broke framework data table
> ---
>
> Key: MESOS-5436
> URL: https://issues.apache.org/jira/browse/MESOS-5436
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: gpu
> Attachments: incorrect_agent_page.png
>
>
> In agent_framework.html and master/static/agent.html, we add {{GPUs (Used / 
> Allocated)}} in table header. But we didn't add the corresponding column to 
> the table body as well.
> On the other hand, we didn't provide statistics for gpus on monitor endpoints.
> To provide those data in webui, it requires we implement gpus statistics in 
> monitor endpoints firstly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5436) GPU resource broke framework data table

2016-05-23 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5436:

Attachment: (was: after_agent_framework_page.png)

> GPU resource broke framework data table
> ---
>
> Key: MESOS-5436
> URL: https://issues.apache.org/jira/browse/MESOS-5436
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: gpu
> Attachments: incorrect_agent_page.png
>
>
> In agent_framework.html and master/static/agent.html, we add {{GPUs (Used / 
> Allocated)}} in table header. But we didn't add the corresponding column to 
> the table body as well.
> On the other hand, we didn't provide statistics for gpus on monitor endpoints.
> To provide those data in webui, it requires we implement gpus statistics in 
> monitor endpoints firstly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5436) GPU resource broke framework data table

2016-05-23 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5436:

Attachment: incorrect_agent_page.png
incorrect_agent_framework_page.png
after_agent_page.png
after_agent_framework_page.png

> GPU resource broke framework data table
> ---
>
> Key: MESOS-5436
> URL: https://issues.apache.org/jira/browse/MESOS-5436
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: gpu
> Attachments: after_agent_framework_page.png, after_agent_page.png, 
> incorrect_agent_framework_page.png, incorrect_agent_page.png
>
>
> In agent_framework.html and master/static/agent.html, we add {{GPUs (Used / 
> Allocated)}} in table header. But we didn't add the corresponding column to 
> the table body as well.
> On the other hand, we didn't provide statistics for gpus on monitor endpoints.
> To provide those data in webui, it requires we implement gpus statistics in 
> monitor endpoints firstly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5436) GPU resource broke framework data table

2016-05-23 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5436:

Attachment: (was: incorrect_agent_page.png)

> GPU resource broke framework data table
> ---
>
> Key: MESOS-5436
> URL: https://issues.apache.org/jira/browse/MESOS-5436
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: gpu
> Attachments: after_agent_framework_page.png, after_agent_page.png, 
> incorrect_agent_framework_page.png, incorrect_agent_page.png
>
>
> In agent_framework.html and master/static/agent.html, we add {{GPUs (Used / 
> Allocated)}} in table header. But we didn't add the corresponding column to 
> the table body as well.
> On the other hand, we didn't provide statistics for gpus on monitor endpoints.
> To provide those data in webui, it requires we implement gpus statistics in 
> monitor endpoints firstly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5442) Stuck when extracting two archive contains overlapped file structure

2016-05-23 Thread Jie Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296651#comment-15296651
 ] 

Jie Yu commented on MESOS-5442:
---

I just backported the patch to 0.28.x branch.

> Stuck when extracting two archive contains overlapped file structure
> 
>
> Key: MESOS-5442
> URL: https://issues.apache.org/jira/browse/MESOS-5442
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Affects Versions: 0.28.1
>Reporter: Timon Wong
>Priority: Minor
>
> Provided we have two zip files:
> {code}
> aaa.zip:
>   - conf/aaa.conf  # Overlapped file structure
>   - aaa
> aaa-patch.zip:
>   - conf/aaa.conf  # Overlapped file structure
> {code}
> Then we create a marathon task for it:
> {code:javascript}
> {
>   // ...
>   "uris": [
> "http://X/aaa.zip";,
> "http://X/aaa-patch.zip";
>   ]
> }
> {code}
> Then after the `aaa.zip` was extracted, it get stuck when trying to 
> extracting `aaa-patch.zip`, the log will finally look like:
> {code}
> I0522 01:23:05.618922 25041 fetcher.cpp:134] Downloading resource from 
> 'http://X/-patch.zip' to '/var/lib//-patch.zip'
> I0522 01:23:05.624514 25041 fetcher.cpp:84] Extracting with command: unzip -d 
> '/var/lib/' '/var/lib//aaa-patch.zip'
> replace /var/lib//conf/aaa.conf? [y]es, [n]o, [A]ll, [N]one, 
> [r]ename: 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4885) Unzip should force overwrite

2016-05-23 Thread Jie Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4885:
--
Fix Version/s: 0.28.2

> Unzip should force overwrite
> 
>
> Key: MESOS-4885
> URL: https://issues.apache.org/jira/browse/MESOS-4885
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Reporter: Tomasz Janiszewski
>Assignee: Tomasz Janiszewski
>Priority: Trivial
> Fix For: 0.29.0, 0.28.2
>
>
> Consider situation when zip file is malformed and contains duplicated files . 
> When fetcher downloads malformed zip file, that contains duplicated files 
> (e.g., dist zips generated by gradle could have duplicated files in libs dir) 
> and try to uncompress it, deployment hang in staged phase because unzip 
> prompt if file should be replaced. unzip should overrite this file or break 
> with error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5431) Update the website generation and development workflows with docker.

2016-05-23 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-5431:

Attachment: website.gif

> Update the website generation and development workflows with docker.
> 
>
> Key: MESOS-5431
> URL: https://issues.apache.org/jira/browse/MESOS-5431
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
> Attachments: website.gif
>
>
> As the discussion thread in [Readme update | 
> http://search-hadoop.com/m/0Vlr6JIzkd2QAk85&subj=Re+WEBSITE+Readme+update]
> From [~vinodkone] and [~klueska]'s comments,
> {quote}
> On Fri, May 20, 2016 at 9:00 AM, haosdent <[EMAIL PROTECTED]> wrote:
> yes. maybe update the rake target ":default" target to also do doxygen and
> javadoc tasks?
> {quote}
> {quote}
> While we are fixing the dockerfile for the website, can I also request that
> we update the docker file to not muck up the mesos source directory that
> gets mounted in? Right now, the end result of 'docker run'  is a "publish"
> directory and a "documentation" directory inside "mesos" directory, which
> means I need to clean those up manually later.
> For "publish", I would like us to mount a source directory (mesos) and a
> publish directory, like so:
> sudo docker run -it --rm -p 4567:4567 -v :/mesos -v
> :/publish mesos/website
> Then as a committer, I would set  to the publish folder of
> my svn clone of the site (e.g., ~/workspace/site/publish).
> For "documentation", I would like the Dockerfile to delete it during exit.
> {quote}
> We need to implement above things to make it more convenience to
> develop and generate the website.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5427) Mesos master locks up after slave fails to authenticate

2016-05-23 Thread analogue (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296744#comment-15296744
 ] 

analogue commented on MESOS-5427:
-

Yes, this is running on Ubuntu Lucid 10.04 LTS :(  Planning to upgrade, but 
just wanted to leave some breadcrumbs regarding this failure if others happen 
to run into something similar.

> Mesos master locks up after slave fails to authenticate
> ---
>
> Key: MESOS-5427
> URL: https://issues.apache.org/jira/browse/MESOS-5427
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.20.1
> Environment: Linux XX-X 3.13.0-49-generic #81-Ubuntu SMP 
> Tue Mar 24 19:29:48 UTC 2015 x86_64 GNU/Linux
> Ubuntu 10.04.1 LTS
> AWS/8cores/16GB
>Reporter: analogue
>Priority: Minor
>
> In a mesos master cluster with one leader and two backups, a single slave 
> attempting to authenticate with the leader locked up the master and resulted 
> in 2 CPU cores pegged at 100% CPU usage until restarted.
> master
> {noformat}
> I0516 02:55:39.945566 32126 master.cpp:3612] Authenticating 
> slave(1)@10.85.20.76:5051
> I0516 02:55:39.945757 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.945802 32123 authenticator.hpp:156] Creating new server SASL 
> connection
> I0516 02:55:39.945991 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946030 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946063 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946095 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946126 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946158 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946189 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946221 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946252 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946285 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946316 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946347 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> I0516 02:55:39.946379 32126 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> ...
> W0516 02:55:44.945811 32124 master.cpp:3670] Authentication timed out
> I0516 02:55:49.290623 32121 master.cpp:3598] Queuing up authentication 
> request from slave(1)@10.85.20.76:5051 because authentication is still in 
> progress
> (last long line repeats until mesos-master restarted)
> {noformat}
> slave
> {noformat}
> Log file created at: 2016/05/16 02:37:52
> Running on machine: 10-85-20-76-uswest2btestopia
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> I0516 02:37:52.112509 10198 logging.cpp:142] INFO level logging started!
> I0516 02:37:52.112761 10198 main.cpp:126] Build: 2014-12-12 00:52:32 by
> I0516 02:37:52.112772 10198 main.cpp:128] Version: 0.20.1
> I0516 02:37:52.112778 10198 main.cpp:131] Git tag: 0.20.1
> I0516 02:37:52.112783 10198 main.cpp:135] Git SHA: 
> fe0a39112f3304283f970f1b08b322b1e970829d
> I0516 02:37:52.112793 10198 containerizer.cpp:89] Using isolation: 
> cgroups/cpu,cgroups/mem
> I0516 02:37:52.125773 10198 linux_launcher.cpp:78] Using 
> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> I0516 02:37:52.126652 10198 main.cpp:149] Starting Mesos slave
> I0516 02:37:52.128687 10246 slave.cpp:167] Slave started on 
> 1)@10.85.20.76:5051
> I0516 02:37:52.128708 10246 credentials.hpp:84] Loading credential

[jira] [Commented] (MESOS-5439) registerExecutor problem

2016-05-23 Thread Joseph Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296759#comment-15296759
 ] 

Joseph Wu commented on MESOS-5439:
--

A couple questions:
* How many tasks are you launching at once?  (i.e. from a single offer)  And 
how many over a given time?
* Are you using the default command executor?  Or are you launching a custom 
executor?
* What flags are you using to launch the agent?
* What do the executor's stdout/stderr files (in the sandbox) say?  There 
should be glog logs in there too.

> registerExecutor problem
> 
>
> Key: MESOS-5439
> URL: https://issues.apache.org/jira/browse/MESOS-5439
> Project: Mesos
>  Issue Type: Bug
>  Components: c++ api, slave
>Affects Versions: 0.27.0
>Reporter: kimjoohwan
>
> Currently, we are using Mesos 0.27.0. The master is build up with a Intel(R) 
> Core(TM) i5-3470 CPU @ 3.20GHz CPU and a 4GB RAM. The slave (Banana PI) is 
> build up with a Cortex -A7 Dual-Core CPU and a 1GB RAM.
> By using the Mesos API, we have developed and completed the execution of the 
> framework which is based on python.
> but, we found that it takes too much time between the messages, 'Forked child 
> with pid' and 'Got registration for executor' from the slave log. (5sec)
> If you know how to deal with this problem, please let us know.
> I0523 17:38:16.264289  1787 slave.cpp:5208] Launching executor default of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 with resources  in work 
> directory 
> '/tmp/mesos/slaves/3fb86eea-96c4-4b07-aaa2-caf071275bdf-S2/frameworks/3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010/executors/default/runs/1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:16.290601  1789 containerizer.cpp:616] Starting container 
> '1c830c9a-4120-4ef0-af80-49a52d307539' for executor 'default' of framework 
> '3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010'
> I0523 17:38:16.293285  1787 slave.cpp:1626] Queuing task '0' for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010
> I0523 17:38:16.297369  1787 slave.cpp:4233] Current disk usage 2.14%. Max 
> allowed age: 6.150293798159722days
> I0523 17:38:16.504043  1789 launcher.cpp:132] Forked child with pid '1837' 
> for container '1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:21.510535  1785 slave.cpp:2573] Got registration for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.554608  1785 slave.cpp:1791] Sending queued task '0' to 
> executor 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 at 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.594511  1789 slave.cpp:2932] Handling status update 
> TASK_RUNNING (UUID: cd04ec2a-0e68-460a-ad2e-e4f504f3b032) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.600050  1789 slave.cpp:2932] Handling status update 
> TASK_FINISHED (UUID: 46e110c8-4078-4f98-ae30-30b3a1376034) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-23 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296781#comment-15296781
 ] 

haosdent commented on MESOS-5430:
-

[~jmanalus] Thank you very much for your wonderful design! For the current 
[website | http://mesos.apache.org/], we have a news part. This have been 
removed in the new design, right?

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5436) GPU resource broke framework data table

2016-05-23 Thread Kevin Klues (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-5436:
---
Shepherd: Benjamin Mahler
  Sprint: Mesosphere Sprint 35
Story Points: 1

> GPU resource broke framework data table
> ---
>
> Key: MESOS-5436
> URL: https://issues.apache.org/jira/browse/MESOS-5436
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: gpu
> Attachments: after_agent_framework_page.png, after_agent_page.png, 
> incorrect_agent_framework_page.png, incorrect_agent_page.png
>
>
> In agent_framework.html and master/static/agent.html, we add {{GPUs (Used / 
> Allocated)}} in table header. But we didn't add the corresponding column to 
> the table body as well.
> On the other hand, we didn't provide statistics for gpus on monitor endpoints.
> To provide those data in webui, it requires we implement gpus statistics in 
> monitor endpoints firstly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-23 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296812#comment-15296812
 ] 

Vinod Kone commented on MESOS-5430:
---

+1 love the design.

[~haosd...@gmail.com] I think the idea is to move news to the "Blog" section.

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-23 Thread Jonathan Manalus (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296905#comment-15296905
 ] 

Jonathan Manalus commented on MESOS-5430:
-

[~haosd...@gmail.com]  - [~vinod] is exactly right on moving everything to the 
blog section that now lives in the navbar. 

Currently all the news posts are mostly changleogs for the last year. You will 
be able to see the most recent changlog post listed under the download section. 
But to answer your question - Yes it has been removed from the homepage. 

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5428) Update the mechanism to define flags in FlagsBase derived clases

2016-05-23 Thread Daniel Pravat (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Pravat updated MESOS-5428:
-
Description: 
If a program exeposes flags,  the recommendation from Mesos was to use a 
derived class from FlagsBase, add the new flags in constructor.
As benefit  the new `Flags` class `inherits` all the flags from the derived 
classes.
Each derived calss calls the method `add` implemented in `FlagsBase` which uses 
`dynamic_cast` to set the default value and other things.

To use the use `FlagsBase` in Visual Studio  we should disable construction 
displacements using `/vd2` compile option. 
More info: https://msdn.microsoft.com/en-us/library/7sf3txa8.aspx

  was:
If a program exeposes flags,  the recommendation from Mesos was to use a 
derived class from FlagsBase, add the new flags in constructor.
As benefit  the new `Flags` class `inherits` all the flags from the derived 
classes.
Each derived calss calls the method `add` implemented in `FlagsBase` which uses 
`dynamic_cast` to set the default value and other things.

Since the constructor is not completed class is not completed (in Visual Studio 
the vtable is not correct at that time) the code does not work on Windows.
We should have to call a separate method in Windows.


> Update the mechanism to define flags in FlagsBase derived clases
> 
>
> Key: MESOS-5428
> URL: https://issues.apache.org/jira/browse/MESOS-5428
> Project: Mesos
>  Issue Type: Bug
>Reporter: Daniel Pravat
>
> If a program exeposes flags,  the recommendation from Mesos was to use a 
> derived class from FlagsBase, add the new flags in constructor.
> As benefit  the new `Flags` class `inherits` all the flags from the derived 
> classes.
> Each derived calss calls the method `add` implemented in `FlagsBase` which 
> uses `dynamic_cast` to set the default value and other things.
> To use the use `FlagsBase` in Visual Studio  we should disable construction 
> displacements using `/vd2` compile option. 
> More info: https://msdn.microsoft.com/en-us/library/7sf3txa8.aspx



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5428) Update the mechanism to define flags in FlagsBase derived clases

2016-05-23 Thread Daniel Pravat (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Pravat updated MESOS-5428:
-
Description: 
If a program exeposes flags,  the recommendation from Mesos was to use a 
derived class from FlagsBase, add the new flags in constructor.
As benefit  the new `Flags` class `inherits` all the flags from the derived 
classes.
Each derived calss calls the method `add` implemented in `FlagsBase` which uses 
`dynamic_cast` to set the default value and other things.

To use `FlagsBase` derived classes in Visual Studio  we should disable 
construction displacements using `/vd2` compile option. 
More info: https://msdn.microsoft.com/en-us/library/7sf3txa8.aspx

  was:
If a program exeposes flags,  the recommendation from Mesos was to use a 
derived class from FlagsBase, add the new flags in constructor.
As benefit  the new `Flags` class `inherits` all the flags from the derived 
classes.
Each derived calss calls the method `add` implemented in `FlagsBase` which uses 
`dynamic_cast` to set the default value and other things.

To use the use `FlagsBase` in Visual Studio  we should disable construction 
displacements using `/vd2` compile option. 
More info: https://msdn.microsoft.com/en-us/library/7sf3txa8.aspx


> Update the mechanism to define flags in FlagsBase derived clases
> 
>
> Key: MESOS-5428
> URL: https://issues.apache.org/jira/browse/MESOS-5428
> Project: Mesos
>  Issue Type: Bug
>Reporter: Daniel Pravat
>
> If a program exeposes flags,  the recommendation from Mesos was to use a 
> derived class from FlagsBase, add the new flags in constructor.
> As benefit  the new `Flags` class `inherits` all the flags from the derived 
> classes.
> Each derived calss calls the method `add` implemented in `FlagsBase` which 
> uses `dynamic_cast` to set the default value and other things.
> To use `FlagsBase` derived classes in Visual Studio  we should disable 
> construction displacements using `/vd2` compile option. 
> More info: https://msdn.microsoft.com/en-us/library/7sf3txa8.aspx



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5420) Implement os::exists for processes

2016-05-23 Thread Daniel Pravat (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Pravat updated MESOS-5420:
-
Description: os::exists returns true if the process identified by the 
parameter is still running or was running and we are able to get information 
about it, such us the exit code. In Windows after obtaining a handle to the 
process it is possible perform those operations.   (was: os::exists returns 
true if the process identified by the parameter is still running or was 
running. In Windows,  subprocess class  keeps an open handle to the process, 
allowing ReaperProcess::reap to get the exit code even if the process is 
terminated.)

> Implement os::exists for processes
> --
>
> Key: MESOS-5420
> URL: https://issues.apache.org/jira/browse/MESOS-5420
> Project: Mesos
>  Issue Type: Improvement
> Environment: Windows
>Reporter: Daniel Pravat
>Assignee: Daniel Pravat
>
> os::exists returns true if the process identified by the parameter is still 
> running or was running and we are able to get information about it, such us 
> the exit code. In Windows after obtaining a handle to the process it is 
> possible perform those operations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-5362) Add authentication to example frameworks

2016-05-23 Thread Greg Mann (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5362:
-
Sprint:   (was: Mesosphere Sprint 35)

> Add authentication to example frameworks
> 
>
> Key: MESOS-5362
> URL: https://issues.apache.org/jira/browse/MESOS-5362
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authentication, mesosphere, security
>
> Some example frameworks do not have the ability to authenticate with the 
> master. Adding authentication to the example frameworks that don't already 
> have it implemented would allow us to use these frameworks for testing in 
> authenticated/authorized scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2013) Slave read endpoint doesn't encode non-ascii characters correctly

2016-05-23 Thread Whitney Sorenson (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297085#comment-15297085
 ] 

Whitney Sorenson commented on MESOS-2013:
-

We have just deployed 0.28.1 and I can report, that yes, this does help our 
case. Thank you. We have to do some processing ourselves to handle utf-8 
characters which have been chopped, but it is something we can work around.

As a note, earlier, when I asked if we should find a competent C++ developer I 
meant to imply someone besides myself - not that anyone else was incompetent ;) 

> Slave read endpoint doesn't encode non-ascii characters correctly
> -
>
> Key: MESOS-2013
> URL: https://issues.apache.org/jira/browse/MESOS-2013
> Project: Mesos
>  Issue Type: Bug
>  Components: json api
>Reporter: Whitney Sorenson
>Assignee: Anand Mazumdar
>
> Create a file in a sandbox with a non-ascii character, like this one: 
> http://www.fileformat.info/info/unicode/char/2018/index.htm
> Hit the read endpoint for that file.
> The response will have something like: 
> data: "\u00E2\u0080\u0098"
> It should actually be:
> data: "\u2018"
> If you put either into JSON.parse() in the browser you will see the first 
> does not render correctly but the second does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5443) Remove "const" for some primitive types in function parameters

2016-05-23 Thread Guangya Liu (JIRA)

Guangya Liu created MESOS-5443:
--

 Summary: Remove "const" for some primitive types in function 
parameters
 Key: MESOS-5443
 URL: https://issues.apache.org/jira/browse/MESOS-5443
 Project: Mesos
  Issue Type: Bug
Reporter: Guangya Liu
Priority: Minor


It is not suggested to use `const` for a primitive type when using it in as a 
function parameter.

There are indeed some cases using this such as `bool` here, we should scan and 
remove those invalid use.

https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/isolators/cgroups/mem.cpp#L75



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4248) mesos slave can't start in CentOS-7 docker container

2016-05-23 Thread Shane da Silva (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297391#comment-15297391
 ] 

Shane da Silva commented on MESOS-4248:
---

FWIW, we can't reproduce this issue on Mesos 0.26.0, but we do hit it on 0.27.2 
and 0.28.1.

It would be great for someone to review Yubao's patch and consider merging it, 
as it's convenient to be able to run Mesos in a container for integration 
testing. For example, in the Chef ecosystem using test-kitchen with 
kitchen-docker to quickly spin up pseudo-"VMs" is common practice.

 [~liuyb]: did you by chance ever find a workaround for this issue?

> mesos slave can't start in CentOS-7 docker container
> 
>
> Key: MESOS-4248
> URL: https://issues.apache.org/jira/browse/MESOS-4248
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.26.0
> Environment: My host OS is Debian Jessie,  the container OS is CentOS 
> 7.2.
> {code}
> # cat /etc/system-release
> CentOS Linux release 7.2.1511 (Core) 
> # rpm -qa |grep mesos
> mesosphere-zookeeper-3.4.6-0.1.20141204175332.centos7.x86_64
> mesosphere-el-repo-7-1.noarch
> mesos-0.26.0-0.2.145.centos701406.x86_64
> $ docker version
> Client:
>  Version:  1.9.1
>  API version:  1.21
>  Go version:   go1.4.2
>  Git commit:   a34a1d5
>  Built:Fri Nov 20 12:59:02 UTC 2015
>  OS/Arch:  linux/amd64
> Server:
>  Version:  1.9.1
>  API version:  1.21
>  Go version:   go1.4.2
>  Git commit:   a34a1d5
>  Built:Fri Nov 20 12:59:02 UTC 2015
>  OS/Arch:  linux/amd64
> {code}
>Reporter: Yubao Liu
>
> // Check the "Environment" label above for kinds of software versions.
> "systemctl start mesos-slave" can't start mesos-slave:
> {code}
> # journalctl -u mesos-slave
> 
> Dec 24 10:35:25 mesos-slave1 systemd[1]: Started Mesos Slave.
> Dec 24 10:35:25 mesos-slave1 systemd[1]: Starting Mesos Slave...
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210180 12838 
> logging.cpp:172] INFO level logging started!
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210603 12838 
> main.cpp:190] Build: 2015-12-16 23:06:16 by root
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210625 12838 
> main.cpp:192] Version: 0.26.0
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210634 12838 
> main.cpp:195] Git tag: 0.26.0
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210644 12838 
> main.cpp:199] Git SHA: d3717e5c4d1bf4fca5c41cd7ea54fae489028faa
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210765 12838 
> containerizer.cpp:142] Using isolation: posix/cpu,posix/mem,filesystem/posix
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.215638 12838 
> linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy 
> for the Linux launcher
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.220279 12838 
> systemd.cpp:128] systemd version `219` detected
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.227017 12838 
> systemd.cpp:210] Started systemd slice `mesos_executors.slice`
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: Failed to create a 
> containerizer: Could not create MesosContainerizer: Failed to create 
> launcher: Failed to locate systemd cgroups hierarchy: does not exist
> Dec 24 10:35:25 mesos-slave1 systemd[1]: mesos-slave.service: main process 
> exited, code=exited, status=1/FAILURE
> Dec 24 10:35:25 mesos-slave1 systemd[1]: Unit mesos-slave.service entered 
> failed state.
> Dec 24 10:35:25 mesos-slave1 systemd[1]: mesos-slave.service failed.
> {code}
> I used strace to debug it, mesos-slave tried to access 
> "/sys/fs/cgroup/systemd/mesos_executors.slice",  but it's actually at 
> "/sys/fs/cgroup/systemd/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope/mesos_executors.slice/",
>mesos-slave should check "/proc/self/cgroup" to find those intermediate 
> directories:
> {code}
> # cat /proc/self/cgroup 
> 8:perf_event:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 7:blkio:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 6:net_cls,net_prio:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 5:freezer:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 4:devices:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 3:cpu,cpuacct:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 2:cpuset:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
>

[jira] [Commented] (MESOS-2346) Docker tasks exiting normally, but returning TASK_FAILED

2016-05-23 Thread Eran Withana (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297474#comment-15297474
 ] 

Eran Withana commented on MESOS-2346:
-

I also experienced the same issue over the weekend.

```
I0522 18:02:54.606420  2848 slave.cpp:3002] Handling status update 
TASK_FINISHED (UUID: fe58f9f9-830f-42b9-b0d1-4a1c14fb5997) for task 
ct:146394000:0:Victimized Job: of framework 
20150526-223237-3758129930-5050-6543-0001 from executor(1)@x.x.x.x:52669
I0522 18:02:54.606534  2848 slave.cpp:3528] executor(1)@x.x.x.x:52669 exited
I0522 18:02:54.606561  2848 slave.cpp:3886] Executor 
'ct:146394000:0:Victimized Job:' of framework 
20150526-223237-3758129930-5050-6543-0001 exited with status 0
I0522 18:02:54.606608  2848 slave.cpp:3002] Handling status update TASK_FAILED 
(UUID: 4f5f97d7-0134-426a-a77a-ba1042dfa0cc) for task 
ct:146394000:0:Victimized Job: of framework 
20150526-223237-3758129930-5050-6543-0001 from @0.0.0.0:0
```

OS: Ubuntu 14.04
Mesos: 0.28.0-2.0.16.ubuntu1404
Docker: 1.8.3-0~trusty

We didn't have this issue before but started happening all of a sudden over the 
weekend. The issue seems to be gone for now but want to know what caused this 
issue. 

> Docker tasks exiting normally, but returning TASK_FAILED
> 
>
> Key: MESOS-2346
> URL: https://issues.apache.org/jira/browse/MESOS-2346
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.22.0
>Reporter: Brenden Matthews
>Priority: Critical
>
> Docker tasks which exit normally will return TASK_FAILED, as opposed to 
> TASK_FINISHED. This problem seems to occur only after `mesos-slave` has been 
> running for some time. If the slave is restarted, it will begin returning 
> TASK_FINISHED correctly.
> Sample slave log:
> {noformat}
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.483464   798 slave.cpp:1138] Got assigned task 
> ct:1423696932164:2:canary: for framework 
> 20150211-045421-1401302794-5050-714-0001
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.483667   798 slave.cpp:3854] Checkpointing FrameworkInfo to 
> '/tmp/mesos/meta/slaves/20150211-045421-1401302794-5050-714-S0/frameworks/20150211-045421-1401302794-5050-714-0001/framework.info'
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.483894   798 slave.cpp:3861] Checkpointing framework pid 
> 'scheduler-f4679749-d7ad-4d8c-b610-f7043332d243@10.102.188.213:56385' to 
> '/tmp/mesos/meta/slaves/20150211-045421-1401302794-5050-714-S0/frameworks/20150211-045421-1401302794-5050-714-0001/framework.pid'
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.484426   798 gc.cpp:84] Unscheduling 
> '/tmp/mesos/slaves/20150211-045421-1401302794-5050-714-S0/frameworks/20150211-045421-1401302794-5050-714-0001'
>  from gc
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.484648   797 gc.cpp:84] Unscheduling 
> '/tmp/mesos/meta/slaves/20150211-045421-1401302794-5050-714-S0/frameworks/20150211-045421-1401302794-5050-714-0001'
>  from gc
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.484748   797 slave.cpp:1253] Launching task 
> ct:1423696932164:2:canary: for framework 
> 20150211-045421-1401302794-5050-714-0001
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.485697   797 slave.cpp:4297] Checkpointing ExecutorInfo to 
> '/tmp/mesos/meta/slaves/20150211-045421-1401302794-5050-714-S0/frameworks/20150211-045421-1401302794-5050-714-0001/executors/ct:1423696932164:2:canary:/executor.info'
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.485999   797 slave.cpp:3929] Launching executor 
> ct:1423696932164:2:canary: of framework 
> 20150211-045421-1401302794-5050-714-0001 in work directory 
> '/tmp/mesos/slaves/20150211-045421-1401302794-5050-714-S0/frameworks/20150211-045421-1401302794-5050-714-0001/executors/ct:1423696932164:2:canary:/runs/5395b133-d10d-4204-999e-4a38c03c55f5'
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.486212   797 slave.cpp:4320] Checkpointing TaskInfo to 
> '/tmp/mesos/meta/slaves/20150211-045421-1401302794-5050-714-S0/frameworks/20150211-045421-1401302794-5050-714-0001/executors/ct:1423696932164:2:canary:/runs/5395b133-d10d-4204-999e-4a38c03c55f5/tasks/ct:1423696932164:2:canary:/task.info'
> Feb 11 23:22:13 ip-10-102-188-213.ec2.internal mesos-slave[793]: I0211 
> 23:22:13.509457   797 slave.cpp:1376] Queuing task 
> 'ct:1423696932164:2:canary:' for executor ct:1423696932164:2:canary: of 
> framework '20150211-045421-1401302794-5050-714-0001
> Feb 11 23:22:13 ip-10-102-188-213.ec2.

[jira] [Created] (MESOS-5444) agent state endpoint misses framework principal field

2016-05-23 Thread haosdent (JIRA)

haosdent created MESOS-5444:
---

 Summary: agent state endpoint misses framework principal field
 Key: MESOS-5444
 URL: https://issues.apache.org/jira/browse/MESOS-5444
 Project: Mesos
  Issue Type: Bug
Reporter: haosdent
Assignee: haosdent


Found by [~deshna] in https://reviews.apache.org/r/47702/ 

When launch a Framework with principal, the state endpoint of Agent didn't show 
the principal of Framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5440) There is a misspelling in some markdown files

2016-05-23 Thread GyeongWon, Do (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297559#comment-15297559
 ] 

GyeongWon, Do commented on MESOS-5440:
--

Thank you!

> There is a misspelling in some markdown files
> -
>
> Key: MESOS-5440
> URL: https://issues.apache.org/jira/browse/MESOS-5440
> Project: Mesos
>  Issue Type: Documentation
>Reporter: GyeongWon, Do
>Priority: Trivial
>
> "This endpoint requires authentication {color:red}iff{color} HTTP 
> authentication is enabled."
> I think iff is misspelling about if, is it right?
> There are many occurrences about that statement in many markdown files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5445) Allow libprocess/stout to build without first doing `make` in 3rdparty.

2016-05-23 Thread Kapil Arya (JIRA)

Kapil Arya created MESOS-5445:
-

 Summary: Allow libprocess/stout to build without first doing 
`make` in 3rdparty.
 Key: MESOS-5445
 URL: https://issues.apache.org/jira/browse/MESOS-5445
 Project: Mesos
  Issue Type: Bug
  Components: build
Reporter: Kapil Arya
Assignee: Kapil Arya
 Fix For: 0.29.0


After the 3rdparty reorg, libprocess/stout are enable to build their 
dependencies and so one has to do `make` in 3rdpart/ before building 
libprocess/stout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5359) The scheduler library should have a delay before initiating a connection with master.

2016-05-23 Thread JIRA


[ 
https://issues.apache.org/jira/browse/MESOS-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297581#comment-15297581
 ] 

José Guilherme Vanz commented on MESOS-5359:


Cool! Thanks [~anandmazumdar]] for the code pointer.

Should this delay be configurable by some flag?

> The scheduler library should have a delay before initiating a connection with 
> master.
> -
>
> Key: MESOS-5359
> URL: https://issues.apache.org/jira/browse/MESOS-5359
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.29.0
>Reporter: Anand Mazumdar
>Assignee: José Guilherme Vanz
>  Labels: mesosphere
>
> Currently, the scheduler library {{src/scheduler/scheduler.cpp}} does have an 
> artificially induced delay when trying to initially establish a connection 
> with the master. In the event of a master failover or ZK disconnect, a large 
> number of frameworks can get disconnected and then thereby overwhelm the 
> master with TCP SYN requests. 
> On a large cluster with many agents, the master is already overwhelmed with 
> handling connection requests from the agents. This compounds the issue 
> further on the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4565) slave recovers and attempt to destroy executor's child containers, then begins rejecting task status updates

2016-05-23 Thread Chanh Le (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297666#comment-15297666
 ] 

Chanh Le commented on MESOS-4565:
-

Any update on that?
I still get the issues.

> slave recovers and attempt to destroy executor's child containers, then 
> begins rejecting task status updates
> 
>
> Key: MESOS-4565
> URL: https://issues.apache.org/jira/browse/MESOS-4565
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
>Reporter: James DeFelice
>  Labels: mesosphere
>
> AFAICT the slave is doing this:
> 1) recovering from some kind of failure
> 2) checking the containers that it pulled from its state store
> 3) complaining about cgroup children hanging off of executor containers
> 4) rejecting task status updates related to the executor container, the first 
> of which in the logs is:
> {code}
> E0130 02:22:21.979852 12683 slave.cpp:2963] Failed to update resources for 
> container 1d965a20-849c-40d8-9446-27cb723220a9 of executor 
> 'd701ab48a0c0f13_k8sm-executor' running task 
> pod.f2dc2c43-c6f7-11e5-ad28-0ad18c5e6c7f on status update for terminal task, 
> destroying container: Container '1d965a20-849c-40d8-9446-27cb723220a9' not 
> found
> {code}
> To be fair, I don't believe that my custom executor is re-registering 
> properly with the slave prior to attempting to send these (failing) status 
> updates. But the slave doesn't complain about that .. it complains that it 
> can't find the **container**.
> slave log here:
> https://gist.github.com/jdef/265663461156b7a7ed4e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container

2016-05-23 Thread Nirav (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297704#comment-15297704
 ] 

Nirav commented on MESOS-5410:
--

Currently because "cgroup" namespace is not supported, following two test-case 
are failing:

1.  NsTest.ROOT_setns
2.  NsTest.ROOT_getns

The error observed is : "nstype: Unknown namespace 'cgroup'"
This is because the contents of the directory "/proc/self/ns" has been changed 
in kernel version 4.6 (cgroup is added).


> Support cgroup namespace in unified container
> -
>
> Key: MESOS-5410
> URL: https://issues.apache.org/jira/browse/MESOS-5410
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to 
> make a process can be created in its own cgroup namespace so that the global 
> cgroup hierarchy will not be leaked to the process. See the following link 
> for more details about this namespace:
> http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html
> We need to support this namespace in unified container to provide better 
> isolation for the containers created by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-23 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297735#comment-15297735
 ] 

haosdent commented on MESOS-5430:
-

[~jmanalus][~vinodkone] Thanks a lot for your reply, I posted a quick demo in 
http://blog.haosdent.me/mesos-site-demo/source/

There are some minor mismatches between the demo page above and [~jmanalus]'s 
design. If [~jmanalus] you use sketch or photoshop, may you send the file to my 
email(haosd...@gmail.com) or upload it in jira. So that I could adjust my demo 
to match your design more exactly. 

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container

2016-05-23 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297740#comment-15297740
 ] 

haosdent commented on MESOS-5410:
-

I think we could add
{code}
namespaces.erase("cgroup");
{code}
as a workaround. Let me file a jira for this.

> Support cgroup namespace in unified container
> -
>
> Key: MESOS-5410
> URL: https://issues.apache.org/jira/browse/MESOS-5410
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to 
> make a process can be created in its own cgroup namespace so that the global 
> cgroup hierarchy will not be leaked to the process. See the following link 
> for more details about this namespace:
> http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html
> We need to support this namespace in unified container to provide better 
> isolation for the containers created by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-5446) NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6

2016-05-23 Thread haosdent (JIRA)

haosdent created MESOS-5446:
---

 Summary: NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 
4.6
 Key: MESOS-5446
 URL: https://issues.apache.org/jira/browse/MESOS-5446
 Project: Mesos
  Issue Type: Bug
Reporter: haosdent
Priority: Minor


>From [~nthakkar%40us.ibm.com]
{quote}
Currently because "cgroup" namespace is not supported, following two test-case 
are failing:
1. NsTest.ROOT_setns
2. NsTest.ROOT_getns
The error observed is : "nstype: Unknown namespace 'cgroup'"
This is because the contents of the directory "/proc/self/ns" has been changed 
in kernel version 4.6 (cgroup is added).
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-5446) NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6

2016-05-23 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-5446:
---

Assignee: haosdent

> NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6
> ---
>
> Key: MESOS-5446
> URL: https://issues.apache.org/jira/browse/MESOS-5446
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
> From [~nthakkar%40us.ibm.com]
> {quote}
> Currently because "cgroup" namespace is not supported, following two 
> test-case are failing:
> 1. NsTest.ROOT_setns
> 2. NsTest.ROOT_getns
> The error observed is : "nstype: Unknown namespace 'cgroup'"
> This is because the contents of the directory "/proc/self/ns" has been 
> changed in kernel version 4.6 (cgroup is added).
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4565) slave recovers and attempt to destroy executor's child containers, then begins rejecting task status updates

2016-05-23 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297752#comment-15297752
 ] 

haosdent commented on MESOS-4565:
-

[~giaosuddau] Do you encounter the 
{code}
E0130 02:22:21.009094 12686 containerizer.cpp:553] Failed to clean up an 
isolator when destroying orphan container kube-proxy: Failed to remove cgroup 
'/sys/fs/cgroup/memory/mesos/1d965a20-849c-40d8-9446-27cb723220a9/kube-proxy': 
Device or resource busy
{code}

A quick workaround it unmount it manually and make Agent recover successfully. 

> slave recovers and attempt to destroy executor's child containers, then 
> begins rejecting task status updates
> 
>
> Key: MESOS-4565
> URL: https://issues.apache.org/jira/browse/MESOS-4565
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
>Reporter: James DeFelice
>  Labels: mesosphere
>
> AFAICT the slave is doing this:
> 1) recovering from some kind of failure
> 2) checking the containers that it pulled from its state store
> 3) complaining about cgroup children hanging off of executor containers
> 4) rejecting task status updates related to the executor container, the first 
> of which in the logs is:
> {code}
> E0130 02:22:21.979852 12683 slave.cpp:2963] Failed to update resources for 
> container 1d965a20-849c-40d8-9446-27cb723220a9 of executor 
> 'd701ab48a0c0f13_k8sm-executor' running task 
> pod.f2dc2c43-c6f7-11e5-ad28-0ad18c5e6c7f on status update for terminal task, 
> destroying container: Container '1d965a20-849c-40d8-9446-27cb723220a9' not 
> found
> {code}
> To be fair, I don't believe that my custom executor is re-registering 
> properly with the slave prior to attempting to send these (failing) status 
> updates. But the slave doesn't complain about that .. it complains that it 
> can't find the **container**.
> slave log here:
> https://gist.github.com/jdef/265663461156b7a7ed4e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container

2016-05-23 Thread Nirav (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297764#comment-15297764
 ] 

Nirav commented on MESOS-5410:
--

Hi,
Or we can add a macro in the file. I tried adding that,and it worked well. 
Since that would help in future.

#ifndef CLONE_NEWCGROUP
#define CLONE_NEWCGROUP 0x0200
#endif

and

nstypes["cgroup"] = CLONE_NEWCGROUP;

I can submit the required patch.

> Support cgroup namespace in unified container
> -
>
> Key: MESOS-5410
> URL: https://issues.apache.org/jira/browse/MESOS-5410
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to 
> make a process can be created in its own cgroup namespace so that the global 
> cgroup hierarchy will not be leaked to the process. See the following link 
> for more details about this namespace:
> http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html
> We need to support this namespace in unified container to provide better 
> isolation for the containers created by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container

2016-05-23 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297786#comment-15297786
 ] 

haosdent commented on MESOS-5410:
-

Cool! Could you send a email to the dev mailing list to become a contributor in 
jira, so that I could change the assignee of MESOS-5446 to you.

> Support cgroup namespace in unified container
> -
>
> Key: MESOS-5410
> URL: https://issues.apache.org/jira/browse/MESOS-5410
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to 
> make a process can be created in its own cgroup namespace so that the global 
> cgroup hierarchy will not be leaked to the process. See the following link 
> for more details about this namespace:
> http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html
> We need to support this namespace in unified container to provide better 
> isolation for the containers created by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

53 matches

Mail list logo