[jira] [Created] (MESOS-2999) Implement a linux/iptables isolator

2015-07-07 Thread Stephan Erb (JIRA)
Stephan Erb created MESOS-2999:
--

 Summary: Implement a linux/iptables isolator 
 Key: MESOS-2999
 URL: https://issues.apache.org/jira/browse/MESOS-2999
 Project: Mesos
  Issue Type: Story
  Components: containerization, isolation
Reporter: Stephan Erb


As a user of Mesos, I would like to have control over inbound and outbound 
network communication of a launched Mesos container. The intention is to gain 
improved security and isolation of user processes on the network level.

*Example Usecases*:

* Preventing outgoing connections to external endpoints which have not been 
whitelisted (e.g., deny internet connections, only allow connections to this 
one production database but not the others, ...)
* Prevent incoming connections from external systems or containers which have 
not been whitelisted (e.g., don't allow a rough or even hijacked services to 
interfere with another service)

The last usecase is somewhat tricky due to the dynamic nature of a Mesos 
cluster but might be achieved using the available 
[DiscoveryInfo|https://github.com/apache/mesos/blob/master/docs/app-framework-development-guide.md#service-discovery]
 (e.g., block all connections from foreign environments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3001) Create a demo HTTP API client

2015-07-07 Thread Marco Massenzio (JIRA)
Marco Massenzio created MESOS-3001:
--

 Summary: Create a demo HTTP API client
 Key: MESOS-3001
 URL: https://issues.apache.org/jira/browse/MESOS-3001
 Project: Mesos
  Issue Type: Bug
  Components: framework
Reporter: Marco Massenzio
Assignee: Marco Massenzio


We want to create a simple demo HTTP API Client (in Java or Python) that can 
serve as an example framework for people who will want to use the new API for 
their Frameworks.

The scope should be fairly limited (eg, launching a simple Container task?) but 
sufficient to exercise most of the new API endpoint messages/capabilities.

Scope: TBD

Non-Goals: 

- create a best-of-breed Framework to deliver any specific functionality;
- create an Integration Test for the HTTP API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617157#comment-14617157
 ] 

haosdent commented on MESOS-2199:
-

Hi, [~idownes]. My test step:
{code}
cd mesos/
rm -rf build
mkdir -p build  ./bootstrap  cd build  ../configure
make check -j8 GTEST_FILTER=-*
sudo ./bin/mesos-tests.sh --verbose 
--gtest_filter=SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
{code}
And could pass it in my tests.

My test env:
{code}
CentOS release 6.5 (Final)
Linux test-2 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 
x86_64 x86_64 GNU/Linux
g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
{code}

Could you show me your test environment? Or open verbose flags to display the 
log? Thank you in advance.

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2332) Report per-container metrics for network bandwidth throttling

2015-07-07 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-2332:
--
Shepherd: Jie Yu  (was: Ian Downes)

 Report per-container metrics for network bandwidth throttling
 -

 Key: MESOS-2332
 URL: https://issues.apache.org/jira/browse/MESOS-2332
 Project: Mesos
  Issue Type: Improvement
  Components: isolation
Reporter: Paul Brett
Assignee: Paul Brett
  Labels: features, twitter
 Fix For: 0.23.0


 Export metrics from the network isolation to identify scope and duration of 
 container throttling.  
 Packet loss can be identified from the overlimits and requeues fields of the 
 htb qdisc report for the virtual interface, e.g.
 {noformat}
 $ tc -s -d qdisc show dev mesos19223
 qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 
 1 1 1
  Sent 158213287452 bytes 1030876393 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0
 qdisc ingress : parent :fff1 
  Sent 119381747824 bytes 1144549901 pkt (dropped 2044879, overlimits 0 
 requeues 0)
  backlog 0b 0p requeues 0
 {noformat}
 Note that since a packet can be examined multiple times before transmission, 
 overlimits can exceed total packets sent.  
 Add to the port_mapping isolator usage() and the container statistics 
 protobuf. Carefully consider the naming (esp tx/rx) + commenting of the 
 protobuf fields so it's clear what these represent and how they are different 
 to the existing dropped packet counts from the network stack.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3007) Support systemd with Mesos containerizer

2015-07-07 Thread Artem Harutyunyan (JIRA)
Artem Harutyunyan created MESOS-3007:


 Summary: Support systemd with Mesos containerizer
 Key: MESOS-3007
 URL: https://issues.apache.org/jira/browse/MESOS-3007
 Project: Mesos
  Issue Type: Epic
Reporter: Artem Harutyunyan
 Fix For: 0.24.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3003) Support mounting in default configuration files/volumes into every new container

2015-07-07 Thread Timothy Chen (JIRA)
Timothy Chen created MESOS-3003:
---

 Summary: Support mounting in default configuration files/volumes 
into every new container
 Key: MESOS-3003
 URL: https://issues.apache.org/jira/browse/MESOS-3003
 Project: Mesos
  Issue Type: Improvement
Reporter: Timothy Chen


Most container images leave out system configuration (e.g: /etc/*) and expect 
the container runtimes to mount in specific configurations as needed such as 
/etc/resolv.conf from the host into the container when needed.

We need to support mounting in specific configuration files for command 
executor to work, and also allow the user to optionally define other 
configuration files to mount in as well via flags.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3000) Failing test - NsTest.ROOT_setns

2015-07-07 Thread Ian Downes (JIRA)
Ian Downes created MESOS-3000:
-

 Summary: Failing test - NsTest.ROOT_setns
 Key: MESOS-3000
 URL: https://issues.apache.org/jira/browse/MESOS-3000
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Ian Downes
Priority: Blocker


Appears to be the same issue plaguing MESOS-2199

{noformat}
[root@hostname build]# MESOS_VERBOSE=1 ./bin/mesos-tests.sh 
--gtest_filter=NsTest.ROOT_setns
...
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from NsTest
[ RUN  ] NsTest.ROOT_setns
ABORT: (../../../3rdparty/libprocess/src/subprocess.cpp:163): Failed to 
os::execvpe in childMain: Permission denied*** Aborted at 1436292540 (unix 
time) try date -d @1436292540 if you are using GNU date ***
PC: @ 0x7f7a1229e625 __GI_raise
*** SIGABRT (@0xfffe0001) received by PID 1 (TID 0x7f7a19afc820) from PID 
1; stack trace: ***
@ 0x7f7a13421710 (unknown)
@ 0x7f7a1229e625 __GI_raise
@ 0x7f7a1229fe05 __GI_abort
@   0x860ba1 (unknown)
@   0x860bcf (unknown)
@ 0x7f7a1826f118 (unknown)
@ 0x7f7a18274594 (unknown)
@ 0x7f7a18273b88 (unknown)
@ 0x7f7a18273098 (unknown)
@  0x1180720 (unknown)
@  0x117a5d7 (unknown)
@ 0x7f7a123548fd clone
../../src/tests/ns_tests.cpp:121: Failure
Failed to wait 15secs for status
[  FAILED  ] NsTest.ROOT_setns (15004 ms)
[--] 1 test from NsTest (15004 ms total)

[--] Global test environment tear-down
../../src/tests/environment.cpp:441: Failure
Failed
Tests completed with child processes remaining:
-+- 40531 /home/idownes/workspace/mesos/build/src/.libs/lt-mesos-tests 
--gtest_filter=NsTest.ROOT_setns
 \--- 40565 /home/idownes/workspace/mesos/build/src/.libs/lt-mesos-tests 
--gtest_filter=NsTest.ROOT_setns
[==] 1 test from 1 test case ran. (15034 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] NsTest.ROOT_setns
{noformat}

Relevant strace for the forked child:
{noformat}
...
getpid()= 1
dup2(6, 0) = 0
dup2(7, 1) = 1
dup2(8, 2) = 2
close(6) = 0
close(7) = 0
close(8) = 0
execve(/home/idownes/workspace/mesos/build/src/setns-test-helper, 
[setns-test-helper, SetnsTestHelper], [/* 24 vars */]) = -1 EACCES 
(Permission denied)
write(2, ABORT: (../../../3rdparty/libpro..., 62) = 62
write(2, Failed to os::execvpe in childMa..., 53) = 53
...
{noformat}

Binary that it's trying to exec:
{noformat}
[root@hostname build]# stat 
/home/idownes/workspace/mesos/build/src/setns-test-helper
  File: `/home/idownes/workspace/mesos/build/src/setns-test-helper'
  Size: 7948Blocks: 16 IO Block: 4096   regular file
Device: 801h/2049d  Inode: 22949249Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (13118/ idownes)   Gid: ( 1500/employee)
Access: 2015-07-07 17:58:09.569861237 +
Modify: 2015-07-07 17:58:09.573861290 +
Change: 2015-07-07 17:58:09.573861290 +
[root@hostname build]# /home/idownes/workspace/mesos/build/src/setns-test-helper
Usage: /home/idownes/workspace/mesos/build/src/.libs/lt-setns-test-helper 
subcommand [OPTIONS]

Available subcommands:
help
SetnsTestHelper
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3004) Support running the command executor with provisioned image for running a task in a container

2015-07-07 Thread Timothy Chen (JIRA)
Timothy Chen created MESOS-3004:
---

 Summary: Support running the command executor with provisioned 
image for running a task in a container
 Key: MESOS-3004
 URL: https://issues.apache.org/jira/browse/MESOS-3004
 Project: Mesos
  Issue Type: Improvement
Reporter: Timothy Chen


Mesos Containerizer uses the command executor to actually launch the user 
defined command, and the command executor then can communicate with the slave 
about the process lifecycle.
When we provision a new container with the user specified image, we also need 
to be able to run the command executor in the container to support the same 
semantics.
One approach is to dynamically mount in a static binary of the command executor 
with all its dependencies in a special directory so it doesn't interfere with 
the provisioned root filesystem and configure the mesos containerizer to run 
the command executor in that directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere reassigned MESOS-3002:
---

Assignee: Joris Van Remoortere  (was: Mark Wang)

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Joris Van Remoortere
Priority: Blocker

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2993) Document per container unique egress flow and network queueing statistics

2015-07-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617110#comment-14617110
 ] 

Adam B commented on MESOS-2993:
---

[~pbrett] Is there a draft/review for this yet? Wondering if we can get this in 
for the next release candidate (rc2).

 Document  per container unique egress flow and network queueing statistics
 --

 Key: MESOS-2993
 URL: https://issues.apache.org/jira/browse/MESOS-2993
 Project: Mesos
  Issue Type: Bug
  Components: documentation, isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Paul Brett
  Labels: twitter

 Document new network isolation capabilities in 0.23



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617163#comment-14617163
 ] 

haosdent commented on MESOS-2199:
-

Nobody user is same with yours:
{code}
nobody:x:99:99:Nobody:/:/sbin/nologin
{code}

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-3002:
--
Target Version/s: 0.23.0

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Mark Wang
Priority: Blocker

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3005) SSL tests can fail depending on hostname configuration

2015-07-07 Thread Joris Van Remoortere (JIRA)
Joris Van Remoortere created MESOS-3005:
---

 Summary: SSL tests can fail depending on hostname configuration
 Key: MESOS-3005
 URL: https://issues.apache.org/jira/browse/MESOS-3005
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Reporter: Joris Van Remoortere
Assignee: Joris Van Remoortere
Priority: Blocker


Depending on how /etc/hosts is configured, the SSL tests can fail with a bad 
hostname match for the certificate.
We can avoid this by explicitly matching the hostname for the certificate to 
the IP that will be used during the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617240#comment-14617240
 ] 

Ian Downes commented on MESOS-2199:
---

I have my build directory under my home directory and I do not want {{nobody}} 
(or anybody else) to have access to it.
{noformat}
[root@hostname build]# stat /home/idownes
  File: `/home/idownes'
  Size: 4096Blocks: 8  IO Block: 4096   directory
Device: 801h/2049d  Inode: 22807083Links: 11
Access: (0700/drwx--)  Uid: (13118/ idownes)   Gid: ( 1500/employee)
Access: 2015-07-06 22:51:35.829848943 +
Modify: 2015-07-06 21:58:32.348041134 +
Change: 2015-07-06 21:58:32.348041134 +
{noformat}
My home directory is {{0700}} so naturally {{nobody}} does not have access:
{noformat}
[root@hostname build]# su -s /bin/sh nobody -c ls /home/idownes
ls: cannot open directory /home/idownes: Permission denied
{noformat}

I think it's flawed to require global read access for the build directory...

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1457) Process IDs should be required to be human-readable

2015-07-07 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617296#comment-14617296
 ] 

Till Toenshoff commented on MESOS-1457:
---

I got pointed to this older issue as the patches did not get committed.

Seems Palak's solution is acceptable. It would be great if we could indeed get 
a comment into the ProcessBase constructor stating something like the proposed 
{noformat}
// Please provide a process ID prefix to ease debugging (See MESOS-1457).
{noformat}

[~PalakPC] could you possibly propose the above in a review-request and rebase 
those other two patches so we can get them committed?

 Process IDs should be required to be human-readable 
 

 Key: MESOS-1457
 URL: https://issues.apache.org/jira/browse/MESOS-1457
 Project: Mesos
  Issue Type: Improvement
  Components: libprocess
Reporter: Dominic Hamon
Assignee: Palak Choudhary
Priority: Minor

 When debugging, it's very useful to understand which processes are getting 
 timeslices. As such, the human-readable names that can be passed to 
 {{ProcessBase}} are incredibly valuable, however they are currently optional.
 If the constructor of {{ProcessBase}} took a mandatory string, every process 
 would get a human-readable name and debugging would be much easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Paul Brett (JIRA)
Paul Brett created MESOS-3002:
-

 Summary: Rename OptionT::get(const T _t) to getOrElse() broke 
network isolator
 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.23.0
Reporter: Paul Brett


Change to Option from get() to getOrElse() breaks network isolator.  Building 
with '../configure --with-network-isolator' generates the following error:

../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
member function 'static Trymesos::slave::Isolator* 
mesos::internal::slave::PortMappingIsolatorProcess::create(const 
mesos::internal::slave::Flags)':
../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
error: no matching function for call to 'Optionstd::basic_stringchar 
::get(const char [1]) const'
   flags.resources.get(),
 ^
../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: note: 
candidates are:
In file included from 
../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
 from ../../3rdparty/libprocess/include/process/check.hpp:19,
 from ../../3rdparty/libprocess/include/process/collect.hpp:7,
 from 
../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: note: 
const T OptionT::get() const [with T = std::basic_stringchar]
   const T get() const { assert(isSome()); return t; }
^
../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: note: 
  candidate expects 0 arguments, 1 provided
../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: note: 
T OptionT::get() [with T = std::basic_stringchar]
   T get() { assert(isSome()); return t; }
  ^
../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: note:  
 candidate expects 0 arguments, 1 provided
make[2]: *** 
[slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo] 
Error 1
make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
make[1]: *** [check] Error 2
make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
make: *** [check-recursive] Error 1




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Paul Brett (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Brett updated MESOS-3002:
--
Assignee: Mark Wang

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Mark Wang

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617180#comment-14617180
 ] 

haosdent commented on MESOS-2199:
-

Could you show the permission of you build dir? Your build path should allow 
other users could read it. You could check your build dir permissions through 
{code}
su -s /bin/sh nobody -c ls your_build_dir_absolute_path
{code}

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2972) Serialize Docker image spec as protobuf

2015-07-07 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-2972:

Description: The Docker image specification defines a schema for the 
metadata json that it puts into each image. Currently the docker image 
provisioner needs to be able to parse and understand this metadata json, and we 
should create a protobuf equivelent schema so we can utilize the json to 
protobuf conversion to read and validate the metadata.

 Serialize Docker image spec as protobuf
 ---

 Key: MESOS-2972
 URL: https://issues.apache.org/jira/browse/MESOS-2972
 Project: Mesos
  Issue Type: Improvement
Reporter: Timothy Chen
  Labels: mesosphere

 The Docker image specification defines a schema for the metadata json that it 
 puts into each image. Currently the docker image provisioner needs to be able 
 to parse and understand this metadata json, and we should create a protobuf 
 equivelent schema so we can utilize the json to protobuf conversion to read 
 and validate the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2972) Serialize Docker image spec as protobuf

2015-07-07 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617190#comment-14617190
 ] 

Timothy Chen commented on MESOS-2972:
-

Just updated the description.

 Serialize Docker image spec as protobuf
 ---

 Key: MESOS-2972
 URL: https://issues.apache.org/jira/browse/MESOS-2972
 Project: Mesos
  Issue Type: Improvement
Reporter: Timothy Chen
  Labels: mesosphere

 The Docker image specification defines a schema for the metadata json that it 
 puts into each image. Currently the docker image provisioner needs to be able 
 to parse and understand this metadata json, and we should create a protobuf 
 equivelent schema so we can utilize the json to protobuf conversion to read 
 and validate the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2972) Serialize Docker image spec as protobuf

2015-07-07 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-2972:

Assignee: Lily Chen

 Serialize Docker image spec as protobuf
 ---

 Key: MESOS-2972
 URL: https://issues.apache.org/jira/browse/MESOS-2972
 Project: Mesos
  Issue Type: Improvement
Reporter: Timothy Chen
Assignee: Lily Chen
  Labels: mesosphere

 The Docker image specification defines a schema for the metadata json that it 
 puts into each image. Currently the docker image provisioner needs to be able 
 to parse and understand this metadata json, and we should create a protobuf 
 equivelent schema so we can utilize the json to protobuf conversion to read 
 and validate the metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Ian Downes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Downes updated MESOS-3002:
--
Priority: Blocker  (was: Major)

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Mark Wang
Priority: Blocker

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2640) Remove old frameworks and ec2 scripts from core Mesos repository

2015-07-07 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-2640:
--
Fix Version/s: 0.24.0

 Remove old frameworks and ec2 scripts from core Mesos repository
 

 Key: MESOS-2640
 URL: https://issues.apache.org/jira/browse/MESOS-2640
 Project: Mesos
  Issue Type: Task
Reporter: Yan Xu
Assignee: Yan Xu
 Fix For: 0.24.0


 As per discussion [on the dev 
 list|http://www.mail-archive.com/dev@mesos.apache.org/msg31587.html] we'll 
 remove the  old and unmaintained frameworks code from the repo and move them 
 to https://github.com/mesos/framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Paul Brett (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617165#comment-14617165
 ] 

Paul Brett commented on MESOS-3002:
---

Mark - can you take a look at this.  Thanks

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Mark Wang

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3006) Add cgroups memory stats API

2015-07-07 Thread Jojy Varghese (JIRA)
Jojy Varghese created MESOS-3006:


 Summary: Add cgroups memory stats API
 Key: MESOS-3006
 URL: https://issues.apache.org/jira/browse/MESOS-3006
 Project: Mesos
  Issue Type: Task
  Components: containerization, docker
 Environment: linux
Reporter: Jojy Varghese
Assignee: Jojy Varghese


cgroups API current does expose stats from the memory namespace. Having this 
API would enable isolators to use its various fields(eg. rss, rss_huge, 
writeback etc) in use cases like usage metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2119) Add Socket tests

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2119:
-
Sprint: Mesosphere Q4 Sprint 3 - 12/7, Mesosphere Q1 Sprint 1 - 1/23, 
Mesosphere Q1 Sprint 2 - 2/6, Mesosphere Q1 Sprint 3 - 2/20, Mesosphere Q1 
Sprint 4 - 3/6, Mesosphere Q1 Sprint 5 - 3/20, Mesosphere Q1 Sprint 6 - 4/3, 
Mesosphere Q1 Sprint 7 - 4/17, Mesosphere Q2 Sprint 8 - 5/1, Mesosphere Sprint 
10, Mesosphere Sprint 11, Mesosphere Sprint 14  (was: Mesosphere Q4 Sprint 3 - 
12/7, Mesosphere Q1 Sprint 1 - 1/23, Mesosphere Q1 Sprint 2 - 2/6, Mesosphere 
Q1 Sprint 3 - 2/20, Mesosphere Q1 Sprint 4 - 3/6, Mesosphere Q1 Sprint 5 - 
3/20, Mesosphere Q1 Sprint 6 - 4/3, Mesosphere Q1 Sprint 7 - 4/17, Mesosphere 
Q2 Sprint 8 - 5/1, Mesosphere Sprint 10, Mesosphere Sprint 11)

 Add Socket tests
 

 Key: MESOS-2119
 URL: https://issues.apache.org/jira/browse/MESOS-2119
 Project: Mesos
  Issue Type: Task
  Components: libprocess
Reporter: Niklas Quarfot Nielsen
Assignee: Joris Van Remoortere
  Labels: mesosphere

 Add more Socket specific tests to get coverage while doing libev to libevent 
 (w and wo SSL) move



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3002:

Shepherd: Benjamin Hindman
  Sprint: Mesosphere Sprint 14
Story Points: 1

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Joris Van Remoortere
Priority: Blocker

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-3002:
--
Affects Version/s: (was: 0.23.0)
   0.24.0

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.24.0
Reporter: Paul Brett
Assignee: Joris Van Remoortere
Priority: Blocker

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-3002:
--
Target Version/s: 0.24.0  (was: 0.23.0)

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.24.0
Reporter: Paul Brett
Assignee: Joris Van Remoortere
Priority: Blocker

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2075) Add maintenance information to the replicated registry.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2075:
-
   Sprint: Mesosphere Sprint 14
   Labels: mesosphere twitter  (was: twitter)
Fix Version/s: 0.24.0

 Add maintenance information to the replicated registry.
 ---

 Key: MESOS-2075
 URL: https://issues.apache.org/jira/browse/MESOS-2075
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter
 Fix For: 0.24.0


 To achieve fault-tolerance for the maintenance primitives, we will need to 
 add the maintenance information to the registry.
 The registry currently stores all of the slave information, which is quite 
 large (~ 17MB for 50,000 slaves from my testing), which results in a protobuf 
 object that is extremely expensive to copy.
 As far as I can tell, reads / writes to maintenance information is 
 independent of reads / writes to the existing 'registry' information. So 
 there are two approach here:
 h4. Add maintenance information to 'maintenance' key:
 # The advantage of this approach is that we don't further grow the large 
 Registry object.
 # This approach assumes that writes to 'maintenance' are independent of 
 writes to the 'registry'. If these writes are not independent, this approach 
 requires that we add transactional support to the State abstraction.
 # This approach requires adding compaction to LogStorage.
 # This approach likely requires some refactoring to the Registrar.
 h4. Add maintenance information to 'registry' key:
 # The advantage of this approach is that it's the easiest to implement.
 # This will further grow the single 'registry' object, but doesn't preclude 
 it being split apart in the future.
 # This approach may require using the diff support in LogStorage and/or 
 adding compression support to LogStorage snapshots to deal with the increased 
 size of the registry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2996) Failing Docker tests on CentOS Linux release 7.1.1503.

2015-07-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617429#comment-14617429
 ] 

Adam B commented on MESOS-2996:
---

First two patches committed, but there's still a lingering issue, as mentioned 
in the above comment.

commit a925b77d53fabcc22e4b4988e18b40387e17b0ab
Author: Timothy Chen tnac...@apache.org
Date:   Tue Jul 7 11:51:36 2015 -0700

Only run netcat tests when nc is available.

Review: https://reviews.apache.org/r/36216

commit eecf0d4a2a31506878c98c9dd562175816efdcbf
Author: Timothy Chen tnac...@apache.org
Date:   Tue Jul 7 11:50:21 2015 -0700

Fix running docker executor tests.

Review: https://reviews.apache.org/r/36214

 Failing Docker tests on CentOS Linux release 7.1.1503.
 --

 Key: MESOS-2996
 URL: https://issues.apache.org/jira/browse/MESOS-2996
 Project: Mesos
  Issue Type: Bug
Reporter: Joerg Schad
Assignee: Timothy Chen
Priority: Critical
  Labels: mesosphere

 With Mesos 0.23 rc1 several tests fail on CentOS Linux release 7.1 (will add 
 more detail shortly).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2996) Failing Docker tests on CentOS Linux release 7.1.1503.

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2996:
--
Priority: Blocker  (was: Critical)

 Failing Docker tests on CentOS Linux release 7.1.1503.
 --

 Key: MESOS-2996
 URL: https://issues.apache.org/jira/browse/MESOS-2996
 Project: Mesos
  Issue Type: Bug
Reporter: Joerg Schad
Assignee: Timothy Chen
Priority: Blocker
  Labels: mesosphere

 With Mesos 0.23 rc1 several tests fail on CentOS Linux release 7.1 (will add 
 more detail shortly).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2946) Authorizer Module: Interface design

2015-07-07 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613069#comment-14613069
 ] 

Till Toenshoff edited comment on MESOS-2946 at 7/7/15 8:27 PM:
---

h4.Status Quo
As the current design stands, {{Authorizer}} is indeed an interface, but its 
default implementation is declared in the same header. Moreover, if one decides 
to create an alternative implementation for authorization, Mesos needs to be 
recompiled and all the places where the authorizer gets instantiated need to be 
updated.

h4.Design
Under the modularize version, the MVP for the {{Authorizer}} interface will 
look like:

{code}
class Authorizer
{
public:
  static TryAuthorizer* create(const std::string name);

  virtual ~Authorizer() {}

  virtual TryNothing initialize(const OptionACLs acls) = 0;

  virtual process::Futurebool authorize(
  const ACL::RegisterFramework request) = 0;
  virtual process::Futurebool authorize(
  const ACL::RunTask request) = 0;
  virtual process::Futurebool authorize(
  const ACL::ShutdownFramework request) = 0;

protected:
  Authorizer() {}
};
{code}

Where {{Authorizer::create(const std::string)}} is the factory function which 
will construct the default {{LocalAuthorizer}} if local is selected and will 
use the existing facilities within {{ModuleManager}} to load the appropriate 
module in any other case.

In order to allow the {{LocalAuthorizer}} to play nicely with the general 
modules design, it needs a default constructor. This constraint leads to the 
existence of {{Authorizer::initialize(const OptionACLs)}} which is needed to 
pass initialization parameters to the {{LocalAuthorizer}}. Note that all other 
authorizers will use the {{ModuleManager}} mechanisms to pass initialization 
parameters. This follows the pattern used in the {{Authenticator}} module. The 
method {{Authorizer::initialize(const OptionACLs)}} can be removed when we 
go to a modules only implementation.

All other methods remain unchanged from the original {{Authorizer}} interface.


was (Author: arojas):
h4.Status Quo
As the current design stands, {{Authorizer}} is indeed an interface, but its 
default implementation is declared in the same header. Moreover, if one decides 
to create an alternative implementation for authorization, Mesos needs to be 
recompiled and all the places where the authorizer gets instantiated need to be 
updated.

h4.Design
Under the modularize version, the MVP for the {{Authorizer}} interface will 
look like:

{code}
class Authorizer
{
public:
  static TryAuthorizer* create(const std::string name);

  virtual ~Authorizer() {}

  virtual TryNothing initialize(const OptionACLs acls) = 0;

  virtual process::Futurebool authorize(
  const ACL::RegisterFramework request) = 0;
  virtual process::Futurebool authorize(
  const ACL::RunTask request) = 0;
  virtual process::Futurebool authorize(
  const ACL::ShutdownFramework request) = 0;

protected:
  Authorizer() {}
};
{code}

Where {{Authorizer::create(const std::string)}} is the factory function which 
will construct the default {{LocalAuthorizer}} if local is selected and will 
use the existing facilities within {{ModuleManager}} to load the appropriate 
module in any other case.

In order to allow the {{LocalAuthorizer}} to play nicely with the general 
modules design, it needs a default constructor. This constraint leads to the 
existence of {{Authorizer::initialize(const OptionACLs)}} which is needed to 
pass initialization parameters to the {{LocalAuthorizer}}. Note that all other 
authorizers will use the {{ModuleManager}} mechanisms to pass initialization 
parameters. This follows the pattern used in the {{Authorizator}} module. The 
method {{Authorizer::initialize(const OptionACLs)}} can be removed when we 
go to a modules only implementation.

All other methods remain unchanged from the original {{Authorizer}} interface.

 Authorizer Module: Interface design
 ---

 Key: MESOS-2946
 URL: https://issues.apache.org/jira/browse/MESOS-2946
 Project: Mesos
  Issue Type: Improvement
Reporter: Till Toenshoff
Assignee: Till Toenshoff
  Labels: mesosphere, module, security

 h4.Motivation
 Design an interface covering authorizer modules while staying minimally 
 invasive in regards to changes to the existing {{LocalAuthorizer}} 
 implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2993) Document per container unique egress flow and network queueing statistics

2015-07-07 Thread Paul Brett (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617498#comment-14617498
 ] 

Paul Brett commented on MESOS-2993:
---

Review draft available at https://reviews.apache.org/r/36281/

 Document  per container unique egress flow and network queueing statistics
 --

 Key: MESOS-2993
 URL: https://issues.apache.org/jira/browse/MESOS-2993
 Project: Mesos
  Issue Type: Bug
  Components: documentation, isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Paul Brett
  Labels: twitter

 Document new network isolation capabilities in 0.23



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3008) Libevent SSL doesn't use EPOLL

2015-07-07 Thread Joris Van Remoortere (JIRA)
Joris Van Remoortere created MESOS-3008:
---

 Summary: Libevent SSL doesn't use EPOLL
 Key: MESOS-3008
 URL: https://issues.apache.org/jira/browse/MESOS-3008
 Project: Mesos
  Issue Type: Improvement
  Components: libprocess
Affects Versions: 0.23.0
Reporter: Joris Van Remoortere
Assignee: Joris Van Remoortere


we currently disable to epoll in libevent to allow SSL to work.
It would be more scalable if we didn't have to do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-3002:
--
Fix Version/s: (was: 0.23.0)

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.24.0
Reporter: Paul Brett
Assignee: Joris Van Remoortere
Priority: Blocker
 Fix For: 0.24.0


 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2076) Implement maintenance primitives in the Master.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2076:
-
Sprint: Mesosphere Sprint 14
Labels: mesosphere twitter  (was: twitter)

 Implement maintenance primitives in the Master.
 ---

 Key: MESOS-2076
 URL: https://issues.apache.org/jira/browse/MESOS-2076
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter

 The master will need to do a number of things to implement the maintenance 
 primitives:
 # For slaves that have a maintenance window:
 #* For unused resources, offers must be augmented with an Unavailability.
 #* For used resources, inverse offers must be sent.
 # For inverse offers that are declined, we must filter these before sending 
 them again. We must also store the decline reason, guard against OOMing. 
 #* My hunch is that we'll not want to persist the reasons in the initial 
 approach.
 # When the drain window is reached, we'll make a binary decision as to 
 whether the slave was drained, based on whether it was empty.
 #* If drained, we deactivate this slave and store the fact that it was 
 drained.
 #* If not drained, we leave this slave activated.
 # Recover the maintenance information upon failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3003) Support mounting in default configuration files/volumes into every new container

2015-07-07 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-3003:

Component/s: containerization

 Support mounting in default configuration files/volumes into every new 
 container
 

 Key: MESOS-3003
 URL: https://issues.apache.org/jira/browse/MESOS-3003
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Timothy Chen
  Labels: mesosphere

 Most container images leave out system configuration (e.g: /etc/*) and expect 
 the container runtimes to mount in specific configurations as needed such as 
 /etc/resolv.conf from the host into the container when needed.
 We need to support mounting in specific configuration files for command 
 executor to work, and also allow the user to optionally define other 
 configuration files to mount in as well via flags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3004) Support running the command executor with provisioned image for running a task in a container

2015-07-07 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-3004:

Component/s: containerization

 Support running the command executor with provisioned image for running a 
 task in a container
 -

 Key: MESOS-3004
 URL: https://issues.apache.org/jira/browse/MESOS-3004
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Timothy Chen
  Labels: mesosphere

 Mesos Containerizer uses the command executor to actually launch the user 
 defined command, and the command executor then can communicate with the slave 
 about the process lifecycle.
 When we provision a new container with the user specified image, we also need 
 to be able to run the command executor in the container to support the same 
 semantics.
 One approach is to dynamically mount in a static binary of the command 
 executor with all its dependencies in a special directory so it doesn't 
 interfere with the provisioned root filesystem and configure the mesos 
 containerizer to run the command executor in that directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3009) Reproduce systemd cgroup behavior

2015-07-07 Thread Artem Harutyunyan (JIRA)
Artem Harutyunyan created MESOS-3009:


 Summary: Reproduce systemd cgroup behavior 
 Key: MESOS-3009
 URL: https://issues.apache.org/jira/browse/MESOS-3009
 Project: Mesos
  Issue Type: Task
Reporter: Artem Harutyunyan
Assignee: Joris Van Remoortere


It has been noticed before that systemd reorganizes cgroup hierarchy created by 
mesos slave. Because of this mesos is no longer able to find the cgroup, and 
there is also a chance of undoing the isolation that mesos slave puts in place. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3002:

Labels: mesosphere  (was: )

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.24.0
Reporter: Paul Brett
Assignee: Joris Van Remoortere
Priority: Blocker
  Labels: mesosphere
 Fix For: 0.24.0


 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2943) mesos fails to compile under mac when libssl and libevent are enabled

2015-07-07 Thread Benjamin Hindman (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617501#comment-14617501
 ] 

Benjamin Hindman commented on MESOS-2943:
-

commit 971583522b3ada19f91d58fd89c0d3d17f5fef34
Author: Joris Van Remoortere joris.van.remoort...@gmail.com
Date:   Tue Jul 7 14:54:56 2015 -0700

MESOS-2943: Add comment for explicit return type.

Review: https://reviews.apache.org/r/36267

 mesos fails to compile under mac when libssl and libevent are enabled
 -

 Key: MESOS-2943
 URL: https://issues.apache.org/jira/browse/MESOS-2943
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Affects Versions: 0.23.0
Reporter: Artem Harutyunyan
Assignee: Joris Van Remoortere
Priority: Blocker
  Labels: mesosphere
 Fix For: 0.23.0


 ../configure --enable-debug --enable-libevent --enable-ssl  make
 produces the following error:
 poll.cpp' || echo '../../../3rdparty/libprocess/'`src/libevent_poll.cpp
 libtool: compile:  g++ -DPACKAGE_NAME=\libprocess\ 
 -DPACKAGE_TARNAME=\libprocess\ -DPACKAGE_VERSION=\0.0.1\ 
 -DPACKAGE_STRING=\libprocess 0.0.1\ -DPACKAGE_BUGREPORT=\\ 
 -DPACKAGE_URL=\\ -DPACKAGE=\libprocess\ -DVERSION=\0.0.1\ 
 -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 
 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ 
 -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 
 -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 
 -DHAVE_LIBCURL=1 -DHAVE_EVENT2_EVENT_H=1 -DHAVE_LIBEVENT=1 
 -DHAVE_EVENT2_THREAD_H=1 -DHAVE_LIBEVENT_PTHREADS=1 -DHAVE_OPENSSL_SSL_H=1 
 -DHAVE_LIBSSL=1 -DHAVE_LIBCRYPTO=1 -DHAVE_EVENT2_BUFFEREVENT_SSL_H=1 
 -DHAVE_LIBEVENT_OPENSSL=1 -DUSE_SSL_SOCKET=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
 -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBDL=1 -I. 
 -I../../../3rdparty/libprocess -I../../../3rdparty/libprocess/include 
 -I../../../3rdparty/libprocess/3rdparty/stout/include -I3rdparty/boost-1.53.0 
 -I3rdparty/libev-4.15 -I3rdparty/picojson-4f93734 -I3rdparty/glog-0.3.3/src 
 -I3rdparty/ry-http-parser-1c3624a -I/usr/local/opt/openssl/include 
 -I/usr/local/opt/libevent/include 
 -I/usr/local/opt/subversion/include/subversion-1 -I/usr/include/apr-1 
 -I/usr/include/apr-1.0 -g1 -O0 -std=c++11 -stdlib=libc++ 
 -DGTEST_USE_OWN_TR1_TUPLE=1 -MT libprocess_la-libevent_poll.lo -MD -MP -MF 
 .deps/libprocess_la-libevent_poll.Tpo -c 
 ../../../3rdparty/libprocess/src/libevent_poll.cpp  -fno-common -DPIC -o 
 libprocess_la-libevent_poll.o
 mv -f .deps/libprocess_la-socket.Tpo .deps/libprocess_la-socket.Plo
 mv -f .deps/libprocess_la-subprocess.Tpo .deps/libprocess_la-subprocess.Plo
 mv -f .deps/libprocess_la-libevent.Tpo .deps/libprocess_la-libevent.Plo
 mv -f .deps/libprocess_la-metrics.Tpo .deps/libprocess_la-metrics.Plo
 In file included from 
 ../../../3rdparty/libprocess/src/libevent_ssl_socket.cpp:11:
 In file included from 
 ../../../3rdparty/libprocess/include/process/queue.hpp:9:
 ../../../3rdparty/libprocess/include/process/future.hpp:849:7: error: no 
 viable conversion from 'const process::Futureconst 
 process::Futureprocess::network::Socket ' to 'const 
 process::network::Socket'
  set(u);
  ^
 ../../../3rdparty/libprocess/src/libevent_ssl_socket.cpp:769:10: note: in 
 instantiation of function template specialization 
 'process::Futureprocess::network::Socket::Futureprocess::Futureconst 
 process::Futureprocess::network::Socket  ' requested here
  return accept_queue.get()
 ^
 ../../../3rdparty/libprocess/include/process/socket.hpp:21:7: note: candidate 
 constructor (the implicit move constructor) not viable: no known conversion 
 from 'const process::Futureconst process::Futureprocess::network::Socket 
 ' to
  'process::network::Socket ' for 1st argument
 class Socket
  ^
 ../../../3rdparty/libprocess/include/process/socket.hpp:21:7: note: candidate 
 constructor (the implicit copy constructor) not viable: no known conversion 
 from 'const process::Futureconst process::Futureprocess::network::Socket 
 ' to
  'const process::network::Socket ' for 1st argument
 class Socket
  ^
 ../../../3rdparty/libprocess/include/process/future.hpp:411:21: note: passing 
 argument to parameter '_t' here
  bool set(const T _t);
^
 1 error generated.
 make[4]: *** [libprocess_la-libevent_ssl_socket.lo] Error 1
 make[4]: *** Waiting for unfinished jobs
 mv -f .deps/libprocess_la-libevent_poll.Tpo 
 .deps/libprocess_la-libevent_poll.Plo
 mv -f .deps/libprocess_la-openssl.Tpo .deps/libprocess_la-openssl.Plo
 mv -f .deps/libprocess_la-process.Tpo .deps/libprocess_la-process.Plo
 make[3]: *** 

[jira] [Updated] (MESOS-2061) Add InverseOffer protobuf message.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2061:
-
Sprint: Mesosphere Sprint 14
Labels: mesosphere twitter  (was: twitter)

 Add InverseOffer protobuf message.
 --

 Key: MESOS-2061
 URL: https://issues.apache.org/jira/browse/MESOS-2061
 Project: Mesos
  Issue Type: Task
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter

 InverseOffer was defined as part of the maintenance work in MESOS-1474, 
 design doc here: 
 https://docs.google.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit?usp=sharing
 {code}
 // A request to deallocate or return any resources already
 // being consumed by the framework.
 message InverseOffer {
   required OfferID id = 1;
   required FrameworkID framework_id = 2;
   repeated Resource resources = 3;
  
   // The slave ID if the resources need to be released on a particular slave.
   optional SlaveID slave_id = 4;
   
   // The executor and task IDs if the resources need to be released on 
 specific
   // executors and/or tasks.
   optional ExecutorID executor_id = 6;
   repeated TaskID task_ids = 6;
  
   // The resources specified in this offer will become unavailable
   // at the specified start time and for the specified duration. Any
   // tasks running using these resources might get killed when
   // these resources become unavailable.
   required Unavailability unavailability = 7;
 }
 {code}
 This ticket is to capture the addition of the InverseOffer protobuf to 
 mesos.proto, the necessary API changes for Event/Call and the language 
 bindings will be tracked separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2076) Implement maintenance primitives in the Master.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2076:
-
Fix Version/s: 0.24.0

 Implement maintenance primitives in the Master.
 ---

 Key: MESOS-2076
 URL: https://issues.apache.org/jira/browse/MESOS-2076
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter
 Fix For: 0.24.0


 The master will need to do a number of things to implement the maintenance 
 primitives:
 # For slaves that have a maintenance window:
 #* For unused resources, offers must be augmented with an Unavailability.
 #* For used resources, inverse offers must be sent.
 # For inverse offers that are declined, we must filter these before sending 
 them again. We must also store the decline reason, guard against OOMing. 
 #* My hunch is that we'll not want to persist the reasons in the initial 
 approach.
 # When the drain window is reached, we'll make a binary decision as to 
 whether the slave was drained, based on whether it was empty.
 #* If drained, we deactivate this slave and store the fact that it was 
 drained.
 #* If not drained, we leave this slave activated.
 # Recover the maintenance information upon failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2061) Add InverseOffer protobuf message.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2061:
-
Fix Version/s: 0.24.0

 Add InverseOffer protobuf message.
 --

 Key: MESOS-2061
 URL: https://issues.apache.org/jira/browse/MESOS-2061
 Project: Mesos
  Issue Type: Task
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter
 Fix For: 0.24.0


 InverseOffer was defined as part of the maintenance work in MESOS-1474, 
 design doc here: 
 https://docs.google.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit?usp=sharing
 {code}
 // A request to deallocate or return any resources already
 // being consumed by the framework.
 message InverseOffer {
   required OfferID id = 1;
   required FrameworkID framework_id = 2;
   repeated Resource resources = 3;
  
   // The slave ID if the resources need to be released on a particular slave.
   optional SlaveID slave_id = 4;
   
   // The executor and task IDs if the resources need to be released on 
 specific
   // executors and/or tasks.
   optional ExecutorID executor_id = 6;
   repeated TaskID task_ids = 6;
  
   // The resources specified in this offer will become unavailable
   // at the specified start time and for the specified duration. Any
   // tasks running using these resources might get killed when
   // these resources become unavailable.
   required Unavailability unavailability = 7;
 }
 {code}
 This ticket is to capture the addition of the InverseOffer protobuf to 
 mesos.proto, the necessary API changes for Event/Call and the language 
 bindings will be tracked separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3010) Review design document for maintenance primitives

2015-07-07 Thread Artem Harutyunyan (JIRA)
Artem Harutyunyan created MESOS-3010:


 Summary: Review design document for maintenance primitives 
 Key: MESOS-3010
 URL: https://issues.apache.org/jira/browse/MESOS-3010
 Project: Mesos
  Issue Type: Task
Reporter: Artem Harutyunyan
Priority: Blocker


Following a suggestion from [~bmahler] we should review the design document [0] 
for maintenance primitives and consider adding support for explicit 
acknowledgement from frameworks. 

[0] - 
https://docs.google.com/document/d/1CIoOnBLFiEvmhOe-h_s8M4m9Qa7BLETuj_dSNJW959U



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1474) Provide cluster maintenance primitives for operators.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-1474:
-
Labels: mesosphere twitter  (was: twitter)

 Provide cluster maintenance primitives for operators.
 -

 Key: MESOS-1474
 URL: https://issues.apache.org/jira/browse/MESOS-1474
 Project: Mesos
  Issue Type: Epic
  Components: framework, master, slave
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter

 Sometimes operators need to perform maintenance on a mesos cluster; we define 
 maintenance here as anything that requires the tasks to be drained on the 
 slave(s). Most mesos upgrades can be done without affecting running tasks, 
 but there are situations where maintenance is task-affecting:
 * Host maintenance (e.g. hardware repair, kernel upgrades).
 * Non-recoverable slave upgrades (e.g. adjusting slave attributes).
 * etc
 In order to ensure operators don’t violate frameworks’ SLAs, schedulers need 
 to be aware of planned unavailability events.
 Maintenance awareness allows schedulers to avoid churn for long running tasks 
 by placing them on machines not undergoing maintenance. If all resources are 
 planned for maintenance, then the scheduler will prefer machines scheduled 
 for maintenance least imminently.
 Maintenance awareness is also crucial when a scheduler uses [persistent 
 disk|https://issues.apache.org/jira/browse/MESOS-1554] resources, to ensure 
 that the scheduler is aware of the expected duration of unavailability for a 
 persistent disk resource (e.g. using 3 1TB replicas, don’t need to replicate 
 1TB over the network when only 1 of the 3 replicas is going to be unavailable 
 for a reboot ( 1 hour)).
 There are a few primitives of interest here:
 * Provide a way for operators to [fully shutdown a 
 slave|https://issues.apache.org/jira/browse/MESOS-1475] (killing all tasks 
 underneath it). Colloquially known as a hard drain.
 * Provide a way for operators to mark specific slaves as scheduled for 
 maintenance. This will inform the scheduler about the scheduled 
 unavailability of the resources.
 * Provide a way for frameworks to be notified when resources are requested to 
 be relinquished. This gives the framework to proactively move a task before 
 it may be forcibly killed by an operator. It also allows the automation of 
 operations like: please drain these slaves within 1 hour.
 See the [design 
 doc|https://docs.google.com/a/twitter.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit#]
  for the latest details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2061) Add InverseOffer protobuf message.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2061:
-
Target Version/s: 0.24.0
   Fix Version/s: (was: 0.24.0)

 Add InverseOffer protobuf message.
 --

 Key: MESOS-2061
 URL: https://issues.apache.org/jira/browse/MESOS-2061
 Project: Mesos
  Issue Type: Task
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter

 InverseOffer was defined as part of the maintenance work in MESOS-1474, 
 design doc here: 
 https://docs.google.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit?usp=sharing
 {code}
 // A request to deallocate or return any resources already
 // being consumed by the framework.
 message InverseOffer {
   required OfferID id = 1;
   required FrameworkID framework_id = 2;
   repeated Resource resources = 3;
  
   // The slave ID if the resources need to be released on a particular slave.
   optional SlaveID slave_id = 4;
   
   // The executor and task IDs if the resources need to be released on 
 specific
   // executors and/or tasks.
   optional ExecutorID executor_id = 6;
   repeated TaskID task_ids = 6;
  
   // The resources specified in this offer will become unavailable
   // at the specified start time and for the specified duration. Any
   // tasks running using these resources might get killed when
   // these resources become unavailable.
   required Unavailability unavailability = 7;
 }
 {code}
 This ticket is to capture the addition of the InverseOffer protobuf to 
 mesos.proto, the necessary API changes for Event/Call and the language 
 bindings will be tracked separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2075) Add maintenance information to the replicated registry.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2075:
-
Target Version/s: 0.24.0
   Fix Version/s: (was: 0.24.0)

 Add maintenance information to the replicated registry.
 ---

 Key: MESOS-2075
 URL: https://issues.apache.org/jira/browse/MESOS-2075
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter

 To achieve fault-tolerance for the maintenance primitives, we will need to 
 add the maintenance information to the registry.
 The registry currently stores all of the slave information, which is quite 
 large (~ 17MB for 50,000 slaves from my testing), which results in a protobuf 
 object that is extremely expensive to copy.
 As far as I can tell, reads / writes to maintenance information is 
 independent of reads / writes to the existing 'registry' information. So 
 there are two approach here:
 h4. Add maintenance information to 'maintenance' key:
 # The advantage of this approach is that we don't further grow the large 
 Registry object.
 # This approach assumes that writes to 'maintenance' are independent of 
 writes to the 'registry'. If these writes are not independent, this approach 
 requires that we add transactional support to the State abstraction.
 # This approach requires adding compaction to LogStorage.
 # This approach likely requires some refactoring to the Registrar.
 h4. Add maintenance information to 'registry' key:
 # The advantage of this approach is that it's the easiest to implement.
 # This will further grow the single 'registry' object, but doesn't preclude 
 it being split apart in the future.
 # This approach may require using the diff support in LogStorage and/or 
 adding compression support to LogStorage snapshots to deal with the increased 
 size of the registry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2076) Implement maintenance primitives in the Master.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2076:
-
Target Version/s: 0.24.0
   Fix Version/s: (was: 0.24.0)

 Implement maintenance primitives in the Master.
 ---

 Key: MESOS-2076
 URL: https://issues.apache.org/jira/browse/MESOS-2076
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter

 The master will need to do a number of things to implement the maintenance 
 primitives:
 # For slaves that have a maintenance window:
 #* For unused resources, offers must be augmented with an Unavailability.
 #* For used resources, inverse offers must be sent.
 # For inverse offers that are declined, we must filter these before sending 
 them again. We must also store the decline reason, guard against OOMing. 
 #* My hunch is that we'll not want to persist the reasons in the initial 
 approach.
 # When the drain window is reached, we'll make a binary decision as to 
 whether the slave was drained, based on whether it was empty.
 #* If drained, we deactivate this slave and store the fact that it was 
 drained.
 #* If not drained, we leave this slave activated.
 # Recover the maintenance information upon failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2061) Add InverseOffer protobuf message.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2061:
-
Assignee: Joseph Wu

 Add InverseOffer protobuf message.
 --

 Key: MESOS-2061
 URL: https://issues.apache.org/jira/browse/MESOS-2061
 Project: Mesos
  Issue Type: Task
Reporter: Benjamin Mahler
Assignee: Joseph Wu
  Labels: mesosphere, twitter

 InverseOffer was defined as part of the maintenance work in MESOS-1474, 
 design doc here: 
 https://docs.google.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit?usp=sharing
 {code}
 // A request to deallocate or return any resources already
 // being consumed by the framework.
 message InverseOffer {
   required OfferID id = 1;
   required FrameworkID framework_id = 2;
   repeated Resource resources = 3;
  
   // The slave ID if the resources need to be released on a particular slave.
   optional SlaveID slave_id = 4;
   
   // The executor and task IDs if the resources need to be released on 
 specific
   // executors and/or tasks.
   optional ExecutorID executor_id = 6;
   repeated TaskID task_ids = 6;
  
   // The resources specified in this offer will become unavailable
   // at the specified start time and for the specified duration. Any
   // tasks running using these resources might get killed when
   // these resources become unavailable.
   required Unavailability unavailability = 7;
 }
 {code}
 This ticket is to capture the addition of the InverseOffer protobuf to 
 mesos.proto, the necessary API changes for Event/Call and the language 
 bindings will be tracked separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2066) Add optional 'Unavailability' to resource offers to provide maintenance awareness.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2066:
-
Labels: mesosphere twitter  (was: twitter)

 Add optional 'Unavailability' to resource offers to provide maintenance 
 awareness.
 --

 Key: MESOS-2066
 URL: https://issues.apache.org/jira/browse/MESOS-2066
 Project: Mesos
  Issue Type: Task
Reporter: Benjamin Mahler
Assignee: Joseph Wu
  Labels: mesosphere, twitter

 In order to inform frameworks about upcoming maintenance on offered 
 resources, per MESOS-1474, we'd like to add an optional 'Unavailability' 
 information to offers:
 {code}
 message Unavailability {
   required Time start = 1;
   // The approximate duration of the unavailability,
   // if this is a transient unavailability.
   optional Duration duration = 2;
 }
 message Offer {
   required OfferID id = 1;
   required FrameworkID framework_id = 2;
   required SlaveID slave_id = 3;
   required string hostname = 4;
   repeated Resource resources = 5;
   repeated Attribute attributes = 7;
   repeated ExecutorID executor_ids = 6;
  
   // The resources specified in this offer will become unavailable
   // at the specified start time and for the specified duration. Any
   // tasks launched using these resources might get killed when
   // these resources become unavailable.
   optional Unavailability unavailability = 8;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3010) Review design document for maintenance primitives

2015-07-07 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-3010:
---
Description: 
Following a suggestion from [~bmahler] we should review the design document [0] 
for maintenance primitives and consider adding support for explicit 
acknowledgement from frameworks. 

[0] - 
https://docs.google.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit

  was:
Following a suggestion from [~bmahler] we should review the design document [0] 
for maintenance primitives and consider adding support for explicit 
acknowledgement from frameworks. 

[0] - 
https://docs.google.com/document/d/1CIoOnBLFiEvmhOe-h_s8M4m9Qa7BLETuj_dSNJW959U


 Review design document for maintenance primitives 
 --

 Key: MESOS-3010
 URL: https://issues.apache.org/jira/browse/MESOS-3010
 Project: Mesos
  Issue Type: Task
Reporter: Artem Harutyunyan
Priority: Blocker

 Following a suggestion from [~bmahler] we should review the design document 
 [0] for maintenance primitives and consider adding support for explicit 
 acknowledgement from frameworks. 
 [0] - 
 https://docs.google.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3005) SSL tests can fail depending on hostname configuration

2015-07-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617587#comment-14617587
 ] 

Adam B commented on MESOS-3005:
---

commit 13a4e81dfeb9ed5515a80c2071c7fcbb696d3450
Author: Joris Van Remoortere joris.van.remoort...@gmail.com
Date:   Tue Jul 7 15:53:40 2015 -0700

SSL: Fix connection issue on OSX.

Using the protocol based size for the `connect()` argument.

Review: https://reviews.apache.org/r/36246

 SSL tests can fail depending on hostname configuration
 --

 Key: MESOS-3005
 URL: https://issues.apache.org/jira/browse/MESOS-3005
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Reporter: Joris Van Remoortere
Assignee: Joris Van Remoortere
Priority: Blocker
  Labels: libevent, mesosphere, ssl, tests
 Fix For: 0.23.0


 Depending on how /etc/hosts is configured, the SSL tests can fail with a bad 
 hostname match for the certificate.
 We can avoid this by explicitly matching the hostname for the certificate to 
 the IP that will be used during the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2996) Failing Docker tests on CentOS Linux release 7.1.1503.

2015-07-07 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617718#comment-14617718
 ] 

Timothy Chen commented on MESOS-2996:
-

commit d959ea4359f1105ad6ad6dc59f49bf0ed5a6bb56
Author: Timothy Chen tnac...@apache.org
Date:   Tue Jul 7 14:40:34 2015 -0700

Remove os environment for docker executor enviornment setup.

Review: https://reviews.apache.org/r/36282

 Failing Docker tests on CentOS Linux release 7.1.1503.
 --

 Key: MESOS-2996
 URL: https://issues.apache.org/jira/browse/MESOS-2996
 Project: Mesos
  Issue Type: Bug
Reporter: Joerg Schad
Assignee: Timothy Chen
Priority: Blocker
  Labels: mesosphere

 With Mesos 0.23 rc1 several tests fail on CentOS Linux release 7.1 (will add 
 more detail shortly).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2076) Implement maintenance primitives in the Master.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2076:
-
Sprint:   (was: Mesosphere Sprint 14)

 Implement maintenance primitives in the Master.
 ---

 Key: MESOS-2076
 URL: https://issues.apache.org/jira/browse/MESOS-2076
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter

 The master will need to do a number of things to implement the maintenance 
 primitives:
 # For slaves that have a maintenance window:
 #* For unused resources, offers must be augmented with an Unavailability.
 #* For used resources, inverse offers must be sent.
 # For inverse offers that are declined, we must filter these before sending 
 them again. We must also store the decline reason, guard against OOMing. 
 #* My hunch is that we'll not want to persist the reasons in the initial 
 approach.
 # When the drain window is reached, we'll make a binary decision as to 
 whether the slave was drained, based on whether it was empty.
 #* If drained, we deactivate this slave and store the fact that it was 
 drained.
 #* If not drained, we leave this slave activated.
 # Recover the maintenance information upon failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3008) Libevent SSL doesn't use EPOLL

2015-07-07 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3008:

Labels: libevent libprocess mesosphere ssl  (was: libevent libprocess ssl)

 Libevent SSL doesn't use EPOLL
 --

 Key: MESOS-3008
 URL: https://issues.apache.org/jira/browse/MESOS-3008
 Project: Mesos
  Issue Type: Improvement
  Components: libprocess
Affects Versions: 0.23.0
Reporter: Joris Van Remoortere
Assignee: Joris Van Remoortere
  Labels: libevent, libprocess, mesosphere, ssl

 we currently disable to epoll in libevent to allow SSL to work.
 It would be more scalable if we didn't have to do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3009) Reproduce systemd cgroup behavior

2015-07-07 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3009:

Labels: mesosphere  (was: )

 Reproduce systemd cgroup behavior 
 --

 Key: MESOS-3009
 URL: https://issues.apache.org/jira/browse/MESOS-3009
 Project: Mesos
  Issue Type: Task
Reporter: Artem Harutyunyan
Assignee: Joris Van Remoortere
  Labels: mesosphere

 It has been noticed before that systemd reorganizes cgroup hierarchy created 
 by mesos slave. Because of this mesos is no longer able to find the cgroup, 
 and there is also a chance of undoing the isolation that mesos slave puts in 
 place. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2066) Add optional 'Unavailability' to resource offers to provide maintenance awareness.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2066:
-
Assignee: Joseph Wu

 Add optional 'Unavailability' to resource offers to provide maintenance 
 awareness.
 --

 Key: MESOS-2066
 URL: https://issues.apache.org/jira/browse/MESOS-2066
 Project: Mesos
  Issue Type: Task
Reporter: Benjamin Mahler
Assignee: Joseph Wu
  Labels: mesosphere, twitter

 In order to inform frameworks about upcoming maintenance on offered 
 resources, per MESOS-1474, we'd like to add an optional 'Unavailability' 
 information to offers:
 {code}
 message Unavailability {
   required Time start = 1;
   // The approximate duration of the unavailability,
   // if this is a transient unavailability.
   optional Duration duration = 2;
 }
 message Offer {
   required OfferID id = 1;
   required FrameworkID framework_id = 2;
   required SlaveID slave_id = 3;
   required string hostname = 4;
   repeated Resource resources = 5;
   repeated Attribute attributes = 7;
   repeated ExecutorID executor_ids = 6;
  
   // The resources specified in this offer will become unavailable
   // at the specified start time and for the specified duration. Any
   // tasks launched using these resources might get killed when
   // these resources become unavailable.
   optional Unavailability unavailability = 8;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2997) SSL connection failure causes failed CHECK.

2015-07-07 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-2997:

Labels: libprocess mesosphere ssl  (was: libprocess ssl)

 SSL connection failure causes failed CHECK.
 ---

 Key: MESOS-2997
 URL: https://issues.apache.org/jira/browse/MESOS-2997
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Reporter: Joris Van Remoortere
Assignee: Joris Van Remoortere
Priority: Blocker
  Labels: libprocess, mesosphere, ssl

 {code}
 [ RUN  ] SSLTest.BasicSameProcess
 F0706 18:32:28.465451 238583808 libevent_ssl_socket.cpp:507] Check failed: 
 'self-bev' Must be non NULL
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2998) Disable Persistent Volumes, Dynamic Reservations via master flags

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2998:
--
Target Version/s: 0.24.0  (was: 0.23.0)

 Disable Persistent Volumes, Dynamic Reservations via master flags
 -

 Key: MESOS-2998
 URL: https://issues.apache.org/jira/browse/MESOS-2998
 Project: Mesos
  Issue Type: Improvement
  Components: master
Affects Versions: 0.23.0
Reporter: Adam B
Assignee: Michael Park
  Labels: mesosphere, persistence, reservations, volumes

 As an operator, I might not want frameworks using the experimental dynamic 
 reservations/persistent volumes APIs in 0.23, since there are no ACLs or 
 operator endpoints for me to manage them. That means that a rogue framework 
 could start reserving resources and creating volumes with all resources 
 provided, and I would have no way to clean them up.
 Is it possible to disable these features from the master (flags, etc.)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2199:
--
Target Version/s: 0.24.0  (was: 0.23.0)

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617519#comment-14617519
 ] 

Adam B commented on MESOS-2199:
---

commit de13d78b7c2a87162c77e7f296784913d90901fd
Author: Adam B a...@mesosphere.io
Date:   Tue Jul 7 14:35:39 2015 -0700

Disabled ROOT_RunTaskWithCommandInfoWithUser for MESOS-2199.

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2075) Add maintenance information to the replicated registry.

2015-07-07 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2075:
-
Sprint:   (was: Mesosphere Sprint 14)

 Add maintenance information to the replicated registry.
 ---

 Key: MESOS-2075
 URL: https://issues.apache.org/jira/browse/MESOS-2075
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Benjamin Mahler
  Labels: mesosphere, twitter

 To achieve fault-tolerance for the maintenance primitives, we will need to 
 add the maintenance information to the registry.
 The registry currently stores all of the slave information, which is quite 
 large (~ 17MB for 50,000 slaves from my testing), which results in a protobuf 
 object that is extremely expensive to copy.
 As far as I can tell, reads / writes to maintenance information is 
 independent of reads / writes to the existing 'registry' information. So 
 there are two approach here:
 h4. Add maintenance information to 'maintenance' key:
 # The advantage of this approach is that we don't further grow the large 
 Registry object.
 # This approach assumes that writes to 'maintenance' are independent of 
 writes to the 'registry'. If these writes are not independent, this approach 
 requires that we add transactional support to the State abstraction.
 # This approach requires adding compaction to LogStorage.
 # This approach likely requires some refactoring to the Registrar.
 h4. Add maintenance information to 'registry' key:
 # The advantage of this approach is that it's the easiest to implement.
 # This will further grow the single 'registry' object, but doesn't preclude 
 it being split apart in the future.
 # This approach may require using the diff support in LogStorage and/or 
 adding compression support to LogStorage snapshots to deal with the increased 
 size of the registry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3005) SSL tests can fail depending on hostname configuration

2015-07-07 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3005:

Labels: libevent mesosphere ssl tests  (was: libevent ssl tests)

 SSL tests can fail depending on hostname configuration
 --

 Key: MESOS-3005
 URL: https://issues.apache.org/jira/browse/MESOS-3005
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Reporter: Joris Van Remoortere
Assignee: Joris Van Remoortere
Priority: Blocker
  Labels: libevent, mesosphere, ssl, tests
 Fix For: 0.23.0


 Depending on how /etc/hosts is configured, the SSL tests can fail with a bad 
 hostname match for the certificate.
 We can avoid this by explicitly matching the hostname for the certificate to 
 the IP that will be used during the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2991) Compilation Error on Mac OS 10.10.4 with clang 3.5.0

2015-07-07 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616609#comment-14616609
 ] 

Alexander Rukletsov commented on MESOS-2991:


Contributor: [~alex-mesos]
Initial reviewer: [~mcypark]
Final reviewer: [~adam-mesos]

 Compilation Error on Mac OS 10.10.4 with clang 3.5.0
 

 Key: MESOS-2991
 URL: https://issues.apache.org/jira/browse/MESOS-2991
 Project: Mesos
  Issue Type: Bug
  Components: stout, test
Affects Versions: 0.23.0
Reporter: Alexander Rukletsov
Assignee: Michael Park
  Labels: mesosphere

 Compiling 0.23.0 (rc1) produces compilation errors on Mac OS 10.10.4 with 
 {{g++}} based on LLVM 3.5. It looks like the issue was introduced in 
 {{a5640ad813e6256b548fca068f04fd9fa3a03eda}}, 
 https://reviews.apache.org/r/32838. In contrast to the commit message, 
 compiling the rc with gcc4.4 on CentOS worked fine for me. 
 According to 0.23 release notes and MESOS-2604, we should support clang 3.5. 
 {code}
 ../../../../../3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp:543:25: 
 error: conversion from 'void ()' to 'const Optionvoid (*)()' is ambiguous
Fork(dosetsid,  // Great-great-granchild.
 ^~~~
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:40:3:
  note: candidate constructor
   Option(const T _t) : state(SOME), t(_t) {}
   ^
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:42:3:
  note: candidate constructor
   Option(T _t) : state(SOME), t(std::move(_t)) {}
   ^
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:45:3:
  note: candidate constructor [with U = void ()]
   Option(const U u) : state(SOME), t(u) {}
   ^
 {code}
 Compiler version:
 {code}
 $ g++ --version
 Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr 
 --with-gxx-include-dir=/usr/include/c++/4.2.1
 Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)
 Target: x86_64-apple-darwin14.4.0
 Thread model: posix
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2991) Compilation Error on Mac OS 10.10.4 with clang 3.5.0

2015-07-07 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-2991:
---
Shepherd: Adam B

 Compilation Error on Mac OS 10.10.4 with clang 3.5.0
 

 Key: MESOS-2991
 URL: https://issues.apache.org/jira/browse/MESOS-2991
 Project: Mesos
  Issue Type: Bug
  Components: stout, test
Affects Versions: 0.23.0
Reporter: Alexander Rukletsov
Assignee: Alexander Rukletsov
  Labels: mesosphere

 Compiling 0.23.0 (rc1) produces compilation errors on Mac OS 10.10.4 with 
 {{g++}} based on LLVM 3.5. It looks like the issue was introduced in 
 {{a5640ad813e6256b548fca068f04fd9fa3a03eda}}, 
 https://reviews.apache.org/r/32838. In contrast to the commit message, 
 compiling the rc with gcc4.4 on CentOS worked fine for me. 
 According to 0.23 release notes and MESOS-2604, we should support clang 3.5. 
 {code}
 ../../../../../3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp:543:25: 
 error: conversion from 'void ()' to 'const Optionvoid (*)()' is ambiguous
Fork(dosetsid,  // Great-great-granchild.
 ^~~~
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:40:3:
  note: candidate constructor
   Option(const T _t) : state(SOME), t(_t) {}
   ^
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:42:3:
  note: candidate constructor
   Option(T _t) : state(SOME), t(std::move(_t)) {}
   ^
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:45:3:
  note: candidate constructor [with U = void ()]
   Option(const U u) : state(SOME), t(u) {}
   ^
 {code}
 Compiler version:
 {code}
 $ g++ --version
 Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr 
 --with-gxx-include-dir=/usr/include/c++/4.2.1
 Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)
 Target: x86_64-apple-darwin14.4.0
 Thread model: posix
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2588) Create pre-create hook before a Docker container launches

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616551#comment-14616551
 ] 

haosdent commented on MESOS-2588:
-

Hi, [~baotiao] I am not sure about this. Let's see [~tnachen]'s opinions.

 Create pre-create hook before a Docker container launches
 -

 Key: MESOS-2588
 URL: https://issues.apache.org/jira/browse/MESOS-2588
 Project: Mesos
  Issue Type: Bug
  Components: docker
Reporter: Timothy Chen
Assignee: haosdent

 To be able to support custom actions to be called before launching a docker 
 contianer, we should create a hook that can be extensible and allow 
 module/hooks to be performed before a docker container is launched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3011) Publish release documentation for major releases on website

2015-07-07 Thread Paul Brett (JIRA)
Paul Brett created MESOS-3011:
-

 Summary: Publish release documentation for major releases on 
website
 Key: MESOS-3011
 URL: https://issues.apache.org/jira/browse/MESOS-3011
 Project: Mesos
  Issue Type: Documentation
Reporter: Paul Brett


Currently, the website only provides a single version of the documentation.  We 
should publish documentation for each release on the website independently (for 
example as https://mesos.apache.org/documentation/0.22/index.html, 
https://mesos.apache.org/documentation/0.23/index.html) and make latest 
redirect to the current version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2997) SSL connection failure causes failed CHECK.

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2997:
--
Fix Version/s: 0.23.0

 SSL connection failure causes failed CHECK.
 ---

 Key: MESOS-2997
 URL: https://issues.apache.org/jira/browse/MESOS-2997
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Reporter: Joris Van Remoortere
Assignee: Joris Van Remoortere
Priority: Blocker
  Labels: libprocess, mesosphere, ssl
 Fix For: 0.23.0


 {code}
 [ RUN  ] SSLTest.BasicSameProcess
 F0706 18:32:28.465451 238583808 libevent_ssl_socket.cpp:507] Check failed: 
 'self-bev' Must be non NULL
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2992) Improve attribute documentation to reflect current state

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2992:
--
Fix Version/s: 0.23.0

 Improve attribute documentation to reflect current state
 

 Key: MESOS-2992
 URL: https://issues.apache.org/jira/browse/MESOS-2992
 Project: Mesos
  Issue Type: Documentation
  Components: documentation
Reporter: Timothy Chen
Assignee: Timothy Chen
 Fix For: 0.23.0


 Currently the attributes doc is out of date, and doesn't reflect all the 
 latest attributes types we support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3017) Make container-IP available via Master endpoint

2015-07-07 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-3017:
-

 Summary: Make container-IP available via Master endpoint
 Key: MESOS-3017
 URL: https://issues.apache.org/jira/browse/MESOS-3017
 Project: Mesos
  Issue Type: Task
Reporter: Kapil Arya
Assignee: Kapil Arya






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3000) Failing test - NsTest.ROOT_setns

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617872#comment-14617872
 ] 

haosdent commented on MESOS-3000:
-

Also because of other user could not access 
/home/idownes/workspace/mesos/build/src/setns-test-helper

 Failing test - NsTest.ROOT_setns
 

 Key: MESOS-3000
 URL: https://issues.apache.org/jira/browse/MESOS-3000
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Ian Downes
Priority: Blocker

 Appears to be the same issue plaguing MESOS-2199
 {noformat}
 [root@hostname build]# MESOS_VERBOSE=1 ./bin/mesos-tests.sh 
 --gtest_filter=NsTest.ROOT_setns
 ...
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from NsTest
 [ RUN  ] NsTest.ROOT_setns
 ABORT: (../../../3rdparty/libprocess/src/subprocess.cpp:163): Failed to 
 os::execvpe in childMain: Permission denied*** Aborted at 1436292540 (unix 
 time) try date -d @1436292540 if you are using GNU date ***
 PC: @ 0x7f7a1229e625 __GI_raise
 *** SIGABRT (@0xfffe0001) received by PID 1 (TID 0x7f7a19afc820) from PID 
 1; stack trace: ***
 @ 0x7f7a13421710 (unknown)
 @ 0x7f7a1229e625 __GI_raise
 @ 0x7f7a1229fe05 __GI_abort
 @   0x860ba1 (unknown)
 @   0x860bcf (unknown)
 @ 0x7f7a1826f118 (unknown)
 @ 0x7f7a18274594 (unknown)
 @ 0x7f7a18273b88 (unknown)
 @ 0x7f7a18273098 (unknown)
 @  0x1180720 (unknown)
 @  0x117a5d7 (unknown)
 @ 0x7f7a123548fd clone
 ../../src/tests/ns_tests.cpp:121: Failure
 Failed to wait 15secs for status
 [  FAILED  ] NsTest.ROOT_setns (15004 ms)
 [--] 1 test from NsTest (15004 ms total)
 [--] Global test environment tear-down
 ../../src/tests/environment.cpp:441: Failure
 Failed
 Tests completed with child processes remaining:
 -+- 40531 /home/idownes/workspace/mesos/build/src/.libs/lt-mesos-tests 
 --gtest_filter=NsTest.ROOT_setns
  \--- 40565 /home/idownes/workspace/mesos/build/src/.libs/lt-mesos-tests 
 --gtest_filter=NsTest.ROOT_setns
 [==] 1 test from 1 test case ran. (15034 ms total)
 [  PASSED  ] 0 tests.
 [  FAILED  ] 1 test, listed below:
 [  FAILED  ] NsTest.ROOT_setns
 {noformat}
 Relevant strace for the forked child:
 {noformat}
 ...
 getpid()= 1
 dup2(6, 0) = 0
 dup2(7, 1) = 1
 dup2(8, 2) = 2
 close(6) = 0
 close(7) = 0
 close(8) = 0
 execve(/home/idownes/workspace/mesos/build/src/setns-test-helper, 
 [setns-test-helper, SetnsTestHelper], [/* 24 vars */]) = -1 EACCES 
 (Permission denied)
 write(2, ABORT: (../../../3rdparty/libpro..., 62) = 62
 write(2, Failed to os::execvpe in childMa..., 53) = 53
 ...
 {noformat}
 Binary that it's trying to exec:
 {noformat}
 [root@hostname build]# stat 
 /home/idownes/workspace/mesos/build/src/setns-test-helper
   File: `/home/idownes/workspace/mesos/build/src/setns-test-helper'
   Size: 7948Blocks: 16 IO Block: 4096   regular file
 Device: 801h/2049d  Inode: 22949249Links: 1
 Access: (0755/-rwxr-xr-x)  Uid: (13118/ idownes)   Gid: ( 1500/employee)
 Access: 2015-07-07 17:58:09.569861237 +
 Modify: 2015-07-07 17:58:09.573861290 +
 Change: 2015-07-07 17:58:09.573861290 +
 [root@hostname build]# 
 /home/idownes/workspace/mesos/build/src/setns-test-helper
 Usage: /home/idownes/workspace/mesos/build/src/.libs/lt-setns-test-helper 
 subcommand [OPTIONS]
 Available subcommands:
 help
 SetnsTestHelper
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2715) Python egg build breakage

2015-07-07 Thread Greg Bowyer (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617879#comment-14617879
 ] 

Greg Bowyer commented on MESOS-2715:


I would leave out the Travis changes, I didn't get it to build.

 Python egg build breakage
 -

 Key: MESOS-2715
 URL: https://issues.apache.org/jira/browse/MESOS-2715
 Project: Mesos
  Issue Type: Bug
  Components: build, python api
Reporter: Greg Bowyer
Priority: Minor
  Labels: mesosphere

 Essentially a small build fix, the python setup.py for the native code does 
 not add -std=c++11 to its compiler flags.
 This is probably a dup.
 Fix is here for the interested
 https://github.com/apache/mesos/pull/42



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3012) Support existing message passing optimization with Event/Call.

2015-07-07 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-3012:
--

 Summary: Support existing message passing optimization with 
Event/Call.
 Key: MESOS-3012
 URL: https://issues.apache.org/jira/browse/MESOS-3012
 Project: Mesos
  Issue Type: Task
Reporter: Benjamin Mahler


See the thread here:
http://markmail.org/thread/wvapc7vkbv7z6gbx

The scheduler driver currently sends framework messages directly to the slave, 
when possible:

{noformat}
  (through master)
Scheduler  — Master  —  Slave   Executor
 DriverDriver
   (skip master)
{noformat}

The slave always sends messages directly to the scheduler driver:
{noformat}
Scheduler Master  Slave   Executor
 DriverDriver
   (skip master)
{noformat}

In order for the scheduler driver to receive Events from the master, it needs 
enough information to continue directly sending messages to slaves. This was 
previously accomplished by sending the slave's pid inside the [offer 
message|https://github.com/apache/mesos/blob/0.23.0-rc1/src/messages/messages.proto#L168]:

{code}
message ResourceOffersMessage {
  repeated Offer offers = 1;
  repeated string pids = 2;
}
{code}

We could add an 'Address' to the Offer protobuf to provide the scheduler driver 
with the same information:

{code}
message Address {
  required string ip;
  required string hostname;
  required uint32_t port;

  // All HTTP requests to this address must begin with this prefix.
  required string path_prefix;
}

message Offer {
  required OfferID id = 1;
  required FrameworkID framework_id = 2;
  required SlaveID slave_id = 3;
  required string hostname = 4;   // Deprecated in favor of 'address'.
  optional Address address = 8;  // Obviates 'hostname'.
  ...
}
{code}

The path prefix is required for testing purposes, where we can have multiple 
slaves within a process (e.g. {{localhost:5051/slave(1)/state.json}} vs. 
{{localhost:5051/slave(2)/state.json}}).

This provides enough information to allow the scheduler driver to continue to 
directly send messages to the slaves, which unblocks MESOS-2910.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2910) Add an Event message handler to scheduler driver

2015-07-07 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617784#comment-14617784
 ] 

Benjamin Mahler commented on MESOS-2910:


This was blocked because we don't have enough information in Event/Call to 
continue sending messages directly to the slaves. Linking in the blocking 
ticket.

 Add an Event message handler to scheduler driver
 

 Key: MESOS-2910
 URL: https://issues.apache.org/jira/browse/MESOS-2910
 Project: Mesos
  Issue Type: Task
Reporter: Vinod Kone
Assignee: Benjamin Mahler

 Adding this handler lets master send Event messages to the driver.
 See MESOS-2909 for additional context.
 This ticket only tracks the installation of the handler and maybe handling of 
 a single event for testing. Additional events handling will be captured in a 
 different ticket(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3013) Extend DiscoveryInfo to include NetworkRequirement message

2015-07-07 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-3013:
-

 Summary: Extend DiscoveryInfo to include NetworkRequirement 
message
 Key: MESOS-3013
 URL: https://issues.apache.org/jira/browse/MESOS-3013
 Project: Mesos
  Issue Type: Bug
Reporter: Kapil Arya
Assignee: Kapil Arya


As per the [design 
doc|https://docs.google.com/document/d/17mXtAmdAXcNBwp_JfrxmZcQrs7EO6ancSbejrqjLQ0g],
 we need to enable frameworks to specify network requirements. The proposed 
message could be along the lines of:

{code}
message NetworkRequirement {
  enum Protocol {
IPv4,
IPv6
  }
  required Protocol protocol;

  // A netgroup is the name given to a set of logically-related IPs that are
  // allowed to communicate within themselves. For example, one might want 
  // to create separate netgroups for dev, testing, qa and prod deployment 
  // environments.
  repeated string netgroups;

  // Sticky IPs allow a framwork to re-launch a task with the same IP on a
  // different Slave/Node.
  optional bool sticky [default = false];

  // A unique id that the framework uses to tag the assigned IP. This tag
  // can be later used to reclaim IP while relaunching the task.
  optional string id;
};
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3013) Extend DiscoveryInfo to include NetworkRequirement message

2015-07-07 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-3013:
--
Issue Type: Task  (was: Bug)

 Extend DiscoveryInfo to include NetworkRequirement message
 

 Key: MESOS-3013
 URL: https://issues.apache.org/jira/browse/MESOS-3013
 Project: Mesos
  Issue Type: Task
Reporter: Kapil Arya
Assignee: Kapil Arya
  Labels: mesosphere

 As per the [design 
 doc|https://docs.google.com/document/d/17mXtAmdAXcNBwp_JfrxmZcQrs7EO6ancSbejrqjLQ0g],
  we need to enable frameworks to specify network requirements. The proposed 
 message could be along the lines of:
 {code}
 message NetworkRequirement {
   enum Protocol {
 IPv4,
 IPv6
   }
   required Protocol protocol;
   // A netgroup is the name given to a set of logically-related IPs that are
   // allowed to communicate within themselves. For example, one might want 
   // to create separate netgroups for dev, testing, qa and prod deployment 
   // environments.
   repeated string netgroups;
   // Sticky IPs allow a framwork to re-launch a task with the same IP on a
   // different Slave/Node.
   optional bool sticky [default = false];
   // A unique id that the framework uses to tag the assigned IP. This tag
   // can be later used to reclaim IP while relaunching the task.
   optional string id;
 };
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617869#comment-14617869
 ] 

haosdent commented on MESOS-2199:
-

{quote}
I think it's flawed to require global read access for the build directory...
{quote}

Hi, [~idownes].If nobody could not read /home/idownes/build, he also could not 
read /home/idownes/build/src/.libs/lt-mesos-executor and execute. 
So the contradictory place appears:
1. if we want nobody could execute 
/home/idownes/build/src/.libs/lt-mesos-executor, he should have r-x 
permissions in these directories:
{code}
/home
/home/idownes
/home/idownes/build
/home/idownes/build/src
/home/idownes/build/src/.libs
{code}
2. if we don't want nobody access /home/idownes/ , he also could not execute 
/home/idownes/build/src/.libs/lt-mesos-executor because lt-mesos-executor 
belongs to /home/idownes/

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617869#comment-14617869
 ] 

haosdent edited comment on MESOS-2199 at 7/8/15 2:41 AM:
-

{quote}
I think it's flawed to require global read access for the build directory...
{quote}

Hi, [~idownes].If nobody could not read /home/idownes/build, he also could not 
read /home/idownes/build/src/.libs/lt-mesos-executor and execute. 
So the contradictory place appears:
1. if we want nobody could execute 
/home/idownes/build/src/.libs/lt-mesos-executor, he should have r-x 
permissions in these directories:
{code}
/home
/home/idownes
/home/idownes/build
/home/idownes/build/src
/home/idownes/build/src/.libs
{code}
2. if we don't want nobody access /home/idownes/ , he also could not execute 
/home/idownes/build/src/.libs/lt-mesos-executor because lt-mesos-executor 
belongs to /home/idownes/

More details is 
[here|http://unix.stackexchange.com/questions/13858/do-the-parent-directorys-permissions-matter-when-accessing-a-subdirectory].
 And need chmod o+x /home/idownes


was (Author: haosd...@gmail.com):
{quote}
I think it's flawed to require global read access for the build directory...
{quote}

Hi, [~idownes].If nobody could not read /home/idownes/build, he also could not 
read /home/idownes/build/src/.libs/lt-mesos-executor and execute. 
So the contradictory place appears:
1. if we want nobody could execute 
/home/idownes/build/src/.libs/lt-mesos-executor, he should have r-x 
permissions in these directories:
{code}
/home
/home/idownes
/home/idownes/build
/home/idownes/build/src
/home/idownes/build/src/.libs
{code}
2. if we don't want nobody access /home/idownes/ , he also could not execute 
/home/idownes/build/src/.libs/lt-mesos-executor because lt-mesos-executor 
belongs to /home/idownes/

More details is here. And need chmod o+x /home/idownes

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2996) Failing Docker tests on CentOS Linux release 7.1.1503.

2015-07-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2996:
--
Fix Version/s: 0.23.0

 Failing Docker tests on CentOS Linux release 7.1.1503.
 --

 Key: MESOS-2996
 URL: https://issues.apache.org/jira/browse/MESOS-2996
 Project: Mesos
  Issue Type: Bug
Reporter: Joerg Schad
Assignee: Timothy Chen
Priority: Blocker
  Labels: mesosphere
 Fix For: 0.23.0


 With Mesos 0.23 rc1 several tests fail on CentOS Linux release 7.1 (will add 
 more detail shortly).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3016) Add task status update hooks for Master/Slave

2015-07-07 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-3016:
-

 Summary: Add task status update hooks for Master/Slave
 Key: MESOS-3016
 URL: https://issues.apache.org/jira/browse/MESOS-3016
 Project: Mesos
  Issue Type: Task
Reporter: Kapil Arya
Assignee: Kapil Arya


The task termination hooks are needed for doing task-specific cleanup in 
Master/Slave.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-3000) Failing test - NsTest.ROOT_setns

2015-07-07 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-3000:

Comment: was deleted

(was: Also because of other user could not access 
/home/idownes/workspace/mesos/build/src/setns-test-helper)

 Failing test - NsTest.ROOT_setns
 

 Key: MESOS-3000
 URL: https://issues.apache.org/jira/browse/MESOS-3000
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Ian Downes
Priority: Blocker

 Appears to be the same issue plaguing MESOS-2199
 {noformat}
 [root@hostname build]# MESOS_VERBOSE=1 ./bin/mesos-tests.sh 
 --gtest_filter=NsTest.ROOT_setns
 ...
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from NsTest
 [ RUN  ] NsTest.ROOT_setns
 ABORT: (../../../3rdparty/libprocess/src/subprocess.cpp:163): Failed to 
 os::execvpe in childMain: Permission denied*** Aborted at 1436292540 (unix 
 time) try date -d @1436292540 if you are using GNU date ***
 PC: @ 0x7f7a1229e625 __GI_raise
 *** SIGABRT (@0xfffe0001) received by PID 1 (TID 0x7f7a19afc820) from PID 
 1; stack trace: ***
 @ 0x7f7a13421710 (unknown)
 @ 0x7f7a1229e625 __GI_raise
 @ 0x7f7a1229fe05 __GI_abort
 @   0x860ba1 (unknown)
 @   0x860bcf (unknown)
 @ 0x7f7a1826f118 (unknown)
 @ 0x7f7a18274594 (unknown)
 @ 0x7f7a18273b88 (unknown)
 @ 0x7f7a18273098 (unknown)
 @  0x1180720 (unknown)
 @  0x117a5d7 (unknown)
 @ 0x7f7a123548fd clone
 ../../src/tests/ns_tests.cpp:121: Failure
 Failed to wait 15secs for status
 [  FAILED  ] NsTest.ROOT_setns (15004 ms)
 [--] 1 test from NsTest (15004 ms total)
 [--] Global test environment tear-down
 ../../src/tests/environment.cpp:441: Failure
 Failed
 Tests completed with child processes remaining:
 -+- 40531 /home/idownes/workspace/mesos/build/src/.libs/lt-mesos-tests 
 --gtest_filter=NsTest.ROOT_setns
  \--- 40565 /home/idownes/workspace/mesos/build/src/.libs/lt-mesos-tests 
 --gtest_filter=NsTest.ROOT_setns
 [==] 1 test from 1 test case ran. (15034 ms total)
 [  PASSED  ] 0 tests.
 [  FAILED  ] 1 test, listed below:
 [  FAILED  ] NsTest.ROOT_setns
 {noformat}
 Relevant strace for the forked child:
 {noformat}
 ...
 getpid()= 1
 dup2(6, 0) = 0
 dup2(7, 1) = 1
 dup2(8, 2) = 2
 close(6) = 0
 close(7) = 0
 close(8) = 0
 execve(/home/idownes/workspace/mesos/build/src/setns-test-helper, 
 [setns-test-helper, SetnsTestHelper], [/* 24 vars */]) = -1 EACCES 
 (Permission denied)
 write(2, ABORT: (../../../3rdparty/libpro..., 62) = 62
 write(2, Failed to os::execvpe in childMa..., 53) = 53
 ...
 {noformat}
 Binary that it's trying to exec:
 {noformat}
 [root@hostname build]# stat 
 /home/idownes/workspace/mesos/build/src/setns-test-helper
   File: `/home/idownes/workspace/mesos/build/src/setns-test-helper'
   Size: 7948Blocks: 16 IO Block: 4096   regular file
 Device: 801h/2049d  Inode: 22949249Links: 1
 Access: (0755/-rwxr-xr-x)  Uid: (13118/ idownes)   Gid: ( 1500/employee)
 Access: 2015-07-07 17:58:09.569861237 +
 Modify: 2015-07-07 17:58:09.573861290 +
 Change: 2015-07-07 17:58:09.573861290 +
 [root@hostname build]# 
 /home/idownes/workspace/mesos/build/src/setns-test-helper
 Usage: /home/idownes/workspace/mesos/build/src/.libs/lt-setns-test-helper 
 subcommand [OPTIONS]
 Available subcommands:
 help
 SetnsTestHelper
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2993) Document per container unique egress flow and network queueing statistics

2015-07-07 Thread Paul Brett (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617770#comment-14617770
 ] 

Paul Brett commented on MESOS-2993:
---

Update incorporating reviewer comments.

 Document  per container unique egress flow and network queueing statistics
 --

 Key: MESOS-2993
 URL: https://issues.apache.org/jira/browse/MESOS-2993
 Project: Mesos
  Issue Type: Bug
  Components: documentation, isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Paul Brett
  Labels: twitter

 Document new network isolation capabilities in 0.23



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3013) Extend DiscoveryInfo to include NetworkRequirement message

2015-07-07 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-3013:
--
Labels: mesosphere  (was: )

 Extend DiscoveryInfo to include NetworkRequirement message
 

 Key: MESOS-3013
 URL: https://issues.apache.org/jira/browse/MESOS-3013
 Project: Mesos
  Issue Type: Bug
Reporter: Kapil Arya
Assignee: Kapil Arya
  Labels: mesosphere

 As per the [design 
 doc|https://docs.google.com/document/d/17mXtAmdAXcNBwp_JfrxmZcQrs7EO6ancSbejrqjLQ0g],
  we need to enable frameworks to specify network requirements. The proposed 
 message could be along the lines of:
 {code}
 message NetworkRequirement {
   enum Protocol {
 IPv4,
 IPv6
   }
   required Protocol protocol;
   // A netgroup is the name given to a set of logically-related IPs that are
   // allowed to communicate within themselves. For example, one might want 
   // to create separate netgroups for dev, testing, qa and prod deployment 
   // environments.
   repeated string netgroups;
   // Sticky IPs allow a framwork to re-launch a task with the same IP on a
   // different Slave/Node.
   optional bool sticky [default = false];
   // A unique id that the framework uses to tag the assigned IP. This tag
   // can be later used to reclaim IP while relaunching the task.
   optional string id;
 };
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2588) Create pre-create hook before a Docker container launches

2015-07-07 Thread chenzongzhi (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616706#comment-14616706
 ] 

chenzongzhi commented on MESOS-2588:


if we want to set the container's cgoup by script, we need know the path, then 
we can change the value directly by write a value in this file.
such as we can echo 1  
/cgroup/cpu/docker/ce00c65f07924ab5225e655a4e2fc6e7f30e63e1ac7a49901463002946fd196f/cpu.cfs_period_us
 to implement our limitation
Or we can set the cgoup limit like MesosContainer.



 Create pre-create hook before a Docker container launches
 -

 Key: MESOS-2588
 URL: https://issues.apache.org/jira/browse/MESOS-2588
 Project: Mesos
  Issue Type: Bug
  Components: docker
Reporter: Timothy Chen
Assignee: haosdent

 To be able to support custom actions to be called before launching a docker 
 contianer, we should create a hook that can be extensible and allow 
 module/hooks to be performed before a docker container is launched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2991) Compilation Error on Mac OS 10.10.4 with clang 3.5.0

2015-07-07 Thread Benjamin Hindman (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616705#comment-14616705
 ] 

Benjamin Hindman commented on MESOS-2991:
-

Hey folks, [~mcypark] and I came across this issue in the past and the solution 
was as simple as adding `` in front of `dosetsid`, to clearly inform the 
compiler that we want this to be treated as a function pointer. I think this 
was lost in one of the patch sets that [~mcypark] and I had been working on, 
unfortunately, but has been added now. Can folks retry and see if this is still 
an issue?

 Compilation Error on Mac OS 10.10.4 with clang 3.5.0
 

 Key: MESOS-2991
 URL: https://issues.apache.org/jira/browse/MESOS-2991
 Project: Mesos
  Issue Type: Bug
  Components: stout, test
Affects Versions: 0.23.0
Reporter: Alexander Rukletsov
Assignee: Michael Park
  Labels: mesosphere

 Compiling 0.23.0 (rc1) produces compilation errors on Mac OS 10.10.4 with 
 {{g++}} based on LLVM 3.5. It looks like the issue was introduced in 
 {{a5640ad813e6256b548fca068f04fd9fa3a03eda}}, 
 https://reviews.apache.org/r/32838. In contrast to the commit message, 
 compiling the rc with gcc4.4 on CentOS worked fine for me. 
 According to 0.23 release notes and MESOS-2604, we should support clang 3.5. 
 {code}
 ../../../../../3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp:543:25: 
 error: conversion from 'void ()' to 'const Optionvoid (*)()' is ambiguous
Fork(dosetsid,  // Great-great-granchild.
 ^~~~
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:40:3:
  note: candidate constructor
   Option(const T _t) : state(SOME), t(_t) {}
   ^
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:42:3:
  note: candidate constructor
   Option(T _t) : state(SOME), t(std::move(_t)) {}
   ^
 ../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:45:3:
  note: candidate constructor [with U = void ()]
   Option(const U u) : state(SOME), t(u) {}
   ^
 {code}
 Compiler version:
 {code}
 $ g++ --version
 Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr 
 --with-gxx-include-dir=/usr/include/c++/4.2.1
 Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)
 Target: x86_64-apple-darwin14.4.0
 Thread model: posix
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2588) Create pre-create hook before a Docker container launches

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617935#comment-14617935
 ] 

haosdent commented on MESOS-2588:
-

Hi [~baotiao] You could not do that before docker launch. This hook is execute 
before create the docker container and could not get the container id of 
docker. I think your requirements is this issue: 
https://issues.apache.org/jira/browse/MESOS-2154

 Create pre-create hook before a Docker container launches
 -

 Key: MESOS-2588
 URL: https://issues.apache.org/jira/browse/MESOS-2588
 Project: Mesos
  Issue Type: Bug
  Components: docker
Reporter: Timothy Chen
Assignee: haosdent

 To be able to support custom actions to be called before launching a docker 
 contianer, we should create a hook that can be extensible and allow 
 module/hooks to be performed before a docker container is launched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-2588) Create pre-create hook before a Docker container launches

2015-07-07 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-2588:

Comment: was deleted

(was: Hi [~baotiao] You could not do that before docker launch. This hook is 
execute before create the docker container and could not get the container id 
of docker. I think your requirements is this issue: 
https://issues.apache.org/jira/browse/MESOS-2154)

 Create pre-create hook before a Docker container launches
 -

 Key: MESOS-2588
 URL: https://issues.apache.org/jira/browse/MESOS-2588
 Project: Mesos
  Issue Type: Bug
  Components: docker
Reporter: Timothy Chen
Assignee: haosdent

 To be able to support custom actions to be called before launching a docker 
 contianer, we should create a hook that can be extensible and allow 
 module/hooks to be performed before a docker container is launched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2588) Create pre-create hook before a Docker container launches

2015-07-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617936#comment-14617936
 ] 

haosdent commented on MESOS-2588:
-

Hi [~baotiao] You could not do that before docker launch. This hook is execute 
before create the docker container and could not get the container id of 
docker. I think your requirements is this issue: 
https://issues.apache.org/jira/browse/MESOS-2154

 Create pre-create hook before a Docker container launches
 -

 Key: MESOS-2588
 URL: https://issues.apache.org/jira/browse/MESOS-2588
 Project: Mesos
  Issue Type: Bug
  Components: docker
Reporter: Timothy Chen
Assignee: haosdent

 To be able to support custom actions to be called before launching a docker 
 contianer, we should create a hook that can be extensible and allow 
 module/hooks to be performed before a docker container is launched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3002) Rename OptionT::get(const T _t) to getOrElse() broke network isolator

2015-07-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617397#comment-14617397
 ] 

Adam B commented on MESOS-3002:
---

The changes for MESOS-2800 to Rename OptionT::get(const T _t) to 
getOrElse() happened after the 0.23.0-rc1 cut and are not planned for 
cherry-picking into the release. The Fix Version of MESOS-2800 is 0.24.0, so 
the Affects Version of this ticket (MESOS-3002) is really 0.24.0, and hence its 
Target Version should also be 0.24.0.
Please let me know otherwise if you actually saw this build error when building 
from the 0.23.0-rc1 tag.

 Rename OptionT::get(const T _t) to getOrElse() broke network isolator
 

 Key: MESOS-3002
 URL: https://issues.apache.org/jira/browse/MESOS-3002
 Project: Mesos
  Issue Type: Bug
  Components: isolation
Affects Versions: 0.23.0
Reporter: Paul Brett
Assignee: Joris Van Remoortere
Priority: Blocker

 Change to Option from get() to getOrElse() breaks network isolator.  Building 
 with '../configure --with-network-isolator' generates the following error:
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp: In static 
 member function 'static Trymesos::slave::Isolator* 
 mesos::internal::slave::PortMappingIsolatorProcess::create(const 
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 error: no matching function for call to 'Optionstd::basic_stringchar 
 ::get(const char [1]) const'
flags.resources.get(),
  ^
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:1103:29: 
 note: candidates are:
 In file included from 
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:26:0,
  from ../../3rdparty/libprocess/include/process/check.hpp:19,
  from ../../3rdparty/libprocess/include/process/collect.hpp:7,
  from 
 ../../src/slave/containerizer/isolators/network/port_mapping.cpp:30:
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note: const T OptionT::get() const [with T = std::basic_stringchar]
const T get() const { assert(isSome()); return t; }
 ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:130:12: 
 note:   candidate expects 0 arguments, 1 provided
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note: T OptionT::get() [with T = std::basic_stringchar]
T get() { assert(isSome()); return t; }
   ^
 ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:131:6: 
 note:   candidate expects 0 arguments, 1 provided
 make[2]: *** 
 [slave/containerizer/isolators/network/libmesos_no_3rdparty_la-port_mapping.lo]
  Error 1
 make[2]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make[1]: *** [check] Error 2
 make[1]: Leaving directory `/home/pbrett/sandbox/mesos.master/build/src'
 make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2199) Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser

2015-07-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617463#comment-14617463
 ] 

Adam B commented on MESOS-2199:
---

Good point, [~idownes]. I'll disable this test for now (0.23), and we can 
revisit the proper fix in 0.24.

 Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ---

 Key: MESOS-2199
 URL: https://issues.apache.org/jira/browse/MESOS-2199
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: haosdent
  Labels: mesosphere

 Appears that running the executor as {{nobody}} is not supported.
 [~nnielsen] can you take a look?
 Executor log:
 {noformat}
 [root@hostname build]# cat 
 /tmp/SlaveTest_ROOT_RunTaskWithCommandInfoWithUser_cxF1dY/slaves/20141219-005206-2081170186-60487-11862-S0/frameworks/20141219-005206-2081170186-60
 487-11862-/executors/1/runs/latest/std*
 sh: /home/idownes/workspace/mesos/build/src/mesos-executor: Permission denied
 {noformat}
 Test output:
 {noformat}
 [==] Running 1 test from 1 test case.
 [--] Global test environment set-up.
 [--] 1 test from SlaveTest
 [ RUN  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
 ../../src/tests/slave_tests.cpp:680: Failure
 Value of: statusRunning.get().state()
   Actual: TASK_FAILED
 Expected: TASK_RUNNING
 ../../src/tests/slave_tests.cpp:682: Failure
 Failed to wait 10secs for statusFinished
 ../../src/tests/slave_tests.cpp:673: Failure
 Actual function call count doesn't match EXPECT_CALL(sched, 
 statusUpdate(driver, _))...
  Expected: to be called twice
Actual: called once - unsatisfied and active
 [  FAILED  ] SlaveTest.ROOT_RunTaskWithCommandInfoWithUser (10641 ms)
 [--] 1 test from SlaveTest (10641 ms total)
 [--] Global test environment tear-down
 [==] 1 test from 1 test case ran. (10658 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)