Re: Review Request 25986: Added reconcileTasks to python scheduler.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25986/#review54395
---


Patch looks great!

Reviews applied: [25986]

All tests passed.

- Mesos ReviewBot


On Sept. 24, 2014, 6:02 a.m., Niklas Nielsen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25986/
> ---
> 
> (Updated Sept. 24, 2014, 6:02 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> The last step of wiring up reconcileTasks in the python bindings.
> 
> 
> Diffs
> -
> 
>   src/python/interface/src/mesos/interface/__init__.py 818f41b 
> 
> Diff: https://reviews.apache.org/r/25986/diff/
> 
> 
> Testing
> ---
> 
> Functional testing
> 
> 
> Thanks,
> 
> Niklas Nielsen
> 
>



Review Request 25986: Added reconcileTasks to python scheduler.

2014-09-23 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25986/
---

Review request for mesos and Ben Mahler.


Repository: mesos-git


Description
---

The last step of wiring up reconcileTasks in the python bindings.


Diffs
-

  src/python/interface/src/mesos/interface/__init__.py 818f41b 

Diff: https://reviews.apache.org/r/25986/diff/


Testing
---

Functional testing


Thanks,

Niklas Nielsen



Re: Review Request 25848: Introducing mesos modules.

2014-09-23 Thread Adam B

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25848/#review54391
---


Minor comments in passing.


src/examples/test_module.hpp


Not sure why these includes are needed here.



src/examples/test_module_impl.cpp


How about the module "interface" instead of "role"?


- Adam B


On Sept. 22, 2014, 9:05 p.m., Kapil Arya wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25848/
> ---
> 
> (Updated Sept. 22, 2014, 9:05 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Bernd Mathiske, Niklas Nielsen, 
> and Timothy St. Clair.
> 
> 
> Bugs: MESOS-1384
> https://issues.apache.org/jira/browse/MESOS-1384
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Adding a first class primitive, abstraction and process for dynamic library 
> writing and loading can make it easier to extend inner workings of Mesos. 
> Making it possible to have dynamic loadable resource allocators, isolators, 
> containerizes, authenticators and much more.
> 
> 
> Diffs
> -
> 
>   include/mesos/module.hpp PRE-CREATION 
>   src/Makefile.am 9b973e5503e30180045e270220987ba647da8038 
>   src/examples/test_module.hpp PRE-CREATION 
>   src/examples/test_module_impl.cpp PRE-CREATION 
>   src/module/manager.hpp PRE-CREATION 
>   src/module/manager.cpp PRE-CREATION 
>   src/tests/module_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/25848/diff/
> 
> 
> Testing
> ---
> 
> Ran make check with added tests for verifying library/module loading and 
> simple version check.
> 
> 
> Thanks,
> 
> Kapil Arya
> 
>



Build failed in Jenkins: Mesos-Ubuntu-distcheck #353

2014-09-23 Thread Apache Jenkins Server
See 

--
[...truncated 5280 lines...]
rm -f cscope.out cscope.in.out cscope.po.out cscope.files
make[4]: Leaving directory 
`
Making distclean in include
make[4]: Entering directory 
`
rm -rf .libs _libs
rm -f *.lo
test -z "" || rm -f 
test . = "../../../../3rdparty/libprocess/include" || test -z "" || rm -f 
rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
rm -f Makefile
make[4]: Leaving directory 
`
rm -f config.status config.cache config.log configure.lineno 
config.status.lineno
rm -rf ./.deps
rm -f Makefile
make[3]: Leaving directory 
`
make[3]: Entering directory 
`
rm -rf .libs _libs
rm -r -f distribute-0.6.26 leveldb zookeeper-3.4.5
rm -f *-stamp
rm -f *.lo
test -z "" || rm -f 
test . = "../../3rdparty" || test -z "" || rm -f 
rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
make[3]: Leaving directory 
`
rm -f Makefile
make[2]: Leaving directory 
`
Making distclean in src
make[2]: Entering directory 
`
 rm -f mesos-local mesos-log mesos mesos-execute mesos-resolve
 rm -f low-level-scheduler-libprocess low-level-scheduler-pthread 
test-framework test-executor long-lived-framework long-lived-executor 
no-executor-framework docker-no-executor-framework balloon-framework 
balloon-executor load-generator-framework mesos-tests
test -z "mesos.pb.cc ../include/mesos/mesos.pb.h 
containerizer/containerizer.pb.cc 
../include/mesos/containerizer/containerizer.pb.h scheduler/scheduler.pb.cc 
../include/mesos/scheduler/scheduler.pb.h 
java/generated/org/apache/mesos/Protos.java 
java/generated/org/apache/mesos/containerizer/Protos.java 
python/interface/src/mesos/interface/mesos_pb2.py 
python/interface/src/mesos/interface/containerizer_pb2.py 
messages/messages.pb.cc messages/messages.pb.h messages/log.pb.cc 
messages/log.pb.h messages/state.pb.cc messages/state.pb.h 
master/registry.pb.cc master/registry.pb.h examples.jar 
../3rdparty/libprocess/3rdparty/protobuf-2.5.0/python/dist/protobuf-2.5.0-py2.7.egg
 python/dist/mesos-0.21.0-py2.7.egg 
python/interface/dist/mesos.interface-0.21.0-py2.7.egg 
python/native/dist/mesos.native-0.21.0-py2.7-linux-x86_64.egg python/*/build 
python/*/dist python/src/mesos/__init__.py 
python/interface/src/mesos/__init__.py 
python/interface/src/mesos/interface/__init__.py 
python/native/src/mesos/__init__.py python/native/src/mesos/native/__init__.py 
python/native/src/mesos/native/mesos_executor_driver_impl.cpp 
python/native/src/mesos/native/mesos_executor_driver_impl.hpp 
python/native/src/mesos/native/mesos_scheduler_driver_impl.cpp 
python/native/src/mesos/native/mesos_scheduler_driver_impl.hpp 
python/native/src/mesos/native/module.cpp 
python/native/src/mesos/native/module.hpp 
python/native/src/mesos/native/proxy_executor.cpp 
python/native/src/mesos/native/proxy_executor.hpp 
python/native/src/mesos/native/proxy_scheduler.cpp 
python/native/src/mesos/native/proxy_scheduler.hpp" || rm -f mesos.pb.cc 
../include/mesos/mesos.pb.h containerizer/containerizer.pb.cc 
../include/mesos/containerizer/containerizer.pb.h scheduler/scheduler.pb.cc 
../include/mesos/scheduler/scheduler.pb.h 
java/generated/org/apache/mesos/Protos.java 
java/generated/org/apache/mesos/containerizer/Protos.java 
python/interface/src/mesos/interface/mesos_pb2.py 
python/interface/src/mesos/interface/containerizer_pb2.py 
messages/messages.pb.cc messages/messages.pb.h messages/log.pb.cc 
messages/log.pb.h messages/state.pb.cc messages/state.pb.h 
master/registry.pb.cc master/registry.pb.h examples.jar 
../3rdparty/libprocess/3rdparty/protobuf-2.5.0/python/dist/protobuf-2.5.0-py2.7.egg
 python/dist/mesos-0.21.0-py2.7.egg 
python/interface/dist/mesos.interface-0.21.0-py2.7.egg 
python/native/dist/mesos.native-0.21.0-py2.7-linux-x86_64.egg python/*/build 
python/*/dist python/src/mesos/__init__.py 
python/interface/src/mesos/__init__.py 
python/interface/src/mesos/interface/__init__.py 
python/native/src/mesos/__init__.py python/native/src/mesos/native/__init__.py 
python/native/src/mesos/native/mesos_executor_driver_impl.cpp 
python/native/src/mesos/native/mesos_executor_driver_impl.hpp 
python/native/src/mesos/native/mesos_scheduler

Re: Review Request 25967: Added task struct in master.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25967/#review54389
---


Patch looks great!

Reviews applied: [25967]

All tests passed.

- Mesos ReviewBot


On Sept. 24, 2014, 3:54 a.m., Niklas Nielsen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25967/
> ---
> 
> (Updated Sept. 24, 2014, 3:54 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> In order to hang additional state off tasks (as 'resources_recovered' in 
> https://reviews.apache.org/r/25911/), this patch introduce a task struct to 
> represent tasks within the master.
> 
> 
> Diffs
> -
> 
>   src/common/http.hpp afce7fe 
>   src/master/http.cpp 3f5a01d 
>   src/master/master.hpp f5d74ae 
>   src/master/master.cpp e5d30e9 
> 
> Diff: https://reviews.apache.org/r/25967/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Niklas Nielsen
> 
>



Jenkins build is back to normal : Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2391

2014-09-23 Thread Apache Jenkins Server
See 




Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/#review54387
---


Patch looks great!

Reviews applied: [25948]

All tests passed.

- Mesos ReviewBot


On Sept. 24, 2014, 2:51 a.m., Chi Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25948/
> ---
> 
> (Updated Sept. 24, 2014, 2:51 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Cong Wang.
> 
> 
> Bugs: Mesos-1808
> https://issues.apache.org/jira/browse/Mesos-1808
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> see Summary.
> 
> 
> Diffs
> -
> 
>   configure.ac da61c29 
>   src/Makefile.am 9b973e5 
>   src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
>   src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
>   src/linux/routing/internal.hpp dca0dc5 
>   src/tests/routing_tests.cpp 10730ad 
> 
> Diff: https://reviews.apache.org/r/25948/diff/
> 
> 
> Testing
> ---
> 
> added a test to use the code path. 
> 
> 
> Thanks,
> 
> Chi Zhang
> 
>



Review Request 25967: Added task struct in master.

2014-09-23 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25967/
---

Review request for mesos and Ben Mahler.


Repository: mesos-git


Description
---

In order to hang additional state off tasks (as 'resources_recovered' in 
https://reviews.apache.org/r/25911/), this patch introduce a task struct to 
represent tasks within the master.


Diffs
-

  src/common/http.hpp afce7fe 
  src/master/http.cpp 3f5a01d 
  src/master/master.hpp f5d74ae 
  src/master/master.cpp e5d30e9 

Diff: https://reviews.apache.org/r/25967/diff/


Testing
---

make check


Thanks,

Niklas Nielsen



Build failed in Jenkins: Mesos-Ubuntu-distcheck #352

2014-09-23 Thread Apache Jenkins Server
See 

Changes:

[bmahler] Removed duplicate error messaging in Docker::create.

--
[...truncated 5283 lines...]
rm -f cscope.out cscope.in.out cscope.po.out cscope.files
make[4]: Leaving directory 
`
Making distclean in include
make[4]: Entering directory 
`
rm -rf .libs _libs
rm -f *.lo
test -z "" || rm -f 
test . = "../../../../3rdparty/libprocess/include" || test -z "" || rm -f 
rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
rm -f Makefile
make[4]: Leaving directory 
`
rm -f config.status config.cache config.log configure.lineno 
config.status.lineno
rm -rf ./.deps
rm -f Makefile
make[3]: Leaving directory 
`
make[3]: Entering directory 
`
rm -rf .libs _libs
rm -r -f distribute-0.6.26 leveldb zookeeper-3.4.5
rm -f *-stamp
rm -f *.lo
test -z "" || rm -f 
test . = "../../3rdparty" || test -z "" || rm -f 
rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
make[3]: Leaving directory 
`
rm -f Makefile
make[2]: Leaving directory 
`
Making distclean in src
make[2]: Entering directory 
`
 rm -f mesos-local mesos-log mesos mesos-execute mesos-resolve
 rm -f low-level-scheduler-libprocess low-level-scheduler-pthread 
test-framework test-executor long-lived-framework long-lived-executor 
no-executor-framework docker-no-executor-framework balloon-framework 
balloon-executor load-generator-framework mesos-tests
test -z "mesos.pb.cc ../include/mesos/mesos.pb.h 
containerizer/containerizer.pb.cc 
../include/mesos/containerizer/containerizer.pb.h scheduler/scheduler.pb.cc 
../include/mesos/scheduler/scheduler.pb.h 
java/generated/org/apache/mesos/Protos.java 
java/generated/org/apache/mesos/containerizer/Protos.java 
python/interface/src/mesos/interface/mesos_pb2.py 
python/interface/src/mesos/interface/containerizer_pb2.py 
messages/messages.pb.cc messages/messages.pb.h messages/log.pb.cc 
messages/log.pb.h messages/state.pb.cc messages/state.pb.h 
master/registry.pb.cc master/registry.pb.h examples.jar 
../3rdparty/libprocess/3rdparty/protobuf-2.5.0/python/dist/protobuf-2.5.0-py2.7.egg
 python/dist/mesos-0.21.0-py2.7.egg 
python/interface/dist/mesos.interface-0.21.0-py2.7.egg 
python/native/dist/mesos.native-0.21.0-py2.7-linux-x86_64.egg python/*/build 
python/*/dist python/src/mesos/__init__.py 
python/interface/src/mesos/__init__.py 
python/interface/src/mesos/interface/__init__.py 
python/native/src/mesos/__init__.py python/native/src/mesos/native/__init__.py 
python/native/src/mesos/native/mesos_executor_driver_impl.cpp 
python/native/src/mesos/native/mesos_executor_driver_impl.hpp 
python/native/src/mesos/native/mesos_scheduler_driver_impl.cpp 
python/native/src/mesos/native/mesos_scheduler_driver_impl.hpp 
python/native/src/mesos/native/module.cpp 
python/native/src/mesos/native/module.hpp 
python/native/src/mesos/native/proxy_executor.cpp 
python/native/src/mesos/native/proxy_executor.hpp 
python/native/src/mesos/native/proxy_scheduler.cpp 
python/native/src/mesos/native/proxy_scheduler.hpp" || rm -f mesos.pb.cc 
../include/mesos/mesos.pb.h containerizer/containerizer.pb.cc 
../include/mesos/containerizer/containerizer.pb.h scheduler/scheduler.pb.cc 
../include/mesos/scheduler/scheduler.pb.h 
java/generated/org/apache/mesos/Protos.java 
java/generated/org/apache/mesos/containerizer/Protos.java 
python/interface/src/mesos/interface/mesos_pb2.py 
python/interface/src/mesos/interface/containerizer_pb2.py 
messages/messages.pb.cc messages/messages.pb.h messages/log.pb.cc 
messages/log.pb.h messages/state.pb.cc messages/state.pb.h 
master/registry.pb.cc master/registry.pb.h examples.jar 
../3rdparty/libprocess/3rdparty/protobuf-2.5.0/python/dist/protobuf-2.5.0-py2.7.egg
 python/dist/mesos-0.21.0-py2.7.egg 
python/interface/dist/mesos.interface-0.21.0-py2.7.egg 
python/native/dist/mesos.native-0.21.0-py2.7-linux-x86_64.egg python/*/build 
python/*/dist python/src/mesos/__init__.py 
python/interface/src/mesos/__init__.py 
python/interface/src/mesos/interface/__init__.py 
python/native/src/mesos/__init__.py python/native/src/mesos/native/__init__.py 
python/native/src/mesos/native/mesos_executor_driver_impl.cpp 
python/native/src/mesos/nati

Re: Review Request 25947: Dynamically change reap poll interval.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25947/#review54385
---


Patch looks great!

Reviews applied: [25947]

All tests passed.

- Mesos ReviewBot


On Sept. 24, 2014, 12:26 a.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25947/
> ---
> 
> (Updated Sept. 24, 2014, 12:26 a.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Ben Mahler, Craig Hansen-Sturm, and 
> Jie Yu.
> 
> 
> Bugs: MESOS-1199
> https://issues.apache.org/jira/browse/MESOS-1199
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> If pid count <= 50 then use 100 ms (<= 0.5% core usage), if count >= 500 use 
> 1000 ms (<= 1% core usage at 500 pids), else interpolate.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/reap.cpp b350ee15bf232493be01506037524b7590c0d2f5 
> 
> Diff: https://reviews.apache.org/r/25947/diff/
> 
> 
> Testing
> ---
> 
> Wrote test program which looped, calling kill(0, pid) and waitpid(pid, 
> &status, WNOHANG) on N children (still running) each loop. These are the 
> system calls used by the reaper. Computed load as wall time / (user + system 
> time). Tested on Linux and OSX.
> 
> `sudo ./bin/mesos-tests.sh` time reduced 20% from 6:40 to 5:20 (not running 
> Docker tests).
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Chi Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/
---

(Updated Sept. 24, 2014, 2:51 a.m.)


Review request for mesos, Jie Yu and Cong Wang.


Changes
---

addressed comments.


Bugs: Mesos-1808
https://issues.apache.org/jira/browse/Mesos-1808


Repository: mesos-git


Description
---

see Summary.


Diffs (updated)
-

  configure.ac da61c29 
  src/Makefile.am 9b973e5 
  src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
  src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
  src/linux/routing/internal.hpp dca0dc5 
  src/tests/routing_tests.cpp 10730ad 

Diff: https://reviews.apache.org/r/25948/diff/


Testing
---

added a test to use the code path. 


Thanks,

Chi Zhang



Re: Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2390

2014-09-23 Thread Benjamin Mahler
Adding google test's xml output, sorry for the noise. Might be a bit more
noise coming.

On Tue, Sep 23, 2014 at 7:43 PM, Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See <
> https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2390/changes
> >
>
> Changes:
>
> [bmahler] Refactored the environment-based test filtering.
>
> [bmahler] Removed duplicate error messaging in Docker::create.
>
> --
> [...truncated 60857 lines...]
> Using temporary directory
> '/tmp/RateLimitingTest_RateLimitingEnabled_ORvIyl'
> I0924 02:43:18.405431 24457 leveldb.cpp:176] Opened db in 2.642525ms
> I0924 02:43:18.406282 24457 leveldb.cpp:183] Compacted db in 827988ns
> I0924 02:43:18.406306 24457 leveldb.cpp:198] Created db iterator in 2825ns
> I0924 02:43:18.406322 24457 leveldb.cpp:204] Seeked to beginning of db in
> 538ns
> I0924 02:43:18.406334 24457 leveldb.cpp:273] Iterated through 0 keys in
> the db in 195ns
> I0924 02:43:18.406349 24457 replica.cpp:741] Replica recovered with log
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0924 02:43:18.406544 24483 recover.cpp:425] Starting replica recovery
> I0924 02:43:18.406780 24479 recover.cpp:451] Replica is in EMPTY status
> I0924 02:43:18.407866 24473 replica.cpp:638] Replica in EMPTY status
> received a broadcasted recover request
> I0924 02:43:18.407968 24475 recover.cpp:188] Received a recover response
> from a replica in EMPTY status
> I0924 02:43:18.408144 24477 recover.cpp:542] Updating replica status to
> STARTING
> I0924 02:43:18.408303 24477 master.cpp:286] Master
> 20140924-024318-3125920579-60293-24457 (penates.apache.org) started on
> 67.195.81.186:60293
> I0924 02:43:18.408341 24477 master.cpp:332] Master only allowing
> authenticated frameworks to register
> I0924 02:43:18.408359 24477 master.cpp:337] Master only allowing
> authenticated slaves to register
> I0924 02:43:18.408376 24477 credentials.hpp:36] Loading credentials for
> authentication from
> '/tmp/RateLimitingTest_RateLimitingEnabled_ORvIyl/credentials'
> I0924 02:43:18.408514 24477 master.cpp:366] Authorization enabled
> I0924 02:43:18.408581 24477 master.cpp:413] Framework rate limiting enabled
> I0924 02:43:18.408844 24480 master.cpp:120] No whitelist given.
> Advertising offers for all slaves
> I0924 02:43:18.408854 24484 leveldb.cpp:306] Persisting metadata (8 bytes)
> to leveldb took 583507ns
> I0924 02:43:18.408885 24484 replica.cpp:320] Persisted replica status to
> STARTING
> I0924 02:43:18.408895 24487 hierarchical_allocator_process.hpp:299]
> Initializing hierarchical allocator process with master :
> master@67.195.81.186:60293
> I0924 02:43:18.409106 24483 recover.cpp:451] Replica is in STARTING status
> I0924 02:43:18.409343 24478 master.cpp:1212] The newly elected leader is
> master@67.195.81.186:60293 with id 20140924-024318-3125920579-60293-24457
> I0924 02:43:18.409368 24478 master.cpp:1225] Elected as the leading master!
> I0924 02:43:18.409376 24478 master.cpp:1043] Recovering from registrar
> I0924 02:43:18.409625 24472 registrar.cpp:313] Recovering registrar
> I0924 02:43:18.409731 24477 replica.cpp:638] Replica in STARTING status
> received a broadcasted recover request
> I0924 02:43:18.409920 24477 recover.cpp:188] Received a recover response
> from a replica in STARTING status
> I0924 02:43:18.410126 24485 recover.cpp:542] Updating replica status to
> VOTING
> I0924 02:43:18.410691 24486 leveldb.cpp:306] Persisting metadata (8 bytes)
> to leveldb took 306314ns
> I0924 02:43:18.410712 24486 replica.cpp:320] Persisted replica status to
> VOTING
> I0924 02:43:18.410773 24482 recover.cpp:556] Successfully joined the Paxos
> group
> I0924 02:43:18.410868 24482 recover.cpp:440] Recover process terminated
> I0924 02:43:18.411387 24479 log.cpp:656] Attempting to start the writer
> I0924 02:43:18.411983 24478 replica.cpp:474] Replica received implicit
> promise request with proposal 1
> I0924 02:43:18.412117 24478 leveldb.cpp:306] Persisting metadata (8 bytes)
> to leveldb took 109389ns
> I0924 02:43:18.412137 24478 replica.cpp:342] Persisted promised to 1
> I0924 02:43:18.412495 24487 coordinator.cpp:230] Coordinator attemping to
> fill missing position
> I0924 02:43:18.413272 24485 replica.cpp:375] Replica received explicit
> promise request for position 0 with proposal 2
> I0924 02:43:18.413558 24485 leveldb.cpp:343] Persisting action (8 bytes)
> to leveldb took 262201ns
> I0924 02:43:18.413578 24485 replica.cpp:676] Persisted action at 0
> I0924 02:43:18.414124 24483 replica.cpp:508] Replica received write
> request for position 0
> I0924 02:43:18.414162 24483 leveldb.cpp:438] Reading position from leveldb
> took 13689ns
> I0924 02:43:18.414293 24483 leveldb.cpp:343] Persisting action (14 bytes)
> to leveldb took 109748ns
> I0924 02:43:18.414310 24483 replica.cpp:676] Persisted action at 0
> I0924 02:43:18.414820 24482 replica.cpp:655] Replica received learned
> notice fo

Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2390

2014-09-23 Thread Apache Jenkins Server
See 


Changes:

[bmahler] Refactored the environment-based test filtering.

[bmahler] Removed duplicate error messaging in Docker::create.

--
[...truncated 60857 lines...]
Using temporary directory '/tmp/RateLimitingTest_RateLimitingEnabled_ORvIyl'
I0924 02:43:18.405431 24457 leveldb.cpp:176] Opened db in 2.642525ms
I0924 02:43:18.406282 24457 leveldb.cpp:183] Compacted db in 827988ns
I0924 02:43:18.406306 24457 leveldb.cpp:198] Created db iterator in 2825ns
I0924 02:43:18.406322 24457 leveldb.cpp:204] Seeked to beginning of db in 538ns
I0924 02:43:18.406334 24457 leveldb.cpp:273] Iterated through 0 keys in the db 
in 195ns
I0924 02:43:18.406349 24457 replica.cpp:741] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0924 02:43:18.406544 24483 recover.cpp:425] Starting replica recovery
I0924 02:43:18.406780 24479 recover.cpp:451] Replica is in EMPTY status
I0924 02:43:18.407866 24473 replica.cpp:638] Replica in EMPTY status received a 
broadcasted recover request
I0924 02:43:18.407968 24475 recover.cpp:188] Received a recover response from a 
replica in EMPTY status
I0924 02:43:18.408144 24477 recover.cpp:542] Updating replica status to STARTING
I0924 02:43:18.408303 24477 master.cpp:286] Master 
20140924-024318-3125920579-60293-24457 (penates.apache.org) started on 
67.195.81.186:60293
I0924 02:43:18.408341 24477 master.cpp:332] Master only allowing authenticated 
frameworks to register
I0924 02:43:18.408359 24477 master.cpp:337] Master only allowing authenticated 
slaves to register
I0924 02:43:18.408376 24477 credentials.hpp:36] Loading credentials for 
authentication from 
'/tmp/RateLimitingTest_RateLimitingEnabled_ORvIyl/credentials'
I0924 02:43:18.408514 24477 master.cpp:366] Authorization enabled
I0924 02:43:18.408581 24477 master.cpp:413] Framework rate limiting enabled
I0924 02:43:18.408844 24480 master.cpp:120] No whitelist given. Advertising 
offers for all slaves
I0924 02:43:18.408854 24484 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 583507ns
I0924 02:43:18.408885 24484 replica.cpp:320] Persisted replica status to 
STARTING
I0924 02:43:18.408895 24487 hierarchical_allocator_process.hpp:299] 
Initializing hierarchical allocator process with master : 
master@67.195.81.186:60293
I0924 02:43:18.409106 24483 recover.cpp:451] Replica is in STARTING status
I0924 02:43:18.409343 24478 master.cpp:1212] The newly elected leader is 
master@67.195.81.186:60293 with id 20140924-024318-3125920579-60293-24457
I0924 02:43:18.409368 24478 master.cpp:1225] Elected as the leading master!
I0924 02:43:18.409376 24478 master.cpp:1043] Recovering from registrar
I0924 02:43:18.409625 24472 registrar.cpp:313] Recovering registrar
I0924 02:43:18.409731 24477 replica.cpp:638] Replica in STARTING status 
received a broadcasted recover request
I0924 02:43:18.409920 24477 recover.cpp:188] Received a recover response from a 
replica in STARTING status
I0924 02:43:18.410126 24485 recover.cpp:542] Updating replica status to VOTING
I0924 02:43:18.410691 24486 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 306314ns
I0924 02:43:18.410712 24486 replica.cpp:320] Persisted replica status to VOTING
I0924 02:43:18.410773 24482 recover.cpp:556] Successfully joined the Paxos group
I0924 02:43:18.410868 24482 recover.cpp:440] Recover process terminated
I0924 02:43:18.411387 24479 log.cpp:656] Attempting to start the writer
I0924 02:43:18.411983 24478 replica.cpp:474] Replica received implicit promise 
request with proposal 1
I0924 02:43:18.412117 24478 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 109389ns
I0924 02:43:18.412137 24478 replica.cpp:342] Persisted promised to 1
I0924 02:43:18.412495 24487 coordinator.cpp:230] Coordinator attemping to fill 
missing position
I0924 02:43:18.413272 24485 replica.cpp:375] Replica received explicit promise 
request for position 0 with proposal 2
I0924 02:43:18.413558 24485 leveldb.cpp:343] Persisting action (8 bytes) to 
leveldb took 262201ns
I0924 02:43:18.413578 24485 replica.cpp:676] Persisted action at 0
I0924 02:43:18.414124 24483 replica.cpp:508] Replica received write request for 
position 0
I0924 02:43:18.414162 24483 leveldb.cpp:438] Reading position from leveldb took 
13689ns
I0924 02:43:18.414293 24483 leveldb.cpp:343] Persisting action (14 bytes) to 
leveldb took 109748ns
I0924 02:43:18.414310 24483 replica.cpp:676] Persisted action at 0
I0924 02:43:18.414820 24482 replica.cpp:655] Replica received learned notice 
for position 0
I0924 02:43:18.414968 24482 leveldb.cpp:343] Persisting action (16 bytes) to 
leveldb took 119664ns
I0924 02:43:18.414991 24482 replica.cpp:676] Persisted action at 0
I0924 02:43:18.415011 24482 replica.cpp:661] Replica learned NOP action at 
position 0
I0924 02:43:18.415287 24487 log.cpp:672] Writer started with ending p

Re: Review Request 25864: Add 'Future cgroups::empty()'.

2014-09-23 Thread Ben Mahler


> On Sept. 22, 2014, 7:12 p.m., Ben Mahler wrote:
> > Why do you need this?
> > 
> > Seems like one would only watch for cgroup emptiness when destroying a 
> > cgroup, in which case, why split emptiness waiting from a successful 
> > destroy?
> 
> Ian Downes wrote:
> This is for use in conjunction with pid namespaces: we kill the leading 
> ('init') process and then we can wait until all processes have been killed by 
> the kernel. The LinuxLauncher continues to use the freezer cgroup for this 
> purpose (and for forward/backward compatibility).

Ok, 'emptied' is definitely a better name.

Seems unfortunate that we're pushing "watcher" logic into the cgroups library. 
It's not obvious why the empty() watching logic couldn't be placed alongside 
the code that is responsible for destruction of cgroups (including pid 
namespaces), could you add a TODO to reflect future cleanup?


- Ben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25864/#review54170
---


On Sept. 23, 2014, 11:39 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25864/
> ---
> 
> (Updated Sept. 23, 2014, 11:39 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Polls cgroups.procs until no processes in the cgroup. Poll interval and 
> timeout can be specified.
> 
> 
> Diffs
> -
> 
>   src/linux/cgroups.hpp abf31df1b4dbf6f715f93256b83c9996a45099cf 
>   src/linux/cgroups.cpp 62df4b7645c6ab061a47634058d79ca849caa6b9 
> 
> Diff: https://reviews.apache.org/r/25864/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25966: Use pid namespace in LinuxLauncher::destroy().

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25966/#review54375
---


Bad patch!

Reviews applied: [25864, 25865]

Failed command: git apply --index 25865.patch

Error:
 error: patch failed: src/Makefile.am:339
error: src/Makefile.am: patch does not apply
error: src/slave/containerizer/isolators/filesystem/shared.cpp: does not exist 
in index
error: patch failed: src/slave/containerizer/linux_launcher.cpp:106
error: src/slave/containerizer/linux_launcher.cpp: patch does not apply
error: patch failed: src/slave/containerizer/mesos/containerizer.cpp:40
error: src/slave/containerizer/mesos/containerizer.cpp: patch does not apply

- Mesos ReviewBot


On Sept. 23, 2014, 11:41 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25966/
> ---
> 
> (Updated Sept. 23, 2014, 11:41 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Check if a container is running in a pid namespace and thus all processes can 
> be killed by the kernel, rather than using the freezer.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/linux_launcher.cpp 
> f7bc894830a7ca3f55465dacc7b653cdc2d7758b 
> 
> Diff: https://reviews.apache.org/r/25966/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25965: Update libprocess Makefile for setns namechange.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25965/#review54374
---


Bad patch!

Reviews applied: [25863, 25965]

Failed command: make -j3 distcheck

Error:
 make  dist-gzip am__post_remove_distdir='@:'
make[1]: Entering directory 
`/home/jenkins/jenkins-slave/workspace/mesos-reviewbot'
if test -d "mesos-0.21.0"; then find "mesos-0.21.0" -type d ! -perm -200 -exec 
chmod u+w {} ';' && rm -rf "mesos-0.21.0" || { sleep 5 && rm -rf 
"mesos-0.21.0"; }; else :; fi
test -d "mesos-0.21.0" || mkdir "mesos-0.21.0"
 (cd 3rdparty && make  top_distdir=../mesos-0.21.0 
distdir=../mesos-0.21.0/3rdparty \
 am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: distdir)
make[2]: Entering directory 
`/home/jenkins/jenkins-slave/workspace/mesos-reviewbot/3rdparty'
 (cd libprocess && make  top_distdir=../../mesos-0.21.0 
distdir=../../mesos-0.21.0/3rdparty/libprocess \
 am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: distdir)
make[3]: Entering directory 
`/home/jenkins/jenkins-slave/workspace/mesos-reviewbot/3rdparty/libprocess'
:
test -d "../../mesos-0.21.0/3rdparty/libprocess" || mkdir 
"../../mesos-0.21.0/3rdparty/libprocess"
 (cd 3rdparty && make  top_distdir=../../../mesos-0.21.0 
distdir=../../../mesos-0.21.0/3rdparty/libprocess/3rdparty \
 am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: distdir)
make[4]: Entering directory 
`/home/jenkins/jenkins-slave/workspace/mesos-reviewbot/3rdparty/libprocess/3rdparty'
make[4]: *** No rule to make target `stout/tests/os/setns_tests.cpp', needed by 
`distdir'.  Stop.
make[4]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/mesos-reviewbot/3rdparty/libprocess/3rdparty'
make[3]: *** [distdir] Error 1
make[3]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/mesos-reviewbot/3rdparty/libprocess'
make[2]: *** [distdir] Error 1
make[2]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/mesos-reviewbot/3rdparty'
make[1]: *** [distdir] Error 1
make[1]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/mesos-reviewbot'
make: *** [dist] Error 2

- Mesos ReviewBot


On Sept. 23, 2014, 11:40 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25965/
> ---
> 
> (Updated Sept. 23, 2014, 11:40 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Update libprocess Makefile for setns namechange.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/isolators/network/port_mapping.cpp 
> 2766a00ff81dc550c21387f920666f81705db4f0 
> 
> Diff: https://reviews.apache.org/r/25965/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25947: Dynamically change reap poll interval.

2014-09-23 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25947/#review54368
---

Ship it!


:D


3rdparty/libprocess/src/reap.cpp


Why did you use auto here? size_t is very straightforward.

Maybe call this s/size/count/ to better match your constants?



3rdparty/libprocess/src/reap.cpp


Signed vs unsigned comparison? Looks like LOW_PID_COUNT and HIGH_PID_COUNT 
should be size_t.



3rdparty/libprocess/src/reap.cpp


You can leverage the Duration operators instead of needing Duration::create 
+ error handling, this cleans up the code quite a lot:

```
} else {
  // Linear interpolation.
  double percent = ((double) (size - LOW_PID_COUNT)) / (HIGH_PID_COUNT - 
LOW_PID_COUNT);

  return LOW_INTERVAL + (HIGH_INTERVAL - LOW_INTERVAL) * percent;
}
```

Adjust the math as you think is clearest. It might be nice to have the 
linear interpolation logic in one place, instead of defining the "gain" 
constant up above.


- Ben Mahler


On Sept. 24, 2014, 12:26 a.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25947/
> ---
> 
> (Updated Sept. 24, 2014, 12:26 a.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Ben Mahler, Craig Hansen-Sturm, and 
> Jie Yu.
> 
> 
> Bugs: MESOS-1199
> https://issues.apache.org/jira/browse/MESOS-1199
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> If pid count <= 50 then use 100 ms (<= 0.5% core usage), if count >= 500 use 
> 1000 ms (<= 1% core usage at 500 pids), else interpolate.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/reap.cpp b350ee15bf232493be01506037524b7590c0d2f5 
> 
> Diff: https://reviews.apache.org/r/25947/diff/
> 
> 
> Testing
> ---
> 
> Wrote test program which looped, calling kill(0, pid) and waitpid(pid, 
> &status, WNOHANG) on N children (still running) each loop. These are the 
> system calls used by the reaper. Computed load as wall time / (user + system 
> time). Tested on Linux and OSX.
> 
> `sudo ./bin/mesos-tests.sh` time reduced 20% from 6:40 to 5:20 (not running 
> Docker tests).
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25964: Update libprocess Makefile for setns namechange.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25964/#review54369
---


Patch looks great!

Reviews applied: [25863, 25964]

All tests passed.

- Mesos ReviewBot


On Sept. 23, 2014, 11:40 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25964/
> ---
> 
> (Updated Sept. 23, 2014, 11:40 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Update libprocess Makefile for setns namechange.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/3rdparty/Makefile.am 
> bd1dc8df0259a318a9171a9c045a223800e64f47 
> 
> Diff: https://reviews.apache.org/r/25964/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25947: Dynamically change reap poll interval.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25947/
---

(Updated Sept. 23, 2014, 5:26 p.m.)


Review request for mesos, Bernd Mathiske, Ben Mahler, Craig Hansen-Sturm, and 
Jie Yu.


Bugs: MESOS-1199
https://issues.apache.org/jira/browse/MESOS-1199


Repository: mesos-git


Description
---

If pid count <= 50 then use 100 ms (<= 0.5% core usage), if count >= 500 use 
1000 ms (<= 1% core usage at 500 pids), else interpolate.


Diffs (updated)
-

  3rdparty/libprocess/src/reap.cpp b350ee15bf232493be01506037524b7590c0d2f5 

Diff: https://reviews.apache.org/r/25947/diff/


Testing
---

Wrote test program which looped, calling kill(0, pid) and waitpid(pid, &status, 
WNOHANG) on N children (still running) each loop. These are the system calls 
used by the reaper. Computed load as wall time / (user + system time). Tested 
on Linux and OSX.

`sudo ./bin/mesos-tests.sh` time reduced 20% from 6:40 to 5:20 (not running 
Docker tests).


Thanks,

Ian Downes



Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/#review54367
---

Ship it!



src/linux/routing/diagnosis/diagnosis.hpp


No need to include these two?



src/linux/routing/diagnosis/diagnosis.hpp


Can you make it const? Here and everywhere else.



src/linux/routing/diagnosis/diagnosis.cpp


Fix the indent.



src/linux/routing/diagnosis/diagnosis.cpp


No need for this helper.



src/linux/routing/diagnosis/diagnosis.cpp


(State) idia...


- Jie Yu


On Sept. 23, 2014, 11:18 p.m., Chi Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25948/
> ---
> 
> (Updated Sept. 23, 2014, 11:18 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Cong Wang.
> 
> 
> Bugs: Mesos-1808
> https://issues.apache.org/jira/browse/Mesos-1808
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> see Summary.
> 
> 
> Diffs
> -
> 
>   configure.ac da61c29 
>   src/Makefile.am 9b973e5 
>   src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
>   src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
>   src/linux/routing/internal.hpp dca0dc5 
>   src/tests/routing_tests.cpp 10730ad 
> 
> Diff: https://reviews.apache.org/r/25948/diff/
> 
> 
> Testing
> ---
> 
> added a test to use the code path. 
> 
> 
> Thanks,
> 
> Chi Zhang
> 
>



Re: Review Request 25569: Refactor test environment validations

2014-09-23 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25569/#review54351
---

Ship it!


Thanks so much Tim, this is a great cleanup!

I'll fix the logical bugs and land this for you shortly.


src/tests/environment.cpp


Missing the include for 



src/tests/environment.cpp


Looks like the double negatives were pretty tricky per the bug below, I'll 
rename this to 'disabled' to make it a bit clearer.



src/tests/environment.cpp


This is a bug! Should be ||, not &&



src/tests/environment.cpp


|| here as well


- Ben Mahler


On Sept. 23, 2014, 10:02 p.m., Timothy Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25569/
> ---
> 
> (Updated Sept. 23, 2014, 10:02 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Review: https://reviews.apache.org/r/25569
> 
> 
> Diffs
> -
> 
>   src/tests/environment.cpp 2274251aaf653d83c2d03ef2186763978067a747 
> 
> Diff: https://reviews.apache.org/r/25569/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>



Re: Review Request 24177: Pass executor directory to Isolator::prepare().

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24177/
---

(Updated Sept. 23, 2014, 4:42 p.m.)


Review request for mesos, Ben Mahler and Jie Yu.


Repository: mesos-git


Description
---

Pass executor directory to Isolator::prepare().

Will be used for FilesystemIsolator.


Diffs (updated)
-

  src/slave/containerizer/isolator.hpp e52e8b15c740c62ef64b49897d3d6ae5179d4719 
  src/slave/containerizer/isolator.cpp 5e61bf2e3cf14be53d41aa657b4a78ab2dd6ecb0 
  src/slave/containerizer/isolators/cgroups/cpushare.hpp 
2187c296ea9b1a7de9ae3f09fdf1983f98a3d01b 
  src/slave/containerizer/isolators/cgroups/cpushare.cpp 
7164ecc0f068d4a72248521e3cbd345958efa880 
  src/slave/containerizer/isolators/cgroups/mem.hpp 
b1b4f5a2bd9e01b03fdfa74f187f7dee8119b812 
  src/slave/containerizer/isolators/cgroups/mem.cpp 
b3d4a5daa90a842e501bc6be2f0cf20fe22906ac 
  src/slave/containerizer/isolators/cgroups/perf_event.hpp 
f7283d830cd6af7b3c9006c098de0a6ad48b7c82 
  src/slave/containerizer/isolators/cgroups/perf_event.cpp 
4ced508e600e13f3e5ae9d12ea199de743def652 
  src/slave/containerizer/isolators/posix.hpp 
f120aafef96343d84f93c5636484509dc972a0a8 
  src/tests/isolator.hpp 89df4c4959c680354b002fa12e3a270a358087af 
  src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 

Diff: https://reviews.apache.org/r/24177/diff/


Testing
---

make check


Thanks,

Ian Downes



Re: Review Request 25549: Basic filesystem isolator for Linux.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25549/
---

(Updated Sept. 23, 2014, 4:42 p.m.)


Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone.


Bugs: MESOS-1586
https://issues.apache.org/jira/browse/MESOS-1586


Repository: mesos-git


Description
---

Does not report usage or enforce quota but can create 'private' directories for 
each container which mask parts of the shared host filesystem.

This review replaces https://reviews.apache.org/r/24178/ because of some file 
renaming. I addressed all comments from earlier reviews.


Diffs (updated)
-

  include/mesos/mesos.proto be45494b2c2f5c1295409889b70004462c6eba49 
  src/Makefile.am 9b973e5503e30180045e270220987ba647da8038 
  src/common/parse.hpp e6153d8a1f25bc9ddbe1e391306beeacfc8d5ff6 
  src/common/type_utils.hpp 480c0883fe6ed7f6a9daf77d83ebb077da2e66ee 
  src/slave/containerizer/isolators/filesystem/shared.hpp PRE-CREATION 
  src/slave/containerizer/isolators/filesystem/shared.cpp PRE-CREATION 
  src/slave/containerizer/isolators/network/port_mapping.cpp 
2766a00ff81dc550c21387f920666f81705db4f0 
  src/slave/containerizer/linux_launcher.cpp 
f7bc894830a7ca3f55465dacc7b653cdc2d7758b 
  src/slave/containerizer/mesos/containerizer.cpp 
9d083294caa5c5a47ba3ceaa1b57346144cb795c 
  src/slave/flags.hpp 32e51d214b0dbbb2f106236c6fa42ddec9774585 
  src/slave/slave.cpp 9a6646f0249fd43ae5d13bd9ee3b5da08412 
  src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 
  src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 

Diff: https://reviews.apache.org/r/25549/diff/


Testing
---

make check # added a test


Thanks,

Ian Downes



Review Request 25966: Use pid namespace in LinuxLauncher::destroy().

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25966/
---

Review request for mesos, Jie Yu and Vinod Kone.


Repository: mesos-git


Description
---

Check if a container is running in a pid namespace and thus all processes can 
be killed by the kernel, rather than using the freezer.


Diffs
-

  src/slave/containerizer/linux_launcher.cpp 
f7bc894830a7ca3f55465dacc7b653cdc2d7758b 

Diff: https://reviews.apache.org/r/25966/diff/


Testing
---


Thanks,

Ian Downes



Re: Review Request 25864: Add 'Future cgroups::empty()'.

2014-09-23 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25864/#review54362
---

Ship it!


Ship It!

- Timothy Chen


On Sept. 23, 2014, 11:39 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25864/
> ---
> 
> (Updated Sept. 23, 2014, 11:39 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Polls cgroups.procs until no processes in the cgroup. Poll interval and 
> timeout can be specified.
> 
> 
> Diffs
> -
> 
>   src/linux/cgroups.hpp abf31df1b4dbf6f715f93256b83c9996a45099cf 
>   src/linux/cgroups.cpp 62df4b7645c6ab061a47634058d79ca849caa6b9 
> 
> Diff: https://reviews.apache.org/r/25864/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/#review54361
---


Patch looks great!

Reviews applied: [25948]

All tests passed.

- Mesos ReviewBot


On Sept. 23, 2014, 11:18 p.m., Chi Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25948/
> ---
> 
> (Updated Sept. 23, 2014, 11:18 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Cong Wang.
> 
> 
> Bugs: Mesos-1808
> https://issues.apache.org/jira/browse/Mesos-1808
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> see Summary.
> 
> 
> Diffs
> -
> 
>   configure.ac da61c29 
>   src/Makefile.am 9b973e5 
>   src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
>   src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
>   src/linux/routing/internal.hpp dca0dc5 
>   src/tests/routing_tests.cpp 10730ad 
> 
> Diff: https://reviews.apache.org/r/25948/diff/
> 
> 
> Testing
> ---
> 
> added a test to use the code path. 
> 
> 
> Thanks,
> 
> Chi Zhang
> 
>



Review Request 25965: Update libprocess Makefile for setns namechange.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25965/
---

Review request for mesos and Ben Mahler.


Repository: mesos-git


Description
---

Update libprocess Makefile for setns namechange.


Diffs
-

  src/slave/containerizer/isolators/network/port_mapping.cpp 
2766a00ff81dc550c21387f920666f81705db4f0 

Diff: https://reviews.apache.org/r/25965/diff/


Testing
---


Thanks,

Ian Downes



Review Request 25964: Update libprocess Makefile for setns namechange.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25964/
---

Review request for mesos and Ben Mahler.


Repository: mesos-git


Description
---

Update libprocess Makefile for setns namechange.


Diffs
-

  3rdparty/libprocess/3rdparty/Makefile.am 
bd1dc8df0259a318a9171a9c045a223800e64f47 

Diff: https://reviews.apache.org/r/25964/diff/


Testing
---


Thanks,

Ian Downes



Re: Review Request 25865: Pid namespace isolator for the MesosContainerizer.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25865/
---

(Updated Sept. 23, 2014, 4:39 p.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Repository: mesos-git


Description
---

Add namespaces/pid to --isolation slave flag. Places executor into a pid 
namespace so it and all descendants will be contained in the namespace. 
Requires the filesystem/shared isolator so /proc and /sys are remounted to 
reflect the different namespace.


Diffs (updated)
-

  src/Makefile.am 9b973e5503e30180045e270220987ba647da8038 
  src/slave/containerizer/isolators/filesystem/shared.cpp PRE-CREATION 
  src/slave/containerizer/isolators/namespaces/pid.hpp PRE-CREATION 
  src/slave/containerizer/isolators/namespaces/pid.cpp PRE-CREATION 
  src/slave/containerizer/linux_launcher.cpp 
f7bc894830a7ca3f55465dacc7b653cdc2d7758b 
  src/slave/containerizer/mesos/containerizer.cpp 
9d083294caa5c5a47ba3ceaa1b57346144cb795c 
  src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 

Diff: https://reviews.apache.org/r/25865/diff/


Testing
---

Added test that command in pid namespaced container is in a different namespace 
and that the command is 'init' (verifies remount of /proc).


Thanks,

Ian Downes



Re: Review Request 25863: Rename stout/os/setns.hpp to namespaces.hpp.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25863/
---

(Updated Sept. 23, 2014, 4:39 p.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Repository: mesos-git


Description
---

Also, added os::getns(pid, ns) to get the namespace inode for comparisons 
between pids' namespaces.


Diffs (updated)
-

  3rdparty/libprocess/3rdparty/stout/Makefile.am 
2ee5a0bcc8bef0a5769dafc8ae54aea284993d6e 
  3rdparty/libprocess/3rdparty/stout/include/Makefile.am 
d4a8ad4e776bcfe1f008e561b5a92340f4d84bd9 
  3rdparty/libprocess/3rdparty/stout/include/stout/os/setns.hpp 
5278996f201a4a3d69282c1bd7b0d230d0f6cd39 
  3rdparty/libprocess/3rdparty/stout/tests/os/namespaces_tests.cpp PRE-CREATION 
  3rdparty/libprocess/3rdparty/stout/tests/os/setns_tests.cpp 
ad8e37aa2f5a29f8b421dde6b7cd5dfe241eabb5 

Diff: https://reviews.apache.org/r/25863/diff/


Testing
---

Added test to check a clone(NEW_PIDNS) results in a new pid namespace.


Thanks,

Ian Downes



Re: Review Request 25864: Add 'Future cgroups::empty()'.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25864/
---

(Updated Sept. 23, 2014, 4:39 p.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Changes
---

Addressed comments.


Repository: mesos-git


Description
---

Polls cgroups.procs until no processes in the cgroup. Poll interval and timeout 
can be specified.


Diffs (updated)
-

  src/linux/cgroups.hpp abf31df1b4dbf6f715f93256b83c9996a45099cf 
  src/linux/cgroups.cpp 62df4b7645c6ab061a47634058d79ca849caa6b9 

Diff: https://reviews.apache.org/r/25864/diff/


Testing
---


Thanks,

Ian Downes



Re: Mesos Modules Design

2014-09-23 Thread Niklas Nielsen
(Inlined) Thanks for chiming in!

On 23 September 2014 11:59, Vinod Kone  wrote:

> Ok. I finally had a chance to read the design doc, go through the comments
> on this thread and glance at the review. Here are my comments.
>
> I like the concept of dynamic loading of libraries when it comes to making
> the lives of end users/operators easy, esp when they want to mix and match
> the modules.
>
> I agree with Dominic though that we should make sure to minimize the
> burden/overhead of core developers when it comes to dealing with modules.
> FWIW, the implementation that I've seen seems easy enough to deal with and
> doesn't look like it adds much of a performance overhead.
>
> That said, the versioning part seems a bit complicated, as BenH alluded to.
> Can we come up with something that's simple as the first cut but still
> extensible? Also, it would be great if interface breakage could be
> automatically deduced at load time. But I guess we still need a module
> version to deal with protocol changes without interface changes. To ensure
> the devs update the versions in module manager, I suggest adding a comment
> on top of the current components (Allocator, Isolator etc) mentioning that
> there is a corresponding module that needs to be updated. Better yet, maybe
> we could use a naming convention (e.g., s/Allocator/AllocatorModule/) for
> easy visual cue.
>

Good point on naming explicitly. We can do that incrementally when we start
wiring modules up.
The easiest path forward (but a bit oversimplified and non-backwards
compatible) is to enforce module X to be built against the same version of
mesos as the one being loaded into. It is a minor change to the current
patch and per-subsystem versioning can be reenabled later on, when we gain
more confidence in the API stability of individual subsystems. I am not
religious there; we can start out strict.


>
> More importantly, I would like to understand how we are going to ensure the
> quality of modules being written. In other words how do we, as a community,
> ensure that a badly written module doesn't leave a bad taste in users
> adopting Mesos. Should we come up with a test suite (functional and
> performance) that runs on CI that module writers are required to test
> against? What is the contract for support when something in modules break?
> Should the users/operators ask on #mesos and dev lists or module writer's
> lists? Would the users even know where the problem lies? Would Mesos devs
> always know? I think this is where most of the burden/overhead lies. Of
> course, this is the same problem with frameworks, but I think this is much
> more relevant for modules because they (could) fundamentally change the
> behavior of Mesos.
>

The first module we planned to wire up would be isolator modules. The patch
is going to include a modularized isolator, tested against our existing
isolator unit tests. We can work on getting the necessary build tooling
setup so module implementors can run relevant unit tests (and wire up their
own).

I like the idea of mesos modules mailing list, as it (like with other
module / plugin systems) is on the users / module writers own
responsibility to be compatible, hardened and tested. The is a disclaimer
that is important to emphasize.


>
> Even more fundamentally, how are we going to ensure that Mesos core gets
> better features vs having the cool features developed as (possibly paid)
> modules? For example, should Kerberos auth and SSL transport be modules or
> should they be integrated into the core? As an open source project, how do
> we ensure that the community resources are properly utilized in a fair and
> neutral manner to help Mesos core grow? Are we going to have
> guidelines/opinions on what should/could be modularized or is everything
> fair game? It would be great to understand how other successful open source
> projects toe this line. Has anyone done any research regarding this?
>

I agree and I think we should document this in-depth. Like I mentioned
above, using 3rd party modules are without _any_ guarantee or support from
the Mesos developers. If we end up embracing / blessing modules, we can
pull those into the source base and test it in our CI. Apache Webserver use
this scheme and has graced modules (
http://httpd.apache.org/docs/current/mod/) and 3rd party / non-graced
modules (https://modules.apache.org)

The most commonly used (which I have been referring to): Apache Web Server.

Apart from the Apache web server, the Linux loadable modules is a great
example of big system that uses modules. See section 2.3 at
http://www.tldp.org/HOWTO/Module-HOWTO/x73.html for the case for modules
for linux. Lots of the advantages apply here as well (multiple ways to do
something don't want to rebuilt the kernel all the time)

As we go on and grow the capabilities of Mesos, I suspect we will need to
be able to address specialized setups: custom fitted isolation mechanisms
(which are going to be vastly different for clo

Re: Review Request 25549: Basic filesystem isolator for Linux.

2014-09-23 Thread Ian Downes


> On Sept. 23, 2014, 1 p.m., Vinod Kone wrote:
> > src/slave/containerizer/isolators/filesystem/shared.cpp, lines 92-99
> > 
> >
> > Hmm. How is this possible given you return None() on line #75?

Sorry, changes split across two reviews. Will correct.


> On Sept. 23, 2014, 1 p.m., Vinod Kone wrote:
> > src/slave/containerizer/isolators/filesystem/shared.cpp, line 200
> > 
> >
> > include error?

The error is simply that the directory doesn't exist, as detailed in the 
message.


> On Sept. 23, 2014, 1 p.m., Vinod Kone wrote:
> > src/slave/containerizer/isolators/filesystem/shared.cpp, line 211
> > 
> >
> > Where is "set -x" being added to commands? Should this if block be 
> > killed altogether?

Removed in subsequent commit (which will be merged back).


> On Sept. 23, 2014, 1 p.m., Vinod Kone wrote:
> > src/slave/slave.cpp, lines 2486-2490
> > 
> >
> > Is there any particular reason the container was added to ExecutorInfo 
> > instead of TaskInfo?

Where would this be done? At the moment both Containerizer::launch() methods 
take an ExecutorInfo so I'd suggest we defer adding it to the TaskInfo until 
this behavior is changed.


- Ian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25549/#review54310
---


On Sept. 22, 2014, 11:45 a.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25549/
> ---
> 
> (Updated Sept. 22, 2014, 11:45 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone.
> 
> 
> Bugs: MESOS-1586
> https://issues.apache.org/jira/browse/MESOS-1586
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Does not report usage or enforce quota but can create 'private' directories 
> for each container which mask parts of the shared host filesystem.
> 
> This review replaces https://reviews.apache.org/r/24178/ because of some file 
> renaming. I addressed all comments from earlier reviews.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto dea51f94d130c131421c43e7fd774ceb8941f501 
>   src/Makefile.am 9b973e5503e30180045e270220987ba647da8038 
>   src/common/parse.hpp e6153d8a1f25bc9ddbe1e391306beeacfc8d5ff6 
>   src/common/type_utils.hpp 480c0883fe6ed7f6a9daf77d83ebb077da2e66ee 
>   src/slave/containerizer/isolators/filesystem/shared.hpp PRE-CREATION 
>   src/slave/containerizer/isolators/filesystem/shared.cpp PRE-CREATION 
>   src/slave/containerizer/linux_launcher.cpp 
> d5ef1d6aa762cf81a3e8384552d97fe95b9cbd95 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 9d083294caa5c5a47ba3ceaa1b57346144cb795c 
>   src/slave/flags.hpp 21e00212bc402674eaea73b44b3f91df477a7213 
>   src/slave/slave.cpp 1b3dc7370a2441e4159aa5ee552b64ca5e511e96 
>   src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 
>   src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 
> 
> Diff: https://reviews.apache.org/r/25549/diff/
> 
> 
> Testing
> ---
> 
> make check # added a test
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Chi Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/
---

(Updated Sept. 23, 2014, 11:18 p.m.)


Review request for mesos, Jie Yu and Cong Wang.


Changes
---

the members in struct Info can't be defined as const because:

http://stackoverflow.com/questions/634662/non-static-const-member-cant-use-default-assignment-operator


Bugs: Mesos-1808
https://issues.apache.org/jira/browse/Mesos-1808


Repository: mesos-git


Description
---

see Summary.


Diffs (updated)
-

  configure.ac da61c29 
  src/Makefile.am 9b973e5 
  src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
  src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
  src/linux/routing/internal.hpp dca0dc5 
  src/tests/routing_tests.cpp 10730ad 

Diff: https://reviews.apache.org/r/25948/diff/


Testing
---

added a test to use the code path. 


Thanks,

Chi Zhang



Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Chi Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/
---

(Updated Sept. 23, 2014, 10:20 p.m.)


Review request for mesos, Jie Yu and Cong Wang.


Changes
---

addressed comments.


Bugs: Mesos-1808
https://issues.apache.org/jira/browse/Mesos-1808


Repository: mesos-git


Description
---

see Summary.


Diffs (updated)
-

  configure.ac da61c29 
  src/Makefile.am 9b973e5 
  src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
  src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
  src/linux/routing/internal.hpp dca0dc5 
  src/tests/routing_tests.cpp 10730ad 

Diff: https://reviews.apache.org/r/25948/diff/


Testing
---

added a test to use the code path. 


Thanks,

Chi Zhang



Re: Review Request 25922: Explore disk io isolation in cgroups

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25922/#review54343
---


This work should be split into:
1. A blkio isolator that only provides block device usage statistics and does 
*not* enforce any limits. This is independently useful to understand the usage 
of tasks, and to guide decisions on what limits would be set.
2. Enhance the isolator to limit block device usage. This requires not only 
support from the isolator (as done in this review) but also elevating block 
device capabilities to a first class resource, i.e., specifying in each slave's 
resources, passing this to the master, including it in the allocator, and 
making it accessible to frameworks. This latter functionality is not addressed 
in this review but is necessary for limits to be used.

This would benefit from some discussion; could you please write up a design doc 
on the blkio isolator and make it accessible from the JIRA ticket?

- Ian Downes


On Sept. 22, 2014, 7:50 p.m., Joris Van Remoortere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25922/
> ---
> 
> (Updated Sept. 22, 2014, 7:50 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Following from: r25105
> Currently there is no disk isolation in place and this affects an executor to 
> be starved of disk when another disk heavy operation such as copying a multi 
> gigabyte file is being performed by another executor.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto be45494 
>   include/mesos/resources.hpp 0e37170 
>   src/Makefile.am 9b973e5 
>   src/common/resources.cpp e9a0c85 
>   src/linux/cgroups.hpp abf31df 
>   src/linux/cgroups.cpp 5093b4c 
>   src/slave/containerizer/isolators/cgroups/blkio.hpp PRE-CREATION 
>   src/slave/containerizer/isolators/cgroups/blkio.cpp PRE-CREATION 
>   src/slave/containerizer/mesos/containerizer.cpp 9d08329 
>   src/tests/isolator_tests.cpp c38f876 
> 
> Diff: https://reviews.apache.org/r/25922/diff/
> 
> 
> Testing
> ---
> 
> make check
> sudo bin/mesos-tests.sh --gtest_filter="*BlkIO*"
> support/mesos-style.py
> 
> 
> Thanks,
> 
> Joris Van Remoortere
> 
>



Re: Review Request 25569: Refactor test environment validations

2014-09-23 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25569/
---

(Updated Sept. 23, 2014, 10:02 p.m.)


Review request for mesos and Ben Mahler.


Repository: mesos-git


Description
---

Review: https://reviews.apache.org/r/25569


Diffs (updated)
-

  src/tests/environment.cpp 2274251aaf653d83c2d03ef2186763978067a747 

Diff: https://reviews.apache.org/r/25569/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 25865: Pid namespace isolator for the MesosContainerizer.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25865/#review54338
---


Bad patch!

Reviews applied: [25864, 25865]

Failed command: git apply --index 25865.patch

Error:
 error: patch failed: src/Makefile.am:339
error: src/Makefile.am: patch does not apply
error: patch failed: src/slave/containerizer/linux_launcher.cpp:95
error: src/slave/containerizer/linux_launcher.cpp: patch does not apply
error: patch failed: src/slave/containerizer/mesos/containerizer.cpp:40
error: src/slave/containerizer/mesos/containerizer.cpp: patch does not apply

- Mesos ReviewBot


On Sept. 23, 2014, 7:35 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25865/
> ---
> 
> (Updated Sept. 23, 2014, 7:35 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Add namespaces/pid to --isolation slave flag. Places executor into a pid 
> namespace so it and all descendants will be contained in the namespace. 
> Requires the filesystem/shared isolator so /proc and /sys are remounted to 
> reflect the different namespace.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 9b973e5503e30180045e270220987ba647da8038 
>   src/slave/containerizer/isolators/namespaces/pid.hpp PRE-CREATION 
>   src/slave/containerizer/isolators/namespaces/pid.cpp PRE-CREATION 
>   src/slave/containerizer/linux_launcher.cpp 
> d5ef1d6aa762cf81a3e8384552d97fe95b9cbd95 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 9d083294caa5c5a47ba3ceaa1b57346144cb795c 
>   src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 
> 
> Diff: https://reviews.apache.org/r/25865/diff/
> 
> 
> Testing
> ---
> 
> Added test that command in pid namespaced container is in a different 
> namespace and that the command is 'init' (verifies remount of /proc).
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25569: Refactor test environment validations

2014-09-23 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25569/
---

(Updated Sept. 23, 2014, 9:24 p.m.)


Review request for mesos and Ben Mahler.


Repository: mesos-git


Description
---

Review: https://reviews.apache.org/r/25569


Diffs (updated)
-

  src/tests/environment.cpp 2274251aaf653d83c2d03ef2186763978067a747 

Diff: https://reviews.apache.org/r/25569/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Cong Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/#review54336
---


Just FYI, libnl still only supports v1 idiag, which is not as flexible as v2, 
this is probably why Chi still got both IPv4 and IPv6 connections even if he 
specifies AF_INET. Anyway, we can always filter the dump by ourself, so not a 
big deal.

- Cong Wang


On Sept. 23, 2014, 6:57 p.m., Chi Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25948/
> ---
> 
> (Updated Sept. 23, 2014, 6:57 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Cong Wang.
> 
> 
> Bugs: Mesos-1808
> https://issues.apache.org/jira/browse/Mesos-1808
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> see Summary.
> 
> 
> Diffs
> -
> 
>   configure.ac da61c29 
>   src/Makefile.am 9b973e5 
>   src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
>   src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
>   src/linux/routing/diagnosis/internal.hpp PRE-CREATION 
>   src/linux/routing/internal.hpp dca0dc5 
>   src/tests/routing_tests.cpp 10730ad 
> 
> Diff: https://reviews.apache.org/r/25948/diff/
> 
> 
> Testing
> ---
> 
> added a test to use the code path. 
> 
> 
> Thanks,
> 
> Chi Zhang
> 
>



Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/#review54331
---


Patch looks great!

Reviews applied: [25948]

All tests passed.

- Mesos ReviewBot


On Sept. 23, 2014, 6:57 p.m., Chi Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25948/
> ---
> 
> (Updated Sept. 23, 2014, 6:57 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Cong Wang.
> 
> 
> Bugs: Mesos-1808
> https://issues.apache.org/jira/browse/Mesos-1808
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> see Summary.
> 
> 
> Diffs
> -
> 
>   configure.ac da61c29 
>   src/Makefile.am 9b973e5 
>   src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
>   src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
>   src/linux/routing/diagnosis/internal.hpp PRE-CREATION 
>   src/linux/routing/internal.hpp dca0dc5 
>   src/tests/routing_tests.cpp 10730ad 
> 
> Diff: https://reviews.apache.org/r/25948/diff/
> 
> 
> Testing
> ---
> 
> added a test to use the code path. 
> 
> 
> Thanks,
> 
> Chi Zhang
> 
>



Re: Review Request 25569: Refactor test environment validations

2014-09-23 Thread Timothy Chen


> On Sept. 23, 2014, 7:10 p.m., Ben Mahler wrote:
> > src/tests/environment.cpp, lines 57-59
> > 
> >
> > Missing includes for these?
> > 
> > #include 
> > #include 

I think it's already included, but I'll add it in case.


> On Sept. 23, 2014, 7:10 p.m., Ben Mahler wrote:
> > src/tests/environment.cpp, lines 152-160
> > 
> >
> > The other CgroupsParamTypeFilter checks for "root" user, but this one 
> > doesn't?

That's identical to what the original environment validation does.


> On Sept. 23, 2014, 7:10 p.m., Ben Mahler wrote:
> > src/tests/environment.cpp, line 210
> > 
> >
> > Any reason to for the conversion to an Option?
> > 
> > If possible, we can just keep it as a Try:
> > s/Option/Try/
> > 
> > That way, the check read a bit more directly:
> > 
> > ```
> > return matches(test, "DOCKER_") && docker.isError();
> > ```
> > vs. existing patch:
> > ```
> > return matches(test, "DOCKER_") && dockerError.isSome();
> > ```

I tried that, but I can't compile since Try has no default constructor, 
and I don't think I have anything useful to initialize it with. You have 
suggestions?


> On Sept. 23, 2014, 7:10 p.m., Ben Mahler wrote:
> > src/tests/environment.cpp, lines 224-241
> > 
> >
> > Can this one be collapsed into the CgropusFilter? Seems a bit confusing 
> > to split the cgroup filter logic into two separate filters.

For some reason the original validation was testing both isRoot and Cgroups 
Params the same type, therefore I split it into two filters.


> On Sept. 23, 2014, 7:10 p.m., Ben Mahler wrote:
> > src/tests/environment.cpp, lines 249-269
> > 
> >
> > Does the following seem a bit easier to read? Trying to keep it as only 
> > one logical structure (only 1 "matches" expression, but what happens is 
> > controlled by the #ifdef):
> > 
> > 
> > ```
> > #ifdef WITH_NETWORK_ISOLATOR
> > if (matches(test, "MultipleSlaves")) {
> >   return !routing::check().isError();
> > }
> > #endif
> > 
> > if (matches(test, "PortMappingIsolatorTest") ||
> > matches(test, "PortMappingMesosTest")) {
> > #ifdef WITH_NETWORK_ISOLATOR
> >   return routing::check().isError();
> > #else
> >   return true;
> > #endif
> > }
> > 
> > return false;
> > ```
> > 
> > vs. current patch:
> > 
> > 
> > ```
> > #ifdef WITH_NETWORK_ISOLATOR
> > if (matches(test, "MultipleSlaves")) {
> >   return !routing::check().isError();
> > }
> > 
> > if (matches(test, "PortMappingIsolatorTest") ||
> > matches(test, "PortMappingMesosTest")) {
> >   return routing::check().isError();
> > }
> > 
> > return false;
> > #else
> > return matches(test, "PortMappingIsolatorTest") ||
> >   matches(test, "PortMappingMesosTest");
> > #endif // WITH_NETWORK_ISOLATOR 
> >   }
> > ```
> > 
> > FWIW, this is what you did for CgroupsFilter as well :)

:) thx, I'll be consistent in my style then!


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25569/#review54300
---


On Sept. 23, 2014, 5:28 a.m., Timothy Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25569/
> ---
> 
> (Updated Sept. 23, 2014, 5:28 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Review: https://reviews.apache.org/r/25569
> 
> 
> Diffs
> -
> 
>   src/tests/environment.cpp 2274251aaf653d83c2d03ef2186763978067a747 
> 
> Diff: https://reviews.apache.org/r/25569/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>



Re: Review Request 25549: Basic filesystem isolator for Linux.

2014-09-23 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25549/#review54310
---



src/slave/containerizer/isolators/filesystem/shared.cpp


Should "namespaces/pid" check be added as part of the pid namespace patch? 
Did you add it here because the pid patch is independent from this and either 
could land first?



src/slave/containerizer/isolators/filesystem/shared.cpp


Can you add some comments on why we need to mount /proc and /sys?



src/slave/containerizer/isolators/filesystem/shared.cpp


Hmm. How is this possible given you return None() on line #75?



src/slave/containerizer/isolators/filesystem/shared.cpp


s/container/Container/



src/slave/containerizer/isolators/filesystem/shared.cpp


s/host/Host/



src/slave/containerizer/isolators/filesystem/shared.cpp


s/host_ ath/host_path/



src/slave/containerizer/isolators/filesystem/shared.cpp


s/container_path/container path/



src/slave/containerizer/isolators/filesystem/shared.cpp


include error?



src/slave/containerizer/isolators/filesystem/shared.cpp


include error?



src/slave/containerizer/isolators/filesystem/shared.cpp


Where is "set -x" being added to commands? Should this if block be killed 
altogether?



src/slave/containerizer/isolators/filesystem/shared.cpp


s/mounts done/mounts is/



src/slave/slave.cpp


Is there any particular reason the container was added to ExecutorInfo 
instead of TaskInfo?


- Vinod Kone


On Sept. 22, 2014, 6:45 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25549/
> ---
> 
> (Updated Sept. 22, 2014, 6:45 p.m.)
> 
> 
> Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone.
> 
> 
> Bugs: MESOS-1586
> https://issues.apache.org/jira/browse/MESOS-1586
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Does not report usage or enforce quota but can create 'private' directories 
> for each container which mask parts of the shared host filesystem.
> 
> This review replaces https://reviews.apache.org/r/24178/ because of some file 
> renaming. I addressed all comments from earlier reviews.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto dea51f94d130c131421c43e7fd774ceb8941f501 
>   src/Makefile.am 9b973e5503e30180045e270220987ba647da8038 
>   src/common/parse.hpp e6153d8a1f25bc9ddbe1e391306beeacfc8d5ff6 
>   src/common/type_utils.hpp 480c0883fe6ed7f6a9daf77d83ebb077da2e66ee 
>   src/slave/containerizer/isolators/filesystem/shared.hpp PRE-CREATION 
>   src/slave/containerizer/isolators/filesystem/shared.cpp PRE-CREATION 
>   src/slave/containerizer/linux_launcher.cpp 
> d5ef1d6aa762cf81a3e8384552d97fe95b9cbd95 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 9d083294caa5c5a47ba3ceaa1b57346144cb795c 
>   src/slave/flags.hpp 21e00212bc402674eaea73b44b3f91df477a7213 
>   src/slave/slave.cpp 1b3dc7370a2441e4159aa5ee552b64ca5e511e96 
>   src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 
>   src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 
> 
> Diff: https://reviews.apache.org/r/25549/diff/
> 
> 
> Testing
> ---
> 
> make check # added a test
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25622: Update the Mesos Style Guide with C++11 and naming notes.

2014-09-23 Thread Ben Mahler


> On Sept. 22, 2014, 7:19 p.m., Ben Mahler wrote:
> > docs/mesos-c++-style-guide.md, lines 96-99
> > 
> >
> > Why would the iterator be called `containerizer`?
> > 
> > s/containerizer/iterator/ ?
> 
> Dominic Hamon wrote:
> -1
> 
> naming a variable after a type is never a good idea. in this case, you're 
> getting a containerizer (iterator) from the container of containerizers so 
> the name 'containerizer' makes sense.
> 
> Ben Mahler wrote:
> Sounds confusing.
> 
> Ben Mahler wrote:
> If 'auto' was not used here, would we call this 'containerizer'? In a 
> loop, this would typically be called `iterator`, no?
> 
> ```
> for (auto iterator = containerizers.begin(); iterator != 
> containerizers.end(); ++iterator) {
>   Containerizer* containerizer = *iterator;
> }
> ```
> 
> Why do something differently when auto is used?
> 
> If the iterator was being "de-referenced" then `containerizer` makes 
> sense:
> 
> ```
>   Containerizer* containerizer = *(containerizers.begin());
> ```
> 
> Alexander Rukletsov wrote:
> I agree with Dominic: it's more important what is stored in the container 
> and not how we access it (iterator, reference, etc.). Actually, the example 
> is taken from our code base, see `src/slave/containerizer/composing.cpp:394`

Ok, since this example uses `containerizer` as a reference to the 
`Containerizer*`, as opposed to an iterator, your points make sense. But in 
general I don't think this is a pattern we'll want in our code because of the 
masquerading types now hidden with 'auto'.

Iterators are not as simple as a pointer or reference. What I find unfortunate 
is that we wouldn't apply the same naming scheme as soon as we change the 
container type, which affects the iterator type:

```
map containerizers;
auto containerizer = containerizers.begin(); // Wouldn't do this.
```

How about a different example here with .find() as opposed to .begin()? Take a 
look at cache.hpp as an example:

```
  // Evict the least-recently used element from the cache.
  void evict()
  {
const typename map::iterator& i = values.find(keys.front());
CHECK(i != values.end());
values.erase(i);
keys.pop_front();
  }
```

Here we definitely care about the iterator, as opposed to the value, and would 
name 'i' accordingly.


- Ben


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25622/#review54173
---


On Sept. 22, 2014, 1:10 p.m., Alexander Rukletsov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25622/
> ---
> 
> (Updated Sept. 22, 2014, 1:10 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, Dominic Hamon, and 
> Till Toenshoff.
> 
> 
> Bugs: MESOS-1793
> https://issues.apache.org/jira/browse/MESOS-1793
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Explicitly prohibit the use of namespace aliases. The discussion about using 
> namespace aliases took place in [the other 
> review](https://reviews.apache.org/r/25434/#comment91754). The majority 
> agreed not to introduce them in code.
> 
> Give examples of preferable way of using auto.
> 
> 
> Diffs
> -
> 
>   docs/mesos-c++-style-guide.md 59a39df 
> 
> Diff: https://reviews.apache.org/r/25622/diff/
> 
> 
> Testing
> ---
> 
> Documentation change, no `make check` needed.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>



Re: Review Request 25865: Pid namespace isolator for the MesosContainerizer.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25865/
---

(Updated Sept. 23, 2014, 12:35 p.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Repository: mesos-git


Description
---

Add namespaces/pid to --isolation slave flag. Places executor into a pid 
namespace so it and all descendants will be contained in the namespace. 
Requires the filesystem/shared isolator so /proc and /sys are remounted to 
reflect the different namespace.


Diffs
-

  src/Makefile.am 9b973e5503e30180045e270220987ba647da8038 
  src/slave/containerizer/isolators/namespaces/pid.hpp PRE-CREATION 
  src/slave/containerizer/isolators/namespaces/pid.cpp PRE-CREATION 
  src/slave/containerizer/linux_launcher.cpp 
d5ef1d6aa762cf81a3e8384552d97fe95b9cbd95 
  src/slave/containerizer/mesos/containerizer.cpp 
9d083294caa5c5a47ba3ceaa1b57346144cb795c 
  src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 

Diff: https://reviews.apache.org/r/25865/diff/


Testing
---

Added test that command in pid namespaced container is in a different namespace 
and that the command is 'init' (verifies remount of /proc).


Thanks,

Ian Downes



Re: Review Request 25864: Add 'Future cgroups::empty()'.

2014-09-23 Thread Ian Downes


> On Sept. 22, 2014, 12:12 p.m., Ben Mahler wrote:
> > Why do you need this?
> > 
> > Seems like one would only watch for cgroup emptiness when destroying a 
> > cgroup, in which case, why split emptiness waiting from a successful 
> > destroy?

This is for use in conjunction with pid namespaces: we kill the leading 
('init') process and then we can wait until all processes have been killed by 
the kernel. The LinuxLauncher continues to use the freezer cgroup for this 
purpose (and for forward/backward compatibility).


- Ian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25864/#review54170
---


On Sept. 22, 2014, 11:46 a.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25864/
> ---
> 
> (Updated Sept. 22, 2014, 11:46 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Polls cgroups.procs until no processes in the cgroup. Poll interval and 
> timeout can be specified.
> 
> 
> Diffs
> -
> 
>   src/linux/cgroups.hpp abf31df1b4dbf6f715f93256b83c9996a45099cf 
>   src/linux/cgroups.cpp 5093b4ca1ac17238234d96613b7f4ceab4373c48 
> 
> Diff: https://reviews.apache.org/r/25864/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25865: Pid namespace isolator for the MesosContainerizer.

2014-09-23 Thread Ian Downes


> On Sept. 22, 2014, 12:04 p.m., Timothy Chen wrote:
> > src/slave/containerizer/isolators/namespaces/pid.cpp, line 66
> > 
> >
> > Might be missing something of how pid namespaces work, but I don't see 
> > where we clean up the BIND_MOUNT_ROOT. Why do we have to (or can) assume 
> > this holder doesn't exist on launch?

The BIND_MOUNT_ROOT dir contains files to which we bind mount from each pid 
namespace. This is not cleaned up. The contents are cleaned up in cleanup() 
with a lazy umount and rm.


- Ian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25865/#review54169
---


On Sept. 22, 2014, 11:46 a.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25865/
> ---
> 
> (Updated Sept. 22, 2014, 11:46 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Add namespaces/pid to --isolation slave flag. Places executor into a pid 
> namespace so it and all descendants will be contained in the namespace. 
> Requires the filesystem/shared isolator so /proc and /sys are remounted to 
> reflect the different namespace.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 9b973e5503e30180045e270220987ba647da8038 
>   src/slave/containerizer/isolators/namespaces/pid.hpp PRE-CREATION 
>   src/slave/containerizer/isolators/namespaces/pid.cpp PRE-CREATION 
>   src/slave/containerizer/linux_launcher.cpp 
> d5ef1d6aa762cf81a3e8384552d97fe95b9cbd95 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 9d083294caa5c5a47ba3ceaa1b57346144cb795c 
>   src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 
> 
> Diff: https://reviews.apache.org/r/25865/diff/
> 
> 
> Testing
> ---
> 
> Added test that command in pid namespaced container is in a different 
> namespace and that the command is 'init' (verifies remount of /proc).
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25947: Dynamically change reap poll interval.

2014-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25947/#review54309
---


Patch looks great!

Reviews applied: [25947]

All tests passed.

- Mesos ReviewBot


On Sept. 23, 2014, 6:05 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25947/
> ---
> 
> (Updated Sept. 23, 2014, 6:05 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Ben Mahler, Craig Hansen-Sturm, and 
> Jie Yu.
> 
> 
> Bugs: MESOS-1199
> https://issues.apache.org/jira/browse/MESOS-1199
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> If pid count <= 50 then use 100 ms (<= 0.5% core usage), if count >= 500 use 
> 1000 ms (<= 1% core usage at 500 pids), else interpolate.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/reap.cpp b350ee15bf232493be01506037524b7590c0d2f5 
> 
> Diff: https://reviews.apache.org/r/25947/diff/
> 
> 
> Testing
> ---
> 
> Wrote test program which looped, calling kill(0, pid) and waitpid(pid, 
> &status, WNOHANG) on N children (still running) each loop. These are the 
> system calls used by the reaper. Computed load as wall time / (user + system 
> time). Tested on Linux and OSX.
> 
> `sudo ./bin/mesos-tests.sh` time reduced 20% from 6:40 to 5:20 (not running 
> Docker tests).
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25947: Dynamically change reap poll interval.

2014-09-23 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25947/#review54308
---



3rdparty/libprocess/src/reap.cpp


const?



3rdparty/libprocess/src/reap.cpp


Just curious, what is the guideline with autos now?


- Timothy Chen


On Sept. 23, 2014, 6:05 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25947/
> ---
> 
> (Updated Sept. 23, 2014, 6:05 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Ben Mahler, Craig Hansen-Sturm, and 
> Jie Yu.
> 
> 
> Bugs: MESOS-1199
> https://issues.apache.org/jira/browse/MESOS-1199
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> If pid count <= 50 then use 100 ms (<= 0.5% core usage), if count >= 500 use 
> 1000 ms (<= 1% core usage at 500 pids), else interpolate.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/reap.cpp b350ee15bf232493be01506037524b7590c0d2f5 
> 
> Diff: https://reviews.apache.org/r/25947/diff/
> 
> 
> Testing
> ---
> 
> Wrote test program which looped, calling kill(0, pid) and waitpid(pid, 
> &status, WNOHANG) on N children (still running) each loop. These are the 
> system calls used by the reaper. Computed load as wall time / (user + system 
> time). Tested on Linux and OSX.
> 
> `sudo ./bin/mesos-tests.sh` time reduced 20% from 6:40 to 5:20 (not running 
> Docker tests).
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Review Request 25551: Add standard versioning to shared libmesos.so

2014-09-23 Thread Timothy St. Clair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25551/
---

(Updated Sept. 23, 2014, 7:16 p.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Bugs: MESOS-1675
https://issues.apache.org/jira/browse/MESOS-1675


Repository: mesos-git


Description
---

Add standard -version-info to shared libmesos, it will need to be updated on 
major modifications.


Diffs
-

  src/Makefile.am 9b973e5 

Diff: https://reviews.apache.org/r/25551/diff/


Testing
---

make check 


Thanks,

Timothy St. Clair



Re: Review Request 25922: Explore disk io isolation in cgroups

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25922/#review54294
---



include/mesos/mesos.proto


I think we agreed to call these blk_read_total_iops, etc., to match the 
cgroup statistic and other naming?

For cpus and memory we also include the limits that are in place, e.g., 
cpus_limit; add these for blk too.



src/linux/cgroups.hpp


This seems generally useful, probably belongs in linux/fs.hpp



src/linux/cgroups.hpp


s/aggregated stat information/aggregate statistics/



src/linux/cgroups.hpp


aggregateStatistics?



src/linux/cgroups.hpp


no snake case, unless it's matching to a cgroup special file.

put in linux/fs.hpp



src/linux/cgroups.hpp


why a char * here: what about separate read_limit_in_bytes and 
write_limit_in_bytes? or an enum?



src/linux/cgroups.hpp


s/bytes/iops/



src/linux/cgroups.cpp


open brace on new line



src/linux/cgroups.cpp


new line before return



src/linux/cgroups.cpp


linux/fs.hpp



src/linux/cgroups.cpp


new line before return



src/linux/cgroups.cpp


open brace on new line



src/linux/cgroups.cpp


strcmp is pretty nasty...



src/linux/cgroups.cpp


new line



src/linux/cgroups.cpp


s/error/Error/



src/linux/cgroups.cpp


new line



src/linux/cgroups.cpp


can avoid this if separate read write functions



src/linux/cgroups.cpp


s/part/partition/



src/linux/cgroups.cpp


s/value/limit/



src/linux/cgroups.cpp


new line



src/linux/cgroups.cpp


s/error/Error/



src/linux/cgroups.cpp


new line



src/linux/cgroups.cpp


new line



src/linux/cgroups.cpp


s/error/Error/



src/linux/cgroups.cpp


new line



src/linux/cgroups.cpp


s/part/partition/



src/linux/cgroups.cpp


new line 
s/value/limit/



src/linux/cgroups.cpp


s/error/Error/



src/linux/cgroups.cpp


new line



src/linux/cgroups.cpp


s/  / /



src/slave/containerizer/isolators/cgroups/blkio.hpp


not needed here?



src/slave/containerizer/isolators/cgroups/blkio.cpp


not needed?



src/slave/containerizer/isolators/cgroups/blkio.cpp


new line



src/slave/containerizer/isolators/cgroups/blkio.cpp


single line or correct indentation



src/slave/containerizer/isolators/cgroups/blkio.cpp


This will cause container launch to fail. what about logging that this is 
not implemented and returning a Future() ?



src/slave/containerizer/isolators/cgroups/blkio.cpp


s/r_bps/read_bps/



src/slave/containerizer/isolators/cgroups/blkio.cpp


ditto, and others



src/slave/containerizer/isolators/cgroups/blkio.cpp


fix indentation, here and others, see style guide, e.g.

Try write = cgroups::blkio::limit_in_bytes(
  hierarchy,
  info->cgroup,
  "read",
  r_bps.get());



src/slave/containerizer/isolators/cgroups/blkio.cpp


Re: Review Request 25569: Refactor test environment validations

2014-09-23 Thread Ben Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25569/#review54300
---


This is coming together really nicely Tim! Just some minor cleanups left at 
this point, and then we can get this committed! :)


src/tests/environment.cpp


Missing includes for these?

#include 
#include 



src/tests/environment.cpp


At this point I think these methods are self-evident, but I will leave it 
to you as to whether you think these comments are still beneficial. :)



src/tests/environment.cpp


const string& pattern



src/tests/environment.cpp


The other CgroupsParamTypeFilter checks for "root" user, but this one 
doesn't?



src/tests/environment.cpp


Any reason to for the conversion to an Option?

If possible, we can just keep it as a Try:
s/Option/Try/

That way, the check read a bit more directly:

```
return matches(test, "DOCKER_") && docker.isError();
```
vs. existing patch:
```
return matches(test, "DOCKER_") && dockerError.isSome();
```



src/tests/environment.cpp


Can this one be collapsed into the CgropusFilter? Seems a bit confusing to 
split the cgroup filter logic into two separate filters.



src/tests/environment.cpp


Does the following seem a bit easier to read? Trying to keep it as only one 
logical structure (only 1 "matches" expression, but what happens is controlled 
by the #ifdef):

```
#ifdef WITH_NETWORK_ISOLATOR
if (matches(test, "MultipleSlaves")) {
  return !routing::check().isError();
}
#endif

if (matches(test, "PortMappingIsolatorTest") ||
matches(test, "PortMappingMesosTest")) {
#ifdef WITH_NETWORK_ISOLATOR
  return routing::check().isError();
#else
  return true;
#endif
}

return false;
```

vs. current patch:

```
#ifdef WITH_NETWORK_ISOLATOR
if (matches(test, "MultipleSlaves")) {
  return !routing::check().isError();
}

if (matches(test, "PortMappingIsolatorTest") ||
matches(test, "PortMappingMesosTest")) {
  return routing::check().isError();
}

return false;
#else
return matches(test, "PortMappingIsolatorTest") ||
  matches(test, "PortMappingMesosTest");
#endif // WITH_NETWORK_ISOLATOR 
  }
```

FWIW, this is what you did for CgroupsFilter as well :)



src/tests/environment.cpp


How about s/disabledTests/disabled/ ?



src/tests/environment.cpp


Looks like you're always adding two colons because of the outer 
strings::join(). You shouldn't need the ":" here anymore, right?

How about the following to be consistent with the rest of our code 
formatting:

```
disabled.push_back(
test->test_case_name() + string(".") + test->name());
```


- Ben Mahler


On Sept. 23, 2014, 5:28 a.m., Timothy Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25569/
> ---
> 
> (Updated Sept. 23, 2014, 5:28 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Review: https://reviews.apache.org/r/25569
> 
> 
> Diffs
> -
> 
>   src/tests/environment.cpp 2274251aaf653d83c2d03ef2186763978067a747 
> 
> Diff: https://reviews.apache.org/r/25569/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>



Re: Review Request 25863: Rename stout/os/setns.hpp to namespaces.hpp.

2014-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25863/#review54306
---

Ship it!



3rdparty/libprocess/3rdparty/stout/tests/os/namespaces_tests.cpp


Add a new line above?


- Jie Yu


On Sept. 22, 2014, 6:45 p.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25863/
> ---
> 
> (Updated Sept. 22, 2014, 6:45 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Also, added os::getns(pid, ns) to get the namespace inode for comparisons 
> between pids' namespaces.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/3rdparty/Makefile.am 
> db9766d70adb9076946cd2b467c55636fe5f7235 
>   3rdparty/libprocess/3rdparty/stout/Makefile.am 
> b6464de53c3873ecd0b62a08ca9aac530043ffb9 
>   3rdparty/libprocess/3rdparty/stout/include/Makefile.am 
> 6fa5b741bdd7f089ba93bf6fea43b9f39f8f0edb 
>   3rdparty/libprocess/3rdparty/stout/include/stout/os/setns.hpp 
> 5278996f201a4a3d69282c1bd7b0d230d0f6cd39 
>   3rdparty/libprocess/3rdparty/stout/tests/os/namespaces_tests.cpp 
> PRE-CREATION 
>   3rdparty/libprocess/3rdparty/stout/tests/os/setns_tests.cpp 
> ad8e37aa2f5a29f8b421dde6b7cd5dfe241eabb5 
>   src/slave/containerizer/isolators/network/port_mapping.cpp 
> 9248460caa558e43a2a7d237f6a37d43f0eeacb5 
> 
> Diff: https://reviews.apache.org/r/25863/diff/
> 
> 
> Testing
> ---
> 
> Added test to check a clone(NEW_PIDNS) results in a new pid namespace.
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Re: Mesos Modules Design

2014-09-23 Thread Vinod Kone
Ok. I finally had a chance to read the design doc, go through the comments
on this thread and glance at the review. Here are my comments.

I like the concept of dynamic loading of libraries when it comes to making
the lives of end users/operators easy, esp when they want to mix and match
the modules.

I agree with Dominic though that we should make sure to minimize the
burden/overhead of core developers when it comes to dealing with modules.
FWIW, the implementation that I've seen seems easy enough to deal with and
doesn't look like it adds much of a performance overhead.

That said, the versioning part seems a bit complicated, as BenH alluded to.
Can we come up with something that's simple as the first cut but still
extensible? Also, it would be great if interface breakage could be
automatically deduced at load time. But I guess we still need a module
version to deal with protocol changes without interface changes. To ensure
the devs update the versions in module manager, I suggest adding a comment
on top of the current components (Allocator, Isolator etc) mentioning that
there is a corresponding module that needs to be updated. Better yet, maybe
we could use a naming convention (e.g., s/Allocator/AllocatorModule/) for
easy visual cue.

More importantly, I would like to understand how we are going to ensure the
quality of modules being written. In other words how do we, as a community,
ensure that a badly written module doesn't leave a bad taste in users
adopting Mesos. Should we come up with a test suite (functional and
performance) that runs on CI that module writers are required to test
against? What is the contract for support when something in modules break?
Should the users/operators ask on #mesos and dev lists or module writer's
lists? Would the users even know where the problem lies? Would Mesos devs
always know? I think this is where most of the burden/overhead lies. Of
course, this is the same problem with frameworks, but I think this is much
more relevant for modules because they (could) fundamentally change the
behavior of Mesos.

Even more fundamentally, how are we going to ensure that Mesos core gets
better features vs having the cool features developed as (possibly paid)
modules? For example, should Kerberos auth and SSL transport be modules or
should they be integrated into the core? As an open source project, how do
we ensure that the community resources are properly utilized in a fair and
neutral manner to help Mesos core grow? Are we going to have
guidelines/opinions on what should/could be modularized or is everything
fair game? It would be great to understand how other successful open source
projects toe this line. Has anyone done any research regarding this?


On Tue, Sep 23, 2014 at 9:27 AM, George Sudarkoff 
wrote:

> Hello everybody!
>
> I'm new here. But why not jump in in the middle of the conversation and
> voice an opinion anyway, right? :)
>
> On 23 Sep 2014, at 09:17, Dominic Hamon  wrote:
>
> > On Tue, Sep 23, 2014 at 5:57 AM, Tim St Clair 
> wrote:
> >
> >>
> >>
> >> - Original Message -
> >>> From: "Benjamin Hindman" 
> >>> To: "dev" 
> >>> Sent: Tuesday, September 23, 2014 3:14:31 AM
> >>> Subject: Re: Mesos Modules Design
> >>>
> 
>  - create abstract classes to define interfaces to objects that should
> >> be
>  modular
> 
> >>>
> >>> We're all in agreement here!
> >>>
> >>> - build modules as static libraries that can be assembled at link time
> to
>  create custom Mesos builds
> 
> >>>
> >>> Okay, but unless I'm missing something here we'll still need a level of
> >>> indirection to wire everything together. What would that look like?
> >>>
> >>> Also, why ask an operator to go through the extra step of relinking
> >> Mesos?
> >>> Asking the operator to relink means they'll need a Mesos build
> >> environment,
> >>> while most folks will likely just have Mesos installed via an RPM (or
> >>> similar). I'm not convinced that getting a link error will be a better
> >> user
> >>> experience then getting a runtime error that cleanly prints out
> something
> >>> along the lines of "Version mismatch: the XXYYZZ module is not
> compatible
> >>> with this version of Mesos".
> >>
> >> To ask service operators to re-link and possibly re-deploy mesos is a
> >> non-starter imho.  One of the goals of enabling "plugins" around key
> >> interfaces is to avoid this type of operation.
> >>
> >
> > ​What, concretely, does a service operator do if they have a bunch of
> > modules that give runtime version errors? What are there options to get a
> > running version?
>
> "Runtime" doesn't necessarily mean "four days after you start it". What
> I'd expect from a piece of software with plugins is to load the plugins and
> verify the versions/compatibility at launch time. If I drop a new plugin in
> and restart a client, I know what to do when the client complains about the
> incompatible versions.
>
>
> -- George
>
>
> >
> >
> >
> >>
> >> --
> >> Cheers,
> >>

Re: Review Request 25948: Routing: added support to obtain basic socket diagnosis information, using the inet-diag module from libnl3.

2014-09-23 Thread Chi Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25948/
---

(Updated Sept. 23, 2014, 6:57 p.m.)


Review request for mesos, Jie Yu and Cong Wang.


Changes
---

+ mesos group


Bugs: Mesos-1808
https://issues.apache.org/jira/browse/Mesos-1808


Repository: mesos-git


Description
---

see Summary.


Diffs
-

  configure.ac da61c29 
  src/Makefile.am 9b973e5 
  src/linux/routing/diagnosis/diagnosis.hpp PRE-CREATION 
  src/linux/routing/diagnosis/diagnosis.cpp PRE-CREATION 
  src/linux/routing/diagnosis/internal.hpp PRE-CREATION 
  src/linux/routing/internal.hpp dca0dc5 
  src/tests/routing_tests.cpp 10730ad 

Diff: https://reviews.apache.org/r/25948/diff/


Testing
---

added a test to use the code path. 


Thanks,

Chi Zhang



Re: Review Request 25947: Dynamically change reap poll interval.

2014-09-23 Thread Dominic Hamon

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25947/#review54293
---



3rdparty/libprocess/src/reap.cpp


nit: it's easy for someone to change the constants in the rest of this 
calculation but not reconfigure this linear relationship. It would be better to 
pull out 50, Milliseconds(100), 500, and Seconds(1) as constants then calculate 
this linear relationship using those.


- Dominic Hamon


On Sept. 23, 2014, 11:05 a.m., Ian Downes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25947/
> ---
> 
> (Updated Sept. 23, 2014, 11:05 a.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Ben Mahler, Craig Hansen-Sturm, and 
> Jie Yu.
> 
> 
> Bugs: MESOS-1199
> https://issues.apache.org/jira/browse/MESOS-1199
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> If pid count <= 50 then use 100 ms (<= 0.5% core usage), if count >= 500 use 
> 1000 ms (<= 1% core usage at 500 pids), else interpolate.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/reap.cpp b350ee15bf232493be01506037524b7590c0d2f5 
> 
> Diff: https://reviews.apache.org/r/25947/diff/
> 
> 
> Testing
> ---
> 
> Wrote test program which looped, calling kill(0, pid) and waitpid(pid, 
> &status, WNOHANG) on N children (still running) each loop. These are the 
> system calls used by the reaper. Computed load as wall time / (user + system 
> time). Tested on Linux and OSX.
> 
> `sudo ./bin/mesos-tests.sh` time reduced 20% from 6:40 to 5:20 (not running 
> Docker tests).
> 
> 
> Thanks,
> 
> Ian Downes
> 
>



Review Request 25947: Dynamically change reap poll interval.

2014-09-23 Thread Ian Downes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25947/
---

Review request for mesos, Bernd Mathiske, Ben Mahler, Craig Hansen-Sturm, and 
Jie Yu.


Bugs: MESOS-1199
https://issues.apache.org/jira/browse/MESOS-1199


Repository: mesos-git


Description
---

If pid count <= 50 then use 100 ms (<= 0.5% core usage), if count >= 500 use 
1000 ms (<= 1% core usage at 500 pids), else interpolate.


Diffs
-

  3rdparty/libprocess/src/reap.cpp b350ee15bf232493be01506037524b7590c0d2f5 

Diff: https://reviews.apache.org/r/25947/diff/


Testing
---

Wrote test program which looped, calling kill(0, pid) and waitpid(pid, &status, 
WNOHANG) on N children (still running) each loop. These are the system calls 
used by the reaper. Computed load as wall time / (user + system time). Tested 
on Linux and OSX.

`sudo ./bin/mesos-tests.sh` time reduced 20% from 6:40 to 5:20 (not running 
Docker tests).


Thanks,

Ian Downes



Re: Review Request 25789: Variadic strings join

2014-09-23 Thread Joris Van Remoortere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25789/
---

(Updated Sept. 23, 2014, 5:30 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
---

fix style issues.


Repository: mesos-git


Description
---

Add Variadic strings join.
There is a second version of the variadic join which takes a reference to a 
stringstream as a parameter. This is handy when strings::join is just a part of 
a larger string manipulation.


Diffs (updated)
-

  3rdparty/libprocess/3rdparty/stout/include/stout/strings.hpp a1702cd 
  3rdparty/libprocess/3rdparty/stout/tests/strings_tests.cpp 51008e5 

Diff: https://reviews.apache.org/r/25789/diff/


Testing
---

Ran make check for stout. Added test cases for join as these were missing.


Thanks,

Joris Van Remoortere



Re: Review Request 25945: Pass Promise to DispatchEvent to correctly discard on rejection.

2014-09-23 Thread Dominic Hamon

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25945/
---

(Updated Sept. 23, 2014, 10:24 a.m.)


Review request for mesos and Ben Mahler.


Changes
---

added bug


Bugs: MESOS-1456 and MESOS-1751
https://issues.apache.org/jira/browse/MESOS-1456
https://issues.apache.org/jira/browse/MESOS-1751


Repository: mesos-git


Description
---

Create a PromiseDispatchEvent to allow discarding of promises if the 
DispatchEvent can't be enqueued.

Also converted some shared_ptr to Owned to correctly track ownership.


Diffs
-

  3rdparty/libprocess/include/process/c++11/dispatch.hpp 
76da2828cf36b6143d627dac66f3e0cc4416bae4 
  3rdparty/libprocess/include/process/defer.hpp 
ebe6f2db47cab2a3306288d8ebabb720e7cd8976 
  3rdparty/libprocess/include/process/delay.hpp 
487f652c9e9b7c8c3aa8b4edc9e59789cffd8da9 
  3rdparty/libprocess/include/process/dispatch.hpp 
88570f7c078faa7d79b2c187aa6a15e4e939878c 
  3rdparty/libprocess/include/process/event.hpp 
bf689d7270df2c8f1f5c9165d2bbcfdca06e15a8 
  3rdparty/libprocess/src/process.cpp 8adc8093ab08a51c437b18dc37a7e0ae3f56870b 
  src/tests/master_tests.cpp 8e4ec1d85c4530b5421387de55036f7d40ee3180 

Diff: https://reviews.apache.org/r/25945/diff/


Testing
---

Added test that fails without the patch.

Ran all tests (make check) with g++-4.4 and clang-3.5.


Thanks,

Dominic Hamon



Review Request 25945: Pass Promise to DispatchEvent to correctly discard on rejection.

2014-09-23 Thread Dominic Hamon

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25945/
---

Review request for mesos and Ben Mahler.


Bugs: MESOS-1751
https://issues.apache.org/jira/browse/MESOS-1751


Repository: mesos-git


Description
---

Create a PromiseDispatchEvent to allow discarding of promises if the 
DispatchEvent can't be enqueued.

Also converted some shared_ptr to Owned to correctly track ownership.


Diffs
-

  3rdparty/libprocess/include/process/c++11/dispatch.hpp 
76da2828cf36b6143d627dac66f3e0cc4416bae4 
  3rdparty/libprocess/include/process/defer.hpp 
ebe6f2db47cab2a3306288d8ebabb720e7cd8976 
  3rdparty/libprocess/include/process/delay.hpp 
487f652c9e9b7c8c3aa8b4edc9e59789cffd8da9 
  3rdparty/libprocess/include/process/dispatch.hpp 
88570f7c078faa7d79b2c187aa6a15e4e939878c 
  3rdparty/libprocess/include/process/event.hpp 
bf689d7270df2c8f1f5c9165d2bbcfdca06e15a8 
  3rdparty/libprocess/src/process.cpp 8adc8093ab08a51c437b18dc37a7e0ae3f56870b 
  src/tests/master_tests.cpp 8e4ec1d85c4530b5421387de55036f7d40ee3180 

Diff: https://reviews.apache.org/r/25945/diff/


Testing
---

Added test that fails without the patch.

Ran all tests (make check) with g++-4.4 and clang-3.5.


Thanks,

Dominic Hamon



Re: Mesos Modules Design

2014-09-23 Thread George Sudarkoff
Hello everybody!

I'm new here. But why not jump in in the middle of the conversation and voice 
an opinion anyway, right? :)

On 23 Sep 2014, at 09:17, Dominic Hamon  wrote:

> On Tue, Sep 23, 2014 at 5:57 AM, Tim St Clair  wrote:
> 
>> 
>> 
>> - Original Message -
>>> From: "Benjamin Hindman" 
>>> To: "dev" 
>>> Sent: Tuesday, September 23, 2014 3:14:31 AM
>>> Subject: Re: Mesos Modules Design
>>> 
 
 - create abstract classes to define interfaces to objects that should
>> be
 modular
 
>>> 
>>> We're all in agreement here!
>>> 
>>> - build modules as static libraries that can be assembled at link time to
 create custom Mesos builds
 
>>> 
>>> Okay, but unless I'm missing something here we'll still need a level of
>>> indirection to wire everything together. What would that look like?
>>> 
>>> Also, why ask an operator to go through the extra step of relinking
>> Mesos?
>>> Asking the operator to relink means they'll need a Mesos build
>> environment,
>>> while most folks will likely just have Mesos installed via an RPM (or
>>> similar). I'm not convinced that getting a link error will be a better
>> user
>>> experience then getting a runtime error that cleanly prints out something
>>> along the lines of "Version mismatch: the XXYYZZ module is not compatible
>>> with this version of Mesos".
>> 
>> To ask service operators to re-link and possibly re-deploy mesos is a
>> non-starter imho.  One of the goals of enabling "plugins" around key
>> interfaces is to avoid this type of operation.
>> 
> 
> ​What, concretely, does a service operator do if they have a bunch of
> modules that give runtime version errors? What are there options to get a
> running version?

"Runtime" doesn't necessarily mean "four days after you start it". What I'd 
expect from a piece of software with plugins is to load the plugins and verify 
the versions/compatibility at launch time. If I drop a new plugin in and 
restart a client, I know what to do when the client complains about the 
incompatible versions.


-- George


> 
> 
> 
>> 
>> --
>> Cheers,
>> Timothy St. Clair
>> Red Hat Inc.
>> 
> 
> 
> 
> -- 
> Dominic Hamon | @mrdo | Twitter
> *There are no bad ideas; only good ideas that go horribly wrong.*



Re: Review Request 25866: Updated semantics of disconnected/deactivated slaves/frameworks in master.

2014-09-23 Thread Dominic Hamon

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25866/#review54278
---



src/master/master.cpp


duplicate of 3977?


- Dominic Hamon


On Sept. 22, 2014, 4:50 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25866/
> ---
> 
> (Updated Sept. 22, 2014, 4:50 p.m.)
> 
> 
> Review request for mesos, Adam B and Ben Mahler.
> 
> 
> Bugs: MESOS-1081 and MESOS-1811
> https://issues.apache.org/jira/browse/MESOS-1081
> https://issues.apache.org/jira/browse/MESOS-1811
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Made consistent what connected and active frameworks/slaves means.
> 
> Fixed MESOS-1811 along the way.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 3f5a01dfddca9cea73563100d88e0c03f600d6b1 
>   src/master/master.hpp f5d74aef185fad861139186be1cab089f8005a94 
>   src/master/master.cpp e5d30e9c7ba1ec0cdd640c81610790f3397f3062 
>   src/tests/fault_tolerance_tests.cpp 
> 154386044d0247b39d84719d7ff14250682a0695 
>   src/tests/master_tests.cpp 8e4ec1d85c4530b5421387de55036f7d40ee3180 
> 
> Diff: https://reviews.apache.org/r/25866/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Mesos Modules Design

2014-09-23 Thread Dominic Hamon
On Tue, Sep 23, 2014 at 5:57 AM, Tim St Clair  wrote:

>
>
> - Original Message -
> > From: "Benjamin Hindman" 
> > To: "dev" 
> > Sent: Tuesday, September 23, 2014 3:14:31 AM
> > Subject: Re: Mesos Modules Design
> >
> > >
> > > - create abstract classes to define interfaces to objects that should
> be
> > > modular
> > >
> >
> > We're all in agreement here!
> >
> > - build modules as static libraries that can be assembled at link time to
> > > create custom Mesos builds
> > >
> >
> > Okay, but unless I'm missing something here we'll still need a level of
> > indirection to wire everything together. What would that look like?
> >
> > Also, why ask an operator to go through the extra step of relinking
> Mesos?
> > Asking the operator to relink means they'll need a Mesos build
> environment,
> > while most folks will likely just have Mesos installed via an RPM (or
> > similar). I'm not convinced that getting a link error will be a better
> user
> > experience then getting a runtime error that cleanly prints out something
> > along the lines of "Version mismatch: the XXYYZZ module is not compatible
> > with this version of Mesos".
>
> To ask service operators to re-link and possibly re-deploy mesos is a
> non-starter imho.  One of the goals of enabling "plugins" around key
> interfaces is to avoid this type of operation.
>

​What, concretely, does a service operator do if they have a bunch of
modules that give runtime version errors? What are there options to get a
running version?



>
> --
> Cheers,
> Timothy St. Clair
> Red Hat Inc.
>



-- 
Dominic Hamon | @mrdo | Twitter
*There are no bad ideas; only good ideas that go horribly wrong.*


Re: Mesos Modules Design

2014-09-23 Thread Dominic Hamon
On Tue, Sep 23, 2014 at 1:14 AM, Benjamin Hindman 
wrote:

> >
> > - create abstract classes to define interfaces to objects that should be
> > modular
> >
>
> We're all in agreement here!
>
> - build modules as static libraries that can be assembled at link time to
> > create custom Mesos builds
> >
>
> Okay, but unless I'm missing something here we'll still need a level of
> indirection to wire everything together. What would that look like?
>

​I think I may be missing something too, then, because I don't see why
you'd need any more indirection.

call site:

  allocator = new Allocator();

depending on the linked symbols, Allocator might be from mesos core, or
from some other replacement library.​



>
> Also, why ask an operator to go through the extra step of relinking Mesos?
> Asking the operator to relink means they'll need a Mesos build environment,
> while most folks will likely just have Mesos installed via an RPM (or
> similar). I'm not convinced that getting a link error will be a better user
> experience then getting a runtime error that cleanly prints out something
> along the lines of "Version mismatch: the XXYYZZ module is not compatible
> with this version of Mesos".
>


​This may be where my assertion is differing from expectation: I'm
expecting the use case to be packaged delivery of custom Mesos builds. Ie,
if you want Mesos with AllocatorX, there's a package that you can request
that has those pieces in there. I can see that this could get complicated
if you want modules from different vendors, and then you'd need to link.

I think, in almost all cases, getting an error at link time is going to be
better than runtime. At least at link time there's something you can fix;
at runtime it's too late. You need to go back to the vendors to figure out
what went wrong, and maybe get them to rebuild to the new API, etc. At
least at link time you are in control of which versions you're putting
together.

I concede that expecting end users to have a build environment is
difficult, but maybe there's something we can put together as a service to
mix and match modules and provide a custom package. I'm concerned about the
overhead (complexity, mostly) involved in having this done dynamically.​




-- 
Dominic Hamon | @mrdo | Twitter
*There are no bad ideas; only good ideas that go horribly wrong.*


Re: Review Request 25622: Update the Mesos Style Guide with C++11 and naming notes.

2014-09-23 Thread Alexander Rukletsov


> On Sept. 22, 2014, 7:19 p.m., Ben Mahler wrote:
> > docs/mesos-c++-style-guide.md, lines 96-99
> > 
> >
> > Why would the iterator be called `containerizer`?
> > 
> > s/containerizer/iterator/ ?
> 
> Dominic Hamon wrote:
> -1
> 
> naming a variable after a type is never a good idea. in this case, you're 
> getting a containerizer (iterator) from the container of containerizers so 
> the name 'containerizer' makes sense.
> 
> Ben Mahler wrote:
> Sounds confusing.
> 
> Ben Mahler wrote:
> If 'auto' was not used here, would we call this 'containerizer'? In a 
> loop, this would typically be called `iterator`, no?
> 
> ```
> for (auto iterator = containerizers.begin(); iterator != 
> containerizers.end(); ++iterator) {
>   Containerizer* containerizer = *iterator;
> }
> ```
> 
> Why do something differently when auto is used?
> 
> If the iterator was being "de-referenced" then `containerizer` makes 
> sense:
> 
> ```
>   Containerizer* containerizer = *(containerizers.begin());
> ```

I agree with Dominic: it's more important what is stored in the container and 
not how we access it (iterator, reference, etc.). Actually, the example is 
taken from our code base, see `src/slave/containerizer/composing.cpp:394`


- Alexander


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25622/#review54173
---


On Sept. 22, 2014, 1:10 p.m., Alexander Rukletsov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25622/
> ---
> 
> (Updated Sept. 22, 2014, 1:10 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, Dominic Hamon, and 
> Till Toenshoff.
> 
> 
> Bugs: MESOS-1793
> https://issues.apache.org/jira/browse/MESOS-1793
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> Explicitly prohibit the use of namespace aliases. The discussion about using 
> namespace aliases took place in [the other 
> review](https://reviews.apache.org/r/25434/#comment91754). The majority 
> agreed not to introduce them in code.
> 
> Give examples of preferable way of using auto.
> 
> 
> Diffs
> -
> 
>   docs/mesos-c++-style-guide.md 59a39df 
> 
> Diff: https://reviews.apache.org/r/25622/diff/
> 
> 
> Testing
> ---
> 
> Documentation change, no `make check` needed.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>



Re: Review Request 25911: Changed master to free up resources for completed tasks when framework is disconnected.

2014-09-23 Thread Niklas Nielsen


> On Sept. 22, 2014, 5:42 p.m., Timothy Chen wrote:
> > src/master/master.cpp, line 4401
> > 
> >
> > Seems like resources recovered is only used internally for the master, 
> > any reason why introducing a new protobuf field instead of just storing it 
> > locally?

Thanks for bringing this up - Ben M reached out on IRC and I am working on a 
refactor which introduce a task struct where we can hang this off.


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25911/#review54221
---


On Sept. 22, 2014, 3:30 p.m., Niklas Nielsen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25911/
> ---
> 
> (Updated Sept. 22, 2014, 3:30 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1817
> https://issues.apache.org/jira/browse/MESOS-1817
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> We have run into a problem that cause tasks which completes, when a
> framework is disconnected and has a fail-over time, to remain in a
> running state even though the tasks actually finishes. This hogs the
> cluster and gives users a inconsistent view of the cluster state.
> 
> The problem turn out to be an issue with the ack-cycle of status
> updates: If the framework disconnects (with a failover timeout set), the
> status update manage on the slaves will keep trying to send the front of
> status update stream to the master (which in turn forwards it to the
> framework). If the first status update after the disconnect is terminal,
> things work out fine; the master picks the terminal state up, removes
> the task and release the resources. If, on the other hand, one
> non-terminal status is in the stream. The master will never know that
> the task finished (or failed) before the framework reconnects.
> 
> As a first pass, this patch makes the status update manager inform the
> master if a terminal state was found in the pending stream of a task.
> If so, the master will recover the resources but will still wait the
> updates to arrive before updating the task state and statuses.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp f5d74ae 
>   src/master/master.cpp e5d30e9 
>   src/messages/messages.proto 7cb3ce6 
>   src/slave/status_update_manager.hpp 24e3882 
>   src/slave/status_update_manager.cpp 5d5cf23 
>   src/tests/fault_tolerance_tests.cpp 1543860 
> 
> Diff: https://reviews.apache.org/r/25911/diff/
> 
> 
> Testing
> ---
> 
> Added a new test: 
> FaultToleranceTest.RecoverResourcesDuringSchedulerDisconnect which exercise 
> the new code path.
> 
> make check
> 
> 
> Thanks,
> 
> Niklas Nielsen
> 
>



Fwd: Running mesos-slave in Docker container

2014-09-23 Thread Tim St Clair
Hey folks - 

I hijacked some of the previous conversation to talk with our Docker folks, 
here is the response below: 

Cheers, 
Tim 

--
 
From: "Daniel J Walsh"  
To: "Fedora Cloud SIG" , "Scott Collier" 
 
Sent: Tuesday, September 23, 2014 9:26:45 AM 
Subject: Re: Fwd: Running mesos-slave in Docker container (Atomic Discussion) 

docker run --privileged 

Turns off all of the docker security. 

Has anyone tried to run a container for something like mesos that execs docker 
commands, to maybe look like 

docker run --privileged -v /:/host -v /run:/run -ti -net=host mesos /bin/sh 

This would cause all of / to be mounted in /host and then you could execute 

/host/usr/bin/docker for example. Not sure why you would want /var/lib/docker 
mounted into the mesos container. 

- Original Message -

> From: "Scott Collier" 
> To: "Tim St Clair" 
> Cc: "Fedora Cloud SIG" 
> Sent: Tuesday, September 23, 2014 9:37:21 AM
> Subject: Re: Fwd: Running mesos-slave in Docker container

> On 09/23/2014 08:18 AM, Tim St Clair wrote:

> > Scott -
> 

> > When you mentioned running in "privileged mode" mode, what does that mean?
> > Could you provide more details.
> 

> Sure. I figured that the Mesos process / service might need to check metrics
> on the host - which a container probably wouldn't have access to. By default
> when you run a container, it drops all Linux capabilities (see "man
> capabilities" for complete list). As of Docker 1.2, they added --cap-add and
> --cap-drop ( https://blog.docker.com/2014/08/announcing-docker-1-2-0/ ) so
> you can be specific about which ones you want. When you run a container in
> --priviledged mode, you add all capabilities back. Now, instead of adding
> all capabilities back, you could pick the ones you need. for example:

> $ sudo docker run --cap-add=NET_ADMIN -dt fedora sleep 3000
> aa934b0e0c8f4a0202d2278a4884c136d7fd985f827e057a20617dcc67ff59db

> $ sudo docker inspect --format '{{ .HostConfig.CapAdd }}' aa9
> [NET_ADMIN]

> And you can see that the NET_ADMIN capability was added to the container.

> HTH.

> > Cheers,
> 
> > Tim
> 

> > - Original Message -
> 

> > > From: "Tim Chen" 
> > 
> 
> > > To: u...@mesos.apache.org , "Gabriel Monroy" 
> > 
> 
> > > Sent: Tuesday, September 23, 2014 2:41:17 AM
> > 
> 
> > > Subject: Re: Running mesos-slave in Docker container
> > 
> 

> > > Hi Grzegorz,
> > 
> 

> > > To run Mesos master|slave in a docker container is not straight forward
> > > because we utilize kernel features therefore you need to explicitly test
> > > out
> > > the features you like to use with Mesos with slave/master in Docker.
> > 
> 

> > > Gabriel during the Mesosphere hackathon has got master and slave running
> > > in
> > > docker containers, and he can probably share his Dockerfile and run
> > > command.
> > 
> 

> > > I believe one work around to get cgroups working with Docker run is to
> > > mount
> > > /sys into the container (mount -v /sys:/sys).
> > 
> 

> > > Gabriel do you still have the command you used to run slave/master with
> > > Docker?
> > 
> 

> > > Tim
> > 
> 

> > > On Tue, Sep 23, 2014 at 12:24 AM, Grzegorz Graczyk < gregor...@gmail.com
> > > >
> > > wrote:
> > 
> 

> > > > I'm trying to run mesos-slave inside Docker container, but it can't
> > > > start
> > > > due
> > > > to problem with mounting cgroups.
> > > 
> > 
> 

> > > > I'm using:
> > > 
> > 
> 
> > > > Kernel Version: 3.13.0-32-generic
> > > 
> > 
> 
> > > > Operating System: Ubuntu 14.04.1 LTS
> > > 
> > 
> 
> > > > Docker: 1.2.0(commit fa7b24f)
> > > 
> > 
> 
> > > > Mesos: 0.20.0
> > > 
> > 
> 

> > > > Following error appears:
> > > 
> > 
> 
> > > > I0923 07:11:20.921475 19 main.cpp:126] Build: 2014-08-22 05:04:26 by
> > > > root
> > > 
> > 
> 
> > > > I0923 07:11:20.921608 19 main.cpp:128] Version: 0.20.0
> > > 
> > 
> 
> > > > I0923 07:11:20.921620 19 main.cpp:131] Git tag: 0.20.0
> > > 
> > 
> 
> > > > I0923 07:11:20.921628 19 main.cpp:135] Git SHA:
> > > > f421ffdf8d32a8834b3a6ee483b5b59f65956497
> > > 
> > 
> 
> > > > Failed to create a containerizer: Could not create DockerContainerizer:
> > > > Failed to find a mounted cgroups hierarchy for the 'cpu' subsystem; you
> > > > probably need to mount cgroups manually!
> > > 
> > 
> 

> > > > I'm running docker container with command:
> > > 
> > 
> 
> > > > docker run --name mesos-slave --privileged --net=host -v
> > > > /var/run/docker.sock:/var/run/docker.sock -v
> > > > /var/lib/docker:/var/lib/docker
> > > > -v /usr/local/bin/docker:/usr/local/bin/docker gregory90/mesos-slave
> > > > --containerizers=docker,mesos --master=zk://localhost:2181/mesos
> > > > --ip=127.0.0.1
> > > 
> > 
> 

> > > > Everything is running on single machine.
> > > 
> > 
> 
> > > > Everything works as expected when mesos-slave is run outside docker
> > > > container.
> > > 
> > 
> 

> > > > I'd appreciate some help.
> > > 
> > 
> 

> > -

Re: Mesos Modules Design

2014-09-23 Thread Tim St Clair


- Original Message -
> From: "Benjamin Hindman" 
> To: "dev" 
> Sent: Tuesday, September 23, 2014 3:14:31 AM
> Subject: Re: Mesos Modules Design
> 
> >
> > - create abstract classes to define interfaces to objects that should be
> > modular
> >
> 
> We're all in agreement here!
> 
> - build modules as static libraries that can be assembled at link time to
> > create custom Mesos builds
> >
> 
> Okay, but unless I'm missing something here we'll still need a level of
> indirection to wire everything together. What would that look like?
> 
> Also, why ask an operator to go through the extra step of relinking Mesos?
> Asking the operator to relink means they'll need a Mesos build environment,
> while most folks will likely just have Mesos installed via an RPM (or
> similar). I'm not convinced that getting a link error will be a better user
> experience then getting a runtime error that cleanly prints out something
> along the lines of "Version mismatch: the XXYYZZ module is not compatible
> with this version of Mesos".

To ask service operators to re-link and possibly re-deploy mesos is a 
non-starter imho.  One of the goals of enabling "plugins" around key interfaces 
is to avoid this type of operation.  

-- 
Cheers,
Timothy St. Clair
Red Hat Inc.


Re: Mesos Modules Design

2014-09-23 Thread Benjamin Hindman
>
> - create abstract classes to define interfaces to objects that should be
> modular
>

We're all in agreement here!

- build modules as static libraries that can be assembled at link time to
> create custom Mesos builds
>

Okay, but unless I'm missing something here we'll still need a level of
indirection to wire everything together. What would that look like?

Also, why ask an operator to go through the extra step of relinking Mesos?
Asking the operator to relink means they'll need a Mesos build environment,
while most folks will likely just have Mesos installed via an RPM (or
similar). I'm not convinced that getting a link error will be a better user
experience then getting a runtime error that cleanly prints out something
along the lines of "Version mismatch: the XXYYZZ module is not compatible
with this version of Mesos".