date:20151125

[jira] [Updated] (MESOS-3795) process::io::write takes parameter as void* which could be const

2015-11-25 Thread Benjamin Bannier (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-3795:

Attachment: (was: ubuntu14_clang-3.6_FAILED.log)

> process::io::write takes parameter as void* which could be const
> 
>
> Key: MESOS-3795
> URL: https://issues.apache.org/jira/browse/MESOS-3795
> Project: Mesos
>  Issue Type: Improvement
>  Components: libprocess
>Reporter: Benjamin Bannier
>  Labels: mesosphere, tech-debt
>
> In libprocess we have
> {code}
> Future write(int fd, void* data, size_t size);
> {code}
> which expects a non-{{const}} {{void*}} for its {{data}} parameter. Under the 
> covers {{data}} appears to be handled as a {{const}} (like one would expect 
> from the signature its inspiration {{::write}}).
> This function is not used too often, but since it expects a non-{{const}} 
> value for {{data}} automatic conversions to {{void*}} from other pointer 
> types are disabled; instead callers seem cast manually to {{void*}} -- often 
> with C-style casts.
> We should sync this method's signature with that of {{::write}}.
> In addition to following the expected semantics of {{::write}}, having this 
> work without casts with any pointer value {{data}} would make it easier to 
> interface this with character literals, or raw data ptrs from STL containers 
> (e.g. {{Container::data}}). It would probably also indirectly eliminate 
> temptation to use C-casts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3793) Cannot start mesos local on a Debian GNU/Linux 8 docker machine

2015-11-25 Thread Till Toenshoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-3793:
--
Shepherd: Till Toenshoff

> Cannot start mesos local on a Debian GNU/Linux 8 docker machine
> ---
>
> Key: MESOS-3793
> URL: https://issues.apache.org/jira/browse/MESOS-3793
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.25.0
> Environment: Debian GNU/Linux 8 docker machine
>Reporter: Matthias Veit
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> We updated the mesos version to 0.25.0 in our Marathon docker image, that 
> runs our integration tests.
> We use mesos local for those tests. This fails with this message:
> {noformat}
> root@a06e4b4eb776:/marathon# mesos local
> I1022 18:42:26.852485   136 leveldb.cpp:176] Opened db in 6.103258ms
> I1022 18:42:26.853302   136 leveldb.cpp:183] Compacted db in 765740ns
> I1022 18:42:26.853343   136 leveldb.cpp:198] Created db iterator in 9001ns
> I1022 18:42:26.853355   136 leveldb.cpp:204] Seeked to beginning of db in 
> 1287ns
> I1022 18:42:26.853366   136 leveldb.cpp:273] Iterated through 0 keys in the 
> db in ns
> I1022 18:42:26.853406   136 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1022 18:42:26.853775   141 recover.cpp:449] Starting replica recovery
> I1022 18:42:26.853862   141 recover.cpp:475] Replica is in EMPTY status
> I1022 18:42:26.854751   138 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I1022 18:42:26.854856   140 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1022 18:42:26.855002   140 recover.cpp:566] Updating replica status to 
> STARTING
> I1022 18:42:26.855655   138 master.cpp:376] Master 
> a3f39818-1bda-4710-b96b-2a60ed4d12b8 (a06e4b4eb776) started on 
> 172.17.0.14:5050
> I1022 18:42:26.855680   138 master.cpp:378] Flags at startup: 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_slaves="false" 
> --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_slave_ping_timeouts="5" --quiet="false" 
> --recovery_slave_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" 
> --registry_strict="false" --root_submissions="true" 
> --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
> --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" 
> --work_dir="/tmp/mesos/local/AK0XpG" --zk_session_timeout="10secs"
> I1022 18:42:26.855790   138 master.cpp:425] Master allowing unauthenticated 
> frameworks to register
> I1022 18:42:26.855803   138 master.cpp:430] Master allowing unauthenticated 
> slaves to register
> I1022 18:42:26.855815   138 master.cpp:467] Using default 'crammd5' 
> authenticator
> W1022 18:42:26.855829   138 authenticator.cpp:505] No credentials provided, 
> authentication requests will be refused
> I1022 18:42:26.855840   138 authenticator.cpp:512] Initializing server SASL
> I1022 18:42:26.856442   136 containerizer.cpp:143] Using isolation: 
> posix/cpu,posix/mem,filesystem/posix
> I1022 18:42:26.856943   140 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.888185ms
> I1022 18:42:26.856987   140 replica.cpp:323] Persisted replica status to 
> STARTING
> I1022 18:42:26.857115   140 recover.cpp:475] Replica is in STARTING status
> I1022 18:42:26.857270   140 replica.cpp:641] Replica in STARTING status 
> received a broadcasted recover request
> I1022 18:42:26.857312   140 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1022 18:42:26.857368   140 recover.cpp:566] Updating replica status to VOTING
> I1022 18:42:26.857781   140 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 371121ns
> I1022 18:42:26.857841   140 replica.cpp:323] Persisted replica status to 
> VOTING
> I1022 18:42:26.857895   140 recover.cpp:580] Successfully joined the Paxos 
> group
> I1022 18:42:26.857928   140 recover.cpp:464] Recover process terminated
> I1022 18:42:26.862455   137 master.cpp:1603] The newly elected leader is 
> master@172.17.0.14:5050 with id a3f39818-1bda-4710-b96b-2a60ed4d12b8
> I1022 18:42:26.862498   137 master.cpp:1616] Elected as the leading master!
> I1022 18:42:26.862511   137 master.cpp:1376] Recovering from registrar
> I1022 18:42:26.862560   137 registrar.cpp:309] Recovering registrar
> Failed to create a containerizer: Could not create MesosContainerizer: Failed 
> to create launcher: Failed to create Linux launcher: Failed to mount cgroups 
>

[jira] [Updated] (MESOS-4014) Introduce remove endpoint for quota

2015-11-25 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4014:
---
Summary: Introduce remove endpoint for quota  (was: Introduce delete/remove 
endpoint for quota)

> Introduce remove endpoint for quota
> ---
>
> Key: MESOS-4014
> URL: https://issues.apache.org/jira/browse/MESOS-4014
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> This endpoint is for removing quotas via the DELETE method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4013) Introduce status endpoint for quota

2015-11-25 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4013:
---
Summary: Introduce status endpoint for quota  (was: Introduce GET/status 
endpoint for quota)

> Introduce status endpoint for quota
> ---
>
> Key: MESOS-4013
> URL: https://issues.apache.org/jira/browse/MESOS-4013
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> The endpoint should provide quota status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3973) Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11.

2015-11-25 Thread Till Toenshoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-3973:
--
Shepherd: Till Toenshoff

> Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11.
> -
>
> Key: MESOS-3973
> URL: https://issues.apache.org/jira/browse/MESOS-3973
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.26.0
> Environment: Mac OS X 10.10.5, Clang 7.0.0.
>Reporter: Bernd Mathiske
>Assignee: Gilbert Song
>  Labels: build, build-failure, mesosphere
>
> Non-root 'make distcheck.
> {noformat}
> ...
> [--] Global test environment tear-down
> [==] 826 tests from 113 test cases ran. (276624 ms total)
> [  PASSED  ] 826 tests.
>   YOU HAVE 6 DISABLED TESTS
> Making install in .
> make[3]: Nothing to be done for `install-exec-am'.
>  ../install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig'
>  /usr/bin/install -c -m 644 mesos.pc 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig'
> Making install in 3rdparty
> /Applications/Xcode.app/Contents/Developer/usr/bin/make  install-recursive
> Making install in libprocess
> Making install in 3rdparty
> /Applications/Xcode.app/Contents/Developer/usr/bin/make  install-recursive
> Making install in stout
> Making install in .
> make[9]: Nothing to be done for `install-exec-am'.
> make[9]: Nothing to be done for `install-data-am'.
> Making install in include
> make[9]: Nothing to be done for `install-exec-am'.
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include'
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include/stout'
>  /usr/bin/install -c -m 644  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/abort.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/attributes.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/base64.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bits.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bytes.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/dynamiclibrary.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/error.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/exit.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/flags.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/foreach.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/format.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/fs.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gtest.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gzip.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashmap.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashset.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/ip.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/lambda.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/linkedhashmap.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/list.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/mac.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multihashmap.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multimap.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/none.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/nothing.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/numify.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/path.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/preprocessor.hpp
>

[jira] [Comment Edited] (MESOS-3975) SSL build of mesos causes flaky testsuite.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026938#comment-15026938
 ] 

Till Toenshoff edited comment on MESOS-3975 at 11/25/15 3:38 PM:
-

I can still see tests failing using the above Vagrantfile generator on both 
VMware-Fusion as well as on VirtualBox -- hosted on OSX and Linux.

Just ran the test-suite again with a repeat-counter enabled and it stopped on 
the first

{noformat}
[ RUN  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem
2015-11-25 
15:26:33,873:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:26:37,209:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:26:40,546:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:26:43,883:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
+ /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false 
--operation=make-rslave --path=/
+ grep -E /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_FrPTNg/.+ 
/proc/self/mountinfo
+ grep -v 722234da-f06d-4c9c-95d9-9be998e69d5c
+ cut '-d ' -f5
+ xargs --no-run-if-empty umount -l
Changing root to 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_FrPTNg/provisioner/containers/722234da-f06d-4c9c-95d9-9be998e69d5c/backends/copy/rootfses/928eb0dc-228b-4e9a-80d4-de8fb86ff6ea
2015-11-25 
15:26:47,221:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
[   OK ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem (16903 ms)
[ RUN  ] 
LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutor
2015-11-25 
15:26:50,558:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:26:53,894:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
+ /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false 
--operation=make-rslave --path=/
+ grep -E 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/.+
 /proc/self/mountinfo
+ grep -v 39ddf64a-d74e-44c9-a237-2d130c95e72d
+ cut '-d ' -f5
+ xargs --no-run-if-empty umount -l
+ mount -n --rbind 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/provisioner/containers/39ddf64a-d74e-44c9-a237-2d130c95e72d/backends/copy/rootfses/4eac79ca-c89f-4a1d-b190-9e11cb43ca15
 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/slaves/f3615745-e347-4ffe-ba44-30cb0c245d76-S0/frameworks/f3615745-e347-4ffe-ba44-30cb0c245d76-/executors/226484c0-8df5-43fd-a62f-39b3b7bc4824/runs/39ddf64a-d74e-44c9-a237-2d130c95e72d/.rootfs
Could not load cert file
2015-11-25 
15:26:57,231:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
../../src/tests/containerizer/filesystem_isolator_tests.cpp:354: Failure
Value of: statusRunning.get().state()
  Actual: TASK_FAILED
Expected: TASK_RUNNING
2015-11-25 
15:27:00,568:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:27:03,906:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:27:07,243:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:27:10,580:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:27:13,916:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
../../src/tests/containerizer/filesystem_isolator_tests.cpp:355: Failure
Failed to wait 15secs for statusFinished

[jira] [Commented] (MESOS-4014) Introduce remove endpoint for quota

2015-11-25 Thread Alexander Rukletsov (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026965#comment-15026965
 ] 

Alexander Rukletsov commented on MESOS-4014:


https://reviews.apache.org/r/40580/

> Introduce remove endpoint for quota
> ---
>
> Key: MESOS-4014
> URL: https://issues.apache.org/jira/browse/MESOS-4014
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> This endpoint is for removing quotas via the DELETE method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3916) MasterMaintenanceTest.InverseOffersFilters is flaky

2015-11-25 Thread Bernd Mathiske (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027007#comment-15027007
 ] 

Bernd Mathiske commented on MESOS-3916:
---

Thank you! Will update the target release.

> MasterMaintenanceTest.InverseOffersFilters is flaky
> ---
>
> Key: MESOS-3916
> URL: https://issues.apache.org/jira/browse/MESOS-3916
> Project: Mesos
>  Issue Type: Bug
> Environment: Ubuntu Wily 64 bit
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: flaky-test, maintenance, mesosphere
> Attachments: wily_maintenance_test_verbose.txt
>
>
> Verbose Logs:
> {code}
> [ RUN  ] MasterMaintenanceTest.InverseOffersFilters
> I1113 16:43:58.486469  8728 leveldb.cpp:176] Opened db in 2.360405ms
> I1113 16:43:58.486935  8728 leveldb.cpp:183] Compacted db in 407105ns
> I1113 16:43:58.486995  8728 leveldb.cpp:198] Created db iterator in 16221ns
> I1113 16:43:58.487030  8728 leveldb.cpp:204] Seeked to beginning of db in 
> 10935ns
> I1113 16:43:58.487046  8728 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 999ns
> I1113 16:43:58.487090  8728 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1113 16:43:58.487735  8747 recover.cpp:449] Starting replica recovery
> I1113 16:43:58.488047  8747 recover.cpp:475] Replica is in EMPTY status
> I1113 16:43:58.488977  8745 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (58)@10.0.2.15:45384
> I1113 16:43:58.489452  8746 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1113 16:43:58.489712  8747 recover.cpp:566] Updating replica status to 
> STARTING
> I1113 16:43:58.490706  8742 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 745443ns
> I1113 16:43:58.490739  8742 replica.cpp:323] Persisted replica status to 
> STARTING
> I1113 16:43:58.490859  8742 recover.cpp:475] Replica is in STARTING status
> I1113 16:43:58.491786  8747 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (59)@10.0.2.15:45384
> I1113 16:43:58.492542  8749 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1113 16:43:58.493221  8743 recover.cpp:566] Updating replica status to VOTING
> I1113 16:43:58.493710  8743 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 331874ns
> I1113 16:43:58.493767  8743 replica.cpp:323] Persisted replica status to 
> VOTING
> I1113 16:43:58.493868  8743 recover.cpp:580] Successfully joined the Paxos 
> group
> I1113 16:43:58.494119  8743 recover.cpp:464] Recover process terminated
> I1113 16:43:58.504369  8749 master.cpp:367] Master 
> d59449fc-5462-43c5-b935-e05563fdd4b6 (vagrant-ubuntu-wily-64) started on 
> 10.0.2.15:45384
> I1113 16:43:58.504438  8749 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/ZB7csS/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_slave_ping_timeouts="5" --quiet="false" 
> --recovery_slave_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" 
> --registry_strict="true" --root_submissions="true" 
> --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
> --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/ZB7csS/master" 
> --zk_session_timeout="10secs"
> I1113 16:43:58.504717  8749 master.cpp:416] Master allowing unauthenticated 
> frameworks to register
> I1113 16:43:58.504889  8749 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1113 16:43:58.504922  8749 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/ZB7csS/credentials'
> I1113 16:43:58.505497  8749 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1113 16:43:58.505759  8749 master.cpp:495] Authorization enabled
> I1113 16:43:58.507638  8746 master.cpp:1606] The newly elected leader is 
> master@10.0.2.15:45384 with id d59449fc-5462-43c5-b935-e05563fdd4b6
> I1113 16:43:58.507693  8746 master.cpp:1619] Elected as the leading master!
> I1113 16:43:58.507720  8746 master.cpp:1379] Recovering from registrar
> I1113 16:43:58.507946  8749 registrar.cpp:309] Recovering registrar
> I1113 16:43:58.508561  8749 log.cpp:661] Attempting to start the writer
> I1113 16:43:58.510282  8747 replica.cpp:496] Replica received implicit 
> promise request from (60)@10.0.2.15:45384 with proposal 1
> I1113 16:43:58.510867  8747

[jira] [Updated] (MESOS-3975) SSL build of mesos causes flaky testsuite.

2015-11-25 Thread Joris Van Remoortere (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-3975:

Assignee: Joseph Wu  (was: Joris Van Remoortere)

> SSL build of mesos causes flaky testsuite.
> --
>
> Key: MESOS-3975
> URL: https://issues.apache.org/jira/browse/MESOS-3975
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.26.0
> Environment: CentOS 7.1, Kernel 3.10.0-229.20.1.el7.x86_64, gcc 
> 4.8.3, Docker 1.9
>Reporter: Till Toenshoff
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> When running the tests of an SSL build of Mesos on CentOS 7.1, I see spurious 
> test failures that are, so far, not reproducible.
> The following tests did fail for me in complete runs but did seem fine when 
> running them individually, in repetition.  
> {noformat}
> DockerTest.ROOT_DOCKER_CheckPortResource
> {noformat}
> {noformat}
> ContainerizerTest.ROOT_CGROUPS_BalloonFramework
> {noformat}
> {noformat}
> [ RUN  ] 
> LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutor
> 2015-11-20 
> 19:08:38,826:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> + /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false 
> --operation=make-rslave --path=/
> + grep -E 
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/.+
>  /proc/self/mountinfo
> + grep -v 2b98025c-74f1-41d2-b35a-ce2cdfae347e
> + cut '-d ' -f5
> + xargs --no-run-if-empty umount -l
> + mount -n --rbind 
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/provisioner/containers/2b98025c-74f1-41d2-b35a-ce2cdfae347e/backends/copy/rootfses/bed11080-474b-4c69-8e7f-0ab85e895b0d
>  
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/slaves/830e842e-c36a-4e4c-bff4-5b9568d7df12-S0/frameworks/830e842e-c36a-4e4c-bff4-5b9568d7df12-/executors/c735be54-c47f-4645-bfc1-2f4647e2cddb/runs/2b98025c-74f1-41d2-b35a-ce2cdfae347e/.rootfs
> Could not load cert file
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:354: Failure
> Value of: statusRunning.get().state()
>   Actual: TASK_FAILED
> Expected: TASK_RUNNING
> 2015-11-20 
> 19:08:42,164:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:45,501:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:48,837:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:52,174:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:355: Failure
> Failed to wait 15secs for statusFinished
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:349: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, 
> statusUpdate(, _))...
>  Expected: to be called twice
>Actual: called once - unsatisfied and active
> 2015-11-20 
> 19:08:55,511:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> *** Aborted at 1448046536 (unix time) try "date -d @1448046536" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGSEGV (@0x0) received by PID 21380 (TID 0x7fa1549e68c0) from PID 0; 
> stack trace: ***
> @ 0x7fa141796fbb (unknown)
> @ 0x7fa14179b341 (unknown)
> @ 0x7fa14f096130 (unknown)
> {noformat}
> Vagrantfile generator:
> {noformat}
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.hostname = "centos71"
>   config.vm.box = "bento/centos-7.1"
>   config.vm.provider "virtualbox" do |vb|
> vb.memory = 16384
> vb.cpus = 8
>   end
>   config.vm.provider "vmware_fusion" do |vb|
> vb.memory = 9216
> vb.cpus = 4
>   end
>   config.vm.provision "shell", inline: <<-SHELL
>  sudo yum -y update systemd
>  sudo yum install -y tar wget
>  sudo wget 
>

[jira] [Assigned] (MESOS-2948) Generalize authorizer interface in order to allow for arbitrary Subjects, Actions and Objects

2015-11-25 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-2948:
--

Assignee: Marco Massenzio

> Generalize authorizer interface in order to allow for arbitrary Subjects, 
> Actions and Objects
> -
>
> Key: MESOS-2948
> URL: https://issues.apache.org/jira/browse/MESOS-2948
> Project: Mesos
>  Issue Type: Epic
>  Components: master, security
>Reporter: Alexander Rojas
>Assignee: Marco Massenzio
>  Labels: acl, mesosphere, security
>
> The current 
> [{{mesos::Authorizer}}|https://github.com/apache/mesos/blob/40b596402521be25b93b9ef4edd8f5c727c9d20e/src/authorizer/authorizer.hpp]
>  API has one method for each of the _actions_ supported (Register Framework, 
> Launch Task and Shutdown Framework), and each of these _actions_ themselves 
> define the _objects_ on which they operate.
> Currently, in case a new action needs to be authorized it is necessary to 
> modify the {{mesos::Authorizer}} interface and all its implementations 
> (currently only {{mesos::LocalAuthorizer}}), and add a new nested message to 
> the {{ACL}} message in {{mesos.proto}}.
> An update to the API should allow for new _actions_ and _objects_ to be added 
> without the need to change the {{mesos::Authorizer}} interface while 
> encapsulating implementation details on how the authorization process is 
> performed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-2297) Add authentication support for HTTP API

2015-11-25 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-2297:
--

Assignee: Marco Massenzio  (was: Alexander Rojas)

> Add authentication support for HTTP API
> ---
>
> Key: MESOS-2297
> URL: https://issues.apache.org/jira/browse/MESOS-2297
> Project: Mesos
>  Issue Type: Epic
>Reporter: Vinod Kone
>Assignee: Marco Massenzio
>  Labels: mesosphere, security
>
> Since most of the communication between mesos components will happen through 
> HTTP with the arrival of the [HTTP 
> API|https://issues.apache.org/jira/browse/MESOS-2288], it makes sense to use 
> HTTP standard mechanisms to authenticate this communication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3975) SSL build of mesos causes flaky testsuite.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026938#comment-15026938
 ] 

Till Toenshoff commented on MESOS-3975:
---

I can still see tests failing using the above Vagrantfile generator on both 
VMware-Fusion as well as on VirtualBox -- hosted on OSX and Linux.

Just ran the test-suite again with a repeat-counter enabled and it stopped on 
the first iteration:
```
[ RUN  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem
2015-11-25 
15:26:33,873:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:26:37,209:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:26:40,546:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:26:43,883:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
+ /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false 
--operation=make-rslave --path=/
+ grep -E /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_FrPTNg/.+ 
/proc/self/mountinfo
+ grep -v 722234da-f06d-4c9c-95d9-9be998e69d5c
+ cut '-d ' -f5
+ xargs --no-run-if-empty umount -l
Changing root to 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_FrPTNg/provisioner/containers/722234da-f06d-4c9c-95d9-9be998e69d5c/backends/copy/rootfses/928eb0dc-228b-4e9a-80d4-de8fb86ff6ea
2015-11-25 
15:26:47,221:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
[   OK ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem (16903 ms)
[ RUN  ] 
LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutor
2015-11-25 
15:26:50,558:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:26:53,894:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
+ /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false 
--operation=make-rslave --path=/
+ grep -E 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/.+
 /proc/self/mountinfo
+ grep -v 39ddf64a-d74e-44c9-a237-2d130c95e72d
+ cut '-d ' -f5
+ xargs --no-run-if-empty umount -l
+ mount -n --rbind 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/provisioner/containers/39ddf64a-d74e-44c9-a237-2d130c95e72d/backends/copy/rootfses/4eac79ca-c89f-4a1d-b190-9e11cb43ca15
 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/slaves/f3615745-e347-4ffe-ba44-30cb0c245d76-S0/frameworks/f3615745-e347-4ffe-ba44-30cb0c245d76-/executors/226484c0-8df5-43fd-a62f-39b3b7bc4824/runs/39ddf64a-d74e-44c9-a237-2d130c95e72d/.rootfs
Could not load cert file
2015-11-25 
15:26:57,231:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
../../src/tests/containerizer/filesystem_isolator_tests.cpp:354: Failure
Value of: statusRunning.get().state()
  Actual: TASK_FAILED
Expected: TASK_RUNNING
2015-11-25 
15:27:00,568:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:27:03,906:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:27:07,243:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:27:10,580:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2015-11-25 
15:27:13,916:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
../../src/tests/containerizer/filesystem_isolator_tests.cpp:355: Failure
Failed to wait 15secs for statusFinished
../../src/tests/containerizer/filesystem_isolator_tests.cpp:349: Failure
Actual function call count doesn't

[jira] [Updated] (MESOS-4013) Introduce status endpoint for quota

2015-11-25 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4013:
---
Description: This endpoint is for querying quota status via the GET method. 
 (was: The endpoint should provide quota status.)

> Introduce status endpoint for quota
> ---
>
> Key: MESOS-4013
> URL: https://issues.apache.org/jira/browse/MESOS-4013
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> This endpoint is for querying quota status via the GET method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027745#comment-15027745
 ] 

Till Toenshoff edited comment on MESOS-3937 at 11/25/15 10:54 PM:
--

I also tried a different image first -- most other images seem to not have this 
issue as they use a different approach for binding the hostname towards an IP.

{noformat}
$ hostname -f
vagrant.vm
{noformat}

{noformat}
$ cat /etc/hosts
127.0.0.1   localhost
127.0.1.1   vagrant.vm  vagrant

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
{noformat}

The above is produced by a {{bento/ubuntu-14.04}} base image.


was (Author: tillt):
I also tried a different image first -- most other images seem to not have this 
issue as they use a different approach for binding the hostname towards an IP.

{noformat}
$ hostname -f
vagrant.vm
{noformat}

{noformat}
$ cat /etc/hosts
127.0.0.1   localhost
127.0.1.1   vagrant.vm  vagrant

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
{noformat}


> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Till Toenshoff
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials

[jira] [Assigned] (MESOS-4002) ReservationEndpointsTest.UnreserveAvailableAndOfferedResources is flaky

2015-11-25 Thread Michael Park (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park reassigned MESOS-4002:
---

Assignee: Michael Park  (was: Anand Mazumdar)

> ReservationEndpointsTest.UnreserveAvailableAndOfferedResources is flaky
> ---
>
> Key: MESOS-4002
> URL: https://issues.apache.org/jira/browse/MESOS-4002
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Michael Park
>  Labels: flaky-test, mesosphere, reservations
>
> Showed up on ASF CI: ( test kept looping on and on and ultimately failing the 
> build after 300 minutes )
> https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose,OS=ubuntu%3A14.04,label_exp=docker%7C%7CHadoop/1269/changes
> {code}
> [ RUN  ] ReservationEndpointsTest.UnreserveAvailableAndOfferedResources
> I1124 01:07:20.050729 30260 leveldb.cpp:174] Opened db in 107.434842ms
> I1124 01:07:20.099630 30260 leveldb.cpp:181] Compacted db in 48.82312ms
> I1124 01:07:20.099722 30260 leveldb.cpp:196] Created db iterator in 29905ns
> I1124 01:07:20.099738 30260 leveldb.cpp:202] Seeked to beginning of db in 
> 3145ns
> I1124 01:07:20.099750 30260 leveldb.cpp:271] Iterated through 0 keys in the 
> db in 279ns
> I1124 01:07:20.099804 30260 replica.cpp:778] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1124 01:07:20.100637 30292 recover.cpp:447] Starting replica recovery
> I1124 01:07:20.100934 30292 recover.cpp:473] Replica is in EMPTY status
> I1124 01:07:20.103240 30288 replica.cpp:674] Replica in EMPTY status received 
> a broadcasted recover request from (6305)@172.17.18.107:37993
> I1124 01:07:20.103672 30292 recover.cpp:193] Received a recover response from 
> a replica in EMPTY status
> I1124 01:07:20.104142 30292 recover.cpp:564] Updating replica status to 
> STARTING
> I1124 01:07:20.114534 30284 master.cpp:365] Master 
> ad27bc60-16d1-4239-9a65-235a991f9600 (9f2f81738d5e) started on 
> 172.17.18.107:37993
> I1124 01:07:20.114558 30284 master.cpp:367] Flags at startup: --acls="" 
> --allocation_interval="1000secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/I60I5f/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" --roles="role" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/mesos/mesos-0.26.0/_inst/share/mesos/webui" 
> --work_dir="/tmp/I60I5f/master" --zk_session_timeout="10secs"
> I1124 01:07:20.114809 30284 master.cpp:412] Master only allowing 
> authenticated frameworks to register
> I1124 01:07:20.114820 30284 master.cpp:417] Master only allowing 
> authenticated slaves to register
> I1124 01:07:20.114825 30284 credentials.hpp:35] Loading credentials for 
> authentication from '/tmp/I60I5f/credentials'
> I1124 01:07:20.115067 30284 master.cpp:456] Using default 'crammd5' 
> authenticator
> I1124 01:07:20.115320 30284 master.cpp:493] Authorization enabled
> I1124 01:07:20.115792 30285 hierarchical.cpp:162] Initialized hierarchical 
> allocator process
> I1124 01:07:20.115855 30285 whitelist_watcher.cpp:77] No whitelist given
> I1124 01:07:20.118755 30285 master.cpp:1625] The newly elected leader is 
> master@172.17.18.107:37993 with id ad27bc60-16d1-4239-9a65-235a991f9600
> I1124 01:07:20.118788 30285 master.cpp:1638] Elected as the leading master!
> I1124 01:07:20.118809 30285 master.cpp:1383] Recovering from registrar
> I1124 01:07:20.119078 30285 registrar.cpp:307] Recovering registrar
> I1124 01:07:20.143256 30292 leveldb.cpp:304] Persisting metadata (8 bytes) to 
> leveldb took 38.787419ms
> I1124 01:07:20.143347 30292 replica.cpp:321] Persisted replica status to 
> STARTING
> I1124 01:07:20.143717 30292 recover.cpp:473] Replica is in STARTING status
> I1124 01:07:20.145454 30286 replica.cpp:674] Replica in STARTING status 
> received a broadcasted recover request from (6307)@172.17.18.107:37993
> I1124 01:07:20.145979 30292 recover.cpp:193] Received a recover response from 
> a replica in STARTING status
> I1124 01:07:20.146654 30292 recover.cpp:564] Updating replica status to VOTING
> I1124 01:07:20.182672 30286 leveldb.cpp:304] Persisting metadata (8 bytes) to 
> leveldb took 35.422256ms
> I1124 01:07:20.182747 30286 replica.cpp:321] Persisted replica status to 
> VOTING
> I1124

[jira] [Comment Edited] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027745#comment-15027745
 ] 

Till Toenshoff edited comment on MESOS-3937 at 11/25/15 10:53 PM:
--

I also tried a different image first -- most other images seem to not have this 
issue as they use a different approach for binding the hostname towards an IP.

{noformat}
$ hostname -f
vagrant.vm
{noformat}

{noformat}
$ cat /etc/hosts
127.0.0.1   localhost
127.0.1.1   vagrant.vm  vagrant

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
{noformat}



was (Author: tillt):
I also tried a different image first -- most other images seem to not have this 
issue as they use a different approach for binding the hostname towards an IP.

{{/etc/hosts}} from {{bento/ubuntu-14.04}}

{noformat}
$ hostname -f
vagrant.vm
{noformat}

{noformat}
$ cat /etc/hosts
127.0.0.1   localhost
127.0.1.1   vagrant.vm  vagrant

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
{noformat}


> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Till Toenshoff
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
>

[jira] [Commented] (MESOS-4000) Implicit roles: Design Doc

2015-11-25 Thread Neil Conway (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027784#comment-15027784
 ] 

Neil Conway commented on MESOS-4000:


The design doc for implicit roles can be found here: 
https://docs.google.com/document/d/1SCFfrBd4edSY3bVCMrNJYMxIVllD0bHJuGmgG-4vCXA/edit?usp=sharing

> Implicit roles: Design Doc
> --
>
> Key: MESOS-4000
> URL: https://issues.apache.org/jira/browse/MESOS-4000
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Neil Conway
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027922#comment-15027922
 ] 

Till Toenshoff commented on MESOS-3937:
---

Boils down to this line being triggered within the docker image of this test:
https://github.com/mesos/mesos-go/blob/068d5470506e3780189fe607af40892814197c5e/mesosutil/node.go#L18

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Till Toenshoff
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos 
> group
> I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated
> I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL
> I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled
> I1117 15:08:09.296018 26399 master.cpp:1606] The

[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Marco Massenzio (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027271#comment-15027271
 ] 

Marco Massenzio commented on MESOS-3937:


So I read the thread and, honestly, it looks like we're making all this song 
and dance to make a test pass? who cares?
The question, with a failing test, is always the same:
{quote}
Is the test buggy, or are we uncovering a genuine issue in the code?
{quote}

It seems to me that this tests does not identify an issue in the code; at best, 
it has highlighted a combination of Ubuntu / Kernel / Docker 
versions/configurations that *may* cause an Executor launched inside a Docker 
container to fail (and, even there, I'm not so sure).

Also, please let's remind ourselves that tests are useful so that, when 
introducing code changes; refactorings; or new features, we can be assured that 
we haven't broken something that was working before: I'm not even sure this 
test achieves that?
(this may be a harsh statement borne out of my ignorance - please, correct me 
if I'm wrong on this one).

Here is my suggestion as to how to solve this issue:

- short-term: we disable this test and remove it as a {{0.26}} blocker (it 
doesn't seem to me that the failure highlights a regression in the code - 
again, correct me if I'm wrong);
- short-term: document the issue and possible workarounds for folks who may 
need to run Docker executors on Ubuntu;
- medium-term: if possible at all, let's find ways to identify in the test the 
conditions under which it's supposed to pass and, if they are met on the given 
platform the test is run - if not, a warning is emitted, but no failure (or 
something similar);
- long-run: decide whether to keep the test (modified, possibly) and / or 
discard it.

What do people think?

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs"

[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027735#comment-15027735
 ] 

Till Toenshoff commented on MESOS-3937:
---

When using the above image, the following is true for me:

The test breaks as described by Bernd.

{noformat}
$ hostname -f
vagrant-ubuntu-trusty-64
{noformat}

{noformat}
$ ifconfig
docker0   Link encap:Ethernet  HWaddr 56:84:7a:fe:97:99
  inet addr:172.17.42.1  Bcast:0.0.0.0  Mask:255.255.0.0
  UP BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth0  Link encap:Ethernet  HWaddr 08:00:27:70:2a:9d
  inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
  inet6 addr: fe80::a00:27ff:fe70:2a9d/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:246548 errors:0 dropped:0 overruns:0 frame:0
  TX packets:65399 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:298078841 (298.0 MB)  TX bytes:6093076 (6.0 MB)

loLink encap:Local Loopback
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:65536  Metric:1
  RX packets:48338 errors:0 dropped:0 overruns:0 frame:0
  TX packets:48338 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:5936578 (5.9 MB)  TX bytes:5936578 (5.9 MB)
{noformat}

{noformat}
$ ping vagrant-ubuntu-trusty-64
PING vagrant-ubuntu-trusty-64 (10.0.2.15) 56(84) bytes of data.
64 bytes from vagrant-ubuntu-trusty-64 (10.0.2.15): icmp_seq=1 ttl=64 
time=0.026 ms
{noformat}

So apparently, the hostname resolves towards a valid, non loopback IP (the one 
used by eth0).

{noformat}
$ cat /etc/hosts
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
{noformat}

Why would I need to add this hostname to {{/etc/hosts}} - despite the fact that 
it fixes this test - but why?

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Till Toenshoff
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs"

[jira] [Issue Comment Deleted] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Till Toenshoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-3937:
--
Comment: was deleted

(was: Boils down to this line being triggered within the docker image of this 
test:
https://github.com/mesos/mesos-go/blob/068d5470506e3780189fe607af40892814197c5e/mesosutil/node.go#L18)

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Till Toenshoff
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos 
> group
> I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated
> I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL
> I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled
> I1117 15:08:09.296018 26399 master.cpp:1606] The newly

[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027927#comment-15027927
 ] 

Till Toenshoff commented on MESOS-3937:
---

Boils down to this line being triggered within the docker image of this test:
https://github.com/mesos/mesos-go/blob/068d5470506e3780189fe607af40892814197c5e/mesosutil/node.go#L18

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Till Toenshoff
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos 
> group
> I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated
> I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL
> I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled
> I1117 15:08:09.296018 26399 master.cpp:1606] The

[jira] [Updated] (MESOS-4000) Implicit roles: Design Doc

2015-11-25 Thread Neil Conway (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4000:
---
Labels: mesosphere roles  (was: )

> Implicit roles: Design Doc
> --
>
> Key: MESOS-4000
> URL: https://issues.apache.org/jira/browse/MESOS-4000
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Neil Conway
>  Labels: mesosphere, roles
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4015) Expose task / executor health in master & slave state.json

2015-11-25 Thread Sargun Dhillon (JIRA)

Sargun Dhillon created MESOS-4015:
-

 Summary: Expose task / executor health in master & slave state.json
 Key: MESOS-4015
 URL: https://issues.apache.org/jira/browse/MESOS-4015
 Project: Mesos
  Issue Type: Improvement
Affects Versions: 0.25.0
Reporter: Sargun Dhillon
Priority: Trivial


Right now, if I specify a healthcheck for a task, the only way to get to it is 
via the Task Status updates that come to the framework. Unfortunately, this 
information isn't exposed in the state.json either in the slave or master. It'd 
be ideal to have that information to enable tools like Mesos-DNS to be 
health-aware.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027745#comment-15027745
 ] 

Till Toenshoff commented on MESOS-3937:
---

I also tried a different image first -- most other images seem to not have this 
issue as they use a different approach for binding the hostname towards an IP.

{{/etc/hosts}} from {{bento/ubuntu-14.04}}

{noformat}
$ hostname -f
vagrant.vm
{noformat}

{noformat}
$ cat /etc/hosts
127.0.0.1   localhost
127.0.1.1   vagrant.vm  vagrant

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
{noformat}


> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Till Toenshoff
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401

[jira] [Created] (MESOS-4016) Agent allows creation of persistent volume with absolute container_path

2015-11-25 Thread Greg Mann (JIRA)

Greg Mann created MESOS-4016:


 Summary: Agent allows creation of persistent volume with absolute 
container_path
 Key: MESOS-4016
 URL: https://issues.apache.org/jira/browse/MESOS-4016
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.25.0
Reporter: Greg Mann


When creating persistent volumes, [~gabriel.hartm...@gmail.com] saw that he 
could specify an absolute {{container_path}} in the {{CREATE}} operation and 
his framework would receive a subsequent offer containing that volume, 
indicating a successful operation. However, the directory was not found on the 
agent, and indeed such an operation should be unsuccessful, since in 
{{/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp}} it is 
enforced that if an absolute {{container_path}} is specified, the directory 
should already exist, and in this case it did not.

The {{CREATE}} operation should not appear to succeed if an invalid 
{{container_path}} is provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4017) Executor killed when multiple persistent volumes specify same container_path

2015-11-25 Thread Greg Mann (JIRA)

Greg Mann created MESOS-4017:


 Summary: Executor killed when multiple persistent volumes specify 
same container_path
 Key: MESOS-4017
 URL: https://issues.apache.org/jira/browse/MESOS-4017
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.25.0
Reporter: Greg Mann


[~gabriel.hartm...@gmail.com] recently noticed that his custom executor was 
getting killed by master when multiple tasks attempt to use persistent volumes 
with the same {{container_path}}. A {{CREATE}} operation that created two 
persistent volumes with the same {{container_path}} succeeded, and a subsequent 
offer included those persistent volumes. Then tasks were launched on a single 
executor that used these volumes, and at that point the master killed the 
executor. Better behavior might be for the first task to launch successfully, 
with the second task returning {{TASK_FAILED}} with an appropriate reason and 
message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4018) Enhance float-point operation in Mesos

2015-11-25 Thread Klaus Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028021#comment-15028021
 ] 

Klaus Ma commented on MESOS-4018:
-

In MESOS-3997, it'll replace float-point by fixed point for resources.

> Enhance float-point operation in Mesos
> --
>
> Key: MESOS-4018
> URL: https://issues.apache.org/jira/browse/MESOS-4018
> Project: Mesos
>  Issue Type: Epic
>  Components: stout
>Reporter: Klaus Ma
>Assignee: Klaus Ma
>
> For now, there are several defects about float-point equal checking. This 
> EPIC is used to build float-point operation in {{stout}} for other 
> components. The major operation will be:
> 1. {{bool almostEqual(double left, double right)}} for Scalar {{operator==}}
> 2. {{CHECK_DOUBLE_EQ(left, right)}} for assert in components



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-25 Thread Klaus Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028035#comment-15028035
 ] 

Klaus Ma commented on MESOS-3552:
-

[~marco-mesos], I create a EPIC (MESOS-4018) for float-point operation related 
issues. The major task of that EPIC is to build float-point operation in 
{{stout}}, e.g. {{almostEqual}}, {{CHECK_DOUBLE_EQ}}. So MESOS-1187 will use 
{{almostEqual}} for Scalar check; this ticket (MESOS-3552) will use 
{{CHECK_DOUBLE_EQ}}. Both tickets are sub-tasks of MESOS-4018.

Any more comments?

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4018) Enhance float-point operation in Mesos

2015-11-25 Thread Klaus Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-4018:

Description: 
For now, there are several defects about float-point equal checking. This EPIC 
is used to build float-point operation in {{stout}} for other components. The 
major operation will be:
1. {{bool almostEqual(double left, double right)}} for Scalar {{operator==}}
2. {{CHECK_DOUBLE_EQ(left, right)}} for assert in components

  was:For now, there are several defects about float-point equal checking. This 
EPIC is used to build float-point operation in {{stout}} for other components. 
The major operation will be: 1.) {{bool almostEqual(double left, double 
right)}} for Scalar {{operator==}}, 2.) {{CHECK_DOUBLE_EQ(left, right)}} for 
assert in components


> Enhance float-point operation in Mesos
> --
>
> Key: MESOS-4018
> URL: https://issues.apache.org/jira/browse/MESOS-4018
> Project: Mesos
>  Issue Type: Epic
>  Components: stout
>Reporter: Klaus Ma
>Assignee: Klaus Ma
>
> For now, there are several defects about float-point equal checking. This 
> EPIC is used to build float-point operation in {{stout}} for other 
> components. The major operation will be:
> 1. {{bool almostEqual(double left, double right)}} for Scalar {{operator==}}
> 2. {{CHECK_DOUBLE_EQ(left, right)}} for assert in components



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4018) Enhance float-point operation in Mesos

2015-11-25 Thread Klaus Ma (JIRA)

Klaus Ma created MESOS-4018:
---

 Summary: Enhance float-point operation in Mesos
 Key: MESOS-4018
 URL: https://issues.apache.org/jira/browse/MESOS-4018
 Project: Mesos
  Issue Type: Epic
  Components: stout
Reporter: Klaus Ma
Assignee: Klaus Ma


For now, there are several defects about float-point equal checking. This EPIC 
is used to build float-point operation in {{stout}} for other components. The 
major operation will be: 1.) {{bool almostEqual(double left, double right)}} 
for Scalar {{operator==}}, 2.) {{CHECK_DOUBLE_EQ(left, right)}} for assert in 
components



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-25 Thread Klaus Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-3552:

Issue Type: Task  (was: Improvement)

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3990) Unexpected reservation results due to floating point error

2015-11-25 Thread Neil Conway (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-3990:
---
 Labels: mesosphere reservations tech-debt  (was: )
   Priority: Major  (was: Critical)
Component/s: master
Summary: Unexpected reservation results due to floating point error  
(was: Doubles Don't Work for Resource Reservation)

On reflection, reopening because this is a distinct issue: MESOS-3552 and 
MESOS-1187 are about a crashing bug, whereas this talks about unexpected 
user-visible behavior.

In the short-term, the workaround is for frameworks to compare reserved 
resources within an epsilon. Long-term fix is to switch to a fixed-point 
representation (MESOS-3997).

> Unexpected reservation results due to floating point error
> --
>
> Key: MESOS-3990
> URL: https://issues.apache.org/jira/browse/MESOS-3990
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Gabriel Hartmann
>  Labels: mesosphere, reservations, tech-debt
>
> When issuing a RESERVE operation requesting the below, I received a 
> reservation with the wrong value (6566.4002):
>   resources {
> name: "mem"
> type: SCALAR
> scalar {
>   value: 6566.4001
> }
> role: "role1"
> reservation {
>   principal: "default-principal"
> }
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3916) MasterMaintenanceTest.InverseOffersFilters is flaky

2015-11-25 Thread Joseph Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027478#comment-15027478
 ] 

Joseph Wu commented on MESOS-3916:
--

That's a very odd failure.

The first batch of inverse offers are both received by the master:
{code:title=inverseOffer2}
I1125 10:05:53.152995 29359 master.cpp:3316] Processing DECLINE call for 
offers: [ 932f7d7b-f2d4-42c7-9391-222c19b9d35b-O3 ] for framework 
932f7d7b-f2d4-42c7-9391-222c19b9d35b- (default)
{code}

Note: This message shows up regardless, since {{Master::GetOffer}} does not 
search for inverse offers.  We might want to silence this incorrect warning.
{code:title=inverseOffer1}
W1125 10:05:53.155109 29362 master.cpp:2897] ACCEPT call used invalid offers '[ 
932f7d7b-f2d4-42c7-9391-222c19b9d35b-O2 ]': Offer 
932f7d7b-f2d4-42c7-9391-222c19b9d35b-O2 is no longer valid
{code}

Somehow, the allocation was not triggered by the subsequent clock advancement 
in the test.  I'm guessing:
# The clock was settled while the ACCEPT call was still in flight.
# The clock was then advanced before the ACCEPT call reached the master.  [This 
comment seems 
relevant](https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/process.cpp#L2845-L2856)
# The allocation went ahead, meaning the inverse offer had not been filtered to 
0 seconds yet.
# Clock is paused, so we don't allocate -> test times out.

> MasterMaintenanceTest.InverseOffersFilters is flaky
> ---
>
> Key: MESOS-3916
> URL: https://issues.apache.org/jira/browse/MESOS-3916
> Project: Mesos
>  Issue Type: Bug
> Environment: Ubuntu Wily 64 bit
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: flaky-test, maintenance, mesosphere
> Attachments: wily_maintenance_test_verbose.txt
>
>
> Verbose Logs:
> {code}
> [ RUN  ] MasterMaintenanceTest.InverseOffersFilters
> I1113 16:43:58.486469  8728 leveldb.cpp:176] Opened db in 2.360405ms
> I1113 16:43:58.486935  8728 leveldb.cpp:183] Compacted db in 407105ns
> I1113 16:43:58.486995  8728 leveldb.cpp:198] Created db iterator in 16221ns
> I1113 16:43:58.487030  8728 leveldb.cpp:204] Seeked to beginning of db in 
> 10935ns
> I1113 16:43:58.487046  8728 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 999ns
> I1113 16:43:58.487090  8728 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1113 16:43:58.487735  8747 recover.cpp:449] Starting replica recovery
> I1113 16:43:58.488047  8747 recover.cpp:475] Replica is in EMPTY status
> I1113 16:43:58.488977  8745 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (58)@10.0.2.15:45384
> I1113 16:43:58.489452  8746 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1113 16:43:58.489712  8747 recover.cpp:566] Updating replica status to 
> STARTING
> I1113 16:43:58.490706  8742 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 745443ns
> I1113 16:43:58.490739  8742 replica.cpp:323] Persisted replica status to 
> STARTING
> I1113 16:43:58.490859  8742 recover.cpp:475] Replica is in STARTING status
> I1113 16:43:58.491786  8747 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (59)@10.0.2.15:45384
> I1113 16:43:58.492542  8749 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1113 16:43:58.493221  8743 recover.cpp:566] Updating replica status to VOTING
> I1113 16:43:58.493710  8743 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 331874ns
> I1113 16:43:58.493767  8743 replica.cpp:323] Persisted replica status to 
> VOTING
> I1113 16:43:58.493868  8743 recover.cpp:580] Successfully joined the Paxos 
> group
> I1113 16:43:58.494119  8743 recover.cpp:464] Recover process terminated
> I1113 16:43:58.504369  8749 master.cpp:367] Master 
> d59449fc-5462-43c5-b935-e05563fdd4b6 (vagrant-ubuntu-wily-64) started on 
> 10.0.2.15:45384
> I1113 16:43:58.504438  8749 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/ZB7csS/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_slave_ping_timeouts="5" --quiet="false" 
> --recovery_slave_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" 
> --registry_strict="true" --root_submissions="true" 
> --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
> --user_sorter="drf" --version="false" 
>

[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-25 Thread Avinash Sridharan (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027366#comment-15027366
 ] 

Avinash Sridharan commented on MESOS-3552:
--

Instead of using an explicit epsilon check as has been proposed here we should 
be using the CHECK_DOUBLE_EQ macro for CPUs. Looks like CPUs are the only 
resources that are stored in double and might run into this double precision 
error. Something like this might work better:
CHECK( result.mem() == mem() &&
result.disk() == disk() &&
result.ports() == ports());

CHECK_DOUBLE_EQ(result.cpus().get(), cpus().get());



> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Till Toenshoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff reassigned MESOS-3937:
-

Assignee: Till Toenshoff  (was: Timothy Chen)

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Till Toenshoff
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos 
> group
> I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated
> I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL
> I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled
> I1117 15:08:09.296018 26399 master.cpp:1606] The newly elected leader is 
> master@10.0.2.15:50088 with id 59c600f1-92ff-4926-9c84-073d9b81f68a
> I1117 15:08:09.296115 26399 master.cpp:1619] Elected as the leading

[jira] [Updated] (MESOS-4014) Introduce remove endpoint for quota

2015-11-25 Thread Joris Van Remoortere (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-4014:

Sprint: Mesosphere Sprint 23

> Introduce remove endpoint for quota
> ---
>
> Key: MESOS-4014
> URL: https://issues.apache.org/jira/browse/MESOS-4014
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
> Fix For: 0.27.0
>
>
> This endpoint is for removing quotas via the DELETE method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-25 Thread Klaus Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028224#comment-15028224
 ] 

Klaus Ma commented on MESOS-3552:
-

[~avin...@mesosphere.io], thanks for your reminder :); I got this message in 
the history few days ago. I think there's a gap on the scope of float-point: 
whether {{bool almostEqual()}} should be included. I think we're on the same 
page about CHECK_NEAR/CHECK_DEBOULD_EQ which should be included.

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4012) Update documentation to reflect the addition of installable tests.

2015-11-25 Thread Benjamin Bannier (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026880#comment-15026880
 ] 

Benjamin Bannier commented on MESOS-4012:
-

In addition to adding information on how a user can check conformance of a 
machine this would also give us the opportunity to cleanly separate what is 
_needed for to build mesos_ and what is _needed to run it_.

> Update documentation to reflect the addition of installable tests.  
> 
>
> Key: MESOS-4012
> URL: https://issues.apache.org/jira/browse/MESOS-4012
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Till Toenshoff
>
> We may want to add the needed steps for administrators to create and run the 
> test-suite on anything other than the build machine. 
> One possible location could be {{docs/gettings-started.md}} for validating 
> the pre-requisites as described in that document. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3974) CgroupsAnyHierarchyMemoryPressureTest tests fail on CentOS 6.7.

2015-11-25 Thread Till Toenshoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-3974:
--
Shepherd: Till Toenshoff

> CgroupsAnyHierarchyMemoryPressureTest tests fail on CentOS 6.7.
> ---
>
> Key: MESOS-3974
> URL: https://issues.apache.org/jira/browse/MESOS-3974
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.26.0
> Environment: CentOS 6.7, kernel 2.6.32-573.el6.x86_64, gcc 4.8.2, 
> docker 1.7.1
>Reporter: Till Toenshoff
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> {noformat}
> GLOG_v=2 sudo ./bin/mesos-tests.sh 
> --gtest_filter="CgroupsAnyHierarchyMemoryPressureTest.*" --verbose
> {noformat}
> {noformat}
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> I1120 17:40:40.410383  2467 process.cpp:2426] Spawned process 
> __gc__@127.0.0.1:45300
> I1120 17:40:40.410909  2467 process.cpp:2426] Spawned process 
> help@127.0.0.1:45300
> I1120 17:40:40.410845  2483 process.cpp:2436] Resuming __gc__@127.0.0.1:45300 
> at 2015-11-20 17:40:40.410562048+00:00
> I1120 17:40:40.410970  2467 process.cpp:2426] Spawned process 
> logging@127.0.0.1:45300
> I1120 17:40:40.410995  2467 process.cpp:2426] Spawned process 
> profiler@127.0.0.1:45300
> I1120 17:40:40.411015  2482 process.cpp:2436] Resuming help@127.0.0.1:45300 
> at 2015-11-20 17:40:40.410989056+00:00
> I1120 17:40:40.411063  2467 process.cpp:2426] Spawned process 
> system@127.0.0.1:45300
> I1120 17:40:40.411160  2482 process.cpp:2436] Resuming 
> profiler@127.0.0.1:45300 at 2015-11-20 17:40:40.411155968+00:00
> I1120 17:40:40.411206  2467 process.cpp:2426] Spawned process 
> __limiter__(1)@127.0.0.1:45300
> I1120 17:40:40.411223  2467 process.cpp:2426] Spawned process 
> metrics@127.0.0.1:45300
> I1120 17:40:40.411268  2482 process.cpp:2436] Resuming system@127.0.0.1:45300 
> at 2015-11-20 17:40:40.411266048+00:00
> I1120 17:40:40.411378  2483 process.cpp:2436] Resuming 
> __limiter__(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.411374080+00:00
> I1120 17:40:40.411388  2467 process.cpp:2426] Spawned process 
> __processes__@127.0.0.1:45300
> I1120 17:40:40.411399  2483 process.cpp:2436] Resuming 
> __processes__@127.0.0.1:45300 at 2015-11-20 17:40:40.411397888+00:00
> I1120 17:40:40.411402  2467 process.cpp:965] libprocess is initialized on 
> 127.0.0.1:45300 for 8 cpus
> I1120 17:40:40.411415  2488 process.cpp:2436] Resuming help@127.0.0.1:45300 
> at 2015-11-20 17:40:40.411397888+00:00
> I1120 17:40:40.411432  2467 logging.cpp:177] Logging to STDERR
> I1120 17:40:40.411384  2482 process.cpp:2436] Resuming 
> metrics@127.0.0.1:45300 at 2015-11-20 17:40:40.411379200+00:00
> I1120 17:40:40.411717  2482 process.cpp:2436] Resuming help@127.0.0.1:45300 
> at 2015-11-20 17:40:40.411710976+00:00
> I1120 17:40:40.411813  2487 process.cpp:2436] Resuming 
> logging@127.0.0.1:45300 at 2015-11-20 17:40:40.411789056+00:00
> I1120 17:40:40.411989  2487 process.cpp:2436] Resuming help@127.0.0.1:45300 
> at 2015-11-20 17:40:40.411983872+00:00
> Source directory: /home/vagrant/mesos
> Build directory: /home/vagrant/mesos/build
> -
> We cannot run any cgroups tests that require mounting
> hierarchies because you have the following hierarchies mounted:
> /cgroup/blkio, /cgroup/cpu, /cgroup/cpuacct, /cgroup/cpuset, /cgroup/devices, 
> /cgroup/freezer, /cgroup/memory, /cgroup/net_cls
> We'll disable the CgroupsNoHierarchyTest test fixture for now.
> -
> I1120 17:40:40.414676  2467 process.cpp:2426] Spawned process 
> reaper(1)@127.0.0.1:45300
> I1120 17:40:40.414728  2482 process.cpp:2436] Resuming 
> reaper(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.414701824+00:00
> I1120 17:40:40.415870  2467 process.cpp:2426] Spawned process 
> __latch__(1)@127.0.0.1:45300
> I1120 17:40:40.415913  2483 process.cpp:2436] Resuming __gc__@127.0.0.1:45300 
> at 2015-11-20 17:40:40.415889920+00:00
> I1120 17:40:40.415966  2467 process.cpp:2426] Spawned process 
> __waiter__(1)@127.0.0.1:45300
> I1120 17:40:40.416054  2483 process.cpp:2436] Resuming 
> __latch__(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.416045056+00:00
> I1120 17:40:40.416070  2467 process.cpp:2734] Donating thread to 
> __waiter__(1)@127.0.0.1:45300 while waiting
> I1120 17:40:40.416093  2467 process.cpp:2436] Resuming 
> __waiter__(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.416083968+00:00
> I1120 17:40:40.517282  2483 process.cpp:2436] Resuming 
> reaper(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.517263872+00:00
> I1120 17:40:40.519779  2488 process.cpp:2436] Resuming 
> __latch__(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.519730176+00:00
> I1120 17:40:40.519865  2488 process.cpp:2541] Cleaning up 
>

[jira] [Commented] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1

2015-11-25 Thread Jan Schlicht (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026899#comment-15026899
 ] 

Jan Schlicht commented on MESOS-4009:
-

I'd appreciate if {{-Wsign-compare}} would be added to the clang compile flags.

> RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
> --
>
> Key: MESOS-4009
> URL: https://issues.apache.org/jira/browse/MESOS-4009
> Project: Mesos
>  Issue Type: Bug
>  Components: test
> Environment: Fedora 23, GCC 5.1.1
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>Priority: Trivial
>  Labels: easyfix
>
> GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a 
> comparison between signed and unsigned int in 
> {{provisioner_docker_tests.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3973) Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11.

2015-11-25 Thread Gilbert Song (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026476#comment-15026476
 ] 

Gilbert Song commented on MESOS-3973:
-

I did try update either distribute tar ball or pip tar ball to the latest 
version(because seems like updating pip fixed `make distcheck` failure on 
debian 8), but none of both works here. The solution may not be in cleaning up 
those files in src/Makefile.am. Instead, we should figure out why 
mesos/mesos.cli/mesos.interface/mesos.native are shown as `not installed` in 
debug log.

> Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11.
> -
>
> Key: MESOS-3973
> URL: https://issues.apache.org/jira/browse/MESOS-3973
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.26.0
> Environment: Mac OS X 10.10.5, Clang 7.0.0.
>Reporter: Bernd Mathiske
>Assignee: Gilbert Song
>  Labels: build, build-failure, mesosphere
>
> Non-root 'make distcheck.
> {noformat}
> ...
> [--] Global test environment tear-down
> [==] 826 tests from 113 test cases ran. (276624 ms total)
> [  PASSED  ] 826 tests.
>   YOU HAVE 6 DISABLED TESTS
> Making install in .
> make[3]: Nothing to be done for `install-exec-am'.
>  ../install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig'
>  /usr/bin/install -c -m 644 mesos.pc 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig'
> Making install in 3rdparty
> /Applications/Xcode.app/Contents/Developer/usr/bin/make  install-recursive
> Making install in libprocess
> Making install in 3rdparty
> /Applications/Xcode.app/Contents/Developer/usr/bin/make  install-recursive
> Making install in stout
> Making install in .
> make[9]: Nothing to be done for `install-exec-am'.
> make[9]: Nothing to be done for `install-data-am'.
> Making install in include
> make[9]: Nothing to be done for `install-exec-am'.
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include'
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include/stout'
>  /usr/bin/install -c -m 644  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/abort.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/attributes.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/base64.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bits.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bytes.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/dynamiclibrary.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/error.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/exit.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/flags.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/foreach.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/format.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/fs.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gtest.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gzip.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashmap.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashset.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/ip.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/lambda.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/linkedhashmap.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/list.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/mac.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multihashmap.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multimap.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/none.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/nothing.hpp
>  
>

[jira] [Commented] (MESOS-3946) Test for role management

2015-11-25 Thread Yong Qiao Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026475#comment-15026475
 ] 

Yong Qiao Wang commented on MESOS-3946:
---

Good suggestion, we can consider this after the main tasks done.

> Test for role management
> 
>
> Key: MESOS-3946
> URL: https://issues.apache.org/jira/browse/MESOS-3946
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Add test for role dynamic configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3581) License headers show up all over doxygen documentation.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026471#comment-15026471
 ] 

Till Toenshoff commented on MESOS-3581:
---

{noformat}
commit 3539b7a0e15b594148308319bf052d28b1429b98
Author: Benjamin Bannier 
Date:   Mon Nov 23 06:53:38 2015 -0800

[libprocess]: Made license-headers doxygen-compatible.

This commit adjusts license headers of C++ source and header files.

Review: https://reviews.apache.org/r/39592
{noformat}

{noformat}
commit dc23756a5433d6f7fcd22d291babad14f6799233
Author: Benjamin Bannier 
Date:   Mon Nov 23 06:53:01 2015 -0800

[stout]: Made license-headers doxygen-compatible.

This commit adjusts license headers of C++ source and header files.

Review: https://reviews.apache.org/r/39591
{noformat}

{noformat}
commit fa36917dd142f66924c5f7ed689b87d5ceabbf79
Author: Benjamin Bannier 
Date:   Mon Nov 23 06:49:31 2015 -0800

Made license-headers doxygen-compatible.

This commit adjusts license headers of C++ source and header files,
and protobuf definitions.

Also, reflect the changed style in the C++ style guide.

Review: https://reviews.apache.org/r/39590
{noformat}

{noformat}
commit 384de473d9f388b84b77321c8a08e17efd558f10
Author: Benjamin Bannier 
Date:   Wed Nov 25 10:05:44 2015 +0100

[stout] Fixed two headers that got cut off in dc23756a.

Review: https://reviews.apache.org/r/40652/
{noformat}


> License headers show up all over doxygen documentation.
> ---
>
> Key: MESOS-3581
> URL: https://issues.apache.org/jira/browse/MESOS-3581
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 0.24.1
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Minor
>  Labels: mesosphere
>
> Currently license headers are commented in something resembling Javadoc style,
> {code}
> /**
> * Licensed ...
> {code}
> Since we use Javadoc-style comment blocks for doxygen documentation all 
> license headers appear in the generated documentation, potentially and likely 
> hiding the actual documentation.
> Using {{/*}} to start the comment blocks would be enough to hide them from 
> doxygen, but would likely also result in a largish (though mostly 
> uninteresting) patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3947) Authenticate /roles request

2015-11-25 Thread Yong Qiao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3947:
--
Description: 
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.

  was:
/roles requests except GET method need to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.


> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles endpoint needs to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3947) Authenticate /roles request

2015-11-25 Thread Yong Qiao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3947:
--
Description: 
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.

In addition, for the query request of /roles endpoint, considering that it 
would not change the status of roles/weights in Mesos master and for backward 
compatibility , so it will not be authenticated.

  was:
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.

In addition, for the query request of /roles endpoint, considering that it 
would not change the status of roles/weights in Mesos master and for backward 
compatibility , it does not need to be authenticated.


> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles endpoint needs to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.
> In addition, for the query request of /roles endpoint, considering that it 
> would not change the status of roles/weights in Mesos master and for backward 
> compatibility , so it will not be authenticated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3608) optionally install test binaries

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026552#comment-15026552
 ] 

Till Toenshoff commented on MESOS-3608:
---

[~jamespeach] thanks a bunch - looking into it.

> optionally install test binaries
> 
>
> Key: MESOS-3608
> URL: https://issues.apache.org/jira/browse/MESOS-3608
> Project: Mesos
>  Issue Type: Improvement
>  Components: build, test
>Reporter: James Peach
>Assignee: James Peach
>Priority: Minor
>
> Many of the tests in Mesos could be described as integration tests, since 
> they have external dependencies on kernel features, installed tools, 
> permissions, etc. I'd like to be able to generate a {{mesos-tests}} RPM along 
> with my {{mesos}} RPM so that I can run the same tests in different 
> deployment environments.
> I propose a new configuration option named {{--enable-test-tools}} that will 
> install the tests into {{libexec/mesos/tests}}. I'll also need to make some 
> minor changes to tests so that helper tools can be found in this location as 
> well as in the build directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3947) Authenticate /roles request

2015-11-25 Thread Yong Qiao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3947:
--
Description: 
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.

In addition, for the query request of /roles endpoint, considering that it 
would not change the status of roles/weights in Mesos master and for backward 
compatibility , it does not need to be authenticated.

  was:
/roles endpoint needs to be authenticated.
This ticket will authenticate /roles requests using credentials provided by the 
`Authorization` field of the HTTP request. This is similar to how 
authentication is implemented in `Master::Http`.


> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles endpoint needs to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.
> In addition, for the query request of /roles endpoint, considering that it 
> would not change the status of roles/weights in Mesos master and for backward 
> compatibility , it does not need to be authenticated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3916) MasterMaintenanceTest.InverseOffersFilters is flaky

2015-11-25 Thread Jan Schlicht (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026551#comment-15026551
 ] 

Jan Schlicht commented on MESOS-3916:
-

After applying the patch, {{./bin/mesos-tests.sh 
--gtest_filter=*.InverseOffersFilters --gtest_repeat=-1 
--gtest_break_on_failure}} fails after > 250 iterations. Before that it failed 
after ~10 iterations.
Therefore, while the flakiness isn't gone for me, the situation improved 
significantly.

Here's a log of a failed test after applying the patch:
{noformat}
I1125 10:05:52.969558 29342 leveldb.cpp:174] Opened db in 384512ns
I1125 10:05:52.969916 29342 leveldb.cpp:181] Compacted db in 319730ns
I1125 10:05:52.969949 29342 leveldb.cpp:196] Created db iterator in 3457ns
I1125 10:05:52.969959 29342 leveldb.cpp:202] Seeked to beginning of db in 332ns
I1125 10:05:52.969965 29342 leveldb.cpp:271] Iterated through 0 keys in the db 
in 318ns
I1125 10:05:52.969979 29342 replica.cpp:778] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I1125 10:05:52.970711 29362 recover.cpp:447] Starting replica recovery
I1125 10:05:52.970767 29362 recover.cpp:473] Replica is in EMPTY status
I1125 10:05:52.970935 29362 replica.cpp:674] Replica in EMPTY status received a 
broadcasted recover request from (3109)@127.0.0.1:42692
I1125 10:05:52.970989 29362 recover.cpp:193] Received a recover response from a 
replica in EMPTY status
I1125 10:05:52.971045 29362 recover.cpp:564] Updating replica status to STARTING
I1125 10:05:52.971160 29362 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 22591ns
I1125 10:05:52.971174 29362 replica.cpp:321] Persisted replica status to 
STARTING
I1125 10:05:52.971195 29362 master.cpp:365] Master 
932f7d7b-f2d4-42c7-9391-222c19b9d35b (localhost) started on 127.0.0.1:42692
I1125 10:05:52.971204 29362 master.cpp:367] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="false" --authenticate_slaves="true" --authenticators="crammd5" 
--authorizers="local" --credentials="/tmp/EruGwl/credentials" 
--framework_sorter="drf" --help="false" --hostname_lookup="true" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="25secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/EruGwl/master" 
--zk_session_timeout="10secs"
W1125 10:05:52.971287 29362 master.cpp:370]
**
Master bound to loopback interface! Cannot communicate with remote schedulers 
or slaves. You might want to set '--ip' flag to a routable IP address.
**
I1125 10:05:52.971364 29362 master.cpp:414] Master allowing unauthenticated 
frameworks to register
I1125 10:05:52.971372 29362 master.cpp:417] Master only allowing authenticated 
slaves to register
I1125 10:05:52.971298 29363 recover.cpp:473] Replica is in STARTING status
I1125 10:05:52.971379 29362 credentials.hpp:35] Loading credentials for 
authentication from '/tmp/EruGwl/credentials'
I1125 10:05:52.971544 29362 master.cpp:456] Using default 'crammd5' 
authenticator
I1125 10:05:52.971573 29363 replica.cpp:674] Replica in STARTING status 
received a broadcasted recover request from (3110)@127.0.0.1:42692
I1125 10:05:52.971587 29362 master.cpp:493] Authorization enabled
I1125 10:05:52.971709 29358 recover.cpp:193] Received a recover response from a 
replica in STARTING status
I1125 10:05:52.971807 29359 whitelist_watcher.cpp:77] No whitelist given
I1125 10:05:52.972726 29356 hierarchical.cpp:162] Initialized hierarchical 
allocator process
I1125 10:05:52.972959 29362 master.cpp:1625] The newly elected leader is 
master@127.0.0.1:42692 with id 932f7d7b-f2d4-42c7-9391-222c19b9d35b
I1125 10:05:52.972996 29362 master.cpp:1638] Elected as the leading master!
I1125 10:05:52.972998 29358 recover.cpp:564] Updating replica status to VOTING
I1125 10:05:52.973006 29362 master.cpp:1383] Recovering from registrar
I1125 10:05:52.973254 29359 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 13731ns
I1125 10:05:52.973319 29359 replica.cpp:321] Persisted replica status to VOTING
I1125 10:05:52.973356 29358 recover.cpp:578] Successfully joined the Paxos group
I1125 10:05:52.973412 29363 registrar.cpp:307] Recovering registrar
I1125 10:05:52.973435 29358 recover.cpp:462] Recover process terminated
I1125 10:05:52.973695 29358 log.cpp:659] Attempting to start the writer
I1125 10:05:52.973846 29358 replica.cpp:494] Replica received implicit promise 
request from (3111)@127.0.0.1:42692 with proposal 1

[jira] [Commented] (MESOS-3966) LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem fails on Centos 7.1

2015-11-25 Thread Jan Schlicht (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026462#comment-15026462
 ] 

Jan Schlicht commented on MESOS-3966:
-

Thanks for the patch!

> LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem fails on 
> Centos 7.1
> 
>
> Key: MESOS-3966
> URL: https://issues.apache.org/jira/browse/MESOS-3966
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.26.0
> Environment: centos 7.1, gcc 4.8.3, docker 1.8.2
>Reporter: Till Toenshoff
>Assignee: Jan Schlicht
>  Labels: mesosphere
>
> {noformat}
> [ RUN  ] LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem
> I1120 11:39:37.862926 29944 linux.cpp:82] Making 
> '/tmp/LinuxFilesystemIsolatorTest_ROOT_ImageInVolumeWithRootFilesystem_ZBw23E'
>  a shared mount
> I1120 11:39:37.876965 29944 linux_launcher.cpp:103] Using 
> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> I1120 11:39:37.930881 29944 systemd.cpp:128] systemd version `208` detected
> W1120 11:39:37.930913 29944 systemd.cpp:136] Required functionality 
> `Delegate` was introduced in Version `218`. Your system may not function 
> properly; however since some distributions have patched systemd packages, 
> your system may still be functional. This is why we keep running. See 
> MESOS-3352 for more information
> I1120 11:39:37.938351 29944 systemd.cpp:210] Started systemd slice 
> `mesos_executors.slice`
> I1120 11:39:37.940218 29962 containerizer.cpp:618] Starting container 
> '1ea741a9-5edf-4910-ae64-f8d53f74e31e' for executor 'test_executor' of 
> framework ''
> I1120 11:39:37.943042 29959 provisioner.cpp:289] Provisioning image rootfs 
> '/tmp/LinuxFilesystemIsolatorTest_ROOT_ImageInVolumeWithRootFilesystem_ZBw23E/provisioner/containers/1ea741a9-5edf-4910-ae64-f8d53f74e31e/backends/copy/rootfses/7d97f8ac-ee57-4c83-b2d1-4332e25c89ae'
>  for container 1ea741a9-5edf-4910-ae64-f8d53f74e31e
> I1120 11:39:49.571781 29958 provisioner.cpp:289] Provisioning image rootfs 
> '/tmp/LinuxFilesystemIsolatorTest_ROOT_ImageInVolumeWithRootFilesystem_ZBw23E/provisioner/containers/1ea741a9-5edf-4910-ae64-f8d53f74e31e/backends/copy/rootfses/0256b892-e737-4d3d-89ea-74cf0e96eaf6'
>  for container 1ea741a9-5edf-4910-ae64-f8d53f74e31e
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:806: Failure
> Failed to wait 15secs for launch
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem 
> (55076 ms)
> [--] 1 test from LinuxFilesystemIsolatorTest (55076 ms total)
> {noformat}
> The following vagrant generator was used:
> {noformat}
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.hostname = "centos71"
>   config.vm.box = "bento/centos-7.1"
>   config.vm.provider "virtualbox" do |vb|
> vb.memory = 16384
> vb.cpus = 8
>   end
>   config.vm.provider "vmware_fusion" do |vb|
> vb.memory = 9216
> vb.cpus = 4
>   end
>   config.vm.provision "shell", inline: <<-SHELL
>  sudo yum -y update systemd
>  sudo yum install -y tar wget
>  sudo wget 
> http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo 
> -O /etc/yum.repos.d/epel-apache-maven.repo
>  sudo yum groupinstall -y "Development Tools"
>  sudo yum install -y apache-maven python-devel java-1.7.0-openjdk-devel 
> zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel cyrus-sasl-md5 
> apr-devel subversion-devel apr-util-devel
>  sudo yum install -y git
>  sudo yum install -y docker
>  sudo service docker start
>  sudo docker info
>  #sudo wget -qO- https://get.docker.com/ | sh
>   SHELL
> end
> EOF
> vagrant up
> vagrant reload
> vagrant ssh -c "
> git clone  https://github.com/apache/mesos.git mesos
> cd mesos
> git checkout -b 0.26.0-rc1 0.26.0-rc1
> ./bootstrap
> mkdir build
> cd build
> ../configure
> make -j4 check
> #make -j4 distcheck
> sudo ./bin/mesos-tests.sh
> #make clean
> #../configure --enable-libevent --enable-ssl
> #GTEST_FILTER="" make check
> #sudo ./bin/mesos-tests.sh
> "
> {noformat}
> Additionally, {{/etc/hosts}} was edited to contain hostname and IP (allowing 
> a pass of the bridged docker executor tests).
> {noformat}
> 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
> ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
> 192.168.218.135 centos71
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3791) Enhance the existing HTTP endpoint /roles

2015-11-25 Thread Yong Qiao Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026469#comment-15026469
 ] 

Yong Qiao Wang commented on MESOS-3791:
---

The response JSON format of /roles endpoint with GET request has been changed 
in this ticket, I am not sure that we need to keep consistent as before for 
backward compatibility. [~adam-mesos] any suggestions for this?

> Enhance the existing HTTP endpoint /roles
> -
>
> Key: MESOS-3791
> URL: https://issues.apache.org/jira/browse/MESOS-3791
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint to query roles as 
> outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3947) Authenticate /roles request

2015-11-25 Thread Yong Qiao Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026499#comment-15026499
 ] 

Yong Qiao Wang commented on MESOS-3947:
---

Thanks [~marco-mesos] for your information, I have updated the description of 
this ticket for your concern.

> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles endpoint needs to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.
> In addition, for the query request of /roles endpoint, considering that it 
> would not change the status of roles/weights in Mesos master and for backward 
> compatibility , so it will not be authenticated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1

2015-11-25 Thread Jan Schlicht (JIRA)

Jan Schlicht created MESOS-4009:
---

 Summary: RegistryClientTest.SimpleRegistryPuller doesn't compile 
with GCC 5.1.1
 Key: MESOS-4009
 URL: https://issues.apache.org/jira/browse/MESOS-4009
 Project: Mesos
  Issue Type: Bug
 Environment: Fedora 23
Reporter: Jan Schlicht
Assignee: Jan Schlicht
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4007) Persist role information to registry

2015-11-25 Thread Yong Qiao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-4007:
--
Description: 
To consider the Mesos master recovery and failover case, Mesos master needs to 
persist the roles and weights information in registry: 
- In the first boot, the first leading master initialize the replicated log 
with the roles/weights specified by command-line flags(--roles and --weights). 
The flags values are only useful to bootstrap the cluster, after which point 
the registry becomes the source of truth.

- At runtime, the replicated log can only be updated to add/remove/update 
entries by the operator REST API.

- For Mesos master restart/failover case, if the replicated log for 
roles/weights has exist, and then it prefers to use the registry values and 
ignore the flags (--roles/--weights), and also log a warning in Mesos master 
that the flags values are being ignored.

- For the future works, we can educate end users to create the replicated log 
to initialize the supported roles/weights before Mesos cluster bootstrap, and 
reset roles/weights configurations by update the replicated log.


  was:Persist role information to registry across master recovery/failover.


> Persist role information to registry
> 
>
> Key: MESOS-4007
> URL: https://issues.apache.org/jira/browse/MESOS-4007
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> To consider the Mesos master recovery and failover case, Mesos master needs 
> to persist the roles and weights information in registry: 
> - In the first boot, the first leading master initialize the replicated log 
> with the roles/weights specified by command-line flags(--roles and 
> --weights). The flags values are only useful to bootstrap the cluster, after 
> which point the registry becomes the source of truth.
> - At runtime, the replicated log can only be updated to add/remove/update 
> entries by the operator REST API.
> - For Mesos master restart/failover case, if the replicated log for 
> roles/weights has exist, and then it prefers to use the registry values and 
> ignore the flags (--roles/--weights), and also log a warning in Mesos master 
> that the flags values are being ignored.
> - For the future works, we can educate end users to create the replicated log 
> to initialize the supported roles/weights before Mesos cluster bootstrap, and 
> reset roles/weights configurations by update the replicated log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3942) Enhance endpoint /roles for adding a new role

2015-11-25 Thread Yong Qiao Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026417#comment-15026417
 ] 

Yong Qiao Wang commented on MESOS-3942:
---

RR: https://reviews.apache.org/r/40697/

> Enhance endpoint /roles for adding a new role
> -
>
> Key: MESOS-3942
> URL: https://issues.apache.org/jira/browse/MESOS-3942
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> In this ticket, we will enhance the existing HTTP endpoint /roles to can add 
> a new role at runtime as outlined in the Design Doc: 
> https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4008) Master recovery with the persisted roles in registry

2015-11-25 Thread Yong Qiao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-4008:
--
Description: 
To consider the Mesos master recovery and failover case, Mesos master needs to 
persist the roles and weights information in registry: 
- In the first boot, the first leading master initialize the replicated log 
with the roles/weights specified by command-line flags(--roles and --weights). 
The flags values are only useful to bootstrap the cluster, after which point 
the registry becomes the source of truth.

- At runtime, the replicated log can only be updated to add/remove/update 
entries by the operator REST API.

- For Mesos master restart/failover case, if the replicated log for 
roles/weights has exist, and then it prefers to use the registry values and 
ignore the flags (--roles/--weights), and also log a warning in Mesos master 
that the flags values are being ignored.

- For the future works, we can educate end users to create the replicated log 
to initialize the supported roles/weights before Mesos cluster bootstrap, and 
reset roles/weights configurations by update the replicated log.


  was:
To consider the Mesos master recovery and failover case, Mesos master needs to 
persist the roles and weights information in registry: 
In the first boot, the first leading master initialize the replicated log with 
the roles/weights specified by command-line flags(--roles and --weights). The 
flags values are only useful to bootstrap the cluster, after which point the 
registry becomes the source of truth.
At runtime, the replicated log can only be updated to add/remove/update entries 
by the operator REST API.
For Mesos master restart/failover case, if the replicated log for roles/weights 
has exist, and then it prefers to use the registry values and ignore the flags 
(--roles/--weights), and also log a warning in Mesos master that the flags 
values are being ignored.
For the future works, we can educate end users to create the replicated log to 
initialize the supported roles/weights before Mesos cluster bootstrap, and 
reset roles/weights configurations by update the replicated log.



> Master recovery with the persisted roles in registry
> 
>
> Key: MESOS-4008
> URL: https://issues.apache.org/jira/browse/MESOS-4008
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> To consider the Mesos master recovery and failover case, Mesos master needs 
> to persist the roles and weights information in registry: 
> - In the first boot, the first leading master initialize the replicated log 
> with the roles/weights specified by command-line flags(--roles and 
> --weights). The flags values are only useful to bootstrap the cluster, after 
> which point the registry becomes the source of truth.
> - At runtime, the replicated log can only be updated to add/remove/update 
> entries by the operator REST API.
> - For Mesos master restart/failover case, if the replicated log for 
> roles/weights has exist, and then it prefers to use the registry values and 
> ignore the flags (--roles/--weights), and also log a warning in Mesos master 
> that the flags values are being ignored.
> - For the future works, we can educate end users to create the replicated log 
> to initialize the supported roles/weights before Mesos cluster bootstrap, and 
> reset roles/weights configurations by update the replicated log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4008) Master recovery with the persisted roles in registry

2015-11-25 Thread Yong Qiao Wang (JIRA)

Yong Qiao Wang created MESOS-4008:
-

 Summary: Master recovery with the persisted roles in registry
 Key: MESOS-4008
 URL: https://issues.apache.org/jira/browse/MESOS-4008
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


To consider the Mesos master recovery and failover case, Mesos master needs to 
persist the roles and weights information in registry: 
In the first boot, the first leading master initialize the replicated log with 
the roles/weights specified by command-line flags(--roles and --weights). The 
flags values are only useful to bootstrap the cluster, after which point the 
registry becomes the source of truth.
At runtime, the replicated log can only be updated to add/remove/update entries 
by the operator REST API.
For Mesos master restart/failover case, if the replicated log for roles/weights 
has exist, and then it prefers to use the registry values and ignore the flags 
(--roles/--weights), and also log a warning in Mesos master that the flags 
values are being ignored.
For the future works, we can educate end users to create the replicated log to 
initialize the supported roles/weights before Mesos cluster bootstrap, and 
reset roles/weights configurations by update the replicated log.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4007) Persist role information to registry

2015-11-25 Thread Yong Qiao Wang (JIRA)

Yong Qiao Wang created MESOS-4007:
-

 Summary: Persist role information to registry
 Key: MESOS-4007
 URL: https://issues.apache.org/jira/browse/MESOS-4007
 Project: Mesos
  Issue Type: Task
Reporter: Yong Qiao Wang
Assignee: Yong Qiao Wang


Persist role information to registry across master recovery/failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Bernd Mathiske (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026649#comment-15026649
 ] 

Bernd Mathiske commented on MESOS-3937:
---

My take is that to close this ticket we need to make sure we have viable 
instructions in the docs / on the web page.

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos 
> group
> I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated
> I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL
> I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled
> I1117 15:08:09.296018 26399 master.cpp:1606] The newly elected leader is 
> master@10.0.2.15:50088 with id

[jira] [Created] (MESOS-4010) Initial leader election unstable

2015-11-25 Thread Guilherme Moro (JIRA)

Guilherme Moro created MESOS-4010:
-

 Summary: Initial leader election unstable
 Key: MESOS-4010
 URL: https://issues.apache.org/jira/browse/MESOS-4010
 Project: Mesos
  Issue Type: Bug
  Components: master
Affects Versions: 0.25.0
 Environment: RHEL 6.6
Reporter: Guilherme Moro
Priority: Critical


No leader is elected
For a start, let me explain my setup:
3 nodes
3 zookeepers
3 mesos-master services, configured as initctl services and controlled by 
puppet, RPM's installed are from the RHEL repository at mesosphere (installed 
through puppet as well), running on RHEL 6.6
Quorum is set to 2, as expected, all the remaining configs were double checked 
and appears to be correct.
Most of times I can get the cluster to bootstrap after rebooting the nodes 
(sometimes more than once).
The whole thing resembles a bit 
https://issues.apache.org/jira/browse/MESOS-2148 and 
https://issues.apache.org/jira/browse/MESOS-2014



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1

2015-11-25 Thread Jan Schlicht (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Schlicht updated MESOS-4009:

Description: GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and 
stumbles over a comparison between signed and unsigned int in 
{{provisioner_docker_tests.cpp}}.
Component/s: test

> RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
> --
>
> Key: MESOS-4009
> URL: https://issues.apache.org/jira/browse/MESOS-4009
> Project: Mesos
>  Issue Type: Bug
>  Components: test
> Environment: Fedora 23
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>Priority: Trivial
>  Labels: easyfix
>
> GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a 
> comparison between signed and unsigned int in 
> {{provisioner_docker_tests.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026645#comment-15026645
 ] 

Till Toenshoff commented on MESOS-4009:
---

Why would clang and gcc < 5.1 not detect this?

> RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
> --
>
> Key: MESOS-4009
> URL: https://issues.apache.org/jira/browse/MESOS-4009
> Project: Mesos
>  Issue Type: Bug
>  Components: test
> Environment: Fedora 23
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>Priority: Trivial
>  Labels: easyfix
>
> GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a 
> comparison between signed and unsigned int in 
> {{provisioner_docker_tests.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4010) Initial leader election unstable

2015-11-25 Thread Guilherme Moro (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guilherme Moro updated MESOS-4010:
--
Attachment: messages_node3.log
messages_node2.log
messages_node1.log

> Initial leader election unstable
> 
>
> Key: MESOS-4010
> URL: https://issues.apache.org/jira/browse/MESOS-4010
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.25.0
> Environment: RHEL 6.6
>Reporter: Guilherme Moro
>Priority: Critical
> Attachments: messages_node1.log, messages_node2.log, 
> messages_node3.log
>
>
> No leader is elected
> For a start, let me explain my setup:
> 3 nodes
> 3 zookeepers
> 3 mesos-master services, configured as initctl services and controlled by 
> puppet, RPM's installed are from the RHEL repository at mesosphere (installed 
> through puppet as well), running on RHEL 6.6
> Quorum is set to 2, as expected, all the remaining configs were double 
> checked and appears to be correct.
> Most of times I can get the cluster to bootstrap after rebooting the nodes 
> (sometimes more than once).
> The whole thing resembles a bit 
> https://issues.apache.org/jira/browse/MESOS-2148 and 
> https://issues.apache.org/jira/browse/MESOS-2014



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Timothy Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026628#comment-15026628
 ] 

Timothy Chen commented on MESOS-3937:
-

I can't repro this with phusion/ubuntu-14.04-amd64 vagrant image?
example_executor.go is open source in mesos-go repo in mesos/mesos-go

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos 
> group
> I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated
> I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL
> I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled
> I1117 15:08:09.296018 26399 master.cpp:1606] The newly elected leader is 
> master@10.0.2.15:50088

[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026638#comment-15026638
 ] 

haosdent commented on MESOS-3937:
-

LoL, could not found it now. 
https://github.com/mesos/mesos-go/search?utf8=%E2%9C%93=example_executor 
But I think it is because "hostname -f" command failed which node.go depends on 
it. Update /etc/hosts, it become ok.

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos 
> group
> I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated
> I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL
> I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled
> I1117 15:08:09.296018

[jira] [Commented] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026764#comment-15026764
 ] 

Till Toenshoff commented on MESOS-4009:
---

Good to know - thanks. 

Maybe we should add the sign-compare then additionally for those other 
compilers?

> RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
> --
>
> Key: MESOS-4009
> URL: https://issues.apache.org/jira/browse/MESOS-4009
> Project: Mesos
>  Issue Type: Bug
>  Components: test
> Environment: Fedora 23
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>Priority: Trivial
>  Labels: easyfix
>
> GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a 
> comparison between signed and unsigned int in 
> {{provisioner_docker_tests.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4012) Update documentation to reflect the addition of installable tests.

2015-11-25 Thread Till Toenshoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-4012:
--
Issue Type: Documentation  (was: Epic)

> Update documentation to reflect the addition of installable tests.  
> 
>
> Key: MESOS-4012
> URL: https://issues.apache.org/jira/browse/MESOS-4012
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Till Toenshoff
>
> We may want to add the needed steps for administrators to create and run the 
> test-suite on anything other than the build machine. 
> One possible location could be {{docs/gettings-started.md}} for validating 
> the pre-requisites as described in that document. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4012) Update documentation to reflect the addition of installable tests.

2015-11-25 Thread Till Toenshoff (JIRA)

Till Toenshoff created MESOS-4012:
-

 Summary: Update documentation to reflect the addition of 
installable tests.  
 Key: MESOS-4012
 URL: https://issues.apache.org/jira/browse/MESOS-4012
 Project: Mesos
  Issue Type: Epic
Reporter: Till Toenshoff


We may want to add the needed steps for administrators to create and run the 
test-suite on anything other than the build machine. 

One possible location could be {{docs/gettings-started.md}} for validating the 
pre-requisites as described in that document. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1

2015-11-25 Thread Jan Schlicht (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026715#comment-15026715
 ] 

Jan Schlicht commented on MESOS-4009:
-

Clang does not have {{-Wsign-compare}} in {{-Wall}}. I'm not sure, but GCC < 
5.1 seems to suffer from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59231
At least that would explain the behaviour, because ASSERT_EQ is using templates 
(see: 
https://github.com/google/googletest/blob/master/googletest/include/gtest/gtest.h#L1451)

> RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
> --
>
> Key: MESOS-4009
> URL: https://issues.apache.org/jira/browse/MESOS-4009
> Project: Mesos
>  Issue Type: Bug
>  Components: test
> Environment: Fedora 23
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>Priority: Trivial
>  Labels: easyfix
>
> GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a 
> comparison between signed and unsigned int in 
> {{provisioner_docker_tests.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1

2015-11-25 Thread Till Toenshoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-4009:
--
Environment: Fedora 23, GCC 5.1.1  (was: Fedora 23)

> RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
> --
>
> Key: MESOS-4009
> URL: https://issues.apache.org/jira/browse/MESOS-4009
> Project: Mesos
>  Issue Type: Bug
>  Components: test
> Environment: Fedora 23, GCC 5.1.1
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>Priority: Trivial
>  Labels: easyfix
>
> GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a 
> comparison between signed and unsigned int in 
> {{provisioner_docker_tests.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3964) LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.

2015-11-25 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026771#comment-15026771
 ] 

Till Toenshoff commented on MESOS-3964:
---

Thanks for referencing this, supports our results, good.


> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and 
> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.
> ---
>
> Key: MESOS-3964
> URL: https://issues.apache.org/jira/browse/MESOS-3964
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation, test
>Affects Versions: 0.26.0
> Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt
> Vagrantfile: see MESOS-3957
>Reporter: Bernd Mathiske
>Assignee: Greg Mann
>Priority: Blocker
>  Labels: mesosphere
>
> sudo ./bin/mesos-test.sh 
> --gtest_filter="LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs"
> {noformat}
> ...
> F1119 14:34:52.514742 30706 isolator_tests.cpp:455] CHECK_SOME(isolator): 
> Failed to find 'cpu.cfs_quota_us'. Your kernel might be too old to use the 
> CFS cgroups feature.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4011) Allow build phase independent platform integration tests.

2015-11-25 Thread Till Toenshoff (JIRA)

Till Toenshoff created MESOS-4011:
-

 Summary: Allow build phase independent platform integration tests.
 Key: MESOS-4011
 URL: https://issues.apache.org/jira/browse/MESOS-4011
 Project: Mesos
  Issue Type: Epic
Reporter: Till Toenshoff


Many of the tests in Mesos could be described as integration tests, since they 
have external dependencies on kernel features, installed tools, permissions, 
etc. I'd like to be able to generate a mesos-tests RPM along with my mesos RPM 
so that I can run the same tests in different deployment environments.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3964) LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.

2015-11-25 Thread haosdent (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026755#comment-15026755
 ] 

haosdent commented on MESOS-3964:
-

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=789019

> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and 
> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.
> ---
>
> Key: MESOS-3964
> URL: https://issues.apache.org/jira/browse/MESOS-3964
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation, test
>Affects Versions: 0.26.0
> Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt
> Vagrantfile: see MESOS-3957
>Reporter: Bernd Mathiske
>Assignee: Greg Mann
>Priority: Blocker
>  Labels: mesosphere
>
> sudo ./bin/mesos-test.sh 
> --gtest_filter="LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs"
> {noformat}
> ...
> F1119 14:34:52.514742 30706 isolator_tests.cpp:455] CHECK_SOME(isolator): 
> Failed to find 'cpu.cfs_quota_us'. Your kernel might be too old to use the 
> CFS cgroups feature.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3969) Failing 'make distcheck' on Debian 8, somehow SSL-related.

2015-11-25 Thread Till Toenshoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-3969:
--
Shepherd: Bernd Mathiske

> Failing 'make distcheck' on Debian 8, somehow SSL-related.
> --
>
> Key: MESOS-3969
> URL: https://issues.apache.org/jira/browse/MESOS-3969
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.26.0
> Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt
> Vagrantfile see MESOS-3957
>Reporter: Bernd Mathiske
>Assignee: Joseph Wu
>  Labels: build, build-failure, mesosphere
>
> As non-root: make distcheck.
> {noformat}
> /bin/mkdir -p '/home/vagrant/mesos/build/mesos-0.26.0/_inst/bin'
> /bin/bash ../libtool --mode=install /usr/bin/install -c mesos-local mesos-log 
> mesos mesos-execute mesos-resolve 
> '/home/vagrant/mesos/build/mesos-0.26.0/_inst/bin'
> libtool: install: /usr/bin/install -c .libs/mesos-local 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-local
> libtool: install: /usr/bin/install -c .libs/mesos-log 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-log
> libtool: install: /usr/bin/install -c .libs/mesos 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos
> libtool: install: /usr/bin/install -c .libs/mesos-execute 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-execute
> libtool: install: /usr/bin/install -c .libs/mesos-resolve 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-resolve
> Traceback (most recent call last):
> File "", line 1, in 
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/build/3rdparty/pip-1.5.6/pip/__init_.py",
>  line 11, in 
> from pip.vcs import git, mercurial, subversion, bazaar # noqa
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/vcs/mercurial.py",
>  line 9, in 
> from pip.download import path_to_url
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/download.py",
>  line 22, in 
> from pip._vendor import requests, six
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/build/3rdparty/pip-1.5.6/pip/_vendor/requests/__init_.py",
>  line 53, in 
> from .packages.urllib3.contrib import pyopenssl
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/_vendor/requests/packages/urllib3/contrib/pyopenssl.py",
>  line 70, in 
> ssl.PROTOCOL_SSLv3: OpenSSL.SSL.SSLv3_METHOD,
> AttributeError: 'module' object has no attribute 'PROTOCOL_SSLv3'
> Traceback (most recent call last):
> File "", line 1, in 
> File "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rd
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota

2015-11-25 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-3073:
---
Issue Type: Epic  (was: Improvement)

> Introduce HTTP endpoints for Quota
> --
>
> Key: MESOS-3073
> URL: https://issues.apache.org/jira/browse/MESOS-3073
> Project: Mesos
>  Issue Type: Epic
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> We need to implement the HTTP endpoints for Quota as outlined in the Design 
> Doc: 
> (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota

2015-11-25 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-3073:
---
   Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 
17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20, 
Mesosphere Sprint 21, Mesosphere Sprint 22  (was: Mesosphere Sprint 15, 
Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere 
Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere Sprint 22, 
Mesosphere Sprint 23)
Epic Name: Quota Endpoints

> Introduce HTTP endpoints for Quota
> --
>
> Key: MESOS-3073
> URL: https://issues.apache.org/jira/browse/MESOS-3073
> Project: Mesos
>  Issue Type: Epic
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> We need to implement the HTTP endpoints for Quota as outlined in the Design 
> Doc: 
> (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4014) Introduce DELETE/remove endpoint for quota

2015-11-25 Thread Alexander Rukletsov (JIRA)

Alexander Rukletsov created MESOS-4014:
--

 Summary: Introduce DELETE/remove endpoint for quota
 Key: MESOS-4014
 URL: https://issues.apache.org/jira/browse/MESOS-4014
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Alexander Rukletsov
Assignee: Joerg Schad


This endpoint is for removing quotas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4014) Introduce delete/remove endpoint for quota

2015-11-25 Thread Anand Mazumdar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4014:
--
Summary: Introduce delete/remove endpoint for quota  (was: Introduce 
DELETE/remove endpoint for quota)

> Introduce delete/remove endpoint for quota
> --
>
> Key: MESOS-4014
> URL: https://issues.apache.org/jira/browse/MESOS-4014
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> This endpoint is for removing quotas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-4014) Introduce delete/remove endpoint for quota

2015-11-25 Thread Anand Mazumdar (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4014:
--
Description: This endpoint is for removing quotas via the DELETE method.  
(was: This endpoint is for removing quotas.)

> Introduce delete/remove endpoint for quota
> --
>
> Key: MESOS-4014
> URL: https://issues.apache.org/jira/browse/MESOS-4014
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> This endpoint is for removing quotas via the DELETE method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-4013) Introduce GET/status endpoint for quota

2015-11-25 Thread Alexander Rukletsov (JIRA)

Alexander Rukletsov created MESOS-4013:
--

 Summary: Introduce GET/status endpoint for quota
 Key: MESOS-4013
 URL: https://issues.apache.org/jira/browse/MESOS-4013
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Alexander Rukletsov
Assignee: Joerg Schad


The endpoint should provide quota status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-25 Thread Bernd Mathiske (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernd Mathiske updated MESOS-3552:
--
Target Version/s: 0.26.0  (was: 0.27.0)

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-25 Thread Neil Conway (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028154#comment-15028154
 ] 

Neil Conway commented on MESOS-3552:


There's also MESOS-3990, which wouldn't be handled by either almostEqual or 
CHECK_DOUBLE_EQ: the problem in MESOS-3990 is that we return unexpected results 
to the user.

Since the plan is to switch to fixed-point anyway, personally I think we should 
focus on (a) fixing the crashing / failing CHECKs, then (b) figuring out a 
migration plan toward fixed-point resource values.

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-25 Thread Avinash Sridharan (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028206#comment-15028206
 ] 

Avinash Sridharan commented on MESOS-3552:
--

Just an update. Tried Mandeep's test case with CHECK_DOUBLE_EQ and it fails on 
the test Mandeep had submitted for review 
https://reviews.apache.org/r/39056/

Creating a patch with CHECK_NEAR with margin set to MIN_CPUS and adding 
Mandeep's test case to the test framework as well. 

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-25 Thread Klaus Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028201#comment-15028201
 ] 

Klaus Ma commented on MESOS-3552:
-

For the MESOS-3990, I think it's because Framework can not decide whether a 
resources can contains others. If we only handle CHECK_DOUBLE_EQ, how to handle 
{{reosurces.contains}}? Just ignore it for now?

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

84 matches

Mail list logo