[jira] [Updated] (MESOS-3795) process::io::write takes parameter as void* which could be const
[ https://issues.apache.org/jira/browse/MESOS-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-3795: Attachment: (was: ubuntu14_clang-3.6_FAILED.log) > process::io::write takes parameter as void* which could be const > > > Key: MESOS-3795 > URL: https://issues.apache.org/jira/browse/MESOS-3795 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Benjamin Bannier > Labels: mesosphere, tech-debt > > In libprocess we have > {code} > Future write(int fd, void* data, size_t size); > {code} > which expects a non-{{const}} {{void*}} for its {{data}} parameter. Under the > covers {{data}} appears to be handled as a {{const}} (like one would expect > from the signature its inspiration {{::write}}). > This function is not used too often, but since it expects a non-{{const}} > value for {{data}} automatic conversions to {{void*}} from other pointer > types are disabled; instead callers seem cast manually to {{void*}} -- often > with C-style casts. > We should sync this method's signature with that of {{::write}}. > In addition to following the expected semantics of {{::write}}, having this > work without casts with any pointer value {{data}} would make it easier to > interface this with character literals, or raw data ptrs from STL containers > (e.g. {{Container::data}}). It would probably also indirectly eliminate > temptation to use C-casts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3793) Cannot start mesos local on a Debian GNU/Linux 8 docker machine
[ https://issues.apache.org/jira/browse/MESOS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-3793: -- Shepherd: Till Toenshoff > Cannot start mesos local on a Debian GNU/Linux 8 docker machine > --- > > Key: MESOS-3793 > URL: https://issues.apache.org/jira/browse/MESOS-3793 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.25.0 > Environment: Debian GNU/Linux 8 docker machine >Reporter: Matthias Veit >Assignee: Jojy Varghese > Labels: mesosphere > > We updated the mesos version to 0.25.0 in our Marathon docker image, that > runs our integration tests. > We use mesos local for those tests. This fails with this message: > {noformat} > root@a06e4b4eb776:/marathon# mesos local > I1022 18:42:26.852485 136 leveldb.cpp:176] Opened db in 6.103258ms > I1022 18:42:26.853302 136 leveldb.cpp:183] Compacted db in 765740ns > I1022 18:42:26.853343 136 leveldb.cpp:198] Created db iterator in 9001ns > I1022 18:42:26.853355 136 leveldb.cpp:204] Seeked to beginning of db in > 1287ns > I1022 18:42:26.853366 136 leveldb.cpp:273] Iterated through 0 keys in the > db in ns > I1022 18:42:26.853406 136 replica.cpp:744] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1022 18:42:26.853775 141 recover.cpp:449] Starting replica recovery > I1022 18:42:26.853862 141 recover.cpp:475] Replica is in EMPTY status > I1022 18:42:26.854751 138 replica.cpp:641] Replica in EMPTY status received > a broadcasted recover request > I1022 18:42:26.854856 140 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1022 18:42:26.855002 140 recover.cpp:566] Updating replica status to > STARTING > I1022 18:42:26.855655 138 master.cpp:376] Master > a3f39818-1bda-4710-b96b-2a60ed4d12b8 (a06e4b4eb776) started on > 172.17.0.14:5050 > I1022 18:42:26.855680 138 master.cpp:378] Flags at startup: > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="false" --authenticate_slaves="false" > --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_slave_ping_timeouts="5" --quiet="false" > --recovery_slave_removal_limit="100%" --registry="replicated_log" > --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" > --registry_strict="false" --root_submissions="true" > --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" > --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" > --work_dir="/tmp/mesos/local/AK0XpG" --zk_session_timeout="10secs" > I1022 18:42:26.855790 138 master.cpp:425] Master allowing unauthenticated > frameworks to register > I1022 18:42:26.855803 138 master.cpp:430] Master allowing unauthenticated > slaves to register > I1022 18:42:26.855815 138 master.cpp:467] Using default 'crammd5' > authenticator > W1022 18:42:26.855829 138 authenticator.cpp:505] No credentials provided, > authentication requests will be refused > I1022 18:42:26.855840 138 authenticator.cpp:512] Initializing server SASL > I1022 18:42:26.856442 136 containerizer.cpp:143] Using isolation: > posix/cpu,posix/mem,filesystem/posix > I1022 18:42:26.856943 140 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.888185ms > I1022 18:42:26.856987 140 replica.cpp:323] Persisted replica status to > STARTING > I1022 18:42:26.857115 140 recover.cpp:475] Replica is in STARTING status > I1022 18:42:26.857270 140 replica.cpp:641] Replica in STARTING status > received a broadcasted recover request > I1022 18:42:26.857312 140 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1022 18:42:26.857368 140 recover.cpp:566] Updating replica status to VOTING > I1022 18:42:26.857781 140 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 371121ns > I1022 18:42:26.857841 140 replica.cpp:323] Persisted replica status to > VOTING > I1022 18:42:26.857895 140 recover.cpp:580] Successfully joined the Paxos > group > I1022 18:42:26.857928 140 recover.cpp:464] Recover process terminated > I1022 18:42:26.862455 137 master.cpp:1603] The newly elected leader is > master@172.17.0.14:5050 with id a3f39818-1bda-4710-b96b-2a60ed4d12b8 > I1022 18:42:26.862498 137 master.cpp:1616] Elected as the leading master! > I1022 18:42:26.862511 137 master.cpp:1376] Recovering from registrar > I1022 18:42:26.862560 137 registrar.cpp:309] Recovering registrar > Failed to create a containerizer: Could not create MesosContainerizer: Failed > to create launcher: Failed to create Linux launcher: Failed to mount cgroups >
[jira] [Updated] (MESOS-4014) Introduce remove endpoint for quota
[ https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4014: --- Summary: Introduce remove endpoint for quota (was: Introduce delete/remove endpoint for quota) > Introduce remove endpoint for quota > --- > > Key: MESOS-4014 > URL: https://issues.apache.org/jira/browse/MESOS-4014 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Alexander Rukletsov >Assignee: Joerg Schad > Labels: mesosphere > > This endpoint is for removing quotas via the DELETE method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4013) Introduce status endpoint for quota
[ https://issues.apache.org/jira/browse/MESOS-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4013: --- Summary: Introduce status endpoint for quota (was: Introduce GET/status endpoint for quota) > Introduce status endpoint for quota > --- > > Key: MESOS-4013 > URL: https://issues.apache.org/jira/browse/MESOS-4013 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Alexander Rukletsov >Assignee: Joerg Schad > Labels: mesosphere > > The endpoint should provide quota status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3973) Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11.
[ https://issues.apache.org/jira/browse/MESOS-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-3973: -- Shepherd: Till Toenshoff > Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11. > - > > Key: MESOS-3973 > URL: https://issues.apache.org/jira/browse/MESOS-3973 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 0.26.0 > Environment: Mac OS X 10.10.5, Clang 7.0.0. >Reporter: Bernd Mathiske >Assignee: Gilbert Song > Labels: build, build-failure, mesosphere > > Non-root 'make distcheck. > {noformat} > ... > [--] Global test environment tear-down > [==] 826 tests from 113 test cases ran. (276624 ms total) > [ PASSED ] 826 tests. > YOU HAVE 6 DISABLED TESTS > Making install in . > make[3]: Nothing to be done for `install-exec-am'. > ../install-sh -c -d > '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig' > /usr/bin/install -c -m 644 mesos.pc > '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig' > Making install in 3rdparty > /Applications/Xcode.app/Contents/Developer/usr/bin/make install-recursive > Making install in libprocess > Making install in 3rdparty > /Applications/Xcode.app/Contents/Developer/usr/bin/make install-recursive > Making install in stout > Making install in . > make[9]: Nothing to be done for `install-exec-am'. > make[9]: Nothing to be done for `install-data-am'. > Making install in include > make[9]: Nothing to be done for `install-exec-am'. > ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d > '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include' > ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d > '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include/stout' > /usr/bin/install -c -m 644 > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/abort.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/attributes.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/base64.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bits.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bytes.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/dynamiclibrary.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/error.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/exit.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/flags.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/foreach.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/format.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/fs.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gtest.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gzip.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashmap.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashset.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/ip.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/lambda.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/linkedhashmap.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/list.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/mac.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multihashmap.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multimap.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/none.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/nothing.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/numify.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/path.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/preprocessor.hpp >
[jira] [Comment Edited] (MESOS-3975) SSL build of mesos causes flaky testsuite.
[ https://issues.apache.org/jira/browse/MESOS-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026938#comment-15026938 ] Till Toenshoff edited comment on MESOS-3975 at 11/25/15 3:38 PM: - I can still see tests failing using the above Vagrantfile generator on both VMware-Fusion as well as on VirtualBox -- hosted on OSX and Linux. Just ran the test-suite again with a repeat-counter enabled and it stopped on the first {noformat} [ RUN ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem 2015-11-25 15:26:33,873:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:26:37,209:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:26:40,546:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:26:43,883:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client + /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false --operation=make-rslave --path=/ + grep -E /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_FrPTNg/.+ /proc/self/mountinfo + grep -v 722234da-f06d-4c9c-95d9-9be998e69d5c + cut '-d ' -f5 + xargs --no-run-if-empty umount -l Changing root to /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_FrPTNg/provisioner/containers/722234da-f06d-4c9c-95d9-9be998e69d5c/backends/copy/rootfses/928eb0dc-228b-4e9a-80d4-de8fb86ff6ea 2015-11-25 15:26:47,221:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client [ OK ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem (16903 ms) [ RUN ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutor 2015-11-25 15:26:50,558:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:26:53,894:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client + /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false --operation=make-rslave --path=/ + grep -E /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/.+ /proc/self/mountinfo + grep -v 39ddf64a-d74e-44c9-a237-2d130c95e72d + cut '-d ' -f5 + xargs --no-run-if-empty umount -l + mount -n --rbind /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/provisioner/containers/39ddf64a-d74e-44c9-a237-2d130c95e72d/backends/copy/rootfses/4eac79ca-c89f-4a1d-b190-9e11cb43ca15 /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/slaves/f3615745-e347-4ffe-ba44-30cb0c245d76-S0/frameworks/f3615745-e347-4ffe-ba44-30cb0c245d76-/executors/226484c0-8df5-43fd-a62f-39b3b7bc4824/runs/39ddf64a-d74e-44c9-a237-2d130c95e72d/.rootfs Could not load cert file 2015-11-25 15:26:57,231:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client ../../src/tests/containerizer/filesystem_isolator_tests.cpp:354: Failure Value of: statusRunning.get().state() Actual: TASK_FAILED Expected: TASK_RUNNING 2015-11-25 15:27:00,568:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:27:03,906:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:27:07,243:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:27:10,580:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:27:13,916:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client ../../src/tests/containerizer/filesystem_isolator_tests.cpp:355: Failure Failed to wait 15secs for statusFinished
[jira] [Commented] (MESOS-4014) Introduce remove endpoint for quota
[ https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026965#comment-15026965 ] Alexander Rukletsov commented on MESOS-4014: https://reviews.apache.org/r/40580/ > Introduce remove endpoint for quota > --- > > Key: MESOS-4014 > URL: https://issues.apache.org/jira/browse/MESOS-4014 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Alexander Rukletsov >Assignee: Joerg Schad > Labels: mesosphere > > This endpoint is for removing quotas via the DELETE method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3916) MasterMaintenanceTest.InverseOffersFilters is flaky
[ https://issues.apache.org/jira/browse/MESOS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027007#comment-15027007 ] Bernd Mathiske commented on MESOS-3916: --- Thank you! Will update the target release. > MasterMaintenanceTest.InverseOffersFilters is flaky > --- > > Key: MESOS-3916 > URL: https://issues.apache.org/jira/browse/MESOS-3916 > Project: Mesos > Issue Type: Bug > Environment: Ubuntu Wily 64 bit >Reporter: Neil Conway >Assignee: Neil Conway > Labels: flaky-test, maintenance, mesosphere > Attachments: wily_maintenance_test_verbose.txt > > > Verbose Logs: > {code} > [ RUN ] MasterMaintenanceTest.InverseOffersFilters > I1113 16:43:58.486469 8728 leveldb.cpp:176] Opened db in 2.360405ms > I1113 16:43:58.486935 8728 leveldb.cpp:183] Compacted db in 407105ns > I1113 16:43:58.486995 8728 leveldb.cpp:198] Created db iterator in 16221ns > I1113 16:43:58.487030 8728 leveldb.cpp:204] Seeked to beginning of db in > 10935ns > I1113 16:43:58.487046 8728 leveldb.cpp:273] Iterated through 0 keys in the > db in 999ns > I1113 16:43:58.487090 8728 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1113 16:43:58.487735 8747 recover.cpp:449] Starting replica recovery > I1113 16:43:58.488047 8747 recover.cpp:475] Replica is in EMPTY status > I1113 16:43:58.488977 8745 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (58)@10.0.2.15:45384 > I1113 16:43:58.489452 8746 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1113 16:43:58.489712 8747 recover.cpp:566] Updating replica status to > STARTING > I1113 16:43:58.490706 8742 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 745443ns > I1113 16:43:58.490739 8742 replica.cpp:323] Persisted replica status to > STARTING > I1113 16:43:58.490859 8742 recover.cpp:475] Replica is in STARTING status > I1113 16:43:58.491786 8747 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (59)@10.0.2.15:45384 > I1113 16:43:58.492542 8749 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1113 16:43:58.493221 8743 recover.cpp:566] Updating replica status to VOTING > I1113 16:43:58.493710 8743 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 331874ns > I1113 16:43:58.493767 8743 replica.cpp:323] Persisted replica status to > VOTING > I1113 16:43:58.493868 8743 recover.cpp:580] Successfully joined the Paxos > group > I1113 16:43:58.494119 8743 recover.cpp:464] Recover process terminated > I1113 16:43:58.504369 8749 master.cpp:367] Master > d59449fc-5462-43c5-b935-e05563fdd4b6 (vagrant-ubuntu-wily-64) started on > 10.0.2.15:45384 > I1113 16:43:58.504438 8749 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="false" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/ZB7csS/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_slave_ping_timeouts="5" --quiet="false" > --recovery_slave_removal_limit="100%" --registry="replicated_log" > --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" > --registry_strict="true" --root_submissions="true" > --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" > --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/ZB7csS/master" > --zk_session_timeout="10secs" > I1113 16:43:58.504717 8749 master.cpp:416] Master allowing unauthenticated > frameworks to register > I1113 16:43:58.504889 8749 master.cpp:419] Master only allowing > authenticated slaves to register > I1113 16:43:58.504922 8749 credentials.hpp:37] Loading credentials for > authentication from '/tmp/ZB7csS/credentials' > I1113 16:43:58.505497 8749 master.cpp:458] Using default 'crammd5' > authenticator > I1113 16:43:58.505759 8749 master.cpp:495] Authorization enabled > I1113 16:43:58.507638 8746 master.cpp:1606] The newly elected leader is > master@10.0.2.15:45384 with id d59449fc-5462-43c5-b935-e05563fdd4b6 > I1113 16:43:58.507693 8746 master.cpp:1619] Elected as the leading master! > I1113 16:43:58.507720 8746 master.cpp:1379] Recovering from registrar > I1113 16:43:58.507946 8749 registrar.cpp:309] Recovering registrar > I1113 16:43:58.508561 8749 log.cpp:661] Attempting to start the writer > I1113 16:43:58.510282 8747 replica.cpp:496] Replica received implicit > promise request from (60)@10.0.2.15:45384 with proposal 1 > I1113 16:43:58.510867 8747
[jira] [Updated] (MESOS-3975) SSL build of mesos causes flaky testsuite.
[ https://issues.apache.org/jira/browse/MESOS-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-3975: Assignee: Joseph Wu (was: Joris Van Remoortere) > SSL build of mesos causes flaky testsuite. > -- > > Key: MESOS-3975 > URL: https://issues.apache.org/jira/browse/MESOS-3975 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.26.0 > Environment: CentOS 7.1, Kernel 3.10.0-229.20.1.el7.x86_64, gcc > 4.8.3, Docker 1.9 >Reporter: Till Toenshoff >Assignee: Joseph Wu > Labels: mesosphere > > When running the tests of an SSL build of Mesos on CentOS 7.1, I see spurious > test failures that are, so far, not reproducible. > The following tests did fail for me in complete runs but did seem fine when > running them individually, in repetition. > {noformat} > DockerTest.ROOT_DOCKER_CheckPortResource > {noformat} > {noformat} > ContainerizerTest.ROOT_CGROUPS_BalloonFramework > {noformat} > {noformat} > [ RUN ] > LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutor > 2015-11-20 > 19:08:38,826:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > + /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false > --operation=make-rslave --path=/ > + grep -E > /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/.+ > /proc/self/mountinfo > + grep -v 2b98025c-74f1-41d2-b35a-ce2cdfae347e > + cut '-d ' -f5 > + xargs --no-run-if-empty umount -l > + mount -n --rbind > /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/provisioner/containers/2b98025c-74f1-41d2-b35a-ce2cdfae347e/backends/copy/rootfses/bed11080-474b-4c69-8e7f-0ab85e895b0d > > /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/slaves/830e842e-c36a-4e4c-bff4-5b9568d7df12-S0/frameworks/830e842e-c36a-4e4c-bff4-5b9568d7df12-/executors/c735be54-c47f-4645-bfc1-2f4647e2cddb/runs/2b98025c-74f1-41d2-b35a-ce2cdfae347e/.rootfs > Could not load cert file > ../../src/tests/containerizer/filesystem_isolator_tests.cpp:354: Failure > Value of: statusRunning.get().state() > Actual: TASK_FAILED > Expected: TASK_RUNNING > 2015-11-20 > 19:08:42,164:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > 2015-11-20 > 19:08:45,501:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > 2015-11-20 > 19:08:48,837:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > 2015-11-20 > 19:08:52,174:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > ../../src/tests/containerizer/filesystem_isolator_tests.cpp:355: Failure > Failed to wait 15secs for statusFinished > ../../src/tests/containerizer/filesystem_isolator_tests.cpp:349: Failure > Actual function call count doesn't match EXPECT_CALL(sched, > statusUpdate(, _))... > Expected: to be called twice >Actual: called once - unsatisfied and active > 2015-11-20 > 19:08:55,511:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: > Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server > refused to accept the client > *** Aborted at 1448046536 (unix time) try "date -d @1448046536" if you are > using GNU date *** > PC: @0x0 (unknown) > *** SIGSEGV (@0x0) received by PID 21380 (TID 0x7fa1549e68c0) from PID 0; > stack trace: *** > @ 0x7fa141796fbb (unknown) > @ 0x7fa14179b341 (unknown) > @ 0x7fa14f096130 (unknown) > {noformat} > Vagrantfile generator: > {noformat} > cat << EOF > Vagrantfile > # -*- mode: ruby -*-" > > # vi: set ft=ruby : > Vagrant.configure(2) do |config| > # Disable shared folder to prevent certain kernel module dependencies. > config.vm.synced_folder ".", "/vagrant", disabled: true > config.vm.hostname = "centos71" > config.vm.box = "bento/centos-7.1" > config.vm.provider "virtualbox" do |vb| > vb.memory = 16384 > vb.cpus = 8 > end > config.vm.provider "vmware_fusion" do |vb| > vb.memory = 9216 > vb.cpus = 4 > end > config.vm.provision "shell", inline: <<-SHELL > sudo yum -y update systemd > sudo yum install -y tar wget > sudo wget >
[jira] [Assigned] (MESOS-2948) Generalize authorizer interface in order to allow for arbitrary Subjects, Actions and Objects
[ https://issues.apache.org/jira/browse/MESOS-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio reassigned MESOS-2948: -- Assignee: Marco Massenzio > Generalize authorizer interface in order to allow for arbitrary Subjects, > Actions and Objects > - > > Key: MESOS-2948 > URL: https://issues.apache.org/jira/browse/MESOS-2948 > Project: Mesos > Issue Type: Epic > Components: master, security >Reporter: Alexander Rojas >Assignee: Marco Massenzio > Labels: acl, mesosphere, security > > The current > [{{mesos::Authorizer}}|https://github.com/apache/mesos/blob/40b596402521be25b93b9ef4edd8f5c727c9d20e/src/authorizer/authorizer.hpp] > API has one method for each of the _actions_ supported (Register Framework, > Launch Task and Shutdown Framework), and each of these _actions_ themselves > define the _objects_ on which they operate. > Currently, in case a new action needs to be authorized it is necessary to > modify the {{mesos::Authorizer}} interface and all its implementations > (currently only {{mesos::LocalAuthorizer}}), and add a new nested message to > the {{ACL}} message in {{mesos.proto}}. > An update to the API should allow for new _actions_ and _objects_ to be added > without the need to change the {{mesos::Authorizer}} interface while > encapsulating implementation details on how the authorization process is > performed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-2297) Add authentication support for HTTP API
[ https://issues.apache.org/jira/browse/MESOS-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio reassigned MESOS-2297: -- Assignee: Marco Massenzio (was: Alexander Rojas) > Add authentication support for HTTP API > --- > > Key: MESOS-2297 > URL: https://issues.apache.org/jira/browse/MESOS-2297 > Project: Mesos > Issue Type: Epic >Reporter: Vinod Kone >Assignee: Marco Massenzio > Labels: mesosphere, security > > Since most of the communication between mesos components will happen through > HTTP with the arrival of the [HTTP > API|https://issues.apache.org/jira/browse/MESOS-2288], it makes sense to use > HTTP standard mechanisms to authenticate this communication. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3975) SSL build of mesos causes flaky testsuite.
[ https://issues.apache.org/jira/browse/MESOS-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026938#comment-15026938 ] Till Toenshoff commented on MESOS-3975: --- I can still see tests failing using the above Vagrantfile generator on both VMware-Fusion as well as on VirtualBox -- hosted on OSX and Linux. Just ran the test-suite again with a repeat-counter enabled and it stopped on the first iteration: ``` [ RUN ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem 2015-11-25 15:26:33,873:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:26:37,209:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:26:40,546:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:26:43,883:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client + /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false --operation=make-rslave --path=/ + grep -E /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_FrPTNg/.+ /proc/self/mountinfo + grep -v 722234da-f06d-4c9c-95d9-9be998e69d5c + cut '-d ' -f5 + xargs --no-run-if-empty umount -l Changing root to /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_FrPTNg/provisioner/containers/722234da-f06d-4c9c-95d9-9be998e69d5c/backends/copy/rootfses/928eb0dc-228b-4e9a-80d4-de8fb86ff6ea 2015-11-25 15:26:47,221:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client [ OK ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem (16903 ms) [ RUN ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutor 2015-11-25 15:26:50,558:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:26:53,894:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client + /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false --operation=make-rslave --path=/ + grep -E /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/.+ /proc/self/mountinfo + grep -v 39ddf64a-d74e-44c9-a237-2d130c95e72d + cut '-d ' -f5 + xargs --no-run-if-empty umount -l + mount -n --rbind /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/provisioner/containers/39ddf64a-d74e-44c9-a237-2d130c95e72d/backends/copy/rootfses/4eac79ca-c89f-4a1d-b190-9e11cb43ca15 /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_nEk9PC/slaves/f3615745-e347-4ffe-ba44-30cb0c245d76-S0/frameworks/f3615745-e347-4ffe-ba44-30cb0c245d76-/executors/226484c0-8df5-43fd-a62f-39b3b7bc4824/runs/39ddf64a-d74e-44c9-a237-2d130c95e72d/.rootfs Could not load cert file 2015-11-25 15:26:57,231:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client ../../src/tests/containerizer/filesystem_isolator_tests.cpp:354: Failure Value of: statusRunning.get().state() Actual: TASK_FAILED Expected: TASK_RUNNING 2015-11-25 15:27:00,568:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:27:03,906:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:27:07,243:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:27:10,580:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2015-11-25 15:27:13,916:22205(0x7f3435870700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:37707] zk retcode=-4, errno=111(Connection refused): server refused to accept the client ../../src/tests/containerizer/filesystem_isolator_tests.cpp:355: Failure Failed to wait 15secs for statusFinished ../../src/tests/containerizer/filesystem_isolator_tests.cpp:349: Failure Actual function call count doesn't
[jira] [Updated] (MESOS-4013) Introduce status endpoint for quota
[ https://issues.apache.org/jira/browse/MESOS-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-4013: --- Description: This endpoint is for querying quota status via the GET method. (was: The endpoint should provide quota status.) > Introduce status endpoint for quota > --- > > Key: MESOS-4013 > URL: https://issues.apache.org/jira/browse/MESOS-4013 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Alexander Rukletsov >Assignee: Joerg Schad > Labels: mesosphere > > This endpoint is for querying quota status via the GET method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027745#comment-15027745 ] Till Toenshoff edited comment on MESOS-3937 at 11/25/15 10:54 PM: -- I also tried a different image first -- most other images seem to not have this issue as they use a different approach for binding the hostname towards an IP. {noformat} $ hostname -f vagrant.vm {noformat} {noformat} $ cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 vagrant.vm vagrant # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters {noformat} The above is produced by a {{bento/ubuntu-14.04}} base image. was (Author: tillt): I also tried a different image first -- most other images seem to not have this issue as they use a different approach for binding the hostname towards an IP. {noformat} $ hostname -f vagrant.vm {noformat} {noformat} $ cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 vagrant.vm vagrant # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters {noformat} > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Till Toenshoff > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials
[jira] [Assigned] (MESOS-4002) ReservationEndpointsTest.UnreserveAvailableAndOfferedResources is flaky
[ https://issues.apache.org/jira/browse/MESOS-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park reassigned MESOS-4002: --- Assignee: Michael Park (was: Anand Mazumdar) > ReservationEndpointsTest.UnreserveAvailableAndOfferedResources is flaky > --- > > Key: MESOS-4002 > URL: https://issues.apache.org/jira/browse/MESOS-4002 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar >Assignee: Michael Park > Labels: flaky-test, mesosphere, reservations > > Showed up on ASF CI: ( test kept looping on and on and ultimately failing the > build after 300 minutes ) > https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose,OS=ubuntu%3A14.04,label_exp=docker%7C%7CHadoop/1269/changes > {code} > [ RUN ] ReservationEndpointsTest.UnreserveAvailableAndOfferedResources > I1124 01:07:20.050729 30260 leveldb.cpp:174] Opened db in 107.434842ms > I1124 01:07:20.099630 30260 leveldb.cpp:181] Compacted db in 48.82312ms > I1124 01:07:20.099722 30260 leveldb.cpp:196] Created db iterator in 29905ns > I1124 01:07:20.099738 30260 leveldb.cpp:202] Seeked to beginning of db in > 3145ns > I1124 01:07:20.099750 30260 leveldb.cpp:271] Iterated through 0 keys in the > db in 279ns > I1124 01:07:20.099804 30260 replica.cpp:778] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1124 01:07:20.100637 30292 recover.cpp:447] Starting replica recovery > I1124 01:07:20.100934 30292 recover.cpp:473] Replica is in EMPTY status > I1124 01:07:20.103240 30288 replica.cpp:674] Replica in EMPTY status received > a broadcasted recover request from (6305)@172.17.18.107:37993 > I1124 01:07:20.103672 30292 recover.cpp:193] Received a recover response from > a replica in EMPTY status > I1124 01:07:20.104142 30292 recover.cpp:564] Updating replica status to > STARTING > I1124 01:07:20.114534 30284 master.cpp:365] Master > ad27bc60-16d1-4239-9a65-235a991f9600 (9f2f81738d5e) started on > 172.17.18.107:37993 > I1124 01:07:20.114558 30284 master.cpp:367] Flags at startup: --acls="" > --allocation_interval="1000secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/I60I5f/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" --roles="role" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/mesos/mesos-0.26.0/_inst/share/mesos/webui" > --work_dir="/tmp/I60I5f/master" --zk_session_timeout="10secs" > I1124 01:07:20.114809 30284 master.cpp:412] Master only allowing > authenticated frameworks to register > I1124 01:07:20.114820 30284 master.cpp:417] Master only allowing > authenticated slaves to register > I1124 01:07:20.114825 30284 credentials.hpp:35] Loading credentials for > authentication from '/tmp/I60I5f/credentials' > I1124 01:07:20.115067 30284 master.cpp:456] Using default 'crammd5' > authenticator > I1124 01:07:20.115320 30284 master.cpp:493] Authorization enabled > I1124 01:07:20.115792 30285 hierarchical.cpp:162] Initialized hierarchical > allocator process > I1124 01:07:20.115855 30285 whitelist_watcher.cpp:77] No whitelist given > I1124 01:07:20.118755 30285 master.cpp:1625] The newly elected leader is > master@172.17.18.107:37993 with id ad27bc60-16d1-4239-9a65-235a991f9600 > I1124 01:07:20.118788 30285 master.cpp:1638] Elected as the leading master! > I1124 01:07:20.118809 30285 master.cpp:1383] Recovering from registrar > I1124 01:07:20.119078 30285 registrar.cpp:307] Recovering registrar > I1124 01:07:20.143256 30292 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 38.787419ms > I1124 01:07:20.143347 30292 replica.cpp:321] Persisted replica status to > STARTING > I1124 01:07:20.143717 30292 recover.cpp:473] Replica is in STARTING status > I1124 01:07:20.145454 30286 replica.cpp:674] Replica in STARTING status > received a broadcasted recover request from (6307)@172.17.18.107:37993 > I1124 01:07:20.145979 30292 recover.cpp:193] Received a recover response from > a replica in STARTING status > I1124 01:07:20.146654 30292 recover.cpp:564] Updating replica status to VOTING > I1124 01:07:20.182672 30286 leveldb.cpp:304] Persisting metadata (8 bytes) to > leveldb took 35.422256ms > I1124 01:07:20.182747 30286 replica.cpp:321] Persisted replica status to > VOTING > I1124
[jira] [Comment Edited] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027745#comment-15027745 ] Till Toenshoff edited comment on MESOS-3937 at 11/25/15 10:53 PM: -- I also tried a different image first -- most other images seem to not have this issue as they use a different approach for binding the hostname towards an IP. {noformat} $ hostname -f vagrant.vm {noformat} {noformat} $ cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 vagrant.vm vagrant # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters {noformat} was (Author: tillt): I also tried a different image first -- most other images seem to not have this issue as they use a different approach for binding the hostname towards an IP. {{/etc/hosts}} from {{bento/ubuntu-14.04}} {noformat} $ hostname -f vagrant.vm {noformat} {noformat} $ cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 vagrant.vm vagrant # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters {noformat} > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Till Toenshoff > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for >
[jira] [Commented] (MESOS-4000) Implicit roles: Design Doc
[ https://issues.apache.org/jira/browse/MESOS-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027784#comment-15027784 ] Neil Conway commented on MESOS-4000: The design doc for implicit roles can be found here: https://docs.google.com/document/d/1SCFfrBd4edSY3bVCMrNJYMxIVllD0bHJuGmgG-4vCXA/edit?usp=sharing > Implicit roles: Design Doc > -- > > Key: MESOS-4000 > URL: https://issues.apache.org/jira/browse/MESOS-4000 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Neil Conway > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027922#comment-15027922 ] Till Toenshoff commented on MESOS-3937: --- Boils down to this line being triggered within the docker image of this test: https://github.com/mesos/mesos-go/blob/068d5470506e3780189fe607af40892814197c5e/mesosutil/node.go#L18 > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Till Toenshoff > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for > authentication from '/tmp/40AlT8/credentials' > I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING > I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' > authenticator > I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.075466ms > I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to > VOTING > I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos > group > I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated > I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL > I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled > I1117 15:08:09.296018 26399 master.cpp:1606] The
[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027271#comment-15027271 ] Marco Massenzio commented on MESOS-3937: So I read the thread and, honestly, it looks like we're making all this song and dance to make a test pass? who cares? The question, with a failing test, is always the same: {quote} Is the test buggy, or are we uncovering a genuine issue in the code? {quote} It seems to me that this tests does not identify an issue in the code; at best, it has highlighted a combination of Ubuntu / Kernel / Docker versions/configurations that *may* cause an Executor launched inside a Docker container to fail (and, even there, I'm not so sure). Also, please let's remind ourselves that tests are useful so that, when introducing code changes; refactorings; or new features, we can be assured that we haven't broken something that was working before: I'm not even sure this test achieves that? (this may be a harsh statement borne out of my ignorance - please, correct me if I'm wrong on this one). Here is my suggestion as to how to solve this issue: - short-term: we disable this test and remove it as a {{0.26}} blocker (it doesn't seem to me that the failure highlights a regression in the code - again, correct me if I'm wrong); - short-term: document the issue and possible workarounds for folks who may need to run Docker executors on Ubuntu; - medium-term: if possible at all, let's find ways to identify in the test the conditions under which it's supposed to pass and, if they are met on the given platform the test is run - if not, a warning is emitted, but no failure (or something similar); - long-run: decide whether to keep the test (modified, possibly) and / or discard it. What do people think? > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Timothy Chen > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs"
[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027735#comment-15027735 ] Till Toenshoff commented on MESOS-3937: --- When using the above image, the following is true for me: The test breaks as described by Bernd. {noformat} $ hostname -f vagrant-ubuntu-trusty-64 {noformat} {noformat} $ ifconfig docker0 Link encap:Ethernet HWaddr 56:84:7a:fe:97:99 inet addr:172.17.42.1 Bcast:0.0.0.0 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) eth0 Link encap:Ethernet HWaddr 08:00:27:70:2a:9d inet addr:10.0.2.15 Bcast:10.0.2.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fe70:2a9d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:246548 errors:0 dropped:0 overruns:0 frame:0 TX packets:65399 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:298078841 (298.0 MB) TX bytes:6093076 (6.0 MB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:48338 errors:0 dropped:0 overruns:0 frame:0 TX packets:48338 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:5936578 (5.9 MB) TX bytes:5936578 (5.9 MB) {noformat} {noformat} $ ping vagrant-ubuntu-trusty-64 PING vagrant-ubuntu-trusty-64 (10.0.2.15) 56(84) bytes of data. 64 bytes from vagrant-ubuntu-trusty-64 (10.0.2.15): icmp_seq=1 ttl=64 time=0.026 ms {noformat} So apparently, the hostname resolves towards a valid, non loopback IP (the one used by eth0). {noformat} $ cat /etc/hosts 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts {noformat} Why would I need to add this hostname to {{/etc/hosts}} - despite the fact that it fixes this test - but why? > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Till Toenshoff > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs"
[jira] [Issue Comment Deleted] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-3937: -- Comment: was deleted (was: Boils down to this line being triggered within the docker image of this test: https://github.com/mesos/mesos-go/blob/068d5470506e3780189fe607af40892814197c5e/mesosutil/node.go#L18) > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Till Toenshoff > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for > authentication from '/tmp/40AlT8/credentials' > I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING > I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' > authenticator > I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.075466ms > I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to > VOTING > I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos > group > I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated > I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL > I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled > I1117 15:08:09.296018 26399 master.cpp:1606] The newly
[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027927#comment-15027927 ] Till Toenshoff commented on MESOS-3937: --- Boils down to this line being triggered within the docker image of this test: https://github.com/mesos/mesos-go/blob/068d5470506e3780189fe607af40892814197c5e/mesosutil/node.go#L18 > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Till Toenshoff > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for > authentication from '/tmp/40AlT8/credentials' > I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING > I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' > authenticator > I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.075466ms > I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to > VOTING > I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos > group > I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated > I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL > I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled > I1117 15:08:09.296018 26399 master.cpp:1606] The
[jira] [Updated] (MESOS-4000) Implicit roles: Design Doc
[ https://issues.apache.org/jira/browse/MESOS-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4000: --- Labels: mesosphere roles (was: ) > Implicit roles: Design Doc > -- > > Key: MESOS-4000 > URL: https://issues.apache.org/jira/browse/MESOS-4000 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Neil Conway > Labels: mesosphere, roles > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4015) Expose task / executor health in master & slave state.json
Sargun Dhillon created MESOS-4015: - Summary: Expose task / executor health in master & slave state.json Key: MESOS-4015 URL: https://issues.apache.org/jira/browse/MESOS-4015 Project: Mesos Issue Type: Improvement Affects Versions: 0.25.0 Reporter: Sargun Dhillon Priority: Trivial Right now, if I specify a healthcheck for a task, the only way to get to it is via the Task Status updates that come to the framework. Unfortunately, this information isn't exposed in the state.json either in the slave or master. It'd be ideal to have that information to enable tools like Mesos-DNS to be health-aware. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027745#comment-15027745 ] Till Toenshoff commented on MESOS-3937: --- I also tried a different image first -- most other images seem to not have this issue as they use a different approach for binding the hostname towards an IP. {{/etc/hosts}} from {{bento/ubuntu-14.04}} {noformat} $ hostname -f vagrant.vm {noformat} {noformat} $ cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 vagrant.vm vagrant # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters {noformat} > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Till Toenshoff > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for > authentication from '/tmp/40AlT8/credentials' > I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING > I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' > authenticator > I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.075466ms > I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to > VOTING > I1117 15:08:09.286200 26401
[jira] [Created] (MESOS-4016) Agent allows creation of persistent volume with absolute container_path
Greg Mann created MESOS-4016: Summary: Agent allows creation of persistent volume with absolute container_path Key: MESOS-4016 URL: https://issues.apache.org/jira/browse/MESOS-4016 Project: Mesos Issue Type: Bug Affects Versions: 0.25.0 Reporter: Greg Mann When creating persistent volumes, [~gabriel.hartm...@gmail.com] saw that he could specify an absolute {{container_path}} in the {{CREATE}} operation and his framework would receive a subsequent offer containing that volume, indicating a successful operation. However, the directory was not found on the agent, and indeed such an operation should be unsuccessful, since in {{/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp}} it is enforced that if an absolute {{container_path}} is specified, the directory should already exist, and in this case it did not. The {{CREATE}} operation should not appear to succeed if an invalid {{container_path}} is provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4017) Executor killed when multiple persistent volumes specify same container_path
Greg Mann created MESOS-4017: Summary: Executor killed when multiple persistent volumes specify same container_path Key: MESOS-4017 URL: https://issues.apache.org/jira/browse/MESOS-4017 Project: Mesos Issue Type: Bug Affects Versions: 0.25.0 Reporter: Greg Mann [~gabriel.hartm...@gmail.com] recently noticed that his custom executor was getting killed by master when multiple tasks attempt to use persistent volumes with the same {{container_path}}. A {{CREATE}} operation that created two persistent volumes with the same {{container_path}} succeeded, and a subsequent offer included those persistent volumes. Then tasks were launched on a single executor that used these volumes, and at that point the master killed the executor. Better behavior might be for the first task to launch successfully, with the second task returning {{TASK_FAILED}} with an appropriate reason and message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4018) Enhance float-point operation in Mesos
[ https://issues.apache.org/jira/browse/MESOS-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028021#comment-15028021 ] Klaus Ma commented on MESOS-4018: - In MESOS-3997, it'll replace float-point by fixed point for resources. > Enhance float-point operation in Mesos > -- > > Key: MESOS-4018 > URL: https://issues.apache.org/jira/browse/MESOS-4018 > Project: Mesos > Issue Type: Epic > Components: stout >Reporter: Klaus Ma >Assignee: Klaus Ma > > For now, there are several defects about float-point equal checking. This > EPIC is used to build float-point operation in {{stout}} for other > components. The major operation will be: > 1. {{bool almostEqual(double left, double right)}} for Scalar {{operator==}} > 2. {{CHECK_DOUBLE_EQ(left, right)}} for assert in components -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request
[ https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028035#comment-15028035 ] Klaus Ma commented on MESOS-3552: - [~marco-mesos], I create a EPIC (MESOS-4018) for float-point operation related issues. The major task of that EPIC is to build float-point operation in {{stout}}, e.g. {{almostEqual}}, {{CHECK_DOUBLE_EQ}}. So MESOS-1187 will use {{almostEqual}} for Scalar check; this ticket (MESOS-3552) will use {{CHECK_DOUBLE_EQ}}. Both tickets are sub-tasks of MESOS-4018. Any more comments? > CHECK failure due to floating point precision on reservation request > > > Key: MESOS-3552 > URL: https://issues.apache.org/jira/browse/MESOS-3552 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Mandeep Chadha >Assignee: Mandeep Chadha > Labels: mesosphere, tech-debt > > result.cpus() == cpus() check is failing due to ( double == double ) > comparison problem. > Root Cause : > Framework requested 0.1 cpu reservation for the first task. So far so good. > Next Reserve operation — lead to double operations resulting in following > double values : > results.cpus() : 23.9964472863211995 cpus() : 24 > And the check ( result.cpus() == cpus() ) failed. > The double arithmetic operations caused results.cpus() value to be : > 23.9964472863211995 and hence ( 23.9964472863211995 > == 24 ) failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4018) Enhance float-point operation in Mesos
[ https://issues.apache.org/jira/browse/MESOS-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Klaus Ma updated MESOS-4018: Description: For now, there are several defects about float-point equal checking. This EPIC is used to build float-point operation in {{stout}} for other components. The major operation will be: 1. {{bool almostEqual(double left, double right)}} for Scalar {{operator==}} 2. {{CHECK_DOUBLE_EQ(left, right)}} for assert in components was:For now, there are several defects about float-point equal checking. This EPIC is used to build float-point operation in {{stout}} for other components. The major operation will be: 1.) {{bool almostEqual(double left, double right)}} for Scalar {{operator==}}, 2.) {{CHECK_DOUBLE_EQ(left, right)}} for assert in components > Enhance float-point operation in Mesos > -- > > Key: MESOS-4018 > URL: https://issues.apache.org/jira/browse/MESOS-4018 > Project: Mesos > Issue Type: Epic > Components: stout >Reporter: Klaus Ma >Assignee: Klaus Ma > > For now, there are several defects about float-point equal checking. This > EPIC is used to build float-point operation in {{stout}} for other > components. The major operation will be: > 1. {{bool almostEqual(double left, double right)}} for Scalar {{operator==}} > 2. {{CHECK_DOUBLE_EQ(left, right)}} for assert in components -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4018) Enhance float-point operation in Mesos
Klaus Ma created MESOS-4018: --- Summary: Enhance float-point operation in Mesos Key: MESOS-4018 URL: https://issues.apache.org/jira/browse/MESOS-4018 Project: Mesos Issue Type: Epic Components: stout Reporter: Klaus Ma Assignee: Klaus Ma For now, there are several defects about float-point equal checking. This EPIC is used to build float-point operation in {{stout}} for other components. The major operation will be: 1.) {{bool almostEqual(double left, double right)}} for Scalar {{operator==}}, 2.) {{CHECK_DOUBLE_EQ(left, right)}} for assert in components -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3552) CHECK failure due to floating point precision on reservation request
[ https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Klaus Ma updated MESOS-3552: Issue Type: Task (was: Improvement) > CHECK failure due to floating point precision on reservation request > > > Key: MESOS-3552 > URL: https://issues.apache.org/jira/browse/MESOS-3552 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Mandeep Chadha >Assignee: Mandeep Chadha > Labels: mesosphere, tech-debt > > result.cpus() == cpus() check is failing due to ( double == double ) > comparison problem. > Root Cause : > Framework requested 0.1 cpu reservation for the first task. So far so good. > Next Reserve operation — lead to double operations resulting in following > double values : > results.cpus() : 23.9964472863211995 cpus() : 24 > And the check ( result.cpus() == cpus() ) failed. > The double arithmetic operations caused results.cpus() value to be : > 23.9964472863211995 and hence ( 23.9964472863211995 > == 24 ) failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3990) Unexpected reservation results due to floating point error
[ https://issues.apache.org/jira/browse/MESOS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-3990: --- Labels: mesosphere reservations tech-debt (was: ) Priority: Major (was: Critical) Component/s: master Summary: Unexpected reservation results due to floating point error (was: Doubles Don't Work for Resource Reservation) On reflection, reopening because this is a distinct issue: MESOS-3552 and MESOS-1187 are about a crashing bug, whereas this talks about unexpected user-visible behavior. In the short-term, the workaround is for frameworks to compare reserved resources within an epsilon. Long-term fix is to switch to a fixed-point representation (MESOS-3997). > Unexpected reservation results due to floating point error > -- > > Key: MESOS-3990 > URL: https://issues.apache.org/jira/browse/MESOS-3990 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Gabriel Hartmann > Labels: mesosphere, reservations, tech-debt > > When issuing a RESERVE operation requesting the below, I received a > reservation with the wrong value (6566.4002): > resources { > name: "mem" > type: SCALAR > scalar { > value: 6566.4001 > } > role: "role1" > reservation { > principal: "default-principal" > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3916) MasterMaintenanceTest.InverseOffersFilters is flaky
[ https://issues.apache.org/jira/browse/MESOS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027478#comment-15027478 ] Joseph Wu commented on MESOS-3916: -- That's a very odd failure. The first batch of inverse offers are both received by the master: {code:title=inverseOffer2} I1125 10:05:53.152995 29359 master.cpp:3316] Processing DECLINE call for offers: [ 932f7d7b-f2d4-42c7-9391-222c19b9d35b-O3 ] for framework 932f7d7b-f2d4-42c7-9391-222c19b9d35b- (default) {code} Note: This message shows up regardless, since {{Master::GetOffer}} does not search for inverse offers. We might want to silence this incorrect warning. {code:title=inverseOffer1} W1125 10:05:53.155109 29362 master.cpp:2897] ACCEPT call used invalid offers '[ 932f7d7b-f2d4-42c7-9391-222c19b9d35b-O2 ]': Offer 932f7d7b-f2d4-42c7-9391-222c19b9d35b-O2 is no longer valid {code} Somehow, the allocation was not triggered by the subsequent clock advancement in the test. I'm guessing: # The clock was settled while the ACCEPT call was still in flight. # The clock was then advanced before the ACCEPT call reached the master. [This comment seems relevant](https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/process.cpp#L2845-L2856) # The allocation went ahead, meaning the inverse offer had not been filtered to 0 seconds yet. # Clock is paused, so we don't allocate -> test times out. > MasterMaintenanceTest.InverseOffersFilters is flaky > --- > > Key: MESOS-3916 > URL: https://issues.apache.org/jira/browse/MESOS-3916 > Project: Mesos > Issue Type: Bug > Environment: Ubuntu Wily 64 bit >Reporter: Neil Conway >Assignee: Neil Conway > Labels: flaky-test, maintenance, mesosphere > Attachments: wily_maintenance_test_verbose.txt > > > Verbose Logs: > {code} > [ RUN ] MasterMaintenanceTest.InverseOffersFilters > I1113 16:43:58.486469 8728 leveldb.cpp:176] Opened db in 2.360405ms > I1113 16:43:58.486935 8728 leveldb.cpp:183] Compacted db in 407105ns > I1113 16:43:58.486995 8728 leveldb.cpp:198] Created db iterator in 16221ns > I1113 16:43:58.487030 8728 leveldb.cpp:204] Seeked to beginning of db in > 10935ns > I1113 16:43:58.487046 8728 leveldb.cpp:273] Iterated through 0 keys in the > db in 999ns > I1113 16:43:58.487090 8728 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1113 16:43:58.487735 8747 recover.cpp:449] Starting replica recovery > I1113 16:43:58.488047 8747 recover.cpp:475] Replica is in EMPTY status > I1113 16:43:58.488977 8745 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (58)@10.0.2.15:45384 > I1113 16:43:58.489452 8746 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1113 16:43:58.489712 8747 recover.cpp:566] Updating replica status to > STARTING > I1113 16:43:58.490706 8742 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 745443ns > I1113 16:43:58.490739 8742 replica.cpp:323] Persisted replica status to > STARTING > I1113 16:43:58.490859 8742 recover.cpp:475] Replica is in STARTING status > I1113 16:43:58.491786 8747 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (59)@10.0.2.15:45384 > I1113 16:43:58.492542 8749 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1113 16:43:58.493221 8743 recover.cpp:566] Updating replica status to VOTING > I1113 16:43:58.493710 8743 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 331874ns > I1113 16:43:58.493767 8743 replica.cpp:323] Persisted replica status to > VOTING > I1113 16:43:58.493868 8743 recover.cpp:580] Successfully joined the Paxos > group > I1113 16:43:58.494119 8743 recover.cpp:464] Recover process terminated > I1113 16:43:58.504369 8749 master.cpp:367] Master > d59449fc-5462-43c5-b935-e05563fdd4b6 (vagrant-ubuntu-wily-64) started on > 10.0.2.15:45384 > I1113 16:43:58.504438 8749 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="false" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/ZB7csS/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_slave_ping_timeouts="5" --quiet="false" > --recovery_slave_removal_limit="100%" --registry="replicated_log" > --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" > --registry_strict="true" --root_submissions="true" > --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" > --user_sorter="drf" --version="false" >
[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request
[ https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027366#comment-15027366 ] Avinash Sridharan commented on MESOS-3552: -- Instead of using an explicit epsilon check as has been proposed here we should be using the CHECK_DOUBLE_EQ macro for CPUs. Looks like CPUs are the only resources that are stored in double and might run into this double precision error. Something like this might work better: CHECK( result.mem() == mem() && result.disk() == disk() && result.ports() == ports()); CHECK_DOUBLE_EQ(result.cpus().get(), cpus().get()); > CHECK failure due to floating point precision on reservation request > > > Key: MESOS-3552 > URL: https://issues.apache.org/jira/browse/MESOS-3552 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Mandeep Chadha >Assignee: Mandeep Chadha > Labels: mesosphere, tech-debt > > result.cpus() == cpus() check is failing due to ( double == double ) > comparison problem. > Root Cause : > Framework requested 0.1 cpu reservation for the first task. So far so good. > Next Reserve operation — lead to double operations resulting in following > double values : > results.cpus() : 23.9964472863211995 cpus() : 24 > And the check ( result.cpus() == cpus() ) failed. > The double arithmetic operations caused results.cpus() value to be : > 23.9964472863211995 and hence ( 23.9964472863211995 > == 24 ) failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff reassigned MESOS-3937: - Assignee: Till Toenshoff (was: Timothy Chen) > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Till Toenshoff > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for > authentication from '/tmp/40AlT8/credentials' > I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING > I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' > authenticator > I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.075466ms > I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to > VOTING > I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos > group > I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated > I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL > I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled > I1117 15:08:09.296018 26399 master.cpp:1606] The newly elected leader is > master@10.0.2.15:50088 with id 59c600f1-92ff-4926-9c84-073d9b81f68a > I1117 15:08:09.296115 26399 master.cpp:1619] Elected as the leading
[jira] [Updated] (MESOS-4014) Introduce remove endpoint for quota
[ https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-4014: Sprint: Mesosphere Sprint 23 > Introduce remove endpoint for quota > --- > > Key: MESOS-4014 > URL: https://issues.apache.org/jira/browse/MESOS-4014 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Alexander Rukletsov >Assignee: Joerg Schad > Labels: mesosphere > Fix For: 0.27.0 > > > This endpoint is for removing quotas via the DELETE method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request
[ https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028224#comment-15028224 ] Klaus Ma commented on MESOS-3552: - [~avin...@mesosphere.io], thanks for your reminder :); I got this message in the history few days ago. I think there's a gap on the scope of float-point: whether {{bool almostEqual()}} should be included. I think we're on the same page about CHECK_NEAR/CHECK_DEBOULD_EQ which should be included. > CHECK failure due to floating point precision on reservation request > > > Key: MESOS-3552 > URL: https://issues.apache.org/jira/browse/MESOS-3552 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Mandeep Chadha >Assignee: Mandeep Chadha > Labels: mesosphere, tech-debt > > result.cpus() == cpus() check is failing due to ( double == double ) > comparison problem. > Root Cause : > Framework requested 0.1 cpu reservation for the first task. So far so good. > Next Reserve operation — lead to double operations resulting in following > double values : > results.cpus() : 23.9964472863211995 cpus() : 24 > And the check ( result.cpus() == cpus() ) failed. > The double arithmetic operations caused results.cpus() value to be : > 23.9964472863211995 and hence ( 23.9964472863211995 > == 24 ) failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4012) Update documentation to reflect the addition of installable tests.
[ https://issues.apache.org/jira/browse/MESOS-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026880#comment-15026880 ] Benjamin Bannier commented on MESOS-4012: - In addition to adding information on how a user can check conformance of a machine this would also give us the opportunity to cleanly separate what is _needed for to build mesos_ and what is _needed to run it_. > Update documentation to reflect the addition of installable tests. > > > Key: MESOS-4012 > URL: https://issues.apache.org/jira/browse/MESOS-4012 > Project: Mesos > Issue Type: Documentation >Reporter: Till Toenshoff > > We may want to add the needed steps for administrators to create and run the > test-suite on anything other than the build machine. > One possible location could be {{docs/gettings-started.md}} for validating > the pre-requisites as described in that document. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3974) CgroupsAnyHierarchyMemoryPressureTest tests fail on CentOS 6.7.
[ https://issues.apache.org/jira/browse/MESOS-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-3974: -- Shepherd: Till Toenshoff > CgroupsAnyHierarchyMemoryPressureTest tests fail on CentOS 6.7. > --- > > Key: MESOS-3974 > URL: https://issues.apache.org/jira/browse/MESOS-3974 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.26.0 > Environment: CentOS 6.7, kernel 2.6.32-573.el6.x86_64, gcc 4.8.2, > docker 1.7.1 >Reporter: Till Toenshoff >Assignee: Benjamin Bannier > Labels: mesosphere > > {noformat} > GLOG_v=2 sudo ./bin/mesos-tests.sh > --gtest_filter="CgroupsAnyHierarchyMemoryPressureTest.*" --verbose > {noformat} > {noformat} > WARNING: Logging before InitGoogleLogging() is written to STDERR > I1120 17:40:40.410383 2467 process.cpp:2426] Spawned process > __gc__@127.0.0.1:45300 > I1120 17:40:40.410909 2467 process.cpp:2426] Spawned process > help@127.0.0.1:45300 > I1120 17:40:40.410845 2483 process.cpp:2436] Resuming __gc__@127.0.0.1:45300 > at 2015-11-20 17:40:40.410562048+00:00 > I1120 17:40:40.410970 2467 process.cpp:2426] Spawned process > logging@127.0.0.1:45300 > I1120 17:40:40.410995 2467 process.cpp:2426] Spawned process > profiler@127.0.0.1:45300 > I1120 17:40:40.411015 2482 process.cpp:2436] Resuming help@127.0.0.1:45300 > at 2015-11-20 17:40:40.410989056+00:00 > I1120 17:40:40.411063 2467 process.cpp:2426] Spawned process > system@127.0.0.1:45300 > I1120 17:40:40.411160 2482 process.cpp:2436] Resuming > profiler@127.0.0.1:45300 at 2015-11-20 17:40:40.411155968+00:00 > I1120 17:40:40.411206 2467 process.cpp:2426] Spawned process > __limiter__(1)@127.0.0.1:45300 > I1120 17:40:40.411223 2467 process.cpp:2426] Spawned process > metrics@127.0.0.1:45300 > I1120 17:40:40.411268 2482 process.cpp:2436] Resuming system@127.0.0.1:45300 > at 2015-11-20 17:40:40.411266048+00:00 > I1120 17:40:40.411378 2483 process.cpp:2436] Resuming > __limiter__(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.411374080+00:00 > I1120 17:40:40.411388 2467 process.cpp:2426] Spawned process > __processes__@127.0.0.1:45300 > I1120 17:40:40.411399 2483 process.cpp:2436] Resuming > __processes__@127.0.0.1:45300 at 2015-11-20 17:40:40.411397888+00:00 > I1120 17:40:40.411402 2467 process.cpp:965] libprocess is initialized on > 127.0.0.1:45300 for 8 cpus > I1120 17:40:40.411415 2488 process.cpp:2436] Resuming help@127.0.0.1:45300 > at 2015-11-20 17:40:40.411397888+00:00 > I1120 17:40:40.411432 2467 logging.cpp:177] Logging to STDERR > I1120 17:40:40.411384 2482 process.cpp:2436] Resuming > metrics@127.0.0.1:45300 at 2015-11-20 17:40:40.411379200+00:00 > I1120 17:40:40.411717 2482 process.cpp:2436] Resuming help@127.0.0.1:45300 > at 2015-11-20 17:40:40.411710976+00:00 > I1120 17:40:40.411813 2487 process.cpp:2436] Resuming > logging@127.0.0.1:45300 at 2015-11-20 17:40:40.411789056+00:00 > I1120 17:40:40.411989 2487 process.cpp:2436] Resuming help@127.0.0.1:45300 > at 2015-11-20 17:40:40.411983872+00:00 > Source directory: /home/vagrant/mesos > Build directory: /home/vagrant/mesos/build > - > We cannot run any cgroups tests that require mounting > hierarchies because you have the following hierarchies mounted: > /cgroup/blkio, /cgroup/cpu, /cgroup/cpuacct, /cgroup/cpuset, /cgroup/devices, > /cgroup/freezer, /cgroup/memory, /cgroup/net_cls > We'll disable the CgroupsNoHierarchyTest test fixture for now. > - > I1120 17:40:40.414676 2467 process.cpp:2426] Spawned process > reaper(1)@127.0.0.1:45300 > I1120 17:40:40.414728 2482 process.cpp:2436] Resuming > reaper(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.414701824+00:00 > I1120 17:40:40.415870 2467 process.cpp:2426] Spawned process > __latch__(1)@127.0.0.1:45300 > I1120 17:40:40.415913 2483 process.cpp:2436] Resuming __gc__@127.0.0.1:45300 > at 2015-11-20 17:40:40.415889920+00:00 > I1120 17:40:40.415966 2467 process.cpp:2426] Spawned process > __waiter__(1)@127.0.0.1:45300 > I1120 17:40:40.416054 2483 process.cpp:2436] Resuming > __latch__(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.416045056+00:00 > I1120 17:40:40.416070 2467 process.cpp:2734] Donating thread to > __waiter__(1)@127.0.0.1:45300 while waiting > I1120 17:40:40.416093 2467 process.cpp:2436] Resuming > __waiter__(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.416083968+00:00 > I1120 17:40:40.517282 2483 process.cpp:2436] Resuming > reaper(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.517263872+00:00 > I1120 17:40:40.519779 2488 process.cpp:2436] Resuming > __latch__(1)@127.0.0.1:45300 at 2015-11-20 17:40:40.519730176+00:00 > I1120 17:40:40.519865 2488 process.cpp:2541] Cleaning up >
[jira] [Commented] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
[ https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026899#comment-15026899 ] Jan Schlicht commented on MESOS-4009: - I'd appreciate if {{-Wsign-compare}} would be added to the clang compile flags. > RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1 > -- > > Key: MESOS-4009 > URL: https://issues.apache.org/jira/browse/MESOS-4009 > Project: Mesos > Issue Type: Bug > Components: test > Environment: Fedora 23, GCC 5.1.1 >Reporter: Jan Schlicht >Assignee: Jan Schlicht >Priority: Trivial > Labels: easyfix > > GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a > comparison between signed and unsigned int in > {{provisioner_docker_tests.cpp}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3973) Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11.
[ https://issues.apache.org/jira/browse/MESOS-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026476#comment-15026476 ] Gilbert Song commented on MESOS-3973: - I did try update either distribute tar ball or pip tar ball to the latest version(because seems like updating pip fixed `make distcheck` failure on debian 8), but none of both works here. The solution may not be in cleaning up those files in src/Makefile.am. Instead, we should figure out why mesos/mesos.cli/mesos.interface/mesos.native are shown as `not installed` in debug log. > Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11. > - > > Key: MESOS-3973 > URL: https://issues.apache.org/jira/browse/MESOS-3973 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 0.26.0 > Environment: Mac OS X 10.10.5, Clang 7.0.0. >Reporter: Bernd Mathiske >Assignee: Gilbert Song > Labels: build, build-failure, mesosphere > > Non-root 'make distcheck. > {noformat} > ... > [--] Global test environment tear-down > [==] 826 tests from 113 test cases ran. (276624 ms total) > [ PASSED ] 826 tests. > YOU HAVE 6 DISABLED TESTS > Making install in . > make[3]: Nothing to be done for `install-exec-am'. > ../install-sh -c -d > '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig' > /usr/bin/install -c -m 644 mesos.pc > '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig' > Making install in 3rdparty > /Applications/Xcode.app/Contents/Developer/usr/bin/make install-recursive > Making install in libprocess > Making install in 3rdparty > /Applications/Xcode.app/Contents/Developer/usr/bin/make install-recursive > Making install in stout > Making install in . > make[9]: Nothing to be done for `install-exec-am'. > make[9]: Nothing to be done for `install-data-am'. > Making install in include > make[9]: Nothing to be done for `install-exec-am'. > ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d > '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include' > ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d > '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include/stout' > /usr/bin/install -c -m 644 > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/abort.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/attributes.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/base64.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bits.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bytes.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/dynamiclibrary.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/error.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/exit.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/flags.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/foreach.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/format.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/fs.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gtest.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gzip.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashmap.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashset.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/ip.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/lambda.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/linkedhashmap.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/list.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/mac.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multihashmap.hpp > > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multimap.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/none.hpp > ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/nothing.hpp > >
[jira] [Commented] (MESOS-3946) Test for role management
[ https://issues.apache.org/jira/browse/MESOS-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026475#comment-15026475 ] Yong Qiao Wang commented on MESOS-3946: --- Good suggestion, we can consider this after the main tasks done. > Test for role management > > > Key: MESOS-3946 > URL: https://issues.apache.org/jira/browse/MESOS-3946 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > Add test for role dynamic configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3581) License headers show up all over doxygen documentation.
[ https://issues.apache.org/jira/browse/MESOS-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026471#comment-15026471 ] Till Toenshoff commented on MESOS-3581: --- {noformat} commit 3539b7a0e15b594148308319bf052d28b1429b98 Author: Benjamin BannierDate: Mon Nov 23 06:53:38 2015 -0800 [libprocess]: Made license-headers doxygen-compatible. This commit adjusts license headers of C++ source and header files. Review: https://reviews.apache.org/r/39592 {noformat} {noformat} commit dc23756a5433d6f7fcd22d291babad14f6799233 Author: Benjamin Bannier Date: Mon Nov 23 06:53:01 2015 -0800 [stout]: Made license-headers doxygen-compatible. This commit adjusts license headers of C++ source and header files. Review: https://reviews.apache.org/r/39591 {noformat} {noformat} commit fa36917dd142f66924c5f7ed689b87d5ceabbf79 Author: Benjamin Bannier Date: Mon Nov 23 06:49:31 2015 -0800 Made license-headers doxygen-compatible. This commit adjusts license headers of C++ source and header files, and protobuf definitions. Also, reflect the changed style in the C++ style guide. Review: https://reviews.apache.org/r/39590 {noformat} {noformat} commit 384de473d9f388b84b77321c8a08e17efd558f10 Author: Benjamin Bannier Date: Wed Nov 25 10:05:44 2015 +0100 [stout] Fixed two headers that got cut off in dc23756a. Review: https://reviews.apache.org/r/40652/ {noformat} > License headers show up all over doxygen documentation. > --- > > Key: MESOS-3581 > URL: https://issues.apache.org/jira/browse/MESOS-3581 > Project: Mesos > Issue Type: Documentation > Components: documentation >Affects Versions: 0.24.1 >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Minor > Labels: mesosphere > > Currently license headers are commented in something resembling Javadoc style, > {code} > /** > * Licensed ... > {code} > Since we use Javadoc-style comment blocks for doxygen documentation all > license headers appear in the generated documentation, potentially and likely > hiding the actual documentation. > Using {{/*}} to start the comment blocks would be enough to hide them from > doxygen, but would likely also result in a largish (though mostly > uninteresting) patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3947) Authenticate /roles request
[ https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Qiao Wang updated MESOS-3947: -- Description: /roles endpoint needs to be authenticated. This ticket will authenticate /roles requests using credentials provided by the `Authorization` field of the HTTP request. This is similar to how authentication is implemented in `Master::Http`. was: /roles requests except GET method need to be authenticated. This ticket will authenticate /roles requests using credentials provided by the `Authorization` field of the HTTP request. This is similar to how authentication is implemented in `Master::Http`. > Authenticate /roles request > --- > > Key: MESOS-3947 > URL: https://issues.apache.org/jira/browse/MESOS-3947 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > /roles endpoint needs to be authenticated. > This ticket will authenticate /roles requests using credentials provided by > the `Authorization` field of the HTTP request. This is similar to how > authentication is implemented in `Master::Http`. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3947) Authenticate /roles request
[ https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Qiao Wang updated MESOS-3947: -- Description: /roles endpoint needs to be authenticated. This ticket will authenticate /roles requests using credentials provided by the `Authorization` field of the HTTP request. This is similar to how authentication is implemented in `Master::Http`. In addition, for the query request of /roles endpoint, considering that it would not change the status of roles/weights in Mesos master and for backward compatibility , so it will not be authenticated. was: /roles endpoint needs to be authenticated. This ticket will authenticate /roles requests using credentials provided by the `Authorization` field of the HTTP request. This is similar to how authentication is implemented in `Master::Http`. In addition, for the query request of /roles endpoint, considering that it would not change the status of roles/weights in Mesos master and for backward compatibility , it does not need to be authenticated. > Authenticate /roles request > --- > > Key: MESOS-3947 > URL: https://issues.apache.org/jira/browse/MESOS-3947 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > /roles endpoint needs to be authenticated. > This ticket will authenticate /roles requests using credentials provided by > the `Authorization` field of the HTTP request. This is similar to how > authentication is implemented in `Master::Http`. > In addition, for the query request of /roles endpoint, considering that it > would not change the status of roles/weights in Mesos master and for backward > compatibility , so it will not be authenticated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3608) optionally install test binaries
[ https://issues.apache.org/jira/browse/MESOS-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026552#comment-15026552 ] Till Toenshoff commented on MESOS-3608: --- [~jamespeach] thanks a bunch - looking into it. > optionally install test binaries > > > Key: MESOS-3608 > URL: https://issues.apache.org/jira/browse/MESOS-3608 > Project: Mesos > Issue Type: Improvement > Components: build, test >Reporter: James Peach >Assignee: James Peach >Priority: Minor > > Many of the tests in Mesos could be described as integration tests, since > they have external dependencies on kernel features, installed tools, > permissions, etc. I'd like to be able to generate a {{mesos-tests}} RPM along > with my {{mesos}} RPM so that I can run the same tests in different > deployment environments. > I propose a new configuration option named {{--enable-test-tools}} that will > install the tests into {{libexec/mesos/tests}}. I'll also need to make some > minor changes to tests so that helper tools can be found in this location as > well as in the build directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3947) Authenticate /roles request
[ https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Qiao Wang updated MESOS-3947: -- Description: /roles endpoint needs to be authenticated. This ticket will authenticate /roles requests using credentials provided by the `Authorization` field of the HTTP request. This is similar to how authentication is implemented in `Master::Http`. In addition, for the query request of /roles endpoint, considering that it would not change the status of roles/weights in Mesos master and for backward compatibility , it does not need to be authenticated. was: /roles endpoint needs to be authenticated. This ticket will authenticate /roles requests using credentials provided by the `Authorization` field of the HTTP request. This is similar to how authentication is implemented in `Master::Http`. > Authenticate /roles request > --- > > Key: MESOS-3947 > URL: https://issues.apache.org/jira/browse/MESOS-3947 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > /roles endpoint needs to be authenticated. > This ticket will authenticate /roles requests using credentials provided by > the `Authorization` field of the HTTP request. This is similar to how > authentication is implemented in `Master::Http`. > In addition, for the query request of /roles endpoint, considering that it > would not change the status of roles/weights in Mesos master and for backward > compatibility , it does not need to be authenticated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3916) MasterMaintenanceTest.InverseOffersFilters is flaky
[ https://issues.apache.org/jira/browse/MESOS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026551#comment-15026551 ] Jan Schlicht commented on MESOS-3916: - After applying the patch, {{./bin/mesos-tests.sh --gtest_filter=*.InverseOffersFilters --gtest_repeat=-1 --gtest_break_on_failure}} fails after > 250 iterations. Before that it failed after ~10 iterations. Therefore, while the flakiness isn't gone for me, the situation improved significantly. Here's a log of a failed test after applying the patch: {noformat} I1125 10:05:52.969558 29342 leveldb.cpp:174] Opened db in 384512ns I1125 10:05:52.969916 29342 leveldb.cpp:181] Compacted db in 319730ns I1125 10:05:52.969949 29342 leveldb.cpp:196] Created db iterator in 3457ns I1125 10:05:52.969959 29342 leveldb.cpp:202] Seeked to beginning of db in 332ns I1125 10:05:52.969965 29342 leveldb.cpp:271] Iterated through 0 keys in the db in 318ns I1125 10:05:52.969979 29342 replica.cpp:778] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I1125 10:05:52.970711 29362 recover.cpp:447] Starting replica recovery I1125 10:05:52.970767 29362 recover.cpp:473] Replica is in EMPTY status I1125 10:05:52.970935 29362 replica.cpp:674] Replica in EMPTY status received a broadcasted recover request from (3109)@127.0.0.1:42692 I1125 10:05:52.970989 29362 recover.cpp:193] Received a recover response from a replica in EMPTY status I1125 10:05:52.971045 29362 recover.cpp:564] Updating replica status to STARTING I1125 10:05:52.971160 29362 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 22591ns I1125 10:05:52.971174 29362 replica.cpp:321] Persisted replica status to STARTING I1125 10:05:52.971195 29362 master.cpp:365] Master 932f7d7b-f2d4-42c7-9391-222c19b9d35b (localhost) started on 127.0.0.1:42692 I1125 10:05:52.971204 29362 master.cpp:367] Flags at startup: --acls="" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/EruGwl/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --quiet="false" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" --registry_strict="true" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/EruGwl/master" --zk_session_timeout="10secs" W1125 10:05:52.971287 29362 master.cpp:370] ** Master bound to loopback interface! Cannot communicate with remote schedulers or slaves. You might want to set '--ip' flag to a routable IP address. ** I1125 10:05:52.971364 29362 master.cpp:414] Master allowing unauthenticated frameworks to register I1125 10:05:52.971372 29362 master.cpp:417] Master only allowing authenticated slaves to register I1125 10:05:52.971298 29363 recover.cpp:473] Replica is in STARTING status I1125 10:05:52.971379 29362 credentials.hpp:35] Loading credentials for authentication from '/tmp/EruGwl/credentials' I1125 10:05:52.971544 29362 master.cpp:456] Using default 'crammd5' authenticator I1125 10:05:52.971573 29363 replica.cpp:674] Replica in STARTING status received a broadcasted recover request from (3110)@127.0.0.1:42692 I1125 10:05:52.971587 29362 master.cpp:493] Authorization enabled I1125 10:05:52.971709 29358 recover.cpp:193] Received a recover response from a replica in STARTING status I1125 10:05:52.971807 29359 whitelist_watcher.cpp:77] No whitelist given I1125 10:05:52.972726 29356 hierarchical.cpp:162] Initialized hierarchical allocator process I1125 10:05:52.972959 29362 master.cpp:1625] The newly elected leader is master@127.0.0.1:42692 with id 932f7d7b-f2d4-42c7-9391-222c19b9d35b I1125 10:05:52.972996 29362 master.cpp:1638] Elected as the leading master! I1125 10:05:52.972998 29358 recover.cpp:564] Updating replica status to VOTING I1125 10:05:52.973006 29362 master.cpp:1383] Recovering from registrar I1125 10:05:52.973254 29359 leveldb.cpp:304] Persisting metadata (8 bytes) to leveldb took 13731ns I1125 10:05:52.973319 29359 replica.cpp:321] Persisted replica status to VOTING I1125 10:05:52.973356 29358 recover.cpp:578] Successfully joined the Paxos group I1125 10:05:52.973412 29363 registrar.cpp:307] Recovering registrar I1125 10:05:52.973435 29358 recover.cpp:462] Recover process terminated I1125 10:05:52.973695 29358 log.cpp:659] Attempting to start the writer I1125 10:05:52.973846 29358 replica.cpp:494] Replica received implicit promise request from (3111)@127.0.0.1:42692 with proposal 1
[jira] [Commented] (MESOS-3966) LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem fails on Centos 7.1
[ https://issues.apache.org/jira/browse/MESOS-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026462#comment-15026462 ] Jan Schlicht commented on MESOS-3966: - Thanks for the patch! > LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem fails on > Centos 7.1 > > > Key: MESOS-3966 > URL: https://issues.apache.org/jira/browse/MESOS-3966 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.26.0 > Environment: centos 7.1, gcc 4.8.3, docker 1.8.2 >Reporter: Till Toenshoff >Assignee: Jan Schlicht > Labels: mesosphere > > {noformat} > [ RUN ] LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem > I1120 11:39:37.862926 29944 linux.cpp:82] Making > '/tmp/LinuxFilesystemIsolatorTest_ROOT_ImageInVolumeWithRootFilesystem_ZBw23E' > a shared mount > I1120 11:39:37.876965 29944 linux_launcher.cpp:103] Using > /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > I1120 11:39:37.930881 29944 systemd.cpp:128] systemd version `208` detected > W1120 11:39:37.930913 29944 systemd.cpp:136] Required functionality > `Delegate` was introduced in Version `218`. Your system may not function > properly; however since some distributions have patched systemd packages, > your system may still be functional. This is why we keep running. See > MESOS-3352 for more information > I1120 11:39:37.938351 29944 systemd.cpp:210] Started systemd slice > `mesos_executors.slice` > I1120 11:39:37.940218 29962 containerizer.cpp:618] Starting container > '1ea741a9-5edf-4910-ae64-f8d53f74e31e' for executor 'test_executor' of > framework '' > I1120 11:39:37.943042 29959 provisioner.cpp:289] Provisioning image rootfs > '/tmp/LinuxFilesystemIsolatorTest_ROOT_ImageInVolumeWithRootFilesystem_ZBw23E/provisioner/containers/1ea741a9-5edf-4910-ae64-f8d53f74e31e/backends/copy/rootfses/7d97f8ac-ee57-4c83-b2d1-4332e25c89ae' > for container 1ea741a9-5edf-4910-ae64-f8d53f74e31e > I1120 11:39:49.571781 29958 provisioner.cpp:289] Provisioning image rootfs > '/tmp/LinuxFilesystemIsolatorTest_ROOT_ImageInVolumeWithRootFilesystem_ZBw23E/provisioner/containers/1ea741a9-5edf-4910-ae64-f8d53f74e31e/backends/copy/rootfses/0256b892-e737-4d3d-89ea-74cf0e96eaf6' > for container 1ea741a9-5edf-4910-ae64-f8d53f74e31e > ../../src/tests/containerizer/filesystem_isolator_tests.cpp:806: Failure > Failed to wait 15secs for launch > [ FAILED ] LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem > (55076 ms) > [--] 1 test from LinuxFilesystemIsolatorTest (55076 ms total) > {noformat} > The following vagrant generator was used: > {noformat} > cat << EOF > Vagrantfile > # -*- mode: ruby -*-" > > # vi: set ft=ruby : > Vagrant.configure(2) do |config| > # Disable shared folder to prevent certain kernel module dependencies. > config.vm.synced_folder ".", "/vagrant", disabled: true > config.vm.hostname = "centos71" > config.vm.box = "bento/centos-7.1" > config.vm.provider "virtualbox" do |vb| > vb.memory = 16384 > vb.cpus = 8 > end > config.vm.provider "vmware_fusion" do |vb| > vb.memory = 9216 > vb.cpus = 4 > end > config.vm.provision "shell", inline: <<-SHELL > sudo yum -y update systemd > sudo yum install -y tar wget > sudo wget > http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo > -O /etc/yum.repos.d/epel-apache-maven.repo > sudo yum groupinstall -y "Development Tools" > sudo yum install -y apache-maven python-devel java-1.7.0-openjdk-devel > zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel cyrus-sasl-md5 > apr-devel subversion-devel apr-util-devel > sudo yum install -y git > sudo yum install -y docker > sudo service docker start > sudo docker info > #sudo wget -qO- https://get.docker.com/ | sh > SHELL > end > EOF > vagrant up > vagrant reload > vagrant ssh -c " > git clone https://github.com/apache/mesos.git mesos > cd mesos > git checkout -b 0.26.0-rc1 0.26.0-rc1 > ./bootstrap > mkdir build > cd build > ../configure > make -j4 check > #make -j4 distcheck > sudo ./bin/mesos-tests.sh > #make clean > #../configure --enable-libevent --enable-ssl > #GTEST_FILTER="" make check > #sudo ./bin/mesos-tests.sh > " > {noformat} > Additionally, {{/etc/hosts}} was edited to contain hostname and IP (allowing > a pass of the bridged docker executor tests). > {noformat} > 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 > ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 > 192.168.218.135 centos71 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3791) Enhance the existing HTTP endpoint /roles
[ https://issues.apache.org/jira/browse/MESOS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026469#comment-15026469 ] Yong Qiao Wang commented on MESOS-3791: --- The response JSON format of /roles endpoint with GET request has been changed in this ticket, I am not sure that we need to keep consistent as before for backward compatibility. [~adam-mesos] any suggestions for this? > Enhance the existing HTTP endpoint /roles > - > > Key: MESOS-3791 > URL: https://issues.apache.org/jira/browse/MESOS-3791 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > In this ticket, we will enhance the existing HTTP endpoint to query roles as > outlined in the Design Doc: > https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit# -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3947) Authenticate /roles request
[ https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026499#comment-15026499 ] Yong Qiao Wang commented on MESOS-3947: --- Thanks [~marco-mesos] for your information, I have updated the description of this ticket for your concern. > Authenticate /roles request > --- > > Key: MESOS-3947 > URL: https://issues.apache.org/jira/browse/MESOS-3947 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > /roles endpoint needs to be authenticated. > This ticket will authenticate /roles requests using credentials provided by > the `Authorization` field of the HTTP request. This is similar to how > authentication is implemented in `Master::Http`. > In addition, for the query request of /roles endpoint, considering that it > would not change the status of roles/weights in Mesos master and for backward > compatibility , so it will not be authenticated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
Jan Schlicht created MESOS-4009: --- Summary: RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1 Key: MESOS-4009 URL: https://issues.apache.org/jira/browse/MESOS-4009 Project: Mesos Issue Type: Bug Environment: Fedora 23 Reporter: Jan Schlicht Assignee: Jan Schlicht Priority: Trivial -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4007) Persist role information to registry
[ https://issues.apache.org/jira/browse/MESOS-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Qiao Wang updated MESOS-4007: -- Description: To consider the Mesos master recovery and failover case, Mesos master needs to persist the roles and weights information in registry: - In the first boot, the first leading master initialize the replicated log with the roles/weights specified by command-line flags(--roles and --weights). The flags values are only useful to bootstrap the cluster, after which point the registry becomes the source of truth. - At runtime, the replicated log can only be updated to add/remove/update entries by the operator REST API. - For Mesos master restart/failover case, if the replicated log for roles/weights has exist, and then it prefers to use the registry values and ignore the flags (--roles/--weights), and also log a warning in Mesos master that the flags values are being ignored. - For the future works, we can educate end users to create the replicated log to initialize the supported roles/weights before Mesos cluster bootstrap, and reset roles/weights configurations by update the replicated log. was:Persist role information to registry across master recovery/failover. > Persist role information to registry > > > Key: MESOS-4007 > URL: https://issues.apache.org/jira/browse/MESOS-4007 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > To consider the Mesos master recovery and failover case, Mesos master needs > to persist the roles and weights information in registry: > - In the first boot, the first leading master initialize the replicated log > with the roles/weights specified by command-line flags(--roles and > --weights). The flags values are only useful to bootstrap the cluster, after > which point the registry becomes the source of truth. > - At runtime, the replicated log can only be updated to add/remove/update > entries by the operator REST API. > - For Mesos master restart/failover case, if the replicated log for > roles/weights has exist, and then it prefers to use the registry values and > ignore the flags (--roles/--weights), and also log a warning in Mesos master > that the flags values are being ignored. > - For the future works, we can educate end users to create the replicated log > to initialize the supported roles/weights before Mesos cluster bootstrap, and > reset roles/weights configurations by update the replicated log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3942) Enhance endpoint /roles for adding a new role
[ https://issues.apache.org/jira/browse/MESOS-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026417#comment-15026417 ] Yong Qiao Wang commented on MESOS-3942: --- RR: https://reviews.apache.org/r/40697/ > Enhance endpoint /roles for adding a new role > - > > Key: MESOS-3942 > URL: https://issues.apache.org/jira/browse/MESOS-3942 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > In this ticket, we will enhance the existing HTTP endpoint /roles to can add > a new role at runtime as outlined in the Design Doc: > https://docs.google.com/document/d/1OIgceqpsjV3-_LGF83IMAFnrh1Ea3Zc16w9kWWPpUj4/edit# -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4008) Master recovery with the persisted roles in registry
[ https://issues.apache.org/jira/browse/MESOS-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Qiao Wang updated MESOS-4008: -- Description: To consider the Mesos master recovery and failover case, Mesos master needs to persist the roles and weights information in registry: - In the first boot, the first leading master initialize the replicated log with the roles/weights specified by command-line flags(--roles and --weights). The flags values are only useful to bootstrap the cluster, after which point the registry becomes the source of truth. - At runtime, the replicated log can only be updated to add/remove/update entries by the operator REST API. - For Mesos master restart/failover case, if the replicated log for roles/weights has exist, and then it prefers to use the registry values and ignore the flags (--roles/--weights), and also log a warning in Mesos master that the flags values are being ignored. - For the future works, we can educate end users to create the replicated log to initialize the supported roles/weights before Mesos cluster bootstrap, and reset roles/weights configurations by update the replicated log. was: To consider the Mesos master recovery and failover case, Mesos master needs to persist the roles and weights information in registry: In the first boot, the first leading master initialize the replicated log with the roles/weights specified by command-line flags(--roles and --weights). The flags values are only useful to bootstrap the cluster, after which point the registry becomes the source of truth. At runtime, the replicated log can only be updated to add/remove/update entries by the operator REST API. For Mesos master restart/failover case, if the replicated log for roles/weights has exist, and then it prefers to use the registry values and ignore the flags (--roles/--weights), and also log a warning in Mesos master that the flags values are being ignored. For the future works, we can educate end users to create the replicated log to initialize the supported roles/weights before Mesos cluster bootstrap, and reset roles/weights configurations by update the replicated log. > Master recovery with the persisted roles in registry > > > Key: MESOS-4008 > URL: https://issues.apache.org/jira/browse/MESOS-4008 > Project: Mesos > Issue Type: Task >Reporter: Yong Qiao Wang >Assignee: Yong Qiao Wang > > To consider the Mesos master recovery and failover case, Mesos master needs > to persist the roles and weights information in registry: > - In the first boot, the first leading master initialize the replicated log > with the roles/weights specified by command-line flags(--roles and > --weights). The flags values are only useful to bootstrap the cluster, after > which point the registry becomes the source of truth. > - At runtime, the replicated log can only be updated to add/remove/update > entries by the operator REST API. > - For Mesos master restart/failover case, if the replicated log for > roles/weights has exist, and then it prefers to use the registry values and > ignore the flags (--roles/--weights), and also log a warning in Mesos master > that the flags values are being ignored. > - For the future works, we can educate end users to create the replicated log > to initialize the supported roles/weights before Mesos cluster bootstrap, and > reset roles/weights configurations by update the replicated log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4008) Master recovery with the persisted roles in registry
Yong Qiao Wang created MESOS-4008: - Summary: Master recovery with the persisted roles in registry Key: MESOS-4008 URL: https://issues.apache.org/jira/browse/MESOS-4008 Project: Mesos Issue Type: Task Reporter: Yong Qiao Wang Assignee: Yong Qiao Wang To consider the Mesos master recovery and failover case, Mesos master needs to persist the roles and weights information in registry: In the first boot, the first leading master initialize the replicated log with the roles/weights specified by command-line flags(--roles and --weights). The flags values are only useful to bootstrap the cluster, after which point the registry becomes the source of truth. At runtime, the replicated log can only be updated to add/remove/update entries by the operator REST API. For Mesos master restart/failover case, if the replicated log for roles/weights has exist, and then it prefers to use the registry values and ignore the flags (--roles/--weights), and also log a warning in Mesos master that the flags values are being ignored. For the future works, we can educate end users to create the replicated log to initialize the supported roles/weights before Mesos cluster bootstrap, and reset roles/weights configurations by update the replicated log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4007) Persist role information to registry
Yong Qiao Wang created MESOS-4007: - Summary: Persist role information to registry Key: MESOS-4007 URL: https://issues.apache.org/jira/browse/MESOS-4007 Project: Mesos Issue Type: Task Reporter: Yong Qiao Wang Assignee: Yong Qiao Wang Persist role information to registry across master recovery/failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026649#comment-15026649 ] Bernd Mathiske commented on MESOS-3937: --- My take is that to close this ticket we need to make sure we have viable instructions in the docs / on the web page. > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Timothy Chen > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for > authentication from '/tmp/40AlT8/credentials' > I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING > I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' > authenticator > I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.075466ms > I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to > VOTING > I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos > group > I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated > I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL > I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled > I1117 15:08:09.296018 26399 master.cpp:1606] The newly elected leader is > master@10.0.2.15:50088 with id
[jira] [Created] (MESOS-4010) Initial leader election unstable
Guilherme Moro created MESOS-4010: - Summary: Initial leader election unstable Key: MESOS-4010 URL: https://issues.apache.org/jira/browse/MESOS-4010 Project: Mesos Issue Type: Bug Components: master Affects Versions: 0.25.0 Environment: RHEL 6.6 Reporter: Guilherme Moro Priority: Critical No leader is elected For a start, let me explain my setup: 3 nodes 3 zookeepers 3 mesos-master services, configured as initctl services and controlled by puppet, RPM's installed are from the RHEL repository at mesosphere (installed through puppet as well), running on RHEL 6.6 Quorum is set to 2, as expected, all the remaining configs were double checked and appears to be correct. Most of times I can get the cluster to bootstrap after rebooting the nodes (sometimes more than once). The whole thing resembles a bit https://issues.apache.org/jira/browse/MESOS-2148 and https://issues.apache.org/jira/browse/MESOS-2014 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
[ https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Schlicht updated MESOS-4009: Description: GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a comparison between signed and unsigned int in {{provisioner_docker_tests.cpp}}. Component/s: test > RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1 > -- > > Key: MESOS-4009 > URL: https://issues.apache.org/jira/browse/MESOS-4009 > Project: Mesos > Issue Type: Bug > Components: test > Environment: Fedora 23 >Reporter: Jan Schlicht >Assignee: Jan Schlicht >Priority: Trivial > Labels: easyfix > > GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a > comparison between signed and unsigned int in > {{provisioner_docker_tests.cpp}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
[ https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026645#comment-15026645 ] Till Toenshoff commented on MESOS-4009: --- Why would clang and gcc < 5.1 not detect this? > RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1 > -- > > Key: MESOS-4009 > URL: https://issues.apache.org/jira/browse/MESOS-4009 > Project: Mesos > Issue Type: Bug > Components: test > Environment: Fedora 23 >Reporter: Jan Schlicht >Assignee: Jan Schlicht >Priority: Trivial > Labels: easyfix > > GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a > comparison between signed and unsigned int in > {{provisioner_docker_tests.cpp}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4010) Initial leader election unstable
[ https://issues.apache.org/jira/browse/MESOS-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guilherme Moro updated MESOS-4010: -- Attachment: messages_node3.log messages_node2.log messages_node1.log > Initial leader election unstable > > > Key: MESOS-4010 > URL: https://issues.apache.org/jira/browse/MESOS-4010 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 0.25.0 > Environment: RHEL 6.6 >Reporter: Guilherme Moro >Priority: Critical > Attachments: messages_node1.log, messages_node2.log, > messages_node3.log > > > No leader is elected > For a start, let me explain my setup: > 3 nodes > 3 zookeepers > 3 mesos-master services, configured as initctl services and controlled by > puppet, RPM's installed are from the RHEL repository at mesosphere (installed > through puppet as well), running on RHEL 6.6 > Quorum is set to 2, as expected, all the remaining configs were double > checked and appears to be correct. > Most of times I can get the cluster to bootstrap after rebooting the nodes > (sometimes more than once). > The whole thing resembles a bit > https://issues.apache.org/jira/browse/MESOS-2148 and > https://issues.apache.org/jira/browse/MESOS-2014 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026628#comment-15026628 ] Timothy Chen commented on MESOS-3937: - I can't repro this with phusion/ubuntu-14.04-amd64 vagrant image? example_executor.go is open source in mesos-go repo in mesos/mesos-go > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Timothy Chen > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for > authentication from '/tmp/40AlT8/credentials' > I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING > I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' > authenticator > I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.075466ms > I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to > VOTING > I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos > group > I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated > I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL > I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled > I1117 15:08:09.296018 26399 master.cpp:1606] The newly elected leader is > master@10.0.2.15:50088
[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
[ https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026638#comment-15026638 ] haosdent commented on MESOS-3937: - LoL, could not found it now. https://github.com/mesos/mesos-go/search?utf8=%E2%9C%93=example_executor But I think it is because "hostname -f" command failed which node.go depends on it. Update /etc/hosts, it become ok. > Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails. > --- > > Key: MESOS-3937 > URL: https://issues.apache.org/jira/browse/MESOS-3937 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 > Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2 > 8 CPUs, 16 GB memory > Vagrant, libvirt/Virtual Box or VMware >Reporter: Bernd Mathiske >Assignee: Timothy Chen > Labels: mesosphere > > {noformat} > ../configure > make check > sudo ./bin/mesos-tests.sh > --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose > {noformat} > {noformat} > [==] Running 1 test from 1 test case. > [--] Global test environment set-up. > [--] 1 test from DockerContainerizerTest > I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms > I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms > I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns > I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in > 4927ns > I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the > db in 1605ns > I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery > I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status > I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received > a broadcasted recover request from (4)@10.0.2.15:50088 > I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from > a replica in EMPTY status > I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to > STARTING > I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.016098ms > I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to > STARTING > I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status > I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status > received a broadcasted recover request from (5)@10.0.2.15:50088 > I1117 15:08:09.282552 26400 master.cpp:367] Master > 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on > 10.0.2.15:50088 > I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" > --allocation_interval="1secs" --allocator="HierarchicalDRF" > --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/40AlT8/credentials" > --framework_sorter="drf" --help="false" --hostname_lookup="true" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="25secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" > --zk_session_timeout="10secs" > I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing > authenticated frameworks to register > I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing > authenticated slaves to register > I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for > authentication from '/tmp/40AlT8/credentials' > I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from > a replica in STARTING status > I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING > I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' > authenticator > I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to > leveldb took 1.075466ms > I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to > VOTING > I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos > group > I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated > I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL > I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled > I1117 15:08:09.296018
[jira] [Commented] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
[ https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026764#comment-15026764 ] Till Toenshoff commented on MESOS-4009: --- Good to know - thanks. Maybe we should add the sign-compare then additionally for those other compilers? > RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1 > -- > > Key: MESOS-4009 > URL: https://issues.apache.org/jira/browse/MESOS-4009 > Project: Mesos > Issue Type: Bug > Components: test > Environment: Fedora 23 >Reporter: Jan Schlicht >Assignee: Jan Schlicht >Priority: Trivial > Labels: easyfix > > GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a > comparison between signed and unsigned int in > {{provisioner_docker_tests.cpp}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4012) Update documentation to reflect the addition of installable tests.
[ https://issues.apache.org/jira/browse/MESOS-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-4012: -- Issue Type: Documentation (was: Epic) > Update documentation to reflect the addition of installable tests. > > > Key: MESOS-4012 > URL: https://issues.apache.org/jira/browse/MESOS-4012 > Project: Mesos > Issue Type: Documentation >Reporter: Till Toenshoff > > We may want to add the needed steps for administrators to create and run the > test-suite on anything other than the build machine. > One possible location could be {{docs/gettings-started.md}} for validating > the pre-requisites as described in that document. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4012) Update documentation to reflect the addition of installable tests.
Till Toenshoff created MESOS-4012: - Summary: Update documentation to reflect the addition of installable tests. Key: MESOS-4012 URL: https://issues.apache.org/jira/browse/MESOS-4012 Project: Mesos Issue Type: Epic Reporter: Till Toenshoff We may want to add the needed steps for administrators to create and run the test-suite on anything other than the build machine. One possible location could be {{docs/gettings-started.md}} for validating the pre-requisites as described in that document. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
[ https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026715#comment-15026715 ] Jan Schlicht commented on MESOS-4009: - Clang does not have {{-Wsign-compare}} in {{-Wall}}. I'm not sure, but GCC < 5.1 seems to suffer from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59231 At least that would explain the behaviour, because ASSERT_EQ is using templates (see: https://github.com/google/googletest/blob/master/googletest/include/gtest/gtest.h#L1451) > RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1 > -- > > Key: MESOS-4009 > URL: https://issues.apache.org/jira/browse/MESOS-4009 > Project: Mesos > Issue Type: Bug > Components: test > Environment: Fedora 23 >Reporter: Jan Schlicht >Assignee: Jan Schlicht >Priority: Trivial > Labels: easyfix > > GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a > comparison between signed and unsigned int in > {{provisioner_docker_tests.cpp}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4009) RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1
[ https://issues.apache.org/jira/browse/MESOS-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-4009: -- Environment: Fedora 23, GCC 5.1.1 (was: Fedora 23) > RegistryClientTest.SimpleRegistryPuller doesn't compile with GCC 5.1.1 > -- > > Key: MESOS-4009 > URL: https://issues.apache.org/jira/browse/MESOS-4009 > Project: Mesos > Issue Type: Bug > Components: test > Environment: Fedora 23, GCC 5.1.1 >Reporter: Jan Schlicht >Assignee: Jan Schlicht >Priority: Trivial > Labels: easyfix > > GCC 5.1.1 has {{-Werror=sign-compare}} in {{-Wall}} and stumbles over a > comparison between signed and unsigned int in > {{provisioner_docker_tests.cpp}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3964) LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.
[ https://issues.apache.org/jira/browse/MESOS-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026771#comment-15026771 ] Till Toenshoff commented on MESOS-3964: --- Thanks for referencing this, supports our results, good. > LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and > LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8. > --- > > Key: MESOS-3964 > URL: https://issues.apache.org/jira/browse/MESOS-3964 > Project: Mesos > Issue Type: Bug > Components: isolation, test >Affects Versions: 0.26.0 > Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt > Vagrantfile: see MESOS-3957 >Reporter: Bernd Mathiske >Assignee: Greg Mann >Priority: Blocker > Labels: mesosphere > > sudo ./bin/mesos-test.sh > --gtest_filter="LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs" > {noformat} > ... > F1119 14:34:52.514742 30706 isolator_tests.cpp:455] CHECK_SOME(isolator): > Failed to find 'cpu.cfs_quota_us'. Your kernel might be too old to use the > CFS cgroups feature. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4011) Allow build phase independent platform integration tests.
Till Toenshoff created MESOS-4011: - Summary: Allow build phase independent platform integration tests. Key: MESOS-4011 URL: https://issues.apache.org/jira/browse/MESOS-4011 Project: Mesos Issue Type: Epic Reporter: Till Toenshoff Many of the tests in Mesos could be described as integration tests, since they have external dependencies on kernel features, installed tools, permissions, etc. I'd like to be able to generate a mesos-tests RPM along with my mesos RPM so that I can run the same tests in different deployment environments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3964) LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.
[ https://issues.apache.org/jira/browse/MESOS-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026755#comment-15026755 ] haosdent commented on MESOS-3964: - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=789019 > LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and > LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8. > --- > > Key: MESOS-3964 > URL: https://issues.apache.org/jira/browse/MESOS-3964 > Project: Mesos > Issue Type: Bug > Components: isolation, test >Affects Versions: 0.26.0 > Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt > Vagrantfile: see MESOS-3957 >Reporter: Bernd Mathiske >Assignee: Greg Mann >Priority: Blocker > Labels: mesosphere > > sudo ./bin/mesos-test.sh > --gtest_filter="LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs" > {noformat} > ... > F1119 14:34:52.514742 30706 isolator_tests.cpp:455] CHECK_SOME(isolator): > Failed to find 'cpu.cfs_quota_us'. Your kernel might be too old to use the > CFS cgroups feature. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3969) Failing 'make distcheck' on Debian 8, somehow SSL-related.
[ https://issues.apache.org/jira/browse/MESOS-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-3969: -- Shepherd: Bernd Mathiske > Failing 'make distcheck' on Debian 8, somehow SSL-related. > -- > > Key: MESOS-3969 > URL: https://issues.apache.org/jira/browse/MESOS-3969 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.26.0 > Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt > Vagrantfile see MESOS-3957 >Reporter: Bernd Mathiske >Assignee: Joseph Wu > Labels: build, build-failure, mesosphere > > As non-root: make distcheck. > {noformat} > /bin/mkdir -p '/home/vagrant/mesos/build/mesos-0.26.0/_inst/bin' > /bin/bash ../libtool --mode=install /usr/bin/install -c mesos-local mesos-log > mesos mesos-execute mesos-resolve > '/home/vagrant/mesos/build/mesos-0.26.0/_inst/bin' > libtool: install: /usr/bin/install -c .libs/mesos-local > /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-local > libtool: install: /usr/bin/install -c .libs/mesos-log > /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-log > libtool: install: /usr/bin/install -c .libs/mesos > /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos > libtool: install: /usr/bin/install -c .libs/mesos-execute > /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-execute > libtool: install: /usr/bin/install -c .libs/mesos-resolve > /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-resolve > Traceback (most recent call last): > File "", line 1, in > File > "/home/vagrant/mesos/build/mesos-0.26.0/build/3rdparty/pip-1.5.6/pip/__init_.py", > line 11, in > from pip.vcs import git, mercurial, subversion, bazaar # noqa > File > "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/vcs/mercurial.py", > line 9, in > from pip.download import path_to_url > File > "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/download.py", > line 22, in > from pip._vendor import requests, six > File > "/home/vagrant/mesos/build/mesos-0.26.0/build/3rdparty/pip-1.5.6/pip/_vendor/requests/__init_.py", > line 53, in > from .packages.urllib3.contrib import pyopenssl > File > "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/_vendor/requests/packages/urllib3/contrib/pyopenssl.py", > line 70, in > ssl.PROTOCOL_SSLv3: OpenSSL.SSL.SSLv3_METHOD, > AttributeError: 'module' object has no attribute 'PROTOCOL_SSLv3' > Traceback (most recent call last): > File "", line 1, in > File "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rd > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota
[ https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3073: --- Issue Type: Epic (was: Improvement) > Introduce HTTP endpoints for Quota > -- > > Key: MESOS-3073 > URL: https://issues.apache.org/jira/browse/MESOS-3073 > Project: Mesos > Issue Type: Epic >Reporter: Joerg Schad >Assignee: Joerg Schad > Labels: mesosphere > > We need to implement the HTTP endpoints for Quota as outlined in the Design > Doc: > (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota
[ https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3073: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere Sprint 22 (was: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23) Epic Name: Quota Endpoints > Introduce HTTP endpoints for Quota > -- > > Key: MESOS-3073 > URL: https://issues.apache.org/jira/browse/MESOS-3073 > Project: Mesos > Issue Type: Epic >Reporter: Joerg Schad >Assignee: Joerg Schad > Labels: mesosphere > > We need to implement the HTTP endpoints for Quota as outlined in the Design > Doc: > (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4014) Introduce DELETE/remove endpoint for quota
Alexander Rukletsov created MESOS-4014: -- Summary: Introduce DELETE/remove endpoint for quota Key: MESOS-4014 URL: https://issues.apache.org/jira/browse/MESOS-4014 Project: Mesos Issue Type: Task Components: master Reporter: Alexander Rukletsov Assignee: Joerg Schad This endpoint is for removing quotas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4014) Introduce delete/remove endpoint for quota
[ https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar updated MESOS-4014: -- Summary: Introduce delete/remove endpoint for quota (was: Introduce DELETE/remove endpoint for quota) > Introduce delete/remove endpoint for quota > -- > > Key: MESOS-4014 > URL: https://issues.apache.org/jira/browse/MESOS-4014 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Alexander Rukletsov >Assignee: Joerg Schad > Labels: mesosphere > > This endpoint is for removing quotas. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4014) Introduce delete/remove endpoint for quota
[ https://issues.apache.org/jira/browse/MESOS-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar updated MESOS-4014: -- Description: This endpoint is for removing quotas via the DELETE method. (was: This endpoint is for removing quotas.) > Introduce delete/remove endpoint for quota > -- > > Key: MESOS-4014 > URL: https://issues.apache.org/jira/browse/MESOS-4014 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Alexander Rukletsov >Assignee: Joerg Schad > Labels: mesosphere > > This endpoint is for removing quotas via the DELETE method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4013) Introduce GET/status endpoint for quota
Alexander Rukletsov created MESOS-4013: -- Summary: Introduce GET/status endpoint for quota Key: MESOS-4013 URL: https://issues.apache.org/jira/browse/MESOS-4013 Project: Mesos Issue Type: Task Components: master Reporter: Alexander Rukletsov Assignee: Joerg Schad The endpoint should provide quota status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3552) CHECK failure due to floating point precision on reservation request
[ https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Mathiske updated MESOS-3552: -- Target Version/s: 0.26.0 (was: 0.27.0) > CHECK failure due to floating point precision on reservation request > > > Key: MESOS-3552 > URL: https://issues.apache.org/jira/browse/MESOS-3552 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Mandeep Chadha >Assignee: Mandeep Chadha > Labels: mesosphere, tech-debt > > result.cpus() == cpus() check is failing due to ( double == double ) > comparison problem. > Root Cause : > Framework requested 0.1 cpu reservation for the first task. So far so good. > Next Reserve operation — lead to double operations resulting in following > double values : > results.cpus() : 23.9964472863211995 cpus() : 24 > And the check ( result.cpus() == cpus() ) failed. > The double arithmetic operations caused results.cpus() value to be : > 23.9964472863211995 and hence ( 23.9964472863211995 > == 24 ) failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request
[ https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028154#comment-15028154 ] Neil Conway commented on MESOS-3552: There's also MESOS-3990, which wouldn't be handled by either almostEqual or CHECK_DOUBLE_EQ: the problem in MESOS-3990 is that we return unexpected results to the user. Since the plan is to switch to fixed-point anyway, personally I think we should focus on (a) fixing the crashing / failing CHECKs, then (b) figuring out a migration plan toward fixed-point resource values. > CHECK failure due to floating point precision on reservation request > > > Key: MESOS-3552 > URL: https://issues.apache.org/jira/browse/MESOS-3552 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Mandeep Chadha >Assignee: Mandeep Chadha > Labels: mesosphere, tech-debt > > result.cpus() == cpus() check is failing due to ( double == double ) > comparison problem. > Root Cause : > Framework requested 0.1 cpu reservation for the first task. So far so good. > Next Reserve operation — lead to double operations resulting in following > double values : > results.cpus() : 23.9964472863211995 cpus() : 24 > And the check ( result.cpus() == cpus() ) failed. > The double arithmetic operations caused results.cpus() value to be : > 23.9964472863211995 and hence ( 23.9964472863211995 > == 24 ) failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request
[ https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028206#comment-15028206 ] Avinash Sridharan commented on MESOS-3552: -- Just an update. Tried Mandeep's test case with CHECK_DOUBLE_EQ and it fails on the test Mandeep had submitted for review https://reviews.apache.org/r/39056/ Creating a patch with CHECK_NEAR with margin set to MIN_CPUS and adding Mandeep's test case to the test framework as well. > CHECK failure due to floating point precision on reservation request > > > Key: MESOS-3552 > URL: https://issues.apache.org/jira/browse/MESOS-3552 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Mandeep Chadha >Assignee: Mandeep Chadha > Labels: mesosphere, tech-debt > > result.cpus() == cpus() check is failing due to ( double == double ) > comparison problem. > Root Cause : > Framework requested 0.1 cpu reservation for the first task. So far so good. > Next Reserve operation — lead to double operations resulting in following > double values : > results.cpus() : 23.9964472863211995 cpus() : 24 > And the check ( result.cpus() == cpus() ) failed. > The double arithmetic operations caused results.cpus() value to be : > 23.9964472863211995 and hence ( 23.9964472863211995 > == 24 ) failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request
[ https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028201#comment-15028201 ] Klaus Ma commented on MESOS-3552: - For the MESOS-3990, I think it's because Framework can not decide whether a resources can contains others. If we only handle CHECK_DOUBLE_EQ, how to handle {{reosurces.contains}}? Just ignore it for now? > CHECK failure due to floating point precision on reservation request > > > Key: MESOS-3552 > URL: https://issues.apache.org/jira/browse/MESOS-3552 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Mandeep Chadha >Assignee: Mandeep Chadha > Labels: mesosphere, tech-debt > > result.cpus() == cpus() check is failing due to ( double == double ) > comparison problem. > Root Cause : > Framework requested 0.1 cpu reservation for the first task. So far so good. > Next Reserve operation — lead to double operations resulting in following > double values : > results.cpus() : 23.9964472863211995 cpus() : 24 > And the check ( result.cpus() == cpus() ) failed. > The double arithmetic operations caused results.cpus() value to be : > 23.9964472863211995 and hence ( 23.9964472863211995 > == 24 ) failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)