Re: Review Request 26133: Modify configure.ac to fix --with-sasl.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26133/#review56492 --- Ship it! Ship It! - Benjamin Hindman On Sept. 29, 2014, 9:22 p.m., Michael Park wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26133/ --- (Updated Sept. 29, 2014, 9:22 p.m.) Review request for mesos, Cody Maloney, Timothy Chen, and Vinod Kone. Bugs: MESOS-1839 https://issues.apache.org/jira/browse/MESOS-1839 Repository: mesos-git Description --- Fixes MESOS-1839. Diffs - configure.ac 86d448c3ad00ad01d3d069c1039dc7ad524af567 Diff: https://reviews.apache.org/r/26133/diff/ Testing --- Built `mesos` with a custom `libsasl2` directory. Thanks, Michael Park
Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2165
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2165/changes Changes: [adam] Enabled whitespace/semicolon rule for cpplint. -- [...truncated 14642 lines...] I1014 06:10:07.862639 25664 status_update_manager.cpp:167] New master detected at master@67.195.81.187:33243 I1014 06:10:07.862673 25671 slave.cpp:647] Detecting new master I1014 06:10:07.862689 25663 authenticatee.hpp:133] Creating new client SASL connection I1014 06:10:07.862814 25669 master.cpp:3787] Authenticating slave(48)@67.195.81.187:33243 I1014 06:10:07.862895 25663 authenticator.hpp:161] Creating new server SASL connection I1014 06:10:07.862977 25657 authenticatee.hpp:224] Received SASL authentication mechanisms: CRAM-MD5 I1014 06:10:07.862997 25657 authenticatee.hpp:250] Attempting to authenticate with mechanism 'CRAM-MD5' I1014 06:10:07.863054 25660 authenticator.hpp:267] Received SASL authentication start I1014 06:10:07.863095 25660 authenticator.hpp:389] Authentication requires more steps I1014 06:10:07.863138 25660 authenticatee.hpp:270] Received SASL authentication step I1014 06:10:07.863205 25658 authenticator.hpp:295] Received SASL authentication step I1014 06:10:07.863230 25658 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'pomona.apache.org' server FQDN: 'pomona.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false I1014 06:10:07.863239 25658 auxprop.cpp:153] Looking up auxiliary property '*userPassword' I1014 06:10:07.863248 25658 auxprop.cpp:153] Looking up auxiliary property '*cmusaslsecretCRAM-MD5' I1014 06:10:07.863258 25658 auxprop.cpp:81] Request to lookup properties for user: 'test-principal' realm: 'pomona.apache.org' server FQDN: 'pomona.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true I1014 06:10:07.863265 25658 auxprop.cpp:103] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true I1014 06:10:07.863270 25658 auxprop.cpp:103] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true I1014 06:10:07.863301 25658 authenticator.hpp:381] Authentication success I1014 06:10:07.863358 25663 authenticatee.hpp:310] Authentication success I1014 06:10:07.863401 25658 master.cpp:3827] Successfully authenticated principal 'test-principal' at slave(48)@67.195.81.187:33243 I1014 06:10:07.863495 25663 slave.cpp:731] Successfully authenticated with master master@67.195.81.187:33243 I1014 06:10:07.863564 25663 slave.cpp:1048] Will retry registration in 6.990899ms if necessary I1014 06:10:07.863620 25671 master.cpp:2968] Registering slave at slave(48)@67.195.81.187:33243 (pomona.apache.org) with id 20141014-061007-3142697795-33243-25643-1 I1014 06:10:07.863755 25671 registrar.cpp:445] Applied 1 operations in 16652ns; attempting to update the 'registry' I1014 06:10:07.865200 25673 log.cpp:680] Attempting to append 487 bytes to the log I1014 06:10:07.865267 25663 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 5 I1014 06:10:07.865694 25660 replica.cpp:508] Replica received write request for position 5 I1014 06:10:07.865845 25660 leveldb.cpp:343] Persisting action (506 bytes) to leveldb took 128557ns I1014 06:10:07.865862 25660 replica.cpp:676] Persisted action at 5 I1014 06:10:07.866153 25665 replica.cpp:655] Replica received learned notice for position 5 I1014 06:10:07.85 25665 leveldb.cpp:343] Persisting action (508 bytes) to leveldb took 490577ns I1014 06:10:07.866682 25665 replica.cpp:676] Persisted action at 5 I1014 06:10:07.866688 25665 replica.cpp:661] Replica learned APPEND action at position 5 I1014 06:10:07.867045 25673 registrar.cpp:490] Successfully updated the 'registry' in 3.266816ms I1014 06:10:07.867125 25668 log.cpp:699] Attempting to truncate the log to 5 I1014 06:10:07.867202 25667 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 6 I1014 06:10:07.867260 25672 slave.cpp:2403] Received ping from slave-observer(42)@67.195.81.187:33243 I1014 06:10:07.867429 25672 slave.cpp:765] Registered with master master@67.195.81.187:33243; given slave ID 20141014-061007-3142697795-33243-25643-1 I1014 06:10:07.867254 25671 master.cpp:3022] Registered slave 20141014-061007-3142697795-33243-25643-1 at slave(48)@67.195.81.187:33243 (pomona.apache.org) with cpus(*):1; mem(*):512; disk(*):0; ports(*):[31000-32000] I1014 06:10:07.867511 25660 hierarchical_allocator_process.hpp:442] Added slave 20141014-061007-3142697795-33243-25643-1 (pomona.apache.org) with cpus(*):1; mem(*):512; disk(*):0; ports(*):[31000-32000] (and cpus(*):1; mem(*):512; disk(*):0; ports(*):[31000-32000] available) I1014 06:10:07.867588 25660 hierarchical_allocator_process.hpp:734] Offering cpus(*):1; mem(*):512; disk(*):0; ports(*):[31000-32000] on slave 20141014-061007-3142697795-33243
Re: Review Request 25184: Delete framework data in TaskStatus to avoid OOM
On Sept. 26, 2014, 9:47 a.m., Timothy Chen wrote: src/master/master.cpp, line 3181 https://reviews.apache.org/r/25184/diff/2/?file=681985#file681985line3181 Period in the end of the comment. Chengwei Yang wrote: I'm not sure if I understand you correctly, if not please correct me. Did you mean that it's better if some comments about how often, how long mesos-master will be killed by OOM killer? If so, the answer as we observed is that when we running spark jobs, every task stored about 17MB data in TaskStatus and a small spark job consists of several thousands of tasks, so it can not finish the job if the leader mesos-master running on a machine with memory small than 10GB. I'll give a common example OOM scenario in comment. To be honest, I think Tim just wanted you to add a '.' character at the end of the comment, to satisfy our style guidelines. - Adam --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25184/#review54695 --- On Oct. 9, 2014, 7 a.m., Chengwei Yang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25184/ --- (Updated Oct. 9, 2014, 7 a.m.) Review request for mesos, Adam B and Timothy St. Clair. Bugs: MESOS-1746 https://issues.apache.org/jira/browse/MESOS-1746 Repository: mesos-git Description --- There was a bug found that Spark use TaskStatus.data to transfer computed result and mesos-master RES memory keeps increasing fast and finally will be killed by OOM killer. Diffs - src/master/master.cpp cb46cec0674b3aa031450c5b4f48f4f8bb92767d Diff: https://reviews.apache.org/r/25184/diff/ Testing --- tested with spark Thanks, Chengwei Yang
Re: Review Request 26473: libprocess: Always use stout ABORT() rather than system abort()
On Oct. 9, 2014, 2:14 a.m., Adam B wrote: 3rdparty/libprocess/src/httpd.cpp, lines 32-37 https://reviews.apache.org/r/26473/diff/1/?file=716312#file716312line32 Why are you removing this here? Doesn't seem related to the abort change. Or did you just notice that these are completely unused? Cody Maloney wrote: They were using abort(), so they fell into the category of upgrade or remove. They weren't used at all, so I just removed them rather than maintain the old cruft. Would you prefer I pull it out into a seperate patch next time? Reasonable explanation. I think it should be fine to include in this patch, but others might have preferred a separate patch. - Adam --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26473/#review55963 --- On Oct. 9, 2014, 9:38 a.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26473/ --- (Updated Oct. 9, 2014, 9:38 a.m.) Review request for mesos, Adam B and Dominic Hamon. Bugs: MESOS-1870 https://issues.apache.org/jira/browse/MESOS-1870 Repository: mesos-git Description --- This makes it so any time there is an abort, we get a line number and at least a basic message as to why there was an abort. If you want a clean(er) exit, use stout/exit. Diffs - 3rdparty/libprocess/include/process/event.hpp bf689d7270df2c8f1f5c9165d2bbcfdca06e15a8 3rdparty/libprocess/include/process/http.hpp d5407755a51a6edf779b2d219b4d81a90c3af2f8 3rdparty/libprocess/include/process/socket.hpp dbcb4f4c2eb12663158057a844b4511d6dde0508 3rdparty/libprocess/src/httpd.cpp eab3aa5f1c74cfc211b0efcc40f984222c85785c 3rdparty/libprocess/src/synchronized.hpp 70f6cd06825ac7bde5e45f2a900d2b2659e02b6e Diff: https://reviews.apache.org/r/26473/diff/ Testing --- make distcheck Thanks, Cody Maloney
Re: Review Request 26472: stout: Always use stout ABORT() rather than system abort()
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26472/#review56498 --- Ship it! Minor consistency nit, but otherwise quite shippable. 3rdparty/libprocess/3rdparty/stout/include/stout/thread.hpp https://reviews.apache.org/r/26472/#comment96869 Not sure why you didn't wrap strerror with std::string() in all these cases. - Adam B On Oct. 9, 2014, 5:46 p.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26472/ --- (Updated Oct. 9, 2014, 5:46 p.m.) Review request for mesos, Adam B and Dominic Hamon. Bugs: MESOS-1870 https://issues.apache.org/jira/browse/MESOS-1870 Repository: mesos-git Description --- This makes it so any time there is an abort, we get a line number and at least a basic message as to why there was an abort. If you want a clean(er) exit, use stout/exit. Also adds an overload which takes a standard string and unwraps it to a const char * automatically, since a lot of the time we are building strings to pass them to abort. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/abort.hpp 6b5b5d1faa488cb3048f6d59bae17123b1b05a33 3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp 9d244b2275f7ef84e8c486f2090b67a57ca8c670 3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 7138bc24912f35bfd04d2341b2218ea349ab40b8 3rdparty/libprocess/3rdparty/stout/include/stout/os/fork.hpp 8aa21ed10293c3dd7f55b7d30f6014bd05fd7455 3rdparty/libprocess/3rdparty/stout/include/stout/protobuf.hpp ccf80a771c59092cb72d4dac0d868eec7df2f088 3rdparty/libprocess/3rdparty/stout/include/stout/result.hpp ce8dd9b6468fb36daceba60d8b43bf2fbfceab6c 3rdparty/libprocess/3rdparty/stout/include/stout/stringify.hpp ed0a1ef2b617bd87261bf4836706cedb80fdf043 3rdparty/libprocess/3rdparty/stout/include/stout/thread.hpp b1af74f517da75525d0dedacaf4a1c6f8cbf55f4 3rdparty/libprocess/3rdparty/stout/include/stout/try.hpp 87c5fc8391d5213f0d76fa5980176726a0366f56 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 9207c551170b747d25207efe7d3c03675b184953 Diff: https://reviews.apache.org/r/26472/diff/ Testing --- make distcheck Thanks, Cody Maloney
Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2439
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2439/changes Changes: [adam] NOLINT a few operator overloads, to enable whitespace/operators [adam] Enabled whitespace/operators rule for cpplint. [benjamin.hindman] Modify configure.ac to fix --with-sasl. -- [...truncated 60828 lines...] I1014 07:45:16.772703 29155 gc.cpp:56] Scheduling '/tmp/StatusUpdateManagerTest_DuplicateTerminalUpdateAfterAck_Ptf7B0/meta/slaves/20141014-074516-3142697795-60149-29137-0/frameworks/20141014-074516-3142697795-60149-29137-' for gc 6.9105812444days in the future [ OK ] StatusUpdateManagerTest.DuplicateTerminalUpdateAfterAck (743 ms) [ RUN ] StatusUpdateManagerTest.CheckpointStatusUpdate Using temporary directory '/tmp/StatusUpdateManagerTest_CheckpointStatusUpdate_CBiQ2x' I1014 07:45:16.871641 29137 leveldb.cpp:176] Opened db in 2.303204ms I1014 07:45:16.872565 29137 leveldb.cpp:183] Compacted db in 896783ns I1014 07:45:16.872586 29137 leveldb.cpp:198] Created db iterator in 4827ns I1014 07:45:16.872596 29137 leveldb.cpp:204] Seeked to beginning of db in 610ns I1014 07:45:16.872609 29137 leveldb.cpp:273] Iterated through 0 keys in the db in 362ns I1014 07:45:16.872632 29137 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I1014 07:45:16.872802 29158 recover.cpp:425] Starting replica recovery I1014 07:45:16.872874 29158 recover.cpp:451] Replica is in EMPTY status I1014 07:45:16.873253 29160 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request I1014 07:45:16.873345 29152 recover.cpp:188] Received a recover response from a replica in EMPTY status I1014 07:45:16.873622 29163 recover.cpp:542] Updating replica status to STARTING I1014 07:45:16.874006 29152 master.cpp:312] Master 20141014-074516-3142697795-60149-29137 (pomona.apache.org) started on 67.195.81.187:60149 I1014 07:45:16.874037 29152 master.cpp:358] Master only allowing authenticated frameworks to register I1014 07:45:16.874044 29152 master.cpp:363] Master only allowing authenticated slaves to register I1014 07:45:16.874055 29152 credentials.hpp:36] Loading credentials for authentication from '/tmp/StatusUpdateManagerTest_CheckpointStatusUpdate_CBiQ2x/credentials' I1014 07:45:16.874155 29154 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 452350ns I1014 07:45:16.874168 29154 replica.cpp:320] Persisted replica status to STARTING I1014 07:45:16.874172 29152 master.cpp:392] Authorization enabled I1014 07:45:16.874248 29156 recover.cpp:451] Replica is in STARTING status I1014 07:45:16.874403 29156 master.cpp:120] No whitelist given. Advertising offers for all slaves I1014 07:45:16.874400 29154 hierarchical_allocator_process.hpp:299] Initializing hierarchical allocator process with master : master@67.195.81.187:60149 I1014 07:45:16.874639 29164 replica.cpp:638] Replica in STARTING status received a broadcasted recover request I1014 07:45:16.874743 29152 master.cpp:1242] The newly elected leader is master@67.195.81.187:60149 with id 20141014-074516-3142697795-60149-29137 I1014 07:45:16.874761 29152 master.cpp:1255] Elected as the leading master! I1014 07:45:16.874770 29152 master.cpp:1073] Recovering from registrar I1014 07:45:16.874790 29161 recover.cpp:188] Received a recover response from a replica in STARTING status I1014 07:45:16.874898 29152 recover.cpp:542] Updating replica status to VOTING I1014 07:45:16.874954 29160 registrar.cpp:313] Recovering registrar I1014 07:45:16.875218 29153 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 235769ns I1014 07:45:16.875231 29153 replica.cpp:320] Persisted replica status to VOTING I1014 07:45:16.875267 29164 recover.cpp:556] Successfully joined the Paxos group I1014 07:45:16.875347 29164 recover.cpp:440] Recover process terminated I1014 07:45:16.875480 29166 log.cpp:656] Attempting to start the writer I1014 07:45:16.875936 29161 replica.cpp:474] Replica received implicit promise request with proposal 1 I1014 07:45:16.876178 29161 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 228853ns I1014 07:45:16.876191 29161 replica.cpp:342] Persisted promised to 1 I1014 07:45:16.876405 29160 coordinator.cpp:230] Coordinator attemping to fill missing position I1014 07:45:16.876824 29152 replica.cpp:375] Replica received explicit promise request for position 0 with proposal 2 I1014 07:45:16.877068 29152 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 226122ns I1014 07:45:16.877080 29152 replica.cpp:676] Persisted action at 0 I1014 07:45:16.877507 29161 replica.cpp:508] Replica received write request for position 0 I1014 07:45:16.877531 29161 leveldb.cpp:438] Reading position from leveldb took 10921ns I1014 07:45:16.877768 29161 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 222715ns I1014 07:45:16.89 29161 replica.cpp:676] Persisted action at 0
Jenkins build is back to normal : Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2166
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2166/changes
Jenkins build is back to normal : Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2440
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2440/changes
Re: Review Request 26605: Cleanup right angle bracket in the code base.
On Oct. 13, 2014, 9:17 a.m., Till Toenshoff wrote: Thanks for proposing this patch Evelina. Could you please split this into seperate patches for stout, libprocess and mesos itself? What do you mean by three separate patches? I suppose three different commits ... It is not very clear to me how to organize them. As far as I can see stout is included in libprocess. - Evelina --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26605/#review56383 --- On Oct. 11, 2014, 6:31 p.m., Evelina Dumitrescu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26605/ --- (Updated Oct. 11, 2014, 6:31 p.m.) Review request for mesos and Jie Yu. Bugs: MESOS-1872 https://issues.apache.org/jira/browse/MESOS-1872 Repository: mesos-git Description --- Replaced with to be consistent with the C++ 11 standarad. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp 37e007e9ee3010d9aacf2de6254abb75b716412d 3rdparty/libprocess/3rdparty/stout/include/stout/flags.hpp ab609a0d7922298eaf89648200fabd843f8201e0 3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp 9d244b2275f7ef84e8c486f2090b67a57ca8c670 3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp 89e77bf75c642c00ac9e0d36a3dbe71838e0efab 3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 719aa964a536cf02dbd2de440157de487ec703b1 3rdparty/libprocess/3rdparty/stout/include/stout/linkedhashmap.hpp 89340fd79167a311c698740454db19e83c286952 3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 7138bc24912f35bfd04d2341b2218ea349ab40b8 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp ec41fe412290a152a15a49e091f0bc6436e42cfd 3rdparty/libprocess/3rdparty/stout/include/stout/os/killtree.hpp fa1950cc8849fcd81d425c651848e0566078 3rdparty/libprocess/3rdparty/stout/include/stout/os/linux.hpp 7a5573be8d8be537e5fd85b86e322fd5765c54ab 3rdparty/libprocess/3rdparty/stout/include/stout/os/ls.hpp 7cd4a40f57a661d83d1726c65fd8c898072e809b 3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp 7d17249cf1d14e9344112dd7fef7230c2ad8 3rdparty/libprocess/3rdparty/stout/include/stout/os/pstree.hpp e7fe077f763518680ba13d609dba9ea26950210d 3rdparty/libprocess/3rdparty/stout/include/stout/os/setns.hpp 5278996f201a4a3d69282c1bd7b0d230d0f6cd39 3rdparty/libprocess/3rdparty/stout/include/stout/os/sysctl.hpp a1e6e4d131e15c31135d72399e5e6825e7dd0141 3rdparty/libprocess/3rdparty/stout/include/stout/proc.hpp 0004fa5cd249269504155e51e52cd802c0916851 3rdparty/libprocess/3rdparty/stout/include/stout/protobuf.hpp ccf80a771c59092cb72d4dac0d868eec7df2f088 3rdparty/libprocess/3rdparty/stout/include/stout/strings.hpp 01e47ff21574d22f9469ab3bafd458322f6849c9 3rdparty/libprocess/3rdparty/stout/tests/flags_tests.cpp 3b60ff824bada441cd657e10ca01762085a6d099 3rdparty/libprocess/3rdparty/stout/tests/multimap_tests.cpp 79e720040e55de5aa68511d65cacf3952a57081a 3rdparty/libprocess/3rdparty/stout/tests/net_tests.cpp 425132e5d7c3770be4a5a39feea5a2f22179b871 3rdparty/libprocess/3rdparty/stout/tests/option_tests.cpp 7ae3b8ffc5df7f8442e72b1a10d50c3f5c373d8c 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 9207c551170b747d25207efe7d3c03675b184953 3rdparty/libprocess/3rdparty/stout/tests/proc_tests.cpp 56a63e8f5d809f03c70c60d4f02cc500d06ff96e 3rdparty/libprocess/3rdparty/stout/tests/some_tests.cpp 4041dc47a12c9540dc8d23f083e783e68f0a1af8 3rdparty/libprocess/3rdparty/stout/tests/strings_tests.cpp b90057972d22123d655697425581cc2767ec825d 3rdparty/libprocess/include/process/async.hpp 9af3cc07334eb7dc26b73964c447cc9b9799396c 3rdparty/libprocess/include/process/collect.hpp 3bee8a6a0b7b48627ebc11c8b950b8582bba2c06 3rdparty/libprocess/include/process/defer.hpp ebe6f2db47cab2a3306288d8ebabb720e7cd8976 3rdparty/libprocess/include/process/delay.hpp 487f652c9e9b7c8c3aa8b4edc9e59789cffd8da9 3rdparty/libprocess/include/process/dispatch.hpp bceda2a2ae8963921e8e19660243a8644feab227 3rdparty/libprocess/include/process/event.hpp bf689d7270df2c8f1f5c9165d2bbcfdca06e15a8 3rdparty/libprocess/include/process/future.hpp 46ae16b0bbce79005f5ed8711be663eeeb8f4bcf 3rdparty/libprocess/include/process/help.hpp 07e99f1124c60003f4c8b8e6cdcb193bb3577496 3rdparty/libprocess/include/process/http.hpp d5407755a51a6edf779b2d219b4d81a90c3af2f8 3rdparty/libprocess/include/process/metrics/metric.hpp 00ace49bc5ef3914704c6247ece9947960c9 3rdparty/libprocess/include/process/metrics/metrics.hpp
Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2167
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2167/changes Changes: [adam] stout: Always use stout ABORT() rather than system abort() [adam] libprocess: Always use stout ABORT() rather than system abort() -- [...truncated 66370 lines...] 2014-10-14 10:03:30,154:24475(0x2b98ed832700):ZOO_INFO@check_events@1703: initiated connection to server [127.0.0.1:36625] 2014-10-14 10:03:30,157:24475(0x2b98ed832700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:36625], sessionId=0x1490e1cd1e7, negotiated timeout=1 I1014 10:03:30.157840 24493 group.cpp:313] Group process (group(53)@67.195.81.187:43531) connected to ZooKeeper I1014 10:03:30.157867 24493 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (1, 0, 0) I1014 10:03:30.157874 24493 group.cpp:385] Trying to create path '/test' in ZooKeeper I1014 10:03:30.163807 24496 group.cpp:659] Trying to get '/test/00' in ZooKeeper I1014 10:03:30.164966 24475 zookeeper_test_server.cpp:118] Shutting down ZooKeeperTestServer on port 36625 2014-10-14 10:03:30,165:24475(0x2b98ed832700):ZOO_ERROR@handle_socket_error_msg@1721: Socket [127.0.0.1:36625] zk retcode=-4, errno=112(Host is down): failed while receiving a server response I1014 10:03:30.166270 24502 group.cpp:418] Lost connection to ZooKeeper, attempting to reconnect ... I1014 10:03:30.167207 24493 group.cpp:619] Trying to remove '/test/00' in ZooKeeper 2014-10-14 10:03:30,167:24475(0x2b98ed832700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:36625] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I1014 10:03:30.170656 24475 zookeeper_test_server.cpp:158] Started ZooKeeperTestServer on port 36625 2014-10-14 10:03:31,150:24475(0x2b98ece2d700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:32834] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I1014 10:03:32.168455 24502 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 1, 0) I1014 10:03:32.168524 24502 group.cpp:619] Trying to remove '/test/00' in ZooKeeper 2014-10-14 10:03:32,168:24475(0x2b98ed832700):ZOO_INFO@check_events@1703: initiated connection to server [127.0.0.1:36625] 2014-10-14 10:03:32,170:24475(0x2b98ed832700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:36625], sessionId=0x1490e1cd1e7, negotiated timeout=1 I1014 10:03:32.175796 24502 group.cpp:313] Group process (group(53)@67.195.81.187:43531) reconnected to ZooKeeper I1014 10:03:32.175847 24502 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) 2014-10-14 10:03:32,177:24475(0x2b9560f0ff80):ZOO_INFO@zookeeper_close@2505: Closing zookeeper sessionId=0x1490e1cd1e7 to [127.0.0.1:36625] I1014 10:03:32.177876 24475 zookeeper_test_server.cpp:118] Shutting down ZooKeeperTestServer on port 36625 [ OK ] GroupTest.GroupCancelWithDisconnect (2029 ms) [ RUN ] GroupTest.GroupDataWithRemovedMembership I1014 10:03:32.181936 24475 zookeeper_test_server.cpp:158] Started ZooKeeperTestServer on port 39430 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@log_env@716: Client environment:host.name=pomona.apache.org 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@log_env@724: Client environment:os.arch=3.13.0-36-lowlatency 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@log_env@725: Client environment:os.version=#63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@log_env@733: Client environment:user.name=jenkins 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@log_env@741: Client environment:user.home=/home/jenkins 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@log_env@753: Client environment:user.dir=https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/ws/src 2014-10-14 10:03:32,182:24475(0x2b9564dc3700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=127.0.0.1:39430 sessionTimeout=1 watcher=0x2b9561a36550 sessionId=0 sessionPasswd=null context=0x2b99180a2ff0 flags=0 2014-10-14 10:03:32,182:24475(0x2b98ed832700):ZOO_INFO@check_events@1703: initiated connection to server [127.0.0.1:39430] 2014-10-14 10:03:32,185:24475(0x2b98ed832700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:39430], sessionId=0x1490e1cd9d5, negotiated timeout=1 I1014 10:03:32.186179 24496 group.cpp:313] Group process (group(54)@67.195.81.187:43531) connected to ZooKeeper I1014 10:03:32.186216 24496 group.cpp:790] Syncing group
Re: Review Request 26634: Implemented array subscript lookup in JSON::Object::find.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26634/#review56504 --- It looks to me that adding this functionality imposes a restriction on key names in JSON. We won't be able to do something like this: object.values[name_[en]_full] = Name; auto result = object.findJSON::String(name_[en]_full); I don't think it is a big deal or we would need it, but maybe let's make this restriction explicit by introducing a vaidation function. 3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp https://reviews.apache.org/r/26634/#comment96872 How about a one-liner with one intermediate object less? `std::string subscript = name.substr(index + 1, name.length() - index - 2);` - Alexander Rukletsov On Oct. 13, 2014, 4:54 a.m., Benjamin Hindman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26634/ --- (Updated Oct. 13, 2014, 4:54 a.m.) Review request for mesos, Alexander Rukletsov and Ben Mahler. Repository: mesos-git Description --- See summary. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 719aa964a536cf02dbd2de440157de487ec703b1 3rdparty/libprocess/3rdparty/stout/tests/json_tests.cpp 3bfc8e639185323a3809ca82c09428b3b1b8afe5 Diff: https://reviews.apache.org/r/26634/diff/ Testing --- make check Thanks, Benjamin Hindman
Re: Review Request 26605: Cleanup right angle bracket in the code base.
On Oct. 13, 2014, 9:17 a.m., Till Toenshoff wrote: Thanks for proposing this patch Evelina. Could you please split this into seperate patches for stout, libprocess and mesos itself? Evelina Dumitrescu wrote: What do you mean by three separate patches? I suppose three different commits ... It is not very clear to me how to organize them. As far as I can see stout is included in libprocess. Yes, three different review proposals / aka patches. - Till --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26605/#review56383 --- On Oct. 11, 2014, 6:31 p.m., Evelina Dumitrescu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26605/ --- (Updated Oct. 11, 2014, 6:31 p.m.) Review request for mesos and Jie Yu. Bugs: MESOS-1872 https://issues.apache.org/jira/browse/MESOS-1872 Repository: mesos-git Description --- Replaced with to be consistent with the C++ 11 standarad. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp 37e007e9ee3010d9aacf2de6254abb75b716412d 3rdparty/libprocess/3rdparty/stout/include/stout/flags.hpp ab609a0d7922298eaf89648200fabd843f8201e0 3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp 9d244b2275f7ef84e8c486f2090b67a57ca8c670 3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp 89e77bf75c642c00ac9e0d36a3dbe71838e0efab 3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 719aa964a536cf02dbd2de440157de487ec703b1 3rdparty/libprocess/3rdparty/stout/include/stout/linkedhashmap.hpp 89340fd79167a311c698740454db19e83c286952 3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 7138bc24912f35bfd04d2341b2218ea349ab40b8 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp ec41fe412290a152a15a49e091f0bc6436e42cfd 3rdparty/libprocess/3rdparty/stout/include/stout/os/killtree.hpp fa1950cc8849fcd81d425c651848e0566078 3rdparty/libprocess/3rdparty/stout/include/stout/os/linux.hpp 7a5573be8d8be537e5fd85b86e322fd5765c54ab 3rdparty/libprocess/3rdparty/stout/include/stout/os/ls.hpp 7cd4a40f57a661d83d1726c65fd8c898072e809b 3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp 7d17249cf1d14e9344112dd7fef7230c2ad8 3rdparty/libprocess/3rdparty/stout/include/stout/os/pstree.hpp e7fe077f763518680ba13d609dba9ea26950210d 3rdparty/libprocess/3rdparty/stout/include/stout/os/setns.hpp 5278996f201a4a3d69282c1bd7b0d230d0f6cd39 3rdparty/libprocess/3rdparty/stout/include/stout/os/sysctl.hpp a1e6e4d131e15c31135d72399e5e6825e7dd0141 3rdparty/libprocess/3rdparty/stout/include/stout/proc.hpp 0004fa5cd249269504155e51e52cd802c0916851 3rdparty/libprocess/3rdparty/stout/include/stout/protobuf.hpp ccf80a771c59092cb72d4dac0d868eec7df2f088 3rdparty/libprocess/3rdparty/stout/include/stout/strings.hpp 01e47ff21574d22f9469ab3bafd458322f6849c9 3rdparty/libprocess/3rdparty/stout/tests/flags_tests.cpp 3b60ff824bada441cd657e10ca01762085a6d099 3rdparty/libprocess/3rdparty/stout/tests/multimap_tests.cpp 79e720040e55de5aa68511d65cacf3952a57081a 3rdparty/libprocess/3rdparty/stout/tests/net_tests.cpp 425132e5d7c3770be4a5a39feea5a2f22179b871 3rdparty/libprocess/3rdparty/stout/tests/option_tests.cpp 7ae3b8ffc5df7f8442e72b1a10d50c3f5c373d8c 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 9207c551170b747d25207efe7d3c03675b184953 3rdparty/libprocess/3rdparty/stout/tests/proc_tests.cpp 56a63e8f5d809f03c70c60d4f02cc500d06ff96e 3rdparty/libprocess/3rdparty/stout/tests/some_tests.cpp 4041dc47a12c9540dc8d23f083e783e68f0a1af8 3rdparty/libprocess/3rdparty/stout/tests/strings_tests.cpp b90057972d22123d655697425581cc2767ec825d 3rdparty/libprocess/include/process/async.hpp 9af3cc07334eb7dc26b73964c447cc9b9799396c 3rdparty/libprocess/include/process/collect.hpp 3bee8a6a0b7b48627ebc11c8b950b8582bba2c06 3rdparty/libprocess/include/process/defer.hpp ebe6f2db47cab2a3306288d8ebabb720e7cd8976 3rdparty/libprocess/include/process/delay.hpp 487f652c9e9b7c8c3aa8b4edc9e59789cffd8da9 3rdparty/libprocess/include/process/dispatch.hpp bceda2a2ae8963921e8e19660243a8644feab227 3rdparty/libprocess/include/process/event.hpp bf689d7270df2c8f1f5c9165d2bbcfdca06e15a8 3rdparty/libprocess/include/process/future.hpp 46ae16b0bbce79005f5ed8711be663eeeb8f4bcf 3rdparty/libprocess/include/process/help.hpp 07e99f1124c60003f4c8b8e6cdcb193bb3577496 3rdparty/libprocess/include/process/http.hpp d5407755a51a6edf779b2d219b4d81a90c3af2f8 3rdparty/libprocess/include/process/metrics/metric.hpp
Re: Review Request 25184: Delete framework data in TaskStatus to avoid OOM
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25184/#review56526 --- Ship it! I tested locally but not to any great extent, and it passed my make check. Could you please elaborate on your testing in the review. - Timothy St. Clair On Oct. 9, 2014, 2 p.m., Chengwei Yang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25184/ --- (Updated Oct. 9, 2014, 2 p.m.) Review request for mesos, Adam B and Timothy St. Clair. Bugs: MESOS-1746 https://issues.apache.org/jira/browse/MESOS-1746 Repository: mesos-git Description --- There was a bug found that Spark use TaskStatus.data to transfer computed result and mesos-master RES memory keeps increasing fast and finally will be killed by OOM killer. Diffs - src/master/master.cpp cb46cec0674b3aa031450c5b4f48f4f8bb92767d Diff: https://reviews.apache.org/r/25184/diff/ Testing --- tested with spark Thanks, Chengwei Yang
Re: Review Request 25184: Delete framework data in TaskStatus to avoid OOM
On Oct. 14, 2014, 2:21 p.m., Timothy St. Clair wrote: I tested locally but not to any great extent, and it passed my make check. Could you please elaborate on your testing in the review. On second ship-it + updated test comment I'll push post haste. - Timothy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25184/#review56526 --- On Oct. 9, 2014, 2 p.m., Chengwei Yang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25184/ --- (Updated Oct. 9, 2014, 2 p.m.) Review request for mesos, Adam B and Timothy St. Clair. Bugs: MESOS-1746 https://issues.apache.org/jira/browse/MESOS-1746 Repository: mesos-git Description --- There was a bug found that Spark use TaskStatus.data to transfer computed result and mesos-master RES memory keeps increasing fast and finally will be killed by OOM killer. Diffs - src/master/master.cpp cb46cec0674b3aa031450c5b4f48f4f8bb92767d Diff: https://reviews.apache.org/r/25184/diff/ Testing --- tested with spark Thanks, Chengwei Yang
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review56533 --- The latest diff looks broken - Thomas, mind updating it (if you still want this to go in)? - Niklas Nielsen On Jan. 29, 2014, 12:51 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 29, 2014, 12:51 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - CHANGELOG e75a3411f865cb7f3768df1299f469f49c3a0009 bin/lldb-mesos-local.sh.in 35011fd483e477701efd7e204b514bb362713ccb bin/lldb-mesos-master.sh.in b1c7f9f1b98b5f410729f5a7e7a1729709f7e744 bin/lldb-mesos-slave.sh.in 896c411b2b05d3c4a14288002520a5391a88d955 bin/lldb-mesos-tests.sh.in f001b0b7f35839a101a86cd7df86fb7ebfc1c47e configure.ac 18bf4bfb345bdd443defccc4e53d357b35c7b533 docs/upgrades.md fe8b60470f7431accef44977e7036a2688289037 src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/log/catchup.cpp 4ee32f285f77eb2de661e22a301b743bb8a06f9c src/master/http.cpp fb15483953a593bfec4e60884219dc8c4e8d565c src/slave/http.cpp c4f598faf6807214608cc89a6d9cf665133f95f3 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Connection Issue Alert https://reviews.apache.org/media/uploaded/files/2014/01/28/dee8df12-0bae-48b5-a7ce-c07e0266c790__Screenshot_2014-01-28_12.44.53.png Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review56536 --- Bad patch! Reviews applied: [17431] Failed command: git apply --index 17431.patch Error: error: patch failed: CHANGELOG:1 error: CHANGELOG: patch does not apply error: patch failed: configure.ac:18 error: configure.ac: patch does not apply error: patch failed: docs/upgrades.md:5 error: docs/upgrades.md: patch does not apply error: patch failed: src/log/catchup.cpp:207 error: src/log/catchup.cpp: patch does not apply error: patch failed: src/master/http.cpp:325 error: src/master/http.cpp: patch does not apply error: patch failed: src/slave/http.cpp:285 error: src/slave/http.cpp: patch does not apply 17431.patch:456: new blank line at EOF. + - Mesos ReviewBot On Jan. 29, 2014, 8:51 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 29, 2014, 8:51 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - CHANGELOG e75a3411f865cb7f3768df1299f469f49c3a0009 bin/lldb-mesos-local.sh.in 35011fd483e477701efd7e204b514bb362713ccb bin/lldb-mesos-master.sh.in b1c7f9f1b98b5f410729f5a7e7a1729709f7e744 bin/lldb-mesos-slave.sh.in 896c411b2b05d3c4a14288002520a5391a88d955 bin/lldb-mesos-tests.sh.in f001b0b7f35839a101a86cd7df86fb7ebfc1c47e configure.ac 18bf4bfb345bdd443defccc4e53d357b35c7b533 docs/upgrades.md fe8b60470f7431accef44977e7036a2688289037 src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/log/catchup.cpp 4ee32f285f77eb2de661e22a301b743bb8a06f9c src/master/http.cpp fb15483953a593bfec4e60884219dc8c4e8d565c src/slave/http.cpp c4f598faf6807214608cc89a6d9cf665133f95f3 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Connection Issue Alert https://reviews.apache.org/media/uploaded/files/2014/01/28/dee8df12-0bae-48b5-a7ce-c07e0266c790__Screenshot_2014-01-28_12.44.53.png Thanks, Thomas Rampelberg
Re: Review Request 26670: Added reconciliation to the documentation home.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26670/#review56537 --- Ship it! Sounds great! - Dominic Hamon On Oct. 13, 2014, 5:36 p.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26670/ --- (Updated Oct. 13, 2014, 5:36 p.m.) Review request for mesos, Benjamin Hindman, Niklas Nielsen, and Vinod Kone. Bugs: MESOS-681 https://issues.apache.org/jira/browse/MESOS-681 Repository: mesos-git Description --- Hoping to have a section called Mesos Fundamentals, where we can place other fundamental documents, for example: Background / Idea Resource Offers Executors / Tasks Status Updates Thoughts? Diffs - docs/home.md 2efe37ed14c5fbc6dfc0d7a2587752493f157350 Diff: https://reviews.apache.org/r/26670/diff/ Testing --- N/A Thanks, Ben Mahler
Re: Framework testing in Mesos
Thanks to both of you. @David Idempotence (and functional style) will both mitigate the issue of testing. @Sharma #3 looks impressive and I hear the pain. Few questions: * Since you already have the state machine modeling, can't the scheduler actions also be modeled as a state machine transitions? * Having a spec for (in form of state machine or otherwise) scheduler looks important (and hard) goal. Mocking looks like a good thing. Is mocking general enough to become a library available to all, to enable *verifiably* correct scheduler behavior? Again thanks for sharing your thoughts. Thanks, Dharmesh On Mon, Oct 13, 2014 at 7:29 AM, David Greenberg dsg123456...@gmail.com wrote: Specifically with regards to the state of the framework due to callback ordering, we ensure that our framework is written in a functional style, so that all callbacks atomically transform the previous state to a new state. By doing this, we serialize all callbacks. At this point, you can do generative testing to create events and run them through your system. This, at least, makes #3 possible. For #4, we are pretty careful to choose idempotent writes into the DB and a DB that supports snapshot reads. This way, you can just use at-least-once semantics for easy-to-implement retries. If a write fails, you just crash, since that means your DB's completely down. Then we test by thinking through and discussing whether operations have this idempotency property and the simple retry logic independently. This starts to get at a way to manage #4 to avoid learning in production. On Sun, Oct 12, 2014 at 11:44 AM, Dharmesh Kakadia dhkaka...@gmail.com wrote: Thanks David. Taking state of the framework is an interesting design. I am assuming the scheduler is maintaining the state and then handing tasks on slaves. If that's the case, we can safely test executor (stateless - receiving event and returning appropriate status to the scheduler). You construct scheduler tests similarly by passing different states and event and observing the next state. This way you will be sure that your callbacks works fine in *isolation*. I would be concerned about the state of the framework in case of callback ordering (or re-execution) in *all possible scenarios*. Mocking is exactly what might uncover such bugs, but as you pointed out, I also think it would not be trivial for many frameworks. At a high-level it would be important to know for frameworks developers that, 1. executors are working fine in isolation on fresh start, implementing the feature. 2. executors are working fine when rescheduled/restarted/in presence of other executors. 3. scheduler is working fine in isolation. 4. scheduler is fine in the wild ( in presence of others/failures/checkpointing/...). 1 is easy to do traditionally. 2 is possible if your executors do not have side effect or using Docker etc. 3 and 4 are not easy to do. I think having support/library for testing scheduler is something all the framework writer would benefit from. Not having to think about communication between executors and scheduler is already a big plus, can we also make it easier for developers to test about their scheduler behaviour? Thoughts? I would love to hear thoughts from others. Thanks, Dharmesh On Sun, Oct 12, 2014 at 8:03 PM, David Greenberg dsg123456...@gmail.com wrote: For our frameworks, we don't tend to do much automated testing of the Mesos interface--instead, we construct the framework state, then send it a message, since our callbacks take the state of the framework + the event as the argument. This way, we don't need to have mesos running, and we can trim away large amounts of code necessary to connect to mesos but unnecessary for the actual feature under test. We've also been experimenting with simulation testing by mocking out the mesos APIs. These techniques are mostly effective when you can pretend that the executors you're using don't communicate much, or when they're trivial to mock. On Sun, Oct 12, 2014 at 9:42 AM, Dharmesh Kakadia dhkaka...@gmail.com wrote: Hi, I am working on a tiny experimental framework for Mesos. I was wondering what is the recommended way of writing testcases for framework testing. I looked at the several existing frameworks, but its still not clear to me. I understand that I might be able to test executor functionality in isolation through normal test cases, but testing as a whole framework is what I am unclear about. Suggestions? Is that a non-goal? How do other framework developers go about it? Also, on the related note, is there a way to debug frameworks in better way than sifting through logs? Thanks, Dharmesh
Re: Review Request 17082: Fixed display of table filter for empty tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17082/#review56538 --- Ship it! Ship It! - Tobi Knaup On Jan. 18, 2014, 1:05 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17082/ --- (Updated Jan. 18, 2014, 1:05 a.m.) Review request for mesos and Ross Allen. Repository: mesos-git Description --- Fixed display of table filter for empty tables Diffs - src/webui/master/static/directives/tableHeader.html 2acb6c8797e2aa8f8c2093f00ceaa52e22b497fd Diff: https://reviews.apache.org/r/17082/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17082: Fixed display of table filter for empty tables
On Oct. 14, 2014, 9:38 a.m., Tobi Knaup wrote: Ship It! Awesome - will land this today. - Niklas --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17082/#review56538 --- On Jan. 17, 2014, 5:05 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17082/ --- (Updated Jan. 17, 2014, 5:05 p.m.) Review request for mesos and Ross Allen. Repository: mesos-git Description --- Fixed display of table filter for empty tables Diffs - src/webui/master/static/directives/tableHeader.html 2acb6c8797e2aa8f8c2093f00ceaa52e22b497fd Diff: https://reviews.apache.org/r/17082/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 26669: Added a document for reconciliation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26669/#review56539 --- docs/reconciliation.md https://reviews.apache.org/r/26669/#comment96894 is there a link to other documentation that could come in here to describe the programming model in more depth? docs/reconciliation.md https://reviews.apache.org/r/26669/#comment96893 can we say here that it's the responsibility of the master and work toward that goal? docs/reconciliation.md https://reviews.apache.org/r/26669/#comment96895 if there's a jira ticket for this, please link it. docs/reconciliation.md https://reviews.apache.org/r/26669/#comment96896 it's unfortunate that we put the onus on framework developers to manage this algorithm and the exponential backoff. I wonder if we can move to a model where a reconcile request is made and then a stream of replies is sent including a terminal empty set. (this is speculative only, obviously.. i think the document is fine as it describes the current state). - Dominic Hamon On Oct. 13, 2014, 5:34 p.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26669/ --- (Updated Oct. 13, 2014, 5:34 p.m.) Review request for mesos, Benjamin Hindman, Niklas Nielsen, and Vinod Kone. Bugs: MESOS-681 https://issues.apache.org/jira/browse/MESOS-681 Repository: mesos-git Description --- Please see here for rendered markdown, will be easier to review: https://gist.github.com/bmahler/18409fc4f052df43f403 Please send your high level thoughts :) Diffs - docs/reconciliation.md PRE-CREATION Diff: https://reviews.apache.org/r/26669/diff/ Testing --- N/A Thanks, Ben Mahler
Re: Review Request 21277: Passed CommandInfo to mesos-fetcher as JSON.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21277/#review56549 --- LGTM but needs rebase. Would be sweet to get in: much more robust than magic encoding :-) - Niklas Nielsen On May 9, 2014, 12:05 p.m., Benjamin Hindman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21277/ --- (Updated May 9, 2014, 12:05 p.m.) Review request for mesos, Ben Mahler, Dominic Hamon, and Tom Arnfeld. Bugs: MESOS-1248 https://issues.apache.org/jira/browse/MESOS-1248 Repository: mesos-git Description --- See summary (and bug). Diffs - src/launcher/fetcher.cpp 8c9e20da8f39eb5e90403a5093cbea7fb2680468 src/slave/fetcher.hpp PRE-CREATION src/tests/fetcher_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/21277/diff/ Testing --- make check Thanks, Benjamin Hindman
Re: Review Request 17082: Fixed display of table filter for empty tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17082/#review56551 --- Patch looks great! Reviews applied: [17082] All tests passed. - Mesos ReviewBot On Jan. 18, 2014, 1:05 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17082/ --- (Updated Jan. 18, 2014, 1:05 a.m.) Review request for mesos and Ross Allen. Repository: mesos-git Description --- Fixed display of table filter for empty tables Diffs - src/webui/master/static/directives/tableHeader.html 2acb6c8797e2aa8f8c2093f00ceaa52e22b497fd Diff: https://reviews.apache.org/r/17082/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
On Oct. 14, 2014, 4:02 p.m., Niklas Nielsen wrote: The latest diff looks broken - Thomas, mind updating it (if you still want this to go in)? Why in the world didn't this get in. I'll fix it up =) - Thomas --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review56533 --- On Jan. 29, 2014, 8:51 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 29, 2014, 8:51 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - CHANGELOG e75a3411f865cb7f3768df1299f469f49c3a0009 bin/lldb-mesos-local.sh.in 35011fd483e477701efd7e204b514bb362713ccb bin/lldb-mesos-master.sh.in b1c7f9f1b98b5f410729f5a7e7a1729709f7e744 bin/lldb-mesos-slave.sh.in 896c411b2b05d3c4a14288002520a5391a88d955 bin/lldb-mesos-tests.sh.in f001b0b7f35839a101a86cd7df86fb7ebfc1c47e configure.ac 18bf4bfb345bdd443defccc4e53d357b35c7b533 docs/upgrades.md fe8b60470f7431accef44977e7036a2688289037 src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/log/catchup.cpp 4ee32f285f77eb2de661e22a301b743bb8a06f9c src/master/http.cpp fb15483953a593bfec4e60884219dc8c4e8d565c src/slave/http.cpp c4f598faf6807214608cc89a6d9cf665133f95f3 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Connection Issue Alert https://reviews.apache.org/media/uploaded/files/2014/01/28/dee8df12-0bae-48b5-a7ce-c07e0266c790__Screenshot_2014-01-28_12.44.53.png Thanks, Thomas Rampelberg
Jenkins build is back to normal : Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2168
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2168/changes
Review Request 26697: Added StartSlave() overload that takes status update manager.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26697/ --- Review request for mesos and Ben Mahler. Repository: mesos-git Description --- Added this overload to be able to use for testing status update manager. Used in the subsequent review. Diffs - src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 src/tests/mesos.cpp 3dcb2acd5ad4ab5e3a7b4fe524ee077558112773 Diff: https://reviews.apache.org/r/26697/diff/ Testing --- make check Thanks, Vinod Kone
Review Request 26699: Updated slave re-registration to send unacknowledged task states.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26699/ --- Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen. Bugs: MESOS-1799 and MESOS-1817 https://issues.apache.org/jira/browse/MESOS-1799 https://issues.apache.org/jira/browse/MESOS-1817 Repository: mesos-git Description --- Slave re-registration now sends both the latest state and unacknowledged state to the master. Diffs - src/slave/slave.hpp 342b09fc084c20d98d096bb129830440179c092c src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/fault_tolerance_tests.cpp a75910d4f486230ba3f1d8927e5f1e5fda6e287b src/tests/slave_tests.cpp f585bdd20ae1af466f2c1b4d85331ac67451552f Diff: https://reviews.apache.org/r/26699/diff/ Testing --- make check Ran new test 1000 times. Thanks, Vinod Kone
Review Request 26700: Updated StatusUpdateManager to send latest task state in update.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26700/ --- Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen. Bugs: MESOS-1799 and MESOS-1817 https://issues.apache.org/jira/browse/MESOS-1799 https://issues.apache.org/jira/browse/MESOS-1817 Repository: mesos-git Description --- Status update manager now sends both latest and unacknowledged states to the master. Diffs - src/messages/messages.proto 8de7f9699f43aa2780f4a39fed087abcf5e5af99 src/slave/status_update_manager.cpp 5d5cf234ef2dd47fa4b1f67be761dbca31659451 src/tests/status_update_manager_tests.cpp e9ef1e208cb01535e9366db7872b922c8f06ec40 Diff: https://reviews.apache.org/r/26700/diff/ Testing --- make check Ran new test 1000 times. Thanks, Vinod Kone
Review Request 26698: Added StatusUpdateManager::unacknowledged() API call.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26698/ --- Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen. Repository: mesos-git Description --- Added StatusUpdateManager::unacknowledged() API call so that slave can use this info during reregistration. Diffs - src/messages/messages.proto 8de7f9699f43aa2780f4a39fed087abcf5e5af99 src/slave/status_update_manager.hpp 24e3882ad1969c20a64daf90e408618c310e9a81 src/slave/status_update_manager.cpp 5d5cf234ef2dd47fa4b1f67be761dbca31659451 src/tests/status_update_manager_tests.cpp e9ef1e208cb01535e9366db7872b922c8f06ec40 Diff: https://reviews.apache.org/r/26698/diff/ Testing --- make check Ran new test 1000 times. Thanks, Vinod Kone
Review Request 26701: Updated master to update task unacknowledged state properly.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26701/ --- Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen. Bugs: MESOS-1799 and MESOS-1817 https://issues.apache.org/jira/browse/MESOS-1799 https://issues.apache.org/jira/browse/MESOS-1817 Repository: mesos-git Description --- Master now maintains the latest and unacknowledged states of the task. Diffs - src/master/master.hpp 14f1d0fd240c9cd0718d0238e1fbb9c733190205 src/master/master.cpp cb46cec0674b3aa031450c5b4f48f4f8bb92767d src/messages/messages.proto 8de7f9699f43aa2780f4a39fed087abcf5e5af99 src/tests/master_tests.cpp d9dc40c6f5aaa66e1f7a0e1b7e4d9cdc586ca0fd Diff: https://reviews.apache.org/r/26701/diff/ Testing --- make check Ran the new test 1000 times. Thanks, Vinod Kone
Review Request 26702: Updated reconciliation semantics to take the task's unacknowledged state into account.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26702/ --- Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen. Bugs: MESOS-1799 and MESOS-1817 https://issues.apache.org/jira/browse/MESOS-1799 https://issues.apache.org/jira/browse/MESOS-1817 Repository: mesos-git Description --- Master responds to reconciliation requests with unacknowledged state of task instead of latest state. Diffs - src/master/master.cpp cb46cec0674b3aa031450c5b4f48f4f8bb92767d src/tests/reconciliation_tests.cpp 400c5c035a2cadd408636021aafb3888546f0081 Diff: https://reviews.apache.org/r/26702/diff/ Testing --- make check Ran new test 1000 times. Thanks, Vinod Kone
Re: Review Request 26702: Updated reconciliation semantics to take the task's unacknowledged state into account.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26702/#review56559 --- Patch looks great! Reviews applied: [26697, 26698, 26699, 26700, 26701, 26702] All tests passed. - Mesos ReviewBot On Oct. 14, 2014, 6:07 p.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26702/ --- (Updated Oct. 14, 2014, 6:07 p.m.) Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen. Bugs: MESOS-1799 and MESOS-1817 https://issues.apache.org/jira/browse/MESOS-1799 https://issues.apache.org/jira/browse/MESOS-1817 Repository: mesos-git Description --- Master responds to reconciliation requests with unacknowledged state of task instead of latest state. Diffs - src/master/master.cpp cb46cec0674b3aa031450c5b4f48f4f8bb92767d src/tests/reconciliation_tests.cpp 400c5c035a2cadd408636021aafb3888546f0081 Diff: https://reviews.apache.org/r/26702/diff/ Testing --- make check Ran new test 1000 times. Thanks, Vinod Kone
Re: Review Request 26622: Don't pass task-related arguments to mesos-executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26622/ --- (Updated Oct. 14, 2014, 12:08 p.m.) Review request for mesos, Benjamin Hindman and Jie Yu. Changes --- + jieyu Bugs: MESOS-1873 https://issues.apache.org/jira/browse/MESOS-1873 Repository: mesos-git Description --- Don't pass task-related arguments to mesos-executor. The description on the linked jira ticket goes into more detail. Diffs - src/slave/slave.cpp 0e342ed src/tests/slave_tests.cpp f585bdd Diff: https://reviews.apache.org/r/26622/diff/ Testing --- make check added new test SlaveTest.GetExecutorInfo verifying the explicit desired behavior. Thanks, R.B. Boyer
Re: Review Request 26622: Don't pass task-related arguments to mesos-executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26622/ --- (Updated Oct. 14, 2014, 7:48 p.m.) Review request for mesos, Benjamin Hindman, Jie Yu, and Timothy Chen. Changes --- + tnachen as well Bugs: MESOS-1873 https://issues.apache.org/jira/browse/MESOS-1873 Repository: mesos-git Description --- Don't pass task-related arguments to mesos-executor. The description on the linked jira ticket goes into more detail. Diffs - src/slave/slave.cpp 0e342ed src/tests/slave_tests.cpp f585bdd Diff: https://reviews.apache.org/r/26622/diff/ Testing --- make check added new test SlaveTest.GetExecutorInfo verifying the explicit desired behavior. Thanks, R.B. Boyer
Re: Review Request 25945: Pass Promise to DispatchEvent to correctly fail on rejection.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25945/ --- (Updated Oct. 14, 2014, 1:09 p.m.) Review request for mesos, Benjamin Hindman and Ben Mahler. Changes --- rebased Bugs: MESOS-1456, MESOS-1751 and MESOS-1866 https://issues.apache.org/jira/browse/MESOS-1456 https://issues.apache.org/jira/browse/MESOS-1751 https://issues.apache.org/jira/browse/MESOS-1866 Repository: mesos-git Description --- Create a PromiseDispatchEvent to allow failing of promises if the DispatchEvent can't be enqueued. Also converted some shared_ptr to Owned to correctly track ownership. Diffs (updated) - 3rdparty/libprocess/include/process/c++11/dispatch.hpp 76da2828cf36b6143d627dac66f3e0cc4416bae4 3rdparty/libprocess/include/process/defer.hpp ebe6f2db47cab2a3306288d8ebabb720e7cd8976 3rdparty/libprocess/include/process/delay.hpp 487f652c9e9b7c8c3aa8b4edc9e59789cffd8da9 3rdparty/libprocess/include/process/dispatch.hpp bceda2a2ae8963921e8e19660243a8644feab227 3rdparty/libprocess/include/process/event.hpp bf689d7270df2c8f1f5c9165d2bbcfdca06e15a8 3rdparty/libprocess/m4/ax_cxx_compile_stdcxx_11.m4 07b298f151094e818287f741b3e0efd28374e82b 3rdparty/libprocess/src/process.cpp d30ed636ee24d0fe6e62f69a921307bde1f32765 3rdparty/libprocess/src/tests/process_tests.cpp b985fb77ea05fae5c0b144ea48814acc7bb5135b src/tests/master_tests.cpp d9dc40c6f5aaa66e1f7a0e1b7e4d9cdc586ca0fd Diff: https://reviews.apache.org/r/25945/diff/ Testing --- Added test that fails without the patch. Ran all tests (make check) with g++-4.4 and clang-3.5. Thanks, Dominic Hamon
Re: Review Request 25945: Pass Promise to DispatchEvent to correctly fail on rejection.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25945/#review56564 --- Bad patch! Reviews applied: [26289] Failed command: git apply --index 26289.patch Error: error: patch failed: 3rdparty/libprocess/include/process/event.hpp:1 error: 3rdparty/libprocess/include/process/event.hpp: patch does not apply - Mesos ReviewBot On Oct. 14, 2014, 8:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25945/ --- (Updated Oct. 14, 2014, 8:09 p.m.) Review request for mesos, Benjamin Hindman and Ben Mahler. Bugs: MESOS-1456, MESOS-1751 and MESOS-1866 https://issues.apache.org/jira/browse/MESOS-1456 https://issues.apache.org/jira/browse/MESOS-1751 https://issues.apache.org/jira/browse/MESOS-1866 Repository: mesos-git Description --- Create a PromiseDispatchEvent to allow failing of promises if the DispatchEvent can't be enqueued. Also converted some shared_ptr to Owned to correctly track ownership. Diffs - 3rdparty/libprocess/include/process/c++11/dispatch.hpp 76da2828cf36b6143d627dac66f3e0cc4416bae4 3rdparty/libprocess/include/process/defer.hpp ebe6f2db47cab2a3306288d8ebabb720e7cd8976 3rdparty/libprocess/include/process/delay.hpp 487f652c9e9b7c8c3aa8b4edc9e59789cffd8da9 3rdparty/libprocess/include/process/dispatch.hpp bceda2a2ae8963921e8e19660243a8644feab227 3rdparty/libprocess/include/process/event.hpp bf689d7270df2c8f1f5c9165d2bbcfdca06e15a8 3rdparty/libprocess/m4/ax_cxx_compile_stdcxx_11.m4 07b298f151094e818287f741b3e0efd28374e82b 3rdparty/libprocess/src/process.cpp d30ed636ee24d0fe6e62f69a921307bde1f32765 3rdparty/libprocess/src/tests/process_tests.cpp b985fb77ea05fae5c0b144ea48814acc7bb5135b src/tests/master_tests.cpp d9dc40c6f5aaa66e1f7a0e1b7e4d9cdc586ca0fd Diff: https://reviews.apache.org/r/25945/diff/ Testing --- Added test that fails without the patch. Ran all tests (make check) with g++-4.4 and clang-3.5. Thanks, Dominic Hamon
Re: Review Request 25549: Basic filesystem isolator for Linux.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25549/ --- (Updated Oct. 14, 2014, 1:31 p.m.) Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone. Changes --- Addressed some clean up comments. Bugs: MESOS-1586 https://issues.apache.org/jira/browse/MESOS-1586 Repository: mesos-git Description --- Does not report usage or enforce quota but can create 'private' directories for each container which mask parts of the shared host filesystem. This review replaces https://reviews.apache.org/r/24178/ because of some file renaming. I addressed all comments from earlier reviews. Diffs (updated) - include/mesos/mesos.proto 6b93e9000761857c4f335f2a8c8088e155078f54 src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/common/parse.hpp c9ca30fde5580d8e388f3c616e78df2b032ac0ad src/common/type_utils.hpp f10b2d31f2781bd056898f93eb8e8b6d9809e80b src/slave/containerizer/isolators/filesystem/shared.hpp PRE-CREATION src/slave/containerizer/isolators/filesystem/shared.cpp PRE-CREATION src/slave/containerizer/linux_launcher.cpp f7bc894830a7ca3f55465dacc7b653cdc2d7758b src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c src/slave/flags.hpp 16f0cc2ab5ba16a39499608174278b3082e0585d src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 Diff: https://reviews.apache.org/r/25549/diff/ Testing --- make check # added a test Thanks, Ian Downes
Re: Review Request 25864: Add 'FutureNothing cgroups::empty()'.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25864/ --- (Updated Oct. 14, 2014, 1:31 p.m.) Review request for mesos, Jie Yu and Vinod Kone. Changes --- Missing semicolon. Repository: mesos-git Description --- Polls cgroups.procs until no processes in the cgroup. Poll interval and timeout can be specified. Diffs (updated) - src/linux/cgroups.hpp abf31df1b4dbf6f715f93256b83c9996a45099cf src/linux/cgroups.cpp 62df4b7645c6ab061a47634058d79ca849caa6b9 Diff: https://reviews.apache.org/r/25864/diff/ Testing --- Thanks, Ian Downes
Re: Review Request 25865: Pid namespace isolator for the MesosContainerizer.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25865/ --- (Updated Oct. 14, 2014, 1:33 p.m.) Review request for mesos, Jie Yu and Vinod Kone. Changes --- Added comments. Moved some code into cpp. Removed some unnecessary isolator dependencies. Repository: mesos-git Description --- Add namespaces/pid to --isolation slave flag. Places executor into a pid namespace so it and all descendants will be contained in the namespace. Requires the filesystem/shared isolator so /proc and /sys are remounted to reflect the different namespace. Diffs (updated) - src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/slave/containerizer/isolators/namespaces/pid.hpp PRE-CREATION src/slave/containerizer/isolators/namespaces/pid.cpp PRE-CREATION src/slave/containerizer/linux_launcher.cpp f7bc894830a7ca3f55465dacc7b653cdc2d7758b src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 Diff: https://reviews.apache.org/r/25865/diff/ Testing --- Added test that command in pid namespaced container is in a different namespace and that the command is 'init' (verifies remount of /proc). Thanks, Ian Downes
Re: Review Request 26274: Remove /proc and /sys remounts from port_mapping isolator.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26274/ --- (Updated Oct. 14, 2014, 1:33 p.m.) Review request for mesos, Jie Yu and Vinod Kone. Bugs: MESOS-1853 https://issues.apache.org/jira/browse/MESOS-1853 Repository: mesos-git Description --- Remove /proc and /sys remounts from port_mapping isolator. See ticket for details. Diffs (updated) - src/slave/containerizer/isolators/network/port_mapping.cpp 62ca9bca08de55f19bfff7b8828dfa4fb6c51f30 src/slave/containerizer/linux_launcher.cpp f7bc894830a7ca3f55465dacc7b653cdc2d7758b Diff: https://reviews.apache.org/r/26274/diff/ Testing (updated) --- make check # Linux after `configure --with-network-isolator` Thanks, Ian Downes
Re: Review Request 25861: Serialize isolator prepare and cleanup (reversed).
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25861/ --- (Updated Oct. 14, 2014, 1:34 p.m.) Review request for mesos, Jie Yu and Vinod Kone. Changes --- Addressed comments. Repository: mesos-git Description --- Change from doing in parallel and collect()ing to serial according to the vector of isolators (reversed order for cleanup). Diffs (updated) - src/slave/containerizer/mesos/containerizer.hpp bf246ca649ca4a461cebf1aee6908a2d58eec362 src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c Diff: https://reviews.apache.org/r/25861/diff/ Testing --- make check Thanks, Ian Downes
Re: Review Request 25965: Update libprocess Makefile for setns namechange.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25965/ --- (Updated Oct. 14, 2014, 1:34 p.m.) Review request for mesos, Ben Mahler and Jie Yu. Repository: mesos-git Description --- Update libprocess Makefile for setns namechange. Diffs (updated) - src/slave/containerizer/isolators/network/port_mapping.cpp 62ca9bca08de55f19bfff7b8828dfa4fb6c51f30 Diff: https://reviews.apache.org/r/25965/diff/ Testing --- Thanks, Ian Downes
Re: Review Request 25966: Use pid namespace in LinuxLauncher::destroy().
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25966/ --- (Updated Oct. 14, 2014, 1:35 p.m.) Review request for mesos, Jie Yu and Vinod Kone. Changes --- Added roll forward and roll back tests. Repository: mesos-git Description --- Check if a container is running in a pid namespace and thus all processes can be killed by the kernel, rather than using the freezer. Diffs (updated) - src/slave/containerizer/linux_launcher.cpp f7bc894830a7ca3f55465dacc7b653cdc2d7758b src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c src/tests/slave_recovery_tests.cpp 4fb357bd55f69f71193e92fd03765b808f932d33 Diff: https://reviews.apache.org/r/25966/diff/ Testing --- Thanks, Ian Downes
Re: Review Request 25864: Add 'FutureNothing cgroups::empty()'.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25864/#review56567 --- src/linux/cgroups.cpp https://reviews.apache.org/r/25864/#comment96913 It will be nice if you also print the interval. - Timothy Chen On Oct. 14, 2014, 8:31 p.m., Ian Downes wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25864/ --- (Updated Oct. 14, 2014, 8:31 p.m.) Review request for mesos, Jie Yu and Vinod Kone. Repository: mesos-git Description --- Polls cgroups.procs until no processes in the cgroup. Poll interval and timeout can be specified. Diffs - src/linux/cgroups.hpp abf31df1b4dbf6f715f93256b83c9996a45099cf src/linux/cgroups.cpp 62df4b7645c6ab061a47634058d79ca849caa6b9 Diff: https://reviews.apache.org/r/25864/diff/ Testing --- Thanks, Ian Downes
Re: Review Request 26669: Added a document for reconciliation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26669/#review56566 --- Ship it! LGTM. docs/reconciliation.md https://reviews.apache.org/r/26669/#comment96911 Also, * slave fails before launching the task? docs/reconciliation.md https://reviews.apache.org/r/26669/#comment96914 Maybe to be explicit, say If `remaining` is non-empty, wait for some time (e.g., exponential backoff) then go to 3. - Vinod Kone On Oct. 14, 2014, 12:34 a.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26669/ --- (Updated Oct. 14, 2014, 12:34 a.m.) Review request for mesos, Benjamin Hindman, Niklas Nielsen, and Vinod Kone. Bugs: MESOS-681 https://issues.apache.org/jira/browse/MESOS-681 Repository: mesos-git Description --- Please see here for rendered markdown, will be easier to review: https://gist.github.com/bmahler/18409fc4f052df43f403 Please send your high level thoughts :) Diffs - docs/reconciliation.md PRE-CREATION Diff: https://reviews.apache.org/r/26669/diff/ Testing --- N/A Thanks, Ben Mahler
Re: Review Request 25549: Basic filesystem isolator for Linux.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25549/#review56568 --- src/slave/containerizer/linux_launcher.cpp https://reviews.apache.org/r/25549/#comment96915 print the contianer id for easier tracing in the logs. - Timothy Chen On Oct. 14, 2014, 8:31 p.m., Ian Downes wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25549/ --- (Updated Oct. 14, 2014, 8:31 p.m.) Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone. Bugs: MESOS-1586 https://issues.apache.org/jira/browse/MESOS-1586 Repository: mesos-git Description --- Does not report usage or enforce quota but can create 'private' directories for each container which mask parts of the shared host filesystem. This review replaces https://reviews.apache.org/r/24178/ because of some file renaming. I addressed all comments from earlier reviews. Diffs - include/mesos/mesos.proto 6b93e9000761857c4f335f2a8c8088e155078f54 src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/common/parse.hpp c9ca30fde5580d8e388f3c616e78df2b032ac0ad src/common/type_utils.hpp f10b2d31f2781bd056898f93eb8e8b6d9809e80b src/slave/containerizer/isolators/filesystem/shared.hpp PRE-CREATION src/slave/containerizer/isolators/filesystem/shared.cpp PRE-CREATION src/slave/containerizer/linux_launcher.cpp f7bc894830a7ca3f55465dacc7b653cdc2d7758b src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c src/slave/flags.hpp 16f0cc2ab5ba16a39499608174278b3082e0585d src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 Diff: https://reviews.apache.org/r/25549/diff/ Testing --- make check # added a test Thanks, Ian Downes
Re: Review Request 25861: Serialize isolator prepare and cleanup (reversed).
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25861/#review56569 --- src/slave/containerizer/mesos/containerizer.cpp https://reviews.apache.org/r/25861/#comment96917 Not sure if I'm missing something, but if you chain all lambdas with .then, only the last one will be the failed one right? - Timothy Chen On Oct. 14, 2014, 8:34 p.m., Ian Downes wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25861/ --- (Updated Oct. 14, 2014, 8:34 p.m.) Review request for mesos, Jie Yu and Vinod Kone. Repository: mesos-git Description --- Change from doing in parallel and collect()ing to serial according to the vector of isolators (reversed order for cleanup). Diffs - src/slave/containerizer/mesos/containerizer.hpp bf246ca649ca4a461cebf1aee6908a2d58eec362 src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c Diff: https://reviews.apache.org/r/25861/diff/ Testing --- make check Thanks, Ian Downes
Re: Review Request 26426: Add --enable-debug flag to ./configure for controlling emission of debug information
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/#review56573 --- configure.ac https://reviews.apache.org/r/26426/#comment96918 Typical pattern is yes/no vs. true/false. Not that it really matters, more for consistency. configure.ac https://reviews.apache.org/r/26426/#comment96919 Is there a reason you want to leave debug symbols out of optimized builds? cmake has the pattern correct imho: Release Debug ReleaseWithDebug A ReleaseWithDebug allows packagers, such as myself, to build w/debugsymbols that are stripped out into a .debuginfo package which can be used by developers for tracing When bears attack. Granted that it is tenuous debugging at best, but it's better then nothing. So I think we want all three modes, stripping all debug information is not really idea. - Timothy St. Clair On Oct. 7, 2014, 10:38 p.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/ --- (Updated Oct. 7, 2014, 10:38 p.m.) Review request for mesos, Benjamin Hindman and Timothy St. Clair. Repository: mesos-git Description --- Reworks building mesos in debug vs. release. By default, mesos is now built in release (no debug info, optimized build). If '--enable-debug' is specified to configure, than optimization will be turned off, and debug info will be turned on. This also adds a variable 'DEBUG' to the build environment, which people can use in code to see if mesos is built with debugging to enable extra assertions / checks. For release builds we may want to set 'NDEBUG' which removes assert()'s, but that is a seperate discussion. Main benefits: 1) Getting a build to include/exclude debug information at will is feasible. Before some things like using clang would forcibly enable debug info in all cases 2) libmesos.so and the other binaries which get packaged up for use in distributions shrink considerably without manually stripping post-build (Improves build time, makes packaging cleaner) Diffs - configure.ac da1c82db31583fc81de658574b9a95628cb84dbc Diff: https://reviews.apache.org/r/26426/diff/ Testing --- Built with both --enable-debug and without, checking that the flags get passed through correctly. Thanks, Cody Maloney
Re: Review Request 26150: Libprocess Benchmark
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26150/ --- (Updated Oct. 14, 2014, 9:27 p.m.) Review request for mesos and Niklas Nielsen. Changes --- Swap out std::thread, std::mutex, std::condition_variable from a simpler Actor / Latch approach. Fix style issues. Bugs: MESOS-1840 https://issues.apache.org/jira/browse/MESOS-1840 Repository: mesos-git Description --- A benchmark for libprocess. It forks num_proc times and launched num_threads libprocess processes in each child. They are aware of the master's (parent) address and all play ping pong with it. This benchmark measures throughput in terms of the number of RPCs handled per second using persistent (linked) connections. A new test file (benchmarks) is introduced because we want to fork before libprocess is initialized. This allows us to prevent short-circuiting of message passing between processes under the same ProcessManager. This way we force the execution path of the underlying event management system. Diffs (updated) - 3rdparty/libprocess/Makefile.am 616618e 3rdparty/libprocess/src/tests/benchmarks.cpp PRE-CREATION Diff: https://reviews.apache.org/r/26150/diff/ Testing --- make check in 3rdparty/libprocess support/mesos-style.py Thanks, Joris Van Remoortere
Re: Review Request 26150: Libprocess Benchmark
On Oct. 13, 2014, 9:47 p.m., Niklas Nielsen wrote: 3rdparty/libprocess/src/tests/benchmarks.cpp, line 25 https://reviews.apache.org/r/26150/diff/3/?file=719696#file719696line25 This is the first time we use std::thread - do you have some references that it is supported across our graced compilers? Same for mutex Removed use of thread. On Oct. 13, 2014, 9:47 p.m., Niklas Nielsen wrote: 3rdparty/libprocess/src/tests/benchmarks.cpp, line 95 https://reviews.apache.org/r/26150/diff/3/?file=719696#file719696line95 Was this the lock that depended on Dominic's configure patch? If so, can you add it as a dependency? Remove use of mutex / unique_lock; use Latch instead. On Oct. 13, 2014, 9:47 p.m., Niklas Nielsen wrote: 3rdparty/libprocess/src/tests/benchmarks.cpp, line 97 https://reviews.apache.org/r/26150/diff/3/?file=719696#file719696line97 Can we use the latch (latch.hpp) abstraction here? Yep. Done. - Joris --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26150/#review56442 --- On Oct. 14, 2014, 9:27 p.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26150/ --- (Updated Oct. 14, 2014, 9:27 p.m.) Review request for mesos and Niklas Nielsen. Bugs: MESOS-1840 https://issues.apache.org/jira/browse/MESOS-1840 Repository: mesos-git Description --- A benchmark for libprocess. It forks num_proc times and launched num_threads libprocess processes in each child. They are aware of the master's (parent) address and all play ping pong with it. This benchmark measures throughput in terms of the number of RPCs handled per second using persistent (linked) connections. A new test file (benchmarks) is introduced because we want to fork before libprocess is initialized. This allows us to prevent short-circuiting of message passing between processes under the same ProcessManager. This way we force the execution path of the underlying event management system. Diffs - 3rdparty/libprocess/Makefile.am 616618e 3rdparty/libprocess/src/tests/benchmarks.cpp PRE-CREATION Diff: https://reviews.apache.org/r/26150/diff/ Testing --- make check in 3rdparty/libprocess support/mesos-style.py Thanks, Joris Van Remoortere
Re: Review Request 26150: Libprocess Benchmark
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26150/ --- (Updated Oct. 14, 2014, 9:30 p.m.) Review request for mesos and Niklas Nielsen. Changes --- Remove unused code. Bugs: MESOS-1840 https://issues.apache.org/jira/browse/MESOS-1840 Repository: mesos-git Description --- A benchmark for libprocess. It forks num_proc times and launched num_threads libprocess processes in each child. They are aware of the master's (parent) address and all play ping pong with it. This benchmark measures throughput in terms of the number of RPCs handled per second using persistent (linked) connections. A new test file (benchmarks) is introduced because we want to fork before libprocess is initialized. This allows us to prevent short-circuiting of message passing between processes under the same ProcessManager. This way we force the execution path of the underlying event management system. Diffs (updated) - 3rdparty/libprocess/Makefile.am 616618e 3rdparty/libprocess/src/tests/benchmarks.cpp PRE-CREATION Diff: https://reviews.apache.org/r/26150/diff/ Testing --- make check in 3rdparty/libprocess support/mesos-style.py Thanks, Joris Van Remoortere
Re: Review Request 26426: Add --enable-debug flag to ./configure for controlling emission of debug information
On Oct. 14, 2014, 9:06 p.m., Timothy St. Clair wrote: configure.ac, line 281 https://reviews.apache.org/r/26426/diff/1/?file=714874#file714874line281 Is there a reason you want to leave debug symbols out of optimized builds? cmake has the pattern correct imho: Release Debug ReleaseWithDebug A ReleaseWithDebug allows packagers, such as myself, to build w/debugsymbols that are stripped out into a .debuginfo package which can be used by developers for tracing When bears attack. Granted that it is tenuous debugging at best, but it's better then nothing. So I think we want all three modes, stripping all debug information is not really idea. My main motivation is to shrink the size of libmesos. Yesterday sugis in #mesos had one which was 213M. For the buildbot internally, full debian packages (which are compressed) of mesos weign in at 165M a piece (Yes, stripping post-build would help a lot, but why build it to begin with?). Most of this is debug info. Also, we build a bunch of different ways, and when libmesos is as big as it is, a decent amount of time ends up being spent on disk I/O reading / writing all the debug info when we are really just trying to ensure it builds on all the different platforms (Not to mention storage and file size shipping things around the network to centralized repositories). The simple toggle between debug and release, removing the legacy logic gets us most of the benefit. If you have a good place to point me for what is needed to get a 'ReleaseWithDebug' info build up and running I can definitely work on adding that as well. - Cody --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/#review56573 --- On Oct. 7, 2014, 10:38 p.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/ --- (Updated Oct. 7, 2014, 10:38 p.m.) Review request for mesos, Benjamin Hindman and Timothy St. Clair. Repository: mesos-git Description --- Reworks building mesos in debug vs. release. By default, mesos is now built in release (no debug info, optimized build). If '--enable-debug' is specified to configure, than optimization will be turned off, and debug info will be turned on. This also adds a variable 'DEBUG' to the build environment, which people can use in code to see if mesos is built with debugging to enable extra assertions / checks. For release builds we may want to set 'NDEBUG' which removes assert()'s, but that is a seperate discussion. Main benefits: 1) Getting a build to include/exclude debug information at will is feasible. Before some things like using clang would forcibly enable debug info in all cases 2) libmesos.so and the other binaries which get packaged up for use in distributions shrink considerably without manually stripping post-build (Improves build time, makes packaging cleaner) Diffs - configure.ac da1c82db31583fc81de658574b9a95628cb84dbc Diff: https://reviews.apache.org/r/26426/diff/ Testing --- Built with both --enable-debug and without, checking that the flags get passed through correctly. Thanks, Cody Maloney
Re: Review Request 26580: Memory cleanup: libprocess join worker threads
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26580/ --- (Updated Oct. 14, 2014, 9:44 p.m.) Review request for mesos, Benjamin Hindman and Niklas Nielsen. Changes --- Address Dominic's issues. Repository: mesos-git Description --- Terminate the gc; Finalize the ProcessManager (picking up -idealy 0- left-over processes); Join on the worker threads that were dispatched. Diffs (updated) - 3rdparty/libprocess/src/process.cpp 85fb995 Diff: https://reviews.apache.org/r/26580/diff/ Testing --- make check support/mesos-style.py Thanks, Joris Van Remoortere
Re: Review Request 26583: Memory cleanup: libprocess delete garbage collector process
On Oct. 10, 2014, 5:15 p.m., Dominic Hamon wrote: 3rdparty/libprocess/src/process.cpp, line 1631 https://reviews.apache.org/r/26583/diff/1/?file=717970#file717970line1631 this can't land until https://reviews.apache.org/r/26289/ which contains the configure.ac check for std::unique_ptr support. Dominic, what is the expected time frame for landing r26289? It contains more than just the check and we have a couple of patches that would depend on it. Can we pull out the check in a new patch to get going? - Niklas --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26583/#review56262 --- On Oct. 10, 2014, 4:12 p.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26583/ --- (Updated Oct. 10, 2014, 4:12 p.m.) Review request for mesos, Benjamin Hindman and Niklas Nielsen. Repository: mesos-git Description --- Wrap the libprocess garbage collector process with a unique ptr. Reset the ptr during finalization. Diffs - 3rdparty/libprocess/src/process.cpp 85fb995 Diff: https://reviews.apache.org/r/26583/diff/ Testing --- make check Thanks, Joris Van Remoortere
Review Request 26709: Change from kill to stop for docker
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 26709: Change from kill to stop for docker
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56585 --- src/docker/docker.cpp https://reviews.apache.org/r/26709/#comment96932 Since this method doesn't simple kill anymore, can you also refactor the method name? Good to provide some comments as well in the header that it stops and kills after a period of time. 30 seconds also seems a bit long to me, but regardless we should at least make a constant that we can change later. - Timothy Chen On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 26709: Change from kill to stop for docker
On Oct. 14, 2014, 10:18 p.m., Timothy Chen wrote: src/docker/docker.cpp, line 472 https://reviews.apache.org/r/26709/diff/1/?file=721122#file721122line472 Since this method doesn't simple kill anymore, can you also refactor the method name? Good to provide some comments as well in the header that it stops and kills after a period of time. 30 seconds also seems a bit long to me, but regardless we should at least make a constant that we can change later. Also did you try to run all the docker unit tests? - Timothy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56585 --- On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 23912: Fix MESOS-947: Slave should properly handle a killTask() that arrives between runTask() and _runTask()
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23912/#review56584 --- src/tests/mesos.cpp https://reviews.apache.org/r/23912/#comment96930 Sorry, missed this earlier. We align constructor arguments as follows: MockSlave::MockSlave(const slave::Flags flags, MasterDetector* detector, slave::Containerizer* containerizer); - Vinod Kone On Oct. 14, 2014, 9:54 p.m., Bernd Mathiske wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23912/ --- (Updated Oct. 14, 2014, 9:54 p.m.) Review request for mesos and Vinod Kone. Bugs: MESOS-947 https://issues.apache.org/jira/browse/MESOS-947 Repository: mesos-git Description --- Fixes MESOS-947 Slave should properly handle a killTask() that arrives between runTask() and _runTask(). Slave::killTask() did not check for task in question combination to be pending (i.e. Slave::runTask had happened, but Slave::_runTask had not yet) and then erroneously assumed that Slave::runTask() had not been executed. The task was then marked LOST instead of KILLED. But Slave::runTask had already scheduled Slave::_runTask to follow. Now the entry for being pending is removed, and the task is marked KILLED, and _runTask gets informed about this. It checks whether the task in question is currently pending and if it is not, then it infers that the task has been killed and does not erroneously try to complete launching it. Diffs - src/slave/slave.hpp 342b09fc084c20d98d096bb129830440179c092c src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 src/tests/mesos.cpp 3dcb2acd5ad4ab5e3a7b4fe524ee077558112773 src/tests/slave_tests.cpp f585bdd20ae1af466f2c1b4d85331ac67451552f Diff: https://reviews.apache.org/r/23912/diff/ Testing --- Wrote a unit test that reliably created the situation described in the ticket. Observed that TASK_LOST and the listed log output occurred. This pointed directly to the lines in killTask() where the problem is rooted. Ran the test after fixing, it succeeded. Checked the log. It looks like a clean kill now :-) Thanks, Bernd Mathiske
Re: Review Request 23912: Fix MESOS-947: Slave should properly handle a killTask() that arrives between runTask() and _runTask()
On Oct. 10, 2014, 6:24 p.m., Vinod Kone wrote: src/slave/slave.cpp, line 1413 https://reviews.apache.org/r/23912/diff/8/?file=716689#file716689line1413 It is weird to me that you remove the task here but (potentially) remove the executor up in _runTask(). It's not obvious to me why you made that choice. If there is a good reason, please add a comment here. Bernd Mathiske wrote: The task is removed in killTask(), because later on the information that it should be removed is no longer easily at hand. I looked at lines 1194, 1195 in _runTask(): Framework* framework = getFramework(frameworkId); CHECK_NOTNULL(framework); If I removed the framework earlier than this, this check would fire. If I want to avoid that, I need to write extra code (e.g. prevent _runTask() from happening). Assembling all framework removals in the second half of _runTask() looks like good defensive practice to me. There is no harm in removing the framework in _runTask() vs. killTask() since _runTask() must eventually happen. But by waiting with removing the framework until _runTask(), we do not need to think about what implications concurrent starting and killing of multiple tasks with the same executor might have. I don't think I follow what you mean by The task is removed in killTask(), because later on the information that it should be removed is no longer easily at hand and ...we do not need to think about what implications concurrent starting and killing of multiple tasks with the same executor might have Anyway, here's why I think we should remove the framework in killTask(). It doesn't require non-local reasoning. Right now, someone reading the killTask() method should know that a pending task means a _runTask() dispatch is in the queue and that the executor and framework will be removed in '_runTask()'. Instead if killTask() can remove the framework right away, it's easy to understand. It is much simple for methods that take FrameworkID (instead of Framework*) to check if the framework exists. If the framework doesn't exist, for whatever reason, it could just bail. I think these semantics would make future contributors' life easy when making changes to these methods. Changing the CHECK in '_runTask()' to an if condition doesn't seem too bad to me. ``` _runTask() { if (framework == NULL) { LOG(WARNING) Ignoring run task task because the framework does not exist; return; } } ``` Makes sense? - Vinod --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23912/#review56016 --- On Oct. 14, 2014, 9:54 p.m., Bernd Mathiske wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23912/ --- (Updated Oct. 14, 2014, 9:54 p.m.) Review request for mesos and Vinod Kone. Bugs: MESOS-947 https://issues.apache.org/jira/browse/MESOS-947 Repository: mesos-git Description --- Fixes MESOS-947 Slave should properly handle a killTask() that arrives between runTask() and _runTask(). Slave::killTask() did not check for task in question combination to be pending (i.e. Slave::runTask had happened, but Slave::_runTask had not yet) and then erroneously assumed that Slave::runTask() had not been executed. The task was then marked LOST instead of KILLED. But Slave::runTask had already scheduled Slave::_runTask to follow. Now the entry for being pending is removed, and the task is marked KILLED, and _runTask gets informed about this. It checks whether the task in question is currently pending and if it is not, then it infers that the task has been killed and does not erroneously try to complete launching it. Diffs - src/slave/slave.hpp 342b09fc084c20d98d096bb129830440179c092c src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 src/tests/mesos.cpp 3dcb2acd5ad4ab5e3a7b4fe524ee077558112773 src/tests/slave_tests.cpp f585bdd20ae1af466f2c1b4d85331ac67451552f Diff: https://reviews.apache.org/r/23912/diff/ Testing --- Wrote a unit test that reliably created the situation described in the ticket. Observed that TASK_LOST and the listed log output occurred. This pointed directly to the lines in killTask() where the problem is rooted. Ran the test after fixing, it succeeded. Checked the log. It looks like a clean kill now :-) Thanks, Bernd Mathiske
Re: Review Request 25965: Update libprocess Makefile for setns namechange.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25965/#review56589 --- Patch looks great! Reviews applied: [25864, 25965] All tests passed. - Mesos ReviewBot On Oct. 14, 2014, 8:34 p.m., Ian Downes wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25965/ --- (Updated Oct. 14, 2014, 8:34 p.m.) Review request for mesos, Ben Mahler and Jie Yu. Repository: mesos-git Description --- Update libprocess Makefile for setns namechange. Diffs - src/slave/containerizer/isolators/network/port_mapping.cpp 62ca9bca08de55f19bfff7b8828dfa4fb6c51f30 Diff: https://reviews.apache.org/r/25965/diff/ Testing --- Thanks, Ian Downes
Re: Review Request 26698: Added StatusUpdateManager::unacknowledged() API call.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26698/#review56583 --- src/messages/messages.proto https://reviews.apache.org/r/26698/#comment96929 Maybe 'unacknowledged_status' would be more accurate, since the uuid is associated with the StatusUpdate message itself, and this could be one of many status updates with the TASK_RUNNING state, of which some may have already been acknowledged and others are waiting behind this one. In that case, the TASK_RUNNING state has already been received and acknowledged, but it's the other pending status updates that are unacknowledged. Maybe 'pending_state' or 'pending_status' would be better. src/messages/messages.proto https://reviews.apache.org/r/26698/#comment96928 This uuid corresponds to the uuid of a StatusUpdateMessage not the state itself. Why wouldn't you want to include the entire StatusUpdate or TaskStatus message instead of picking only the fields you think you need now? I'm sure the arbitrary size of the 'data' field comes into play, but I'd just like to know if you considered it and why you rejected it. Is uuid really always going to be required? Maybe it should be called 'status_uuid'? src/tests/status_update_manager_tests.cpp https://reviews.apache.org/r/26698/#comment96934 s/udpate/update/ src/tests/status_update_manager_tests.cpp https://reviews.apache.org/r/26698/#comment96935 state and uuid? src/tests/status_update_manager_tests.cpp https://reviews.apache.org/r/26698/#comment96940 Maybe you'd like to fill in some of these '_' blanks to at least guarantee the messages are coming from and going to the right pids? src/tests/status_update_manager_tests.cpp https://reviews.apache.org/r/26698/#comment96939 Times(1) is implicit, unneeded. If neither WillOnce() nor WillRepeatedly() is in the EXPECT_CALL(), the inferred cardinality is Times(1). - https://code.google.com/p/googlemock/wiki/ForDummies#Cardinalities:_How_Many_Times_Will_It_Be_Called? (No offense) src/tests/status_update_manager_tests.cpp https://reviews.apache.org/r/26698/#comment96941 Could just use the LaunchTasks(DEFAULT_EXECUTOR_INFO, 1, 1, 64, *) pattern and avoid tracking the offers or directly calling driver.launchTasks. src/tests/status_update_manager_tests.cpp https://reviews.apache.org/r/26698/#comment96938 Times(1) is implicit, unneeded. src/tests/status_update_manager_tests.cpp https://reviews.apache.org/r/26698/#comment96942 s/unacknowledged/unacknowledged state/ src/tests/status_update_manager_tests.cpp https://reviews.apache.org/r/26698/#comment96943 Also validate the uuid? - Adam B On Oct. 14, 2014, 11 a.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26698/ --- (Updated Oct. 14, 2014, 11 a.m.) Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen. Repository: mesos-git Description --- Added StatusUpdateManager::unacknowledged() API call so that slave can use this info during reregistration. Diffs - src/messages/messages.proto 8de7f9699f43aa2780f4a39fed087abcf5e5af99 src/slave/status_update_manager.hpp 24e3882ad1969c20a64daf90e408618c310e9a81 src/slave/status_update_manager.cpp 5d5cf234ef2dd47fa4b1f67be761dbca31659451 src/tests/status_update_manager_tests.cpp e9ef1e208cb01535e9366db7872b922c8f06ec40 Diff: https://reviews.apache.org/r/26698/diff/ Testing --- make check Ran new test 1000 times. Thanks, Vinod Kone
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Oct. 14, 2014, 10:52 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs (updated) - src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html 25caf530628ad3ac7f23ab5f014000aac8583da1 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js 41a70a80442501a2bf7b217939dbe504662941d2 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Connection Issue Alert https://reviews.apache.org/media/uploaded/files/2014/01/28/dee8df12-0bae-48b5-a7ce-c07e0266c790__Screenshot_2014-01-28_12.44.53.png Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Oct. 14, 2014, 10:53 p.m.) Review request for mesos and Michael Lunøe. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html 25caf530628ad3ac7f23ab5f014000aac8583da1 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js 41a70a80442501a2bf7b217939dbe504662941d2 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Connection Issue Alert https://reviews.apache.org/media/uploaded/files/2014/01/28/dee8df12-0bae-48b5-a7ce-c07e0266c790__Screenshot_2014-01-28_12.44.53.png Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Oct. 14, 2014, 10:54 p.m.) Review request for mesos, Michael Lunøe and Niklas Nielsen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html 25caf530628ad3ac7f23ab5f014000aac8583da1 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js 41a70a80442501a2bf7b217939dbe504662941d2 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Connection Issue Alert https://reviews.apache.org/media/uploaded/files/2014/01/28/dee8df12-0bae-48b5-a7ce-c07e0266c790__Screenshot_2014-01-28_12.44.53.png Thanks, Thomas Rampelberg
Re: Review Request 26697: Added StartSlave() overload that takes status update manager.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26697/#review56593 --- Ship it! Ship It! - Adam B On Oct. 14, 2014, 10:58 a.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26697/ --- (Updated Oct. 14, 2014, 10:58 a.m.) Review request for mesos and Ben Mahler. Repository: mesos-git Description --- Added this overload to be able to use for testing status update manager. Used in the subsequent review. Diffs - src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 src/tests/mesos.cpp 3dcb2acd5ad4ab5e3a7b4fe524ee077558112773 Diff: https://reviews.apache.org/r/26697/diff/ Testing --- make check Thanks, Vinod Kone
Re: Review Request 25966: Use pid namespace in LinuxLauncher::destroy().
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25966/#review56595 --- Bad patch! Reviews applied: [25864, 25865] Failed command: git apply --index 25865.patch Error: error: patch failed: src/Makefile.am:344 error: src/Makefile.am: patch does not apply error: patch failed: src/slave/containerizer/linux_launcher.cpp:105 error: src/slave/containerizer/linux_launcher.cpp: patch does not apply error: patch failed: src/slave/containerizer/mesos/containerizer.cpp:40 error: src/slave/containerizer/mesos/containerizer.cpp: patch does not apply - Mesos ReviewBot On Oct. 14, 2014, 8:35 p.m., Ian Downes wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25966/ --- (Updated Oct. 14, 2014, 8:35 p.m.) Review request for mesos, Jie Yu and Vinod Kone. Repository: mesos-git Description --- Check if a container is running in a pid namespace and thus all processes can be killed by the kernel, rather than using the freezer. Diffs - src/slave/containerizer/linux_launcher.cpp f7bc894830a7ca3f55465dacc7b653cdc2d7758b src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c src/tests/slave_recovery_tests.cpp 4fb357bd55f69f71193e92fd03765b808f932d33 Diff: https://reviews.apache.org/r/25966/diff/ Testing --- Thanks, Ian Downes
Review Request 26712: Add configuration check to libprocess for std::unique_ptr and std::move
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26712/ --- Review request for mesos and Niklas Nielsen. Repository: mesos-git Description --- As summary Diffs - 3rdparty/libprocess/m4/ax_cxx_compile_stdcxx_11.m4 07b298f151094e818287f741b3e0efd28374e82b Diff: https://reviews.apache.org/r/26712/diff/ Testing --- make g++-4.4, g++-4.6, g++-4.8, clang++ Thanks, Dominic Hamon
Re: Review Request 26583: Memory cleanup: libprocess delete garbage collector process
On Oct. 10, 2014, 5:15 p.m., Dominic Hamon wrote: 3rdparty/libprocess/src/process.cpp, line 1631 https://reviews.apache.org/r/26583/diff/1/?file=717970#file717970line1631 this can't land until https://reviews.apache.org/r/26289/ which contains the configure.ac check for std::unique_ptr support. Niklas Nielsen wrote: Dominic, what is the expected time frame for landing r26289? It contains more than just the check and we have a couple of patches that would depend on it. Can we pull out the check in a new patch to get going? just split it out: https://reviews.apache.org/r/26712/ - Dominic --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26583/#review56262 --- On Oct. 10, 2014, 4:12 p.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26583/ --- (Updated Oct. 10, 2014, 4:12 p.m.) Review request for mesos, Benjamin Hindman and Niklas Nielsen. Repository: mesos-git Description --- Wrap the libprocess garbage collector process with a unique ptr. Reset the ptr during finalization. Diffs - 3rdparty/libprocess/src/process.cpp 85fb995 Diff: https://reviews.apache.org/r/26583/diff/ Testing --- make check Thanks, Joris Van Remoortere
Re: Review Request 26426: Add --enable-debug flag to ./configure for controlling emission of debug information
On Oct. 14, 2014, 9:06 p.m., Timothy St. Clair wrote: configure.ac, line 281 https://reviews.apache.org/r/26426/diff/1/?file=714874#file714874line281 Is there a reason you want to leave debug symbols out of optimized builds? cmake has the pattern correct imho: Release Debug ReleaseWithDebug A ReleaseWithDebug allows packagers, such as myself, to build w/debugsymbols that are stripped out into a .debuginfo package which can be used by developers for tracing When bears attack. Granted that it is tenuous debugging at best, but it's better then nothing. So I think we want all three modes, stripping all debug information is not really idea. Cody Maloney wrote: My main motivation is to shrink the size of libmesos. Yesterday sugis in #mesos had one which was 213M. For the buildbot internally, full debian packages (which are compressed) of mesos weign in at 165M a piece (Yes, stripping post-build would help a lot, but why build it to begin with?). Most of this is debug info. Also, we build a bunch of different ways, and when libmesos is as big as it is, a decent amount of time ends up being spent on disk I/O reading / writing all the debug info when we are really just trying to ensure it builds on all the different platforms (Not to mention storage and file size shipping things around the network to centralized repositories). The simple toggle between debug and release, removing the legacy logic gets us most of the benefit. If you have a good place to point me for what is needed to get a 'ReleaseWithDebug' info build up and running I can definitely work on adding that as well. It is also entirely possible to specify custom CXXFLAGS + CFLAGS with this patch to get the old optimized debug build. The flags set by --enable-debug (or not having it), to get back the optimized debug build from before. - Cody --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/#review56573 --- On Oct. 14, 2014, 11:07 p.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/ --- (Updated Oct. 14, 2014, 11:07 p.m.) Review request for mesos, Benjamin Hindman and Timothy St. Clair. Repository: mesos-git Description --- Reworks building mesos in debug vs. release. By default, mesos is now built in release (no debug info, optimized build). If '--enable-debug' is specified to configure, than optimization will be turned off, and debug info will be turned on. This also adds a variable 'DEBUG' to the build environment, which people can use in code to see if mesos is built with debugging to enable extra assertions / checks. For release builds we may want to set 'NDEBUG' which removes assert()'s, but that is a seperate discussion. Main benefits: 1) Getting a build to include/exclude debug information at will is feasible. Before some things like using clang would forcibly enable debug info in all cases 2) libmesos.so and the other binaries which get packaged up for use in distributions shrink considerably without manually stripping post-build (Improves build time, makes packaging cleaner) Diffs - configure.ac 2b372e06006250b5230956ef096473e98f3fa590 Diff: https://reviews.apache.org/r/26426/diff/ Testing --- Built with both --enable-debug and without, checking that the flags get passed through correctly. Thanks, Cody Maloney
Re: Review Request 26426: Add --enable-debug flag to ./configure for controlling emission of debug information
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/ --- (Updated Oct. 14, 2014, 11:07 p.m.) Review request for mesos, Benjamin Hindman and Timothy St. Clair. Changes --- Switch to yes/no Repository: mesos-git Description --- Reworks building mesos in debug vs. release. By default, mesos is now built in release (no debug info, optimized build). If '--enable-debug' is specified to configure, than optimization will be turned off, and debug info will be turned on. This also adds a variable 'DEBUG' to the build environment, which people can use in code to see if mesos is built with debugging to enable extra assertions / checks. For release builds we may want to set 'NDEBUG' which removes assert()'s, but that is a seperate discussion. Main benefits: 1) Getting a build to include/exclude debug information at will is feasible. Before some things like using clang would forcibly enable debug info in all cases 2) libmesos.so and the other binaries which get packaged up for use in distributions shrink considerably without manually stripping post-build (Improves build time, makes packaging cleaner) Diffs (updated) - configure.ac 2b372e06006250b5230956ef096473e98f3fa590 Diff: https://reviews.apache.org/r/26426/diff/ Testing --- Built with both --enable-debug and without, checking that the flags get passed through correctly. Thanks, Cody Maloney
Re: Review Request 26712: Add configuration check to libprocess for std::unique_ptr and std::move
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26712/#review56600 --- Ship it! Ship It! - Niklas Nielsen On Oct. 14, 2014, 4:01 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26712/ --- (Updated Oct. 14, 2014, 4:01 p.m.) Review request for mesos and Niklas Nielsen. Repository: mesos-git Description --- As summary Diffs - 3rdparty/libprocess/m4/ax_cxx_compile_stdcxx_11.m4 07b298f151094e818287f741b3e0efd28374e82b Diff: https://reviews.apache.org/r/26712/diff/ Testing --- make g++-4.4, g++-4.6, g++-4.8, clang++ Thanks, Dominic Hamon
Re: Review Request 26289: Replace some shared_ptr with unique_ptr/Owned to clarify ownership passing.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26289/ --- (Updated Oct. 14, 2014, 4:29 p.m.) Review request for mesos and Jie Yu. Changes --- rebased Repository: mesos-git Description --- replace shared_ptr with Owned where ownership is being passed. Diffs (updated) - 3rdparty/libprocess/include/process/c++11/dispatch.hpp 76da2828cf36b6143d627dac66f3e0cc4416bae4 3rdparty/libprocess/include/process/defer.hpp ebe6f2db47cab2a3306288d8ebabb720e7cd8976 3rdparty/libprocess/include/process/delay.hpp 487f652c9e9b7c8c3aa8b4edc9e59789cffd8da9 3rdparty/libprocess/include/process/dispatch.hpp bceda2a2ae8963921e8e19660243a8644feab227 3rdparty/libprocess/include/process/event.hpp 294e2158876b25b06456a3619e547082d71e31ed 3rdparty/libprocess/src/process.cpp 85fb9958342f0bcdde322d9c55333126e6f86668 Diff: https://reviews.apache.org/r/26289/diff/ Testing --- make g++-4.4 and clang++-3.5 make check clang++-3.5 Thanks, Dominic Hamon
Re: Review Request 26709: Change from kill to stop for docker
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56602 --- src/docker/docker.cpp https://reviews.apache.org/r/26709/#comment96985 A 30 second timeout seems pretty arbitrary. I would suggest allowing the user to specify this as a arg. Something on the lines of: 1. If a slave system environment value (eg: `DOCKER_STOP_TIMEOUT`)is set we use that. 2. If user provides a `-t` option, instead of using the environment variable. 3. If nothing is set, kill as usual. The only reason why i suggest these is beacuse a lot of people may already have the expectation that killing a task will kill the container. This way, the upgrade process is a smooth upgrade flow. - Ankur Chauhan On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 26709: Change from kill to stop for docker
On Oct. 14, 2014, 11:39 p.m., Ankur Chauhan wrote: src/docker/docker.cpp, line 472 https://reviews.apache.org/r/26709/diff/1/?file=721122#file721122line472 A 30 second timeout seems pretty arbitrary. I would suggest allowing the user to specify this as a arg. Something on the lines of: 1. If a slave system environment value (eg: `DOCKER_STOP_TIMEOUT`)is set we use that. 2. If user provides a `-t` option, instead of using the environment variable. 3. If nothing is set, kill as usual. The only reason why i suggest these is beacuse a lot of people may already have the expectation that killing a task will kill the container. This way, the upgrade process is a smooth upgrade flow. Do you think we should kill as usual, or just set a smaller default timeout - i.e. 1 second as the default and allow configuring via env or argument? I know it slightly changes the behaviour of previous versions, but it would simplify the code path a little by having the one container stop/kill command. - Ryan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56602 --- On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 26709: Change from kill to stop for docker
On Oct. 14, 2014, 11:39 p.m., Ankur Chauhan wrote: src/docker/docker.cpp, line 472 https://reviews.apache.org/r/26709/diff/1/?file=721122#file721122line472 A 30 second timeout seems pretty arbitrary. I would suggest allowing the user to specify this as a arg. Something on the lines of: 1. If a slave system environment value (eg: `DOCKER_STOP_TIMEOUT`)is set we use that. 2. If user provides a `-t` option, instead of using the environment variable. 3. If nothing is set, kill as usual. The only reason why i suggest these is beacuse a lot of people may already have the expectation that killing a task will kill the container. This way, the upgrade process is a smooth upgrade flow. Ryan Thomas wrote: Do you think we should kill as usual, or just set a smaller default timeout - i.e. 1 second as the default and allow configuring via env or argument? I know it slightly changes the behaviour of previous versions, but it would simplify the code path a little by having the one container stop/kill command. I think that's pretty agreeable but I don't know how docker deals with timeout set to something obscene like: 1. Negative timeout 2. Really large timeout 3. 0 timeout These may be trivial questions but I would want to be defensive about these cases. Probably expanding the test suite is in order. - Ankur --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56602 --- On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 26709: Change from kill to stop for docker
On Oct. 14, 2014, 11:39 p.m., Ankur Chauhan wrote: src/docker/docker.cpp, line 472 https://reviews.apache.org/r/26709/diff/1/?file=721122#file721122line472 A 30 second timeout seems pretty arbitrary. I would suggest allowing the user to specify this as a arg. Something on the lines of: 1. If a slave system environment value (eg: `DOCKER_STOP_TIMEOUT`)is set we use that. 2. If user provides a `-t` option, instead of using the environment variable. 3. If nothing is set, kill as usual. The only reason why i suggest these is beacuse a lot of people may already have the expectation that killing a task will kill the container. This way, the upgrade process is a smooth upgrade flow. Ryan Thomas wrote: Do you think we should kill as usual, or just set a smaller default timeout - i.e. 1 second as the default and allow configuring via env or argument? I know it slightly changes the behaviour of previous versions, but it would simplify the code path a little by having the one container stop/kill command. Ankur Chauhan wrote: I think that's pretty agreeable but I don't know how docker deals with timeout set to something obscene like: 1. Negative timeout 2. Really large timeout 3. 0 timeout These may be trivial questions but I would want to be defensive about these cases. Probably expanding the test suite is in order. I'll add to the test suite too with these changes. I think we should also enforce some bounds for the values, i.e. 0 and = (executor timeout) / 2 or something like that. - Ryan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56602 --- On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 26709: Change from kill to stop for docker
On Oct. 14, 2014, 11:39 p.m., Ankur Chauhan wrote: src/docker/docker.cpp, line 472 https://reviews.apache.org/r/26709/diff/1/?file=721122#file721122line472 A 30 second timeout seems pretty arbitrary. I would suggest allowing the user to specify this as a arg. Something on the lines of: 1. If a slave system environment value (eg: `DOCKER_STOP_TIMEOUT`)is set we use that. 2. If user provides a `-t` option, instead of using the environment variable. 3. If nothing is set, kill as usual. The only reason why i suggest these is beacuse a lot of people may already have the expectation that killing a task will kill the container. This way, the upgrade process is a smooth upgrade flow. Ryan Thomas wrote: Do you think we should kill as usual, or just set a smaller default timeout - i.e. 1 second as the default and allow configuring via env or argument? I know it slightly changes the behaviour of previous versions, but it would simplify the code path a little by having the one container stop/kill command. Ankur Chauhan wrote: I think that's pretty agreeable but I don't know how docker deals with timeout set to something obscene like: 1. Negative timeout 2. Really large timeout 3. 0 timeout These may be trivial questions but I would want to be defensive about these cases. Probably expanding the test suite is in order. Ryan Thomas wrote: I'll add to the test suite too with these changes. I think we should also enforce some bounds for the values, i.e. 0 and = (executor timeout) / 2 or something like that. ++ - Ankur --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56602 --- On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 25549: Basic filesystem isolator for Linux.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25549/#review56609 --- Patch looks great! Reviews applied: [26273, 25861, 24177, 25655, 25549] All tests passed. - Mesos ReviewBot On Oct. 14, 2014, 8:31 p.m., Ian Downes wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25549/ --- (Updated Oct. 14, 2014, 8:31 p.m.) Review request for mesos, Ben Mahler, Jie Yu, and Vinod Kone. Bugs: MESOS-1586 https://issues.apache.org/jira/browse/MESOS-1586 Repository: mesos-git Description --- Does not report usage or enforce quota but can create 'private' directories for each container which mask parts of the shared host filesystem. This review replaces https://reviews.apache.org/r/24178/ because of some file renaming. I addressed all comments from earlier reviews. Diffs - include/mesos/mesos.proto 6b93e9000761857c4f335f2a8c8088e155078f54 src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/common/parse.hpp c9ca30fde5580d8e388f3c616e78df2b032ac0ad src/common/type_utils.hpp f10b2d31f2781bd056898f93eb8e8b6d9809e80b src/slave/containerizer/isolators/filesystem/shared.hpp PRE-CREATION src/slave/containerizer/isolators/filesystem/shared.cpp PRE-CREATION src/slave/containerizer/linux_launcher.cpp f7bc894830a7ca3f55465dacc7b653cdc2d7758b src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c src/slave/flags.hpp 16f0cc2ab5ba16a39499608174278b3082e0585d src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/isolator_tests.cpp c38f87632cb6984543cb3767dbd656cde7459610 src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 Diff: https://reviews.apache.org/r/25549/diff/ Testing --- make check # added a test Thanks, Ian Downes
Re: Review Request 26709: Change from kill to stop for docker
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56611 --- src/docker/docker.cpp https://reviews.apache.org/r/26709/#comment96993 I'm not sure what executor timeout you're referring to, but for example the executor signal escalation timeout although similiar is not really intended for docker containerizer shutdown. In all our containerizers we simply kill (SIGKILL) the tasks. Let's rename the method and also make the timeout be a method param that we can pass in. - Timothy Chen On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 26150: Libprocess Benchmark
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26150/#review56614 --- Bad patch! Reviews applied: [26150] Failed command: ./support/mesos-style.py Error: Checking 520 files using filter --filter=-,+build/class,+build/deprecated,+build/endif_comment,+readability/todo,+readability/namespace,+runtime/vlog,+whitespace/blank_line,+whitespace/comma,+whitespace/end_of_line,+whitespace/ending_newline,+whitespace/forcolon,+whitespace/indent,+whitespace/line_length,+whitespace/operators,+whitespace/semicolon,+whitespace/tab,+whitespace/todo 3rdparty/libprocess/src/tests/benchmarks.cpp:127: Missing space after ; [whitespace/semicolon] [3] Total errors found: 1 - Mesos ReviewBot On Oct. 14, 2014, 9:30 p.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26150/ --- (Updated Oct. 14, 2014, 9:30 p.m.) Review request for mesos and Niklas Nielsen. Bugs: MESOS-1840 https://issues.apache.org/jira/browse/MESOS-1840 Repository: mesos-git Description --- A benchmark for libprocess. It forks num_proc times and launched num_threads libprocess processes in each child. They are aware of the master's (parent) address and all play ping pong with it. This benchmark measures throughput in terms of the number of RPCs handled per second using persistent (linked) connections. A new test file (benchmarks) is introduced because we want to fork before libprocess is initialized. This allows us to prevent short-circuiting of message passing between processes under the same ProcessManager. This way we force the execution path of the underlying event management system. Diffs - 3rdparty/libprocess/Makefile.am 616618e 3rdparty/libprocess/src/tests/benchmarks.cpp PRE-CREATION Diff: https://reviews.apache.org/r/26150/diff/ Testing --- make check in 3rdparty/libprocess support/mesos-style.py Thanks, Joris Van Remoortere
Re: Review Request 26709: Change from kill to stop for docker
On Oct. 15, 2014, 12:15 a.m., Timothy Chen wrote: src/docker/docker.cpp, line 472 https://reviews.apache.org/r/26709/diff/1/?file=721122#file721122line472 I'm not sure what executor timeout you're referring to, but for example the executor signal escalation timeout although similiar is not really intended for docker containerizer shutdown. In all our containerizers we simply kill (SIGKILL) the tasks. Let's rename the method and also make the timeout be a method param that we can pass in. Ryan Thomas wrote: Ok will do. I think we do need some sanity checking on the param though, maybe we could just set a hard upper bounds on the value - 60 seconds? Yes, I basically had this in mind (mind the TODO) ``` string cmd = path + kill + container; if(os::hasenv(MESOS_DOCKER_STOP_TIMEOUT)) { const string timeout = os::getenv(MESOS_DOCKER_STOP_TIMEOUT, false); // check if it is there and is numeric if(string != NULL std::regex_match(timeout, std::regex([0-9]+))) { // it is numeric // TODO: Verify the value us something sane cmd = path + stop -t + timeout + + container; } } ``` I definitely agree with Timothy's comment to rename methods and such. I (personal opinion) wouldn't want to put in any hard coded values - but would definitely support 60 secs as a good default value for a timeout. - Ankur --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56611 --- On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Review Request 26715: Supported O_CLOEXEC for os::open on all platforms.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26715/ --- Review request for mesos, Ben Mahler and Vinod Kone. Repository: mesos-git Description --- This patch allows us to use O_CLOEXEC even if it's not available on some distributions. Will follow up with a patch to cleanup the code. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp ec41fe412290a152a15a49e091f0bc6436e42cfd 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 898d175980e397493270454aafdf340a96c9cf85 Diff: https://reviews.apache.org/r/26715/diff/ Testing --- make check both on mac and linux Thanks, Jie Yu
Re: Review Request 26709: Change from kill to stop for docker
On Oct. 15, 2014, 12:15 a.m., Timothy Chen wrote: src/docker/docker.cpp, line 472 https://reviews.apache.org/r/26709/diff/1/?file=721122#file721122line472 I'm not sure what executor timeout you're referring to, but for example the executor signal escalation timeout although similiar is not really intended for docker containerizer shutdown. In all our containerizers we simply kill (SIGKILL) the tasks. Let's rename the method and also make the timeout be a method param that we can pass in. Ryan Thomas wrote: Ok will do. I think we do need some sanity checking on the param though, maybe we could just set a hard upper bounds on the value - 60 seconds? Ankur Chauhan wrote: Yes, I basically had this in mind (mind the TODO) ``` string cmd = path + kill + container; if(os::hasenv(MESOS_DOCKER_STOP_TIMEOUT)) { const string timeout = os::getenv(MESOS_DOCKER_STOP_TIMEOUT, false); // check if it is there and is numeric if(string != NULL std::regex_match(timeout, std::regex([0-9]+))) { // it is numeric // TODO: Verify the value us something sane cmd = path + stop -t + timeout + + container; } } ``` I definitely agree with Timothy's comment to rename methods and such. I (personal opinion) wouldn't want to put in any hard coded values - but would definitely support 60 secs as a good default value for a timeout. Although it makes sense to not let a large timeout be specified, what large means seems subjective enough that I'm not sure what is a good upper bounds we can use. My intuition is that we have the following options: 1) We set a default timeout to be a relative small number (ie 10-15) for now (with a constant), since it's already a expectation that we kill tasks so we don't want to have a huge shutdown. 2) Allow the timeout to be configurable and let the user adjust the best configuration I'm not sure yet what's the best approach, but I think going for 2) will be the most flexible. As for the option, let's expose it as a slave flag instead of using envrionemt variables here. - Timothy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/#review56611 --- On Oct. 14, 2014, 9:56 p.m., Ryan Thomas wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26709/ --- (Updated Oct. 14, 2014, 9:56 p.m.) Review request for mesos. Bugs: MESOS-1925 https://issues.apache.org/jira/browse/MESOS-1925 Repository: mesos-git Description --- Change docker kill to docker stop to allow graceful shutdown of containers. Diffs - src/docker/docker.cpp e09b51c4f5101c3a8e77caf12b208c88f47fbcb2 Diff: https://reviews.apache.org/r/26709/diff/ Testing --- Thanks, Ryan Thomas
Re: Review Request 24535: Added functionality to create SVN based diffs of arbitrary strings.
On Oct. 15, 2014, 12:22 a.m., Ben Mahler wrote: Looks good modulo some issues below, would like to take a final pass when you update the APR() abstraction to be thread safe. Hm.. have you looked at whether the other calls here are thread safe? For example, svn_pool_create(). - Ben --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24535/#review56597 --- On Oct. 12, 2014, 4:14 a.m., Benjamin Hindman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24535/ --- (Updated Oct. 12, 2014, 4:14 a.m.) Review request for mesos, Ben Mahler and Jie Yu. Repository: mesos-git Description --- See summary. Note that this hard codes the location of the subversion and Apache Portable Runtime (APR) headers. Diffs - 3rdparty/libprocess/3rdparty/Makefile.am 256df0bb5557ebe6c75099d35c284804c9e57253 3rdparty/libprocess/3rdparty/stout/include/stout/svn.hpp PRE-CREATION 3rdparty/libprocess/3rdparty/stout/tests/svn_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/24535/diff/ Testing --- make check Thanks, Benjamin Hindman
Jenkins build is back to normal : Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2444
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2444/changes
Review Request 26723: Escape JSON object keys
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26723/ --- Review request for mesos and Adam B. Repository: mesos-git Description --- All strings in JSON need to be escaped, including the keys in dictionaries / objects. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 719aa964a536cf02dbd2de440157de487ec703b1 Diff: https://reviews.apache.org/r/26723/diff/ Testing --- make distcheck Thanks, Cody Maloney
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review56636 --- Patch looks great! Reviews applied: [17431] All tests passed. - Mesos ReviewBot On Oct. 14, 2014, 10:54 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Oct. 14, 2014, 10:54 p.m.) Review request for mesos, Michael Lunøe and Niklas Nielsen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html 25caf530628ad3ac7f23ab5f014000aac8583da1 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js 41a70a80442501a2bf7b217939dbe504662941d2 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Connection Issue Alert https://reviews.apache.org/media/uploaded/files/2014/01/28/dee8df12-0bae-48b5-a7ce-c07e0266c790__Screenshot_2014-01-28_12.44.53.png Thanks, Thomas Rampelberg
Re: Review Request 26426: Add --enable-debug flag to ./configure for controlling emission of debug information
On Oct. 14, 2014, 11:11 p.m., Dominic Hamon wrote: configure.ac, line 281 https://reviews.apache.org/r/26426/diff/2/?file=721143#file721143line281 aside: you might want to consider Os for release. It'll keep the size down and will often be as performant, even without rigourous optimizations, due to cache friendliness. You almost always want -O2 over -Os, only typically in embedded situations does that change. - Timothy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/#review56598 --- On Oct. 14, 2014, 11:07 p.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/ --- (Updated Oct. 14, 2014, 11:07 p.m.) Review request for mesos, Benjamin Hindman and Timothy St. Clair. Repository: mesos-git Description --- Reworks building mesos in debug vs. release. By default, mesos is now built in release (no debug info, optimized build). If '--enable-debug' is specified to configure, than optimization will be turned off, and debug info will be turned on. This also adds a variable 'DEBUG' to the build environment, which people can use in code to see if mesos is built with debugging to enable extra assertions / checks. For release builds we may want to set 'NDEBUG' which removes assert()'s, but that is a seperate discussion. Main benefits: 1) Getting a build to include/exclude debug information at will is feasible. Before some things like using clang would forcibly enable debug info in all cases 2) libmesos.so and the other binaries which get packaged up for use in distributions shrink considerably without manually stripping post-build (Improves build time, makes packaging cleaner) Diffs - configure.ac 2b372e06006250b5230956ef096473e98f3fa590 Diff: https://reviews.apache.org/r/26426/diff/ Testing --- Built with both --enable-debug and without, checking that the flags get passed through correctly. Thanks, Cody Maloney
Review Request 26724: Remove --without-cxx11 flag
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26724/ --- Review request for mesos and Dominic Hamon. Repository: mesos-git Description --- C++11 is required now. Specifying this flag makes no change in the actual compilation of mesos. Diffs - configure.ac 2b372e06006250b5230956ef096473e98f3fa590 Diff: https://reviews.apache.org/r/26724/diff/ Testing --- grep -r 'with_cxx11' in the mesos repository Thanks, Cody Maloney
Re: Review Request 26426: Add --enable-debug flag to ./configure for controlling emission of debug information
On Oct. 14, 2014, 9:06 p.m., Timothy St. Clair wrote: configure.ac, line 281 https://reviews.apache.org/r/26426/diff/1/?file=714874#file714874line281 Is there a reason you want to leave debug symbols out of optimized builds? cmake has the pattern correct imho: Release Debug ReleaseWithDebug A ReleaseWithDebug allows packagers, such as myself, to build w/debugsymbols that are stripped out into a .debuginfo package which can be used by developers for tracing When bears attack. Granted that it is tenuous debugging at best, but it's better then nothing. So I think we want all three modes, stripping all debug information is not really idea. Cody Maloney wrote: My main motivation is to shrink the size of libmesos. Yesterday sugis in #mesos had one which was 213M. For the buildbot internally, full debian packages (which are compressed) of mesos weign in at 165M a piece (Yes, stripping post-build would help a lot, but why build it to begin with?). Most of this is debug info. Also, we build a bunch of different ways, and when libmesos is as big as it is, a decent amount of time ends up being spent on disk I/O reading / writing all the debug info when we are really just trying to ensure it builds on all the different platforms (Not to mention storage and file size shipping things around the network to centralized repositories). The simple toggle between debug and release, removing the legacy logic gets us most of the benefit. If you have a good place to point me for what is needed to get a 'ReleaseWithDebug' info build up and running I can definitely work on adding that as well. Cody Maloney wrote: It is also entirely possible to specify custom CXXFLAGS + CFLAGS with this patch to get the old optimized debug build. The flags set by --enable-debug (or not having it), to get back the optimized debug build from before. I suppose rpm builds default CXXFLAGS to '-O2 -g ...', so I can buy the argument of just making it easier on the average person. - Timothy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/#review56573 --- On Oct. 14, 2014, 11:07 p.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26426/ --- (Updated Oct. 14, 2014, 11:07 p.m.) Review request for mesos, Benjamin Hindman and Timothy St. Clair. Repository: mesos-git Description --- Reworks building mesos in debug vs. release. By default, mesos is now built in release (no debug info, optimized build). If '--enable-debug' is specified to configure, than optimization will be turned off, and debug info will be turned on. This also adds a variable 'DEBUG' to the build environment, which people can use in code to see if mesos is built with debugging to enable extra assertions / checks. For release builds we may want to set 'NDEBUG' which removes assert()'s, but that is a seperate discussion. Main benefits: 1) Getting a build to include/exclude debug information at will is feasible. Before some things like using clang would forcibly enable debug info in all cases 2) libmesos.so and the other binaries which get packaged up for use in distributions shrink considerably without manually stripping post-build (Improves build time, makes packaging cleaner) Diffs - configure.ac 2b372e06006250b5230956ef096473e98f3fa590 Diff: https://reviews.apache.org/r/26426/diff/ Testing --- Built with both --enable-debug and without, checking that the flags get passed through correctly. Thanks, Cody Maloney
Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2170
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2170/changes Changes: [dhamon] Add configuration check to libprocess for std::unique_ptr and std::move -- [...truncated 29021 lines...] I1015 01:32:22.666582 20672 hierarchical_allocator_process.hpp:659] Performed allocation for 3 slaves in 228550ns I1015 01:32:23.473719 20663 monitor.cpp:140] Failed to collect resource usage for container '7fb85347-9101-411d-9311-22d20c9ef131' for executor 'default' of framework '20141015-010435-3142697795-50358-20631-': Unknown container: 7fb85347-9101-411d-9311-22d20c9ef131 I1015 01:32:23.473742 20663 monitor.cpp:140] Failed to collect resource usage for container '6febf425-aeea-4111-9403-580715eaa77e' for executor 'default' of framework '20141015-010435-3142697795-50358-20631-': Unknown container: 6febf425-aeea-4111-9403-580715eaa77e I1015 01:32:23.666925 20664 hierarchical_allocator_process.hpp:734] Offering mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2 on slave 20141015-010435-3142697795-50358-20631-2 to framework 20141015-010435-3142697795-50358-20631- I1015 01:32:23.667067 20664 hierarchical_allocator_process.hpp:734] Offering cpus(*):1; mem(*):10112; disk(*):3.70122e+06; ports(*):[31000-32000] on slave 20141015-010435-3142697795-50358-20631-1 to framework 20141015-010435-3142697795-50358-20631- I1015 01:32:23.667199 20664 hierarchical_allocator_process.hpp:734] Offering mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2 on slave 20141015-010435-3142697795-50358-20631-0 to framework 20141015-010435-3142697795-50358-20631- I1015 01:32:23.667438 20664 hierarchical_allocator_process.hpp:659] Performed allocation for 3 slaves in 559469ns I1015 01:32:23.667721 20663 master.cpp:3729] Sending 3 offers to framework 20141015-010435-3142697795-50358-20631- (Low-Level Scheduler using pthread (C++)) at scheduler(1)@67.195.81.187:50358 I1015 01:32:23.667865 20663 scheduler.cpp:582] Enqueuing event 3 from master@67.195.81.187:50358 Received an OFFERS event Received offer 20141015-010435-3142697795-50358-20631-998 with mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2 Received offer 20141015-010435-3142697795-50358-20631-999 with mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2 Received offer 20141015-010435-3142697795-50358-20631-1000 with cpus(*):1; mem(*):10112; disk(*):3.70122e+06; ports(*):[31000-32000] I1015 01:32:23.668670 20669 master.cpp:2315] Processing reply for offers: [ 20141015-010435-3142697795-50358-20631-998 ] on slave 20141015-010435-3142697795-50358-20631-0 at slave(3)@67.195.81.187:50358 (pomona.apache.org) for framework 20141015-010435-3142697795-50358-20631- (Low-Level Scheduler using pthread (C++)) at scheduler(1)@67.195.81.187:50358 I1015 01:32:23.668844 20669 master.cpp:2315] Processing reply for offers: [ 20141015-010435-3142697795-50358-20631-999 ] on slave 20141015-010435-3142697795-50358-20631-2 at slave(2)@67.195.81.187:50358 (pomona.apache.org) for framework 20141015-010435-3142697795-50358-20631- (Low-Level Scheduler using pthread (C++)) at scheduler(1)@67.195.81.187:50358 I1015 01:32:23.668985 20669 master.cpp:2315] Processing reply for offers: [ 20141015-010435-3142697795-50358-20631-1000 ] on slave 20141015-010435-3142697795-50358-20631-1 at slave(1)@67.195.81.187:50358 (pomona.apache.org) for framework 20141015-010435-3142697795-50358-20631- (Low-Level Scheduler using pthread (C++)) at scheduler(1)@67.195.81.187:50358 I1015 01:32:23.669246 20659 hierarchical_allocator_process.hpp:563] Recovered mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2 (total allocatable: mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2) on slave 20141015-010435-3142697795-50358-20631-0 from framework 20141015-010435-3142697795-50358-20631- I1015 01:32:23.669299 20659 hierarchical_allocator_process.hpp:599] Framework 20141015-010435-3142697795-50358-20631- filtered slave 20141015-010435-3142697795-50358-20631-0 for 5secs I1015 01:32:23.669394 20659 hierarchical_allocator_process.hpp:563] Recovered mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2 (total allocatable: mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2) on slave 20141015-010435-3142697795-50358-20631-2 from framework 20141015-010435-3142697795-50358-20631- I1015 01:32:23.669420 20659 hierarchical_allocator_process.hpp:599] Framework 20141015-010435-3142697795-50358-20631- filtered slave 20141015-010435-3142697795-50358-20631-2 for 5secs I1015 01:32:23.669499 20659 hierarchical_allocator_process.hpp:563] Recovered cpus(*):1; mem(*):10112; disk(*):3.70122e+06; ports(*):[31000-32000] (total allocatable: cpus(*):1; mem(*):10112; disk(*):3.70122e+06; ports(*):[31000-32000]) on slave 20141015-010435-3142697795-50358-20631-1 from
Review Request 26728: Fixed minor typos in configure.ac.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26728/ --- Review request for mesos, Cody Maloney and Niklas Nielsen. Repository: mesos-git Description --- Fixed minor typos in configure.ac. Diffs - configure.ac da1c82db31583fc81de658574b9a95628cb84dbc Diff: https://reviews.apache.org/r/26728/diff/ Testing --- Thanks, Kapil Arya
Review Request 26727: Hooked Isolator module to mesos containerizer.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26727/ --- Review request for mesos and Niklas Nielsen. Repository: mesos-git Description --- Hooked Isolator module to mesos containerizer. Diffs - src/module/manager.hpp 797728a8c8e88dd1a13142a355cbe0b1491fb7a2 src/slave/containerizer/mesos/containerizer.cpp 9d083294caa5c5a47ba3ceaa1b57346144cb795c Diff: https://reviews.apache.org/r/26727/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 26715: Supported O_CLOEXEC for os::open on all platforms.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26715/#review56642 --- Patch looks great! Reviews applied: [26715] All tests passed. - Mesos ReviewBot On Oct. 15, 2014, 12:24 a.m., Jie Yu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26715/ --- (Updated Oct. 15, 2014, 12:24 a.m.) Review request for mesos, Ben Mahler and Vinod Kone. Repository: mesos-git Description --- This patch allows us to use O_CLOEXEC even if it's not available on some distributions. Will follow up with a patch to cleanup the code. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp ec41fe412290a152a15a49e091f0bc6436e42cfd 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 898d175980e397493270454aafdf340a96c9cf85 Diff: https://reviews.apache.org/r/26715/diff/ Testing --- make check both on mac and linux Thanks, Jie Yu
Re: Review Request 25184: Delete framework data in TaskStatus to avoid OOM
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25184/ --- (Updated Oct. 15, 2014, 10:23 a.m.) Review request for mesos, Adam B and Timothy St. Clair. Bugs: MESOS-1746 https://issues.apache.org/jira/browse/MESOS-1746 Repository: mesos-git Description --- There was a bug found that Spark use TaskStatus.data to transfer computed result and mesos-master RES memory keeps increasing fast and finally will be killed by OOM killer. Diffs - src/master/master.cpp cb46cec0674b3aa031450c5b4f48f4f8bb92767d Diff: https://reviews.apache.org/r/25184/diff/ Testing (updated) --- tested with spark. It's very easy to reproduce this issue (100%) with spark, when spark use mesos as resource manager, its executor driver will put result into TaskStatus. For example, a result of a single task like below. 14/08/22 13:29:18 INFO Executor: Serialized size of result for 248 is 17573033 It's about 16MB large, and a stage of spark generally consist of maybe hundreds of task and finished in tens of seconds, this will put mesos get killed by OOM killer soon. Thanks, Chengwei Yang
Re: Review Request 25525: MESOS-1739: Allow slave reconfiguration on restart
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25525/ --- (Updated Oct. 15, 2014, 3:06 a.m.) Review request for mesos, Adam B and Vinod Kone. Changes --- Update to address review comments. Bugs: MESOS-1739 https://issues.apache.org/jira/browse/MESOS-1739 Repository: mesos-git Description --- Allows attributes and resources to be set to a superset of what they were previously on a slave restart. Incorporates all comments from: https://issues.apache.org/jira/browse/MESOS-1739 and the former review request: https://reviews.apache.org/r/25111/ Diffs (updated) - src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/common/attributes.hpp 0a043d5b5dca804c6dd215cabd2704f24df71a33 src/common/attributes.cpp aab114e1a5932e3f218b850e1afc7f2ef0f10e21 src/common/slaveinfo_utils.hpp PRE-CREATION src/common/slaveinfo_utils.cpp PRE-CREATION src/master/allocator.hpp 02d20d0cc802805bc702891306aa42894531b223 src/master/hierarchical_allocator_process.hpp 31dfb2cc86e2e74da43f05fbada10976ce65f3e4 src/master/master.hpp 14f1d0fd240c9cd0718d0238e1fbb9c733190205 src/master/master.cpp 1b1ce0de3065ced45890777d42c69ef1091f5699 src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/attributes_tests.cpp 240a8cac18ac12178cf73e8eeb88bd50e3fcc03b src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 src/tests/slave_tests.cpp f585bdd20ae1af466f2c1b4d85331ac67451552f Diff: https://reviews.apache.org/r/25525/diff/ Testing --- make check on localhost Thanks, Cody Maloney
Re: Review Request 25525: MESOS-1739: Allow slave reconfiguration on restart
On Oct. 15, 2014, 1:25 a.m., Vinod Kone wrote: src/master/master.hpp, line 1308 https://reviews.apache.org/r/25525/diff/8/?file=717604#file717604line1308 This should be CHECK_SOME(compatible). True, a incompatible change shouldn't get that far because there are advance checks, but practically the registrar will do the right thing / recover in the case of a compatibility error here. CHECK_SOME() means that we will always fatally exit and kill the master, which seems worse to me. On Oct. 15, 2014, 1:25 a.m., Vinod Kone wrote: src/master/master.cpp, lines 3100-3102 https://reviews.apache.org/r/25525/diff/8/?file=717605#file717605line3100 ``` LOG(WARNING) Slave *slave attempted to re-register with incompatible info: compatible.error() ; shutting it down ``` Also, maybe we don't need to explicitly 'removeSlave(slave)' here? If a compatible slave is not re-registered within timeout, master will automatically remove it. Updated the message. I would rather explicitly tell the slave to shutdown than wait, otherwise the slave will continue trying to reregister with the master, the master will keep saying no, until the timeout at which point we will tell the slave to shutdown when it tries to re-authenticate with the message Slave attempted to re-register after removal. Waiting gives the operator worse information, and makes it a longer loop for getting the slave back into the cluster, than giving the notification, and it really isn't much (or very complex) code. On Oct. 15, 2014, 1:25 a.m., Vinod Kone wrote: src/master/master.cpp, lines 3146-3149 https://reviews.apache.org/r/25525/diff/8/?file=717605#file717605line3146 Hmm. i don't think we want these semantics. Why not just fail if readmission failed? So if the registrar actually fails to readmit the slave, then the master will go down on a LOG(FATAL). In the case where we couldn't readmit the slave for some reason (Ex: the registrar thought the upgrade was incompatible, hit the fall-through case in master/master.hpp), then the slave will be sent a shutdown message and explicitly removed at that point. So I think the semantics you want are there / the comment can just be removed. On Oct. 15, 2014, 1:25 a.m., Vinod Kone wrote: src/master/master.cpp, lines 3169-3171 https://reviews.apache.org/r/25525/diff/8/?file=717605#file717605line3169 This seems very unintuitve for users/operators. Lets fix this in this patch rather than doing a TODO. Also, AFAICT, the changes are not lost because both the registrar and the allocator are informed about the new slave info. Am I missing something? Note this isn't a fall through from the above code. All the above code exits. At this point, we haven't talked to the registrar or allocator. All that has happened since reciept of the message is authentication, checking that the slave hasn't already been removed, and checking that the master doesn't already know about the slave (getSlave reutrns null). So this can be triggered by a slave reregistering, and that first registration being sent to the Registrar. While that registration is happening, the same slave sends another reregistration message, this time with updated SlaveInfo. The re-registration check will be hit, and the updated SlaveInfo will be discarded. per the check. The solutions I can see: 1) Send a message back to the slave that we already got a reregistration with the last registration's SlaveInfo, and if it doesn't want that one it needs to update it's registration after that one has completed 2) Make it so we can queue multiple in-flight reregistrations for a single slave (No idea if this will break anything / would work if you just removed the check). I worry a little with this option that there could be a lot of reregistrations in flight at the same time to the master and it could get overwhelmed doing extra work where before there was an early exit. On Oct. 15, 2014, 1:25 a.m., Vinod Kone wrote: src/common/slaveinfo_utils.cpp, lines 64-96 https://reviews.apache.org/r/25525/diff/8/?file=717601#file717601line64 Hmm. This is still hard to follow. Using variadic templated function here doesn't make it any simpler to read/follow. Infact it makes it harder because now I need to read understand what the template function does. I think explicitly checking each field (as suggested in the earlier comment) reads better. So I can do that, which gives: ```c++ if (!(newSlaveInfo.id() == oldSlaveInfo.id())) { return Error(id cannot be changed (old: + stringify(oldSlaveInfo.id()) + new: + stringify(newSlaveInfo.id()) + )); } if (newSlaveInfo.hostname() != oldSlaveInfo.hostname()) { return Error(hostname cannot be changed (old: + stringify(oldSlaveInfo.hostname()) + new: +
Re: Review Request 25525: MESOS-1739: Allow slave reconfiguration on restart
- Cody --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25525/#review56590 --- On Oct. 15, 2014, 3:13 a.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25525/ --- (Updated Oct. 15, 2014, 3:13 a.m.) Review request for mesos, Adam B and Vinod Kone. Bugs: MESOS-1739 https://issues.apache.org/jira/browse/MESOS-1739 Repository: mesos-git Description --- Allows attributes and resources to be set to a superset of what they were previously on a slave restart. Incorporates all comments from: https://issues.apache.org/jira/browse/MESOS-1739 and the former review request: https://reviews.apache.org/r/25111/ Diffs - src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/common/attributes.hpp 0a043d5b5dca804c6dd215cabd2704f24df71a33 src/common/attributes.cpp aab114e1a5932e3f218b850e1afc7f2ef0f10e21 src/common/slaveinfo_utils.hpp PRE-CREATION src/common/slaveinfo_utils.cpp PRE-CREATION src/master/allocator.hpp 02d20d0cc802805bc702891306aa42894531b223 src/master/hierarchical_allocator_process.hpp 31dfb2cc86e2e74da43f05fbada10976ce65f3e4 src/master/master.hpp 14f1d0fd240c9cd0718d0238e1fbb9c733190205 src/master/master.cpp 1b1ce0de3065ced45890777d42c69ef1091f5699 src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/attributes_tests.cpp 240a8cac18ac12178cf73e8eeb88bd50e3fcc03b src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 src/tests/slave_tests.cpp f585bdd20ae1af466f2c1b4d85331ac67451552f Diff: https://reviews.apache.org/r/25525/diff/ Testing --- make check on localhost Thanks, Cody Maloney
Re: Review Request 25525: MESOS-1739: Allow slave reconfiguration on restart
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25525/ --- (Updated Oct. 15, 2014, 3:13 a.m.) Review request for mesos, Adam B and Vinod Kone. Changes --- Missed a couple comments Bugs: MESOS-1739 https://issues.apache.org/jira/browse/MESOS-1739 Repository: mesos-git Description --- Allows attributes and resources to be set to a superset of what they were previously on a slave restart. Incorporates all comments from: https://issues.apache.org/jira/browse/MESOS-1739 and the former review request: https://reviews.apache.org/r/25111/ Diffs (updated) - src/Makefile.am d503c8df73cda15a9d59254e8265e4a5d0e003a4 src/common/attributes.hpp 0a043d5b5dca804c6dd215cabd2704f24df71a33 src/common/attributes.cpp aab114e1a5932e3f218b850e1afc7f2ef0f10e21 src/common/slaveinfo_utils.hpp PRE-CREATION src/common/slaveinfo_utils.cpp PRE-CREATION src/master/allocator.hpp 02d20d0cc802805bc702891306aa42894531b223 src/master/hierarchical_allocator_process.hpp 31dfb2cc86e2e74da43f05fbada10976ce65f3e4 src/master/master.hpp 14f1d0fd240c9cd0718d0238e1fbb9c733190205 src/master/master.cpp 1b1ce0de3065ced45890777d42c69ef1091f5699 src/slave/slave.cpp 0e342ed35e3db3b68f9f32b6cf4ace23e4a4db38 src/tests/attributes_tests.cpp 240a8cac18ac12178cf73e8eeb88bd50e3fcc03b src/tests/mesos.hpp 957e2233cc11c438fd80d3b6d1907a1223093104 src/tests/slave_tests.cpp f585bdd20ae1af466f2c1b4d85331ac67451552f Diff: https://reviews.apache.org/r/25525/diff/ Testing --- make check on localhost Thanks, Cody Maloney
Re: Review Request 26723: Escape JSON object keys
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26723/#review56644 --- LGTM, but can you include some motivation here, since there's no JIRA ticket to track the issue? How did you come across this? What kinds of issues could have arisen? - Adam B On Oct. 14, 2014, 6:12 p.m., Cody Maloney wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26723/ --- (Updated Oct. 14, 2014, 6:12 p.m.) Review request for mesos and Adam B. Repository: mesos-git Description --- All strings in JSON need to be escaped, including the keys in dictionaries / objects. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 719aa964a536cf02dbd2de440157de487ec703b1 Diff: https://reviews.apache.org/r/26723/diff/ Testing --- make distcheck Thanks, Cody Maloney