----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36389/#review92221 -----------------------------------------------------------
Apologies for the delay on the review. Let's definitely get a test for this stuff too! src/slave/flags.hpp (lines 118 - 119) <https://reviews.apache.org/r/36389/#comment146243> We name the flags to match how you'd see them on the command line and thus use snake case here (see above too). Thanks! Also, I'd like to suggest s/remote_execution_allowed/allow_remote_execution/ in order to be more of a verb here, i.e, --allow_remote_execution=true versus --remote_execution_allowed=true. src/slave/main.cpp (lines 261 - 265) <https://reviews.apache.org/r/36389/#comment146245> First, this shoudn't be needed, because the slave will get passed the flags and when it initializes can check to see if remote execution has been allowed. Second, we try and avoid at all costs synchronously calling a libprocess process. There is only one place I know of that does this in some tests, but there is a big TODO and warning of why we don't want to do this. Happy to discuss in more detail. In this case since we shouldn't need this block of code at all you can just kill it. src/slave/slave.hpp (lines 372 - 382) <https://reviews.apache.org/r/36389/#comment146246> These shouldn't be needed, the slave will have the flags and can just read the values from there. src/slave/slave.hpp (lines 571 - 575) <https://reviews.apache.org/r/36389/#comment146247> Not needed, can just read values from 'flags', no need to have two copies, then on wonders which is the source of truth? This is one of the main reasons we pass 'flags' throughout everywhere as much as possible. src/slave/slave.cpp (line 517) <https://reviews.apache.org/r/36389/#comment146250> Do you need to capture 'http'? src/slave/slave.cpp (line 519) <https://reviews.apache.org/r/36389/#comment146248> FYI, the 'this->' isn't needed here and we haven't used it consistently. src/slave/slave.cpp (line 4700) <https://reviews.apache.org/r/36389/#comment146263> Why `auto` here? Interestingly enough, you do `subprocess.get()` below and pass it to an `Option<Subprocess>`, which made me jump for a second thinking that we'd changed the return type of `process::subprocess()` to be `Try<Option<process::Subprocess>>` instead of just `Try<process::Subprocess>`. This is why I'm not convinced `auto` actually buys us anything here: the type provides documentation. src/slave/slave.cpp (line 4701) <https://reviews.apache.org/r/36389/#comment146258> What about if 'shell' is true? Seems like we should support both cases, or explicitly return a `BadRequest` or `Unsupported...` if we don't support it for now. src/slave/slave.cpp (lines 4703 - 4705) <https://reviews.apache.org/r/36389/#comment146260> One of these things is not like the others. ;-) Any reason you didn't use the 'process::' prefix on the third `Subprocess::PIPE()`? src/slave/slave.cpp (line 4712) <https://reviews.apache.org/r/36389/#comment146262> Why is this an `Option`? The return type of `process::subprocess(...)` above is `Try<process::Subprocess>`, so no need to put it into an option. In fact, why not just use `subprocess.get()` everywhere below instead of `cmd.get()`? src/slave/slave.cpp (line 4713) <https://reviews.apache.org/r/36389/#comment146261> If we don't use stderr my recommendation would be to just send it to '/dev/null' by using `process::Subprocess::PATH("/dev/null")` above instead of a `process::Subprocess::PIPE()`. src/slave/slave.cpp (line 4722) <https://reviews.apache.org/r/36389/#comment146264> Unused variable. src/slave/slave.cpp (line 4731) <https://reviews.apache.org/r/36389/#comment146265> We don't actually know if this exited unless we do `WIFEXITED` first, it might have terminated due to a signal (which we determine via `WIFSIGNALED`). src/slave/slave.cpp (line 4735) <https://reviews.apache.org/r/36389/#comment146266> Okay nevermind, it looks like you do use stderr? Either way, calling 'get()' here now is blocking, because technically even though the process is completed we might not yet have read all the data from the pipe. I think what you want here is to wait for all of 'status()', 'out', and 'err', e.g., something like: CHECK_SOME(subprocess.get().out()); CHECK_SOME(subprocess.get().err()); Future<string> out = subprocess.get().out().get(); Future<string> err = subprocess.get().err().get(); return await(subprocess.get().status(), out, err) .then(...); Note that 'await' will just mean the futures have completed, but they might have failed. You'd want 'collect' if you want to make sure they're successful. I think we'd need some overloads to make 'collect' work in this case too, but that's a quick fix, just let me know. src/slave/slave.cpp (line 4735) <https://reviews.apache.org/r/36389/#comment146332> Okay nevermind, it looks like you do use stderr? Either way, calling 'get()' here now is blocking, because technically even though the process is completed we might not yet have read all the data from the pipe. I think what you want here is to wait for all of 'status()', 'out', and 'err', e.g., something like: CHECK_SOME(subprocess.get().out()); CHECK_SOME(subprocess.get().err()); Future<string> out = subprocess.get().out().get(); Future<string> err = subprocess.get().err().get(); return await(subprocess.get().status(), out, err) .then(...); Note that 'await' will just mean the futures have completed, but they might have failed. You'd want 'collect' if you want to make sure they're successful. I think we'd need some overloads to make 'collect' work in this case too, but that's a quick fix, just let me know. src/slave/slave.cpp (line 4755) <https://reviews.apache.org/r/36389/#comment146267> Technically this can happen, because 'after' is going to get executed "after" the timeout but that doesn't mean that the lambda in the body of the 'then' is not still executing and couldn't complete before you check the status of 'response'. Note taht this is the caveat with 'after' that you have to look out for, but your code looks okay with respect to this (if, on the other hand, you were making some nonRegardless of the value of response, I'd just treat this as a timeout. It would probably be good, however, to 'os::kill(subprocess.get().pid())' so that we don't have a process that potentially runs forever. - Benjamin Hindman On July 14, 2015, 11:20 p.m., Marco Massenzio wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/36389/ > ----------------------------------------------------------- > > (Updated July 14, 2015, 11:20 p.m.) > > > Review request for mesos, Benjamin Hindman and Cody Maloney. > > > Bugs: MESOS-2830 > https://issues.apache.org/jira/browse/MESOS-2830 > > > Repository: mesos > > > Description > ------- > > Jira: MESOS-2830 > > Under certain (maintenance) circumnstance, it may be necessary > (or desirable) to execute arbitrary operator's commands on the > slave (the entire fleet or a subset thereof) bypassing the Mesos > Task execution mechanism; this may typically be necessary for > maintenance and/or emergency actions. > > This patch adds an HTTP endpoint (/execute) which accepts a > JSON-encoded CommandInfo structure and executes the given > command (with optional arguments). > > The output of the command (along with potentially any stderr > messages) is returned JSON-encoded in the Response. > > For more details, see the design doc at: > https://goo.gl/4npTMU > > > Diffs > ----- > > src/slave/flags.hpp 26c778db2303167369af8675fe0441a00a1e9151 > src/slave/flags.cpp 8632677ebbdbfef8ffa45204b6f63a700baff7f3 > src/slave/main.cpp 8008430e98773d8be9ba5ac6385cffb2e351932a > src/slave/slave.hpp dec4ca8323e151a6d0f9139214ff0ef6e3e3375a > src/slave/slave.cpp 2119b5176aa7cfb7b0b551d4d4f65ee12818b9e4 > > Diff: https://reviews.apache.org/r/36389/diff/ > > > Testing > ------- > > make check > > lots of manual testing (using Postman, REST client) > > > Thanks, > > Marco Massenzio > >