Review Request 27516: Rebased and re-edited patch for MESOS-1316.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27516/ --- Review request for mesos and Benjamin Hindman. Bugs: MESOS-1316 https://issues.apache.org/jira/browse/MESOS-1316 Repository: mesos-git Description --- Manually rebased and re-edited https://reviews.apache.org/r/21233/, which is supposed to be replaced now by this patch. Diffs - src/Makefile.am e6a07150c10b9fa040143e394b2f913a18eeebc1 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 src/slave/containerizer/fetcher.hpp PRE-CREATION src/slave/containerizer/fetcher.cpp PRE-CREATION src/slave/containerizer/mesos/containerizer.cpp d4b08f54d6feb453f3a9d27ca54c867176e62102 src/tests/containerizer_tests.cpp 2c90d2fc18a3268c55b6dfe98699bfb36d093983 src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa Diff: https://reviews.apache.org/r/27516/diff/ Testing --- make check on Mac OS 10.10 and Ubuntu 14.4. In total, 3 tests fail: ExamplesTest.NoExecutorFramework, ExamplesTest.JavaFramework , ExamplesTest.PythonFramework. It is strongly suspected that those are unrelated to this code change and just generally flaky. Thanks, Bernd Mathiske
Re: Review Request 27516: Rebased and re-edited patch for MESOS-1316: Abstracted out invoking 'mesos-fetcher'.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27516/ --- (Updated Nov. 3, 2014, 2:43 a.m.) Review request for mesos and Benjamin Hindman. Changes --- more commentary in the description Summary (updated) - Rebased and re-edited patch for MESOS-1316: Abstracted out invoking 'mesos-fetcher'. Bugs: MESOS-1316 https://issues.apache.org/jira/browse/MESOS-1316 Repository: mesos-git Description (updated) --- Manually rebasing and re-editing https://reviews.apache.org/r/21233/, which is supposed to be replaced now by this patch. Original description: To test the mesos-fetcher (and the setting of the environment) more cleanly I did some refactoring into a 'fetcher' namespace. Also moved fetcher environment tests to fetcher test file. Added two fetcher tests. Diffs - src/Makefile.am e6a07150c10b9fa040143e394b2f913a18eeebc1 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 src/slave/containerizer/fetcher.hpp PRE-CREATION src/slave/containerizer/fetcher.cpp PRE-CREATION src/slave/containerizer/mesos/containerizer.cpp d4b08f54d6feb453f3a9d27ca54c867176e62102 src/tests/containerizer_tests.cpp 2c90d2fc18a3268c55b6dfe98699bfb36d093983 src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa Diff: https://reviews.apache.org/r/27516/diff/ Testing --- make check on Mac OS 10.10 and Ubuntu 14.4. In total, 3 tests fail: ExamplesTest.NoExecutorFramework, ExamplesTest.JavaFramework , ExamplesTest.PythonFramework. It is strongly suspected that those are unrelated to this code change and just generally flaky. Thanks, Bernd Mathiske
Re: Review Request 27516: Rebased and re-edited patch for MESOS-1316: Abstracted out invoking 'mesos-fetcher'.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27516/ --- (Updated Nov. 3, 2014, 7:45 a.m.) Review request for mesos and Benjamin Hindman. Changes --- Fixed make distcheck issue. Bugs: MESOS-1316 https://issues.apache.org/jira/browse/MESOS-1316 Repository: mesos-git Description --- Manually rebasing and re-editing https://reviews.apache.org/r/21233/, which is supposed to be replaced now by this patch. Original description: To test the mesos-fetcher (and the setting of the environment) more cleanly I did some refactoring into a 'fetcher' namespace. Also moved fetcher environment tests to fetcher test file. Added two fetcher tests. Diffs (updated) - src/Makefile.am e6a07150c10b9fa040143e394b2f913a18eeebc1 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 src/slave/containerizer/fetcher.hpp PRE-CREATION src/slave/containerizer/fetcher.cpp PRE-CREATION src/slave/containerizer/mesos/containerizer.cpp d4b08f54d6feb453f3a9d27ca54c867176e62102 src/tests/containerizer_tests.cpp 2c90d2fc18a3268c55b6dfe98699bfb36d093983 src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa Diff: https://reviews.apache.org/r/27516/diff/ Testing --- make check on Mac OS 10.10 and Ubuntu 14.4. In total, 3 tests fail: ExamplesTest.NoExecutorFramework, ExamplesTest.JavaFramework , ExamplesTest.PythonFramework. It is strongly suspected that those are unrelated to this code change and just generally flaky. Thanks, Bernd Mathiske
Re: Review Request 27516: Rebased and re-edited patch for MESOS-1316: Abstracted out invoking 'mesos-fetcher'.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27516/ --- (Updated Nov. 3, 2014, 8:36 a.m.) Review request for mesos and Benjamin Hindman. Changes --- Another attempt to upload the correct version of Makefile.am :-) Bugs: MESOS-1316 https://issues.apache.org/jira/browse/MESOS-1316 Repository: mesos-git Description --- Manually rebasing and re-editing https://reviews.apache.org/r/21233/, which is supposed to be replaced now by this patch. Original description: To test the mesos-fetcher (and the setting of the environment) more cleanly I did some refactoring into a 'fetcher' namespace. Also moved fetcher environment tests to fetcher test file. Added two fetcher tests. Diffs (updated) - src/Makefile.am e6a07150c10b9fa040143e394b2f913a18eeebc1 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 src/slave/containerizer/fetcher.hpp PRE-CREATION src/slave/containerizer/fetcher.cpp PRE-CREATION src/slave/containerizer/mesos/containerizer.cpp d4b08f54d6feb453f3a9d27ca54c867176e62102 src/tests/containerizer_tests.cpp 2c90d2fc18a3268c55b6dfe98699bfb36d093983 src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa Diff: https://reviews.apache.org/r/27516/diff/ Testing --- make check on Mac OS 10.10 and Ubuntu 14.4. In total, 3 tests fail: ExamplesTest.NoExecutorFramework, ExamplesTest.JavaFramework , ExamplesTest.PythonFramework. It is strongly suspected that those are unrelated to this code change and just generally flaky. Thanks, Bernd Mathiske
Re: Why rely on url scheme for fetching?
Hi Ankur I think this is a great approach. It makes the code much simpler, extensible, and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration tests, so this shouldn't surprise anyone :) If you haven't already, please read the documentation on contributing to Mesos and the style guide to ensure all the naming is as expected, then you can push the patch to reviewboard to get it reviewed and committed. On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com wrote: Hi, I did some learning today! This is pretty much a very rough draft of the tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious mistakes, please let me know. (this is my first pass after all). https://github.com/ankurcha/mesos/compare/prefer_2 My main intention is to break the logic of the fetcher info some very discrete components that i can write tests against. I am still re-learning cpp/mesos code styles etc so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions. -- Ankur @ankurcha On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com wrote: Hi, I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the lines of a functional test. I was going to add some tests but it seems like if I am to do that I would need to add a test dependency on hadoop. As an alternative, I propose adding a good set of unit tests around the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to catch a good portion of cases at the same time keeping the dependencies and runtime of tests low. What do you guys thing about this? PS: I am pretty green in terms of gtest and the overall c++ testing methodology. Can someone give me pointers to good examples of tests in the codebase. -- Ankur On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io wrote: Thank you Ankur. At first glance, it looks great. We'll do a more thorough review of it very soon. I know Tim St. Clair had some ideas for fixing MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review too. On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I just created a review https://reviews.apache.org/r/27483/ It's my first stab at it and I will try to add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let me know what you think about it so far. -- Ankur On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com wrote: Yea, i saw that the minute i pressed send. I'll start the review board so that people can have a look at the change. -- Ankur On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io wrote: Hi Ankur, There is a fetcher_tests.cpp in src/tests. Tim On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I am trying to find/write some test cases. I couldn't find a fetcher_tests.{cpp|hpp} so once I have something, I'll post on review board. I am new to gmock/gtest so bear with me while i get up to speed. -- Ankur On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io wrote: Hi Ankur, Can you post on reviewboard? We can discuss more about the code there. Tim Sent from my iPhone On Nov 1, 2014, at 6:29 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I don't think there is an issue which is directly in line with what i wanted but the closest one that I could find in JIRA is https://issues.apache.org/jira/browse/MESOS-1711 I have a branch ( https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher ) that has a change that would enable users to specify whatever hdfs compatible uris to the mesos-fetcher but maybe you can weight in on it. Do you think this is the right track? if so, i would like to pick this issue and submit a patch for review. -- Ankur On 1 Nov 2014, at 04:32, Tom Arnfeld t...@duedil.com wrote: Completely +1 to this. There are now quite a lot of hadoop compatible filesystem wrappers out in the wild and this would certainly be very useful. I'm happy to contribute a patch. Here's a few related issues that might be of interest; - https://issues.apache.org/jira/browse/MESOS-1887 - https://issues.apache.org/jira/browse/MESOS-1316 - https://issues.apache.org/jira/browse/MESOS-336 - https://issues.apache.org/jira/browse/MESOS-1248 On 31 October 2014 22:39, Tim Chen t...@mesosphere.io wrote: I believe there is already a JIRA ticket for this, if you search for fetcher in Mesos JIRA I think you can find it. Tim On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan an...@malloc64.com wrote: Hi, I have been looking at some of the stuff around the fetcher and saw something interesting. The code for fetcher::fetch method is dependent on a hard coded list of url schemes. No doubt that this works but is very restrictive.
Re: Review Request 27516: Rebased and re-edited patch for MESOS-1316: Abstracted out invoking 'mesos-fetcher'.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27516/#review59587 --- Patch looks great! Reviews applied: [27516] All tests passed. - Mesos ReviewBot On Nov. 3, 2014, 4:36 p.m., Bernd Mathiske wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27516/ --- (Updated Nov. 3, 2014, 4:36 p.m.) Review request for mesos and Benjamin Hindman. Bugs: MESOS-1316 https://issues.apache.org/jira/browse/MESOS-1316 Repository: mesos-git Description --- Manually rebasing and re-editing https://reviews.apache.org/r/21233/, which is supposed to be replaced now by this patch. Original description: To test the mesos-fetcher (and the setting of the environment) more cleanly I did some refactoring into a 'fetcher' namespace. Also moved fetcher environment tests to fetcher test file. Added two fetcher tests. Diffs - src/Makefile.am e6a07150c10b9fa040143e394b2f913a18eeebc1 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 src/slave/containerizer/fetcher.hpp PRE-CREATION src/slave/containerizer/fetcher.cpp PRE-CREATION src/slave/containerizer/mesos/containerizer.cpp d4b08f54d6feb453f3a9d27ca54c867176e62102 src/tests/containerizer_tests.cpp 2c90d2fc18a3268c55b6dfe98699bfb36d093983 src/tests/fetcher_tests.cpp d7754009a59fedb43e3422c56b3a786ce80164aa Diff: https://reviews.apache.org/r/27516/diff/ Testing --- make check on Mac OS 10.10 and Ubuntu 14.4. In total, 3 tests fail: ExamplesTest.NoExecutorFramework, ExamplesTest.JavaFramework , ExamplesTest.PythonFramework. It is strongly suspected that those are unrelated to this code change and just generally flaky. Thanks, Bernd Mathiske
Re: Why rely on url scheme for fetching?
+ Bernd, who has done some fetcher work, including additional testing, for MESOS-1316, MESOS-1945, and MESOS-336 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com wrote: Hi Ankur I think this is a great approach. It makes the code much simpler, extensible, and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration tests, so this shouldn't surprise anyone :) If you haven't already, please read the documentation on contributing to Mesos and the style guide to ensure all the naming is as expected, then you can push the patch to reviewboard to get it reviewed and committed. On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com wrote: Hi, I did some learning today! This is pretty much a very rough draft of the tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious mistakes, please let me know. (this is my first pass after all). https://github.com/ankurcha/mesos/compare/prefer_2 My main intention is to break the logic of the fetcher info some very discrete components that i can write tests against. I am still re-learning cpp/mesos code styles etc so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions. -- Ankur @ankurcha On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com wrote: Hi, I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the lines of a functional test. I was going to add some tests but it seems like if I am to do that I would need to add a test dependency on hadoop. As an alternative, I propose adding a good set of unit tests around the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to catch a good portion of cases at the same time keeping the dependencies and runtime of tests low. What do you guys thing about this? PS: I am pretty green in terms of gtest and the overall c++ testing methodology. Can someone give me pointers to good examples of tests in the codebase. -- Ankur On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io wrote: Thank you Ankur. At first glance, it looks great. We'll do a more thorough review of it very soon. I know Tim St. Clair had some ideas for fixing MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review too. On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I just created a review https://reviews.apache.org/r/27483/ It's my first stab at it and I will try to add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let me know what you think about it so far. -- Ankur On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com wrote: Yea, i saw that the minute i pressed send. I'll start the review board so that people can have a look at the change. -- Ankur On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io wrote: Hi Ankur, There is a fetcher_tests.cpp in src/tests. Tim On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I am trying to find/write some test cases. I couldn't find a fetcher_tests.{cpp|hpp} so once I have something, I'll post on review board. I am new to gmock/gtest so bear with me while i get up to speed. -- Ankur On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io wrote: Hi Ankur, Can you post on reviewboard? We can discuss more about the code there. Tim Sent from my iPhone On Nov 1, 2014, at 6:29 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I don't think there is an issue which is directly in line with what i wanted but the closest one that I could find in JIRA is https://issues.apache.org/jira/browse/MESOS-1711 I have a branch ( https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher ) that has a change that would enable users to specify whatever hdfs compatible uris to the mesos-fetcher but maybe you can weight in on it. Do you think this is the right track? if so, i would like to pick this issue and submit a patch for review. -- Ankur On 1 Nov 2014, at 04:32, Tom Arnfeld t...@duedil.com wrote: Completely +1 to this. There are now quite a lot of hadoop compatible filesystem wrappers out in the wild and this would certainly be very useful. I'm happy to contribute a patch. Here's a few related issues that might be of interest; - https://issues.apache.org/jira/browse/MESOS-1887 - https://issues.apache.org/jira/browse/MESOS-1316 - https://issues.apache.org/jira/browse/MESOS-336 - https://issues.apache.org/jira/browse/MESOS-1248 On 31 October 2014 22:39, Tim Chen t...@mesosphere.io wrote: I believe there is already a JIRA ticket for this, if you search for fetcher in Mesos JIRA I think you can find it. Tim On Fri, Oct 31, 2014 at 3:27 PM, Ankur Chauhan an...@malloc64.com wrote: Hi, I have been looking at some of
Re: Why rely on url scheme for fetching?
Yea, I saw those today morning. I'll hold off a little mesos-336 changes a lot of stuff. Sent from my iPhone On Nov 3, 2014, at 9:18 AM, Adam Bordelon a...@mesosphere.io wrote: + Bernd, who has done some fetcher work, including additional testing, for MESOS-1316, MESOS-1945, and MESOS-336 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com wrote: Hi Ankur I think this is a great approach. It makes the code much simpler, extensible, and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration tests, so this shouldn't surprise anyone :) If you haven't already, please read the documentation on contributing to Mesos and the style guide to ensure all the naming is as expected, then you can push the patch to reviewboard to get it reviewed and committed. On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com wrote: Hi, I did some learning today! This is pretty much a very rough draft of the tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious mistakes, please let me know. (this is my first pass after all). https://github.com/ankurcha/mesos/compare/prefer_2 My main intention is to break the logic of the fetcher info some very discrete components that i can write tests against. I am still re-learning cpp/mesos code styles etc so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions. -- Ankur @ankurcha On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com wrote: Hi, I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the lines of a functional test. I was going to add some tests but it seems like if I am to do that I would need to add a test dependency on hadoop. As an alternative, I propose adding a good set of unit tests around the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to catch a good portion of cases at the same time keeping the dependencies and runtime of tests low. What do you guys thing about this? PS: I am pretty green in terms of gtest and the overall c++ testing methodology. Can someone give me pointers to good examples of tests in the codebase. -- Ankur On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io wrote: Thank you Ankur. At first glance, it looks great. We'll do a more thorough review of it very soon. I know Tim St. Clair had some ideas for fixing MESOS-1711; he may want to review too. On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I just created a review https://reviews.apache.org/r/27483/ It's my first stab at it and I will try to add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let me know what you think about it so far. -- Ankur On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com wrote: Yea, i saw that the minute i pressed send. I'll start the review board so that people can have a look at the change. -- Ankur On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io wrote: Hi Ankur, There is a fetcher_tests.cpp in src/tests. Tim On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I am trying to find/write some test cases. I couldn't find a fetcher_tests.{cpp|hpp} so once I have something, I'll post on review board. I am new to gmock/gtest so bear with me while i get up to speed. -- Ankur On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io wrote: Hi Ankur, Can you post on reviewboard? We can discuss more about the code there. Tim Sent from my iPhone On Nov 1, 2014, at 6:29 PM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim, I don't think there is an issue which is directly in line with what i wanted but the closest one that I could find in JIRA is https://issues.apache.org/jira/browse/MESOS-1711 I have a branch ( https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher ) that has a change that would enable users to specify whatever hdfs compatible uris to the mesos-fetcher but maybe you can weight in on it. Do you think this is the right track? if so, i would like to pick this issue and submit a patch for review. -- Ankur On 1 Nov 2014, at 04:32, Tom Arnfeld t...@duedil.com wrote: Completely +1 to this. There are now quite a lot of hadoop compatible filesystem wrappers out in the wild and this would certainly be very useful. I'm happy to contribute a patch. Here's a few related issues that might be of interest; - https://issues.apache.org/jira/browse/MESOS-1887 - https://issues.apache.org/jira/browse/MESOS-1316 - https://issues.apache.org/jira/browse/MESOS-336 - https://issues.apache.org/jira/browse/MESOS-1248 On 31 October 2014 22:39, Tim Chen t...@mesosphere.io wrote: I believe there
Re: Review Request 27040: Replace hard-coded reap interval with a constant
On Oct. 22, 2014, 5:26 p.m., Ben Mahler wrote: There is already a constant, MESOS-1935 is to ensure that code that previously hardcoded Seconds(1) now uses MAX_REAP_INTERVAL from reap.hpp. Sorry, but I didn't find the constants in https://github.com/apache/mesos/blob/4693728e4f604a1cff6f01d5f2d8f0ec60eb2f96/3rdparty/libprocess/include/process/reap.hpp . - Aditi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27040/#review57826 --- On Oct. 22, 2014, 2:45 p.m., Aditi Bhatnagar wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27040/ --- (Updated Oct. 22, 2014, 2:45 p.m.) Review request for mesos and Dominic Hamon. Bugs: MESOS-1935 https://issues.apache.org/jira/browse/MESOS-1935 Repository: mesos-git Description --- Replace the hard-coded value for the maximal and minimum reap intervals with the constants from reap.hpp Diffs - 3rdparty/libprocess/include/process/reap.hpp 5e5051a 3rdparty/libprocess/include/process/reap.hpp~ PRE-CREATION 3rdparty/libprocess/src/reap.cpp afd956b 3rdparty/libprocess/src/reap.cpp~ PRE-CREATION Diff: https://reviews.apache.org/r/27040/diff/ Testing --- Thanks, Aditi Bhatnagar
Re: Cutting 0.21.0.
Hello all, We were waiting to get a number of fixes in on Friday; thanks for your patience. They've been committed so I'm tagging the release candidate today. I'll follow up shortly once the release candidate has been built. Thanks, Ian On Sat, Nov 1, 2014 at 9:14 AM, Tom Arnfeld t...@duedil.com wrote: Has there been any further discussion on getting a release candidate for 0.21.0 cut? On 22 October 2014 22:12, Ian Downes idow...@twitter.com.invalid wrote: Please note that we're targeting to cut this release next Wednesday, 29 October. Ian On Wed, Oct 22, 2014 at 1:33 PM, Vinod Kone vinodk...@gmail.com wrote: Can everyone who has ticket(s) that they absolutely want to get in 0.21.0 please mark them with target version as 0.21.0? That will make it easy to track the blockers. On Wed, Oct 22, 2014 at 11:44 AM, Adam Bordelon a...@mesosphere.io wrote: I'd also like to see more of the modules work land in 0.21, especially the Authenticator module (MESOS-1889). I expect it to land in less than a week, but I don't know what your timeframe is for 0.21. On Wed, Oct 22, 2014 at 11:22 AM, Ian Downes idow...@twitter.com.invalid wrote: Can someone please volunteer to shepherd this work and comment on the state of the review? On Tue, Oct 21, 2014 at 9:01 PM, R.B. Boyer are...@nexusvector.net wrote: Can someone see if MESOS-1873 https://issues.apache.org/jira/browse/MESOS-1873 is suitable for 0.21.0? The patch is super simple https://reviews.apache.org/r/26622/ and fixes a showstopper bug in the command executor. On Tue, Oct 21, 2014 at 10:52 PM, Benjamin Hindman b...@eecs.berkeley.edu wrote: Awesome, thanks Ben/Ian! We've got some Docker updates that we want to land in 0.21.0. My estimate is it will land sometime this week, or early next week. On Tue, Oct 21, 2014 at 6:51 PM, Benjamin Mahler benjamin.mah...@gmail.com wrote: Hi all, We would like to cut 0.21.0 very soon to release the task reconciliation work that has been completed recently. I spoke with Ian Downes who was willing to be the release manager. I will let him reply here with a ticket and with other features that have made it in the 0.21.0 release. Please reply to this thread if you have anything that you think needs to land in 0.21.0! Ben
Re: Why rely on url scheme for fetching?
Hi Tim/others, Is this to be included in the 0.21.0 release? If so, I don't know how to tag it etc. I would really (shamelessly) love it to be included as it would really simplify my intended usecase of using snackfs (cassandra backed filesystem). -- Ankur On 3 Nov 2014, at 09:28, Ankur Chauhan an...@malloc64.com wrote: Yea, I saw those today morning. I'll hold off a little mesos-336 changes a lot of stuff. Sent from my iPhone On Nov 3, 2014, at 9:18 AM, Adam Bordelon a...@mesosphere.io mailto:a...@mesosphere.io wrote: + Bernd, who has done some fetcher work, including additional testing, for MESOS-1316, MESOS-1945, and MESOS-336 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com mailto:dha...@twopensource.com wrote: Hi Ankur I think this is a great approach. It makes the code much simpler, extensible, and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration tests, so this shouldn't surprise anyone :) If you haven't already, please read the documentation on contributing to Mesos and the style guide to ensure all the naming is as expected, then you can push the patch to reviewboard to get it reviewed and committed. On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I did some learning today! This is pretty much a very rough draft of the tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious mistakes, please let me know. (this is my first pass after all). https://github.com/ankurcha/mesos/compare/prefer_2 https://github.com/ankurcha/mesos/compare/prefer_2 My main intention is to break the logic of the fetcher info some very discrete components that i can write tests against. I am still re-learning cpp/mesos code styles etc so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions. -- Ankur @ankurcha On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the lines of a functional test. I was going to add some tests but it seems like if I am to do that I would need to add a test dependency on hadoop. As an alternative, I propose adding a good set of unit tests around the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to catch a good portion of cases at the same time keeping the dependencies and runtime of tests low. What do you guys thing about this? PS: I am pretty green in terms of gtest and the overall c++ testing methodology. Can someone give me pointers to good examples of tests in the codebase. -- Ankur On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io mailto:a...@mesosphere.io wrote: Thank you Ankur. At first glance, it looks great. We'll do a more thorough review of it very soon. I know Tim St. Clair had some ideas for fixing MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review too. On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I just created a review https://reviews.apache.org/r/27483/ https://reviews.apache.org/r/27483/ It's my first stab at it and I will try to add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let me know what you think about it so far. -- Ankur On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Yea, i saw that the minute i pressed send. I'll start the review board so that people can have a look at the change. -- Ankur On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io mailto:t...@mesosphere.io wrote: Hi Ankur, There is a fetcher_tests.cpp in src/tests. Tim On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I am trying to find/write some test cases. I couldn't find a fetcher_tests.{cpp|hpp} so once I have something, I'll post on review board. I am new to gmock/gtest so bear with me while i get up to speed. -- Ankur On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io mailto:t...@mesosphere.io wrote: Hi Ankur, Can you post on reviewboard? We can discuss more about the code there. Tim Sent from my iPhone On Nov 1, 2014, at 6:29 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I don't think there is an issue which is directly in line with what i wanted but the closest one that I could find in JIRA is https://issues.apache.org/jira/browse/MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 I have a branch ( https://github.com/ankurcha/mesos/compare/prefer_hadoop_fetcher
Review Request 27531: Update Master metrics to match task source and reason scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- Review request for mesos and Vinod Kone. Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 Diff: https://reviews.apache.org/r/27531/diff/ Testing --- make check run master and check endpoint: { ... master/task_error/source_executor/reason_executor_terminated: 0, master/task_error/source_executor/reason_executor_unregistered: 0, master/task_error/source_executor/reason_framework_removed: 0, master/task_error/source_executor/reason_gc_error: 0, master/task_error/source_executor/reason_invalid_frameworkid: 0, master/task_error/source_executor/reason_invalid_offers: 0, master/task_error/source_executor/reason_master_disconnected: 0, master/task_error/source_executor/reason_reconciliation: 0, master/task_error/source_executor/reason_slave_disconnected: 0, master/task_error/source_executor/reason_slave_removed: 0, master/task_error/source_executor/reason_slave_restarted: 0, master/task_error/source_executor/reason_slave_unknown: 0, master/task_error/source_executor/reason_task_invalid: 0, master/task_error/source_executor/reason_task_unauthorized: 0, master/task_error/source_executor/reason_task_unknown: 0, master/task_error/source_master/reason_executor_terminated: 0, master/task_error/source_master/reason_executor_unregistered: 0, master/task_error/source_master/reason_framework_removed: 0, master/task_error/source_master/reason_gc_error: 0, master/task_error/source_master/reason_invalid_frameworkid: 0, master/task_error/source_master/reason_invalid_offers: 0, master/task_error/source_master/reason_master_disconnected: 0, master/task_error/source_master/reason_reconciliation: 0, master/task_error/source_master/reason_slave_disconnected: 0, master/task_error/source_master/reason_slave_removed: 0, master/task_error/source_master/reason_slave_restarted: 0, master/task_error/source_master/reason_slave_unknown: 0, master/task_error/source_master/reason_task_invalid: 0, master/task_error/source_master/reason_task_unauthorized: 0, master/task_error/source_master/reason_task_unknown: 0, master/task_error/source_slave/reason_executor_terminated: 0, master/task_error/source_slave/reason_executor_unregistered: 0, master/task_error/source_slave/reason_framework_removed: 0, master/task_error/source_slave/reason_gc_error: 0, master/task_error/source_slave/reason_invalid_frameworkid: 0, master/task_error/source_slave/reason_invalid_offers: 0, master/task_error/source_slave/reason_master_disconnected: 0, master/task_error/source_slave/reason_reconciliation: 0, master/task_error/source_slave/reason_slave_disconnected: 0, master/task_error/source_slave/reason_slave_removed: 0, master/task_error/source_slave/reason_slave_restarted: 0, master/task_error/source_slave/reason_slave_unknown: 0, master/task_error/source_slave/reason_task_invalid: 0, master/task_error/source_slave/reason_task_unauthorized: 0, master/task_error/source_slave/reason_task_unknown: 0, master/task_failed/source_executor/reason_executor_terminated: 0, master/task_failed/source_executor/reason_executor_unregistered: 0, master/task_failed/source_executor/reason_framework_removed: 0, master/task_failed/source_executor/reason_gc_error: 0, master/task_failed/source_executor/reason_invalid_frameworkid: 0, master/task_failed/source_executor/reason_invalid_offers: 0, master/task_failed/source_executor/reason_master_disconnected: 0, master/task_failed/source_executor/reason_reconciliation: 0, master/task_failed/source_executor/reason_slave_disconnected: 0, master/task_failed/source_executor/reason_slave_removed: 0, master/task_failed/source_executor/reason_slave_restarted: 0, master/task_failed/source_executor/reason_slave_unknown: 0, master/task_failed/source_executor/reason_task_invalid: 0, master/task_failed/source_executor/reason_task_unauthorized: 0, master/task_failed/source_executor/reason_task_unknown: 0, master/task_failed/source_master/reason_executor_terminated: 0, master/task_failed/source_master/reason_executor_unregistered: 0, master/task_failed/source_master/reason_framework_removed: 0, master/task_failed/source_master/reason_gc_error: 0, master/task_failed/source_master/reason_invalid_frameworkid: 0, master/task_failed/source_master/reason_invalid_offers: 0, master/task_failed/source_master/reason_master_disconnected: 0, master/task_failed/source_master/reason_reconciliation: 0, master/task_failed/source_master/reason_slave_disconnected: 0,
Re: Review Request 27531: Update Master metrics to match task source and reason scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- (Updated Nov. 3, 2014, 10:24 a.m.) Review request for mesos, Tobias Weingartner and Vinod Kone. Changes --- +toby for operational input Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 Diff: https://reviews.apache.org/r/27531/diff/ Testing --- make check run master and check endpoint: { ... master/task_error/source_executor/reason_executor_terminated: 0, master/task_error/source_executor/reason_executor_unregistered: 0, master/task_error/source_executor/reason_framework_removed: 0, master/task_error/source_executor/reason_gc_error: 0, master/task_error/source_executor/reason_invalid_frameworkid: 0, master/task_error/source_executor/reason_invalid_offers: 0, master/task_error/source_executor/reason_master_disconnected: 0, master/task_error/source_executor/reason_reconciliation: 0, master/task_error/source_executor/reason_slave_disconnected: 0, master/task_error/source_executor/reason_slave_removed: 0, master/task_error/source_executor/reason_slave_restarted: 0, master/task_error/source_executor/reason_slave_unknown: 0, master/task_error/source_executor/reason_task_invalid: 0, master/task_error/source_executor/reason_task_unauthorized: 0, master/task_error/source_executor/reason_task_unknown: 0, master/task_error/source_master/reason_executor_terminated: 0, master/task_error/source_master/reason_executor_unregistered: 0, master/task_error/source_master/reason_framework_removed: 0, master/task_error/source_master/reason_gc_error: 0, master/task_error/source_master/reason_invalid_frameworkid: 0, master/task_error/source_master/reason_invalid_offers: 0, master/task_error/source_master/reason_master_disconnected: 0, master/task_error/source_master/reason_reconciliation: 0, master/task_error/source_master/reason_slave_disconnected: 0, master/task_error/source_master/reason_slave_removed: 0, master/task_error/source_master/reason_slave_restarted: 0, master/task_error/source_master/reason_slave_unknown: 0, master/task_error/source_master/reason_task_invalid: 0, master/task_error/source_master/reason_task_unauthorized: 0, master/task_error/source_master/reason_task_unknown: 0, master/task_error/source_slave/reason_executor_terminated: 0, master/task_error/source_slave/reason_executor_unregistered: 0, master/task_error/source_slave/reason_framework_removed: 0, master/task_error/source_slave/reason_gc_error: 0, master/task_error/source_slave/reason_invalid_frameworkid: 0, master/task_error/source_slave/reason_invalid_offers: 0, master/task_error/source_slave/reason_master_disconnected: 0, master/task_error/source_slave/reason_reconciliation: 0, master/task_error/source_slave/reason_slave_disconnected: 0, master/task_error/source_slave/reason_slave_removed: 0, master/task_error/source_slave/reason_slave_restarted: 0, master/task_error/source_slave/reason_slave_unknown: 0, master/task_error/source_slave/reason_task_invalid: 0, master/task_error/source_slave/reason_task_unauthorized: 0, master/task_error/source_slave/reason_task_unknown: 0, master/task_failed/source_executor/reason_executor_terminated: 0, master/task_failed/source_executor/reason_executor_unregistered: 0, master/task_failed/source_executor/reason_framework_removed: 0, master/task_failed/source_executor/reason_gc_error: 0, master/task_failed/source_executor/reason_invalid_frameworkid: 0, master/task_failed/source_executor/reason_invalid_offers: 0, master/task_failed/source_executor/reason_master_disconnected: 0, master/task_failed/source_executor/reason_reconciliation: 0, master/task_failed/source_executor/reason_slave_disconnected: 0, master/task_failed/source_executor/reason_slave_removed: 0, master/task_failed/source_executor/reason_slave_restarted: 0, master/task_failed/source_executor/reason_slave_unknown: 0, master/task_failed/source_executor/reason_task_invalid: 0, master/task_failed/source_executor/reason_task_unauthorized: 0, master/task_failed/source_executor/reason_task_unknown: 0, master/task_failed/source_master/reason_executor_terminated: 0, master/task_failed/source_master/reason_executor_unregistered: 0, master/task_failed/source_master/reason_framework_removed: 0, master/task_failed/source_master/reason_gc_error: 0, master/task_failed/source_master/reason_invalid_frameworkid: 0, master/task_failed/source_master/reason_invalid_offers: 0, master/task_failed/source_master/reason_master_disconnected: 0,
Re: Review Request 26382: Add source and reason to TaskStatus.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59607 --- Ship it! Ship It! - Vinod Kone On Oct. 31, 2014, 10:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Oct. 31, 2014, 10:09 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 27461: Change process reparent test to be recursive search for init
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/#review59608 --- Thinking a little, I believe walking up to pid 1 like this is unnecessary. The test is checking that the process has been reparented. I had incorrectly assumed that all systems would reparent to pid 1. I think we can test reparenting by simply checking that the parent is *no longer* the child that forked it, and that it is still running, i.e., not zombied. I don't actually care who the new parent is, just that it has been reparented. - Ian Downes On Oct. 31, 2014, 8:03 p.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Oct. 31, 2014, 8:03 p.m.) Review request for mesos and Ian Downes. Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check up the parent tree, and succeed if there is a path to pid 1 without zombies along the way. This is not the cleanest fix, but I'm having trouble finding a way to find the appropriate init to check for. Diffs - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Re: Why rely on url scheme for fetching?
Unfortunately, this will not get in 0.21.0 as we're tagging that today. Please tag the ticket(s) as Target Version = 0.22.0. Ian On Mon, Nov 3, 2014 at 10:22 AM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim/others, Is this to be included in the 0.21.0 release? If so, I don't know how to tag it etc. I would really (shamelessly) love it to be included as it would really simplify my intended usecase of using snackfs (cassandra backed filesystem). -- Ankur On 3 Nov 2014, at 09:28, Ankur Chauhan an...@malloc64.com wrote: Yea, I saw those today morning. I'll hold off a little mesos-336 changes a lot of stuff. Sent from my iPhone On Nov 3, 2014, at 9:18 AM, Adam Bordelon a...@mesosphere.io mailto: a...@mesosphere.io wrote: + Bernd, who has done some fetcher work, including additional testing, for MESOS-1316, MESOS-1945, and MESOS-336 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com mailto:dha...@twopensource.com wrote: Hi Ankur I think this is a great approach. It makes the code much simpler, extensible, and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration tests, so this shouldn't surprise anyone :) If you haven't already, please read the documentation on contributing to Mesos and the style guide to ensure all the naming is as expected, then you can push the patch to reviewboard to get it reviewed and committed. On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I did some learning today! This is pretty much a very rough draft of the tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious mistakes, please let me know. (this is my first pass after all). https://github.com/ankurcha/mesos/compare/prefer_2 https://github.com/ankurcha/mesos/compare/prefer_2 My main intention is to break the logic of the fetcher info some very discrete components that i can write tests against. I am still re-learning cpp/mesos code styles etc so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions. -- Ankur @ankurcha On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com mailto: an...@malloc64.com wrote: Hi, I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the lines of a functional test. I was going to add some tests but it seems like if I am to do that I would need to add a test dependency on hadoop. As an alternative, I propose adding a good set of unit tests around the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to catch a good portion of cases at the same time keeping the dependencies and runtime of tests low. What do you guys thing about this? PS: I am pretty green in terms of gtest and the overall c++ testing methodology. Can someone give me pointers to good examples of tests in the codebase. -- Ankur On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io mailto: a...@mesosphere.io wrote: Thank you Ankur. At first glance, it looks great. We'll do a more thorough review of it very soon. I know Tim St. Clair had some ideas for fixing MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review too. On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I just created a review https://reviews.apache.org/r/27483/ https://reviews.apache.org/r/27483/ It's my first stab at it and I will try to add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let me know what you think about it so far. -- Ankur On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com mailto: an...@malloc64.com wrote: Yea, i saw that the minute i pressed send. I'll start the review board so that people can have a look at the change. -- Ankur On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io mailto: t...@mesosphere.io wrote: Hi Ankur, There is a fetcher_tests.cpp in src/tests. Tim On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I am trying to find/write some test cases. I couldn't find a fetcher_tests.{cpp|hpp} so once I have something, I'll post on review board. I am new to gmock/gtest so bear with me while i get up to speed. -- Ankur On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io mailto: t...@mesosphere.io wrote: Hi Ankur, Can you post on reviewboard? We can discuss more about the code there. Tim Sent from my iPhone On Nov 1, 2014, at 6:29 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I don't think there is an issue which is directly in line with what i wanted but the closest one that I could find in JIRA is
Re: Why rely on url scheme for fetching?
Hi, Okay, thanks Ian. What's the expected ETA on getting 0.21.0 out? jira didn't have a release date set. -- Ankur On 3 Nov 2014, at 10:37, Ian Downes idow...@twitter.com.INVALID wrote: Unfortunately, this will not get in 0.21.0 as we're tagging that today. Please tag the ticket(s) as Target Version = 0.22.0. Ian On Mon, Nov 3, 2014 at 10:22 AM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim/others, Is this to be included in the 0.21.0 release? If so, I don't know how to tag it etc. I would really (shamelessly) love it to be included as it would really simplify my intended usecase of using snackfs (cassandra backed filesystem). -- Ankur On 3 Nov 2014, at 09:28, Ankur Chauhan an...@malloc64.com wrote: Yea, I saw those today morning. I'll hold off a little mesos-336 changes a lot of stuff. Sent from my iPhone On Nov 3, 2014, at 9:18 AM, Adam Bordelon a...@mesosphere.io mailto:a...@mesosphere.io mailto: a...@mesosphere.io mailto:a...@mesosphere.io wrote: + Bernd, who has done some fetcher work, including additional testing, for MESOS-1316, MESOS-1945, and MESOS-336 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com mailto:dha...@twopensource.com mailto:dha...@twopensource.com mailto:dha...@twopensource.com wrote: Hi Ankur I think this is a great approach. It makes the code much simpler, extensible, and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration tests, so this shouldn't surprise anyone :) If you haven't already, please read the documentation on contributing to Mesos and the style guide to ensure all the naming is as expected, then you can push the patch to reviewboard to get it reviewed and committed. On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com mailto:an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I did some learning today! This is pretty much a very rough draft of the tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious mistakes, please let me know. (this is my first pass after all). https://github.com/ankurcha/mesos/compare/prefer_2 https://github.com/ankurcha/mesos/compare/prefer_2 https://github.com/ankurcha/mesos/compare/prefer_2 https://github.com/ankurcha/mesos/compare/prefer_2 My main intention is to break the logic of the fetcher info some very discrete components that i can write tests against. I am still re-learning cpp/mesos code styles etc so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions. -- Ankur @ankurcha On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com mailto: an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the lines of a functional test. I was going to add some tests but it seems like if I am to do that I would need to add a test dependency on hadoop. As an alternative, I propose adding a good set of unit tests around the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to catch a good portion of cases at the same time keeping the dependencies and runtime of tests low. What do you guys thing about this? PS: I am pretty green in terms of gtest and the overall c++ testing methodology. Can someone give me pointers to good examples of tests in the codebase. -- Ankur On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io mailto:a...@mesosphere.io mailto: a...@mesosphere.io mailto:a...@mesosphere.io wrote: Thank you Ankur. At first glance, it looks great. We'll do a more thorough review of it very soon. I know Tim St. Clair had some ideas for fixing MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review too. On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com mailto:an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I just created a review https://reviews.apache.org/r/27483/ https://reviews.apache.org/r/27483/ https://reviews.apache.org/r/27483/ https://reviews.apache.org/r/27483/ It's my first stab at it and I will try to add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let me know what you think about it so far. -- Ankur On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com mailto: an...@malloc64.com mailto:an...@malloc64.com wrote: Yea, i saw that the minute i pressed send. I'll start the review board so that people can have a look at the change. -- Ankur On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io mailto:t...@mesosphere.io mailto: t...@mesosphere.io mailto:t...@mesosphere.io wrote:
Re: Review Request 27531: Update Master metrics to match task source and reason scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/#review59615 --- Patch looks great! Reviews applied: [26817, 26382, 27531] All tests passed. - Mesos ReviewBot On Nov. 3, 2014, 6:24 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- (Updated Nov. 3, 2014, 6:24 p.m.) Review request for mesos, Tobias Weingartner and Vinod Kone. Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 Diff: https://reviews.apache.org/r/27531/diff/ Testing --- make check run master and check endpoint: { ... master/task_error/source_executor/reason_executor_terminated: 0, master/task_error/source_executor/reason_executor_unregistered: 0, master/task_error/source_executor/reason_framework_removed: 0, master/task_error/source_executor/reason_gc_error: 0, master/task_error/source_executor/reason_invalid_frameworkid: 0, master/task_error/source_executor/reason_invalid_offers: 0, master/task_error/source_executor/reason_master_disconnected: 0, master/task_error/source_executor/reason_reconciliation: 0, master/task_error/source_executor/reason_slave_disconnected: 0, master/task_error/source_executor/reason_slave_removed: 0, master/task_error/source_executor/reason_slave_restarted: 0, master/task_error/source_executor/reason_slave_unknown: 0, master/task_error/source_executor/reason_task_invalid: 0, master/task_error/source_executor/reason_task_unauthorized: 0, master/task_error/source_executor/reason_task_unknown: 0, master/task_error/source_master/reason_executor_terminated: 0, master/task_error/source_master/reason_executor_unregistered: 0, master/task_error/source_master/reason_framework_removed: 0, master/task_error/source_master/reason_gc_error: 0, master/task_error/source_master/reason_invalid_frameworkid: 0, master/task_error/source_master/reason_invalid_offers: 0, master/task_error/source_master/reason_master_disconnected: 0, master/task_error/source_master/reason_reconciliation: 0, master/task_error/source_master/reason_slave_disconnected: 0, master/task_error/source_master/reason_slave_removed: 0, master/task_error/source_master/reason_slave_restarted: 0, master/task_error/source_master/reason_slave_unknown: 0, master/task_error/source_master/reason_task_invalid: 0, master/task_error/source_master/reason_task_unauthorized: 0, master/task_error/source_master/reason_task_unknown: 0, master/task_error/source_slave/reason_executor_terminated: 0, master/task_error/source_slave/reason_executor_unregistered: 0, master/task_error/source_slave/reason_framework_removed: 0, master/task_error/source_slave/reason_gc_error: 0, master/task_error/source_slave/reason_invalid_frameworkid: 0, master/task_error/source_slave/reason_invalid_offers: 0, master/task_error/source_slave/reason_master_disconnected: 0, master/task_error/source_slave/reason_reconciliation: 0, master/task_error/source_slave/reason_slave_disconnected: 0, master/task_error/source_slave/reason_slave_removed: 0, master/task_error/source_slave/reason_slave_restarted: 0, master/task_error/source_slave/reason_slave_unknown: 0, master/task_error/source_slave/reason_task_invalid: 0, master/task_error/source_slave/reason_task_unauthorized: 0, master/task_error/source_slave/reason_task_unknown: 0, master/task_failed/source_executor/reason_executor_terminated: 0, master/task_failed/source_executor/reason_executor_unregistered: 0, master/task_failed/source_executor/reason_framework_removed: 0, master/task_failed/source_executor/reason_gc_error: 0, master/task_failed/source_executor/reason_invalid_frameworkid: 0, master/task_failed/source_executor/reason_invalid_offers: 0, master/task_failed/source_executor/reason_master_disconnected: 0, master/task_failed/source_executor/reason_reconciliation: 0, master/task_failed/source_executor/reason_slave_disconnected: 0, master/task_failed/source_executor/reason_slave_removed: 0, master/task_failed/source_executor/reason_slave_restarted: 0, master/task_failed/source_executor/reason_slave_unknown: 0, master/task_failed/source_executor/reason_task_invalid: 0, master/task_failed/source_executor/reason_task_unauthorized: 0, master/task_failed/source_executor/reason_task_unknown: 0,
Re: Unable to install Mesos on Ubuntu 14.04. Error during 'make'
I use campus proxy configuration for internet. But I am unable to run make in installing Mesos on my Ubuntu -14.04, as it ends up with ERROR(1). Upon searching, I created settings.xml file inside .m2 with the required configuration but failed again. Please help me out, as it is greatly hindering me from my work. On Sat Nov 01 2014 at 3:38:57 AM Joris Van Remoortere jo...@mesosphere.io wrote: I think this suggests you're fetching through a proxy: *Proxy request sent, awaiting response... 200 OK* When I wget this is my output: wget http://repo.maven.apache.org/maven2/org/apache/apache/11/ apache-11.pom --2014-10-31 http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom--2014-10-31 15:05:32-- http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom Resolving repo.maven.apache.org (repo.maven.apache.org)... 23.235.47.215 Connecting to repo.maven.apache.org (repo.maven.apache.org)|23.235.47.215|:80... connected. *HTTP request sent, awaiting response... 200 OK* I think you might need to set up your proxy along these lines: http://maven.apache.org/guides/mini/guide-proxies.html ... but I am not familiar with Maven. Joris On Fri, Oct 31, 2014 at 2:36 PM, Sweta Rani swetarani3...@gmail.com wrote: Tried wget for all the given https, it was successful but still after that make was not successful and make check says YOU HAVE 3 DISABLED TESTS proxima@Centauri:~/mesos/build$ wget http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom --2014-11-01 02:55:51-- http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom Connecting to 10.3.100.207:8080... connected. Proxy request sent, awaiting response... 200 OK Length: 14811 (14K) [text/xml] Saving to: ‘apache-11.pom’ *Joris Van Remoortere* On Sat Nov 01 2014 at 2:36:27 AM Joris Van Remoortere jo...@mesosphere.io wrote: That seems like you might have a connectivity / firewall issue. Can you try just doing a wget on that artifact? On Fri, Oct 31, 2014 at 1:56 PM, Sweta Rani swetarani3...@gmail.com wrote: I tried it again but failed with following errors : make[1]: Entering directory `/home/proxima/mesos/build/src' Building mesos-0.21.0.jar ... [INFO] Scanning for projects... Downloading: http://repo.maven.apache.org/maven2/org/apache/apache/11/ apache-11.pom [ERROR] The build could not read 1 project - [Help 1] [ERROR] [ERROR] The project org.apache.mesos:mesos:0.21.0 (/home/proxima/mesos/build/src/java/mesos.pom) has 1 error [ERROR] Non-resolvable parent POM: Could not transfer artifact org.apache:apache:pom:11 from/to central ( http://repo.maven.apache.org/maven2): Connection to http://repo.maven.apache.org refused and 'parent.relativePath' points at wrong local POM @ line 18, column 11: Connection refused - [Help 2] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuil dingException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/Unresolvabl eModelException make[1]: *** [java/target/mesos-0.21.0.jar] Error 1 make[1]: Leaving directory `/home/proxima/mesos/build/src' make: *** [all-recursive] Error 1 On Fri Oct 31 2014 at 1:09:15 PM Sweta Rani swetarani3...@gmail.com wrote: yaa.. followed the steps as directed in the documentation *Sweta Rani* *3rd Year Undergraduate* *Department of Electrical Engineering* *IIT Kharagpur http://www.iitkgp.ac.in/* On Fri, Oct 31, 2014 at 3:14 AM, Dominic Hamon dha...@twopensource.com wrote: Let's tackle the easy stuff: did you run bootstrap and configure as per the getting started doc? On Oct 30, 2014 2:04 PM, Sweta Rani swetarani3...@gmail.com wrote: Tried all the suggestions and tips given in JIRA and different google forums but still the error remains same. Making all in . make[1]: Entering directory `/home/proxima/mesos/build' make[1]: Nothing to be done for `all-am'. make[1]: Leaving directory `/home/proxima/mesos/build' Making all in 3rdparty make[1]: Entering directory `/home/proxima/mesos/build/ 3rdparty' make all-recursive make[2]: Entering directory `/home/proxima/mesos/build/ 3rdparty' Making all in libprocess make[3]: Entering directory `/home/proxima/mesos/build/3rdparty/libprocess' Making all in 3rdparty make[4]: Entering directory `/home/proxima/mesos/build/3rdparty/libprocess/3rdparty' make all-recursive
Re: Why rely on url scheme for fetching?
I think it's too late to be included, since it's going to take some rounds of review, and Ian is cutting the release today. We'll have to tag this for the next release. Tim On Mon, Nov 3, 2014 at 10:22 AM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim/others, Is this to be included in the 0.21.0 release? If so, I don't know how to tag it etc. I would really (shamelessly) love it to be included as it would really simplify my intended usecase of using snackfs (cassandra backed filesystem). -- Ankur On 3 Nov 2014, at 09:28, Ankur Chauhan an...@malloc64.com wrote: Yea, I saw those today morning. I'll hold off a little mesos-336 changes a lot of stuff. Sent from my iPhone On Nov 3, 2014, at 9:18 AM, Adam Bordelon a...@mesosphere.io mailto:a...@mesosphere.io wrote: + Bernd, who has done some fetcher work, including additional testing, for MESOS-1316, MESOS-1945, and MESOS-336 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com mailto:dha...@twopensource.com wrote: Hi Ankur I think this is a great approach. It makes the code much simpler, extensible, and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration tests, so this shouldn't surprise anyone :) If you haven't already, please read the documentation on contributing to Mesos and the style guide to ensure all the naming is as expected, then you can push the patch to reviewboard to get it reviewed and committed. On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I did some learning today! This is pretty much a very rough draft of the tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious mistakes, please let me know. (this is my first pass after all). https://github.com/ankurcha/mesos/compare/prefer_2 https://github.com/ankurcha/mesos/compare/prefer_2 My main intention is to break the logic of the fetcher info some very discrete components that i can write tests against. I am still re-learning cpp/mesos code styles etc so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions. -- Ankur @ankurcha On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the lines of a functional test. I was going to add some tests but it seems like if I am to do that I would need to add a test dependency on hadoop. As an alternative, I propose adding a good set of unit tests around the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to catch a good portion of cases at the same time keeping the dependencies and runtime of tests low. What do you guys thing about this? PS: I am pretty green in terms of gtest and the overall c++ testing methodology. Can someone give me pointers to good examples of tests in the codebase. -- Ankur On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io mailto:a...@mesosphere.io wrote: Thank you Ankur. At first glance, it looks great. We'll do a more thorough review of it very soon. I know Tim St. Clair had some ideas for fixing MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review too. On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I just created a review https://reviews.apache.org/r/27483/ https://reviews.apache.org/r/27483/ It's my first stab at it and I will try to add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let me know what you think about it so far. -- Ankur On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Yea, i saw that the minute i pressed send. I'll start the review board so that people can have a look at the change. -- Ankur On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io mailto:t...@mesosphere.io wrote: Hi Ankur, There is a fetcher_tests.cpp in src/tests. Tim On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I am trying to find/write some test cases. I couldn't find a fetcher_tests.{cpp|hpp} so once I have something, I'll post on review board. I am new to gmock/gtest so bear with me while i get up to speed. -- Ankur On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io mailto:t...@mesosphere.io wrote: Hi Ankur, Can you post on reviewboard? We can discuss more about the code there. Tim Sent from my iPhone On Nov 1, 2014, at 6:29 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I don't think there is an issue which is directly in line with what i wanted but the closest one that I could find in JIRA is
Re: Review Request 24776: Add docker containerizer destroy tests
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24776/ --- (Updated Nov. 3, 2014, 8:53 p.m.) Review request for mesos, Benjamin Hindman and Jie Yu. Summary (updated) - Add docker containerizer destroy tests Repository: mesos-git Description --- Review: https://reviews.apache.org/r/24776 Diffs - src/docker/docker.hpp 443db49318a5923b987c06cda8060ccfb3301a2f src/docker/docker.cpp 6063114e700d13b1cd5d59cff356518021eb3286 src/slave/containerizer/docker.hpp fbbd45d77e5f2f74ca893552f85eb893b3dd948f src/slave/containerizer/docker.cpp 9a2948951f57f3ab16291df51cd9f33e5e96add4 src/tests/docker_containerizer_tests.cpp 1981f49d228903ccaa094a9747dec49054c1e0f2 Diff: https://reviews.apache.org/r/24776/diff/ Testing --- make check Thanks, Timothy Chen
Re: Why rely on url scheme for fetching?
That's cool. I think if it gets reviews and gets an okay I'll modify my deployment and build myself some deb packages with these changes till 0.22.0 ships. Sent from my iPhone On Nov 3, 2014, at 12:49 PM, Timothy Chen tnac...@gmail.com wrote: I think it's too late to be included, since it's going to take some rounds of review, and Ian is cutting the release today. We'll have to tag this for the next release. Tim On Mon, Nov 3, 2014 at 10:22 AM, Ankur Chauhan an...@malloc64.com wrote: Hi Tim/others, Is this to be included in the 0.21.0 release? If so, I don't know how to tag it etc. I would really (shamelessly) love it to be included as it would really simplify my intended usecase of using snackfs (cassandra backed filesystem). -- Ankur On 3 Nov 2014, at 09:28, Ankur Chauhan an...@malloc64.com wrote: Yea, I saw those today morning. I'll hold off a little mesos-336 changes a lot of stuff. Sent from my iPhone On Nov 3, 2014, at 9:18 AM, Adam Bordelon a...@mesosphere.io mailto:a...@mesosphere.io wrote: + Bernd, who has done some fetcher work, including additional testing, for MESOS-1316, MESOS-1945, and MESOS-336 On Mon, Nov 3, 2014 at 9:04 AM, Dominic Hamon dha...@twopensource.com mailto:dha...@twopensource.com wrote: Hi Ankur I think this is a great approach. It makes the code much simpler, extensible, and more testable. Anyone that's heard me rant knows I am a big fan of unit tests over integration tests, so this shouldn't surprise anyone :) If you haven't already, please read the documentation on contributing to Mesos and the style guide to ensure all the naming is as expected, then you can push the patch to reviewboard to get it reviewed and committed. On Mon, Nov 3, 2014 at 12:49 AM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I did some learning today! This is pretty much a very rough draft of the tests/refactor of mesos-fetcher that I have come up with. Again, If there are some obvious mistakes, please let me know. (this is my first pass after all). https://github.com/ankurcha/mesos/compare/prefer_2 https://github.com/ankurcha/mesos/compare/prefer_2 My main intention is to break the logic of the fetcher info some very discrete components that i can write tests against. I am still re-learning cpp/mesos code styles etc so I may be a little slow to catch up but I would really appreciate any comments and/or suggestions. -- Ankur @ankurcha On 2 Nov 2014, at 18:17, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, I noticed that the current set of tests in `src/tests/fetcher_tests.cpp` is pretty coarse grained and are more on the lines of a functional test. I was going to add some tests but it seems like if I am to do that I would need to add a test dependency on hadoop. As an alternative, I propose adding a good set of unit tests around the methods used by `src/launcher/fetcher.cpp` and `src/hdfs/hdfs.cpp`. This should be able to catch a good portion of cases at the same time keeping the dependencies and runtime of tests low. What do you guys thing about this? PS: I am pretty green in terms of gtest and the overall c++ testing methodology. Can someone give me pointers to good examples of tests in the codebase. -- Ankur On 1 Nov 2014, at 22:54, Adam Bordelon a...@mesosphere.io mailto:a...@mesosphere.io wrote: Thank you Ankur. At first glance, it looks great. We'll do a more thorough review of it very soon. I know Tim St. Clair had some ideas for fixing MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711; he may want to review too. On Sat, Nov 1, 2014 at 8:49 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I just created a review https://reviews.apache.org/r/27483/ https://reviews.apache.org/r/27483/ It's my first stab at it and I will try to add more tests as I figure out how to do the hadoop mocking and stuff. Have a look and let me know what you think about it so far. -- Ankur On 1 Nov 2014, at 20:05, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Yea, i saw that the minute i pressed send. I'll start the review board so that people can have a look at the change. -- Ankur On 1 Nov 2014, at 20:01, Tim Chen t...@mesosphere.io mailto:t...@mesosphere.io wrote: Hi Ankur, There is a fetcher_tests.cpp in src/tests. Tim On Sat, Nov 1, 2014 at 7:27 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi Tim, I am trying to find/write some test cases. I couldn't find a fetcher_tests.{cpp|hpp} so once I have something, I'll post on review board. I am new to gmock/gtest so bear with me while i get up to speed. -- Ankur On 1 Nov 2014, at 19:23, Timothy Chen t...@mesosphere.io mailto:t...@mesosphere.io wrote: Hi Ankur, Can you post on reviewboard? We can discuss more about
Twitter Mesos sprint Q4.3
Here's the list of stuff we're working on this sprint. First few parts of persistence, maintenance design, and a chunk of tech debt and documentation. Also, cutting the first 0.21.0 release candidate. Link to sprint board https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=35 ​0.21.0 Release​ ​-​ cutting the first candidate for 0.21.0 Persistence ​- persistent disk resource​ - slave checkpointing resources TaskStatus - adding source and reason to TaskStatus ​Isolation - documentation​ Maintenance - design document Technical debt - balloon framework - flaky tests - overcommit in Command executor - segfault on test failure -- Dominic Hamon | @mrdo | Twitter *There are no bad ideas; only good ideas that go horribly wrong.*
Re: Review Request 27531: Update Master metrics to match task source and reason scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/#review59613 --- src/master/master.cpp https://reviews.apache.org/r/27531/#comment100910 We cannot/shouldnot remove these without a deprecation cycle. This is an API change! src/master/master.cpp https://reviews.apache.org/r/27531/#comment100945 Hmm. This is rather unweildy. Exposing the cross product of status and source and reason in the metrics is a bit strange, considering most reasons are only related to TASK_LOST. IOW, most of those metrics will be 0s which I find weird. AFAICT, this is what we have: TASK_STAGING, TASK_STARTING, TASK_RUNNING and TASK_FINISHED are sent by the master (reconciliation) or executor. Either way, there is no reason associated with them. TASK_FAILED is generated by master (reconciliation) or slave (oom or command executor failed) or executor. I'll comment on the slave aspect in the dependent review, because i realized it sets incorrect reason. TASK_KILLED can be generated by master (reconciliation, pending), slave (pending, framework removed) or executor. TASK_LOST can be generated by master or slave or executor and can contain any of the reasons. Given the above, I would rather we use explicit combinations of status, source and reason metrics to capture these semantics, rather than using a vector of vector of vector. - Vinod Kone On Nov. 3, 2014, 6:24 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- (Updated Nov. 3, 2014, 6:24 p.m.) Review request for mesos, Tobias Weingartner and Vinod Kone. Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 Diff: https://reviews.apache.org/r/27531/diff/ Testing --- make check run master and check endpoint: { ... master/task_error/source_executor/reason_executor_terminated: 0, master/task_error/source_executor/reason_executor_unregistered: 0, master/task_error/source_executor/reason_framework_removed: 0, master/task_error/source_executor/reason_gc_error: 0, master/task_error/source_executor/reason_invalid_frameworkid: 0, master/task_error/source_executor/reason_invalid_offers: 0, master/task_error/source_executor/reason_master_disconnected: 0, master/task_error/source_executor/reason_reconciliation: 0, master/task_error/source_executor/reason_slave_disconnected: 0, master/task_error/source_executor/reason_slave_removed: 0, master/task_error/source_executor/reason_slave_restarted: 0, master/task_error/source_executor/reason_slave_unknown: 0, master/task_error/source_executor/reason_task_invalid: 0, master/task_error/source_executor/reason_task_unauthorized: 0, master/task_error/source_executor/reason_task_unknown: 0, master/task_error/source_master/reason_executor_terminated: 0, master/task_error/source_master/reason_executor_unregistered: 0, master/task_error/source_master/reason_framework_removed: 0, master/task_error/source_master/reason_gc_error: 0, master/task_error/source_master/reason_invalid_frameworkid: 0, master/task_error/source_master/reason_invalid_offers: 0, master/task_error/source_master/reason_master_disconnected: 0, master/task_error/source_master/reason_reconciliation: 0, master/task_error/source_master/reason_slave_disconnected: 0, master/task_error/source_master/reason_slave_removed: 0, master/task_error/source_master/reason_slave_restarted: 0, master/task_error/source_master/reason_slave_unknown: 0, master/task_error/source_master/reason_task_invalid: 0, master/task_error/source_master/reason_task_unauthorized: 0, master/task_error/source_master/reason_task_unknown: 0, master/task_error/source_slave/reason_executor_terminated: 0, master/task_error/source_slave/reason_executor_unregistered: 0, master/task_error/source_slave/reason_framework_removed: 0, master/task_error/source_slave/reason_gc_error: 0, master/task_error/source_slave/reason_invalid_frameworkid: 0, master/task_error/source_slave/reason_invalid_offers: 0, master/task_error/source_slave/reason_master_disconnected: 0, master/task_error/source_slave/reason_reconciliation: 0, master/task_error/source_slave/reason_slave_disconnected: 0, master/task_error/source_slave/reason_slave_removed: 0,
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59644 --- src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100942 we should probably bail here, if somehow the return is != 0 (isError() || false) - Timothy St. Clair On Nov. 3, 2014, 1:52 a.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 1:52 a.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Unable to install Mesos on Ubuntu 14.04. Error during 'make'
Did you verify that the settings.xml file was parsed using mvn -X? Specifically after the first few lines you will see something along the lines of: [DEBUG] Reading global settings from /usr/share/maven/conf/settings.xml [DEBUG] Reading user settings from /home/abc/.m2/settings.xml [DEBUG] Using local repository at /home/abc/.m2/repository Make sure your settings file is in the right location. On Mon, Nov 3, 2014 at 11:51 AM, Sweta Rani swetarani3...@gmail.com wrote: I use campus proxy configuration for internet. But I am unable to run make in installing Mesos on my Ubuntu -14.04, as it ends up with ERROR(1). Upon searching, I created settings.xml file inside .m2 with the required configuration but failed again. Please help me out, as it is greatly hindering me from my work. On Sat Nov 01 2014 at 3:38:57 AM Joris Van Remoortere jo...@mesosphere.io wrote: I think this suggests you're fetching through a proxy: *Proxy request sent, awaiting response... 200 OK* When I wget this is my output: wget http://repo.maven.apache.org/maven2/org/apache/apache/11/ apache-11.pom --2014-10-31 http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom--2014-10-31 15:05:32-- http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom Resolving repo.maven.apache.org (repo.maven.apache.org)... 23.235.47.215 Connecting to repo.maven.apache.org (repo.maven.apache.org)|23.235.47.215|:80... connected. *HTTP request sent, awaiting response... 200 OK* I think you might need to set up your proxy along these lines: http://maven.apache.org/guides/mini/guide-proxies.html ... but I am not familiar with Maven. Joris On Fri, Oct 31, 2014 at 2:36 PM, Sweta Rani swetarani3...@gmail.com wrote: Tried wget for all the given https, it was successful but still after that make was not successful and make check says YOU HAVE 3 DISABLED TESTS proxima@Centauri:~/mesos/build$ wget http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom --2014-11-01 02:55:51-- http://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom Connecting to 10.3.100.207:8080... connected. Proxy request sent, awaiting response... 200 OK Length: 14811 (14K) [text/xml] Saving to: ‘apache-11.pom’ *Joris Van Remoortere* On Sat Nov 01 2014 at 2:36:27 AM Joris Van Remoortere jo...@mesosphere.io wrote: That seems like you might have a connectivity / firewall issue. Can you try just doing a wget on that artifact? On Fri, Oct 31, 2014 at 1:56 PM, Sweta Rani swetarani3...@gmail.com wrote: I tried it again but failed with following errors : make[1]: Entering directory `/home/proxima/mesos/build/src' Building mesos-0.21.0.jar ... [INFO] Scanning for projects... Downloading: http://repo.maven.apache.org/maven2/org/apache/apache/11/ apache-11.pom [ERROR] The build could not read 1 project - [Help 1] [ERROR] [ERROR] The project org.apache.mesos:mesos:0.21.0 (/home/proxima/mesos/build/src/java/mesos.pom) has 1 error [ERROR] Non-resolvable parent POM: Could not transfer artifact org.apache:apache:pom:11 from/to central ( http://repo.maven.apache.org/maven2): Connection to http://repo.maven.apache.org refused and 'parent.relativePath' points at wrong local POM @ line 18, column 11: Connection refused - [Help 2] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuil dingException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/Unresolvabl eModelException make[1]: *** [java/target/mesos-0.21.0.jar] Error 1 make[1]: Leaving directory `/home/proxima/mesos/build/src' make: *** [all-recursive] Error 1 On Fri Oct 31 2014 at 1:09:15 PM Sweta Rani swetarani3...@gmail.com wrote: yaa.. followed the steps as directed in the documentation *Sweta Rani* *3rd Year Undergraduate* *Department of Electrical Engineering* *IIT Kharagpur http://www.iitkgp.ac.in/* On Fri, Oct 31, 2014 at 3:14 AM, Dominic Hamon dha...@twopensource.com wrote: Let's tackle the easy stuff: did you run bootstrap and configure as per the getting started doc? On Oct 30, 2014 2:04 PM, Sweta Rani swetarani3...@gmail.com wrote: Tried all the suggestions and tips given in JIRA and different google forums but still the error remains same. Making all in . make[1]: Entering
Re: Review Request 27531: Update Master metrics to match task source and reason scheme.
On Nov. 3, 2014, 1:59 p.m., Vinod Kone wrote: src/master/master.cpp, lines 5078-5094 https://reviews.apache.org/r/27531/diff/1/?file=747568#file747568line5078 We cannot/shouldnot remove these without a deprecation cycle. This is an API change! I considered exposing them as Gauges that would sum up the children in the tree. Thoughts? On Nov. 3, 2014, 1:59 p.m., Vinod Kone wrote: src/master/master.cpp, lines 5186-5230 https://reviews.apache.org/r/27531/diff/1/?file=747568#file747568line5186 Hmm. This is rather unweildy. Exposing the cross product of status and source and reason in the metrics is a bit strange, considering most reasons are only related to TASK_LOST. IOW, most of those metrics will be 0s which I find weird. AFAICT, this is what we have: TASK_STAGING, TASK_STARTING, TASK_RUNNING and TASK_FINISHED are sent by the master (reconciliation) or executor. Either way, there is no reason associated with them. TASK_FAILED is generated by master (reconciliation) or slave (oom or command executor failed) or executor. I'll comment on the slave aspect in the dependent review, because i realized it sets incorrect reason. TASK_KILLED can be generated by master (reconciliation, pending), slave (pending, framework removed) or executor. TASK_LOST can be generated by master or slave or executor and can contain any of the reasons. Given the above, I would rather we use explicit combinations of status, source and reason metrics to capture these semantics, rather than using a vector of vector of vector. All task status can have reason 'reconciliation' or 'None', so there needs to be a reason for everything. I'll code up the explicit combination, but it will likely be as unwieldy in terms of tracking which are valid combinations. It also means that changes to sources/reasons will require API changes whereas this covers every possible future combination and is a complete API. I'm starting to think I should pull the metrics out into a separate file (as per a TODO) to avoid churn. Maybe as a dependent review. What do you think? - Dominic --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/#review59613 --- On Nov. 3, 2014, 10:24 a.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- (Updated Nov. 3, 2014, 10:24 a.m.) Review request for mesos, Tobias Weingartner and Vinod Kone. Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 Diff: https://reviews.apache.org/r/27531/diff/ Testing --- make check run master and check endpoint: { ... master/task_error/source_executor/reason_executor_terminated: 0, master/task_error/source_executor/reason_executor_unregistered: 0, master/task_error/source_executor/reason_framework_removed: 0, master/task_error/source_executor/reason_gc_error: 0, master/task_error/source_executor/reason_invalid_frameworkid: 0, master/task_error/source_executor/reason_invalid_offers: 0, master/task_error/source_executor/reason_master_disconnected: 0, master/task_error/source_executor/reason_reconciliation: 0, master/task_error/source_executor/reason_slave_disconnected: 0, master/task_error/source_executor/reason_slave_removed: 0, master/task_error/source_executor/reason_slave_restarted: 0, master/task_error/source_executor/reason_slave_unknown: 0, master/task_error/source_executor/reason_task_invalid: 0, master/task_error/source_executor/reason_task_unauthorized: 0, master/task_error/source_executor/reason_task_unknown: 0, master/task_error/source_master/reason_executor_terminated: 0, master/task_error/source_master/reason_executor_unregistered: 0, master/task_error/source_master/reason_framework_removed: 0, master/task_error/source_master/reason_gc_error: 0, master/task_error/source_master/reason_invalid_frameworkid: 0, master/task_error/source_master/reason_invalid_offers: 0, master/task_error/source_master/reason_master_disconnected: 0, master/task_error/source_master/reason_reconciliation: 0, master/task_error/source_master/reason_slave_disconnected: 0, master/task_error/source_master/reason_slave_removed: 0,
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59653 --- src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100948 It seems weird that we would always try and fail on HDFS if the user had specified http pulls. Given that we prefix check for curl pulls, perhaps we should switch the case order, as the cost is low for a string check but the cost is high for a HDFS check. - Timothy St. Clair On Nov. 3, 2014, 1:52 a.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 1:52 a.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 26382: Add source and reason to TaskStatus.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59654 --- src/slave/slave.cpp https://reviews.apache.org/r/26382/#comment100949 So, after thinking a bit more and looking at MESOS-343, what we really want here is to set the reason for failed as oom or memory limit (REASON_MEMORY_LIMIT). Slave generates TASK_FAILED here not because the executor terminated but becaused it oomed (and in the future due to disk limit etc). Note that the slave also generates TASK_FAILED when command executor unexpectedly terminates. This is because, under normal conditions, command executor is not expected to terminate until the task finishes. For this case I think REASON_INVALID_COMMAND or REASON_BAD_COMMAND or REASON_COMMAND_FAILED is a better reason than REASON_EXECUTOR_TERMINATED because users have no idea about executors when using command tasks. The right way to do this is to plumb the reason via the Termination protobuf. That way any isolator can include the right reason if its corresponding limit is reached (memory, disk etc). If you want to punt on the plumbing, please add a TODO and tracking ticket. - Vinod Kone On Oct. 31, 2014, 10:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Oct. 31, 2014, 10:09 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 26382: Add source and reason to TaskStatus.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59656 --- Couple more things that I wanted to add before I forget. -- Can you update the change log with these changes? You can add it under API changes subsection. -- Email the dev list about the changes coming in 0.21.0. -- FWICT, we haven't addressed https://issues.apache.org/jira/browse/MESOS-1930 with this review. The only reasons we have added are framework removed and executor unregistered (do we need this one?). I think what Alex is asking is (you can confirm with him) for a way for the executor to set a reason for TASK_KILLED? or for the slave to set different reasons based on different exit statuses? - Vinod Kone On Oct. 31, 2014, 10:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Oct. 31, 2014, 10:09 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 26382: Add source and reason to TaskStatus.
On Nov. 3, 2014, 2:13 p.m., Vinod Kone wrote: src/slave/slave.cpp, line 2952 https://reviews.apache.org/r/26382/diff/11/?file=746449#file746449line2952 So, after thinking a bit more and looking at MESOS-343, what we really want here is to set the reason for failed as oom or memory limit (REASON_MEMORY_LIMIT). Slave generates TASK_FAILED here not because the executor terminated but becaused it oomed (and in the future due to disk limit etc). Note that the slave also generates TASK_FAILED when command executor unexpectedly terminates. This is because, under normal conditions, command executor is not expected to terminate until the task finishes. For this case I think REASON_INVALID_COMMAND or REASON_BAD_COMMAND or REASON_COMMAND_FAILED is a better reason than REASON_EXECUTOR_TERMINATED because users have no idea about executors when using command tasks. The right way to do this is to plumb the reason via the Termination protobuf. That way any isolator can include the right reason if its corresponding limit is reached (memory, disk etc). If you want to punt on the plumbing, please add a TODO and tracking ticket. if termination is set to 'killed' then should it be TASK_KILLED instead of TASK_FAILED? I think this needs a TODO as it changes the plumbing quite a bit (and tests, etc). For now, is REASON_EXECUTOR_TERMINATED accurate enough? I could add something for command vs containerizer vs executor, but that starts to require more plumbing pretty quickly. - Dominic --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59654 --- On Oct. 31, 2014, 3:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Oct. 31, 2014, 3:09 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 26382: Add source and reason to TaskStatus.
On Nov. 3, 2014, 2:18 p.m., Vinod Kone wrote: Couple more things that I wanted to add before I forget. -- Can you update the change log with these changes? You can add it under API changes subsection. -- Email the dev list about the changes coming in 0.21.0. -- FWICT, we haven't addressed https://issues.apache.org/jira/browse/MESOS-1930 with this review. The only reasons we have added are framework removed and executor unregistered (do we need this one?). I think what Alex is asking is (you can confirm with him) for a way for the executor to set a reason for TASK_KILLED? or for the slave to set different reasons based on different exit statuses? I was going to do that once I was sure it was landing in 0.21. I'm not sure that it is. - Dominic --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59656 --- On Oct. 31, 2014, 3:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Oct. 31, 2014, 3:09 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 27531: Update Master metrics to match task source and reason scheme.
On Nov. 3, 2014, 9:59 p.m., Vinod Kone wrote: src/master/master.cpp, lines 5078-5094 https://reviews.apache.org/r/27531/diff/1/?file=747568#file747568line5078 We cannot/shouldnot remove these without a deprecation cycle. This is an API change! Dominic Hamon wrote: I considered exposing them as Gauges that would sum up the children in the tree. Thoughts? non-terminals are gauges and terminals are counters right? why not leave them as is for now? another potential issue is that the sum of the child metrics may not be equal to the parent metric because they are calculated at different times. but i guess, that's ok? On Nov. 3, 2014, 9:59 p.m., Vinod Kone wrote: src/master/master.cpp, lines 5186-5230 https://reviews.apache.org/r/27531/diff/1/?file=747568#file747568line5186 Hmm. This is rather unweildy. Exposing the cross product of status and source and reason in the metrics is a bit strange, considering most reasons are only related to TASK_LOST. IOW, most of those metrics will be 0s which I find weird. AFAICT, this is what we have: TASK_STAGING, TASK_STARTING, TASK_RUNNING and TASK_FINISHED are sent by the master (reconciliation) or executor. Either way, there is no reason associated with them. TASK_FAILED is generated by master (reconciliation) or slave (oom or command executor failed) or executor. I'll comment on the slave aspect in the dependent review, because i realized it sets incorrect reason. TASK_KILLED can be generated by master (reconciliation, pending), slave (pending, framework removed) or executor. TASK_LOST can be generated by master or slave or executor and can contain any of the reasons. Given the above, I would rather we use explicit combinations of status, source and reason metrics to capture these semantics, rather than using a vector of vector of vector. Dominic Hamon wrote: All task status can have reason 'reconciliation' or 'None', so there needs to be a reason for everything. I'll code up the explicit combination, but it will likely be as unwieldy in terms of tracking which are valid combinations. It also means that changes to sources/reasons will require API changes whereas this covers every possible future combination and is a complete API. I'm starting to think I should pull the metrics out into a separate file (as per a TODO) to avoid churn. Maybe as a dependent review. What do you think? ``` It also means that changes to sources/reasons will require API changes ``` I don't think I follow. Why? Another thing to consider is to reduce the scope of this work for 0.21.0. Maybe just include reasons for TASK_LOST and nothing else? Would that make it less unweildy for now? - Vinod --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/#review59613 --- On Nov. 3, 2014, 6:24 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- (Updated Nov. 3, 2014, 6:24 p.m.) Review request for mesos, Tobias Weingartner and Vinod Kone. Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 Diff: https://reviews.apache.org/r/27531/diff/ Testing --- make check run master and check endpoint: { ... master/task_error/source_executor/reason_executor_terminated: 0, master/task_error/source_executor/reason_executor_unregistered: 0, master/task_error/source_executor/reason_framework_removed: 0, master/task_error/source_executor/reason_gc_error: 0, master/task_error/source_executor/reason_invalid_frameworkid: 0, master/task_error/source_executor/reason_invalid_offers: 0, master/task_error/source_executor/reason_master_disconnected: 0, master/task_error/source_executor/reason_reconciliation: 0, master/task_error/source_executor/reason_slave_disconnected: 0, master/task_error/source_executor/reason_slave_removed: 0, master/task_error/source_executor/reason_slave_restarted: 0, master/task_error/source_executor/reason_slave_unknown: 0, master/task_error/source_executor/reason_task_invalid: 0, master/task_error/source_executor/reason_task_unauthorized: 0, master/task_error/source_executor/reason_task_unknown: 0,
Re: Review Request 27531: Update Master metrics to match task source and reason scheme.
On Nov. 3, 2014, 1:59 p.m., Vinod Kone wrote: src/master/master.cpp, lines 5078-5094 https://reviews.apache.org/r/27531/diff/1/?file=747568#file747568line5078 We cannot/shouldnot remove these without a deprecation cycle. This is an API change! Dominic Hamon wrote: I considered exposing them as Gauges that would sum up the children in the tree. Thoughts? Vinod Kone wrote: non-terminals are gauges and terminals are counters right? why not leave them as is for now? another potential issue is that the sum of the child metrics may not be equal to the parent metric because they are calculated at different times. but i guess, that's ok? to avoid having multiple metric updates when we send status updates. ie, to simplify the callsites and avoid code churn. On Nov. 3, 2014, 1:59 p.m., Vinod Kone wrote: src/master/master.cpp, lines 5186-5230 https://reviews.apache.org/r/27531/diff/1/?file=747568#file747568line5186 Hmm. This is rather unweildy. Exposing the cross product of status and source and reason in the metrics is a bit strange, considering most reasons are only related to TASK_LOST. IOW, most of those metrics will be 0s which I find weird. AFAICT, this is what we have: TASK_STAGING, TASK_STARTING, TASK_RUNNING and TASK_FINISHED are sent by the master (reconciliation) or executor. Either way, there is no reason associated with them. TASK_FAILED is generated by master (reconciliation) or slave (oom or command executor failed) or executor. I'll comment on the slave aspect in the dependent review, because i realized it sets incorrect reason. TASK_KILLED can be generated by master (reconciliation, pending), slave (pending, framework removed) or executor. TASK_LOST can be generated by master or slave or executor and can contain any of the reasons. Given the above, I would rather we use explicit combinations of status, source and reason metrics to capture these semantics, rather than using a vector of vector of vector. Dominic Hamon wrote: All task status can have reason 'reconciliation' or 'None', so there needs to be a reason for everything. I'll code up the explicit combination, but it will likely be as unwieldy in terms of tracking which are valid combinations. It also means that changes to sources/reasons will require API changes whereas this covers every possible future combination and is a complete API. I'm starting to think I should pull the metrics out into a separate file (as per a TODO) to avoid churn. Maybe as a dependent review. What do you think? Vinod Kone wrote: ``` It also means that changes to sources/reasons will require API changes ``` I don't think I follow. Why? Another thing to consider is to reduce the scope of this work for 0.21.0. Maybe just include reasons for TASK_LOST and nothing else? Would that make it less unweildy for now? if we define the explicit set of task state/source/reason tuples and a reason changes for a given state/source, we'd have to deprecate the old combination to introduce the new one. concrete example: TASK_FAILED/SOURCE_SLAVE/REASON_EXECUTOR_TERMINATED will become TASK_FAILED/SOURCE_SLAVE/REASON_OOM and TASK_FAILED/SOURCE_SLAVE/REASON_INVALID_COMMAND in a future patch we'd then need to deprecate the REASON_EXECUTOR_TERMINATED metric through a cycle as it would then be unused. Actually, thinking about it more, we'd have to do that anyway and the explicit list of metrics makes it easier! let me code up the explicit combinations and see what it looks like. - Dominic --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/#review59613 --- On Nov. 3, 2014, 10:24 a.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- (Updated Nov. 3, 2014, 10:24 a.m.) Review request for mesos, Tobias Weingartner and Vinod Kone. Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 Diff: https://reviews.apache.org/r/27531/diff/ Testing --- make check run master and check endpoint: { ... master/task_error/source_executor/reason_executor_terminated:
Re: Review Request 26382: Add source and reason to TaskStatus.
On Nov. 3, 2014, 10:13 p.m., Vinod Kone wrote: src/slave/slave.cpp, line 2952 https://reviews.apache.org/r/26382/diff/11/?file=746449#file746449line2952 So, after thinking a bit more and looking at MESOS-343, what we really want here is to set the reason for failed as oom or memory limit (REASON_MEMORY_LIMIT). Slave generates TASK_FAILED here not because the executor terminated but becaused it oomed (and in the future due to disk limit etc). Note that the slave also generates TASK_FAILED when command executor unexpectedly terminates. This is because, under normal conditions, command executor is not expected to terminate until the task finishes. For this case I think REASON_INVALID_COMMAND or REASON_BAD_COMMAND or REASON_COMMAND_FAILED is a better reason than REASON_EXECUTOR_TERMINATED because users have no idea about executors when using command tasks. The right way to do this is to plumb the reason via the Termination protobuf. That way any isolator can include the right reason if its corresponding limit is reached (memory, disk etc). If you want to punt on the plumbing, please add a TODO and tracking ticket. Dominic Hamon wrote: if termination is set to 'killed' then should it be TASK_KILLED instead of TASK_FAILED? I think this needs a TODO as it changes the plumbing quite a bit (and tests, etc). For now, is REASON_EXECUTOR_TERMINATED accurate enough? I could add something for command vs containerizer vs executor, but that starts to require more plumbing pretty quickly. ``` if termination is set to 'killed' then should it be TASK_KILLED instead of TASK_FAILED? ``` No. It should be TASK_FAILED. Those are the semantics. ``` I think this needs a TODO as it changes the plumbing quite a bit (and tests, etc) ``` Which one? Plumbing via the termination protobuf or setting REASON_FAILED_COMMAND and REASON_MEMORY_LIMIT? The latter could be done without any plumbing; all changes will be in executorTerminated(). - Vinod --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59654 --- On Oct. 31, 2014, 10:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Oct. 31, 2014, 10:09 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 26382: Add source and reason to TaskStatus.
On Nov. 3, 2014, 2:13 p.m., Vinod Kone wrote: src/slave/slave.cpp, line 2952 https://reviews.apache.org/r/26382/diff/11/?file=746449#file746449line2952 So, after thinking a bit more and looking at MESOS-343, what we really want here is to set the reason for failed as oom or memory limit (REASON_MEMORY_LIMIT). Slave generates TASK_FAILED here not because the executor terminated but becaused it oomed (and in the future due to disk limit etc). Note that the slave also generates TASK_FAILED when command executor unexpectedly terminates. This is because, under normal conditions, command executor is not expected to terminate until the task finishes. For this case I think REASON_INVALID_COMMAND or REASON_BAD_COMMAND or REASON_COMMAND_FAILED is a better reason than REASON_EXECUTOR_TERMINATED because users have no idea about executors when using command tasks. The right way to do this is to plumb the reason via the Termination protobuf. That way any isolator can include the right reason if its corresponding limit is reached (memory, disk etc). If you want to punt on the plumbing, please add a TODO and tracking ticket. Dominic Hamon wrote: if termination is set to 'killed' then should it be TASK_KILLED instead of TASK_FAILED? I think this needs a TODO as it changes the plumbing quite a bit (and tests, etc). For now, is REASON_EXECUTOR_TERMINATED accurate enough? I could add something for command vs containerizer vs executor, but that starts to require more plumbing pretty quickly. Vinod Kone wrote: ``` if termination is set to 'killed' then should it be TASK_KILLED instead of TASK_FAILED? ``` No. It should be TASK_FAILED. Those are the semantics. ``` I think this needs a TODO as it changes the plumbing quite a bit (and tests, etc) ``` Which one? Plumbing via the termination protobuf or setting REASON_FAILED_COMMAND and REASON_MEMORY_LIMIT? The latter could be done without any plumbing; all changes will be in executorTerminated(). this assumes that the termination reason is memory limit though, right? and we still don't have a reason for TASK_LOST. - Dominic --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59654 --- On Oct. 31, 2014, 3:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Oct. 31, 2014, 3:09 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
On Nov. 3, 2014, 10:10 p.m., Timothy St. Clair wrote: src/launcher/fetcher.cpp, line 219 https://reviews.apache.org/r/27483/diff/1/?file=746870#file746870line219 It seems weird that we would always try and fail on HDFS if the user had specified http pulls. Given that we prefix check for curl pulls, perhaps we should switch the case order, as the cost is low for a string check but the cost is high for a HDFS check. That seems fair. I can swap the checks. - Ankur --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59653 --- On Nov. 3, 2014, 1:52 a.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 1:52 a.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
On Nov. 3, 2014, 9:59 p.m., Timothy St. Clair wrote: src/launcher/fetcher.cpp, line 77 https://reviews.apache.org/r/27483/diff/1/?file=746870#file746870line77 we should probably bail here, if somehow the return is != 0 (isError() || false) My thinking here was: In case of a user that does not have hadoop_home set and no hadoop in path, we would like to continue with other methods. If we bail here for whatever reason (no hadoop, misconfigured hadoop etc), we would end up breaking the fetcher for all those people. - This holds only if we assume that hadoop/hdfs is **not** a hard dependency. - Ankur --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59644 --- On Nov. 3, 2014, 1:52 a.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 1:52 a.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 26382: Add source and reason to TaskStatus.
On Nov. 3, 2014, 10:13 p.m., Vinod Kone wrote: src/slave/slave.cpp, line 2952 https://reviews.apache.org/r/26382/diff/11/?file=746449#file746449line2952 So, after thinking a bit more and looking at MESOS-343, what we really want here is to set the reason for failed as oom or memory limit (REASON_MEMORY_LIMIT). Slave generates TASK_FAILED here not because the executor terminated but becaused it oomed (and in the future due to disk limit etc). Note that the slave also generates TASK_FAILED when command executor unexpectedly terminates. This is because, under normal conditions, command executor is not expected to terminate until the task finishes. For this case I think REASON_INVALID_COMMAND or REASON_BAD_COMMAND or REASON_COMMAND_FAILED is a better reason than REASON_EXECUTOR_TERMINATED because users have no idea about executors when using command tasks. The right way to do this is to plumb the reason via the Termination protobuf. That way any isolator can include the right reason if its corresponding limit is reached (memory, disk etc). If you want to punt on the plumbing, please add a TODO and tracking ticket. Dominic Hamon wrote: if termination is set to 'killed' then should it be TASK_KILLED instead of TASK_FAILED? I think this needs a TODO as it changes the plumbing quite a bit (and tests, etc). For now, is REASON_EXECUTOR_TERMINATED accurate enough? I could add something for command vs containerizer vs executor, but that starts to require more plumbing pretty quickly. Vinod Kone wrote: ``` if termination is set to 'killed' then should it be TASK_KILLED instead of TASK_FAILED? ``` No. It should be TASK_FAILED. Those are the semantics. ``` I think this needs a TODO as it changes the plumbing quite a bit (and tests, etc) ``` Which one? Plumbing via the termination protobuf or setting REASON_FAILED_COMMAND and REASON_MEMORY_LIMIT? The latter could be done without any plumbing; all changes will be in executorTerminated(). Dominic Hamon wrote: this assumes that the termination reason is memory limit though, right? and we still don't have a reason for TASK_LOST. ``` this assumes that the termination reason is memory limit though, right? ``` thats correct, which is true as of today. ``` and we still don't have a reason for TASK_LOST. ``` TASK_LOST should be REASON_EXECUTOR_TERMINATED as you already had. That should only apply for non-terminal tasks belonging to a terminated (non-oomed and non-command) executor. - Vinod --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59654 --- On Oct. 31, 2014, 10:09 p.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Oct. 31, 2014, 10:09 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 11:09 p.m.) Review request for mesos and Timothy St. Clair. Changes --- Prefer the libcurl fetcher to the HDFS fetcher. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs (updated) - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59673 --- src/hdfs/hdfs.hpp https://reviews.apache.org/r/27483/#comment100962 Doing a style clean first on this patch. Move { to newline. src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100963 space bewteen the first parenthesis: if (hdfs.available().isError()) src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100964 Ditto src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100965 Two spaces between functions src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100966 Two spaces bewteen functions src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100968 if (result.isSome()) { src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100967 Space between parenthesis and also between starting bracket You're still working on unit tests right? - Timothy Chen On Nov. 3, 2014, 11:09 p.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 11:09 p.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59674 --- src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100969 I think we should leave some comments why we're trying to use HDFS (I think more specifically the HDFS client right?) to fetch all other URIs, the original motivation of the refactor. - Timothy Chen On Nov. 3, 2014, 11:09 p.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 11:09 p.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27481: Updated Modules protobuf to simplify JSON parsing.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27481/#review59510 --- Ship it! Great work Kapil. Makes much more sense now. src/tests/module_tests.cpp https://reviews.apache.org/r/27481/#comment100781 Given that people like me seek the ultimate truth in the tests, I think this one is particularly helpful even though it is arguable on why you basically test JSON to protobuf here. Also this test would have allowed us to spot issues we saw with the earlier version of the libraries protobuf much quicker (documentation was not in sync with json-format). - Till Toenshoff On Nov. 2, 2014, 1:02 a.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27481/ --- (Updated Nov. 2, 2014, 1:02 a.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Repository: mesos-git Description --- Updated the --modules help messages accordingly. Diffs - src/master/flags.hpp c931fd99e847477d773c05524f4dee90b8c168cb src/messages/messages.proto 76e39808457816d67f58f08f4349cc700fd396ee src/module/manager.cpp 7a6c88444c136dc56898bc5e81fda2c2662e6e68 src/slave/flags.hpp f7a8cde5826556a477ad22a84b9f50f0d1c8103f src/tests/flags.hpp 2886e89cd5e2e45983190493cd2a0a20ab96aa6e src/tests/module_tests.cpp e079dbe58f000d80d111ff87945d37edab0db9b4 Diff: https://reviews.apache.org/r/27481/diff/ Testing --- Updated test suite, added a new test that generates the Modules protobuf by parsing a Json string and ran make check. Thanks, Kapil Arya
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 11:42 p.m.) Review request for mesos and Timothy St. Clair. Changes --- Style fixes and added comments to explain the fetcher dependency on hadoop client Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs (updated) - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27481: Updated Modules protobuf to simplify JSON parsing.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27481/ --- (Updated Nov. 3, 2014, 6:43 p.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Changes --- Add issue id. Bugs: MESOS-2036 https://issues.apache.org/jira/browse/MESOS-2036 Repository: mesos-git Description --- Updated the --modules help messages accordingly. Diffs - src/master/flags.hpp c931fd99e847477d773c05524f4dee90b8c168cb src/messages/messages.proto 76e39808457816d67f58f08f4349cc700fd396ee src/module/manager.cpp 7a6c88444c136dc56898bc5e81fda2c2662e6e68 src/slave/flags.hpp f7a8cde5826556a477ad22a84b9f50f0d1c8103f src/tests/flags.hpp 2886e89cd5e2e45983190493cd2a0a20ab96aa6e src/tests/module_tests.cpp e079dbe58f000d80d111ff87945d37edab0db9b4 Diff: https://reviews.apache.org/r/27481/diff/ Testing --- Updated test suite, added a new test that generates the Modules protobuf by parsing a Json string and ran make check. Thanks, Kapil Arya
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59681 --- src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100973 End comments with period. src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100974 Fix spacing - Timothy Chen On Nov. 3, 2014, 11:42 p.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 3, 2014, 11:42 p.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 4, 2014, 12:02 a.m.) Review request for mesos and Timothy St. Clair. Changes --- Fix build failure Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs (updated) - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2239
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2239/changes Changes: [dave] Adds Oakmore Labs to Mesos adopters documentation. -- [...truncated 69675 lines...] I1104 00:02:03.636489 15081 group.cpp:385] Trying to create path '/log' in ZooKeeper 2014-11-04 00:02:03,636:15030(0x2b3f7a857700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:37029], sessionId=0x149781bba660001, negotiated timeout=1 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@log_env@716: Client environment:host.name=penates.apache.org 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@log_env@724: Client environment:os.arch=3.13.0-36-lowlatency 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@log_env@725: Client environment:os.version=#63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@log_env@716: Client environment:host.name=penates.apache.org 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@log_env@733: Client environment:user.name=jenkins 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@log_env@724: Client environment:os.arch=3.13.0-36-lowlatency 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@log_env@725: Client environment:os.version=#63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@log_env@741: Client environment:user.home=/home/jenkins 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@log_env@753: Client environment:user.dir=/tmp/LogZooKeeperTest_WriteRead_wjDxTk 2014-11-04 00:02:03,637:15030(0x2b3bd8419700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=127.0.0.1:37029 sessionTimeout=1 watcher=0x2b3bd2d357be sessionId=0 sessionPasswd=null context=0x2b3c040c20f0 flags=0 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@log_env@733: Client environment:user.name=jenkins 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@log_env@741: Client environment:user.home=/home/jenkins 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@log_env@753: Client environment:user.dir=/tmp/LogZooKeeperTest_WriteRead_wjDxTk 2014-11-04 00:02:03,637:15030(0x2b3bd8c1d700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=127.0.0.1:37029 sessionTimeout=1 watcher=0x2b3bd2d357be sessionId=0 sessionPasswd=null context=0x2b3c1c0a6460 flags=0 I1104 00:02:03.637473 15076 group.cpp:313] Group process (group(54)@67.195.81.186:52364) connected to ZooKeeper I1104 00:02:03.637713 15076 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (1, 0, 0) I1104 00:02:03.637732 15076 group.cpp:385] Trying to create path '/log' in ZooKeeper I1104 00:02:03.637675 15075 log.cpp:238] Attempting to join replica to ZooKeeper group 2014-11-04 00:02:03,638:15030(0x2b3f7a254700):ZOO_INFO@check_events@1703: initiated connection to server [127.0.0.1:37029] I1104 00:02:03.638144 15080 recover.cpp:437] Starting replica recovery I1104 00:02:03.638434 15080 recover.cpp:463] Replica is in VOTING status I1104 00:02:03.638598 15080 recover.cpp:452] Recover process terminated 2014-11-04 00:02:03,638:15030(0x2b3f79a50700):ZOO_INFO@check_events@1703: initiated connection to server [127.0.0.1:37029] I1104 00:02:03.638901 15066 log.cpp:656] Attempting to start the writer 2014-11-04 00:02:03,640:15030(0x2b3f7a254700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:37029], sessionId=0x149781bba660002, negotiated timeout=1 2014-11-04 00:02:03,640:15030(0x2b3f79a50700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:37029], sessionId=0x149781bba660003, negotiated timeout=1 I1104 00:02:03.642882 15077 group.cpp:313] Group process (group(55)@67.195.81.186:52364) connected to ZooKeeper I1104 00:02:03.784194 15077 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I1104 00:02:03.784291 15077 group.cpp:385] Trying to create path '/log' in ZooKeeper I1104 00:02:03.644551 15072 network.hpp:424] ZooKeeper group memberships changed I1104 00:02:03.784762 15066 group.cpp:313] Group process (group(56)@67.195.81.186:52364) connected to ZooKeeper I1104 00:02:03.784796 15066 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (1, 0, 0) I1104 00:02:03.784809 15066 group.cpp:385] Trying to create path '/log' in ZooKeeper I1104 00:02:03.785157 15074
Jenkins build is back to normal : Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2517
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2517/changes
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59682 --- src/hdfs/hdfs.hpp https://reviews.apache.org/r/27483/#comment100975 We use proper sentences for comments. s/check/Check/ s/code == 0/code == 0./ s/hadoop/`hadoop'/ src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100981 2 blank lines between outer elements. src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100980 s/HDFS/HadoopClient/ src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100982 s/HDFS/Hadoop Client/ src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100983 thank you. src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100977 period at the end. src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100984 Why the fall through to HDFS here? src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100979 period at the end. src/launcher/fetcher.cpp https://reviews.apache.org/r/27483/#comment100978 Seems weird to do this after hdfs fetch. Is HDFS local copy as fast as cp ? I would change the semantics as follows: ``` if (local) { return fetchWithLocalCopy(); } if (http* or ftp*) { return fetchWithNet(); } return fetchWithHadoopClient(); ``` - Vinod Kone On Nov. 4, 2014, 12:02 a.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 4, 2014, 12:02 a.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27461: fix OsTest.killtreeNoRoot: check for reparent and not zombie
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Nov. 4, 2014, 12:17 a.m.) Review request for mesos and Ian Downes. Changes --- Modify checks to address Ian's comment. Summary (updated) - fix OsTest.killtreeNoRoot: check for reparent and not zombie Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description (updated) --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check that the subtree has been reparented to a process different from its original parent (a.k.a. child) and that it is not a zombie. Diffs (updated) - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Review Request 27550: Added DiskInfo to the Resource protobuf.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27550/ --- Review request for mesos, Ben Mahler and Vinod Kone. Repository: mesos-git Description --- See summary. Diffs - include/mesos/mesos.proto 6b93e9000761857c4f335f2a8c8088e155078f54 Diff: https://reviews.apache.org/r/27550/diff/ Testing --- make check Thanks, Jie Yu
Re: Review Request 27461: fix OsTest.killtreeNoRoot: check for reparent and not zombie
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/#review59690 --- Ship it! Could you please add to the comment to clarify why this is done, i.e., because some systems don't reparent to init pid 1. Thanks! - Ian Downes On Nov. 3, 2014, 4:17 p.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Nov. 3, 2014, 4:17 p.m.) Review request for mesos and Ian Downes. Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check that the subtree has been reparented to a process different from its original parent (a.k.a. child) and that it is not a zombie. Diffs - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Review Request 27552: Added a slave flag to opt out using default resources.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27552/ --- Review request for mesos, Ben Mahler and Vinod Kone. Repository: mesos-git Description --- This is used in testing of the new Resources class. See the dependent review. Diffs - src/slave/containerizer/containerizer.cpp 0254679508167a390fd6fed855f19794354ac081 src/slave/flags.hpp 03c62a2fd040768392c7e24d93f64ca3a855c4a1 Diff: https://reviews.apache.org/r/27552/diff/ Testing --- make check Thanks, Jie Yu
Re: Review Request 27461: fix OsTest.killtreeNoRoot: check for reparent and not zombie
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/#review59692 --- 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp https://reviews.apache.org/r/27461/#comment100986 Is this comment still valid? 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp https://reviews.apache.org/r/27461/#comment100987 Couldn't this be grandchild.children.front()? 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp https://reviews.apache.org/r/27461/#comment100988 Should this also check if _grandchild.get().parent isn't a zombie? - Adam B On Nov. 3, 2014, 4:36 p.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Nov. 3, 2014, 4:36 p.m.) Review request for mesos and Ian Downes. Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check that the subtree has been reparented to a process different from its original parent (a.k.a. child) and that it is not a zombie. Diffs - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Re: Review Request 27461: fix OsTest.killtreeNoRoot: check for reparent and not zombie
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Nov. 4, 2014, 12:36 a.m.) Review request for mesos and Ian Downes. Changes --- Clarifying comment as per Ian's request. Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check that the subtree has been reparented to a process different from its original parent (a.k.a. child) and that it is not a zombie. Diffs (updated) - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Re: Review Request 26382: Add source and reason to TaskStatus.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/ --- (Updated Nov. 3, 2014, 4:38 p.m.) Review request for mesos, Vinod Kone and Bill Farner. Bugs: MESOS-1830 and MESOS-343 https://issues.apache.org/jira/browse/MESOS-1830 https://issues.apache.org/jira/browse/MESOS-343 Repository: mesos-git Description --- Added source and reason, enabled TASK_ERROR, and made the changes necessary throughout the codebase. Diffs (updated) - include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 src/examples/java/TestFramework.java bc593d0abfacb00690b1492b2b82c970f4e4de6d src/examples/low_level_scheduler_libprocess.cpp 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 src/examples/low_level_scheduler_pthread.cpp 6e233a10117a1c7aa669806b5b430e746e227ee5 src/examples/no_executor_framework.cpp f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/fault_tolerance_tests.cpp a18a41a3e34ff112e04e693447d757403e5013bd src/tests/master_authorization_tests.cpp 652e80d0d4567b225c6ffb326725ddfde06f7fd3 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/resource_offers_tests.cpp fe66432b1bf75ee25feb73c4bb353e4d4e5b503f Diff: https://reviews.apache.org/r/26382/diff/ Testing --- make check Thanks, Dominic Hamon
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59693 --- Patch looks great! Reviews applied: [27483] All tests passed. - Mesos ReviewBot On Nov. 4, 2014, 12:02 a.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 4, 2014, 12:02 a.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27461: fix OsTest.killtreeNoRoot: check for reparent and not zombie
On Nov. 4, 2014, 12:37 a.m., Adam B wrote: 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp, line 661 https://reviews.apache.org/r/27461/diff/3/?file=747878#file747878line661 Couldn't this be grandchild.children.front()? we'd have to keep a reference to the object rather than the implicit conversion to pid_t for that to work. - Joris --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/#review59692 --- On Nov. 4, 2014, 12:36 a.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Nov. 4, 2014, 12:36 a.m.) Review request for mesos and Ian Downes. Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check that the subtree has been reparented to a process different from its original parent (a.k.a. child) and that it is not a zombie. Diffs - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Re: Review Request 27461: fix OsTest.killtreeNoRoot: check for reparent and not zombie
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Nov. 4, 2014, 12:49 a.m.) Review request for mesos and Ian Downes. Changes --- Address Adam's comments. Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check that the subtree has been reparented to a process different from its original parent (a.k.a. child) and that it is not a zombie. Diffs (updated) - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Re: Review Request 27453: First version of docs/modules.md.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27453/ --- (Updated Nov. 3, 2014, 7:51 p.m.) Review request for mesos, Bernd Mathiske, Niklas Nielsen, and Till Toenshoff. Changes --- Addressed Bernd's comments. Repository: mesos-git Description --- With bits copied from https://cwiki.apache.org/confluence/display/MESOS/Mesos+Modules+Developer+Guide. Here is the url for markdown view: https://github.com/karya0/mesos/blob/modules/docs/modules.md Diffs (updated) - docs/home.md 416a52ed99dba5ba55af97a300ce428355edd199 docs/modules.md PRE-CREATION Diff: https://reviews.apache.org/r/27453/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
On Nov. 4, 2014, 12:12 a.m., Vinod Kone wrote: src/launcher/fetcher.cpp, line 230 https://reviews.apache.org/r/27483/diff/3/?file=747858#file747858line230 Why the fall through to HDFS here? My reasoning is that if hadoop client is configured with some ftp/ftps/http/https credentials that are not in the urls, the hadoop client would be able to fetch those resource that the libcurl client can't. I am open to changing this if this line of thinking seems confusing. On Nov. 4, 2014, 12:12 a.m., Vinod Kone wrote: src/launcher/fetcher.cpp, lines 245-246 https://reviews.apache.org/r/27483/diff/3/?file=747858#file747858line245 Seems weird to do this after hdfs fetch. Is HDFS local copy as fast as cp ? I would change the semantics as follows: ``` if (local) { return fetchWithLocalCopy(); } if (http* or ftp*) { return fetchWithNet(); } return fetchWithHadoopClient(); ``` I am pretty sure that native cp is faster than hadoop client local copy. I can make this change. - Ankur --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59682 --- On Nov. 4, 2014, 12:02 a.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 4, 2014, 12:02 a.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27461: fix OsTest.killtreeNoRoot: check for reparent and not zombie
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/#review59698 --- Ship it! Ship It! - Adam B On Nov. 3, 2014, 4:49 p.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Nov. 3, 2014, 4:49 p.m.) Review request for mesos and Ian Downes. Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check that the subtree has been reparented to a process different from its original parent (a.k.a. child) and that it is not a zombie. Diffs - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Review Request 27556: Updated docs/configuration.md to reflect current state.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27556/ --- Review request for mesos, Ian Downes and Till Toenshoff. Repository: mesos-git Description --- Copied the output of --help messages from configure, master, and slave. Diffs - docs/configuration.md 5845ae324181d01cb65990fbf8dd38a621e1c351 Diff: https://reviews.apache.org/r/27556/diff/ Testing --- Pushed and tested output on github at: https://github.com/karya0/mesos/blob/modules/docs/configuration.md Thanks, Kapil Arya
Review Request 27555: Refactored the C++ Resources class to support persistent disk resources.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27555/ --- Review request for mesos, Ben Mahler and Vinod Kone. Repository: mesos-git Description --- The purpose of the refactor is to support persistent disk resources. Here are the main things I've done in this refactor: 1) Resource objects in Resources are stored in minimal format (validated/non-zero). That allows us to kill isAllocatable, allocatable, isZero, etc. 2) 'matches' needs to be split into two pieces: one for combining and one for removing, in order to support persitent disk resource. For example, one cannot combine two Resource object with DiskInfo (it's like two disks), however, you can do removal if they are identical. 3) Some of the interfaces are not intuitive (e.g., =, see details in the ticket). I removed them in favor of more explicit interfaces. 4) Unified all the validation code. 5) Adjusted the tests accordingly. Diffs - include/mesos/resources.hpp 0e37170 src/cli/execute.cpp ddaa20d src/common/resources.cpp e9a0c85 src/examples/low_level_scheduler_libprocess.cpp 7ef5ea7 src/examples/low_level_scheduler_pthread.cpp 6e233a1 src/examples/no_executor_framework.cpp f98a073 src/examples/test_framework.cpp 187a611 src/master/drf_sorter.cpp 5464900 src/master/hierarchical_allocator_process.hpp 31dfb2c src/master/http.cpp a5e34cc src/master/master.cpp 0a5c9a3 src/tests/allocator_tests.cpp 58e15aa src/tests/gc_tests.cpp f7747e2 src/tests/master_tests.cpp d9dc40c src/tests/mesos.hpp 957e223 src/tests/resource_offers_tests.cpp 060039e src/tests/resources_tests.cpp 3e50889 src/tests/slave_recovery_tests.cpp 4fb357b src/tests/sorter_tests.cpp 0516ab5 Diff: https://reviews.apache.org/r/27555/diff/ Testing --- make check Thanks, Jie Yu
Re: Review Request 27555: Refactored the C++ Resources class to support persistent disk resources.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27555/ --- (Updated Nov. 4, 2014, 12:57 a.m.) Review request for mesos, Ben Mahler and Vinod Kone. Bugs: MESOS-1974 https://issues.apache.org/jira/browse/MESOS-1974 Repository: mesos-git Description --- The purpose of the refactor is to support persistent disk resources. Here are the main things I've done in this refactor: 1) Resource objects in Resources are stored in minimal format (validated/non-zero). That allows us to kill isAllocatable, allocatable, isZero, etc. 2) 'matches' needs to be split into two pieces: one for combining and one for removing, in order to support persitent disk resource. For example, one cannot combine two Resource object with DiskInfo (it's like two disks), however, you can do removal if they are identical. 3) Some of the interfaces are not intuitive (e.g., =, see details in the ticket). I removed them in favor of more explicit interfaces. 4) Unified all the validation code. 5) Adjusted the tests accordingly. Diffs - include/mesos/resources.hpp 0e37170 src/cli/execute.cpp ddaa20d src/common/resources.cpp e9a0c85 src/examples/low_level_scheduler_libprocess.cpp 7ef5ea7 src/examples/low_level_scheduler_pthread.cpp 6e233a1 src/examples/no_executor_framework.cpp f98a073 src/examples/test_framework.cpp 187a611 src/master/drf_sorter.cpp 5464900 src/master/hierarchical_allocator_process.hpp 31dfb2c src/master/http.cpp a5e34cc src/master/master.cpp 0a5c9a3 src/tests/allocator_tests.cpp 58e15aa src/tests/gc_tests.cpp f7747e2 src/tests/master_tests.cpp d9dc40c src/tests/mesos.hpp 957e223 src/tests/resource_offers_tests.cpp 060039e src/tests/resources_tests.cpp 3e50889 src/tests/slave_recovery_tests.cpp 4fb357b src/tests/sorter_tests.cpp 0516ab5 Diff: https://reviews.apache.org/r/27555/diff/ Testing --- make check Thanks, Jie Yu
Review Request 27557: Pass user into Isolator::prepare.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27557/ --- Review request for mesos and Vinod Kone. Bugs: MESOS-1941 https://issues.apache.org/jira/browse/MESOS-1941 Repository: mesos-git Description --- Pass user into Isolator::prepare. Needed by subsequent review on unprivileged child cgroups. Diffs - src/slave/containerizer/isolator.hpp 4c9d1d87b1ae50a54b7b83707c6da7101bd983d3 src/slave/containerizer/isolator.cpp 69849d2041bd664f59516fd08e2193181c7b2a50 src/slave/containerizer/isolators/cgroups/cpushare.hpp 5d43169e9082c7eef6784a435c9bac91e36fc32b src/slave/containerizer/isolators/cgroups/cpushare.cpp f9531e447bc2b39ee99d6fb0c8f911a70a3498a3 src/slave/containerizer/isolators/cgroups/mem.hpp 25e4afc20563ebc00b70c6e23baf2bbb4854020f src/slave/containerizer/isolators/cgroups/mem.cpp 96bc506bb9dce7a455b443d9e24236babe91d890 src/slave/containerizer/isolators/cgroups/perf_event.hpp 7cb2ba297ef4de2d1de6b0da94ddc0b889ba1cf7 src/slave/containerizer/isolators/cgroups/perf_event.cpp 7ed418a8e8c4a0a68715ffb13660e57afeaef69e src/slave/containerizer/isolators/filesystem/shared.hpp 75172d5a0929652342d6376358424dccb93389bb src/slave/containerizer/isolators/filesystem/shared.cpp 49510b29bdc4f2095140e1155417fd313adc9b82 src/slave/containerizer/isolators/namespaces/pid.hpp 7c40e7730e690e69a3bbef02a46ccb32ebc6badf src/slave/containerizer/isolators/namespaces/pid.cpp edfb1f64fb535d826886922788d95d0850f49b3e src/slave/containerizer/isolators/network/port_mapping.hpp f9215b276459692d0b0097289509854988b70fb9 src/slave/containerizer/isolators/network/port_mapping.cpp 14fae1f00050afbc6b99f4aabf868a2d75774b15 src/slave/containerizer/isolators/posix.hpp 7e02f92f1b901ec4fad05571a5d3d199624fb535 src/slave/containerizer/mesos/containerizer.cpp d4b08f54d6feb453f3a9d27ca54c867176e62102 src/tests/isolator.hpp d8f3f09cae9c2aabd889ab7625691e6b44b47dba src/tests/isolator_tests.cpp 4d03f46cbe20c0e7aff15bab0a26d7738b55aab9 src/tests/port_mapping_tests.cpp 1a5e52c8151e6fa7b55263d41e175f1b0d8238a2 Diff: https://reviews.apache.org/r/27557/diff/ Testing --- make check Thanks, Ian Downes
Re: Review Request 27531: Update Master and Slave metrics to match task source and reason scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- (Updated Nov. 3, 2014, 4:59 p.m.) Review request for mesos, Tobias Weingartner and Vinod Kone. Changes --- add slave and limit to explicit metrics. Summary (updated) - Update Master and Slave metrics to match task source and reason scheme. Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs (updated) - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/slave/slave.hpp eb5de736ba63dfab5bf048d7a09462f0cff9aaea src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/slave_tests.cpp a1bd00cffe9de178b0b188c3556e479cd1f6d566 Diff: https://reviews.apache.org/r/27531/diff/ Testing (updated) --- make check run master and check endpoint: { ... master/reconciliation_states/task_error: 0, master/reconciliation_states/task_failed: 0, master/reconciliation_states/task_finished: 0, master/reconciliation_states/task_killed: 0, master/reconciliation_states/task_lost: 0, master/reconciliation_states/task_running: 0, master/reconciliation_states/task_staging: 0, master/reconciliation_states/task_starting: 0, master/recovery_slave_removals: 0, master/slave_registrations: 1, master/slave_removals: 0, master/slave_reregistrations: 0, master/slaves_active: 1, master/slaves_connected: 1, master/slaves_disconnected: 0, master/slaves_inactive: 0, master/tasks_failed: 0, master/tasks_finished: 0, master/tasks_killed: 0, master/tasks_lost: 0, master/tasks_lost/reason_executor_terminated: 0, master/tasks_lost/reason_executor_unregistered: 0, master/tasks_lost/reason_framework_removed: 0, master/tasks_lost/reason_gc_error: 0, master/tasks_lost/reason_invalid_command: 0, master/tasks_lost/reason_invalid_frameworkid: 0, master/tasks_lost/reason_invalid_offers: 0, master/tasks_lost/reason_master_disconnected: 0, master/tasks_lost/reason_memory_limit: 0, master/tasks_lost/reason_reconciliation: 0, master/tasks_lost/reason_slave_disconnected: 0, master/tasks_lost/reason_slave_removed: 0, master/tasks_lost/reason_slave_restarted: 0, master/tasks_lost/reason_slave_unknown: 0, master/tasks_lost/reason_task_invalid: 0, master/tasks_lost/reason_task_unauthorized: 0, master/tasks_lost/reason_task_unknown: 0, ... } and slave: { ... slave/tasks_lost: 0, slave/tasks_lost/reason_executor_terminated: 0, slave/tasks_lost/reason_executor_unregistered: 0, slave/tasks_lost/reason_framework_removed: 0, slave/tasks_lost/reason_gc_error: 0, slave/tasks_lost/reason_invalid_command: 0, slave/tasks_lost/reason_invalid_frameworkid: 0, slave/tasks_lost/reason_invalid_offers: 0, slave/tasks_lost/reason_master_disconnected: 0, slave/tasks_lost/reason_memory_limit: 0, slave/tasks_lost/reason_reconciliation: 0, slave/tasks_lost/reason_slave_disconnected: 0, slave/tasks_lost/reason_slave_removed: 0, slave/tasks_lost/reason_slave_restarted: 0, slave/tasks_lost/reason_slave_unknown: 0, slave/tasks_lost/reason_task_invalid: 0, slave/tasks_lost/reason_task_unauthorized: 0, slave/tasks_lost/reason_task_unknown: 0, slave/tasks_running: 0, slave/tasks_staging: 0, slave/tasks_starting: 0, ... } Thanks, Dominic Hamon
Review Request 27558: Permit unprivileged executors to created child cgroups.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27558/ --- Review request for mesos and Vinod Kone. Bugs: MESOS-1941 https://issues.apache.org/jira/browse/MESOS-1941 Repository: mesos-git Description --- MesosContainerizer Isolators using cgroups will chown the cgroup to the executor user, if specified, enabling the executor to create child cgroups. Cgroup control files are not chowned so the executor cannot modify its own cgroups, only control files for child cgroups. Diffs - src/slave/containerizer/isolators/cgroups/cpushare.cpp f9531e447bc2b39ee99d6fb0c8f911a70a3498a3 src/slave/containerizer/isolators/cgroups/mem.cpp 96bc506bb9dce7a455b443d9e24236babe91d890 src/slave/containerizer/isolators/cgroups/perf_event.cpp 7ed418a8e8c4a0a68715ffb13660e57afeaef69e src/tests/isolator_tests.cpp 4d03f46cbe20c0e7aff15bab0a26d7738b55aab9 Diff: https://reviews.apache.org/r/27558/diff/ Testing --- # New test added make check Thanks, Ian Downes
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 4, 2014, 1 a.m.) Review request for mesos and Timothy St. Clair. Changes --- * Change HDFS to hadoop client. * Changed method names to better reflect usage of hdfs client. * Code/comment style change as per review feedback and [style guide](http://mesos.apache.org/documentation/latest/mesos-c++-style-guide/). * Changed the order of fetchers to local - libcurl - hdfs. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs (updated) - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 4, 2014, 1:04 a.m.) Review request for mesos and Timothy St. Clair. Changes --- Fix spacing Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs (updated) - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Re: Review Request 27555: Refactored the C++ Resources class to support persistent disk resources.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27555/ --- (Updated Nov. 4, 2014, 1:19 a.m.) Review request for mesos, Ben Mahler and Vinod Kone. Bugs: MESOS-1974 https://issues.apache.org/jira/browse/MESOS-1974 Repository: mesos-git Description --- The purpose of the refactor is to support persistent disk resources. Here are the main things I've done in this refactor: 1) Resource objects in Resources are stored in minimal format (validated/non-zero). That allows us to kill isAllocatable, allocatable, isZero, etc. 2) 'matches' needs to be split into two pieces: one for combining and one for removing, in order to support persitent disk resource. For example, one cannot combine two Resource object with DiskInfo (it's like two disks), however, you can do removal if they are identical. 3) Some of the interfaces are not intuitive (e.g., =, see details in the ticket). I removed them in favor of more explicit interfaces. 4) Unified all the validation code. 5) Adjusted the tests accordingly. Diffs - include/mesos/resources.hpp 0e37170 src/cli/execute.cpp ddaa20d src/common/resources.cpp e9a0c85 src/examples/low_level_scheduler_libprocess.cpp 7ef5ea7 src/examples/low_level_scheduler_pthread.cpp 6e233a1 src/examples/no_executor_framework.cpp f98a073 src/examples/test_framework.cpp 187a611 src/master/drf_sorter.cpp 5464900 src/master/hierarchical_allocator_process.hpp 31dfb2c src/master/http.cpp a5e34cc src/master/master.cpp 0a5c9a3 src/tests/allocator_tests.cpp 58e15aa src/tests/gc_tests.cpp f7747e2 src/tests/master_tests.cpp d9dc40c src/tests/mesos.hpp 957e223 src/tests/resource_offers_tests.cpp 060039e src/tests/resources_tests.cpp 3e50889 src/tests/slave_recovery_tests.cpp 4fb357b src/tests/sorter_tests.cpp 0516ab5 Diff: https://reviews.apache.org/r/27555/diff/ Testing --- make check Thanks, Jie Yu
Re: Review Request 27555: Refactored the C++ Resources class to support persistent disk resources.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27555/ --- (Updated Nov. 4, 2014, 1:19 a.m.) Review request for mesos, Ben Mahler and Vinod Kone. Bugs: MESOS-1974 https://issues.apache.org/jira/browse/MESOS-1974 Repository: mesos-git Description --- The purpose of the refactor is to support persistent disk resources. Here are the main things I've done in this refactor: 1) Resource objects in Resources are stored in minimal format (validated/non-zero). That allows us to kill isAllocatable, allocatable, isZero, etc. 2) 'matches' needs to be split into two pieces: one for combining and one for removing, in order to support persitent disk resource. For example, one cannot combine two Resource object with DiskInfo (it's like two disks), however, you can do removal if they are identical. 3) Some of the interfaces are not intuitive (e.g., =, see details in the ticket). I removed them in favor of more explicit interfaces. 4) Unified all the validation code. 5) Adjusted the tests accordingly. Diffs - include/mesos/resources.hpp 0e37170 src/cli/execute.cpp ddaa20d src/common/resources.cpp e9a0c85 src/examples/low_level_scheduler_libprocess.cpp 7ef5ea7 src/examples/low_level_scheduler_pthread.cpp 6e233a1 src/examples/no_executor_framework.cpp f98a073 src/examples/test_framework.cpp 187a611 src/master/drf_sorter.cpp 5464900 src/master/hierarchical_allocator_process.hpp 31dfb2c src/master/http.cpp a5e34cc src/master/master.cpp 0a5c9a3 src/tests/allocator_tests.cpp 58e15aa src/tests/gc_tests.cpp f7747e2 src/tests/master_tests.cpp d9dc40c src/tests/mesos.hpp 957e223 src/tests/resource_offers_tests.cpp 060039e src/tests/resources_tests.cpp 3e50889 src/tests/slave_recovery_tests.cpp 4fb357b src/tests/sorter_tests.cpp 0516ab5 Diff: https://reviews.apache.org/r/27555/diff/ Testing --- make check Thanks, Jie Yu
Jenkins build is back to normal : Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2240
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2240/changes
Re: Review Request 27556: Updated docs/configuration.md to reflect current state.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27556/ --- (Updated Nov. 3, 2014, 8:35 p.m.) Review request for mesos, Ian Downes and Till Toenshoff. Changes --- Updated --isolation flag. Rebased diff; added bug Bugs: MESOS-2037 https://issues.apache.org/jira/browse/MESOS-2037 Repository: mesos-git Description --- Copied the output of --help messages from configure, master, and slave. Diffs (updated) - docs/configuration.md 5845ae324181d01cb65990fbf8dd38a621e1c351 Diff: https://reviews.apache.org/r/27556/diff/ Testing --- Pushed and tested output on github at: https://github.com/karya0/mesos/blob/modules/docs/configuration.md Thanks, Kapil Arya
Review Request 27560: Updated --isolation help message for slave.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/ --- Review request for mesos, Niklas Nielsen and Till Toenshoff. Repository: mesos-git Description --- Now also mentions the --modules flag. Diffs - src/slave/flags.hpp 838ca85d2452e7ce643f62b072670ca5cfa4ec4f src/tests/flags.hpp 0b300835545736e8a2535afd1de293511100ceb8 Diff: https://reviews.apache.org/r/27560/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27453: First version of docs/modules.md.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27453/#review59702 --- Ship it! Looks great. Just some minor nits here and there :) docs/home.md https://reviews.apache.org/r/27453/#comment101007 master, slave and tests. ? docs/modules.md https://reviews.apache.org/r/27453/#comment100999 Not sure this is needed. Mesos as a whole is work in progress. Maybe stating that this was the initial version would be cleaner. But I dont have a strong opinion on that. docs/modules.md https://reviews.apache.org/r/27453/#comment101000 (and tests)? Also s/slace/slave/ docs/modules.md https://reviews.apache.org/r/27453/#comment101001 atleast - not sure that word exists. :) docs/modules.md https://reviews.apache.org/r/27453/#comment101002 Add a blank line. docs/modules.md https://reviews.apache.org/r/27453/#comment101003 Even though our style demands two blank lines here, I think for the sake of this document we should leave as is. docs/modules.md https://reviews.apache.org/r/27453/#comment101004 i.e., - is that correct? docs/modules.md https://reviews.apache.org/r/27453/#comment101005 ... module selected for isolation: ? docs/modules.md https://reviews.apache.org/r/27453/#comment101006 src/module/manager.cpp docs/modules.md https://reviews.apache.org/r/27453/#comment100997 This should be libraries docs/modules.md https://reviews.apache.org/r/27453/#comment100998 This should be parameters - Till Toenshoff On Nov. 4, 2014, 12:51 a.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27453/ --- (Updated Nov. 4, 2014, 12:51 a.m.) Review request for mesos, Bernd Mathiske, Niklas Nielsen, and Till Toenshoff. Repository: mesos-git Description --- With bits copied from https://cwiki.apache.org/confluence/display/MESOS/Mesos+Modules+Developer+Guide. Here is the url for markdown view: https://github.com/karya0/mesos/blob/modules/docs/modules.md Diffs - docs/home.md 416a52ed99dba5ba55af97a300ce428355edd199 docs/modules.md PRE-CREATION Diff: https://reviews.apache.org/r/27453/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27550: Added DiskInfo to the Resource protobuf.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27550/#review59703 --- Patch looks great! Reviews applied: [27550] All tests passed. - Mesos ReviewBot On Nov. 4, 2014, 12:29 a.m., Jie Yu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27550/ --- (Updated Nov. 4, 2014, 12:29 a.m.) Review request for mesos, Ben Mahler and Vinod Kone. Repository: mesos-git Description --- See summary. Diffs - include/mesos/mesos.proto 6b93e9000761857c4f335f2a8c8088e155078f54 Diff: https://reviews.apache.org/r/27550/diff/ Testing --- make check Thanks, Jie Yu
Re: Review Request 27560: Updated --isolation help message for slave.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/#review59704 --- Ship it! src/slave/flags.hpp https://reviews.apache.org/r/27560/#comment101008 Not sure we need a capital I for isolator :) - Till Toenshoff On Nov. 4, 2014, 1:34 a.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/ --- (Updated Nov. 4, 2014, 1:34 a.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Repository: mesos-git Description --- Now also mentions the --modules flag. Diffs - src/slave/flags.hpp 838ca85d2452e7ce643f62b072670ca5cfa4ec4f src/tests/flags.hpp 0b300835545736e8a2535afd1de293511100ceb8 Diff: https://reviews.apache.org/r/27560/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27560: Updated --isolation help message for slave.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/ --- (Updated Nov. 3, 2014, 8:42 p.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Changes --- Fixed a capitalization. Repository: mesos-git Description --- Now also mentions the --modules flag. Diffs (updated) - src/slave/flags.hpp 838ca85d2452e7ce643f62b072670ca5cfa4ec4f src/tests/flags.hpp 0b300835545736e8a2535afd1de293511100ceb8 Diff: https://reviews.apache.org/r/27560/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27556: Updated docs/configuration.md to reflect current state.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27556/ --- (Updated Nov. 3, 2014, 8:43 p.m.) Review request for mesos, Ian Downes and Till Toenshoff. Changes --- Rebased. Bugs: MESOS-2037 https://issues.apache.org/jira/browse/MESOS-2037 Repository: mesos-git Description --- Copied the output of --help messages from configure, master, and slave. Diffs (updated) - docs/configuration.md 5845ae324181d01cb65990fbf8dd38a621e1c351 Diff: https://reviews.apache.org/r/27556/diff/ Testing --- Pushed and tested output on github at: https://github.com/karya0/mesos/blob/modules/docs/configuration.md Thanks, Kapil Arya
Re: Review Request 27560: Updated --isolation help message for slave.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/#review59706 --- Ship it! Ship It! - Adam B On Nov. 3, 2014, 5:42 p.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/ --- (Updated Nov. 3, 2014, 5:42 p.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Repository: mesos-git Description --- Now also mentions the --modules flag. Diffs - src/slave/flags.hpp 838ca85d2452e7ce643f62b072670ca5cfa4ec4f src/tests/flags.hpp 0b300835545736e8a2535afd1de293511100ceb8 Diff: https://reviews.apache.org/r/27560/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27560: Updated --isolation help message for slave.
On Nov. 3, 2014, 5:47 p.m., Adam B wrote: Ship It! What JIRA is this associated with? - Adam --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/#review59706 --- On Nov. 3, 2014, 5:42 p.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/ --- (Updated Nov. 3, 2014, 5:42 p.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Repository: mesos-git Description --- Now also mentions the --modules flag. Diffs - src/slave/flags.hpp 838ca85d2452e7ce643f62b072670ca5cfa4ec4f src/tests/flags.hpp 0b300835545736e8a2535afd1de293511100ceb8 Diff: https://reviews.apache.org/r/27560/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27560: Updated --isolation help message for slave.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/#review59709 --- src/tests/flags.hpp https://reviews.apache.org/r/27560/#comment101012 Are we missing the default here? - Till Toenshoff On Nov. 4, 2014, 1:42 a.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/ --- (Updated Nov. 4, 2014, 1:42 a.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Repository: mesos-git Description --- Now also mentions the --modules flag. Diffs - src/slave/flags.hpp 838ca85d2452e7ce643f62b072670ca5cfa4ec4f src/tests/flags.hpp 0b300835545736e8a2535afd1de293511100ceb8 Diff: https://reviews.apache.org/r/27560/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27560: Updated --isolation help message for slave.
On Nov. 4, 2014, 1:47 a.m., Adam B wrote: Ship It! Adam B wrote: What JIRA is this associated with? Given the tinyness of this correction, I mentioned to Kapil that we might be fine without one - or are we? - Till --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/#review59706 --- On Nov. 4, 2014, 1:42 a.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/ --- (Updated Nov. 4, 2014, 1:42 a.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Repository: mesos-git Description --- Now also mentions the --modules flag. Diffs - src/slave/flags.hpp 838ca85d2452e7ce643f62b072670ca5cfa4ec4f src/tests/flags.hpp 0b300835545736e8a2535afd1de293511100ceb8 Diff: https://reviews.apache.org/r/27560/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27453: First version of docs/modules.md.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27453/ --- (Updated Nov. 3, 2014, 8:55 p.m.) Review request for mesos, Bernd Mathiske, Niklas Nielsen, and Till Toenshoff. Changes --- Addressed Till's comments. Repository: mesos-git Description --- With bits copied from https://cwiki.apache.org/confluence/display/MESOS/Mesos+Modules+Developer+Guide. Here is the url for markdown view: https://github.com/karya0/mesos/blob/modules/docs/modules.md Diffs (updated) - docs/home.md 416a52ed99dba5ba55af97a300ce428355edd199 docs/modules.md PRE-CREATION Diff: https://reviews.apache.org/r/27453/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27560: Updated --isolation help message for slave.
On Nov. 3, 2014, 5:47 p.m., Adam B wrote: Ship It! Adam B wrote: What JIRA is this associated with? Till Toenshoff wrote: Given the tinyness of this correction, I mentioned to Kapil that we might be fine without one - or are we? Do you want this to go into 0.21? A JIRA's probably the best way to target it as such and bring it to the release manager's attention. - Adam --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/#review59706 --- On Nov. 3, 2014, 5:42 p.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27560/ --- (Updated Nov. 3, 2014, 5:42 p.m.) Review request for mesos, Niklas Nielsen and Till Toenshoff. Repository: mesos-git Description --- Now also mentions the --modules flag. Diffs - src/slave/flags.hpp 838ca85d2452e7ce643f62b072670ca5cfa4ec4f src/tests/flags.hpp 0b300835545736e8a2535afd1de293511100ceb8 Diff: https://reviews.apache.org/r/27560/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27558: Permit unprivileged executors to created child cgroups.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27558/#review59713 --- Ship it! src/tests/isolator_tests.cpp https://reviews.apache.org/r/27558/#comment101017 maybe a bigger sleep interval to account for slower machines? src/tests/isolator_tests.cpp https://reviews.apache.org/r/27558/#comment101019 ASSERT_SOME on status? - Vinod Kone On Nov. 4, 2014, 1 a.m., Ian Downes wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27558/ --- (Updated Nov. 4, 2014, 1 a.m.) Review request for mesos and Vinod Kone. Bugs: MESOS-1941 https://issues.apache.org/jira/browse/MESOS-1941 Repository: mesos-git Description --- MesosContainerizer Isolators using cgroups will chown the cgroup to the executor user, if specified, enabling the executor to create child cgroups. Cgroup control files are not chowned so the executor cannot modify its own cgroups, only control files for child cgroups. Diffs - src/slave/containerizer/isolators/cgroups/cpushare.cpp f9531e447bc2b39ee99d6fb0c8f911a70a3498a3 src/slave/containerizer/isolators/cgroups/mem.cpp 96bc506bb9dce7a455b443d9e24236babe91d890 src/slave/containerizer/isolators/cgroups/perf_event.cpp 7ed418a8e8c4a0a68715ffb13660e57afeaef69e src/tests/isolator_tests.cpp 4d03f46cbe20c0e7aff15bab0a26d7738b55aab9 Diff: https://reviews.apache.org/r/27558/diff/ Testing --- # New test added make check Thanks, Ian Downes
Re: Review Request 27557: Pass user into Isolator::prepare.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27557/#review59714 --- Ship it! Ship It! - Vinod Kone On Nov. 4, 2014, 12:58 a.m., Ian Downes wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27557/ --- (Updated Nov. 4, 2014, 12:58 a.m.) Review request for mesos and Vinod Kone. Bugs: MESOS-1941 https://issues.apache.org/jira/browse/MESOS-1941 Repository: mesos-git Description --- Pass user into Isolator::prepare. Needed by subsequent review on unprivileged child cgroups. Diffs - src/slave/containerizer/isolator.hpp 4c9d1d87b1ae50a54b7b83707c6da7101bd983d3 src/slave/containerizer/isolator.cpp 69849d2041bd664f59516fd08e2193181c7b2a50 src/slave/containerizer/isolators/cgroups/cpushare.hpp 5d43169e9082c7eef6784a435c9bac91e36fc32b src/slave/containerizer/isolators/cgroups/cpushare.cpp f9531e447bc2b39ee99d6fb0c8f911a70a3498a3 src/slave/containerizer/isolators/cgroups/mem.hpp 25e4afc20563ebc00b70c6e23baf2bbb4854020f src/slave/containerizer/isolators/cgroups/mem.cpp 96bc506bb9dce7a455b443d9e24236babe91d890 src/slave/containerizer/isolators/cgroups/perf_event.hpp 7cb2ba297ef4de2d1de6b0da94ddc0b889ba1cf7 src/slave/containerizer/isolators/cgroups/perf_event.cpp 7ed418a8e8c4a0a68715ffb13660e57afeaef69e src/slave/containerizer/isolators/filesystem/shared.hpp 75172d5a0929652342d6376358424dccb93389bb src/slave/containerizer/isolators/filesystem/shared.cpp 49510b29bdc4f2095140e1155417fd313adc9b82 src/slave/containerizer/isolators/namespaces/pid.hpp 7c40e7730e690e69a3bbef02a46ccb32ebc6badf src/slave/containerizer/isolators/namespaces/pid.cpp edfb1f64fb535d826886922788d95d0850f49b3e src/slave/containerizer/isolators/network/port_mapping.hpp f9215b276459692d0b0097289509854988b70fb9 src/slave/containerizer/isolators/network/port_mapping.cpp 14fae1f00050afbc6b99f4aabf868a2d75774b15 src/slave/containerizer/isolators/posix.hpp 7e02f92f1b901ec4fad05571a5d3d199624fb535 src/slave/containerizer/mesos/containerizer.cpp d4b08f54d6feb453f3a9d27ca54c867176e62102 src/tests/isolator.hpp d8f3f09cae9c2aabd889ab7625691e6b44b47dba src/tests/isolator_tests.cpp 4d03f46cbe20c0e7aff15bab0a26d7738b55aab9 src/tests/port_mapping_tests.cpp 1a5e52c8151e6fa7b55263d41e175f1b0d8238a2 Diff: https://reviews.apache.org/r/27557/diff/ Testing --- make check Thanks, Ian Downes
Re: Review Request 27453: First version of docs/modules.md.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27453/ --- (Updated Nov. 3, 2014, 9:04 p.m.) Review request for mesos, Bernd Mathiske, Niklas Nielsen, and Till Toenshoff. Changes --- Added bug ids. Bugs: MESOS-1937, MESOS-1950 and MESOS-1981 https://issues.apache.org/jira/browse/MESOS-1937 https://issues.apache.org/jira/browse/MESOS-1950 https://issues.apache.org/jira/browse/MESOS-1981 Repository: mesos-git Description --- With bits copied from https://cwiki.apache.org/confluence/display/MESOS/Mesos+Modules+Developer+Guide. Here is the url for markdown view: https://github.com/karya0/mesos/blob/modules/docs/modules.md Diffs - docs/home.md 416a52ed99dba5ba55af97a300ce428355edd199 docs/modules.md PRE-CREATION Diff: https://reviews.apache.org/r/27453/diff/ Testing --- Thanks, Kapil Arya
Re: Review Request 27461: fix OsTest.killtreeNoRoot: check for reparent and not zombie
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/#review59715 --- Patch looks great! Reviews applied: [27461] All tests passed. - Mesos ReviewBot On Nov. 4, 2014, 12:49 a.m., Joris Van Remoortere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27461/ --- (Updated Nov. 4, 2014, 12:49 a.m.) Review request for mesos and Ian Downes. Bugs: MESOS-2025 https://issues.apache.org/jira/browse/MESOS-2025 Repository: mesos-git Description --- Reparenting does not always assign pid 1 (/sbin/init). If there is a user init such as init --user with some other pid, this will be the new parent. Modify os_tests to check that the subtree has been reparented to a process different from its original parent (a.k.a. child) and that it is not a zombie. Diffs - 3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 3f39017 Diff: https://reviews.apache.org/r/27461/diff/ Testing --- make check Thanks, Joris Van Remoortere
Re: Review Request 27531: Update Master and Slave metrics to match task source and reason scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/#review59717 --- Bad patch! Reviews applied: [26817, 26382, 27531] Failed command: ./support/apply-review.sh -n -r 27531 Error: 2014-11-04 02:42:09 URL:https://reviews.apache.org/r/27531/diff/raw/ [10523/10523] - 27531.patch [1] error: patch failed: src/master/master.hpp:677 error: src/master/master.hpp: patch does not apply Failed to apply patch - Mesos ReviewBot On Nov. 4, 2014, 12:59 a.m., Dominic Hamon wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27531/ --- (Updated Nov. 4, 2014, 12:59 a.m.) Review request for mesos, Tobias Weingartner and Vinod Kone. Bugs: MESOS-1830 https://issues.apache.org/jira/browse/MESOS-1830 Repository: mesos-git Description --- Update metrics in Master to match the source and reason split for task statuses. Diffs - src/master/master.hpp b1a2cd0f51f89d6dabbccaa67e0411fc55a4272f src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd src/slave/slave.hpp eb5de736ba63dfab5bf048d7a09462f0cff9aaea src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 src/tests/slave_tests.cpp a1bd00cffe9de178b0b188c3556e479cd1f6d566 Diff: https://reviews.apache.org/r/27531/diff/ Testing --- make check run master and check endpoint: { ... master/reconciliation_states/task_error: 0, master/reconciliation_states/task_failed: 0, master/reconciliation_states/task_finished: 0, master/reconciliation_states/task_killed: 0, master/reconciliation_states/task_lost: 0, master/reconciliation_states/task_running: 0, master/reconciliation_states/task_staging: 0, master/reconciliation_states/task_starting: 0, master/recovery_slave_removals: 0, master/slave_registrations: 1, master/slave_removals: 0, master/slave_reregistrations: 0, master/slaves_active: 1, master/slaves_connected: 1, master/slaves_disconnected: 0, master/slaves_inactive: 0, master/tasks_failed: 0, master/tasks_finished: 0, master/tasks_killed: 0, master/tasks_lost: 0, master/tasks_lost/reason_executor_terminated: 0, master/tasks_lost/reason_executor_unregistered: 0, master/tasks_lost/reason_framework_removed: 0, master/tasks_lost/reason_gc_error: 0, master/tasks_lost/reason_invalid_command: 0, master/tasks_lost/reason_invalid_frameworkid: 0, master/tasks_lost/reason_invalid_offers: 0, master/tasks_lost/reason_master_disconnected: 0, master/tasks_lost/reason_memory_limit: 0, master/tasks_lost/reason_reconciliation: 0, master/tasks_lost/reason_slave_disconnected: 0, master/tasks_lost/reason_slave_removed: 0, master/tasks_lost/reason_slave_restarted: 0, master/tasks_lost/reason_slave_unknown: 0, master/tasks_lost/reason_task_invalid: 0, master/tasks_lost/reason_task_unauthorized: 0, master/tasks_lost/reason_task_unknown: 0, ... } and slave: { ... slave/tasks_lost: 0, slave/tasks_lost/reason_executor_terminated: 0, slave/tasks_lost/reason_executor_unregistered: 0, slave/tasks_lost/reason_framework_removed: 0, slave/tasks_lost/reason_gc_error: 0, slave/tasks_lost/reason_invalid_command: 0, slave/tasks_lost/reason_invalid_frameworkid: 0, slave/tasks_lost/reason_invalid_offers: 0, slave/tasks_lost/reason_master_disconnected: 0, slave/tasks_lost/reason_memory_limit: 0, slave/tasks_lost/reason_reconciliation: 0, slave/tasks_lost/reason_slave_disconnected: 0, slave/tasks_lost/reason_slave_removed: 0, slave/tasks_lost/reason_slave_restarted: 0, slave/tasks_lost/reason_slave_unknown: 0, slave/tasks_lost/reason_task_invalid: 0, slave/tasks_lost/reason_task_unauthorized: 0, slave/tasks_lost/reason_task_unknown: 0, slave/tasks_running: 0, slave/tasks_staging: 0, slave/tasks_starting: 0, ... } Thanks, Dominic Hamon
Re: Review Request 27483: Fetcher uses hadoop to fetch URIs regardless of the url scheme.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/#review59723 --- Patch looks great! Reviews applied: [27483] All tests passed. - Mesos ReviewBot On Nov. 4, 2014, 1:04 a.m., Ankur Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27483/ --- (Updated Nov. 4, 2014, 1:04 a.m.) Review request for mesos and Timothy St. Clair. Bugs: MESOS-1711 https://issues.apache.org/jira/browse/MESOS-1711 Repository: mesos-git Description --- Previously, the fetcher used a hardcoded list of schemes to determine what URIs could be fetched by hadoop (if available). This is now changed such that we first check if hadoop can fetch them for us and then we fallback to the os::net and then a local copy method (same as it used to be). This allows users to fetch artifacts from arbitrary filesystems as long as hadoop is correctly configured (in core-site.xml). Diffs - src/hdfs/hdfs.hpp bbfeddef106c598d8379ced085ef0605c4b2f380 src/launcher/fetcher.cpp 9323c28237010fa065ef34d74435c151ded530a8 Diff: https://reviews.apache.org/r/27483/diff/ Testing --- make check sudo bin/mesos-tests.sh --verbose support/mesos-style.py Thanks, Ankur Chauhan
Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2242
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2242/changes Changes: [toenshoff] Updated --isolation help message for slave. -- [...truncated 5243 lines...] I1104 03:37:55.460032 28381 master.cpp:2337] Processing reply for offers: [ 20141104-033755-3176252227-49988-28363-O0 ] on slave 20141104-033755-3176252227-49988-28363-S0 at slave(1)@67.195.81.189:49988 (proserpina.apache.org) for framework 20141104-033755-3176252227-49988-28363- (default) at scheduler-612ac10f-0976-46e5-9df6-6d51c6c3b683@67.195.81.189:49988 W1104 03:37:55.460283 28381 master.cpp:1985] Executor default for task 0 uses less CPUs (None) than the minimum required (0.01). Please update your executor, as this will be mandatory in future releases. W1104 03:37:55.460320 28381 master.cpp:1996] Executor default for task 0 uses less memory (None) than the minimum required (32MB). Please update your executor, as this will be mandatory in future releases. I1104 03:37:55.460546 28381 master.cpp:2433] Authorizing framework principal 'test-principal' to launch task 0 as user 'jenkins' I1104 03:37:55.462381 28392 master.hpp:877] Adding task 0 with resources cpus(*):1; mem(*):64 on slave 20141104-033755-3176252227-49988-28363-S0 (proserpina.apache.org) I1104 03:37:55.462442 28392 master.cpp:2496] Launching task 0 of framework 20141104-033755-3176252227-49988-28363- (default) at scheduler-612ac10f-0976-46e5-9df6-6d51c6c3b683@67.195.81.189:49988 with resources cpus(*):1; mem(*):64 on slave 20141104-033755-3176252227-49988-28363-S0 at slave(1)@67.195.81.189:49988 (proserpina.apache.org) I1104 03:37:55.462740 28390 slave.cpp:1081] Got assigned task 0 for framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.462961 28387 hierarchical_allocator_process.hpp:563] Recovered cpus(*):1; mem(*):960; disk(*):1024; ports(*):[31000-32000] (total allocatable: cpus(*):1; mem(*):960; disk(*):1024; ports(*):[31000-32000]) on slave 20141104-033755-3176252227-49988-28363-S0 from framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.463003 28387 hierarchical_allocator_process.hpp:599] Framework 20141104-033755-3176252227-49988-28363- filtered slave 20141104-033755-3176252227-49988-28363-S0 for 5secs I1104 03:37:55.463879 28390 slave.cpp:1191] Launching task 0 for framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.466716 28390 slave.cpp:3893] Launching executor default of framework 20141104-033755-3176252227-49988-28363- in work directory '/tmp/MasterTest_OfferNotRescindedOnceUsed_IosgUN/slaves/20141104-033755-3176252227-49988-28363-S0/frameworks/20141104-033755-3176252227-49988-28363-/executors/default/runs/a25474f8-f457-4e6d-8d8b-45c41fde753b' I1104 03:37:55.469238 28390 exec.cpp:132] Version: 0.21.0 I1104 03:37:55.469936 28377 exec.cpp:182] Executor started at: executor(1)@67.195.81.189:49988 with pid 28363 I1104 03:37:55.470327 28390 slave.cpp:1317] Queuing task '0' for executor default of framework '20141104-033755-3176252227-49988-28363- I1104 03:37:55.470551 28390 slave.cpp:555] Successfully attached file '/tmp/MasterTest_OfferNotRescindedOnceUsed_IosgUN/slaves/20141104-033755-3176252227-49988-28363-S0/frameworks/20141104-033755-3176252227-49988-28363-/executors/default/runs/a25474f8-f457-4e6d-8d8b-45c41fde753b' I1104 03:37:55.470660 28390 slave.cpp:1849] Got registration for executor 'default' of framework 20141104-033755-3176252227-49988-28363- from executor(1)@67.195.81.189:49988 I1104 03:37:55.471159 28390 slave.cpp:1968] Flushing queued task 0 for executor 'default' of framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.471324 28382 exec.cpp:206] Executor registered on slave 20141104-033755-3176252227-49988-28363-S0 I1104 03:37:55.471479 28390 slave.cpp:2824] Monitoring executor 'default' of framework '20141104-033755-3176252227-49988-28363-' in container 'a25474f8-f457-4e6d-8d8b-45c41fde753b' I1104 03:37:55.473562 28382 exec.cpp:218] Executor::registered took 44754ns I1104 03:37:55.473768 28382 exec.cpp:293] Executor asked to run task '0' I1104 03:37:55.473961 28382 exec.cpp:302] Executor::launchTask took 168197ns I1104 03:37:55.476100 28382 exec.cpp:525] Executor sending status update TASK_RUNNING (UUID: 38a55dfc-4392-4356-ad26-09fe19e72df7) for task 0 of framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.476656 28382 slave.cpp:2202] Handling status update TASK_RUNNING (UUID: 38a55dfc-4392-4356-ad26-09fe19e72df7) for task 0 of framework 20141104-033755-3176252227-49988-28363- from executor(1)@67.195.81.189:49988 I1104 03:37:55.477147 28390 status_update_manager.cpp:317] Received status update TASK_RUNNING (UUID: 38a55dfc-4392-4356-ad26-09fe19e72df7) for task 0 of framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.477203 28390 status_update_manager.cpp:494] Creating StatusUpdate stream for task 0 of
Re: Review Request 27555: Refactored the C++ Resources class to support persistent disk resources.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27555/#review59725 --- Patch looks great! Reviews applied: [27550, 27552, 27555] All tests passed. - Mesos ReviewBot On Nov. 4, 2014, 1:19 a.m., Jie Yu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27555/ --- (Updated Nov. 4, 2014, 1:19 a.m.) Review request for mesos, Ben Mahler and Vinod Kone. Bugs: MESOS-1974 https://issues.apache.org/jira/browse/MESOS-1974 Repository: mesos-git Description --- The purpose of the refactor is to support persistent disk resources. Here are the main things I've done in this refactor: 1) Resource objects in Resources are stored in minimal format (validated/non-zero). That allows us to kill isAllocatable, allocatable, isZero, etc. 2) 'matches' needs to be split into two pieces: one for combining and one for removing, in order to support persitent disk resource. For example, one cannot combine two Resource object with DiskInfo (it's like two disks), however, you can do removal if they are identical. 3) Some of the interfaces are not intuitive (e.g., =, see details in the ticket). I removed them in favor of more explicit interfaces. 4) Unified all the validation code. 5) Adjusted the tests accordingly. Diffs - include/mesos/resources.hpp 0e37170 src/cli/execute.cpp ddaa20d src/common/resources.cpp e9a0c85 src/examples/low_level_scheduler_libprocess.cpp 7ef5ea7 src/examples/low_level_scheduler_pthread.cpp 6e233a1 src/examples/no_executor_framework.cpp f98a073 src/examples/test_framework.cpp 187a611 src/master/drf_sorter.cpp 5464900 src/master/hierarchical_allocator_process.hpp 31dfb2c src/master/http.cpp a5e34cc src/master/master.cpp 0a5c9a3 src/tests/allocator_tests.cpp 58e15aa src/tests/gc_tests.cpp f7747e2 src/tests/master_tests.cpp d9dc40c src/tests/mesos.hpp 957e223 src/tests/resource_offers_tests.cpp 060039e src/tests/resources_tests.cpp 3e50889 src/tests/slave_recovery_tests.cpp 4fb357b src/tests/sorter_tests.cpp 0516ab5 Diff: https://reviews.apache.org/r/27555/diff/ Testing --- make check Thanks, Jie Yu
Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2521
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2521/changes Changes: [toenshoff] First version of docs/modules.md. -- [...truncated 64861 lines...] I1104 04:36:39.271708 30194 slave.cpp:2369] Status update manager successfully handled status update TASK_FINISHED (UUID: 6f7960a2-02c3-4421-8637-0cbef3fd8ee3) for task 2 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.271744 30194 slave.cpp:2375] Sending acknowledgement for status update TASK_FINISHED (UUID: 6f7960a2-02c3-4421-8637-0cbef3fd8ee3) for task 2 of framework 20141104-043633-3142697795-37858-30153- to executor(1)@67.195.81.187:55454 I1104 04:36:39.271832 30195 hierarchical_allocator_process.hpp:563] Recovered cpus(*):1; mem(*):128 (total allocatable: mem(*):10112; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):1) on slave 20141104-043633-3142697795-37858-30153-S2 from framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.271855 30186 master.cpp:4713] Removing task 4 with resources cpus(*):1; mem(*):128 of framework 20141104-043633-3142697795-37858-30153- on slave 20141104-043633-3142697795-37858-30153-S0 at slave(2)@67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.271875 30496 exec.cpp:339] Executor received status update acknowledgement fde44b10-a67d-465e-b475-912e44d6cb9e for task 0 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272058 30186 master.cpp:2898] Forwarding status update acknowledgement d43993e1-1535-4fb4-87ea-ffb018709857 for task 4 of framework 20141104-043633-3142697795-37858-30153- (No Executor Framework (C++)) at scheduler-2dcd6bb5-c698-48f6-b0a3-66cad42d989a@67.195.81.187:37858 to slave 20141104-043633-3142697795-37858-30153-S0 at slave(2)@67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.272243 30186 master.cpp:3426] Forwarding status update TASK_FINISHED (UUID: fde44b10-a67d-465e-b475-912e44d6cb9e) for task 0 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272255 30454 exec.cpp:339] Executor received status update acknowledgement 6f7960a2-02c3-4421-8637-0cbef3fd8ee3 for task 2 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272317 30194 status_update_manager.cpp:389] Received status update acknowledgement (UUID: d43993e1-1535-4fb4-87ea-ffb018709857) for task 4 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272375 30186 master.cpp:3398] Status update TASK_FINISHED (UUID: fde44b10-a67d-465e-b475-912e44d6cb9e) for task 0 of framework 20141104-043633-3142697795-37858-30153- from slave 20141104-043633-3142697795-37858-30153-S2 at slave(3)@67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.272426 30186 master.cpp:4654] Updating the latest state of task 0 of framework 20141104-043633-3142697795-37858-30153- to TASK_FINISHED I1104 04:36:39.272433 30194 status_update_manager.cpp:525] Cleaning up status update stream for task 4 of framework 20141104-043633-3142697795-37858-30153- Task 0 is in state TASK_FINISHED I1104 04:36:39.272460 30196 sched.cpp:635] Scheduler::statusUpdate took 14010ns I1104 04:36:39.272671 30186 master.cpp:4713] Removing task 1 with resources cpus(*):1; mem(*):128 of framework 20141104-043633-3142697795-37858-30153- on slave 20141104-043633-3142697795-37858-30153-S2 at slave(3)@67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.272713 30185 hierarchical_allocator_process.hpp:563] Recovered cpus(*):1; mem(*):128 (total allocatable: mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2) on slave 20141104-043633-3142697795-37858-30153-S2 from framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272773 30186 master.cpp:2898] Forwarding status update acknowledgement 745bde5a-2ecd-4795-b9b9-024ec602588e for task 1 of framework 20141104-043633-3142697795-37858-30153- (No Executor Framework (C++)) at scheduler-2dcd6bb5-c698-48f6-b0a3-66cad42d989a@67.195.81.187:37858 to slave 20141104-043633-3142697795-37858-30153-S2 at slave(3)@67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.272864 30188 slave.cpp:1789] Status update manager successfully handled status update acknowledgement (UUID: d43993e1-1535-4fb4-87ea-ffb018709857) for task 4 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272924 30188 slave.cpp:4240] Completing task 4 I1104 04:36:39.272958 30186 master.cpp:3426] Forwarding status update TASK_FINISHED (UUID: 6f7960a2-02c3-4421-8637-0cbef3fd8ee3) for task 2 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272994 30188 status_update_manager.cpp:389] Received status update acknowledgement (UUID: 745bde5a-2ecd-4795-b9b9-024ec602588e) for task 1 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.273092 30186 master.cpp:3398] Status update TASK_FINISHED (UUID:
Jenkins build is back to normal : Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2243
See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2243/changes
Re: Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME #2242
@till: looks like a segfault in the auth code? can you triage? On Mon, Nov 3, 2014 at 7:38 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-In-Src-Set-JAVA_HOME/2242/changes Changes: [toenshoff] Updated --isolation help message for slave. -- [...truncated 5243 lines...] I1104 03:37:55.460032 28381 master.cpp:2337] Processing reply for offers: [ 20141104-033755-3176252227-49988-28363-O0 ] on slave 20141104-033755-3176252227-49988-28363-S0 at slave(1)@67.195.81.189:49988 (proserpina.apache.org) for framework 20141104-033755-3176252227-49988-28363- (default) at scheduler-612ac10f-0976-46e5-9df6-6d51c6c3b683@67.195.81.189:49988 W1104 03:37:55.460283 28381 master.cpp:1985] Executor default for task 0 uses less CPUs (None) than the minimum required (0.01). Please update your executor, as this will be mandatory in future releases. W1104 03:37:55.460320 28381 master.cpp:1996] Executor default for task 0 uses less memory (None) than the minimum required (32MB). Please update your executor, as this will be mandatory in future releases. I1104 03:37:55.460546 28381 master.cpp:2433] Authorizing framework principal 'test-principal' to launch task 0 as user 'jenkins' I1104 03:37:55.462381 28392 master.hpp:877] Adding task 0 with resources cpus(*):1; mem(*):64 on slave 20141104-033755-3176252227-49988-28363-S0 ( proserpina.apache.org) I1104 03:37:55.462442 28392 master.cpp:2496] Launching task 0 of framework 20141104-033755-3176252227-49988-28363- (default) at scheduler-612ac10f-0976-46e5-9df6-6d51c6c3b683@67.195.81.189:49988 with resources cpus(*):1; mem(*):64 on slave 20141104-033755-3176252227-49988-28363-S0 at slave(1)@67.195.81.189:49988 (proserpina.apache.org) I1104 03:37:55.462740 28390 slave.cpp:1081] Got assigned task 0 for framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.462961 28387 hierarchical_allocator_process.hpp:563] Recovered cpus(*):1; mem(*):960; disk(*):1024; ports(*):[31000-32000] (total allocatable: cpus(*):1; mem(*):960; disk(*):1024; ports(*):[31000-32000]) on slave 20141104-033755-3176252227-49988-28363-S0 from framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.463003 28387 hierarchical_allocator_process.hpp:599] Framework 20141104-033755-3176252227-49988-28363- filtered slave 20141104-033755-3176252227-49988-28363-S0 for 5secs I1104 03:37:55.463879 28390 slave.cpp:1191] Launching task 0 for framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.466716 28390 slave.cpp:3893] Launching executor default of framework 20141104-033755-3176252227-49988-28363- in work directory '/tmp/MasterTest_OfferNotRescindedOnceUsed_IosgUN/slaves/20141104-033755-3176252227-49988-28363-S0/frameworks/20141104-033755-3176252227-49988-28363-/executors/default/runs/a25474f8-f457-4e6d-8d8b-45c41fde753b' I1104 03:37:55.469238 28390 exec.cpp:132] Version: 0.21.0 I1104 03:37:55.469936 28377 exec.cpp:182] Executor started at: executor(1)@ 67.195.81.189:49988 with pid 28363 I1104 03:37:55.470327 28390 slave.cpp:1317] Queuing task '0' for executor default of framework '20141104-033755-3176252227-49988-28363- I1104 03:37:55.470551 28390 slave.cpp:555] Successfully attached file '/tmp/MasterTest_OfferNotRescindedOnceUsed_IosgUN/slaves/20141104-033755-3176252227-49988-28363-S0/frameworks/20141104-033755-3176252227-49988-28363-/executors/default/runs/a25474f8-f457-4e6d-8d8b-45c41fde753b' I1104 03:37:55.470660 28390 slave.cpp:1849] Got registration for executor 'default' of framework 20141104-033755-3176252227-49988-28363- from executor(1)@67.195.81.189:49988 I1104 03:37:55.471159 28390 slave.cpp:1968] Flushing queued task 0 for executor 'default' of framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.471324 28382 exec.cpp:206] Executor registered on slave 20141104-033755-3176252227-49988-28363-S0 I1104 03:37:55.471479 28390 slave.cpp:2824] Monitoring executor 'default' of framework '20141104-033755-3176252227-49988-28363-' in container 'a25474f8-f457-4e6d-8d8b-45c41fde753b' I1104 03:37:55.473562 28382 exec.cpp:218] Executor::registered took 44754ns I1104 03:37:55.473768 28382 exec.cpp:293] Executor asked to run task '0' I1104 03:37:55.473961 28382 exec.cpp:302] Executor::launchTask took 168197ns I1104 03:37:55.476100 28382 exec.cpp:525] Executor sending status update TASK_RUNNING (UUID: 38a55dfc-4392-4356-ad26-09fe19e72df7) for task 0 of framework 20141104-033755-3176252227-49988-28363- I1104 03:37:55.476656 28382 slave.cpp:2202] Handling status update TASK_RUNNING (UUID: 38a55dfc-4392-4356-ad26-09fe19e72df7) for task 0 of framework 20141104-033755-3176252227-49988-28363- from executor(1)@ 67.195.81.189:49988 I1104 03:37:55.477147 28390 status_update_manager.cpp:317] Received status update TASK_RUNNING (UUID:
Re: Review Request 27556: Updated docs/configuration.md to reflect current state.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27556/#review59727 --- Patch looks great! Reviews applied: [27560, 27556] All tests passed. - Mesos ReviewBot On Nov. 4, 2014, 1:43 a.m., Kapil Arya wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27556/ --- (Updated Nov. 4, 2014, 1:43 a.m.) Review request for mesos, Ian Downes and Till Toenshoff. Bugs: MESOS-2037 https://issues.apache.org/jira/browse/MESOS-2037 Repository: mesos-git Description --- Copied the output of --help messages from configure, master, and slave. Diffs - docs/configuration.md 5845ae324181d01cb65990fbf8dd38a621e1c351 Diff: https://reviews.apache.org/r/27556/diff/ Testing --- Pushed and tested output on github at: https://github.com/karya0/mesos/blob/modules/docs/configuration.md Thanks, Kapil Arya
Re: Build failed in Jenkins: Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui #2521
looks like the slave's authentication never succeeded while scheduler's authz (that started later) succeeded. that is strange. @yan/@till can you triage? On Mon, Nov 3, 2014 at 8:37 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: See https://builds.apache.org/job/Mesos-Trunk-Ubuntu-Build-Out-Of-Src-Disable-Java-Disable-Python-Disable-Webui/2521/changes Changes: [toenshoff] First version of docs/modules.md. -- [...truncated 64861 lines...] I1104 04:36:39.271708 30194 slave.cpp:2369] Status update manager successfully handled status update TASK_FINISHED (UUID: 6f7960a2-02c3-4421-8637-0cbef3fd8ee3) for task 2 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.271744 30194 slave.cpp:2375] Sending acknowledgement for status update TASK_FINISHED (UUID: 6f7960a2-02c3-4421-8637-0cbef3fd8ee3) for task 2 of framework 20141104-043633-3142697795-37858-30153- to executor(1)@67.195.81.187:55454 I1104 04:36:39.271832 30195 hierarchical_allocator_process.hpp:563] Recovered cpus(*):1; mem(*):128 (total allocatable: mem(*):10112; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):1) on slave 20141104-043633-3142697795-37858-30153-S2 from framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.271855 30186 master.cpp:4713] Removing task 4 with resources cpus(*):1; mem(*):128 of framework 20141104-043633-3142697795-37858-30153- on slave 20141104-043633-3142697795-37858-30153-S0 at slave(2)@67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.271875 30496 exec.cpp:339] Executor received status update acknowledgement fde44b10-a67d-465e-b475-912e44d6cb9e for task 0 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272058 30186 master.cpp:2898] Forwarding status update acknowledgement d43993e1-1535-4fb4-87ea-ffb018709857 for task 4 of framework 20141104-043633-3142697795-37858-30153- (No Executor Framework (C++)) at scheduler-2dcd6bb5-c698-48f6-b0a3-66cad42d989a@67.195.81.187:37858 to slave 20141104-043633-3142697795-37858-30153-S0 at slave(2)@ 67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.272243 30186 master.cpp:3426] Forwarding status update TASK_FINISHED (UUID: fde44b10-a67d-465e-b475-912e44d6cb9e) for task 0 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272255 30454 exec.cpp:339] Executor received status update acknowledgement 6f7960a2-02c3-4421-8637-0cbef3fd8ee3 for task 2 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272317 30194 status_update_manager.cpp:389] Received status update acknowledgement (UUID: d43993e1-1535-4fb4-87ea-ffb018709857) for task 4 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272375 30186 master.cpp:3398] Status update TASK_FINISHED (UUID: fde44b10-a67d-465e-b475-912e44d6cb9e) for task 0 of framework 20141104-043633-3142697795-37858-30153- from slave 20141104-043633-3142697795-37858-30153-S2 at slave(3)@67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.272426 30186 master.cpp:4654] Updating the latest state of task 0 of framework 20141104-043633-3142697795-37858-30153- to TASK_FINISHED I1104 04:36:39.272433 30194 status_update_manager.cpp:525] Cleaning up status update stream for task 4 of framework 20141104-043633-3142697795-37858-30153- Task 0 is in state TASK_FINISHED I1104 04:36:39.272460 30196 sched.cpp:635] Scheduler::statusUpdate took 14010ns I1104 04:36:39.272671 30186 master.cpp:4713] Removing task 1 with resources cpus(*):1; mem(*):128 of framework 20141104-043633-3142697795-37858-30153- on slave 20141104-043633-3142697795-37858-30153-S2 at slave(3)@67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.272713 30185 hierarchical_allocator_process.hpp:563] Recovered cpus(*):1; mem(*):128 (total allocatable: mem(*):10240; disk(*):3.70122e+06; ports(*):[31000-32000]; cpus(*):2) on slave 20141104-043633-3142697795-37858-30153-S2 from framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272773 30186 master.cpp:2898] Forwarding status update acknowledgement 745bde5a-2ecd-4795-b9b9-024ec602588e for task 1 of framework 20141104-043633-3142697795-37858-30153- (No Executor Framework (C++)) at scheduler-2dcd6bb5-c698-48f6-b0a3-66cad42d989a@67.195.81.187:37858 to slave 20141104-043633-3142697795-37858-30153-S2 at slave(3)@ 67.195.81.187:37858 (pomona.apache.org) I1104 04:36:39.272864 30188 slave.cpp:1789] Status update manager successfully handled status update acknowledgement (UUID: d43993e1-1535-4fb4-87ea-ffb018709857) for task 4 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272924 30188 slave.cpp:4240] Completing task 4 I1104 04:36:39.272958 30186 master.cpp:3426] Forwarding status update TASK_FINISHED (UUID: 6f7960a2-02c3-4421-8637-0cbef3fd8ee3) for task 2 of framework 20141104-043633-3142697795-37858-30153- I1104 04:36:39.272994