Re: Review Request 17306: Added an asynchronous subprocess utility.
On Jan. 27, 2014, 7:53 a.m., Ian Downes wrote: What about supporting environment variables specific to the child process? This is necessary for distinct environments between different subprocesses and the parent. This could be done by prepending the command with 'env' but it'll be much nicer to take a mapstring, string of environment variables and setenv them after the fork. Along the same lines, what about optionally taking a user and working directory? Sorry for intrusion with comments. What about more generic way - a way to pass a callable, that will be invoked after fork. Usually it is called pre_exec_fn. - Nikita --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/#review32817 --- On Jan. 24, 2014, 7:06 a.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/ --- (Updated Jan. 24, 2014, 7:06 a.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Jie Yu. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This adds an asynchronous mechanism for subprocess execution, per MESOS-943. What started simple was made a little more complex due to the following issues: 1. Who is responsible for closing the input / output descriptors? Placing this burden onto the caller of subprocess() seems likely to yield leaked open file descriptors. This introduced the notion of a shared_ptr / destructor / copy constructor / assignment constructor to ensure that the file descriptors are closed when the handle to the file descriptors are lost. However, even with my implementation, one may copy these file descriptors, at which point they may be deleted from underneath them. 2. What does discarding the status entail? Does it kill the process? The current implementation kills the process, which requires the use of an explicit Promise to deal with the discard from the caller not affecting the reaper's future. If discard() is a no-op, we must still use an explicit Promise to preserve the notification from the Reaper (so that we can know when to delete the Reaper). That's about it, I've added tests that demonstrate the ability to communicate with the subprocess through stdin / stout / stderr. Please let me know if you find any simplifications that can be made! (Other than C++11 lambdas, of course :)) Diffs - 3rdparty/libprocess/Makefile.am 40f01a7b3803696ccca440c8326e1d6d7c377459 3rdparty/libprocess/include/process/subprocess.hpp PRE-CREATION 3rdparty/libprocess/src/tests/subprocess_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/17306/diff/ Testing --- Tests were added and ran in repetition. Thanks, Ben Mahler
question about hadoop on mesos
Hi, I ran a map-only job on hadoop. But on every tasktracker, it had only 7 map slots running. How to let it be 8 map slots?
Re: question about hadoop on mesos
Hi, I ran a map-only job on hadoop. Every node has 8 cores, but on every tasktracker, it had only 7 map slots running. How to let tasktracker run 8 map slots on every node? Thank you
Re: question about hadoop on mesos
Another question, The tasktracker can not stop after the hadoop job is done. Why? -原始邮件- 发件人: HUO Jing huoj...@ihep.ac.cn 发送时间: 2014年1月27日 星期一 收件人: dev@mesos.apache.org 抄送: Brenden Matthews bren...@diddyinc.com 主题: Re: question about hadoop on mesos Hi, I ran a map-only job on hadoop. Every node has 8 cores, but on every tasktracker, it had only 7 map slots running. How to let tasktracker run 8 map slots on every node? Thank you
Re: Review Request 17343: Exposed coordinator demotion.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17343/#review32782 --- Ship it! src/log/coordinator.hpp https://reviews.apache.org/r/17343/#comment61750 Please add some comments about when None() will be returned. src/log/coordinator.hpp https://reviews.apache.org/r/17343/#comment61751 Ditto here. - Jie Yu On Jan. 25, 2014, 12:43 a.m., Benjamin Hindman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17343/ --- (Updated Jan. 25, 2014, 12:43 a.m.) Review request for mesos and Jie Yu. Repository: mesos-git Description --- See summary Diffs - src/log/coordinator.hpp 35b68e938c5f2ca76bbdf8dbb0184ef1686d94f1 src/log/coordinator.cpp bc85e668a01d60077fdd954139d60edd295ad3b0 src/log/log.cpp e83f822af86a2389e2b1abab9489713cb59838c2 src/state/zookeeper.hpp d1d1fedf27987aeaf9fbdee678d3b3848d05620a src/tests/log_tests.cpp e493af4f2f2435efe168d07acd267b61afd37fe4 Diff: https://reviews.apache.org/r/17343/diff/ Testing --- make check Thanks, Benjamin Hindman
Re: Please Help me about hadoop on Mesos
I have some questions about running hadoop on top of Mesos, please help me. 1. when a tasktracker is launched, if n cpu core are allocated to it, it can only launch n-1 map tasks. Could someone tell me why? And, if I want to run map-only job, what should I do to run n map tasks on a n cpu resource offer? This is because 1 cpu is allocated to the task tracker itself. 2. After a tasktracker is launched, in what condition it's status will update to FINISHED? In my cluster, sometimes it will never end until I restart the jobtracker. Sometimes it will end if there is no task or job in jobtracker to run. The expected case is that the task tracker is finished/killed if there is no task/job assigned to it. If there is an idle task tracker for a long time it's probably a bug (@brenden can correct me if the semantics have changed around this). Some logs would help diagnose the issue. 3. How to use DRF with weight? I run two frameworks on mesos, and I want to give them different proportion of resources. Give each framework a different role (FrameworkInfo.role) and give weights to each role via master command line flags (see --roles and --weights via ./master --help). Please help me! Thank you very much!
[jira] [Resolved] (MESOS-799) Mesos python egg is faulty on OS X Mavericks
[ https://issues.apache.org/jira/browse/MESOS-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niklas Quarfot Nielsen resolved MESOS-799. -- Resolution: Fixed Fix Version/s: 0.17.0 Python issues should be resolved from BenH's C++11/Clang patch sets. Mesos python egg is faulty on OS X Mavericks Key: MESOS-799 URL: https://issues.apache.org/jira/browse/MESOS-799 Project: Mesos Issue Type: Bug Environment: Mac OS X Mavericks, GCC 4.7 4.8 Reporter: Niklas Quarfot Nielsen Assignee: Niklas Quarfot Nielsen Fix For: 0.17.0 The Python framework test (and I suspect other Python frameworks) breaks on OS X Mavericks. From a quick study, this is what I found: 1) The chosen compiler in $(CC) is not propagated in src/Makefile.am to setup.py during python binding compilation. 2) When compiler is chosen, compiler flags in distutils are picked up from the ones used to compile Python (which most likely was clang). The effect of this is hard to pick up; Clang is able to bundle both 32bit and 64bit executable into one with multiple -arch flags. AFAIK GCC picks only one architecture (the last one) which in this case leaves a 32bit image only, which then is incompatible with the other binaries. Distutil can be configured by setting environment variables to overwrite for example linker and compiler flags. LDSHARED and CCSHARED are some, but additional need to be set to deal with the 32bit/64bit bundle issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MESOS-798) ExamplesTest.PythonFramework failing
[ https://issues.apache.org/jira/browse/MESOS-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883017#comment-13883017 ] Niklas Quarfot Nielsen commented on MESOS-798: -- Bernardo, can we mark this are resolved as well? The C++11/Clang patches should fix this issue. ExamplesTest.PythonFramework failing - Key: MESOS-798 URL: https://issues.apache.org/jira/browse/MESOS-798 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.16.0 Environment: Linux 2.6.32-279.el6.x86_64 Python 2.6.6 Python Packages installed python-2.6.6-37.el6_4.x86_64 protobuf-python-2.3.0-7.el6.x86_64 python-libs-2.6.6-37.el6_4.x86_64 python-devel-2.6.6-37.el6_4.x86_64 python-setuptools-0.6.10-3.el6.noarch python-boto-2.13.3-1.el6.noarch Reporter: Bernardo Gomez Palacio Labels: test (Based on Git Commit Hash 03b9407) Running `make check` fails while executing ExamplesTest.PythonFramework with the following trace: [ RUN ] ExamplesTest.PythonFramework Using temporary directory '/tmp/ExamplesTest_PythonFramework_5Vv4TM' Traceback (most recent call last): File /builddir/build/BUILD/mesos-03b94076caddceda4c0a6a03aa44ebe4d64f1acc/src/examples/python/test_framework.py, line 23, in module import mesos File build/bdist.linux-x86_64/egg/mesos.py, line 26, in module File build/bdist.linux-x86_64/egg/_mesos.py, line 7, in module File build/bdist.linux-x86_64/egg/_mesos.py, line 6, in __bootstrap__ File build/bdist.linux-x86_64/egg/mesos_pb2.py, line 1545, in module AttributeError: 'FileDescriptor' object has no attribute 'message_types_by_name' tests/script.cpp:77: Failure Failed python_framework_test.sh exited with status 1 [ FAILED ] ExamplesTest.PythonFramework (432 ms) [--] 5 tests from ExamplesTest (4346 ms total) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (MESOS-775) expose count of running tasks
[ https://issues.apache.org/jira/browse/MESOS-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niklas Quarfot Nielsen resolved MESOS-775. -- Resolution: Duplicate Duplicate of MESOS-772 expose count of running tasks - Key: MESOS-775 URL: https://issues.apache.org/jira/browse/MESOS-775 Project: Mesos Issue Type: Improvement Reporter: David Robinson Priority: Minor The stats endpoint doesn't show the current number of running tasks: $ curl -s http://localhost:5051/slave\(1\)/stats.json | python2.7 -m json.tool { failed_tasks: 0, finished_tasks: 0, invalid_status_updates: 0, killed_tasks: 0, lost_tasks: 0, recovery_errors: 0, registered: 1, staged_tasks: 2, started_tasks: 0, total_frameworks: 1, uptime: 1168.518182912, valid_status_updates: 0 } Can this be added please? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17345: Replaced Log::Writer constructor with explicit Log::Writer::start.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17345/#review32851 --- src/java/jni/org_apache_mesos_Log.cpp https://reviews.apache.org/r/17345/#comment61849 This should be 'break', right? src/log/log.hpp https://reviews.apache.org/r/17345/#comment61853 Comments need to be adjusted. Seems that currently, multiple writers can be 'valid', but only one of them has 'started'. - Jie Yu On Jan. 25, 2014, 12:43 a.m., Benjamin Hindman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17345/ --- (Updated Jan. 25, 2014, 12:43 a.m.) Review request for mesos and Jie Yu. Repository: mesos-git Description --- See summary. Diffs - src/java/jni/org_apache_mesos_Log.cpp 36c636d155c1581eeb7734cdbc5b6fac4ca42440 src/log/log.hpp 1f0b30ddf8709cf64db7989775c8b0e926af99b5 src/log/log.cpp e83f822af86a2389e2b1abab9489713cb59838c2 src/tests/log_tests.cpp e493af4f2f2435efe168d07acd267b61afd37fe4 Diff: https://reviews.apache.org/r/17345/diff/ Testing --- make check Thanks, Benjamin Hindman
Re: Review Request 17346: Exposed coordinator demotion in Log.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17346/#review32858 --- src/java/jni/org_apache_mesos_Log.cpp https://reviews.apache.org/r/17346/#comment61859 Shouldn't be: position.get().get()? src/java/jni/org_apache_mesos_Log.cpp https://reviews.apache.org/r/17346/#comment61860 Ditto here. src/log/log.hpp https://reviews.apache.org/r/17346/#comment61862 Adjust the comments here. A none result means the writer has lost exclusive write access. src/log/log.hpp https://reviews.apache.org/r/17346/#comment61863 Ditto here. src/log/log.cpp https://reviews.apache.org/r/17346/#comment61864 Using a Future here seems to be not necessary. What about the following: static OptionLog::Position position(const Optionuint64_t position); - Jie Yu On Jan. 25, 2014, 12:44 a.m., Benjamin Hindman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17346/ --- (Updated Jan. 25, 2014, 12:44 a.m.) Review request for mesos and Jie Yu. Repository: mesos-git Description --- See summary. Diffs - src/java/jni/org_apache_mesos_Log.cpp 36c636d155c1581eeb7734cdbc5b6fac4ca42440 src/log/log.hpp 1f0b30ddf8709cf64db7989775c8b0e926af99b5 src/log/log.cpp e83f822af86a2389e2b1abab9489713cb59838c2 src/tests/log_tests.cpp e493af4f2f2435efe168d07acd267b61afd37fe4 Diff: https://reviews.apache.org/r/17346/diff/ Testing --- make check Thanks, Benjamin Hindman
[jira] [Created] (MESOS-948) Docs for Java interfaces
Connor Doyle created MESOS-948: -- Summary: Docs for Java interfaces Key: MESOS-948 URL: https://issues.apache.org/jira/browse/MESOS-948 Project: Mesos Issue Type: Documentation Components: documentation, framework, java api Reporter: Connor Doyle There is some great user documentation embedded in the Java API sources (see: https://github.com/apache/mesos/tree/master/src/java/src/org/apache/mesos). Unfortunately, these are hard to find. It would be helpful to publish the generated Javadocs and link to them from the Mesos site. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (MESOS-949) slave should wipe meta directory on startup if bootid changes
Kevin Sweeney created MESOS-949: --- Summary: slave should wipe meta directory on startup if bootid changes Key: MESOS-949 URL: https://issues.apache.org/jira/browse/MESOS-949 Project: Mesos Issue Type: Bug Components: slave Reporter: Kevin Sweeney Right now, if slave metadata is persisted across a reboot the slave is left with useless metadata. Slave should detect this case and purge the metadata, perhaps by using bootid. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Re: Please Help me about hadoop on Mesos
On Mon, Jan 27, 2014 at 10:07 AM, HUO Jing huoj...@ihep.ac.cn wrote: So, at the very beginning, if all the resource are assigned to hadoop, and after that, there are always enough jobs in jobtracker, is that meanning that the other framework will never get resource? Is it fair to do so ? That is correct. Currently there is no concept of pre-emption of resources in mesos. While this is likely to change in the future, in the short term you could reserve resources to frameworks (see --resources on ./slave --help) to avoid starvation.
[jira] [Resolved] (MESOS-949) slave should wipe meta directory on startup if bootid changes
[ https://issues.apache.org/jira/browse/MESOS-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone resolved MESOS-949. -- Resolution: Duplicate Assignee: Benjamin Mahler slave should wipe meta directory on startup if bootid changes - Key: MESOS-949 URL: https://issues.apache.org/jira/browse/MESOS-949 Project: Mesos Issue Type: Bug Components: slave Reporter: Kevin Sweeney Assignee: Benjamin Mahler Right now, if slave metadata is persisted across a reboot the slave is left with useless metadata. Slave should detect this case and purge the metadata, perhaps by using bootid. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MESOS-948) Docs for Java interfaces
[ https://issues.apache.org/jira/browse/MESOS-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Connor Doyle updated MESOS-948: --- Description: There is some great user documentation embedded in the Java API sources (see: https://github.com/apache/mesos/tree/master/src/java/src/org/apache/mesos/ ). Unfortunately, these are hard to find. It would be helpful to publish the generated Javadocs and link to them from the Mesos site. was: There is some great user documentation embedded in the Java API sources (see: https://github.com/apache/mesos/tree/master/src/java/src/org/apache/mesos). Unfortunately, these are hard to find. It would be helpful to publish the generated Javadocs and link to them from the Mesos site. Docs for Java interfaces Key: MESOS-948 URL: https://issues.apache.org/jira/browse/MESOS-948 Project: Mesos Issue Type: Documentation Components: documentation, framework, java api Reporter: Connor Doyle There is some great user documentation embedded in the Java API sources (see: https://github.com/apache/mesos/tree/master/src/java/src/org/apache/mesos/ ). Unfortunately, these are hard to find. It would be helpful to publish the generated Javadocs and link to them from the Mesos site. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17423: Added log implementation for state storage.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17423/ --- Review request for mesos, Ben Mahler and Jie Yu. Repository: mesos-git Description --- This is done in C++11, I've converted to C++03 in https://reviews.apache.org/r/17424. Note that this does not implement diffs but that's probably okay for our initial use case. This also does not implement caching so all state entries are stored in memory, which is also okay for our initial use case. Finally, this does not implement defragmentation which again is probably okay for our use case. These are captured as TODOs in the code. Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/messages/state.proto 7f7a8a505d6f24b01fec0c3ad47b0e15b2b17ffa src/state/log.hpp PRE-CREATION src/state/log.cpp PRE-CREATION src/tests/log_tests.cpp e493af4f2f2435efe168d07acd267b61afd37fe4 src/tests/state_tests.cpp 03c538861a88d3a07e2468dce5553eeb3acc9243 Diff: https://reviews.apache.org/r/17423/diff/ Testing --- make check Thanks, Benjamin Hindman
Review Request 17424: Refactored LogStoragProcess for C++03.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17424/ --- Review request for mesos, Ben Mahler and Jie Yu. Repository: mesos-git Description --- See summary. Diffs - src/state/log.cpp PRE-CREATION Diff: https://reviews.apache.org/r/17424/diff/ Testing --- make check Thanks, Benjamin Hindman
Re: Review Request 17423: Added log implementation for state storage.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17423/ --- (Updated Jan. 27, 2014, 9:33 p.m.) Review request for mesos, Ben Mahler and Jie Yu. Repository: mesos-git Description (updated) --- This is done in C++11, I've converted to C++03 in https://reviews.apache.org/r/17424. Note that this does not implement diffs but that's probably okay for our initial use case. This also does not implement caching so all state entries are stored in memory, which is also okay for our initial use case. Finally, this does not implement defragmentation which again is probably okay for our use case. These are captured as TODOs in the code. Also, this uses a 'sequence' operation which will be replaced with work being done by Jie Yu. Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/messages/state.proto 7f7a8a505d6f24b01fec0c3ad47b0e15b2b17ffa src/state/log.hpp PRE-CREATION src/state/log.cpp PRE-CREATION src/tests/log_tests.cpp e493af4f2f2435efe168d07acd267b61afd37fe4 src/tests/state_tests.cpp 03c538861a88d3a07e2468dce5553eeb3acc9243 Diff: https://reviews.apache.org/r/17423/diff/ Testing --- make check Thanks, Benjamin Hindman
[jira] [Commented] (MESOS-950) Add LLDB helpers
[ https://issues.apache.org/jira/browse/MESOS-950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883371#comment-13883371 ] Benjamin Hindman commented on MESOS-950: YES PLEASE YES PLEASE YES PLEASE! Do you have any experience using lldb on OS X? Did you have to build via source to use lldb? Add LLDB helpers Key: MESOS-950 URL: https://issues.apache.org/jira/browse/MESOS-950 Project: Mesos Issue Type: Improvement Components: build Affects Versions: 0.17.0 Environment: LLVM environments both Linux and Mac OS X. Reporter: Niklas Quarfot Nielsen Priority: Trivial It would be helpful to add LLDB helpers in similar style as gdb-mesos-tests, gdb-mesos-local, gdb-mesos-master and gdb-mesos-slave, as GDB seems to have been out phased on OS X Mavericks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17305: Update the slave to use the libprocess Reaper.
On Jan. 24, 2014, 7:47 a.m., Benjamin Hindman wrote: src/launcher/executor.cpp, line 298 https://reviews.apache.org/r/17305/diff/1/?file=447761#file447761line298 I think it makes sense to have a global reaper just like we do with statistics. Having multiple libprocess processes be delaying every 1 second to call waitpid is a bit wasteful. Ben Mahler wrote: That's true, at the same time we would be doing a 1 second no-op event loop on all libprocess binaries, unless we added the ability to start/stop the Reaper. I was thinking of a library-style API: namespace process { FutureOptionint reap(pid_t); } This would use a global Reaper under the hood, I'll be updating the diffs with this cleaner interface. One side-effect of the global reaper that I discovered while testing is that when running the reaping related tests in repetition, they gradually become slower. This is because each test that advances time for the global reaper to run it's wait() loop pushes the next wait() loop further into the future, requiring more and more calls to Clock::advance + Clock::settle. Before these changes, since a Reaper was re-constructed across each test and each component, we were immune to this. I'll leave a note about this in the code. We can fix this if we were to implement the approach where we have a thread per pid, each blocking on waitpid(). - Ben --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17305/#review32706 --- On Jan. 24, 2014, 7:05 a.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17305/ --- (Updated Jan. 24, 2014, 7:05 a.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Jie Yu. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This removes the Mesos Reaper in preference of using the libprocess Reaper, which no longer reaps non-monitored processes. Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/launcher/executor.cpp b73ab479500a7347a38ba53acecfab9229f1080d src/slave/cgroups_isolator.hpp e86062e9abaaa263c32c55e9dbfefd700f605886 src/slave/process_isolator.hpp 4ae093fe65775a2b9bec42071961dd58aa0c3d8b src/slave/reaper.hpp 9a31c754475ecbce5299d8f18f38253c542404e5 src/slave/reaper.cpp 5eabbc3911584cf47c353bcf4ca660c47c2c17be src/tests/environment.cpp 6edce4552ef9a12b7b58cefea97ebacc9224ab04 src/tests/reaper_tests.cpp 608ec0eff4eaae115d75621937a39b22e3bdb068 src/tests/slave_recovery_tests.cpp 5a4c4fc4f687a37409d1afbda4c0d07fcdc3a4c7 Diff: https://reviews.apache.org/r/17305/diff/ Testing --- make check I've also added an orphan check in the testing environment tear down. Thanks, Ben Mahler
[jira] [Commented] (MESOS-950) Add LLDB helpers
[ https://issues.apache.org/jira/browse/MESOS-950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883416#comment-13883416 ] Niklas Quarfot Nielsen commented on MESOS-950: -- Do you think we should put some effort into picking up whether lldb is supported? I don't think we do that for gdb. Add LLDB helpers Key: MESOS-950 URL: https://issues.apache.org/jira/browse/MESOS-950 Project: Mesos Issue Type: Improvement Components: build Affects Versions: 0.17.0 Environment: LLVM environments both Linux and Mac OS X. Reporter: Niklas Quarfot Nielsen Assignee: Niklas Quarfot Nielsen Priority: Trivial It would be helpful to add LLDB helpers in similar style as gdb-mesos-tests, gdb-mesos-local, gdb-mesos-master and gdb-mesos-slave, as GDB seems to have been out phased on OS X Mavericks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Closed] (MESOS-829) Switch website CMS from Jekyll to Middleman
[ https://issues.apache.org/jira/browse/MESOS-829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Lester closed MESOS-829. - Resolution: Fixed Fixed. Thanks for your help, [~jfarrell]! Switch website CMS from Jekyll to Middleman --- Key: MESOS-829 URL: https://issues.apache.org/jira/browse/MESOS-829 Project: Mesos Issue Type: Improvement Components: project website Reporter: Dave Lester Assignee: Jake Farrell Priority: Critical Attachments: site.tar.gz We currently use Jekyll as the CMS to manage the Mesos website. It does most of what we need by offering templates and power to control how pages render. The only issue is that when you build from source files Jekyll will wipe away any hidden files in the rendered directory. This is problematic since both the source and live directories are in version control (svn). Our recent website revision history has been a mess, because we often end up wiping out/rewriting I'd like to switch the site to use Middleman to render our static site files. We are using Middleman for Apache Aurora, and it doesn't face the issue of wiping out hidden files. Additionally, it seems capable of all the same rendering features of Markdown that we currently use. The only difference is that redcarpet (the library for converting markdown to HTML) needs to use ruby 3+. Thoughts? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17427: Handled EINTR in os::close.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/ --- Review request for mesos, Benjamin Hindman, Ian Downes, and Vinod Kone. Repository: mesos-git Description --- See above, note that GNU offers TEMP_FAILURE_RETRY as a macro for this. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp bba6f43eaeba0238a5db6e388902d92eb18f14f5 Diff: https://reviews.apache.org/r/17427/diff/ Testing --- make check Thanks, Ben Mahler
Re: Review Request 17306: Added an asynchronous subprocess utility.
On Jan. 24, 2014, 6:16 p.m., Ian Downes wrote: 3rdparty/libprocess/include/process/subprocess.hpp, line 120 https://reviews.apache.org/r/17306/diff/1/?file=447773#file447773line120 should we check their exit codes since dup2 can fail even with valid descriptors. I've handled EINTR now, and updating os::close to handle EINTR as well in a separate review. It doesn't appear that the other error codes are possible here: EBADF: oldfd was opened above since we check the result of pipe, newfd is in the valid range. EBUSY: (Linux), doesn't look like we're susceptible to a race condition between open and dup2. EINTR: Handled this now. EINVAL: Not possible here. EMFILE: Since dup2 will close newfd (stdin), it does not seem possible to for dup2 to cause the number of open file descriptors in the subprocess to be exceeded. - Ben --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/#review32726 --- On Jan. 24, 2014, 7:06 a.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/ --- (Updated Jan. 24, 2014, 7:06 a.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Jie Yu. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This adds an asynchronous mechanism for subprocess execution, per MESOS-943. What started simple was made a little more complex due to the following issues: 1. Who is responsible for closing the input / output descriptors? Placing this burden onto the caller of subprocess() seems likely to yield leaked open file descriptors. This introduced the notion of a shared_ptr / destructor / copy constructor / assignment constructor to ensure that the file descriptors are closed when the handle to the file descriptors are lost. However, even with my implementation, one may copy these file descriptors, at which point they may be deleted from underneath them. 2. What does discarding the status entail? Does it kill the process? The current implementation kills the process, which requires the use of an explicit Promise to deal with the discard from the caller not affecting the reaper's future. If discard() is a no-op, we must still use an explicit Promise to preserve the notification from the Reaper (so that we can know when to delete the Reaper). That's about it, I've added tests that demonstrate the ability to communicate with the subprocess through stdin / stout / stderr. Please let me know if you find any simplifications that can be made! (Other than C++11 lambdas, of course :)) Diffs - 3rdparty/libprocess/Makefile.am 40f01a7b3803696ccca440c8326e1d6d7c377459 3rdparty/libprocess/include/process/subprocess.hpp PRE-CREATION 3rdparty/libprocess/src/tests/subprocess_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/17306/diff/ Testing --- Tests were added and ran in repetition. Thanks, Ben Mahler
Re: Review Request 17306: Added an asynchronous subprocess utility.
On Jan. 24, 2014, 8:12 p.m., Jie Yu wrote: 3rdparty/libprocess/include/process/subprocess.hpp, lines 163-165 https://reviews.apache.org/r/17306/diff/1/?file=447773#file447773line163 Instead of killing the subprocess when 'status' is discarded, what about having an explicit kill(), because a user who doesn't care about exit status does not mean that he want the subprocess to be killed. I've opted to remove the discard semantics, but I've also omitted a kill(), since the user has access to the pid and so it seems best for now to just let the caller determine how to kill (which signal, kill escalation SIGTERM - SIGKILL, etc). On Jan. 24, 2014, 8:12 p.m., Jie Yu wrote: 3rdparty/libprocess/include/process/subprocess.hpp, line 30 https://reviews.apache.org/r/17306/diff/1/?file=447773#file447773line30 What's the reason not having a copy constructor? Added one. On Jan. 24, 2014, 8:12 p.m., Jie Yu wrote: 3rdparty/libprocess/include/process/subprocess.hpp, line 157 https://reviews.apache.org/r/17306/diff/1/?file=447773#file447773line157 Not sure if we need a reaper for each subprocess call? Can we create one reaper for all subprocess calls? I guess you don't want to create a reaper if no subprocess call is made, and you want to cleanup the reaper if all Subprocesses are terminated. But given that you need to handle reaper cleanup logic in any way (_cleanup), why not do it at global level (i.e., lazy initialization, reference counting). Ah perfect, I've already changed this to become process::reap in the previous review, which now uses a lazily initialized Reaper, please take a look! :) - Ben --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/#review32740 --- On Jan. 24, 2014, 7:06 a.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/ --- (Updated Jan. 24, 2014, 7:06 a.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Jie Yu. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This adds an asynchronous mechanism for subprocess execution, per MESOS-943. What started simple was made a little more complex due to the following issues: 1. Who is responsible for closing the input / output descriptors? Placing this burden onto the caller of subprocess() seems likely to yield leaked open file descriptors. This introduced the notion of a shared_ptr / destructor / copy constructor / assignment constructor to ensure that the file descriptors are closed when the handle to the file descriptors are lost. However, even with my implementation, one may copy these file descriptors, at which point they may be deleted from underneath them. 2. What does discarding the status entail? Does it kill the process? The current implementation kills the process, which requires the use of an explicit Promise to deal with the discard from the caller not affecting the reaper's future. If discard() is a no-op, we must still use an explicit Promise to preserve the notification from the Reaper (so that we can know when to delete the Reaper). That's about it, I've added tests that demonstrate the ability to communicate with the subprocess through stdin / stout / stderr. Please let me know if you find any simplifications that can be made! (Other than C++11 lambdas, of course :)) Diffs - 3rdparty/libprocess/Makefile.am 40f01a7b3803696ccca440c8326e1d6d7c377459 3rdparty/libprocess/include/process/subprocess.hpp PRE-CREATION 3rdparty/libprocess/src/tests/subprocess_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/17306/diff/ Testing --- Tests were added and ran in repetition. Thanks, Ben Mahler
Re: Review Request 17306: Added an asynchronous subprocess utility.
On Jan. 27, 2014, 7:53 a.m., Ian Downes wrote: What about supporting environment variables specific to the child process? This is necessary for distinct environments between different subprocesses and the parent. This could be done by prepending the command with 'env' but it'll be much nicer to take a mapstring, string of environment variables and setenv them after the fork. Along the same lines, what about optionally taking a user and working directory? Nikita Vetoshkin wrote: Sorry for intrusion with comments. What about more generic way - a way to pass a callable, that will be invoked after fork. Usually it is called pre_exec_fn. Ian Downes wrote: Yes, this is a more general solution. However, the callable *must* be async-signal-safe so perhaps we don't want to fully expose this. I could imagine taking environment variables and arguments explicitly, let's take the lazy approach and get this simple version committed to get things started. We'll add functionality as the need arises. :) - Ben --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/#review32817 --- On Jan. 24, 2014, 7:06 a.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/ --- (Updated Jan. 24, 2014, 7:06 a.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Jie Yu. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This adds an asynchronous mechanism for subprocess execution, per MESOS-943. What started simple was made a little more complex due to the following issues: 1. Who is responsible for closing the input / output descriptors? Placing this burden onto the caller of subprocess() seems likely to yield leaked open file descriptors. This introduced the notion of a shared_ptr / destructor / copy constructor / assignment constructor to ensure that the file descriptors are closed when the handle to the file descriptors are lost. However, even with my implementation, one may copy these file descriptors, at which point they may be deleted from underneath them. 2. What does discarding the status entail? Does it kill the process? The current implementation kills the process, which requires the use of an explicit Promise to deal with the discard from the caller not affecting the reaper's future. If discard() is a no-op, we must still use an explicit Promise to preserve the notification from the Reaper (so that we can know when to delete the Reaper). That's about it, I've added tests that demonstrate the ability to communicate with the subprocess through stdin / stout / stderr. Please let me know if you find any simplifications that can be made! (Other than C++11 lambdas, of course :)) Diffs - 3rdparty/libprocess/Makefile.am 40f01a7b3803696ccca440c8326e1d6d7c377459 3rdparty/libprocess/include/process/subprocess.hpp PRE-CREATION 3rdparty/libprocess/src/tests/subprocess_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/17306/diff/ Testing --- Tests were added and ran in repetition. Thanks, Ben Mahler
Re: Review Request 17306: Added an asynchronous subprocess utility.
On Jan. 24, 2014, 10:39 p.m., Ian Downes wrote: 3rdparty/libprocess/include/process/subprocess.hpp, line 133 https://reviews.apache.org/r/17306/diff/1/?file=447773#file447773line133 I don't think strlen() is async-signal-safe. I'd like to use it for convenience, my rationale is three-fold: 1. I don't think there are any known implementations of strlen that are not async signal safe, given the functionality of strlen is to count the length of a string argument, I cannot think of a sane implementation that would depend on global state. 2. There have been requests to add it to the list of known async signal safe functions, given 1 above. http://austingroupbugs.net/view.php?id=692 3. It is highly likely that both clang and gcc will optimize out the need for strlen in these cases since the result can be determined at compile-time. What do you think? I could be sold either way, but I'd like to keep the code clean and it seems a safe assumption to make. - Ben --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/#review32758 --- On Jan. 24, 2014, 7:06 a.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/ --- (Updated Jan. 24, 2014, 7:06 a.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Jie Yu. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This adds an asynchronous mechanism for subprocess execution, per MESOS-943. What started simple was made a little more complex due to the following issues: 1. Who is responsible for closing the input / output descriptors? Placing this burden onto the caller of subprocess() seems likely to yield leaked open file descriptors. This introduced the notion of a shared_ptr / destructor / copy constructor / assignment constructor to ensure that the file descriptors are closed when the handle to the file descriptors are lost. However, even with my implementation, one may copy these file descriptors, at which point they may be deleted from underneath them. 2. What does discarding the status entail? Does it kill the process? The current implementation kills the process, which requires the use of an explicit Promise to deal with the discard from the caller not affecting the reaper's future. If discard() is a no-op, we must still use an explicit Promise to preserve the notification from the Reaper (so that we can know when to delete the Reaper). That's about it, I've added tests that demonstrate the ability to communicate with the subprocess through stdin / stout / stderr. Please let me know if you find any simplifications that can be made! (Other than C++11 lambdas, of course :)) Diffs - 3rdparty/libprocess/Makefile.am 40f01a7b3803696ccca440c8326e1d6d7c377459 3rdparty/libprocess/include/process/subprocess.hpp PRE-CREATION 3rdparty/libprocess/src/tests/subprocess_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/17306/diff/ Testing --- Tests were added and ran in repetition. Thanks, Ben Mahler
Re: Review Request 17306: Added an asynchronous subprocess utility.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17306/ --- (Updated Jan. 27, 2014, 10:42 p.m.) Review request for mesos, Benjamin Hindman and Vinod Kone. Changes --- Review comments, primarily this now ensures that the subprocess is terminated before we close the file descriptors by binding the Subprocess into the reap callback to ensure a copy of the Subprocess remains so long as it is running. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This adds an asynchronous mechanism for subprocess execution, per MESOS-943. What started simple was made a little more complex due to the following issues: 1. Who is responsible for closing the input / output descriptors? Placing this burden onto the caller of subprocess() seems likely to yield leaked open file descriptors. This introduced the notion of a shared_ptr / destructor / copy constructor / assignment constructor to ensure that the file descriptors are closed when the handle to the file descriptors are lost. However, even with my implementation, one may copy these file descriptors, at which point they may be deleted from underneath them. 2. What does discarding the status entail? Does it kill the process? The current implementation kills the process, which requires the use of an explicit Promise to deal with the discard from the caller not affecting the reaper's future. If discard() is a no-op, we must still use an explicit Promise to preserve the notification from the Reaper (so that we can know when to delete the Reaper). That's about it, I've added tests that demonstrate the ability to communicate with the subprocess through stdin / stout / stderr. Please let me know if you find any simplifications that can be made! (Other than C++11 lambdas, of course :)) Diffs (updated) - 3rdparty/libprocess/Makefile.am 40f01a7b3803696ccca440c8326e1d6d7c377459 3rdparty/libprocess/include/process/subprocess.hpp PRE-CREATION 3rdparty/libprocess/src/tests/subprocess_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/17306/diff/ Testing --- Tests were added and ran in repetition. Thanks, Ben Mahler
Re: Review Request 17305: Update the slave to use the libprocess Reaper.
On Jan. 24, 2014, 7:47 a.m., Benjamin Hindman wrote: src/launcher/executor.cpp, line 298 https://reviews.apache.org/r/17305/diff/1/?file=447761#file447761line298 I think it makes sense to have a global reaper just like we do with statistics. Having multiple libprocess processes be delaying every 1 second to call waitpid is a bit wasteful. Ben Mahler wrote: That's true, at the same time we would be doing a 1 second no-op event loop on all libprocess binaries, unless we added the ability to start/stop the Reaper. I was thinking of a library-style API: namespace process { FutureOptionint reap(pid_t); } This would use a global Reaper under the hood, I'll be updating the diffs with this cleaner interface. Ben Mahler wrote: One side-effect of the global reaper that I discovered while testing is that when running the reaping related tests in repetition, they gradually become slower. This is because each test that advances time for the global reaper to run it's wait() loop pushes the next wait() loop further into the future, requiring more and more calls to Clock::advance + Clock::settle. Before these changes, since a Reaper was re-constructed across each test and each component, we were immune to this. I'll leave a note about this in the code. We can fix this if we were to implement the approach where we have a thread per pid, each blocking on waitpid(). Why isn't resuming the clock sufficient for this? - Benjamin --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17305/#review32706 --- On Jan. 24, 2014, 7:05 a.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17305/ --- (Updated Jan. 24, 2014, 7:05 a.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Jie Yu. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This removes the Mesos Reaper in preference of using the libprocess Reaper, which no longer reaps non-monitored processes. Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/launcher/executor.cpp b73ab479500a7347a38ba53acecfab9229f1080d src/slave/cgroups_isolator.hpp e86062e9abaaa263c32c55e9dbfefd700f605886 src/slave/process_isolator.hpp 4ae093fe65775a2b9bec42071961dd58aa0c3d8b src/slave/reaper.hpp 9a31c754475ecbce5299d8f18f38253c542404e5 src/slave/reaper.cpp 5eabbc3911584cf47c353bcf4ca660c47c2c17be src/tests/environment.cpp 6edce4552ef9a12b7b58cefea97ebacc9224ab04 src/tests/reaper_tests.cpp 608ec0eff4eaae115d75621937a39b22e3bdb068 src/tests/slave_recovery_tests.cpp 5a4c4fc4f687a37409d1afbda4c0d07fcdc3a4c7 Diff: https://reviews.apache.org/r/17305/diff/ Testing --- make check I've also added an orphan check in the testing environment tear down. Thanks, Ben Mahler
[jira] [Updated] (MESOS-496) Refactor the MasterDetector (with 'bool contend' option) into new MasterDetector and MasterContender.
[ https://issues.apache.org/jira/browse/MESOS-496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-496: - Fix Version/s: 0.16.0 Refactor the MasterDetector (with 'bool contend' option) into new MasterDetector and MasterContender. - Key: MESOS-496 URL: https://issues.apache.org/jira/browse/MESOS-496 Project: Mesos Issue Type: Improvement Components: general Affects Versions: 0.14.0 Reporter: Yan Xu Assignee: Yan Xu Labels: twitter Fix For: 0.16.0 The MasterDetector code still uses old Zookeeper abstractions and it's error-prone for maintenance and further development. There exists newer and higher-level zookeeper group management abstractions such as zookeeper::Group. After this refactoring Mesos master detecting and contending logic should reside in separate classes which depend on general Zookeeper group leader detection and contention abstractions with Future-pattern APIs. Masters should then use the MasterContender while Slaves and Schedulers should use the MasterDetector. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MESOS-496) Refactor the MasterDetector (with 'bool contend' option) into new MasterDetector and MasterContender.
[ https://issues.apache.org/jira/browse/MESOS-496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-496: - Issue Type: Story (was: Improvement) Refactor the MasterDetector (with 'bool contend' option) into new MasterDetector and MasterContender. - Key: MESOS-496 URL: https://issues.apache.org/jira/browse/MESOS-496 Project: Mesos Issue Type: Story Components: general Affects Versions: 0.14.0 Reporter: Yan Xu Assignee: Yan Xu Labels: twitter Fix For: 0.16.0 The MasterDetector code still uses old Zookeeper abstractions and it's error-prone for maintenance and further development. There exists newer and higher-level zookeeper group management abstractions such as zookeeper::Group. After this refactoring Mesos master detecting and contending logic should reside in separate classes which depend on general Zookeeper group leader detection and contention abstractions with Future-pattern APIs. Masters should then use the MasterContender while Slaves and Schedulers should use the MasterDetector. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17427: Handled EINTR in os::close.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/#review32908 --- Ship it! Ship It! - Benjamin Hindman On Jan. 27, 2014, 10:16 p.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/ --- (Updated Jan. 27, 2014, 10:16 p.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Vinod Kone. Repository: mesos-git Description --- See above, note that GNU offers TEMP_FAILURE_RETRY as a macro for this. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp bba6f43eaeba0238a5db6e388902d92eb18f14f5 Diff: https://reviews.apache.org/r/17427/diff/ Testing --- make check Thanks, Ben Mahler
Re: Review Request 17304: Added a child Reaper utility in libprocess.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17304/ --- (Updated Jan. 27, 2014, 11:25 p.m.) Review request for mesos, Benjamin Hindman and Vinod Kone. Changes --- This now is exposed as a library utility, rather than a Reaper object: FutureOptionint process::reap(pid_t); Which uses a lazily initialized global reaper. This is more efficient, however, the reap related tests take increasingly long when run in repetition due to the Reaper's wait() loop getting scheduled further and further into the future (more calls to Clock::advance / Clock::settle become necessary as the tests run in repetition). Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This is a copy of the Reaper from Mesos, with one key difference: This now only reaps pids that were monitored explicitly. The test needed updating to reflect the fact that we can now retrieve the exit status for an already exited process. This also opens up the opportunity to eliminate the 1 second delay in the Reaper (using threads or SIGCHLD), but this is a copy of the Mesos Reaper for now. Diffs (updated) - 3rdparty/libprocess/Makefile.am 40f01a7b3803696ccca440c8326e1d6d7c377459 3rdparty/libprocess/include/process/reap.hpp PRE-CREATION 3rdparty/libprocess/src/reap.cpp PRE-CREATION 3rdparty/libprocess/src/tests/reap_tests.cpp PRE-CREATION Diff: https://reviews.apache.org/r/17304/diff/ Testing --- make check Thanks, Ben Mahler
Re: Review Request 17305: Update the slave to use the libprocess Reaper.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17305/ --- (Updated Jan. 27, 2014, 11:25 p.m.) Review request for mesos, Benjamin Hindman and Vinod Kone. Changes --- This now uses process::reap(). Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This removes the Mesos Reaper in preference of using the libprocess Reaper, which no longer reaps non-monitored processes. Diffs (updated) - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/launcher/executor.cpp b73ab479500a7347a38ba53acecfab9229f1080d src/slave/cgroups_isolator.hpp e86062e9abaaa263c32c55e9dbfefd700f605886 src/slave/cgroups_isolator.cpp 80155a3f0cd684a6aad9a37f61c5204d86032735 src/slave/process_isolator.hpp 4ae093fe65775a2b9bec42071961dd58aa0c3d8b src/slave/process_isolator.cpp 0bc698f04f7c8eaad166dc9d646e13310129dd01 src/slave/reaper.hpp 9a31c754475ecbce5299d8f18f38253c542404e5 src/slave/reaper.cpp 5eabbc3911584cf47c353bcf4ca660c47c2c17be src/tests/environment.cpp 6edce4552ef9a12b7b58cefea97ebacc9224ab04 src/tests/reaper_tests.cpp 608ec0eff4eaae115d75621937a39b22e3bdb068 src/tests/slave_recovery_tests.cpp 5a4c4fc4f687a37409d1afbda4c0d07fcdc3a4c7 Diff: https://reviews.apache.org/r/17305/diff/ Testing --- make check I've also added an orphan check in the testing environment tear down. Thanks, Ben Mahler
[jira] [Commented] (MESOS-930) Provide slave-executor protocol
[ https://issues.apache.org/jira/browse/MESOS-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883520#comment-13883520 ] Benjamin Mahler commented on MESOS-930: --- I think we should step back a bit and think more generally about our goals around language bindings, and then think about ways we can achieve them. Pure bindings that contain the driver logic may have disadvantages as well, mostly around having to manage the driver related logic across all of the bindings. This discussion started due to the issues with our current implementation, that is, the Python bindings may be incompatible with some Python event / threading models. Let's step back and think about goals first, a design doc or proposal with alternatives would be great here! Goals (anything I'm missing?): # Make it easy to add bindings for a particular language. # Allow for bindings to follow an idiomatic programming model for the particular language. # Ensure that the bindings can be maintainable: can they be updated easily? Can they be bundled as supported and tested with Mesos? I think with the suggestion in this ticket, we're neglecting (1) and (3). (1) because the bindings will need to re-implement the driver functionality (this will become increasingly difficult if/when the scheduler driver becomes stateful). (3) because whenever we want to add driver-side implementation, we would need to ensure all of the bindings are updated. With our current implementation, (1) and (2) are neglected. (1) because some languages do not have a bridge to C\+\+. (2) because the API is synchronous and we use libev which might interfere with other languages that use an event model. With these things in mind, we may want to consider an alternative route, where we continue to have C\+\+ drivers, but rather than requiring a bridge through C\+\+, we also offer the option of inter-process communication through pipes or sockets. This way a language binding could opt to run the driver as a subprocess and communicate to it without needing the C\+\+ bridge. This would look identical to running an embedded driver as done today, but with a different flag passed when constructing the driver. What do you think? Do you want to drive a design doc / proposal with alternatives for this? Provide slave-executor protocol - Key: MESOS-930 URL: https://issues.apache.org/jira/browse/MESOS-930 Project: Mesos Issue Type: Improvement Components: general Reporter: Nikita Vetoshkin Priority: Minor This ticket is the result of the discussion started in mailing list (http://www.mail-archive.com/dev@mesos.apache.org/msg05477.html). It would be great if Mesos provided protocol for slave-executor communication additionally to currently provided c++ based language bindings. ocumenting wire protocol could open ways to implement Executors in pure python or golang or any other language. It could provide some benefits: * in Python one could use gevent which is pretty popular * golang has it's own eventloop builtin * pure language binding could save a lot of trouble bridging with unfriendly C++ * building and using pure language client could be performed using native tools like `pip install` or `go get` without need to establish c++ dev environment. Before moving any further we need to decide, if this is actually a good thing to do. According to discussion in mail list - looks like it's worth doing. So next logical thing is to decide: * should protocol be something utterly new * should we document current protocol used by libprocess * should libprocess protocol be brushed a little before documenting it for external implementation While waiting for discussion I'd like to start documenting current protocol. Where can one do it in a way suitable for comments? RB? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- Review request for mesos. Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Diffs - src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 27, 2014, 11:47 p.m.) Review request for mesos. Repository: mesos-git Description (updated) --- Enabled configuration of the mesos master from the UI. This solution is to help out with development and testing of the UI irrespective of Mesos' version. Diffs - src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 27, 2014, 11:50 p.m.) Review request for mesos. Repository: mesos-git Description (updated) --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs (updated) - src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17427: Handled EINTR in os::close.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/#review32914 --- According to Linus the file descriptor has been closed even if EINTR is returned and it should not be closed again. Later in the thread he explicitly says that the glibc macro is incorrect. http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-09/3000.html Posix documentation says the state of the file descriptor is unspecified on EINTR. - Ian Downes On Jan. 27, 2014, 10:16 p.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/ --- (Updated Jan. 27, 2014, 10:16 p.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Vinod Kone. Repository: mesos-git Description --- See above, note that GNU offers TEMP_FAILURE_RETRY as a macro for this. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp bba6f43eaeba0238a5db6e388902d92eb18f14f5 Diff: https://reviews.apache.org/r/17427/diff/ Testing --- make check Thanks, Ben Mahler
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 27, 2014, 11:55 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs (updated) - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17427: Handled EINTR in os::close.
On Jan. 27, 2014, 11:54 p.m., Ian Downes wrote: According to Linus the file descriptor has been closed even if EINTR is returned and it should not be closed again. Later in the thread he explicitly says that the glibc macro is incorrect. http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-09/3000.html Posix documentation says the state of the file descriptor is unspecified on EINTR. [Incorrect in the sense that it shouldn't be used to wrap a close()] - Ian --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/#review32914 --- On Jan. 27, 2014, 10:16 p.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/ --- (Updated Jan. 27, 2014, 10:16 p.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Vinod Kone. Repository: mesos-git Description --- See above, note that GNU offers TEMP_FAILURE_RETRY as a macro for this. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp bba6f43eaeba0238a5db6e388902d92eb18f14f5 Diff: https://reviews.apache.org/r/17427/diff/ Testing --- make check Thanks, Ben Mahler
Re: Review Request 17427: Handled EINTR in os::close.
On Jan. 27, 2014, 11:54 p.m., Ian Downes wrote: According to Linus the file descriptor has been closed even if EINTR is returned and it should not be closed again. Later in the thread he explicitly says that the glibc macro is incorrect. http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-09/3000.html Posix documentation says the state of the file descriptor is unspecified on EINTR. Ian Downes wrote: [Incorrect in the sense that it shouldn't be used to wrap a close()] Yikes, thank you for finding this Ian! - Ben --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/#review32914 --- On Jan. 27, 2014, 10:16 p.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17427/ --- (Updated Jan. 27, 2014, 10:16 p.m.) Review request for mesos, Benjamin Hindman, Ian Downes, and Vinod Kone. Repository: mesos-git Description --- See above, note that GNU offers TEMP_FAILURE_RETRY as a macro for this. Diffs - 3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp bba6f43eaeba0238a5db6e388902d92eb18f14f5 Diff: https://reviews.apache.org/r/17427/diff/ Testing --- make check Thanks, Ben Mahler
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32919 --- src/webui/master/static/config.html https://reviews.apache.org/r/17431/#comment61949 Add a `class=modal-title` here to remove the bottom/top margin from the heading. That comes from core Bootstrap: http://getbootstrap.com/javascript/#modals src/webui/master/static/config.html https://reviews.apache.org/r/17431/#comment61950 Can this describe the format? Maybe Master URL src/webui/master/static/config.html https://reviews.apache.org/r/17431/#comment61951 Needs an ID matching the `for` attribute of the associated label: id=masterHost - Ross Allen On Jan. 27, 2014, 11:56 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 27, 2014, 11:56 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32921 --- src/webui/master/static/config.html https://reviews.apache.org/r/17431/#comment61953 Also, it might be useful to set the current hostname as the `placeholder` so it doesn't look empty. It would be something like `placeholder=http://hostname.of.mesos:5050; - Ross Allen On Jan. 27, 2014, 11:56 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 27, 2014, 11:56 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32922 --- src/webui/master/static/index.html https://reviews.apache.org/r/17431/#comment61954 Anchors don't get the pointer cursor unless they have an `href` attribute. This might be better as a button anyway since it's not nav, something like button type=button class=btn btn-link ng-click i class=glyphicon glyphicon-cog/i /button - Ross Allen On Jan. 27, 2014, 11:56 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 27, 2014, 11:56 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
[jira] [Created] (MESOS-951) Build failure: in log/catchup.cpp on Clang
Bernd Mathiske created MESOS-951: Summary: Build failure: in log/catchup.cpp on Clang Key: MESOS-951 URL: https://issues.apache.org/jira/browse/MESOS-951 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.17.0 Environment: Mac OS X 10.9, Clang (not gcc) Reporter: Bernd Mathiske Priority: Minor ./bootstrap; ./conffigure; make clean; make Build fails with this output near the end: libtool: compile: g++ -DPACKAGE_NAME=\mesos\ -DPACKAGE_TARNAME=\mesos\ -DPACKAGE_VERSION=\0.17.0\ -DPACKAGE_STRING=\mesos 0.17.0\ -DPACKAGE_BUGREPORT=\\ -DPACKAGE_URL=\\ -DPACKAGE=\mesos\ -DVERSION=\0.17.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_PTHREAD=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\2.7\ -DMESOS_HAS_PYTHON=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_LIBSASL2=1 -I. -Wall -Werror -DLIBDIR=\/usr/local/lib\ -DPKGLIBEXECDIR=\/usr/local/libexec/mesos\ -DPKGDATADIR=\/usr/local/share/mesos\ -I../include -I../3rdparty/libprocess/include -I../3rdparty/libprocess/3rdparty/stout/include -I../include -I../3rdparty/libprocess/3rdparty/boost-1.53.0 -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src -I../3rdparty/zookeeper-3.4.5/src/c/include -I../3rdparty/zookeeper-3.4.5/src/c/generated -D_THREAD_SAFE -DGTEST_USE_OWN_TR1_TUPLE=1 -g -g2 -O2 -std=c++11 -stdlib=libc++ -MT messages/libmesos_no_3rdparty_la-messages.pb.lo -MD -MP -MF messages/.deps/libmesos_no_3rdparty_la-messages.pb.Tpo -c messages/messages.pb.cc -o messages/libmesos_no_3rdparty_la-messages.pb.o /dev/null 21 log/catchup.cpp:225:28: error: no viable conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'const lambda::functionvoid ()' Timer::create(timeout, lambda::bind(Futureuint64_t::discard, catching)); ^~ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1137:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'nullptr_t' for 1st argument function(nullptr_t) _NOEXCEPT : __f_(0) {} ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1138:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'const std::__1::functionvoid () ' for 1st argument function(const function); ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1139:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'std::__1::functionvoid () ' for 1st argument function(function) _NOEXCEPT; ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1142:35: note: candidate template ignored: disabled by 'enable_if' [with _Fp = std::__1::__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ] typename enable_if__callable_Fp::value::type* = 0); ^ ../3rdparty/libprocess/include/process/timer.hpp:22:43: note: passing argument to parameter 'thunk' here const lambda::functionvoid(void) thunk); -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32924 --- src/webui/master/static/config.html https://reviews.apache.org/r/17431/#comment61955 Convention in modals is to put the primary action in the bottom right, I think these should be switched. - Ross Allen On Jan. 27, 2014, 11:56 p.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 27, 2014, 11:56 p.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:34 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs (updated) - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32925 --- src/webui/master/static/js/app.js https://reviews.apache.org/r/17431/#comment61956 Can the timestamp config be moved into this object in local storage as well to use just one root key for user configuration? - Ross Allen On Jan. 28, 2014, 12:34 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:34 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32927 --- Screenshots? :) If we were to have a way to spin up the webui standalone, perhaps that's the only time we would want to expose this? - Ben Mahler On Jan. 28, 2014, 12:34 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:34 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:52 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs (updated) - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:53 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments (updated) Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
On Jan. 28, 2014, 12:45 a.m., Ben Mahler wrote: Screenshots? :) If we were to have a way to spin up the webui standalone, perhaps that's the only time we would want to expose this? I suspect that you're right. The impetus for the feature is having the UI be standalone at least. In general however, there's now a method for allowing local config (Ross' point about timestamps is a great one, I'll move those over). That doesn't mean that we couldn't hide the config menu however. At the moment, I'm treating this as a debug feature more than anything else. For a real standalone solution, I'd personally like to have something a little more usable. - Thomas --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32927 --- On Jan. 28, 2014, 12:53 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:53 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Thanks, Thomas Rampelberg
Re: Review Request 17305: Update the slave to use the libprocess Reaper.
On Jan. 24, 2014, 7:47 a.m., Benjamin Hindman wrote: src/launcher/executor.cpp, line 298 https://reviews.apache.org/r/17305/diff/1/?file=447761#file447761line298 I think it makes sense to have a global reaper just like we do with statistics. Having multiple libprocess processes be delaying every 1 second to call waitpid is a bit wasteful. Ben Mahler wrote: That's true, at the same time we would be doing a 1 second no-op event loop on all libprocess binaries, unless we added the ability to start/stop the Reaper. I was thinking of a library-style API: namespace process { FutureOptionint reap(pid_t); } This would use a global Reaper under the hood, I'll be updating the diffs with this cleaner interface. Ben Mahler wrote: One side-effect of the global reaper that I discovered while testing is that when running the reaping related tests in repetition, they gradually become slower. This is because each test that advances time for the global reaper to run it's wait() loop pushes the next wait() loop further into the future, requiring more and more calls to Clock::advance + Clock::settle. Before these changes, since a Reaper was re-constructed across each test and each component, we were immune to this. I'll leave a note about this in the code. We can fix this if we were to implement the approach where we have a thread per pid, each blocking on waitpid(). Benjamin Hindman wrote: Why isn't resuming the clock sufficient for this? Ben Mahler wrote: Because the global ReaperProcess lives across the tests. From its point of view, it is scheduling every 1 second. However, we're locally pausing and advancing to force a schedule of the wait() function. Once wait() is run, it will get re-scheduled 1 second further from when it was run **in the advanced time-frame**. So the next test that wants to pause and advance the clock, now must reach ~2 seconds into the advanced time-frame rather than 1 second. Resuming would be sufficient if it made time appear to continue in the advanced time frame, but it will actually result in time going _back_ to what it would have been had the clock had never been paused in the first place. Sorry this is a bit tricky to explain clearly. Let me know if I'm missing something or if it's still not clear. This is actually causing some of the mesos tests to be flaky as well. Perhaps I will revert back to a Reaper per-pid and leave TODOs for cleaning this up. - Ben --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17305/#review32706 --- On Jan. 27, 2014, 11:25 p.m., Ben Mahler wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17305/ --- (Updated Jan. 27, 2014, 11:25 p.m.) Review request for mesos, Benjamin Hindman and Vinod Kone. Bugs: MESOS-943 https://issues.apache.org/jira/browse/MESOS-943 Repository: mesos-git Description --- This removes the Mesos Reaper in preference of using the libprocess Reaper, which no longer reaps non-monitored processes. Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/launcher/executor.cpp b73ab479500a7347a38ba53acecfab9229f1080d src/slave/cgroups_isolator.hpp e86062e9abaaa263c32c55e9dbfefd700f605886 src/slave/cgroups_isolator.cpp 80155a3f0cd684a6aad9a37f61c5204d86032735 src/slave/process_isolator.hpp 4ae093fe65775a2b9bec42071961dd58aa0c3d8b src/slave/process_isolator.cpp 0bc698f04f7c8eaad166dc9d646e13310129dd01 src/slave/reaper.hpp 9a31c754475ecbce5299d8f18f38253c542404e5 src/slave/reaper.cpp 5eabbc3911584cf47c353bcf4ca660c47c2c17be src/tests/environment.cpp 6edce4552ef9a12b7b58cefea97ebacc9224ab04 src/tests/reaper_tests.cpp 608ec0eff4eaae115d75621937a39b22e3bdb068 src/tests/slave_recovery_tests.cpp 5a4c4fc4f687a37409d1afbda4c0d07fcdc3a4c7 Diff: https://reviews.apache.org/r/17305/diff/ Testing --- make check I've also added an orphan check in the testing environment tear down. Thanks, Ben Mahler
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
On Jan. 28, 2014, 12:45 a.m., Ben Mahler wrote: Screenshots? :) If we were to have a way to spin up the webui standalone, perhaps that's the only time we would want to expose this? Thomas Rampelberg wrote: I suspect that you're right. The impetus for the feature is having the UI be standalone at least. In general however, there's now a method for allowing local config (Ross' point about timestamps is a great one, I'll move those over). That doesn't mean that we couldn't hide the config menu however. At the moment, I'm treating this as a debug feature more than anything else. For a real standalone solution, I'd personally like to have something a little more usable. This will anyone test UI changes on production clusters without updating Mesos on those clusters. You can run the UI locally and point to a real cluster. We are curious to see if the latest changes to pagination and filtering work on large clusters. - Ross --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32927 --- On Jan. 28, 2014, 12:53 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:53 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32931 --- src/webui/master/static/index.html https://reviews.apache.org/r/17431/#comment61958 `location.origin` is copied in a couple places. This should work for now, but we should figure out a more robust config setup for defaults, etc. - Ross Allen On Jan. 28, 2014, 12:53 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:53 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Thanks, Thomas Rampelberg
Review Request 17435: Changed catchup.cpp to make it compilable with clang and C++11.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17435/ --- Review request for mesos and Benjamin Hindman. Repository: mesos-git Description --- See summary. Diffs - src/log/catchup.cpp 69fac3c Diff: https://reviews.apache.org/r/17435/diff/ Testing --- clang++-3.3 make check Thanks, Jie Yu
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32932 --- src/webui/master/static/js/controllers.js https://reviews.apache.org/r/17431/#comment61960 This should use the $location service instead of the global window.location. src/webui/master/static/js/controllers.js https://reviews.apache.org/r/17431/#comment61961 Use the Angular $location API if possible instead of the global window.location: http://docs.angularjs.org/api/ng.$location src/webui/master/static/js/controllers.js https://reviews.apache.org/r/17431/#comment61962 $location service here too instead of a global. - Ross Allen On Jan. 28, 2014, 12:53 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 12:53 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Thanks, Thomas Rampelberg
Re: Review Request 17435: Changed catchup.cpp to make it compilable with clang and C++11.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17435/#review32933 --- Ship it! Ship It! - Benjamin Hindman On Jan. 28, 2014, 1:14 a.m., Jie Yu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17435/ --- (Updated Jan. 28, 2014, 1:14 a.m.) Review request for mesos and Benjamin Hindman. Repository: mesos-git Description --- See summary. Diffs - src/log/catchup.cpp 69fac3c Diff: https://reviews.apache.org/r/17435/diff/ Testing --- clang++-3.3 make check Thanks, Jie Yu
Re: Review Request 17435: Changed catchup.cpp to make it compilable with clang and C++11.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17435/ --- (Updated Jan. 28, 2014, 1:16 a.m.) Review request for mesos and Benjamin Hindman. Bugs: MESOS-951 https://issues.apache.org/jira/browse/MESOS-951 Repository: mesos-git Description --- See summary. Diffs - src/log/catchup.cpp 69fac3c Diff: https://reviews.apache.org/r/17435/diff/ Testing --- clang++-3.3 make check Thanks, Jie Yu
[jira] [Commented] (MESOS-951) Build failure: in log/catchup.cpp on Clang
[ https://issues.apache.org/jira/browse/MESOS-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883609#comment-13883609 ] Benjamin Hindman commented on MESOS-951: Thanks for reporting Bernd! Build failure: in log/catchup.cpp on Clang -- Key: MESOS-951 URL: https://issues.apache.org/jira/browse/MESOS-951 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.17.0 Environment: Mac OS X 10.9, Clang (not gcc) Reporter: Bernd Mathiske Assignee: Jie Yu Priority: Minor Fix For: 0.17.0 ./bootstrap; ./conffigure; make clean; make Build fails with this output near the end: libtool: compile: g++ -DPACKAGE_NAME=\mesos\ -DPACKAGE_TARNAME=\mesos\ -DPACKAGE_VERSION=\0.17.0\ -DPACKAGE_STRING=\mesos 0.17.0\ -DPACKAGE_BUGREPORT=\\ -DPACKAGE_URL=\\ -DPACKAGE=\mesos\ -DVERSION=\0.17.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_PTHREAD=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\2.7\ -DMESOS_HAS_PYTHON=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_LIBSASL2=1 -I. -Wall -Werror -DLIBDIR=\/usr/local/lib\ -DPKGLIBEXECDIR=\/usr/local/libexec/mesos\ -DPKGDATADIR=\/usr/local/share/mesos\ -I../include -I../3rdparty/libprocess/include -I../3rdparty/libprocess/3rdparty/stout/include -I../include -I../3rdparty/libprocess/3rdparty/boost-1.53.0 -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src -I../3rdparty/zookeeper-3.4.5/src/c/include -I../3rdparty/zookeeper-3.4.5/src/c/generated -D_THREAD_SAFE -DGTEST_USE_OWN_TR1_TUPLE=1 -g -g2 -O2 -std=c++11 -stdlib=libc++ -MT messages/libmesos_no_3rdparty_la-messages.pb.lo -MD -MP -MF messages/.deps/libmesos_no_3rdparty_la-messages.pb.Tpo -c messages/messages.pb.cc -o messages/libmesos_no_3rdparty_la-messages.pb.o /dev/null 21 log/catchup.cpp:225:28: error: no viable conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'const lambda::functionvoid ()' Timer::create(timeout, lambda::bind(Futureuint64_t::discard, catching)); ^~ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1137:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'nullptr_t' for 1st argument function(nullptr_t) _NOEXCEPT : __f_(0) {} ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1138:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'const std::__1::functionvoid () ' for 1st argument function(const function); ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1139:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'std::__1::functionvoid () ' for 1st argument function(function) _NOEXCEPT; ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1142:35: note: candidate template ignored: disabled by 'enable_if' [with _Fp = std::__1::__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ] typename enable_if__callable_Fp::value::type* = 0); ^ ../3rdparty/libprocess/include/process/timer.hpp:22:43: note: passing argument to parameter 'thunk' here const lambda::functionvoid(void) thunk); -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (MESOS-951) Build failure: in log/catchup.cpp on Clang
[ https://issues.apache.org/jira/browse/MESOS-951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman resolved MESOS-951. Resolution: Fixed Fix Version/s: 0.17.0 Assignee: Jie Yu https://reviews.apache.org/r/17435 Build failure: in log/catchup.cpp on Clang -- Key: MESOS-951 URL: https://issues.apache.org/jira/browse/MESOS-951 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.17.0 Environment: Mac OS X 10.9, Clang (not gcc) Reporter: Bernd Mathiske Assignee: Jie Yu Priority: Minor Fix For: 0.17.0 ./bootstrap; ./conffigure; make clean; make Build fails with this output near the end: libtool: compile: g++ -DPACKAGE_NAME=\mesos\ -DPACKAGE_TARNAME=\mesos\ -DPACKAGE_VERSION=\0.17.0\ -DPACKAGE_STRING=\mesos 0.17.0\ -DPACKAGE_BUGREPORT=\\ -DPACKAGE_URL=\\ -DPACKAGE=\mesos\ -DVERSION=\0.17.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_PTHREAD=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\2.7\ -DMESOS_HAS_PYTHON=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_LIBSASL2=1 -I. -Wall -Werror -DLIBDIR=\/usr/local/lib\ -DPKGLIBEXECDIR=\/usr/local/libexec/mesos\ -DPKGDATADIR=\/usr/local/share/mesos\ -I../include -I../3rdparty/libprocess/include -I../3rdparty/libprocess/3rdparty/stout/include -I../include -I../3rdparty/libprocess/3rdparty/boost-1.53.0 -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src -I../3rdparty/zookeeper-3.4.5/src/c/include -I../3rdparty/zookeeper-3.4.5/src/c/generated -D_THREAD_SAFE -DGTEST_USE_OWN_TR1_TUPLE=1 -g -g2 -O2 -std=c++11 -stdlib=libc++ -MT messages/libmesos_no_3rdparty_la-messages.pb.lo -MD -MP -MF messages/.deps/libmesos_no_3rdparty_la-messages.pb.Tpo -c messages/messages.pb.cc -o messages/libmesos_no_3rdparty_la-messages.pb.o /dev/null 21 log/catchup.cpp:225:28: error: no viable conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'const lambda::functionvoid ()' Timer::create(timeout, lambda::bind(Futureuint64_t::discard, catching)); ^~ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1137:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'nullptr_t' for 1st argument function(nullptr_t) _NOEXCEPT : __f_(0) {} ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1138:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'const std::__1::functionvoid () ' for 1st argument function(const function); ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1139:5: note: candidate constructor not viable: no known conversion from '__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ' to 'std::__1::functionvoid () ' for 1st argument function(function) _NOEXCEPT; ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/c++/v1/functional:1142:35: note: candidate template ignored: disabled by 'enable_if' [with _Fp = std::__1::__bindbool (process::Futureunsigned long long::*)(), process::Futureunsigned long long ] typename enable_if__callable_Fp::value::type* = 0); ^ ../3rdparty/libprocess/include/process/timer.hpp:22:43: note: passing argument to parameter 'thunk' here const lambda::functionvoid(void) thunk); -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (MESOS-948) Docs for Java interfaces
[ https://issues.apache.org/jira/browse/MESOS-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Connor Doyle resolved MESOS-948. Resolution: Duplicate Docs for Java interfaces Key: MESOS-948 URL: https://issues.apache.org/jira/browse/MESOS-948 Project: Mesos Issue Type: Documentation Components: documentation, framework, java api Reporter: Connor Doyle Labels: duplicate There is some great user documentation embedded in the Java API sources (see: https://github.com/apache/mesos/tree/master/src/java/src/org/apache/mesos/ ). Unfortunately, these are hard to find. It would be helpful to publish the generated Javadocs and link to them from the Mesos site. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 1:44 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs (updated) - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Thanks, Thomas Rampelberg
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/#review32935 --- src/Makefile.am https://reviews.apache.org/r/17431/#comment61972 Oh hey, this should be config.html not config/html right? - Ross Allen On Jan. 28, 2014, 1:44 a.m., Thomas Rampelberg wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 1:44 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Thanks, Thomas Rampelberg
[jira] [Created] (MESOS-952) Clock::resume should adjust timeouts that were created in a paused/advanced Clock context.
Benjamin Mahler created MESOS-952: - Summary: Clock::resume should adjust timeouts that were created in a paused/advanced Clock context. Key: MESOS-952 URL: https://issues.apache.org/jira/browse/MESOS-952 Project: Mesos Issue Type: Bug Reporter: Benjamin Mahler When timeouts are created while the Clock is paused and advanced into the future, these must be adjusted once we resume the Clock. For example: Process A { initialize() { loop(); } loop() { delay(Seconds(1), loop); } } // T = 0 Clock::pause(); Clock::advance(Seconds(1)); // T = 1 Clock::settle(); // The loop timeout will be expired, loop() is called. // loop is scheduled for T = 2 Clock::advance(Seconds(1)); // T = 2 Clock::settle(); // The loop timeout will be expired, loop() is called. // loop is scheduled for T = 3 Clock::resume(); // T = 0 once again (assume ~ no real time has elapsed) // Now loop will not be called for 3 seconds, until T = 3. // Instead, we expect loop to be called at T = 1. The semantics here can be quite tricky so please let me know if I'm missing something. It seems we should be adjusting the timers when resuming an advanced clock. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 16432: Containerizer - cgroup isolators (part 4)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16432/#review32889 --- Ship it! Looks good, just some minor nits. src/slave/containerizer/isolators/cgroups/cgroups.cpp https://reviews.apache.org/r/16432/#comment61881 Why not assign path::join(root, test) to a variable, if it's going to be used 3 times in this function? Isn't this test already performed in the cgroups_launcher? Do we really need it in both? (I guess that's what your TODO in the launcher is about.) src/slave/containerizer/isolators/cgroups/cgroups.cpp https://reviews.apache.org/r/16432/#comment61883 Do you also want to log the info-cgroup string as above? src/slave/containerizer/isolators/cgroups/cgroups.cpp https://reviews.apache.org/r/16432/#comment61884 s/cgorups/cgroups/ src/slave/containerizer/isolators/cgroups/cpushare.hpp https://reviews.apache.org/r/16432/#comment61886 Shouldn't need both mesos/resources.hpp and mesos/resources.hpp src/slave/containerizer/isolators/cgroups/cpushare.hpp https://reviews.apache.org/r/16432/#comment61930 Please use 'subsystem' instead of 'subsystem_' for the member variable. src/slave/containerizer/isolators/cgroups/cpushare.hpp https://reviews.apache.org/r/16432/#comment61887 Just because cgroups uses cpuacct, does that mean we need to keep that archaic name? I think cpuAccountHierarchy would be easier to read. Or cpuShareHierarchy if you'd rather. src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61890 Go ahead and use local variable foo_ and foo = foo_.get() for cpuacctHierarchy here like you did above for hierarchy_. We prefer not to use this- to workaround shadowed variables. src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61894 Please comment somewhere that CFS stands for Completely Fair Scheduler. My first google search for cfs cgroups suggested Chronic Fatigue Syndrome support groups. :P src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61901 isNone() instead of !isSome()? src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61913 No need for all this Option:: nonsense, you can just use = Some(pid);, or even just = pid; src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61917 s/hierarchy/cpuacctHierarchy/ src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61919 First CHECK_NOTNULL(infos[containerId]); ? src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61918 isNone() instead of !isSome() src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61922 Shadows the identical double cpus = resources.cpus().get(); on line 249; you can probably remove this line and use the preexisting 'cpus' variable src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61923 Can you chain these futures together with .then() to ensure that we destroy both of these hierarchies and delete the containerId/info from infos, even if the futures aren't immediately ready? src/slave/containerizer/isolators/cgroups/cpushare.cpp https://reviews.apache.org/r/16432/#comment61926 Newline at end of file src/slave/containerizer/isolators/cgroups/mem.hpp https://reviews.apache.org/r/16432/#comment61925 Apache License statement? src/slave/containerizer/isolators/cgroups/mem.hpp https://reviews.apache.org/r/16432/#comment61928 alpha-order src/slave/containerizer/isolators/cgroups/mem.hpp https://reviews.apache.org/r/16432/#comment61931 s/subsystem_/subsystem/ src/slave/containerizer/isolators/cgroups/mem.hpp https://reviews.apache.org/r/16432/#comment61932 If you want Doxygen to pick this up, you should either use triple-slash /// @param foo bar or double-star: /** * Description. * @param foo bar */ src/slave/containerizer/isolators/cgroups/mem.cpp https://reviews.apache.org/r/16432/#comment61933 alpha-order src/slave/containerizer/isolators/cgroups/mem.cpp https://reviews.apache.org/r/16432/#comment61935 isNone src/slave/containerizer/isolators/cgroups/mem.cpp https://reviews.apache.org/r/16432/#comment61941 isNone src/slave/containerizer/isolators/cgroups/mem.cpp https://reviews.apache.org/r/16432/#comment61942 = pid; src/slave/containerizer/isolators/cgroups/mem.cpp https://reviews.apache.org/r/16432/#comment61944 CHECK_NOTNULL(infos[containerId]); first? src/slave/containerizer/isolators/cgroups/mem.cpp
Re: Review Request 17431: Enabled configuration of the mesos master from the UI.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17431/ --- (Updated Jan. 28, 2014, 1:48 a.m.) Review request for mesos and Ross Allen. Bugs: mesos-885 https://issues.apache.org/jira/browse/mesos-885 Repository: mesos-git Description --- Enabled configuration of the mesos master from the UI. Review: http://reviews.apache.org/r/17431 Diffs (updated) - src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 src/webui/master/static/config.html PRE-CREATION src/webui/master/static/css/mesos.css 5b1227e9d64757f9fc106e497f7fa3ed72112c10 src/webui/master/static/directives/timestamp.html 5e422b9f22f8ddaf987feec3e02a849f21e5e22c src/webui/master/static/index.html f7f3d24abfee7d30691dbc2d7adf7c05c888a7b4 src/webui/master/static/js/app.js 4ccff6314c684ae4e917345fe41a95ccc0eb5803 src/webui/master/static/js/controllers.js afb24fb9c2184772f7314162f5637dbabaa2ab94 Diff: https://reviews.apache.org/r/17431/diff/ Testing --- File Attachments Config Dialog https://reviews.apache.org/media/uploaded/files/2014/01/28/5499d3e5-077e-4aff-b29a-7d32134f29a0__Screenshot_2014-01-27_16.53.12.png Thanks, Thomas Rampelberg
Review Request 17440: Removed an unnecessary intermediate ZooKeeper event handler (WatcherProcess) which has a bug in ensuring the lifecycles of WatcherProcess and Watcher match each other thus causin
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17440/ --- Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone. Bugs: MESOS-937 https://issues.apache.org/jira/browse/MESOS-937 Repository: mesos-git Description --- Description in JIRA: Between the execution of ProcessWatcher::~ProcessWatcher() and its base class destructor Watcher::~Watcher(), the pure virtual method Watcher::process() can be invoked by WatcherProcess::event(). By eliminating WatcherProcess this problem is resolved. Diffs - src/state/zookeeper.hpp d1d1fedf27987aeaf9fbdee678d3b3848d05620a src/state/zookeeper.cpp 09b63d44e9349cab2d73659c939de3d8e96fbcc5 src/zookeeper/group.hpp e51ebb2cf5f09a633462c101f913ee8272be9a6c src/zookeeper/group.cpp ecb6c002e8194b8d67e262826d988f747414f9f3 src/zookeeper/watcher.hpp 1db0386719c2a675d29b47b417dc856993062326 src/zookeeper/zookeeper.hpp f50aca6e7035c8084c3e76fd56b9d1ef7f9d9902 src/zookeeper/zookeeper.cpp 5720f4c1cd51c4d998b52e7bf7f9f019ae80c5f8 Diff: https://reviews.apache.org/r/17440/diff/ Testing --- make check. Jenkins tests ongoing, will report the results. Thanks, Jiang Yan Xu
[jira] [Comment Edited] (MESOS-937) Fix pure virtual method called bug in zookeeper::ProcessWatcher
[ https://issues.apache.org/jira/browse/MESOS-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880178#comment-13880178 ] Yan Xu edited comment on MESOS-937 at 1/28/14 1:59 AM: --- https://reviews.apache.org/r/17221/ https://reviews.apache.org/r/17440/ was (Author: xujyan): https://reviews.apache.org/r/17221/ https://reviews.apache.org/r/17222/ Fix pure virtual method called bug in zookeeper::ProcessWatcher - Key: MESOS-937 URL: https://issues.apache.org/jira/browse/MESOS-937 Project: Mesos Issue Type: Bug Reporter: Yan Xu Assignee: Yan Xu Fix For: 0.17.0 It appears to be the root cause for MESOS-871, MESOS-537, etc. Between the execution of ProcessWatcher::~ProcessWatcher() and its base class destructor Watcher::~Watcher(), the pure virtual method Watcher::process() can be invoked by WatcherProcess::event(). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17442: Added 'active_tasks' stat to master stats endpoint.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/ --- Review request for mesos, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. I opted for active tasks instead of running tasks because I didn't want the stats endpoint to loop through all tasks to figure out if a task is in RUNNING state. I think active is useful for most debugging purposes. Diffs - src/master/http.cpp 546e91dbb9c8ee1014bb4f0b3be2714ad6a2d520 Diff: https://reviews.apache.org/r/17442/diff/ Testing --- make check Thanks, Vinod Kone
[jira] [Assigned] (MESOS-772) expose count of running tasks
[ https://issues.apache.org/jira/browse/MESOS-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone reassigned MESOS-772: Assignee: Vinod Kone https://reviews.apache.org/r/17442/ expose count of running tasks - Key: MESOS-772 URL: https://issues.apache.org/jira/browse/MESOS-772 Project: Mesos Issue Type: Improvement Reporter: David Robinson Assignee: Vinod Kone Priority: Minor The stats endpoint doesn't show the current number of running tasks: $ curl -s http://localhost:5051/slave\(1\)/stats.json | python2.7 -m json.tool { failed_tasks: 0, finished_tasks: 0, invalid_status_updates: 0, killed_tasks: 0, lost_tasks: 0, recovery_errors: 0, registered: 1, staged_tasks: 2, started_tasks: 0, total_frameworks: 1, uptime: 1168.518182912, valid_status_updates: 0 } Can this be added please? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17442: Added 'active_tasks' stat to master stats endpoint.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/#review32942 --- Ship it! Looks great! Thanks src/master/http.cpp https://reviews.apache.org/r/17442/#comment61977 What are the active states besides RUNNING? STAGING and STARTING? Those should be fine to count too. Maybe add them to the comment to clarify. - Adam B On Jan. 27, 2014, 6:17 p.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/ --- (Updated Jan. 27, 2014, 6:17 p.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. I opted for active tasks instead of running tasks because I didn't want the stats endpoint to loop through all tasks to figure out if a task is in RUNNING state. I think active is useful for most debugging purposes. Diffs - src/master/http.cpp 546e91dbb9c8ee1014bb4f0b3be2714ad6a2d520 Diff: https://reviews.apache.org/r/17442/diff/ Testing --- make check Thanks, Vinod Kone
Re: Review Request 17442: Added 'active_tasks' stat to master stats endpoint.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/ --- (Updated Jan. 28, 2014, 2:59 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Changes --- adam's comments. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. I opted for active tasks instead of running tasks because I didn't want the stats endpoint to loop through all tasks to figure out if a task is in RUNNING state. I think active is useful for most debugging purposes. Diffs (updated) - src/master/http.cpp 546e91dbb9c8ee1014bb4f0b3be2714ad6a2d520 Diff: https://reviews.apache.org/r/17442/diff/ Testing --- make check Thanks, Vinod Kone
Review Request 17443: Added queued and launched tasks to slave stats.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17443/ --- Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-775 https://issues.apache.org/jira/browse/MESOS-775 Repository: mesos-git Description --- See summary. Diffs - src/slave/http.cpp c8357e214d2adf2cd712072f58d07b07badb79dc Diff: https://reviews.apache.org/r/17443/diff/ Testing --- make Thanks, Vinod Kone
Re: Review Request 17443: Added queued and launched tasks to slave stats.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17443/ --- (Updated Jan. 28, 2014, 3:02 a.m.) Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Changes --- added 'Depends On'. Bugs: MESOS-775 https://issues.apache.org/jira/browse/MESOS-775 Repository: mesos-git Description --- See summary. Diffs - src/slave/http.cpp c8357e214d2adf2cd712072f58d07b07badb79dc Diff: https://reviews.apache.org/r/17443/diff/ Testing --- make Thanks, Vinod Kone
Re: Review Request 17443: Added queued and launched tasks to slave stats.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17443/ --- (Updated Jan. 28, 2014, 3:02 a.m.) Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Changes --- fixed bug id. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. Diffs - src/slave/http.cpp c8357e214d2adf2cd712072f58d07b07badb79dc Diff: https://reviews.apache.org/r/17443/diff/ Testing --- make Thanks, Vinod Kone
[jira] [Comment Edited] (MESOS-772) expose count of running tasks
[ https://issues.apache.org/jira/browse/MESOS-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883664#comment-13883664 ] Vinod Kone edited comment on MESOS-772 at 1/28/14 3:02 AM: --- https://reviews.apache.org/r/17442/ https://reviews.apache.org/r/17443/ was (Author: vinodkone): https://reviews.apache.org/r/17442/ expose count of running tasks - Key: MESOS-772 URL: https://issues.apache.org/jira/browse/MESOS-772 Project: Mesos Issue Type: Improvement Reporter: David Robinson Assignee: Vinod Kone Priority: Minor The stats endpoint doesn't show the current number of running tasks: $ curl -s http://localhost:5051/slave\(1\)/stats.json | python2.7 -m json.tool { failed_tasks: 0, finished_tasks: 0, invalid_status_updates: 0, killed_tasks: 0, lost_tasks: 0, recovery_errors: 0, registered: 1, staged_tasks: 2, started_tasks: 0, total_frameworks: 1, uptime: 1168.518182912, valid_status_updates: 0 } Can this be added please? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17445: Added LLDB convenience scripts.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17445/ --- Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone. Bugs: MESOS-950 https://issues.apache.org/jira/browse/MESOS-950 Repository: mesos-git Description --- This patch adds lldb-mesos-tests, lldb-mesos-local, lldb-mesos-master and lldb-mesos-slave in similar style as gdb-mesos-* as GDB seems to have been out phased on OS X Mavericks. Diffs - bin/lldb-mesos-local.sh.in PRE-CREATION bin/lldb-mesos-master.sh.in PRE-CREATION bin/lldb-mesos-slave.sh.in PRE-CREATION bin/lldb-mesos-tests.sh.in PRE-CREATION configure.ac aa6ee45 Diff: https://reviews.apache.org/r/17445/diff/ Testing --- make check and functional testing of scripts with and without arguments. Thanks, Niklas Nielsen
[jira] [Commented] (MESOS-950) Add LLDB helpers
[ https://issues.apache.org/jira/browse/MESOS-950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883713#comment-13883713 ] Niklas Quarfot Nielsen commented on MESOS-950: -- Added in https://reviews.apache.org/r/17445/ Add LLDB helpers Key: MESOS-950 URL: https://issues.apache.org/jira/browse/MESOS-950 Project: Mesos Issue Type: Improvement Components: build Affects Versions: 0.17.0 Environment: LLVM environments both Linux and Mac OS X. Reporter: Niklas Quarfot Nielsen Assignee: Niklas Quarfot Nielsen Priority: Trivial It would be helpful to add LLDB helpers in similar style as gdb-mesos-tests, gdb-mesos-local, gdb-mesos-master and gdb-mesos-slave, as GDB seems to have been out phased on OS X Mavericks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17442: Added 'active_tasks' stat to master stats endpoint.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/#review32952 --- Ship it! I hope the past vs present tense will be enough to not make the monotonic vs instantaneous stats confusing for those consuming this data. src/master/http.cpp https://reviews.apache.org/r/17442/#comment62012 s/launched/active/ ? - Ben Mahler On Jan. 28, 2014, 2:59 a.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/ --- (Updated Jan. 28, 2014, 2:59 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. I opted for active tasks instead of running tasks because I didn't want the stats endpoint to loop through all tasks to figure out if a task is in RUNNING state. I think active is useful for most debugging purposes. Diffs - src/master/http.cpp 546e91dbb9c8ee1014bb4f0b3be2714ad6a2d520 Diff: https://reviews.apache.org/r/17442/diff/ Testing --- make check Thanks, Vinod Kone
Re: Review Request 16724: Added completed frameworks/tasks to slave re-registration.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16724/#review32949 --- Ship it! src/master/master.cpp https://reviews.apache.org/r/16724/#comment62002 Let's use the same parameter name here and in the declaration please (same for Master::readdCompletedFramework too). src/master/master.cpp https://reviews.apache.org/r/16724/#comment61999 We align our wrapped lists (parameter lists, argument lists, etc). src/master/master.cpp https://reviews.apache.org/r/16724/#comment62004 = None(); src/master/master.cpp https://reviews.apache.org/r/16724/#comment62003 s/knownFramework/completedFramework/ src/master/master.cpp https://reviews.apache.org/r/16724/#comment62005 Let's make this a TODO to make registration and reregistration time be optional in Framework. src/messages/messages.proto https://reviews.apache.org/r/16724/#comment62008 I was thinking we'd put this in the 'archive' package since we'll likely call a class 'Archive', but this is fine for now and we can always rename it later without causing upgrade/compatibility issues. src/messages/messages.proto https://reviews.apache.org/r/16724/#comment62011 Can we make this optional? I can see moving something like 'pid' into another structure and I don't think it's that big of a deal if it's not present (it's only used for constructing a completed 'Framework' and not having the pid is not a big deal. src/slave/slave.cpp https://reviews.apache.org/r/16724/#comment62014 Please kill all the 'msg' prefixes here, they're not necessary and not with convention. src/slave/slave.cpp https://reviews.apache.org/r/16724/#comment62013 Please align the wrapping here. - Benjamin Hindman On Jan. 17, 2014, 12:24 a.m., Adam B wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16724/ --- (Updated Jan. 17, 2014, 12:24 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, Niklas Nielsen, and Vinod Kone. Bugs: MESOS-767 https://issues.apache.org/jira/browse/MESOS-767 Repository: mesos-git Description --- Added completed frameworks/tasks to slave re-registration. Fixes MESOS-767. Additional issues discovered during investigation: - MESOS-905: Remove Framework.id in favor of FrameworkInfo.id - MESOS-906: Last task in Completed Framework never graduates from terminatedTasks to completedTasks. - Completed frameworks/executors/tasks are stored in circular buffers, and these may overflow in different orders on different slaves. BenH proposes an archive to replace these circular buffers. Diffs - src/master/master.hpp 95b9cec src/master/master.cpp 38c5532 src/messages/messages.proto 1f264d5 src/slave/slave.cpp 396293b Diff: https://reviews.apache.org/r/16724/diff/ Testing --- make check; manually failed-over a master, watched the slave reregister its completed frameworks, web UI shows completed tasks and stdout/stderr. Thanks, Adam B
Re: Review Request 17445: Added LLDB convenience scripts.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17445/#review32954 --- Ship it! Ship It! - Benjamin Hindman On Jan. 28, 2014, 3:41 a.m., Niklas Nielsen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17445/ --- (Updated Jan. 28, 2014, 3:41 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone. Bugs: MESOS-950 https://issues.apache.org/jira/browse/MESOS-950 Repository: mesos-git Description --- This patch adds lldb-mesos-tests, lldb-mesos-local, lldb-mesos-master and lldb-mesos-slave in similar style as gdb-mesos-* as GDB seems to have been out phased on OS X Mavericks. Diffs - bin/lldb-mesos-local.sh.in PRE-CREATION bin/lldb-mesos-master.sh.in PRE-CREATION bin/lldb-mesos-slave.sh.in PRE-CREATION bin/lldb-mesos-tests.sh.in PRE-CREATION configure.ac aa6ee45 Diff: https://reviews.apache.org/r/17445/diff/ Testing --- make check and functional testing of scripts with and without arguments. Thanks, Niklas Nielsen
Re: Review Request 17440: Removed an unnecessary intermediate ZooKeeper event handler (WatcherProcess) which has a bug in ensuring the lifecycles of WatcherProcess and Watcher match each other thus ca
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17440/#review32956 --- src/state/zookeeper.cpp https://reviews.apache.org/r/17440/#comment62023 But we aren't connecting! The 'zk' handle is set to NULL on the line above. What was the problem with DISCONNECTED? That was used originally in GroupProcess as well. src/state/zookeeper.cpp https://reviews.apache.org/r/17440/#comment62024 This couple of lines of code are hard to follow. Basically, I can get a 'connected' callback and be in the state CONNECTING, which makes me CONNECTED, but also be in lots of other states and do nothing? Can I be in READY and then get a 'connected' callback? Why don't I transition to CONNECTED if I'm in READY? We need to be much more explicit about what states we can be in here please. src/state/zookeeper.cpp https://reviews.apache.org/r/17440/#comment62021 We should really capture the bad state due to failed authentication rather than keep us in the CONNECTED state and also add a new READY state. src/zookeeper/watcher.hpp https://reviews.apache.org/r/17440/#comment62025 Please document why having shared state here doesn't work due to concurrency issues. - Benjamin Hindman On Jan. 28, 2014, 1:59 a.m., Jiang Yan Xu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17440/ --- (Updated Jan. 28, 2014, 1:59 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone. Bugs: MESOS-937 https://issues.apache.org/jira/browse/MESOS-937 Repository: mesos-git Description --- Description in JIRA: Between the execution of ProcessWatcher::~ProcessWatcher() and its base class destructor Watcher::~Watcher(), the pure virtual method Watcher::process() can be invoked by WatcherProcess::event(). By eliminating WatcherProcess this problem is resolved. Diffs - src/state/zookeeper.hpp d1d1fedf27987aeaf9fbdee678d3b3848d05620a src/state/zookeeper.cpp 09b63d44e9349cab2d73659c939de3d8e96fbcc5 src/zookeeper/group.hpp e51ebb2cf5f09a633462c101f913ee8272be9a6c src/zookeeper/group.cpp ecb6c002e8194b8d67e262826d988f747414f9f3 src/zookeeper/watcher.hpp 1db0386719c2a675d29b47b417dc856993062326 src/zookeeper/zookeeper.hpp f50aca6e7035c8084c3e76fd56b9d1ef7f9d9902 src/zookeeper/zookeeper.cpp 5720f4c1cd51c4d998b52e7bf7f9f019ae80c5f8 Diff: https://reviews.apache.org/r/17440/diff/ Testing --- make check. Jenkins tests ongoing, will report the results. Thanks, Jiang Yan Xu
Re: Review Request 17442: Added 'active_tasks' stat to master stats endpoint.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/#review32962 --- Ship it! src/master/http.cpp https://reviews.apache.org/r/17442/#comment62039 s/launched_tasks/launched_tasks_gauge/ ? (or active_tasks_gauge, please yourself) All the other *_tasks are counters, which, but calling this something_tasks also implies it is. This isn't a counter so it's better to explicitly call it something else, such as something_tasks_gauge. - David Robinson On Jan. 28, 2014, 2:59 a.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/ --- (Updated Jan. 28, 2014, 2:59 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. I opted for active tasks instead of running tasks because I didn't want the stats endpoint to loop through all tasks to figure out if a task is in RUNNING state. I think active is useful for most debugging purposes. Diffs - src/master/http.cpp 546e91dbb9c8ee1014bb4f0b3be2714ad6a2d520 Diff: https://reviews.apache.org/r/17442/diff/ Testing --- make check Thanks, Vinod Kone
Re: Review Request 17442: Added 'active_tasks' stat to master stats endpoint.
On Jan. 28, 2014, 4:55 a.m., Ben Mahler wrote: I hope the past vs present tense will be enough to not make the monotonic vs instantaneous stats confusing for those consuming this data. I'd find it confusing. active_tasks_gauge is preferrable. - David --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/#review32952 --- On Jan. 28, 2014, 2:59 a.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/ --- (Updated Jan. 28, 2014, 2:59 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. I opted for active tasks instead of running tasks because I didn't want the stats endpoint to loop through all tasks to figure out if a task is in RUNNING state. I think active is useful for most debugging purposes. Diffs - src/master/http.cpp 546e91dbb9c8ee1014bb4f0b3be2714ad6a2d520 Diff: https://reviews.apache.org/r/17442/diff/ Testing --- make check Thanks, Vinod Kone
Re: Review Request 17443: Added queued and launched tasks to slave stats.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17443/#review32964 --- Ship it! src/slave/http.cpp https://reviews.apache.org/r/17443/#comment62041 Same comment as RB 17442. queued_tasks_gauge and launced_tasks_gauge (or active_tasks_gauge, same as whatever gets using in the master). - David Robinson On Jan. 28, 2014, 3:02 a.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17443/ --- (Updated Jan. 28, 2014, 3:02 a.m.) Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. Diffs - src/slave/http.cpp c8357e214d2adf2cd712072f58d07b07badb79dc Diff: https://reviews.apache.org/r/17443/diff/ Testing --- make Thanks, Vinod Kone
Re: Review Request 16724: Added completed frameworks/tasks to slave re-registration.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16724/#review32961 --- Can we add a test for this please? src/master/master.cpp https://reviews.apache.org/r/16724/#comment62036 not yours but do you mind s/it's/its/ src/master/master.cpp https://reviews.apache.org/r/16724/#comment62035 s/slaveCompletedFrameworks/completedFrameworks/ src/master/master.cpp https://reviews.apache.org/r/16724/#comment62037 +1 s/slaveCompletedFrameworks/completedFrameworks/ src/master/master.cpp https://reviews.apache.org/r/16724/#comment62038 new line. src/master/master.cpp https://reviews.apache.org/r/16724/#comment62050 kill the log line or change it to VLOG. src/master/master.cpp https://reviews.apache.org/r/16724/#comment62049 Kill the log line or change it to VLOG. src/slave/slave.cpp https://reviews.apache.org/r/16724/#comment62040 s/id// src/slave/slave.cpp https://reviews.apache.org/r/16724/#comment62043 I'm not sure we want to LOG these. We don't even log the active frameworks/executors/tasks above. Maybe VLOG if you want use it for debugging. src/slave/slave.cpp https://reviews.apache.org/r/16724/#comment62042 s/MergeFrom/CopyFrom/ src/slave/slave.cpp https://reviews.apache.org/r/16724/#comment62044 ditto about logging. src/slave/slave.cpp https://reviews.apache.org/r/16724/#comment62046 I think you've brought this up before but did you figure out why a completed executor has terminated tasks? src/slave/slave.cpp https://reviews.apache.org/r/16724/#comment62045 ditto about logging. - Vinod Kone On Jan. 17, 2014, 12:24 a.m., Adam B wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16724/ --- (Updated Jan. 17, 2014, 12:24 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, Niklas Nielsen, and Vinod Kone. Bugs: MESOS-767 https://issues.apache.org/jira/browse/MESOS-767 Repository: mesos-git Description --- Added completed frameworks/tasks to slave re-registration. Fixes MESOS-767. Additional issues discovered during investigation: - MESOS-905: Remove Framework.id in favor of FrameworkInfo.id - MESOS-906: Last task in Completed Framework never graduates from terminatedTasks to completedTasks. - Completed frameworks/executors/tasks are stored in circular buffers, and these may overflow in different orders on different slaves. BenH proposes an archive to replace these circular buffers. Diffs - src/master/master.hpp 95b9cec src/master/master.cpp 38c5532 src/messages/messages.proto 1f264d5 src/slave/slave.cpp 396293b Diff: https://reviews.apache.org/r/16724/diff/ Testing --- make check; manually failed-over a master, watched the slave reregister its completed frameworks, web UI shows completed tasks and stdout/stderr. Thanks, Adam B
Re: Review Request 17442: Added 'active_tasks' stat to master stats endpoint.
On Jan. 28, 2014, 6:31 a.m., David Robinson wrote: src/master/http.cpp, line 349 https://reviews.apache.org/r/17442/diff/2/?file=452628#file452628line349 s/launched_tasks/launched_tasks_gauge/ ? (or active_tasks_gauge, please yourself) All the other *_tasks are counters, which, but calling this something_tasks also implies it is. This isn't a counter so it's better to explicitly call it something else, such as something_tasks_gauge. There are lots of stats that we currently expose that are gauges but we don't explicitly call them that way. I agree adding a suffix like _gauge is more explicit but maybe we should do that across all gauge stats in one fell swoop to avoid confusion? What do you think? - Vinod --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/#review32962 --- On Jan. 28, 2014, 2:59 a.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/ --- (Updated Jan. 28, 2014, 2:59 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. I opted for active tasks instead of running tasks because I didn't want the stats endpoint to loop through all tasks to figure out if a task is in RUNNING state. I think active is useful for most debugging purposes. Diffs - src/master/http.cpp 546e91dbb9c8ee1014bb4f0b3be2714ad6a2d520 Diff: https://reviews.apache.org/r/17442/diff/ Testing --- make check Thanks, Vinod Kone
Re: Review Request 17442: Added 'active_tasks' stat to master stats endpoint.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17442/ --- (Updated Jan. 28, 2014, 6:57 a.m.) Review request for mesos, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Changes --- benm's. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. I opted for active tasks instead of running tasks because I didn't want the stats endpoint to loop through all tasks to figure out if a task is in RUNNING state. I think active is useful for most debugging purposes. Diffs (updated) - src/master/http.cpp 546e91dbb9c8ee1014bb4f0b3be2714ad6a2d520 Diff: https://reviews.apache.org/r/17442/diff/ Testing --- make check Thanks, Vinod Kone
Re: Review Request 17443: Added queued and launched tasks to slave stats.
On Jan. 28, 2014, 5:08 a.m., Benjamin Hindman wrote: src/slave/http.cpp, line 309 https://reviews.apache.org/r/17443/diff/1/?file=452630#file452630line309 Any reason not to use 'active_tasks' here instead of 'launched_tasks' too? I went back and forth on the naming. This is my current thinking. active_tasks on master = (queued_tasks + launched_tasks) on slaves. calling them launched_tasks on the master seemed a bit confusing because it has a different meaning on the slave. thoughts? - Vinod --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17443/#review32953 --- On Jan. 28, 2014, 3:02 a.m., Vinod Kone wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17443/ --- (Updated Jan. 28, 2014, 3:02 a.m.) Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, David Robinson, and Niklas Nielsen. Bugs: MESOS-772 https://issues.apache.org/jira/browse/MESOS-772 Repository: mesos-git Description --- See summary. Diffs - src/slave/http.cpp c8357e214d2adf2cd712072f58d07b07badb79dc Diff: https://reviews.apache.org/r/17443/diff/ Testing --- make Thanks, Vinod Kone