[jira] [Updated] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Domański updated MESOS-1774: -- Target Version/s: 1.0.0, 0.20.1 (was: 1.0.0, 0.20.0, 0.20.1) Fix protobuf detection on systems with Python 3 as default -- Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Labels: build When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
Kamil Domański created MESOS-1774: - Summary: Fix protobuf detection on systems with Python 3 as default Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125498#comment-14125498 ] Kamil Domański commented on MESOS-1774: --- https://reviews.apache.org/r/25439/ Fix protobuf detection on systems with Python 3 as default -- Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Labels: build When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1775) Libprocess wants source for unbundled gmock
[ https://issues.apache.org/jira/browse/MESOS-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Domański updated MESOS-1775: -- Affects Version/s: 0.20.0 Libprocess wants source for unbundled gmock --- Key: MESOS-1775 URL: https://issues.apache.org/jira/browse/MESOS-1775 Project: Mesos Issue Type: Bug Components: build, libprocess Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Priority: Minor Labels: build *gmock* is installed on my system. Yet with *--disable-bundled* the libprocess configuration script is still searching for *gmock-all.cc* instead of just the headers and libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Domański updated MESOS-1774: -- Shepherd: Timothy St. Clair (was: Kamil Domański) Fix protobuf detection on systems with Python 3 as default -- Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Labels: build When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-1764) Minor Build Fixes from 0.20 release
[ https://issues.apache.org/jira/browse/MESOS-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121966#comment-14121966 ] Timothy St. Clair edited comment on MESOS-1764 at 9/8/14 4:49 PM: -- package config file -reviews.apache.org/r/25355/- was (Author: tstclair): package config file https://reviews.apache.org/r/25355/ Minor Build Fixes from 0.20 release --- Key: MESOS-1764 URL: https://issues.apache.org/jira/browse/MESOS-1764 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Reporter: Timothy St. Clair Assignee: Timothy St. Clair This ticket is a catch all for minor issues caught during a rebase and testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1772) ./include/process/future.hpp(274): error: no instance of overloaded function process::FutureT::onReady
[ https://issues.apache.org/jira/browse/MESOS-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125749#comment-14125749 ] Dominic Hamon commented on MESOS-1772: -- I know very little about the Intel C compiler, but my gut reaction is that we shouldn't support Yet Another Toolchain. However, we should then check for this at configure time. ./include/process/future.hpp(274): error: no instance of overloaded function process::FutureT::onReady - Key: MESOS-1772 URL: https://issues.apache.org/jira/browse/MESOS-1772 Project: Mesos Issue Type: Bug Components: build Reporter: Vinson Lee Priority: Blocker build error with Intel C Compiler {noformat} libtool: compile: /opt/intel/bin/icpc -DPACKAGE_NAME=\libprocess\ -DPACKAGE_TARNAME=\libprocess\ -DPACKAGE_VERSION=\0.0.1\ -DPACKAGE_STRING=\libprocess 0.0.1\ -DPACKAGE_BUGREPORT=\\ -DPACKAGE_URL=\\ -DPACKAGE=\libprocess\ -DVERSION=\0.0.1\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -I. -I./include -I./3rdparty/stout/include -I3rdparty/boost-1.53.0 -I3rdparty/libev-4.15 -I3rdparty/picojson-4f93734 -I3rdparty/glog-0.3.3/src -I3rdparty/ry-http-parser-1c3624a -g -g2 -O2 -std=c++11 -MT libprocess_la-http.lo -MD -MP -MF .deps/libprocess_la-http.Tpo -c src/http.cpp -fPIC -DPIC -o libprocess_la-http.o ./include/process/future.hpp(274): error: no instance of overloaded function process::FutureT::onReady [with T=std::string] matches the argument list argument types are: (std::_Bindstd::_Mem_fnbool (process::Futurestd::string::*)(const std::string ) (process::Futurestd::string, std::_Placeholder1), process::Futurestd::string::Prefer) return onReady(std::forwardF(f), Prefer()); ^ detected during: instantiation of const process::FutureT process::FutureT::onReady(F ) const [with T=std::string, F=std::_Bindstd::_Mem_fnbool (process::Futurestd::string::*)(const std::string ) (process::Futurestd::string, std::_Placeholder1)] at line 777 instantiation of bool process::PromiseT::associate(const process::FutureT ) [with T=std::string] at line 1435 instantiation of void process::internal::thenf(const std::shared_ptrprocess::PromiseX , const std::functionprocess::FutureX (const T ) , const process::FutureT ) [with T=Nothing, X=std::basic_stringchar, std::char_traitschar, std::allocatorchar] at line 1508 instantiation of process::FutureX process::FutureT::then(const std::functionprocess::FutureX (const T ) ) const [with T=Nothing, X=std::basic_stringchar, std::char_traitschar, std::allocatorchar] at line 355 instantiation of process::FutureX process::FutureT::then(F , process::FutureT::Prefer) const [with T=Nothing, F=std::_Bindprocess::Futurestd::string (*(int))(int), X=std::basic_stringchar, std::char_traitschar, std::allocatorchar] at line 369 instantiation of auto process::FutureT::then(F ) const-decltype((expression)) [with T=Nothing, F=std::_Bindprocess::Futurestd::string (*(int))(int)] at line 160 of src/http.cpp {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1765) Use PID namespace to avoid freezing cgroup
[ https://issues.apache.org/jira/browse/MESOS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125756#comment-14125756 ] Cong Wang commented on MESOS-1765: -- [~yasumoto] Sure, here is the patch I sent to Linux kernel: https://lkml.org/lkml/2014/9/4/646 which contains the description of the bug. Use PID namespace to avoid freezing cgroup -- Key: MESOS-1765 URL: https://issues.apache.org/jira/browse/MESOS-1765 Project: Mesos Issue Type: Story Components: containerization Reporter: Cong Wang There is some known kernel issue when we freeze the whole cgroup upon OOM. Mesos probably can just use PID namespace so that we will only need to kill the init of the pid namespace, instead of freezing all the processes and killing them one by one. But I am not quite sure if this would break the existing code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy St. Clair updated MESOS-1774: - Assignee: Timothy St. Clair Fix protobuf detection on systems with Python 3 as default -- Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Assignee: Timothy St. Clair Labels: build When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy St. Clair resolved MESOS-1774. -- Resolution: Fixed commit 18d3957f2742aa83e9a73a4c6ee09cb5419487f3 Author: Kamil Doma?ski alabat...@gmail.com Date: Mon Sep 8 12:10:27 2014 -0500 Fix protobuf detection on systems with Python 3 as default -- Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Assignee: Timothy St. Clair Labels: build When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1764) Minor Build Fixes from 0.20 release
[ https://issues.apache.org/jira/browse/MESOS-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy St. Clair updated MESOS-1764: - Shepherd: Vinod Kone Minor Build Fixes from 0.20 release --- Key: MESOS-1764 URL: https://issues.apache.org/jira/browse/MESOS-1764 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Reporter: Timothy St. Clair Assignee: Timothy St. Clair This ticket is a catch all for minor issues caught during a rebase and testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1771) introduce unique_ptr
[ https://issues.apache.org/jira/browse/MESOS-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1771: - Description: * add unique_ptr to the configure check * document use of unique_ptr in style guide ** use when possible, use std::move when necessary * deprecate Owned in favour of unique_ptr * Move raw pointers with ownership over to unique_ptr was: * add unique_ptr to the configure check * deprecate Owned in favour of unique_ptr * Move raw pointers with ownership over to unique_ptr introduce unique_ptr Key: MESOS-1771 URL: https://issues.apache.org/jira/browse/MESOS-1771 Project: Mesos Issue Type: Improvement Reporter: Dominic Hamon Assignee: Dominic Hamon * add unique_ptr to the configure check * document use of unique_ptr in style guide ** use when possible, use std::move when necessary * deprecate Owned in favour of unique_ptr * Move raw pointers with ownership over to unique_ptr -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1771) introduce unique_ptr
[ https://issues.apache.org/jira/browse/MESOS-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125783#comment-14125783 ] Dominic Hamon commented on MESOS-1771: -- Adding check to configure: https://reviews.apache.org/r/25448/ introduce unique_ptr Key: MESOS-1771 URL: https://issues.apache.org/jira/browse/MESOS-1771 Project: Mesos Issue Type: Improvement Reporter: Dominic Hamon Assignee: Dominic Hamon * add unique_ptr to the configure check * document use of unique_ptr in style guide ** use when possible, use std::move when necessary * deprecate Owned in favour of unique_ptr * Move raw pointers with ownership over to unique_ptr -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1715) The slave does not send pending tasks / executors during re-registration.
[ https://issues.apache.org/jira/browse/MESOS-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1715: - Sprint: Q3 Sprint 4, Q3 Sprint 5 (was: Q3 Sprint 4) The slave does not send pending tasks / executors during re-registration. - Key: MESOS-1715 URL: https://issues.apache.org/jira/browse/MESOS-1715 Project: Mesos Issue Type: Bug Components: slave Reporter: Benjamin Mahler Assignee: Benjamin Mahler In what looks like an oversight, the pending tasks and executors in the slave (Framework::pending) are not sent in the re-registration message. For tasks, this can lead to spurious TASK_LOST notifications being generated by the master when it falsely thinks the tasks are not present on the slave. For executors, this can lead to under-accounting in the master, causing an overcommit on the slave. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1586) Isolate system directories, e.g., per-container /tmp
[ https://issues.apache.org/jira/browse/MESOS-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1586: - Sprint: Q3 Sprint 1, Q3 Sprint 2, Q3 Sprint 3, Q3 Sprint 4, Q3 Sprint 5 (was: Q3 Sprint 1, Q3 Sprint 2, Q3 Sprint 3, Q3 Sprint 4) Isolate system directories, e.g., per-container /tmp Key: MESOS-1586 URL: https://issues.apache.org/jira/browse/MESOS-1586 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.20.0 Reporter: Ian Downes Assignee: Ian Downes Ideally, tasks should not write outside their sandbox (executor work directory) but pragmatically they may need to write to /tmp, /var/tmp, or some other directory. 1) We should include any such files in disk usage and quota. 2) We should make these shared directories private, i.e., each container has their own. 3) We should make the lifetime of any such files the same as the executor work directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1466) Race between executor exited event and launch task can cause overcommit of resources
[ https://issues.apache.org/jira/browse/MESOS-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1466: - Sprint: Q3 Sprint 3, Q3 Sprint 4, Q3 Sprint 5 (was: Q3 Sprint 3, Q3 Sprint 4) Race between executor exited event and launch task can cause overcommit of resources Key: MESOS-1466 URL: https://issues.apache.org/jira/browse/MESOS-1466 Project: Mesos Issue Type: Bug Components: allocation, master Reporter: Vinod Kone Assignee: Benjamin Mahler Labels: reliability The following sequence of events can cause an overcommit -- Launch task is called for a task whose executor is already running -- Executor's resources are not accounted for on the master -- Executor exits and the event is enqueued behind launch tasks on the master -- Master sends the task to the slave which needs to commit for resources for task and the (new) executor. -- Master processes the executor exited event and re-offers the executor's resources causing an overcommit of resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1721) Prevent overcommit of the slave for ports and ephemeral ports.
[ https://issues.apache.org/jira/browse/MESOS-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1721: - Sprint: Q3 Sprint 4, Q3 Sprint 5 (was: Q3 Sprint 4) Prevent overcommit of the slave for ports and ephemeral ports. -- Key: MESOS-1721 URL: https://issues.apache.org/jira/browse/MESOS-1721 Project: Mesos Issue Type: Bug Components: slave Reporter: Benjamin Mahler Assignee: Benjamin Mahler It's possible for the slave to be overcommitted (e.g. MESOS-1668). In the case of named resources like ports and ephemeral_ports, this is problematic as the resources needed by the tasks are in use. This ticket is to present the idea of rejecting tasks when the slave is overcommitted on ports or ephemeral_ports. In order to ensure the master reconciles state with the slave, we can also trigger a re-registration. For cpu / memory, this is less crucial, so preventing overcommit for these will be punted for later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1592) Design inverse resource offer support
[ https://issues.apache.org/jira/browse/MESOS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1592: - Sprint: Q3 Sprint 3, Q3 Sprint 4, Q3 Sprint 5 (was: Q3 Sprint 3, Q3 Sprint 4) Design inverse resource offer support - Key: MESOS-1592 URL: https://issues.apache.org/jira/browse/MESOS-1592 Project: Mesos Issue Type: Task Components: allocation Reporter: Benjamin Mahler Assignee: Alexandra Sava An inverse resource offer means that Mesos is requesting resources back from the framework, possibly within some time interval. This can be leveraged initially to provide more automated cluster maintenance, by offering schedulers the opportunity to move tasks to compensate for planned maintenance. Operators can set a time limit on how long to wait for schedulers to relocate tasks before the tasks are forcibly terminated. Inverse resource offers have many other potential uses, as it opens the opportunity for the allocator to attempt to move tasks in the cluster through the co-operation of the framework, possibly providing better over-subscription, fairness, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1728) Libprocess: report bind parameters on failure
[ https://issues.apache.org/jira/browse/MESOS-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1728: - Sprint: Q3 Sprint 4, Q3 Sprint 5 (was: Q3 Sprint 4) Libprocess: report bind parameters on failure - Key: MESOS-1728 URL: https://issues.apache.org/jira/browse/MESOS-1728 Project: Mesos Issue Type: Improvement Components: libprocess Reporter: Nikita Vetoshkin Assignee: Nikita Vetoshkin Priority: Trivial When you attempt to start slave or master and there's another one already running there, it is nice to report what are the actual parameters to {{bind}} call that failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1466) Race between executor exited event and launch task can cause overcommit of resources
[ https://issues.apache.org/jira/browse/MESOS-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1466: - Sprint: Q3 Sprint 3, Q3 Sprint 4 (was: Q3 Sprint 3, Q3 Sprint 4, Q3 Sprint 5) Race between executor exited event and launch task can cause overcommit of resources Key: MESOS-1466 URL: https://issues.apache.org/jira/browse/MESOS-1466 Project: Mesos Issue Type: Bug Components: allocation, master Reporter: Vinod Kone Assignee: Benjamin Mahler Labels: reliability The following sequence of events can cause an overcommit -- Launch task is called for a task whose executor is already running -- Executor's resources are not accounted for on the master -- Executor exits and the event is enqueued behind launch tasks on the master -- Master sends the task to the slave which needs to commit for resources for task and the (new) executor. -- Master processes the executor exited event and re-offers the executor's resources causing an overcommit of resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1425) LogZooKeeperTest.WriteRead test is flaky
[ https://issues.apache.org/jira/browse/MESOS-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1425: - Sprint: Q3 Sprint 1, Q3 Sprint 2, Q3 Sprint 4 (was: Q3 Sprint 1, Q3 Sprint 2, Q3 Sprint 4, Q3 Sprint 5) LogZooKeeperTest.WriteRead test is flaky Key: MESOS-1425 URL: https://issues.apache.org/jira/browse/MESOS-1425 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.19.0 Reporter: Vinod Kone Assignee: Jie Yu {code} [ RUN ] LogZooKeeperTest.WriteRead I0527 23:23:48.286031 1352 zookeeper_test_server.cpp:158] Started ZooKeeperTestServer on port 39446 I0527 23:23:48.293916 1352 log_tests.cpp:1945] Using temporary directory '/tmp/LogZooKeeperTest_WriteRead_Vyty8g' I0527 23:23:48.296430 1352 leveldb.cpp:176] Opened db in 2.459713ms I0527 23:23:48.296740 1352 leveldb.cpp:183] Compacted db in 286843ns I0527 23:23:48.296761 1352 leveldb.cpp:198] Created db iterator in 3083ns I0527 23:23:48.296772 1352 leveldb.cpp:204] Seeked to beginning of db in 4541ns I0527 23:23:48.296777 1352 leveldb.cpp:273] Iterated through 0 keys in the db in 87ns I0527 23:23:48.296788 1352 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0527 23:23:48.297499 1383 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 505340ns I0527 23:23:48.297513 1383 replica.cpp:320] Persisted replica status to VOTING I0527 23:23:48.299492 1352 leveldb.cpp:176] Opened db in 1.73582ms I0527 23:23:48.299773 1352 leveldb.cpp:183] Compacted db in 263937ns I0527 23:23:48.299793 1352 leveldb.cpp:198] Created db iterator in 7494ns I0527 23:23:48.299806 1352 leveldb.cpp:204] Seeked to beginning of db in 235ns I0527 23:23:48.299813 1352 leveldb.cpp:273] Iterated through 0 keys in the db in 93ns I0527 23:23:48.299821 1352 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0527 23:23:48.300503 1380 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 492309ns I0527 23:23:48.300516 1380 replica.cpp:320] Persisted replica status to VOTING I0527 23:23:48.302500 1352 leveldb.cpp:176] Opened db in 1.793829ms I0527 23:23:48.303642 1352 leveldb.cpp:183] Compacted db in 1.123929ms I0527 23:23:48.303669 1352 leveldb.cpp:198] Created db iterator in 5865ns I0527 23:23:48.303689 1352 leveldb.cpp:204] Seeked to beginning of db in 8811ns I0527 23:23:48.303705 1352 leveldb.cpp:273] Iterated through 1 keys in the db in 9545ns I0527 23:23:48.303715 1352 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned 2014-05-27 23:23:48,303:1352(0x2b1173a29700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2014-05-27 23:23:48,303:1352(0x2b1173a29700):ZOO_INFO@log_env@716: Client environment:host.name=minerva 2014-05-27 23:23:48,303:1352(0x2b1173a29700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2014-05-27 23:23:48,303:1352(0x2b1173a29700):ZOO_INFO@log_env@724: Client environment:os.arch=3.2.0-57-generic 2014-05-27 23:23:48,303:1352(0x2b1173a29700):ZOO_INFO@log_env@725: Client environment:os.version=#87-Ubuntu SMP Tue Nov 12 21:35:10 UTC 2013 2014-05-27 23:23:48,303:1352(0x2b1173e2b700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2014-05-27 23:23:48,304:1352(0x2b1173e2b700):ZOO_INFO@log_env@716: Client environment:host.name=minerva 2014-05-27 23:23:48,304:1352(0x2b1173e2b700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2014-05-27 23:23:48,304:1352(0x2b1173e2b700):ZOO_INFO@log_env@724: Client environment:os.arch=3.2.0-57-generic 2014-05-27 23:23:48,304:1352(0x2b1173e2b700):ZOO_INFO@log_env@725: Client environment:os.version=#87-Ubuntu SMP Tue Nov 12 21:35:10 UTC 2013 2014-05-27 23:23:48,304:1352(0x2b1173a29700):ZOO_INFO@log_env@733: Client environment:user.name=(null) I0527 23:23:48.303988 1380 log.cpp:238] Attempting to join replica to ZooKeeper group 2014-05-27 23:23:48,304:1352(0x2b1173e2b700):ZOO_INFO@log_env@733: Client environment:user.name=(null) 2014-05-27 23:23:48,304:1352(0x2b1173a29700):ZOO_INFO@log_env@741: Client environment:user.home=/home/jenkins I0527 23:23:48.304198 1385 recover.cpp:425] Starting replica recovery 2014-05-27 23:23:48,304:1352(0x2b1173e2b700):ZOO_INFO@log_env@741: Client environment:user.home=/home/jenkins 2014-05-27 23:23:48,304:1352(0x2b1173a29700):ZOO_INFO@log_env@753: Client environment:user.dir=/tmp/LogZooKeeperTest_WriteRead_Vyty8g 2014-05-27 23:23:48,304:1352(0x2b1173a29700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=127.0.0.1:39446 sessionTimeout=5000 watcher=0x2b11708e98d0 sessionId=0 sessionPasswd=null
[jira] [Updated] (MESOS-1752) Allow variadic templates
[ https://issues.apache.org/jira/browse/MESOS-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1752: - Sprint: Q3 Sprint 4, Q3 Sprint 5 (was: Q3 Sprint 4) Allow variadic templates Key: MESOS-1752 URL: https://issues.apache.org/jira/browse/MESOS-1752 Project: Mesos Issue Type: Improvement Reporter: Dominic Hamon Assignee: Dominic Hamon Priority: Minor Labels: c++11 Add variadic templates to the C++11 configure check. Once there, we can start using them in the code-base. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1758) Freezer failure leads to lost task during container destruction.
[ https://issues.apache.org/jira/browse/MESOS-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1758: - Sprint: Q3 Sprint 5 Freezer failure leads to lost task during container destruction. Key: MESOS-1758 URL: https://issues.apache.org/jira/browse/MESOS-1758 Project: Mesos Issue Type: Bug Components: containerization Reporter: Benjamin Mahler In the past we've seen numerous issues around the freezer. Lately, on the 2.6.44 kernel, we've seen issues where we're unable to freeze the cgroup: (1) An oom occurs. (2) No indication of oom in the kernel logs. (3) The slave is unable to freeze the cgroup. (4) The task is marked as lost. {noformat} I0903 16:46:24.956040 25469 mem.cpp:575] Memory limit exceeded: Requested: 15488MB Maximum Used: 15488MB MEMORY STATISTICS: cache 7958691840 rss 8281653248 mapped_file 9474048 pgpgin 4487861 pgpgout 522933 pgfault 2533780 pgmajfault 11 inactive_anon 0 active_anon 8281653248 inactive_file 7631708160 active_file 326852608 unevictable 0 hierarchical_memory_limit 16240345088 total_cache 7958691840 total_rss 8281653248 total_mapped_file 9474048 total_pgpgin 4487861 total_pgpgout 522933 total_pgfault 2533780 total_pgmajfault 11 total_inactive_anon 0 total_active_anon 8281653248 total_inactive_file 7631728640 total_active_file 326852608 total_unevictable 0 I0903 16:46:24.956848 25469 containerizer.cpp:1041] Container bbb9732a-d600-4c1b-b326-846338c608c3 has reached its limit for resource mem(*):1.62403e+10 and will be terminated I0903 16:46:24.957427 25469 containerizer.cpp:909] Destroying container 'bbb9732a-d600-4c1b-b326-846338c608c3' I0903 16:46:24.958664 25481 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:34.959529 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:34.962070 25482 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.710848ms I0903 16:46:34.962658 25479 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:44.963349 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:44.965631 25472 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.588224ms I0903 16:46:44.966356 25472 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:54.967254 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:56.008447 25475 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 2.15296ms I0903 16:46:56.009071 25466 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:06.010329 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:06.012538 25467 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.643008ms I0903 16:47:06.013216 25467 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:12.516348 25480 slave.cpp:3030] Current usage 9.57%. Max allowed age: 5.630238827780799days I0903 16:47:16.015192 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:16.017043 25486 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.511168ms I0903 16:47:16.017555 25480 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:19.862746 25483 http.cpp:245] HTTP request for '/slave(1)/stats.json' E0903 16:47:24.960055 25472 slave.cpp:2557] Termination of executor 'E' of framework '201104070004-002563-' failed: Failed to destroy container: discarded future I0903 16:47:24.962054 25472 slave.cpp:2087] Handling status update TASK_LOST (UUID: c0c1633b-7221-40dc-90a2-660ef639f747) for task T of framework 201104070004-002563- from @0.0.0.0:0 I0903 16:47:24.963470 25469 mem.cpp:293] Updated 'memory.soft_limit_in_bytes' to 128MB for container bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:24.963541 25471 cpushare.cpp:338] Updated 'cpu.shares' to 256 (cpus 0.25) for container bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:24.964756 25471
[jira] [Updated] (MESOS-1410) Keep terminal unacknowledged tasks in the master's state.
[ https://issues.apache.org/jira/browse/MESOS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1410: - Assignee: Benjamin Mahler Keep terminal unacknowledged tasks in the master's state. - Key: MESOS-1410 URL: https://issues.apache.org/jira/browse/MESOS-1410 Project: Mesos Issue Type: Task Affects Versions: 0.19.0 Reporter: Benjamin Mahler Assignee: Benjamin Mahler Fix For: 0.21.0 Once we are sending acknowledgments through the master as per MESOS-1409, we need to keep terminal tasks that are *unacknowledged* in the Master's memory. This will allow us to identify these tasks to frameworks when we haven't yet forwarded them an update. Without this, we're susceptible to MESOS-1389. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1410) Keep terminal unacknowledged tasks in the master's state.
[ https://issues.apache.org/jira/browse/MESOS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1410: - Shepherd: Vinod Kone Keep terminal unacknowledged tasks in the master's state. - Key: MESOS-1410 URL: https://issues.apache.org/jira/browse/MESOS-1410 Project: Mesos Issue Type: Task Affects Versions: 0.19.0 Reporter: Benjamin Mahler Assignee: Benjamin Mahler Fix For: 0.21.0 Once we are sending acknowledgments through the master as per MESOS-1409, we need to keep terminal tasks that are *unacknowledged* in the Master's memory. This will allow us to identify these tasks to frameworks when we haven't yet forwarded them an update. Without this, we're susceptible to MESOS-1389. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1410) Keep terminal unacknowledged tasks in the master's state.
[ https://issues.apache.org/jira/browse/MESOS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1410: - Sprint: Q3 Sprint 5 Keep terminal unacknowledged tasks in the master's state. - Key: MESOS-1410 URL: https://issues.apache.org/jira/browse/MESOS-1410 Project: Mesos Issue Type: Task Affects Versions: 0.19.0 Reporter: Benjamin Mahler Fix For: 0.21.0 Once we are sending acknowledgments through the master as per MESOS-1409, we need to keep terminal tasks that are *unacknowledged* in the Master's memory. This will allow us to identify these tasks to frameworks when we haven't yet forwarded them an update. Without this, we're susceptible to MESOS-1389. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1476) Provide endpoints for deactivating / activating slaves.
[ https://issues.apache.org/jira/browse/MESOS-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1476: - Sprint: Q3 Sprint 5 Provide endpoints for deactivating / activating slaves. --- Key: MESOS-1476 URL: https://issues.apache.org/jira/browse/MESOS-1476 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Assignee: Alexandra Sava Labels: gsoc2014 When performing maintenance operations on slaves, it is important to allow these slaves to be drained of their tasks. The first essential primitive of draining slaves is to prevent them from running more tasks. This can be achieved by deactivating them: stop sending their resource offers to frameworks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1739) Add Dynamic Slave Attributes
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1739: - Sprint: Q3 Sprint 5 Add Dynamic Slave Attributes Key: MESOS-1739 URL: https://issues.apache.org/jira/browse/MESOS-1739 Project: Mesos Issue Type: Improvement Reporter: Patrick Reilly Assignee: Patrick Reilly Make it so that either via a slave restart or a out of process reconfigure ping, the attributes and resources of a slave can be updated to be a superset of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-1765) Use PID namespace to avoid freezing cgroup
[ https://issues.apache.org/jira/browse/MESOS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon reassigned MESOS-1765: Assignee: Vinod Kone Use PID namespace to avoid freezing cgroup -- Key: MESOS-1765 URL: https://issues.apache.org/jira/browse/MESOS-1765 Project: Mesos Issue Type: Story Components: containerization Reporter: Cong Wang Assignee: Vinod Kone There is some known kernel issue when we freeze the whole cgroup upon OOM. Mesos probably can just use PID namespace so that we will only need to kill the init of the pid namespace, instead of freezing all the processes and killing them one by one. But I am not quite sure if this would break the existing code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-1758) Freezer failure leads to lost task during container destruction.
[ https://issues.apache.org/jira/browse/MESOS-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon reassigned MESOS-1758: Assignee: Vinod Kone Freezer failure leads to lost task during container destruction. Key: MESOS-1758 URL: https://issues.apache.org/jira/browse/MESOS-1758 Project: Mesos Issue Type: Bug Components: containerization Reporter: Benjamin Mahler Assignee: Vinod Kone In the past we've seen numerous issues around the freezer. Lately, on the 2.6.44 kernel, we've seen issues where we're unable to freeze the cgroup: (1) An oom occurs. (2) No indication of oom in the kernel logs. (3) The slave is unable to freeze the cgroup. (4) The task is marked as lost. {noformat} I0903 16:46:24.956040 25469 mem.cpp:575] Memory limit exceeded: Requested: 15488MB Maximum Used: 15488MB MEMORY STATISTICS: cache 7958691840 rss 8281653248 mapped_file 9474048 pgpgin 4487861 pgpgout 522933 pgfault 2533780 pgmajfault 11 inactive_anon 0 active_anon 8281653248 inactive_file 7631708160 active_file 326852608 unevictable 0 hierarchical_memory_limit 16240345088 total_cache 7958691840 total_rss 8281653248 total_mapped_file 9474048 total_pgpgin 4487861 total_pgpgout 522933 total_pgfault 2533780 total_pgmajfault 11 total_inactive_anon 0 total_active_anon 8281653248 total_inactive_file 7631728640 total_active_file 326852608 total_unevictable 0 I0903 16:46:24.956848 25469 containerizer.cpp:1041] Container bbb9732a-d600-4c1b-b326-846338c608c3 has reached its limit for resource mem(*):1.62403e+10 and will be terminated I0903 16:46:24.957427 25469 containerizer.cpp:909] Destroying container 'bbb9732a-d600-4c1b-b326-846338c608c3' I0903 16:46:24.958664 25481 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:34.959529 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:34.962070 25482 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.710848ms I0903 16:46:34.962658 25479 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:44.963349 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:44.965631 25472 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.588224ms I0903 16:46:44.966356 25472 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:54.967254 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:56.008447 25475 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 2.15296ms I0903 16:46:56.009071 25466 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:06.010329 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:06.012538 25467 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.643008ms I0903 16:47:06.013216 25467 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:12.516348 25480 slave.cpp:3030] Current usage 9.57%. Max allowed age: 5.630238827780799days I0903 16:47:16.015192 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:16.017043 25486 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.511168ms I0903 16:47:16.017555 25480 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:19.862746 25483 http.cpp:245] HTTP request for '/slave(1)/stats.json' E0903 16:47:24.960055 25472 slave.cpp:2557] Termination of executor 'E' of framework '201104070004-002563-' failed: Failed to destroy container: discarded future I0903 16:47:24.962054 25472 slave.cpp:2087] Handling status update TASK_LOST (UUID: c0c1633b-7221-40dc-90a2-660ef639f747) for task T of framework 201104070004-002563- from @0.0.0.0:0 I0903 16:47:24.963470 25469 mem.cpp:293] Updated 'memory.soft_limit_in_bytes' to 128MB for container bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:24.963541 25471 cpushare.cpp:338] Updated 'cpu.shares' to 256 (cpus 0.25) for container
[jira] [Updated] (MESOS-1721) Prevent overcommit of the slave for ports and ephemeral ports.
[ https://issues.apache.org/jira/browse/MESOS-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1721: - Sprint: Q3 Sprint 4 (was: Q3 Sprint 4, Q3 Sprint 5) Prevent overcommit of the slave for ports and ephemeral ports. -- Key: MESOS-1721 URL: https://issues.apache.org/jira/browse/MESOS-1721 Project: Mesos Issue Type: Bug Components: slave Reporter: Benjamin Mahler Assignee: Benjamin Mahler It's possible for the slave to be overcommitted (e.g. MESOS-1668). In the case of named resources like ports and ephemeral_ports, this is problematic as the resources needed by the tasks are in use. This ticket is to present the idea of rejecting tasks when the slave is overcommitted on ports or ephemeral_ports. In order to ensure the master reconciles state with the slave, we can also trigger a re-registration. For cpu / memory, this is less crucial, so preventing overcommit for these will be punted for later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1717) The slave does not show pending tasks in the JSON endpoints.
[ https://issues.apache.org/jira/browse/MESOS-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1717: - Story Points: 1 The slave does not show pending tasks in the JSON endpoints. Key: MESOS-1717 URL: https://issues.apache.org/jira/browse/MESOS-1717 Project: Mesos Issue Type: Bug Components: json api, slave Reporter: Benjamin Mahler Assignee: Benjamin Mahler The slave does not show pending tasks in the /state.json endpoint. This is a bit tricky to add since we rely on knowing the executor directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1715) The slave does not send pending tasks / executors during re-registration.
[ https://issues.apache.org/jira/browse/MESOS-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1715: - Story Points: 3 The slave does not send pending tasks / executors during re-registration. - Key: MESOS-1715 URL: https://issues.apache.org/jira/browse/MESOS-1715 Project: Mesos Issue Type: Bug Components: slave Reporter: Benjamin Mahler Assignee: Benjamin Mahler In what looks like an oversight, the pending tasks and executors in the slave (Framework::pending) are not sent in the re-registration message. For tasks, this can lead to spurious TASK_LOST notifications being generated by the master when it falsely thinks the tasks are not present on the slave. For executors, this can lead to under-accounting in the master, causing an overcommit on the slave. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1728) Libprocess: report bind parameters on failure
[ https://issues.apache.org/jira/browse/MESOS-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1728: - Story Points: 1 Libprocess: report bind parameters on failure - Key: MESOS-1728 URL: https://issues.apache.org/jira/browse/MESOS-1728 Project: Mesos Issue Type: Improvement Components: libprocess Reporter: Nikita Vetoshkin Assignee: Nikita Vetoshkin Priority: Trivial When you attempt to start slave or master and there's another one already running there, it is nice to report what are the actual parameters to {{bind}} call that failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1392) Failure when znode is removed before we can read its contents.
[ https://issues.apache.org/jira/browse/MESOS-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1392: - Story Points: 3 Failure when znode is removed before we can read its contents. -- Key: MESOS-1392 URL: https://issues.apache.org/jira/browse/MESOS-1392 Project: Mesos Issue Type: Bug Affects Versions: 0.19.0 Reporter: Benjamin Mahler Assignee: Yan Xu Looks like the following can occur when a znode goes away right before we can read it's contents: {noformat: title=Slave exit} I0520 16:33:45.721727 29155 group.cpp:382] Trying to create path '/home/mesos/test/master' in ZooKeeper I0520 16:33:48.600837 29155 detector.cpp:134] Detected a new leader: (id='2617') I0520 16:33:48.601428 29147 group.cpp:655] Trying to get '/home/mesos/test/master/info_002617' in ZooKeeper Failed to detect a master: Failed to get data for ephemeral node '/home/mesos/test/master/info_002617' in ZooKeeper: no node Slave Exit Status: 1 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1739) Add Dynamic Slave Attributes
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1739: - Story Points: 3 Add Dynamic Slave Attributes Key: MESOS-1739 URL: https://issues.apache.org/jira/browse/MESOS-1739 Project: Mesos Issue Type: Improvement Reporter: Patrick Reilly Assignee: Patrick Reilly Make it so that either via a slave restart or a out of process reconfigure ping, the attributes and resources of a slave can be updated to be a superset of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-703) master fails to respect updated FrameworkInfo when the framework scheduler restarts
[ https://issues.apache.org/jira/browse/MESOS-703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-703: Sprint: Q3 Sprint 5 master fails to respect updated FrameworkInfo when the framework scheduler restarts --- Key: MESOS-703 URL: https://issues.apache.org/jira/browse/MESOS-703 Project: Mesos Issue Type: Bug Components: master Affects Versions: 0.14.0 Environment: ubuntu 13.04, mesos 0.14.0-rc3 Reporter: Jordan Curzon Assignee: Vinod Kone When I first ran marathon it was running as a personal user and registered with mesos-master as such due to putting an empty string in the user field. When I restarted marathon as nobody, tasks were still being run as the personal user which didn't exist on the slaves. I know marathon was trying to send a FrameworkInfo with nobody listed as the user because I hard coded it in. The tasks wouldn't run as nobody until I restarted the mesos-master. Each time I restarted the marathon framework, it reregistered with mesos-master and mesos-master wrote to the logs that it detected a failover because the scheduler went away and then came back. I understand the scheduler failover, but shouldn't mesos-master respect an updated FrameworkInfo when the scheduler re-registers? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126053#comment-14126053 ] Kamil Domański commented on MESOS-1774: --- [~tstclair], I actually updated the review request, since the original patch changed echoed message, but not not the command. Fix protobuf detection on systems with Python 3 as default -- Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Assignee: Timothy St. Clair Labels: build When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Domański updated MESOS-1774: -- Comment: was deleted (was: [~tstclair], I actually updated the review request, since the original patch changed echoed message, but not not the command.) Fix protobuf detection on systems with Python 3 as default -- Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Assignee: Timothy St. Clair Labels: build When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (MESOS-1774) Fix protobuf detection on systems with Python 3 as default
[ https://issues.apache.org/jira/browse/MESOS-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Domański reopened MESOS-1774: --- [~tstclair], I actually updated the review request, since the original patch changed echoed message, but not not the command. Fix protobuf detection on systems with Python 3 as default -- Key: MESOS-1774 URL: https://issues.apache.org/jira/browse/MESOS-1774 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.20.0 Environment: Gentoo Linux ./configure --disable-bundled Reporter: Kamil Domański Assignee: Timothy St. Clair Labels: build When configureing without bundled dependencies, usage of *python* symbolic link in *m4/ac_python_module.m4* causes the detection of *google.protobuf* module to fail on systems with Python 3 set as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-1777) Design persistent resources
Jie Yu created MESOS-1777: - Summary: Design persistent resources Key: MESOS-1777 URL: https://issues.apache.org/jira/browse/MESOS-1777 Project: Mesos Issue Type: Task Reporter: Jie Yu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1777) Design persistent resources
[ https://issues.apache.org/jira/browse/MESOS-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dominic Hamon updated MESOS-1777: - Sprint: Q3 Sprint 5 Assignee: Jie Yu Design persistent resources --- Key: MESOS-1777 URL: https://issues.apache.org/jira/browse/MESOS-1777 Project: Mesos Issue Type: Task Reporter: Jie Yu Assignee: Jie Yu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1758) Freezer failure leads to lost task during container destruction.
[ https://issues.apache.org/jira/browse/MESOS-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126238#comment-14126238 ] Vinod Kone commented on MESOS-1758: --- short term fix: https://reviews.apache.org/r/25457/ until we get PID namespace support. Freezer failure leads to lost task during container destruction. Key: MESOS-1758 URL: https://issues.apache.org/jira/browse/MESOS-1758 Project: Mesos Issue Type: Bug Components: containerization Reporter: Benjamin Mahler Assignee: Vinod Kone In the past we've seen numerous issues around the freezer. Lately, on the 2.6.44 kernel, we've seen issues where we're unable to freeze the cgroup: (1) An oom occurs. (2) No indication of oom in the kernel logs. (3) The slave is unable to freeze the cgroup. (4) The task is marked as lost. {noformat} I0903 16:46:24.956040 25469 mem.cpp:575] Memory limit exceeded: Requested: 15488MB Maximum Used: 15488MB MEMORY STATISTICS: cache 7958691840 rss 8281653248 mapped_file 9474048 pgpgin 4487861 pgpgout 522933 pgfault 2533780 pgmajfault 11 inactive_anon 0 active_anon 8281653248 inactive_file 7631708160 active_file 326852608 unevictable 0 hierarchical_memory_limit 16240345088 total_cache 7958691840 total_rss 8281653248 total_mapped_file 9474048 total_pgpgin 4487861 total_pgpgout 522933 total_pgfault 2533780 total_pgmajfault 11 total_inactive_anon 0 total_active_anon 8281653248 total_inactive_file 7631728640 total_active_file 326852608 total_unevictable 0 I0903 16:46:24.956848 25469 containerizer.cpp:1041] Container bbb9732a-d600-4c1b-b326-846338c608c3 has reached its limit for resource mem(*):1.62403e+10 and will be terminated I0903 16:46:24.957427 25469 containerizer.cpp:909] Destroying container 'bbb9732a-d600-4c1b-b326-846338c608c3' I0903 16:46:24.958664 25481 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:34.959529 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:34.962070 25482 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.710848ms I0903 16:46:34.962658 25479 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:44.963349 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:44.965631 25472 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.588224ms I0903 16:46:44.966356 25472 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:54.967254 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:56.008447 25475 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 2.15296ms I0903 16:46:56.009071 25466 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:06.010329 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:06.012538 25467 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.643008ms I0903 16:47:06.013216 25467 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:12.516348 25480 slave.cpp:3030] Current usage 9.57%. Max allowed age: 5.630238827780799days I0903 16:47:16.015192 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:16.017043 25486 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.511168ms I0903 16:47:16.017555 25480 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:19.862746 25483 http.cpp:245] HTTP request for '/slave(1)/stats.json' E0903 16:47:24.960055 25472 slave.cpp:2557] Termination of executor 'E' of framework '201104070004-002563-' failed: Failed to destroy container: discarded future I0903 16:47:24.962054 25472 slave.cpp:2087] Handling status update TASK_LOST (UUID: c0c1633b-7221-40dc-90a2-660ef639f747) for task T of framework 201104070004-002563- from @0.0.0.0:0 I0903 16:47:24.963470 25469 mem.cpp:293] Updated 'memory.soft_limit_in_bytes' to 128MB for container bbb9732a-d600-4c1b-b326-846338c608c3 I0903
[jira] [Updated] (MESOS-1758) Freezer failure leads to lost task during container destruction.
[ https://issues.apache.org/jira/browse/MESOS-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-1758: -- Target Version/s: 0.20.1 Fix Version/s: 0.21.0 Story Points: 2 commit 63ed9863f927beb2cf074aacc838fb601329 Author: Vinod Kone vinodk...@gmail.com Date: Mon Sep 8 15:40:54 2014 -0700 Added kill() to freezerTimedOut() in cgroups.cpp. This is a short-term fix for MESOS-1758. Review: https://reviews.apache.org/r/25457 Freezer failure leads to lost task during container destruction. Key: MESOS-1758 URL: https://issues.apache.org/jira/browse/MESOS-1758 Project: Mesos Issue Type: Bug Components: containerization Reporter: Benjamin Mahler Assignee: Vinod Kone Fix For: 0.21.0 In the past we've seen numerous issues around the freezer. Lately, on the 2.6.44 kernel, we've seen issues where we're unable to freeze the cgroup: (1) An oom occurs. (2) No indication of oom in the kernel logs. (3) The slave is unable to freeze the cgroup. (4) The task is marked as lost. {noformat} I0903 16:46:24.956040 25469 mem.cpp:575] Memory limit exceeded: Requested: 15488MB Maximum Used: 15488MB MEMORY STATISTICS: cache 7958691840 rss 8281653248 mapped_file 9474048 pgpgin 4487861 pgpgout 522933 pgfault 2533780 pgmajfault 11 inactive_anon 0 active_anon 8281653248 inactive_file 7631708160 active_file 326852608 unevictable 0 hierarchical_memory_limit 16240345088 total_cache 7958691840 total_rss 8281653248 total_mapped_file 9474048 total_pgpgin 4487861 total_pgpgout 522933 total_pgfault 2533780 total_pgmajfault 11 total_inactive_anon 0 total_active_anon 8281653248 total_inactive_file 7631728640 total_active_file 326852608 total_unevictable 0 I0903 16:46:24.956848 25469 containerizer.cpp:1041] Container bbb9732a-d600-4c1b-b326-846338c608c3 has reached its limit for resource mem(*):1.62403e+10 and will be terminated I0903 16:46:24.957427 25469 containerizer.cpp:909] Destroying container 'bbb9732a-d600-4c1b-b326-846338c608c3' I0903 16:46:24.958664 25481 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:34.959529 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:34.962070 25482 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.710848ms I0903 16:46:34.962658 25479 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:44.963349 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:44.965631 25472 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.588224ms I0903 16:46:44.966356 25472 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:54.967254 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:46:56.008447 25475 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 2.15296ms I0903 16:46:56.009071 25466 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:06.010329 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:06.012538 25467 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.643008ms I0903 16:47:06.013216 25467 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:12.516348 25480 slave.cpp:3030] Current usage 9.57%. Max allowed age: 5.630238827780799days I0903 16:47:16.015192 25488 cgroups.cpp:2209] Thawing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:16.017043 25486 cgroups.cpp:1404] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 after 1.511168ms I0903 16:47:16.017555 25480 cgroups.cpp:2192] Freezing cgroup /sys/fs/cgroup/freezer/mesos/bbb9732a-d600-4c1b-b326-846338c608c3 I0903 16:47:19.862746 25483 http.cpp:245] HTTP request for '/slave(1)/stats.json' E0903 16:47:24.960055 25472 slave.cpp:2557] Termination of executor 'E' of framework '201104070004-002563-' failed: Failed to destroy container: discarded future I0903 16:47:24.962054 25472 slave.cpp:2087] Handling status update TASK_LOST (UUID:
[jira] [Updated] (MESOS-1476) Provide endpoints for deactivating / activating slaves.
[ https://issues.apache.org/jira/browse/MESOS-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-1476: --- Sprint: (was: Mesos Q3 Sprint 5) Provide endpoints for deactivating / activating slaves. --- Key: MESOS-1476 URL: https://issues.apache.org/jira/browse/MESOS-1476 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Labels: gsoc2014 When performing maintenance operations on slaves, it is important to allow these slaves to be drained of their tasks. The first essential primitive of draining slaves is to prevent them from running more tasks. This can be achieved by deactivating them: stop sending their resource offers to frameworks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-1476) Provide endpoints for deactivating / activating slaves.
[ https://issues.apache.org/jira/browse/MESOS-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler reassigned MESOS-1476: -- Assignee: (was: Alexandra Sava) Un-assigning for now since there is no longer a need for this with the updated maintenance design in MESOS-1474. Provide endpoints for deactivating / activating slaves. --- Key: MESOS-1476 URL: https://issues.apache.org/jira/browse/MESOS-1476 Project: Mesos Issue Type: Improvement Components: master Reporter: Benjamin Mahler Labels: gsoc2014 When performing maintenance operations on slaves, it is important to allow these slaves to be drained of their tasks. The first essential primitive of draining slaves is to prevent them from running more tasks. This can be achieved by deactivating them: stop sending their resource offers to frameworks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1592) Design inverse resource offer support
[ https://issues.apache.org/jira/browse/MESOS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126421#comment-14126421 ] Benjamin Mahler commented on MESOS-1592: Moving this to reviewable as inverse offers were designed as part of the maintenance work: MESOS-1474. We are currently considering how persistent resources will interact with inverse offers and the other maintenance primitives. Design inverse resource offer support - Key: MESOS-1592 URL: https://issues.apache.org/jira/browse/MESOS-1592 Project: Mesos Issue Type: Task Components: allocation Reporter: Benjamin Mahler Assignee: Alexandra Sava An inverse resource offer means that Mesos is requesting resources back from the framework, possibly within some time interval. This can be leveraged initially to provide more automated cluster maintenance, by offering schedulers the opportunity to move tasks to compensate for planned maintenance. Operators can set a time limit on how long to wait for schedulers to relocate tasks before the tasks are forcibly terminated. Inverse resource offers have many other potential uses, as it opens the opportunity for the allocator to attempt to move tasks in the cluster through the co-operation of the framework, possibly providing better over-subscription, fairness, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1717) The slave does not show pending tasks in the JSON endpoints.
[ https://issues.apache.org/jira/browse/MESOS-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-1717: --- Sprint: Q3 Sprint 4 (was: Q3 Sprint 4, Mesos Q3 Sprint 5) The slave does not show pending tasks in the JSON endpoints. Key: MESOS-1717 URL: https://issues.apache.org/jira/browse/MESOS-1717 Project: Mesos Issue Type: Bug Components: json api, slave Reporter: Benjamin Mahler The slave does not show pending tasks in the /state.json endpoint. This is a bit tricky to add since we rely on knowing the executor directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)