[mesos] branch master updated: Disable `AgentFailoverHTTPExecutorUsingResourceProviderResources` test.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 172a46a Disable `AgentFailoverHTTPExecutorUsingResourceProviderResources` test. 172a46a is described below commit 172a46ad456e7c61a1694690edcc35742f306596 Author: Chun-Hung Hsiao AuthorDate: Thu Aug 1 02:11:27 2019 -0700 Disable `AgentFailoverHTTPExecutorUsingResourceProviderResources` test. --- src/tests/slave_tests.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/tests/slave_tests.cpp b/src/tests/slave_tests.cpp index abee107..fc95f19 100644 --- a/src/tests/slave_tests.cpp +++ b/src/tests/slave_tests.cpp @@ -11595,7 +11595,8 @@ TEST_F(SlaveTest, RetryOperationStatusUpdateAfterRecovery) // This test verifies that on agent failover HTTP-based executors using resource // provider resources can resubscribe without crashing the agent or killing the // executor. This is a regression test for MESOS-9667 and MESOS-9711. -TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) +TEST_F( +SlaveTest, DISABLED_AgentFailoverHTTPExecutorUsingResourceProviderResources) { // This test is run with paused clock to avoid // dealing with retried task status updates.
[mesos] branch master updated: Fixed a typo in `src/Makefile.am`.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new c348285 Fixed a typo in `src/Makefile.am`. c348285 is described below commit c3482852abd26d1d35af3ecb0141784acfc0748f Author: Chun-Hung Hsiao AuthorDate: Fri Jul 26 08:38:16 2019 -0700 Fixed a typo in `src/Makefile.am`. --- src/Makefile.am | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/Makefile.am b/src/Makefile.am index 2164c60..d27d4b6 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1178,7 +1178,7 @@ libmesos_no_3rdparty_la_SOURCES += \ oci/spec.cpp \ posix/rlimits.cpp\ posix/rlimits.hpp\ - resource_provider/constants.cpp \ + resource_provider/constants.hpp \ resource_provider/daemon.cpp \ resource_provider/daemon.hpp \ resource_provider/detector.cpp \
[mesos] 01/02: Moved default constants for CSI RPC retry to a new header.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 668781708ac80913ad1b8eb336e4c38bff70d842 Author: Chun-Hung Hsiao AuthorDate: Tue Jun 11 11:33:00 2019 -0700 Moved default constants for CSI RPC retry to a new header. Since the default constants for CSI RPC retry do not depend on CSI versions, these constants are pulled off from version-specific headers to a common header. Review: https://reviews.apache.org/r/71143 --- src/Makefile.am| 3 +- src/csi/constants.hpp | 38 ++ src/csi/v0_volume_manager.cpp | 5 +-- src/csi/v0_volume_manager_process.hpp | 11 --- src/csi/v1_volume_manager.cpp | 5 +-- src/csi/v1_volume_manager_process.hpp | 11 --- .../storage_local_resource_provider_tests.cpp | 9 ++--- 7 files changed, 51 insertions(+), 31 deletions(-) diff --git a/src/Makefile.am b/src/Makefile.am index 697ab10..f0a83a2 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1588,7 +1588,7 @@ libmesos_no_3rdparty_la_LIBADD += libbuild.la # Convenience library for build the CSI client. noinst_LTLIBRARIES += libcsi.la libcsi_la_SOURCES =\ - csi/types.cpp \ + csi/constants.hpp\ csi/metrics.cpp \ csi/metrics.hpp \ csi/paths.cpp \ @@ -1597,6 +1597,7 @@ libcsi_la_SOURCES = \ csi/service_manager.hpp \ csi/state.hpp \ csi/state.proto \ + csi/types.cpp \ csi/v0.cpp \ csi/v0_client.cpp\ csi/v0_client.hpp\ diff --git a/src/csi/constants.hpp b/src/csi/constants.hpp new file mode 100644 index 000..af53cb2 --- /dev/null +++ b/src/csi/constants.hpp @@ -0,0 +1,38 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#ifndef __CSI_CONSTANTS_HPP__ +#define __CSI_CONSTANTS_HPP__ + +#include + +namespace mesos { +namespace csi { + +// The CSI volume manager initially picks a random amount of time between +// `[0, b]`, where `b = DEFAULT_RPC_RETRY_BACKOFF_FACTOR`, to retry RPC calls. +// Subsequent retries are exponentially backed off based on this interval (e.g., +// 2nd retry uses a random value between `[0, b * 2^1]`, 3rd retry between +// `[0, b * 2^2]`, etc) up to a maximum of `DEFAULT_RPC_RETRY_INTERVAL_MAX`. +// +// TODO(chhsiao): Make the retry parameters configurable. +constexpr Duration DEFAULT_RPC_RETRY_BACKOFF_FACTOR = Seconds(10); +constexpr Duration DEFAULT_RPC_RETRY_INTERVAL_MAX = Minutes(10); + +} // namespace csi { +} // namespace mesos { + +#endif // __CSI_CONSTANTS_HPP__ diff --git a/src/csi/v0_volume_manager.cpp b/src/csi/v0_volume_manager.cpp index e19dc7c..4b056e7 100644 --- a/src/csi/v0_volume_manager.cpp +++ b/src/csi/v0_volume_manager.cpp @@ -40,6 +40,7 @@ #include #include +#include "csi/constants.hpp" #include "csi/paths.hpp" #include "csi/v0_client.hpp" #include "csi/v0_utils.hpp" @@ -494,7 +495,7 @@ Future VolumeManagerProcess::call( const Request& request, const bool retry) // Made immutable in the following mutable lambda. { - Duration maxBackoff = DEFAULT_CSI_RETRY_BACKOFF_FACTOR; + Duration maxBackoff = DEFAULT_RPC_RETRY_BACKOFF_FACTOR; return process::loop( self(), @@ -514,7 +515,7 @@ Future VolumeManagerProcess::call(
[mesos] 02/02: Added `reconciliation_interval_seconds` for storage resource providers.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 61eaef09d3df7c5aab9311b571f95a890bcf211b Author: Chun-Hung Hsiao AuthorDate: Tue Jun 11 11:33:18 2019 -0700 Added `reconciliation_interval_seconds` for storage resource providers. This new configuration option controls how frequent a storage resource provider reconciles existing volumes and storage pools against its CSI plugin to detect new or missing disk resources. Review: https://reviews.apache.org/r/71144 --- include/mesos/mesos.proto | 6 ++ include/mesos/v1/mesos.proto| 6 ++ src/Makefile.am | 1 + src/resource_provider/constants.hpp | 30 ++ 4 files changed, 43 insertions(+) diff --git a/include/mesos/mesos.proto b/include/mesos/mesos.proto index cb6d131..8fd838e 100644 --- a/include/mesos/mesos.proto +++ b/include/mesos/mesos.proto @@ -1148,6 +1148,12 @@ message ResourceProviderInfo { // Storage resource provider related information. message Storage { required CSIPluginInfo plugin = 1; + +// Amount of time to wait after the resource provider finishes reconciling +// existing volumes and storage pools against the CSI plugin to start the +// next reconciliation. A non-positive value means that no reconciliation +// will happen after startup. +optional double reconciliation_interval_seconds = 2; } optional Storage storage = 6; // EXPERIMENTAL. diff --git a/include/mesos/v1/mesos.proto b/include/mesos/v1/mesos.proto index 438c3fe..da19256 100644 --- a/include/mesos/v1/mesos.proto +++ b/include/mesos/v1/mesos.proto @@ -1136,6 +1136,12 @@ message ResourceProviderInfo { // Storage resource provider related information. message Storage { required CSIPluginInfo plugin = 1; + +// Amount of time to wait after the resource provider finishes reconciling +// existing volumes and storage pools against the CSI plugin to start the +// next reconciliation. A non-positive value means that no reconciliation +// will happen after startup. +optional double reconciliation_interval_seconds = 2; } optional Storage storage = 6; // EXPERIMENTAL. diff --git a/src/Makefile.am b/src/Makefile.am index f0a83a2..2164c60 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1178,6 +1178,7 @@ libmesos_no_3rdparty_la_SOURCES += \ oci/spec.cpp \ posix/rlimits.cpp\ posix/rlimits.hpp\ + resource_provider/constants.cpp \ resource_provider/daemon.cpp \ resource_provider/daemon.hpp \ resource_provider/detector.cpp \ diff --git a/src/resource_provider/constants.hpp b/src/resource_provider/constants.hpp new file mode 100644 index 000..218a9c9 --- /dev/null +++ b/src/resource_provider/constants.hpp @@ -0,0 +1,30 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#ifndef __RESOURCE_PROVIDER_CONSTANTS_HPP__ +#define __RESOURCE_PROVIDER_CONSTANTS_HPP__ + +#include + +namespace mesos { +namespace resource_provider { + +constexpr Duration DEFAULT_STORAGE_RECONCILIATION_INTERVAL = Seconds(10); + +} // namespace resource_provider { +} // namespace mesos { + +#endif // __RESOURCE_PROVIDER_CONSTANTS_HPP__
[mesos] branch master updated (7e160a3 -> 61eaef0)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 7e160a3 Added test to verify that Docker executor can override kill policy. new 6687817 Moved default constants for CSI RPC retry to a new header. new 61eaef0 Added `reconciliation_interval_seconds` for storage resource providers. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: include/mesos/mesos.proto | 6 include/mesos/v1/mesos.proto | 6 src/Makefile.am| 4 ++- src/csi/{metrics.hpp => constants.hpp} | 32 +- src/csi/v0_volume_manager.cpp | 5 ++-- src/csi/v0_volume_manager_process.hpp | 11 src/csi/v1_volume_manager.cpp | 5 ++-- src/csi/v1_volume_manager_process.hpp | 11 .../constants.hpp} | 13 + .../storage_local_resource_provider_tests.cpp | 9 +++--- 10 files changed, 46 insertions(+), 56 deletions(-) copy src/csi/{metrics.hpp => constants.hpp} (55%) copy src/{common/kernel_version.hpp => resource_provider/constants.hpp} (73%)
[mesos] branch master updated: Added MESOS-9785 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new e37edc0 Added MESOS-9785 to the 1.7.3 CHANGELOG. e37edc0 is described below commit e37edc0fc5ad580ed8828a6c88694e9d5429d893 Author: Chun-Hung Hsiao AuthorDate: Wed May 22 14:45:02 2019 -0700 Added MESOS-9785 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 7b464f1..c354dd9 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -483,6 +483,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9707] - Calling link::lo() may cause runtime error * [MESOS-9750] - Agent V1 GET_STATE response may report a complete executor's tasks as non-terminal after a graceful agent shutdown. * [MESOS-9766] - /__processes__ endpoint can hang. + * [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers. * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. * [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`.
[mesos] branch 1.7.x updated (511bfc3 -> 8e8c6c0)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 511bfc3 Added MESOS-9870 to the 1.7.3 CHANGELOG. new 17dea04 Sequentialized all events to master's `/api/v1` subscribers. new c3ed9ae Notifies master `/api/v1` subscribers about recovered frameworks. new 8e8c6c0 Added MESOS-9785 to the 1.7.3 CHANGELOG. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/common/protobuf_utils.cpp | 4 src/master/master.cpp | 22 -- src/master/master.hpp | 11 +++ src/tests/api_tests.cpp | 35 +-- 5 files changed, 57 insertions(+), 16 deletions(-)
[mesos] 03/03: Added MESOS-9785 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 8e8c6c0820e1edd954ce07aa530aa64ad323f3d6 Author: Chun-Hung Hsiao AuthorDate: Wed May 22 14:45:02 2019 -0700 Added MESOS-9785 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index fcdf1e9..7bee655 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -20,6 +20,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9707] - Calling link::lo() may cause runtime error * [MESOS-9750] - Agent V1 GET_STATE response may report a complete executor's tasks as non-terminal after a graceful agent shutdown. * [MESOS-9766] - /__processes__ endpoint can hang. + * [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers. * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. * [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`.
[mesos] 02/03: Notifies master `/api/v1` subscribers about recovered frameworks.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit c3ed9ae5a0791cbc83cc0679d924da5b7cc6caa1 Author: Chun-Hung Hsiao AuthorDate: Wed May 15 21:54:20 2019 -0700 Notifies master `/api/v1` subscribers about recovered frameworks. If one subscribes to master's `/api/v1` endpoint after a master failover but before an agent reregistration, frameworks recovered through the agent registration should be notified to the subscriber, otherwise recovered tasks will have framework IDs referring to frameworks unknown to the subscriber. Review: https://reviews.apache.org/r/70651 --- src/common/protobuf_utils.cpp | 4 src/master/master.cpp | 7 +++ src/tests/api_tests.cpp | 35 +-- 3 files changed, 32 insertions(+), 14 deletions(-) diff --git a/src/common/protobuf_utils.cpp b/src/common/protobuf_utils.cpp index b9289a2..138bee6 100644 --- a/src/common/protobuf_utils.cpp +++ b/src/common/protobuf_utils.cpp @@ -1211,10 +1211,6 @@ mesos::master::Event createTaskAdded(const Task& task) mesos::master::Event createFrameworkAdded( const mesos::internal::master::Framework& _framework) { - CHECK(_framework.active()); - CHECK(_framework.connected()); - CHECK(!_framework.recovered()); - mesos::master::Event event; event.set_type(mesos::master::Event::FRAMEWORK_ADDED); diff --git a/src/master/master.cpp b/src/master/master.cpp index 0564025..3f0c8c0 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -10094,6 +10094,13 @@ void Master::recoverFramework( Framework* framework = new Framework(this, flags, info); + // Send a `FRAMEWORK_ADDED` event to subscribers before adding recovered tasks + // so the framework ID referred by any succeeding `TASK_ADDED` event will be + // known to subscribers. + if (!subscribers.subscribed.empty()) { + subscribers.send(protobuf::master::event::createFrameworkAdded(*framework)); + } + // Add active operations, tasks, and executors to the framework. foreachvalue (Slave* slave, slaves.registered) { if (slave->tasks.contains(framework->id())) { diff --git a/src/tests/api_tests.cpp b/src/tests/api_tests.cpp index d6a8f82..0cfc8e3 100644 --- a/src/tests/api_tests.cpp +++ b/src/tests/api_tests.cpp @@ -2539,8 +2539,9 @@ TEST_P(MasterAPITest, SubscribersReceiveHealthUpdates) // This test verifies that subscribing to the 'api/v1' endpoint between -// a master failover and an agent re-registration won't cause the master -// to crash. See MESOS-8601. +// a master failover and an agent reregistration won't cause the master +// to crash, and frameworks recovered through agent reregistration will be +// broadcast to subscribers. See MESOS-8601 and MESOS-9785. TEST_P(MasterAPITest, MasterFailover) { ContentType contentType = GetParam(); @@ -2691,21 +2692,35 @@ TEST_P(MasterAPITest, MasterFailover) AWAIT_READY(slaveReregisteredMessage); - // The agent re-registration should result in an `AGENT_ADDED` event - // and a `TASK_ADDED` event. - set expectedEvents = -{v1::master::Event::AGENT_ADDED, v1::master::Event::TASK_ADDED}; - set observedEvents; + // The agent re-registration should result in an `AGENT_ADDED` event, + // a `FRAMEWORK_ADDED` event and a `TASK_ADDED` event in order. + event = decoder.read(); + AWAIT_READY(event); + + EXPECT_EQ(v1::master::Event::AGENT_ADDED, event.get()->type()); + const v1::master::Event::AgentAdded& agentAdded = event.get()->agent_added(); + + EXPECT_EQ(agentId, agentAdded.agent().agent_info().id()); event = decoder.read(); AWAIT_READY(event); - observedEvents.insert(event->get().type()); + + EXPECT_EQ(v1::master::Event::FRAMEWORK_ADDED, event.get()->type()); + const v1::master::Event::FrameworkAdded& frameworkAdded = +event.get()->framework_added(); + + EXPECT_EQ(frameworkId, frameworkAdded.framework().framework_info().id()); + EXPECT_FALSE(frameworkAdded.framework().active()); + EXPECT_FALSE(frameworkAdded.framework().connected()); + EXPECT_TRUE(frameworkAdded.framework().recovered()); event = decoder.read(); AWAIT_READY(event); - observedEvents.insert(event->get().type()); - EXPECT_EQ(expectedEvents, observedEvents); + EXPECT_EQ(v1::master::Event::TASK_ADDED, event.get()->type()); + const v1::master::Event::TaskAdded& taskAdded = event.get()->task_added(); + + EXPECT_EQ(task.task_id(), taskAdded.task().task_id()); }
[mesos] 01/03: Sequentialized all events to master's `/api/v1` subscribers.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 17dea045f24c391ce7645a001338ffbc44ec55e6 Author: Chun-Hung Hsiao AuthorDate: Wed May 22 21:08:09 2019 -0700 Sequentialized all events to master's `/api/v1` subscribers. The master needs to create object approvers before sending an event to its `/api/v1` subscribers. The creation calls `process::collect`, which does not have any ordering guarantee. As a result, events might be reordered, which could be unexpected by subscribers. This patch imposes an order between events by sequentializing the creation of object approvers. The actual creations can still go in parallel, but the returned futures will be completed in the creation order. Review: https://reviews.apache.org/r/70702 --- src/master/master.cpp | 15 +-- src/master/master.hpp | 11 +++ 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index 479d56c..0564025 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -12065,9 +12065,8 @@ void Master::Subscribers::send( Shared sharedTask(task.isSome() ? new Task(task.get()) : nullptr); foreachvalue (const Owned& subscriber, subscribed) { -ObjectApprovers::create( +subscriber->getApprovers( master->authorizer, -subscriber->principal, {VIEW_ROLE, VIEW_FRAMEWORK, VIEW_TASK, VIEW_EXECUTOR}) .then(defer( master->self(), @@ -12084,6 +12083,18 @@ void Master::Subscribers::send( } +Future> Master::Subscribers::Subscriber::getApprovers( +const Option& authorizer, +std::initializer_list actions) +{ + Future> approvers = +ObjectApprovers::create(authorizer, principal, actions); + + return approversSequence.add>( + [approvers] { return approvers; }); +} + + void Master::Subscribers::Subscriber::send( const Shared& event, const Owned& approvers, diff --git a/src/master/master.hpp b/src/master/master.hpp index 2bfe255..6830e3b 100644 --- a/src/master/master.hpp +++ b/src/master/master.hpp @@ -58,6 +58,7 @@ #include #include #include +#include #include #include @@ -2134,6 +2135,12 @@ private: Subscriber(const Subscriber&) = delete; Subscriber& operator=(const Subscriber&) = delete; + // Creates object approvers. The futures returned by this method will be + // completed in the calling order. + process::Future> getApprovers( + const Option& authorizer, + std::initializer_list actions); + // TODO(greggomann): Refactor this function into multiple event-specific // overloads. See MESOS-8475. void send( @@ -2158,6 +2165,10 @@ private: process::Owned> heartbeater; const Option principal; + + // We maintain a sequence to coordinate the creation of object approvers + // in order to sequentialize all events to the subscriber. + process::Sequence approversSequence; }; // Sends the event to all subscribers connected to the 'api/vX' endpoint.
[mesos] 02/03: Notifies master `/api/v1` subscribers about recovered frameworks.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit a17a536611ca38783bfdf86630664f86562ea98e Author: Chun-Hung Hsiao AuthorDate: Wed May 15 21:54:20 2019 -0700 Notifies master `/api/v1` subscribers about recovered frameworks. If one subscribes to master's `/api/v1` endpoint after a master failover but before an agent reregistration, frameworks recovered through the agent registration should be notified to the subscriber, otherwise recovered tasks will have framework IDs referring to frameworks unknown to the subscriber. Review: https://reviews.apache.org/r/70651 --- src/common/protobuf_utils.cpp | 4 src/master/master.cpp | 7 +++ src/tests/api_tests.cpp | 40 ++-- 3 files changed, 37 insertions(+), 14 deletions(-) diff --git a/src/common/protobuf_utils.cpp b/src/common/protobuf_utils.cpp index fc67c38..7778e7f 100644 --- a/src/common/protobuf_utils.cpp +++ b/src/common/protobuf_utils.cpp @@ -1438,10 +1438,6 @@ mesos::master::Event createTaskAdded(const Task& task) mesos::master::Event createFrameworkAdded( const mesos::internal::master::Framework& _framework) { - CHECK(_framework.active()); - CHECK(_framework.connected()); - CHECK(!_framework.recovered()); - mesos::master::Event event; event.set_type(mesos::master::Event::FRAMEWORK_ADDED); diff --git a/src/master/master.cpp b/src/master/master.cpp index ace05b2..fbde112 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -10893,6 +10893,13 @@ void Master::recoverFramework( Framework* framework = new Framework(this, flags, info); + // Send a `FRAMEWORK_ADDED` event to subscribers before adding recovered tasks + // so the framework ID referred by any succeeding `TASK_ADDED` event will be + // known to subscribers. + if (!subscribers.subscribed.empty()) { + subscribers.send(protobuf::master::event::createFrameworkAdded(*framework)); + } + // Add active operations, tasks, and executors to the framework. foreachvalue (Slave* slave, slaves.registered) { if (slave->tasks.contains(framework->id())) { diff --git a/src/tests/api_tests.cpp b/src/tests/api_tests.cpp index 561ff20..3479ed3 100644 --- a/src/tests/api_tests.cpp +++ b/src/tests/api_tests.cpp @@ -2630,8 +2630,9 @@ TEST_P(MasterAPITest, SubscribersReceiveHealthUpdates) // This test verifies that subscribing to the 'api/v1' endpoint between -// a master failover and an agent re-registration won't cause the master -// to crash. See MESOS-8601. +// a master failover and an agent reregistration won't cause the master +// to crash, and frameworks recovered through agent reregistration will be +// broadcast to subscribers. See MESOS-8601 and MESOS-9785. TEST_P(MasterAPITest, MasterFailover) { ContentType contentType = GetParam(); @@ -2733,15 +2734,24 @@ TEST_P(MasterAPITest, MasterFailover) EXPECT_CALL(subscriber, subscribed(_)) .WillOnce(FutureArg<0>()); - // The agent re-registration should result in an `AGENT_ADDED` event - // and a `TASK_ADDED` event. - Future taskAdded; - EXPECT_CALL(subscriber, taskAdded(_)) -.WillOnce(FutureSatisfy()); + // The agent re-registration should result in an `AGENT_ADDED` event, + // a `FRAMEWORK_ADDED` event and a `TASK_ADDED` event in order. + Sequence masterEventsSequence; - Future agentAdded; + Future agentAdded; EXPECT_CALL(subscriber, agentAdded(_)) -.WillOnce(FutureSatisfy()); +.InSequence(masterEventsSequence) +.WillOnce(FutureArg<0>()); + + Future frameworkAdded; + EXPECT_CALL(subscriber, frameworkAdded(_)) +.InSequence(masterEventsSequence) +.WillOnce(FutureArg<0>()); + + Future taskAdded; + EXPECT_CALL(subscriber, taskAdded(_)) +.InSequence(masterEventsSequence) +.WillOnce(FutureArg<0>()); // Create event stream after the master failover but before the agent // re-registration. We should see no framework, agent, task and @@ -2764,8 +2774,18 @@ TEST_P(MasterAPITest, MasterFailover) Clock::advance(slaveFlags.registration_backoff_factor); AWAIT_READY(slaveReregisteredMessage); - AWAIT_READY(taskAdded); + AWAIT_READY(agentAdded); + EXPECT_EQ(agentId, agentAdded->agent().agent_info().id()); + + AWAIT_READY(frameworkAdded); + EXPECT_EQ(frameworkId, frameworkAdded->framework().framework_info().id()); + EXPECT_FALSE(frameworkAdded->framework().active()); + EXPECT_FALSE(frameworkAdded->framework().connected()); + EXPECT_TRUE(frameworkAdded->framework().recovered()); + + AWAIT_READY(taskAdded); + EXPECT_EQ(task.task_id(), taskAdded->task().task_id()); }
[mesos] 01/03: Sequentialized all events to master's `/api/v1` subscribers.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit dece081d33c8a6d4e1b56589ec5d4e16013d6643 Author: Chun-Hung Hsiao AuthorDate: Wed May 22 21:08:09 2019 -0700 Sequentialized all events to master's `/api/v1` subscribers. The master needs to create object approvers before sending an event to its `/api/v1` subscribers. The creation calls `process::collect`, which does not have any ordering guarantee. As a result, events might be reordered, which could be unexpected by subscribers. This patch imposes an order between events by sequentializing the creation of object approvers. The actual creations can still go in parallel, but the returned futures will be completed in the creation order. Review: https://reviews.apache.org/r/70702 --- src/master/master.cpp | 15 +-- src/master/master.hpp | 11 +++ 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index f1ab034..06a89bc 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -12792,9 +12792,8 @@ void Master::Subscribers::send( Shared sharedTask(task.isSome() ? new Task(task.get()) : nullptr); foreachvalue (const Owned& subscriber, subscribed) { -ObjectApprovers::create( +subscriber->getApprovers( master->authorizer, -subscriber->principal, {VIEW_ROLE, VIEW_FRAMEWORK, VIEW_TASK, VIEW_EXECUTOR}) .then(defer( master->self(), @@ -12811,6 +12810,18 @@ void Master::Subscribers::send( } +Future> Master::Subscribers::Subscriber::getApprovers( +const Option& authorizer, +std::initializer_list actions) +{ + Future> approvers = +ObjectApprovers::create(authorizer, principal, actions); + + return approversSequence.add>( + [approvers] { return approvers; }); +} + + void Master::Subscribers::Subscriber::send( const Shared& event, const Owned& approvers, diff --git a/src/master/master.hpp b/src/master/master.hpp index 94891af..ed83167 100644 --- a/src/master/master.hpp +++ b/src/master/master.hpp @@ -49,6 +49,7 @@ #include #include #include +#include #include #include @@ -2140,6 +2141,12 @@ private: Subscriber(const Subscriber&) = delete; Subscriber& operator=(const Subscriber&) = delete; + // Creates object approvers. The futures returned by this method will be + // completed in the calling order. + process::Future> getApprovers( + const Option& authorizer, + std::initializer_list actions); + // TODO(greggomann): Refactor this function into multiple event-specific // overloads. See MESOS-8475. void send( @@ -2160,6 +2167,10 @@ private: StreamingHttpConnection http; ResponseHeartbeater heartbeater; const Option principal; + + // We maintain a sequence to coordinate the creation of object approvers + // in order to sequentialize all events to the subscriber. + process::Sequence approversSequence; }; // Sends the event to all subscribers connected to the 'api/vX' endpoint.
[mesos] branch 1.8.x updated (4ae0644 -> edaf639)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 4ae0644 Updated CHANGELOG for 1.8.1 release. new dece081 Sequentialized all events to master's `/api/v1` subscribers. new c889ba6 Notifies master `/api/v1` subscribers about recovered frameworks. new edaf639 Added MESOS-9785 to the 1.8.2 CHANGELOG. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 9 + src/common/protobuf_utils.cpp | 4 src/master/master.cpp | 22 -- src/master/master.hpp | 11 +++ src/tests/api_tests.cpp | 35 +-- 5 files changed, 65 insertions(+), 16 deletions(-)
[mesos] 01/03: Sequentialized all events to master's `/api/v1` subscribers.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 586dba870858129e5f6eaac58b7e1eb118196e26 Author: Chun-Hung Hsiao AuthorDate: Wed May 22 21:08:09 2019 -0700 Sequentialized all events to master's `/api/v1` subscribers. The master needs to create object approvers before sending an event to its `/api/v1` subscribers. The creation calls `process::collect`, which does not have any ordering guarantee. As a result, events might be reordered, which could be unexpected by subscribers. This patch imposes an order between events by sequentializing the creation of object approvers. The actual creations can still go in parallel, but the returned futures will be completed in the creation order. Review: https://reviews.apache.org/r/70702 --- src/master/master.cpp | 15 +-- src/master/master.hpp | 11 +++ 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index f1ca637..ace05b2 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -13177,9 +13177,8 @@ void Master::Subscribers::send( Shared sharedTask(task.isSome() ? new Task(task.get()) : nullptr); foreachvalue (const Owned& subscriber, subscribed) { -ObjectApprovers::create( +subscriber->getApprovers( master->authorizer, -subscriber->principal, {VIEW_ROLE, VIEW_FRAMEWORK, VIEW_TASK, VIEW_EXECUTOR}) .then(defer( master->self(), @@ -13196,6 +13195,18 @@ void Master::Subscribers::send( } +Future> Master::Subscribers::Subscriber::getApprovers( +const Option& authorizer, +std::initializer_list actions) +{ + Future> approvers = +ObjectApprovers::create(authorizer, principal, actions); + + return approversSequence.add>( + [approvers] { return approvers; }); +} + + void Master::Subscribers::Subscriber::send( const Shared& event, const Owned& approvers, diff --git a/src/master/master.hpp b/src/master/master.hpp index ffa7423..5c229c5 100644 --- a/src/master/master.hpp +++ b/src/master/master.hpp @@ -50,6 +50,7 @@ #include #include #include +#include #include #include @@ -2265,6 +2266,12 @@ private: Subscriber(const Subscriber&) = delete; Subscriber& operator=(const Subscriber&) = delete; + // Creates object approvers. The futures returned by this method will be + // completed in the calling order. + process::Future> getApprovers( + const Option& authorizer, + std::initializer_list actions); + // TODO(greggomann): Refactor this function into multiple event-specific // overloads. See MESOS-8475. void send( @@ -2285,6 +2292,10 @@ private: StreamingHttpConnection http; ResponseHeartbeater heartbeater; const Option principal; + + // We maintain a sequence to coordinate the creation of object approvers + // in order to sequentialize all events to the subscriber. + process::Sequence approversSequence; }; // Sends the event to all subscribers connected to the 'api/vX' endpoint.
[mesos] 03/03: Added MESOS-9785 to the 1.8.2 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit edaf639d5e5d9af368bb986b331186a4cae4edc5 Author: Chun-Hung Hsiao AuthorDate: Wed May 22 14:44:28 2019 -0700 Added MESOS-9785 to the 1.8.2 CHANGELOG. --- CHANGELOG | 9 + 1 file changed, 9 insertions(+) diff --git a/CHANGELOG b/CHANGELOG index d03be88..248e382 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,3 +1,11 @@ +Release Notes - Mesos - Version 1.8.2 (WIP) +--- +* This is a bug fix release. + +** Bug + * [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers. + + Release Notes - Mesos - Version 1.8.1 - * This is a bug fix release. @@ -22,6 +30,7 @@ Release Notes - Mesos - Version 1.8.1 * [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. + Release Notes - Mesos - Version 1.8.0 - This release contains the following highlights:
[mesos] 03/03: Added MESOS-9785 to the 1.8.2 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit b96268cdcfd765152ef95ba64ca3d5260c4e384c Author: Chun-Hung Hsiao AuthorDate: Wed May 22 14:44:28 2019 -0700 Added MESOS-9785 to the 1.8.2 CHANGELOG. --- CHANGELOG | 10 ++ 1 file changed, 10 insertions(+) diff --git a/CHANGELOG b/CHANGELOG index 164465a..7b464f1 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -34,6 +34,15 @@ Additional API Changes: NOTE: This new overload is only available when libprocess is compiled with `--enable-ssl`. + +Release Notes - Mesos - Version 1.8.2 (WIP) +--- +* This is a bug fix release. + +** Bug + * [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers. + + Release Notes - Mesos - Version 1.8.1 (WIP) --- * This is a bug fix release. @@ -57,6 +66,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) ** Improvement * [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator. + Release Notes - Mesos - Version 1.8.0 - This release contains the following highlights:
[mesos] branch master updated (af54474 -> b96268c)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from af54474 Windows: Fixed exclusion of GID and Docker spec headers. new 586dba8 Sequentialized all events to master's `/api/v1` subscribers. new a17a536 Notifies master `/api/v1` subscribers about recovered frameworks. new b96268c Added MESOS-9785 to the 1.8.2 CHANGELOG. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 10 ++ src/common/protobuf_utils.cpp | 4 src/master/master.cpp | 22 -- src/master/master.hpp | 11 +++ src/tests/api_tests.cpp | 40 ++-- 5 files changed, 71 insertions(+), 16 deletions(-)
[mesos] 02/03: Notifies master `/api/v1` subscribers about recovered frameworks.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit c889ba6b7818e8eea8b2a11c442bac81b6a237a2 Author: Chun-Hung Hsiao AuthorDate: Wed May 15 21:54:20 2019 -0700 Notifies master `/api/v1` subscribers about recovered frameworks. If one subscribes to master's `/api/v1` endpoint after a master failover but before an agent reregistration, frameworks recovered through the agent registration should be notified to the subscriber, otherwise recovered tasks will have framework IDs referring to frameworks unknown to the subscriber. Review: https://reviews.apache.org/r/70651 --- src/common/protobuf_utils.cpp | 4 src/master/master.cpp | 7 +++ src/tests/api_tests.cpp | 35 +-- 3 files changed, 32 insertions(+), 14 deletions(-) diff --git a/src/common/protobuf_utils.cpp b/src/common/protobuf_utils.cpp index 8b252cb..6a93ac7 100644 --- a/src/common/protobuf_utils.cpp +++ b/src/common/protobuf_utils.cpp @@ -1337,10 +1337,6 @@ mesos::master::Event createTaskAdded(const Task& task) mesos::master::Event createFrameworkAdded( const mesos::internal::master::Framework& _framework) { - CHECK(_framework.active()); - CHECK(_framework.connected()); - CHECK(!_framework.recovered()); - mesos::master::Event event; event.set_type(mesos::master::Event::FRAMEWORK_ADDED); diff --git a/src/master/master.cpp b/src/master/master.cpp index 06a89bc..5488b7b 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -10521,6 +10521,13 @@ void Master::recoverFramework( Framework* framework = new Framework(this, flags, info); + // Send a `FRAMEWORK_ADDED` event to subscribers before adding recovered tasks + // so the framework ID referred by any succeeding `TASK_ADDED` event will be + // known to subscribers. + if (!subscribers.subscribed.empty()) { + subscribers.send(protobuf::master::event::createFrameworkAdded(*framework)); + } + // Add active operations, tasks, and executors to the framework. foreachvalue (Slave* slave, slaves.registered) { if (slave->tasks.contains(framework->id())) { diff --git a/src/tests/api_tests.cpp b/src/tests/api_tests.cpp index 4850ba6..539c704 100644 --- a/src/tests/api_tests.cpp +++ b/src/tests/api_tests.cpp @@ -2711,8 +2711,9 @@ TEST_P(MasterAPITest, SubscribersReceiveHealthUpdates) // This test verifies that subscribing to the 'api/v1' endpoint between -// a master failover and an agent re-registration won't cause the master -// to crash. See MESOS-8601. +// a master failover and an agent reregistration won't cause the master +// to crash, and frameworks recovered through agent reregistration will be +// broadcast to subscribers. See MESOS-8601 and MESOS-9785. TEST_P(MasterAPITest, MasterFailover) { ContentType contentType = GetParam(); @@ -2863,21 +2864,35 @@ TEST_P(MasterAPITest, MasterFailover) AWAIT_READY(slaveReregisteredMessage); - // The agent re-registration should result in an `AGENT_ADDED` event - // and a `TASK_ADDED` event. - set expectedEvents = -{v1::master::Event::AGENT_ADDED, v1::master::Event::TASK_ADDED}; - set observedEvents; + // The agent re-registration should result in an `AGENT_ADDED` event, + // a `FRAMEWORK_ADDED` event and a `TASK_ADDED` event in order. + event = decoder.read(); + AWAIT_READY(event); + + EXPECT_EQ(v1::master::Event::AGENT_ADDED, event.get()->type()); + const v1::master::Event::AgentAdded& agentAdded = event.get()->agent_added(); + + EXPECT_EQ(agentId, agentAdded.agent().agent_info().id()); event = decoder.read(); AWAIT_READY(event); - observedEvents.insert(event->get().type()); + + EXPECT_EQ(v1::master::Event::FRAMEWORK_ADDED, event.get()->type()); + const v1::master::Event::FrameworkAdded& frameworkAdded = +event.get()->framework_added(); + + EXPECT_EQ(frameworkId, frameworkAdded.framework().framework_info().id()); + EXPECT_FALSE(frameworkAdded.framework().active()); + EXPECT_FALSE(frameworkAdded.framework().connected()); + EXPECT_TRUE(frameworkAdded.framework().recovered()); event = decoder.read(); AWAIT_READY(event); - observedEvents.insert(event->get().type()); - EXPECT_EQ(expectedEvents, observedEvents); + EXPECT_EQ(v1::master::Event::TASK_ADDED, event.get()->type()); + const v1::master::Event::TaskAdded& taskAdded = event.get()->task_added(); + + EXPECT_EQ(task.task_id(), taskAdded.task().task_id()); }
[mesos] branch master updated: Fixed a race between status updates and acknowledgements in SLRP tests.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new af6ef2c Fixed a race between status updates and acknowledgements in SLRP tests. af6ef2c is described below commit af6ef2cca58993c4c99b3a10509d1036818d3f79 Author: Chun-Hung Hsiao AuthorDate: Fri Jul 5 16:51:47 2019 -0700 Fixed a race between status updates and acknowledgements in SLRP tests. SLRP tests `RetryOperationStatusUpdate*` check that no operation update will be sent by SLRP once acknowledgements are sent by master. However, since the acknowledgements are delivered via HTTP, it is possible that operation status updates race with acknowledgements that are still in HTTP pipes. This patch fixes this race condition. Review: https://reviews.apache.org/r/71018 --- .../storage_local_resource_provider_tests.cpp | 36 ++ 1 file changed, 36 insertions(+) diff --git a/src/tests/storage_local_resource_provider_tests.cpp b/src/tests/storage_local_resource_provider_tests.cpp index 3823305..6986126 100644 --- a/src/tests/storage_local_resource_provider_tests.cpp +++ b/src/tests/storage_local_resource_provider_tests.cpp @@ -44,6 +44,7 @@ #include #include #include +#include #include @@ -59,6 +60,8 @@ #include "module/manager.hpp" +#include "messages/messages.hpp" + #include "slave/container_daemon_process.hpp" #include "slave/paths.hpp" #include "slave/state.hpp" @@ -67,6 +70,8 @@ #include "slave/containerizer/mesos/containerizer.hpp" +#include "status_update_manager/status_update_manager_process.hpp" + #include "tests/disk_profile_server.hpp" #include "tests/environment.hpp" #include "tests/flags.hpp" @@ -4330,10 +4335,21 @@ TEST_P(StorageLocalResourceProviderTest, RetryOperationStatusUpdate) FUTURE_PROTOBUF( AcknowledgeOperationStatusMessage(), master.get()->pid, slave.get()->pid); + // Since the acknowledgement is delivered to the SLRP via HTTP, we wait for + // a dispatch event to ensure that the acknowledgement is received by SLRP. + Future statusUpdateManagerAcknowledgement = FUTURE_DISPATCH( + _, + (< + id::UUID, + UpdateOperationStatusRecord, + UpdateOperationStatusMessage>::acknowledgement)); + Clock::advance(slave::STATUS_UPDATE_RETRY_INTERVAL_MIN); AWAIT_READY(retriedUpdateOperationStatusMessage); + AWAIT_READY(acknowledgeOperationStatusMessage); + AWAIT_READY(statusUpdateManagerAcknowledgement); // The master acknowledged the operation status update, so the SLRP shouldn't // send further operation status updates. @@ -4491,6 +4507,15 @@ TEST_P( Future acknowledgeOperationStatusMessage = FUTURE_PROTOBUF(AcknowledgeOperationStatusMessage(), master.get()->pid, _); + // Since the acknowledgement is delivered to the SLRP via HTTP, we wait for + // a dispatch event to ensure that the acknowledgement is received by SLRP. + Future statusUpdateManagerAcknowledgement = FUTURE_DISPATCH( + _, + (< + id::UUID, + UpdateOperationStatusRecord, + UpdateOperationStatusMessage>::acknowledgement)); + slave = StartSlave(detector.get(), flags); ASSERT_SOME(slave); @@ -4512,6 +4537,7 @@ TEST_P( AWAIT_READY(retriedUpdateOperationStatusMessage); AWAIT_READY(acknowledgeOperationStatusMessage); + AWAIT_READY(statusUpdateManagerAcknowledgement); // The master has acknowledged the operation status update, so the SLRP // shouldn't send further operation status updates. @@ -5523,10 +5549,20 @@ TEST_P(StorageLocalResourceProviderTest, RetryOperationStatusUpdateToScheduler) FUTURE_PROTOBUF( AcknowledgeOperationStatusMessage(), master.get()->pid, slave.get()->pid); + // Since the acknowledgement is delivered to the SLRP via HTTP, we wait for + // a dispatch event to ensure that the acknowledgement is received by SLRP. + Future statusUpdateManagerAcknowledgement = FUTURE_DISPATCH( + _, + (< + id::UUID, + UpdateOperationStatusRecord, + UpdateOperationStatusMessage>::acknowledgement)); + mesos.send(v1::createCallAcknowledgeOperationStatus( frameworkId, offer.agent_id(), resourceProviderId.get(), update.get())); AWAIT_READY(acknowledgeOperationStatusMessage); + AWAIT_READY(statusUpdateManagerAcknowledgement); // Verify that the retry was only counted as one operation. EXPECT_TRUE(metricEquals("master/operations/finished", 1));
[mesos] 02/02: Added MESOS-9803 to the 1.7.3 CHANNGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 149d2e0c9d7cd9e1641438a968765df2aed68f74 Author: Chun-Hung Hsiao AuthorDate: Wed Jun 5 15:13:26 2019 -0700 Added MESOS-9803 to the 1.7.3 CHANNGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 24df654..f3dbcdf 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -21,6 +21,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9766] - /__processes__ endpoint can hang. * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. + * [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`. ** Improvements * [MESOS-8880] - Add minimum capabilities in the master.
[mesos] branch 1.7.x updated (33a1ba9 -> 149d2e0)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 33a1ba9 Added MESOS-9750 to the 1.7.3 CHANGELOG. new bc40604 Fixed chaining futures infinitely in `UriDiskProfileAdaptor`. new 149d2e0 Added MESOS-9803 to the 1.7.3 CHANNGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + .../storage/uri_disk_profile_adaptor.cpp | 55 +++--- .../storage/uri_disk_profile_adaptor.hpp | 22 ++--- 3 files changed, 56 insertions(+), 22 deletions(-)
[mesos] 01/02: Fixed chaining futures infinitely in `UriDiskProfileAdaptor`.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit bc406048928095c21c0e1c8389ce60ab5549e84c Author: Chun-Hung Hsiao AuthorDate: Thu May 30 15:13:44 2019 -0700 Fixed chaining futures infinitely in `UriDiskProfileAdaptor`. Previously it is possible to have an infinite chain of futures when `UriDiskProfileAdaptor::watch` is called: if the set of profiles remains fixed for every poll, each poll would satisfy a promise that triggers an asynchronous recursive call to `UriDiskProfileAdaptor::watch` again. This patch fixes the problem by removing the asynchronous recursion. Instead, we maintain a separated promise for each watcher that is never associated to another promise. After each poll, we check if the current set of profiles differs from the known set for a watcher, and satisfy its own promise if so. Review: https://reviews.apache.org/r/70766 --- .../storage/uri_disk_profile_adaptor.cpp | 55 +++--- .../storage/uri_disk_profile_adaptor.hpp | 22 ++--- 2 files changed, 55 insertions(+), 22 deletions(-) diff --git a/src/resource_provider/storage/uri_disk_profile_adaptor.cpp b/src/resource_provider/storage/uri_disk_profile_adaptor.cpp index cb574be..dd0653d 100644 --- a/src/resource_provider/storage/uri_disk_profile_adaptor.cpp +++ b/src/resource_provider/storage/uri_disk_profile_adaptor.cpp @@ -16,6 +16,7 @@ #include "resource_provider/storage/uri_disk_profile_adaptor.hpp" +#include #include #include #include @@ -102,8 +103,7 @@ Future> UriDiskProfileAdaptor::watch( UriDiskProfileAdaptorProcess::UriDiskProfileAdaptorProcess( const UriDiskProfileAdaptor::Flags& _flags) : ProcessBase(ID::generate("uri-disk-profile-adaptor")), -flags(_flags), -watchPromise(new Promise()) {} +flags(_flags) {} void UriDiskProfileAdaptorProcess::initialize() @@ -142,24 +142,24 @@ Future> UriDiskProfileAdaptorProcess::watch( const hashset& knownProfiles, const ResourceProviderInfo& resourceProviderInfo) { - // Calculate the new set of profiles for the resource provider. - hashset newProfiles; - foreachpair (const string& profile, - const ProfileRecord& record, - profileMatrix) { + // Calculate the current set of profiles for the resource provider. + hashset currentProfiles; + foreachpair ( + const string& profile, const ProfileRecord& record, profileMatrix) { if (record.active && isSelectedResourceProvider(record.manifest, resourceProviderInfo)) { - newProfiles.insert(profile); + currentProfiles.insert(profile); } } - if (newProfiles != knownProfiles) { -return newProfiles; + if (currentProfiles != knownProfiles) { +return currentProfiles; } // Wait for the next update if there is no change. - return watchPromise->future() -.then(defer(self(), ::watch, knownProfiles, resourceProviderInfo)); + watchers.emplace_back(knownProfiles, resourceProviderInfo); + + return watchers.back().promise.future(); } @@ -274,12 +274,35 @@ void UriDiskProfileAdaptorProcess::notify( profileMatrix.put(entry.first, {entry.second, true}); } - // Notify any watchers and then prepare a new promise for the next - // iteration of polling. + // Notify a watcher if its current set of profiles differs from its known set. // // TODO(josephw): Delay this based on the `--max_random_wait` option. - watchPromise->set(Nothing()); - watchPromise.reset(new Promise()); + foreach (WatcherData& watcher, watchers) { +hashset current; +foreachpair ( +const string& profile, const ProfileRecord& record, profileMatrix) { + if (record.active && + isSelectedResourceProvider(record.manifest, watcher.info)) { +current.insert(profile); + } +} + +if (current != watcher.known) { + CHECK(watcher.promise.set(current)) +<< "Promise for watcher '" << watcher.info << "' is already " +<< watcher.promise.future(); +} + } + + // Remove all notified watchers. + watchers.erase( + std::remove_if( + watchers.begin(), + watchers.end(), + [](const WatcherData& watcher) { +return watcher.promise.future().isReady(); + }), + watchers.end()); LOG(INFO) << "Updated disk profile mapping to " << parsed.profile_matrix().size() diff --git a/src/resource_provider/storage/uri_disk_profile_adaptor.hpp b/src/resource_provider/storage/uri_disk_profile_adaptor.hpp index 7e610d3..86287ae 100644 --- a/src/resource_provider/storage/uri_disk_profile_adaptor.hpp +++ b/src/resource_provider/storage/uri_disk_profile_ada
[mesos] 05/06: Garbage-collected disappeared RPs when agent resources remain unchanged.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 9ad93574b54841b2dd30df58916159a0125a70b1 Author: Chun-Hung Hsiao AuthorDate: Wed Jun 5 21:47:07 2019 -0700 Garbage-collected disappeared RPs when agent resources remain unchanged. Previously when there is a missing resource provider in an `UpdateSlaveMessage` but the agent's total resources remain unchanged, the update message will be completely ignored, so the missing resource provider will still be cached in the master's state, which is not the desired behavior. This patch ensures that the master's state gets updated if any resource provider is missing. Review: https://reviews.apache.org/r/70788 --- src/master/master.cpp | 8 src/tests/api_tests.cpp | 99 + 2 files changed, 107 insertions(+) diff --git a/src/master/master.cpp b/src/master/master.cpp index 555136e..f1ab034 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -8162,6 +8162,8 @@ void Master::updateSlave(UpdateSlaveMessage&& message) // Check if resource provider information changed. if (!updated && message.has_resource_providers()) { +hashset receivedResourceProviders; + foreach ( const UpdateSlaveMessage::ResourceProvider& receivedProvider, message.resource_providers().providers()) { @@ -8171,6 +8173,8 @@ void Master::updateSlave(UpdateSlaveMessage&& message) const ResourceProviderID& resourceProviderId = receivedProvider.info().id(); + receivedResourceProviders.insert(resourceProviderId); + if (!slave->resourceProviders.contains(resourceProviderId)) { updated = true; break; @@ -8201,6 +8205,10 @@ void Master::updateSlave(UpdateSlaveMessage&& message) } } } + +if (slave->resourceProviders.keys() != receivedResourceProviders) { + updated = true; +} } if (!updated) { diff --git a/src/tests/api_tests.cpp b/src/tests/api_tests.cpp index 60fdd11..4850ba6 100644 --- a/src/tests/api_tests.cpp +++ b/src/tests/api_tests.cpp @@ -274,6 +274,105 @@ TEST_P(MasterAPITest, GetAgents) } +// This test verifies that if a resource provider becomes disconnected, it will +// not be reported by `GET_AGENT` calls. +TEST_P(MasterAPITest, GetAgentsDisconnectedResourceProvider) +{ + Clock::pause(); + + const ContentType contentType = GetParam(); + + master::Flags masterFlags = CreateMasterFlags(); + Try> master = this->StartMaster(masterFlags); + ASSERT_SOME(master); + + Owned detector = master.get()->createDetector(); + + // Start one agent. + Future updateSlaveMessage = +FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _); + + slave::Flags slaveFlags = CreateSlaveFlags(); + Try> slave = StartSlave(detector.get(), slaveFlags); + ASSERT_SOME(slave); + + Clock::settle(); + Clock::advance(slaveFlags.registration_backoff_factor); + + AWAIT_READY(updateSlaveMessage); + ASSERT_TRUE(updateSlaveMessage->resource_providers().providers().empty()); + + // Start a resource provider. + mesos::v1::ResourceProviderInfo info; + info.set_type("org.apache.mesos.rp.test"); + info.set_name("test"); + + v1::MockResourceProvider resourceProvider(info, v1::Resources()); + + // Start and register a resource provider. + Owned endpointDetector( + resource_provider::createEndpointDetector(slave.get()->pid)); + + updateSlaveMessage = FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _); + + resourceProvider.start(std::move(endpointDetector), contentType); + + // Wait until the agent's resources have been updated to include the + // resource provider. + AWAIT_READY(updateSlaveMessage); + ASSERT_FALSE(updateSlaveMessage->resource_providers().providers().empty()); + + { +v1::master::Call v1Call; +v1Call.set_type(v1::master::Call::GET_AGENTS); + +Future v1Response = + post(master.get()->pid, v1Call, contentType); + +AWAIT_READY(v1Response); +ASSERT_TRUE(v1Response->IsInitialized()); +ASSERT_EQ(v1::master::Response::GET_AGENTS, v1Response->type()); +ASSERT_EQ(1, v1Response->get_agents().agents_size()); +ASSERT_EQ(1, v1Response->get_agents().agents(0).resource_providers_size()); + +const mesos::v1::ResourceProviderInfo& responseInfo = + v1Response->get_agents() +.agents(0) +.resource_providers(0) +.resource_provider_info(); + +EXPECT_EQ(info.type(), responseInfo.type()); +EXPECT_EQ(info.name(), responseInfo.name()); +EXPECT_TRUE(responseInfo.has_id()); + } + + updateSlaveMessage = FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _); + + // Disconnect the resource provider. + resourceProvider.stop(); + + // Wait until the agent's resources have been updated to exclude the + //
[mesos] branch 1.8.x updated (d19ca42 -> fca8934)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from d19ca42 Added MESOS-9750 to the 1.8.1 CHANGELOG. new e0246d1 Made SLRP allow changes in volume context. new 3de0e98 Added MESOS-9395 to the 1.8.1 CHANGELOG. new 1188b13 Fixed chaining futures infinitely in `UriDiskProfileAdaptor`. new 043ef2d Added MESOS-9803 to the 1.8.1 CHANNGELOG. new 9ad9357 Garbage-collected disappeared RPs when agent resources remain unchanged. new fca8934 Added MESOS-9831 to the 1.8.1 CHANGELOG. The 6 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 3 + include/mesos/mesos.proto | 3 +- include/mesos/v1/mesos.proto | 3 +- src/master/master.cpp | 8 + src/resource_provider/storage/provider.cpp | 177 ++--- .../storage/uri_disk_profile_adaptor.cpp | 55 +-- .../storage/uri_disk_profile_adaptor.hpp | 22 ++- src/tests/api_tests.cpp| 99 8 files changed, 286 insertions(+), 84 deletions(-)
[mesos] 06/06: Added MESOS-9831 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit fca89344aff96a8e2ec1b5b70f4a3cb0e899c352 Author: Chun-Hung Hsiao AuthorDate: Thu Jun 6 11:36:53 2019 -0700 Added MESOS-9831 to the 1.8.1 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 42137e2..be43ecc 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -13,6 +13,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. * [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`. + * [MESOS-9831] - Master should not report disconnected resource providers. ** Improvement * [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
[mesos] 03/06: Fixed chaining futures infinitely in `UriDiskProfileAdaptor`.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 1188b131a09e086f3a1510f16fe3053f0ae3a46a Author: Chun-Hung Hsiao AuthorDate: Thu May 30 15:13:44 2019 -0700 Fixed chaining futures infinitely in `UriDiskProfileAdaptor`. Previously it is possible to have an infinite chain of futures when `UriDiskProfileAdaptor::watch` is called: if the set of profiles remains fixed for every poll, each poll would satisfy a promise that triggers an asynchronous recursive call to `UriDiskProfileAdaptor::watch` again. This patch fixes the problem by removing the asynchronous recursion. Instead, we maintain a separated promise for each watcher that is never associated to another promise. After each poll, we check if the current set of profiles differs from the known set for a watcher, and satisfy its own promise if so. Review: https://reviews.apache.org/r/70766 --- .../storage/uri_disk_profile_adaptor.cpp | 55 +++--- .../storage/uri_disk_profile_adaptor.hpp | 22 ++--- 2 files changed, 55 insertions(+), 22 deletions(-) diff --git a/src/resource_provider/storage/uri_disk_profile_adaptor.cpp b/src/resource_provider/storage/uri_disk_profile_adaptor.cpp index 215f7f9..40eae0c 100644 --- a/src/resource_provider/storage/uri_disk_profile_adaptor.cpp +++ b/src/resource_provider/storage/uri_disk_profile_adaptor.cpp @@ -16,6 +16,7 @@ #include "resource_provider/storage/uri_disk_profile_adaptor.hpp" +#include #include #include #include @@ -100,8 +101,7 @@ Future> UriDiskProfileAdaptor::watch( UriDiskProfileAdaptorProcess::UriDiskProfileAdaptorProcess( const UriDiskProfileAdaptor::Flags& _flags) : ProcessBase(ID::generate("uri-disk-profile-adaptor")), -flags(_flags), -watchPromise(new Promise()) {} +flags(_flags) {} void UriDiskProfileAdaptorProcess::initialize() @@ -140,24 +140,24 @@ Future> UriDiskProfileAdaptorProcess::watch( const hashset& knownProfiles, const ResourceProviderInfo& resourceProviderInfo) { - // Calculate the new set of profiles for the resource provider. - hashset newProfiles; - foreachpair (const string& profile, - const ProfileRecord& record, - profileMatrix) { + // Calculate the current set of profiles for the resource provider. + hashset currentProfiles; + foreachpair ( + const string& profile, const ProfileRecord& record, profileMatrix) { if (record.active && isSelectedResourceProvider(record.manifest, resourceProviderInfo)) { - newProfiles.insert(profile); + currentProfiles.insert(profile); } } - if (newProfiles != knownProfiles) { -return newProfiles; + if (currentProfiles != knownProfiles) { +return currentProfiles; } // Wait for the next update if there is no change. - return watchPromise->future() -.then(defer(self(), ::watch, knownProfiles, resourceProviderInfo)); + watchers.emplace_back(knownProfiles, resourceProviderInfo); + + return watchers.back().promise.future(); } @@ -272,12 +272,35 @@ void UriDiskProfileAdaptorProcess::notify( profileMatrix.put(entry.first, {entry.second, true}); } - // Notify any watchers and then prepare a new promise for the next - // iteration of polling. + // Notify a watcher if its current set of profiles differs from its known set. // // TODO(josephw): Delay this based on the `--max_random_wait` option. - watchPromise->set(Nothing()); - watchPromise.reset(new Promise()); + foreach (WatcherData& watcher, watchers) { +hashset current; +foreachpair ( +const string& profile, const ProfileRecord& record, profileMatrix) { + if (record.active && + isSelectedResourceProvider(record.manifest, watcher.info)) { +current.insert(profile); + } +} + +if (current != watcher.known) { + CHECK(watcher.promise.set(current)) +<< "Promise for watcher '" << watcher.info << "' is already " +<< watcher.promise.future(); +} + } + + // Remove all notified watchers. + watchers.erase( + std::remove_if( + watchers.begin(), + watchers.end(), + [](const WatcherData& watcher) { +return watcher.promise.future().isReady(); + }), + watchers.end()); LOG(INFO) << "Updated disk profile mapping to " << parsed.profile_matrix().size() diff --git a/src/resource_provider/storage/uri_disk_profile_adaptor.hpp b/src/resource_provider/storage/uri_disk_profile_adaptor.hpp index a5a34dc..027ceaa 100644 --- a/src/resource_provider/storage/uri_disk_profile_adaptor.hpp +++ b/src/resource_provider/storage/uri_disk_profile_ada
[mesos] 04/06: Added MESOS-9803 to the 1.8.1 CHANNGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 043ef2de068acd3c256f8dd165e7f42e520764dd Author: Chun-Hung Hsiao AuthorDate: Wed Jun 5 15:12:48 2019 -0700 Added MESOS-9803 to the 1.8.1 CHANNGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 3182eed..42137e2 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -12,6 +12,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) * [MESOS-9782] - Random sorter fails to clear removed clients. * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. + * [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`. ** Improvement * [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
[mesos] 01/06: Made SLRP allow changes in volume context.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit e0246d1765df1be69d23fc572c24baadae500dc6 Author: Chun-Hung Hsiao AuthorDate: Thu May 9 17:59:16 2019 -0700 Made SLRP allow changes in volume context. To make SLRP more robust against non-conforming CSI plugins that change volume contexts, the `getExistVolumes` method returns a list of resource conversions consisting of one for converting old volume contexts to new volume contexts, and one to remove missing volumes and add new volumes. To make the interfaces consistent, `getStoragePools` now also returns a list of resource conversions consisting of one conversion. Review: https://reviews.apache.org/r/70620 --- include/mesos/mesos.proto | 3 +- include/mesos/v1/mesos.proto | 3 +- src/resource_provider/storage/provider.cpp | 177 +++-- 3 files changed, 121 insertions(+), 62 deletions(-) diff --git a/include/mesos/mesos.proto b/include/mesos/mesos.proto index dc6a87f..2b4f350 100644 --- a/include/mesos/mesos.proto +++ b/include/mesos/mesos.proto @@ -1510,7 +1510,8 @@ message Resource { optional string id = 4; // EXPERIMENTAL. // Additional metadata for this source. This field maps onto CSI volume - // attributes and is not expected to be set by frameworks. + // context. Frameworks should neither alter this field, nor expect this + // field to remain unchanged. optional Labels metadata = 5; // EXPERIMENTAL. // This field serves as an indirection to a set of storage diff --git a/include/mesos/v1/mesos.proto b/include/mesos/v1/mesos.proto index e8086e0..bafc274 100644 --- a/include/mesos/v1/mesos.proto +++ b/include/mesos/v1/mesos.proto @@ -1502,7 +1502,8 @@ message Resource { optional string id = 4; // EXPERIMENTAL. // Additional metadata for this source. This field maps onto CSI volume - // attributes and is not expected to be set by frameworks. + // context. Frameworks should neither alter this field, nor expect this + // field to remain unchanged. optional Labels metadata = 5; // EXPERIMENTAL. // This field serves as an indirection to a set of storage diff --git a/src/resource_provider/storage/provider.cpp b/src/resource_provider/storage/provider.cpp index 999fe95..6d63260 100644 --- a/src/resource_provider/storage/provider.cpp +++ b/src/resource_provider/storage/provider.cpp @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -276,8 +277,14 @@ private: const Resources& checkpointed, const Resources& discovered); - Future getRawVolumes(); - Future getStoragePools(); + // Returns a list of resource conversions to updates volume contexts for + // existing volumes, remove disappeared unconverted volumes, and add newly + // appeared ones. + Future> getExistingVolumes(); + + // Returns a list of resource conversions to remove disappeared unconverted + // storage pools and add newly appeared ones. + Future> getStoragePools(); // Spawns a loop to watch for changes in the set of known profiles and update // the profile mapping and storage pools accordingly. @@ -711,21 +718,21 @@ StorageLocalResourceProviderProcess::reconcileResourceProviderState() { return reconcileOperationStatuses() .then(defer(self(), [=] { - return collect({getRawVolumes(), getStoragePools()}) -.then(defer(self(), [=](const vector& discovered) { - ResourceConversion conversion = reconcileResources( - totalResources, - accumulate(discovered.begin(), discovered.end(), Resources())); - - Try result = totalResources.apply(conversion); - CHECK_SOME(result); + return collect>( + {getExistingVolumes(), getStoragePools()}) +.then(defer(self(), [=]( +const vector>& collected) { + Resources result = totalResources; + foreach (const vector& conversions, collected) { +result = CHECK_NOTERROR(result.apply(conversions)); + } - if (result.get() != totalResources) { + if (result != totalResources) { LOG(INFO) - << "Removing '" << conversion.consumed << "' and adding '" - << conversion.converted << "' to the total resources"; + << "Removing '" << (totalResources - result) << "' and adding '" + << (result - totalResources) << "' to the total resources"; -totalResources = result.get(); +totalResources = result; checkpointResourceProviderState(); } @@ -919,21 +926,15 @@ Future S
[mesos] 02/06: Added MESOS-9395 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 3de0e9809692d0c9ceb124d59887fe0f96d6e262 Author: Chun-Hung Hsiao AuthorDate: Wed Jun 5 14:41:57 2019 -0700 Added MESOS-9395 to the 1.8.1 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 0f94b19..3182eed 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -3,6 +3,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) * This is a bug fix release. ** Bug + * [MESOS-9395] - Check failure on `StorageLocalResourceProviderProcess::applyCreateDisk`. * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9750] - Agent V1 GET_STATE response may report a complete executor's tasks as non-terminal after a graceful agent shutdown.
[mesos] branch master updated (c18efb7 -> 4b9c566)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from c18efb7 Improved operation feedback example framework. new fe70100 Made SLRP allow changes in volume context. new 6273b50 Added MESOS-9395 to the 1.8.1 CHANGELOG. new 24c70aa Used full paths as volume IDs for the test CSI plugin. new a7b98f7 Added a unit test to verify if SLRP allows changes in volume context. new 343b6d7 Fixed chaining futures infinitely in `UriDiskProfileAdaptor`. new 1271584 Added MESOS-9803 to the 1.8.1 CHANNGELOG. new 661bac0 Added MESOS-9803 to the 1.7.3 CHANNGELOG. new 4c82acb Garbage-collected disappeared RPs when agent resources remain unchanged. new 4b9c566 Added MESOS-9831 to the 1.8.1 CHANGELOG. The 9 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 4 + include/mesos/mesos.proto | 3 +- include/mesos/v1/mesos.proto | 3 +- src/examples/test_csi_plugin.cpp | 282 ++ src/master/master.cpp | 8 + src/resource_provider/storage/provider.cpp | 177 +++ .../storage/uri_disk_profile_adaptor.cpp | 55 +++- .../storage/uri_disk_profile_adaptor.hpp | 22 +- src/tests/api_tests.cpp| 99 +++ .../storage_local_resource_provider_tests.cpp | 325 +++-- 10 files changed, 628 insertions(+), 350 deletions(-)
[mesos] 01/09: Made SLRP allow changes in volume context.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit fe7010099133962fd3d58ffde18d6c8e7472e01a Author: Chun-Hung Hsiao AuthorDate: Thu May 9 17:59:16 2019 -0700 Made SLRP allow changes in volume context. To make SLRP more robust against non-conforming CSI plugins that change volume contexts, the `getExistVolumes` method returns a list of resource conversions consisting of one for converting old volume contexts to new volume contexts, and one to remove missing volumes and add new volumes. To make the interfaces consistent, `getStoragePools` now also returns a list of resource conversions consisting of one conversion. Review: https://reviews.apache.org/r/70620 --- include/mesos/mesos.proto | 3 +- include/mesos/v1/mesos.proto | 3 +- src/resource_provider/storage/provider.cpp | 177 +++-- 3 files changed, 121 insertions(+), 62 deletions(-) diff --git a/include/mesos/mesos.proto b/include/mesos/mesos.proto index dc6a87f..2b4f350 100644 --- a/include/mesos/mesos.proto +++ b/include/mesos/mesos.proto @@ -1510,7 +1510,8 @@ message Resource { optional string id = 4; // EXPERIMENTAL. // Additional metadata for this source. This field maps onto CSI volume - // attributes and is not expected to be set by frameworks. + // context. Frameworks should neither alter this field, nor expect this + // field to remain unchanged. optional Labels metadata = 5; // EXPERIMENTAL. // This field serves as an indirection to a set of storage diff --git a/include/mesos/v1/mesos.proto b/include/mesos/v1/mesos.proto index e8086e0..bafc274 100644 --- a/include/mesos/v1/mesos.proto +++ b/include/mesos/v1/mesos.proto @@ -1502,7 +1502,8 @@ message Resource { optional string id = 4; // EXPERIMENTAL. // Additional metadata for this source. This field maps onto CSI volume - // attributes and is not expected to be set by frameworks. + // context. Frameworks should neither alter this field, nor expect this + // field to remain unchanged. optional Labels metadata = 5; // EXPERIMENTAL. // This field serves as an indirection to a set of storage diff --git a/src/resource_provider/storage/provider.cpp b/src/resource_provider/storage/provider.cpp index 999fe95..6d63260 100644 --- a/src/resource_provider/storage/provider.cpp +++ b/src/resource_provider/storage/provider.cpp @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -276,8 +277,14 @@ private: const Resources& checkpointed, const Resources& discovered); - Future getRawVolumes(); - Future getStoragePools(); + // Returns a list of resource conversions to updates volume contexts for + // existing volumes, remove disappeared unconverted volumes, and add newly + // appeared ones. + Future> getExistingVolumes(); + + // Returns a list of resource conversions to remove disappeared unconverted + // storage pools and add newly appeared ones. + Future> getStoragePools(); // Spawns a loop to watch for changes in the set of known profiles and update // the profile mapping and storage pools accordingly. @@ -711,21 +718,21 @@ StorageLocalResourceProviderProcess::reconcileResourceProviderState() { return reconcileOperationStatuses() .then(defer(self(), [=] { - return collect({getRawVolumes(), getStoragePools()}) -.then(defer(self(), [=](const vector& discovered) { - ResourceConversion conversion = reconcileResources( - totalResources, - accumulate(discovered.begin(), discovered.end(), Resources())); - - Try result = totalResources.apply(conversion); - CHECK_SOME(result); + return collect>( + {getExistingVolumes(), getStoragePools()}) +.then(defer(self(), [=]( +const vector>& collected) { + Resources result = totalResources; + foreach (const vector& conversions, collected) { +result = CHECK_NOTERROR(result.apply(conversions)); + } - if (result.get() != totalResources) { + if (result != totalResources) { LOG(INFO) - << "Removing '" << conversion.consumed << "' and adding '" - << conversion.converted << "' to the total resources"; + << "Removing '" << (totalResources - result) << "' and adding '" + << (result - totalResources) << "' to the total resources"; -totalResources = result.get(); +totalResources = result; checkpointResourceProviderState(); } @@ -919,21 +926,15 @@ Future S
[mesos] 07/09: Added MESOS-9803 to the 1.7.3 CHANNGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 661bac038d16ec590eac8104e5a73f04cc45834d Author: Chun-Hung Hsiao AuthorDate: Wed Jun 5 15:13:26 2019 -0700 Added MESOS-9803 to the 1.7.3 CHANNGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index a19654f..69a015b 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -447,6 +447,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9766] - /__processes__ endpoint can hang. * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. + * [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`. ** Improvements * [MESOS-8880] - Add minimum capabilities in the master.
[mesos] 05/09: Fixed chaining futures infinitely in `UriDiskProfileAdaptor`.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 343b6d7f0d4b3024692fa7ed5669c9819658e382 Author: Chun-Hung Hsiao AuthorDate: Thu May 30 15:13:44 2019 -0700 Fixed chaining futures infinitely in `UriDiskProfileAdaptor`. Previously it is possible to have an infinite chain of futures when `UriDiskProfileAdaptor::watch` is called: if the set of profiles remains fixed for every poll, each poll would satisfy a promise that triggers an asynchronous recursive call to `UriDiskProfileAdaptor::watch` again. This patch fixes the problem by removing the asynchronous recursion. Instead, we maintain a separated promise for each watcher that is never associated to another promise. After each poll, we check if the current set of profiles differs from the known set for a watcher, and satisfy its own promise if so. Review: https://reviews.apache.org/r/70766 --- .../storage/uri_disk_profile_adaptor.cpp | 55 +++--- .../storage/uri_disk_profile_adaptor.hpp | 22 ++--- 2 files changed, 55 insertions(+), 22 deletions(-) diff --git a/src/resource_provider/storage/uri_disk_profile_adaptor.cpp b/src/resource_provider/storage/uri_disk_profile_adaptor.cpp index 215f7f9..40eae0c 100644 --- a/src/resource_provider/storage/uri_disk_profile_adaptor.cpp +++ b/src/resource_provider/storage/uri_disk_profile_adaptor.cpp @@ -16,6 +16,7 @@ #include "resource_provider/storage/uri_disk_profile_adaptor.hpp" +#include #include #include #include @@ -100,8 +101,7 @@ Future> UriDiskProfileAdaptor::watch( UriDiskProfileAdaptorProcess::UriDiskProfileAdaptorProcess( const UriDiskProfileAdaptor::Flags& _flags) : ProcessBase(ID::generate("uri-disk-profile-adaptor")), -flags(_flags), -watchPromise(new Promise()) {} +flags(_flags) {} void UriDiskProfileAdaptorProcess::initialize() @@ -140,24 +140,24 @@ Future> UriDiskProfileAdaptorProcess::watch( const hashset& knownProfiles, const ResourceProviderInfo& resourceProviderInfo) { - // Calculate the new set of profiles for the resource provider. - hashset newProfiles; - foreachpair (const string& profile, - const ProfileRecord& record, - profileMatrix) { + // Calculate the current set of profiles for the resource provider. + hashset currentProfiles; + foreachpair ( + const string& profile, const ProfileRecord& record, profileMatrix) { if (record.active && isSelectedResourceProvider(record.manifest, resourceProviderInfo)) { - newProfiles.insert(profile); + currentProfiles.insert(profile); } } - if (newProfiles != knownProfiles) { -return newProfiles; + if (currentProfiles != knownProfiles) { +return currentProfiles; } // Wait for the next update if there is no change. - return watchPromise->future() -.then(defer(self(), ::watch, knownProfiles, resourceProviderInfo)); + watchers.emplace_back(knownProfiles, resourceProviderInfo); + + return watchers.back().promise.future(); } @@ -272,12 +272,35 @@ void UriDiskProfileAdaptorProcess::notify( profileMatrix.put(entry.first, {entry.second, true}); } - // Notify any watchers and then prepare a new promise for the next - // iteration of polling. + // Notify a watcher if its current set of profiles differs from its known set. // // TODO(josephw): Delay this based on the `--max_random_wait` option. - watchPromise->set(Nothing()); - watchPromise.reset(new Promise()); + foreach (WatcherData& watcher, watchers) { +hashset current; +foreachpair ( +const string& profile, const ProfileRecord& record, profileMatrix) { + if (record.active && + isSelectedResourceProvider(record.manifest, watcher.info)) { +current.insert(profile); + } +} + +if (current != watcher.known) { + CHECK(watcher.promise.set(current)) +<< "Promise for watcher '" << watcher.info << "' is already " +<< watcher.promise.future(); +} + } + + // Remove all notified watchers. + watchers.erase( + std::remove_if( + watchers.begin(), + watchers.end(), + [](const WatcherData& watcher) { +return watcher.promise.future().isReady(); + }), + watchers.end()); LOG(INFO) << "Updated disk profile mapping to " << parsed.profile_matrix().size() diff --git a/src/resource_provider/storage/uri_disk_profile_adaptor.hpp b/src/resource_provider/storage/uri_disk_profile_adaptor.hpp index a5a34dc..027ceaa 100644 --- a/src/resource_provider/storage/uri_disk_profile_adaptor.hpp +++ b/src/resource_provider/storage/uri_disk_profile_ada
[mesos] 08/09: Garbage-collected disappeared RPs when agent resources remain unchanged.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 4c82acba45e31d2426adebdb7783b8bd762b6b0d Author: Chun-Hung Hsiao AuthorDate: Wed Jun 5 21:47:07 2019 -0700 Garbage-collected disappeared RPs when agent resources remain unchanged. Previously when there is a missing resource provider in an `UpdateSlaveMessage` but the agent's total resources remain unchanged, the update message will be completely ignored, so the missing resource provider will still be cached in the master's state, which is not the desired behavior. This patch ensures that the master's state gets updated if any resource provider is missing. Review: https://reviews.apache.org/r/70788 --- src/master/master.cpp | 8 src/tests/api_tests.cpp | 99 + 2 files changed, 107 insertions(+) diff --git a/src/master/master.cpp b/src/master/master.cpp index 4d7c37c..b3c10ab 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -8185,6 +8185,8 @@ void Master::updateSlave(UpdateSlaveMessage&& message) // Check if resource provider information changed. if (!updated && message.has_resource_providers()) { +hashset receivedResourceProviders; + foreach ( const UpdateSlaveMessage::ResourceProvider& receivedProvider, message.resource_providers().providers()) { @@ -8194,6 +8196,8 @@ void Master::updateSlave(UpdateSlaveMessage&& message) const ResourceProviderID& resourceProviderId = receivedProvider.info().id(); + receivedResourceProviders.insert(resourceProviderId); + if (!slave->resourceProviders.contains(resourceProviderId)) { updated = true; break; @@ -8224,6 +8228,10 @@ void Master::updateSlave(UpdateSlaveMessage&& message) } } } + +if (slave->resourceProviders.keys() != receivedResourceProviders) { + updated = true; +} } if (!updated) { diff --git a/src/tests/api_tests.cpp b/src/tests/api_tests.cpp index 2220cec..f191a1c 100644 --- a/src/tests/api_tests.cpp +++ b/src/tests/api_tests.cpp @@ -275,6 +275,105 @@ TEST_P(MasterAPITest, GetAgents) } +// This test verifies that if a resource provider becomes disconnected, it will +// not be reported by `GET_AGENT` calls. +TEST_P(MasterAPITest, GetAgentsDisconnectedResourceProvider) +{ + Clock::pause(); + + const ContentType contentType = GetParam(); + + master::Flags masterFlags = CreateMasterFlags(); + Try> master = this->StartMaster(masterFlags); + ASSERT_SOME(master); + + Owned detector = master.get()->createDetector(); + + // Start one agent. + Future updateSlaveMessage = +FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _); + + slave::Flags slaveFlags = CreateSlaveFlags(); + Try> slave = StartSlave(detector.get(), slaveFlags); + ASSERT_SOME(slave); + + Clock::settle(); + Clock::advance(slaveFlags.registration_backoff_factor); + + AWAIT_READY(updateSlaveMessage); + ASSERT_TRUE(updateSlaveMessage->resource_providers().providers().empty()); + + // Start a resource provider. + mesos::v1::ResourceProviderInfo info; + info.set_type("org.apache.mesos.rp.test"); + info.set_name("test"); + + v1::MockResourceProvider resourceProvider(info, v1::Resources()); + + // Start and register a resource provider. + Owned endpointDetector( + resource_provider::createEndpointDetector(slave.get()->pid)); + + updateSlaveMessage = FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _); + + resourceProvider.start(std::move(endpointDetector), contentType); + + // Wait until the agent's resources have been updated to include the + // resource provider. + AWAIT_READY(updateSlaveMessage); + ASSERT_FALSE(updateSlaveMessage->resource_providers().providers().empty()); + + { +v1::master::Call v1Call; +v1Call.set_type(v1::master::Call::GET_AGENTS); + +Future v1Response = + post(master.get()->pid, v1Call, contentType); + +AWAIT_READY(v1Response); +ASSERT_TRUE(v1Response->IsInitialized()); +ASSERT_EQ(v1::master::Response::GET_AGENTS, v1Response->type()); +ASSERT_EQ(1, v1Response->get_agents().agents_size()); +ASSERT_EQ(1, v1Response->get_agents().agents(0).resource_providers_size()); + +const mesos::v1::ResourceProviderInfo& responseInfo = + v1Response->get_agents() +.agents(0) +.resource_providers(0) +.resource_provider_info(); + +EXPECT_EQ(info.type(), responseInfo.type()); +EXPECT_EQ(info.name(), responseInfo.name()); +EXPECT_TRUE(responseInfo.has_id()); + } + + updateSlaveMessage = FUTURE_PROTOBUF(UpdateSlaveMessage(), _, _); + + // Disconnect the resource provider. + resourceProvider.stop(); + + // Wait until the agent's resources have been updated to exclude the + //
[mesos] 02/09: Added MESOS-9395 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 6273b506f2cbd5b140bb8419c90bf22858bd615b Author: Chun-Hung Hsiao AuthorDate: Wed Jun 5 14:41:57 2019 -0700 Added MESOS-9395 to the 1.8.1 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 8b2a613..9d1e557 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -16,6 +16,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) * This is a bug fix release. ** Bug + * [MESOS-9395] - Check failure on `StorageLocalResourceProviderProcess::applyCreateDisk`. * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9750] - Agent V1 GET_STATE response may report a complete executor's tasks as non-terminal after a graceful agent shutdown.
[mesos] 03/09: Used full paths as volume IDs for the test CSI plugin.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 24c70aa15c566b6f15a3f3636664507bae3af3b6 Author: Chun-Hung Hsiao AuthorDate: Thu Apr 18 12:50:54 2019 -0700 Used full paths as volume IDs for the test CSI plugin. The full paths of simulated volumes are now in their ID instead of their contextual information. This simplifies SLRP tests, and makes it cleaner if we want to customize the contextual information in the future. Review: https://reviews.apache.org/r/70621 --- src/examples/test_csi_plugin.cpp | 245 - .../storage_local_resource_provider_tests.cpp | 170 +++--- 2 files changed, 172 insertions(+), 243 deletions(-) diff --git a/src/examples/test_csi_plugin.cpp b/src/examples/test_csi_plugin.cpp index 03f782e..7ff08b8 100644 --- a/src/examples/test_csi_plugin.cpp +++ b/src/examples/test_csi_plugin.cpp @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include @@ -59,13 +60,14 @@ #include "csi/v0_utils.hpp" #include "csi/v1_utils.hpp" +#include "csi/volume_manager.hpp" #include "linux/fs.hpp" #include "logging/logging.hpp" namespace http = process::http; -namespace fs = mesos::internal::fs; +namespace internal = mesos::internal; using std::cerr; using std::cout; @@ -95,6 +97,8 @@ using grpc::ServerContext; using grpc::Status; using grpc::WriteOptions; +using mesos::csi::VolumeInfo; + using mesos::csi::types::VolumeCapability; using process::grpc::StatusError; @@ -192,41 +196,46 @@ public: defaultVolumeCapability.mutable_access_mode() ->set_mode(VolumeCapability::AccessMode::SINGLE_NODE_WRITER); -// Scan for preprovisioned volumes. +Bytes usedCapacity(0); + +// Scan for created volumes. // // TODO(jieyu): Consider not using CHECKs here. -Try> paths = os::ls(workDir); -CHECK_SOME(paths); - -foreach (const string& path, paths.get()) { - Try volumeInfo = parseVolumePath(path); - CHECK_SOME(volumeInfo); - - CHECK(!volumes.contains(volumeInfo->id)); - volumes.put(volumeInfo->id, volumeInfo.get()); - - if (!_volumes.contains(volumeInfo->id)) { -CHECK_GE(availableCapacity, volumeInfo->size); -availableCapacity -= volumeInfo->size; - } +Try> paths = fs::list(path::join(workDir, "*-*")); +foreach (const string& path, CHECK_NOTERROR(paths)) { + volumes.put(path, CHECK_NOTERROR(parseVolumePath(path))); + usedCapacity += volumes.at(path).capacity; } +// Create preprovisioned volumes if they have not existed yet. foreachpair (const string& name, const Bytes& capacity, _volumes) { - if (volumes.contains(name)) { + Option found = findVolumeByName(name); + + if (found.isSome()) { +CHECK_EQ(found->capacity, capacity) + << "Expected preprovisioned volume '" << name << "' to be " + << capacity << " but found " << found->capacity << " instead"; + +usedCapacity -= found->capacity; continue; } - VolumeInfo volumeInfo; - volumeInfo.id = name; - volumeInfo.size = capacity; - - const string path = getVolumePath(volumeInfo); + VolumeInfo volumeInfo{ +capacity, getVolumePath(capacity, name), Map()}; - Try mkdir = os::mkdir(path); - CHECK_SOME(mkdir); + Try mkdir = os::mkdir(volumeInfo.id); + CHECK_SOME(mkdir) +<< "Failed to create directory for preprovisioned volume '" << name +<< "': " << mkdir.error(); - volumes.put(volumeInfo.id, volumeInfo); + volumes.put(volumeInfo.id, std::move(volumeInfo)); } + +CHECK_GE(availableCapacity, usedCapacity) + << "Insufficient available capacity for volumes, expected to be at least " + << usedCapacity; + +availableCapacity -= usedCapacity; } void run(); @@ -406,14 +415,9 @@ public: csi::v1::NodeGetInfoResponse* response) override; private: - struct VolumeInfo - { -string id; -Bytes size; - }; - - string getVolumePath(const VolumeInfo& volumeInfo); - Try parseVolumePath(const string& path); + string getVolumePath(const Bytes& capacity, const string& name); + Try parseVolumePath(const string& dir); + Option findVolumeByName(const string& name); Try createVolume( const string& name, @@ -568,9 +572,8 @@ Status TestCSIPlugin::CreateVolume( } response->mutable_volume()->set_id(result->id); - response->mutable_volume()->set_capacity_bytes(result->size.bytes()); - (*response->mutable
[mesos] 04/09: Added a unit test to verify if SLRP allows changes in volume context.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit a7b98f702ddf4e8c9cfd9a7e14e050d802f80084 Author: Chun-Hung Hsiao AuthorDate: Mon May 6 11:44:40 2019 -0700 Added a unit test to verify if SLRP allows changes in volume context. Review: https://reviews.apache.org/r/70622 --- src/examples/test_csi_plugin.cpp | 43 -- .../storage_local_resource_provider_tests.cpp | 155 +++-- 2 files changed, 172 insertions(+), 26 deletions(-) diff --git a/src/examples/test_csi_plugin.cpp b/src/examples/test_csi_plugin.cpp index 7ff08b8..6202173 100644 --- a/src/examples/test_csi_plugin.cpp +++ b/src/examples/test_csi_plugin.cpp @@ -146,6 +146,12 @@ public: "specified as a semicolon-delimited list of param=value pairs.\n" "(Example: 'param1=value1;param2=value2')"); +add(::volume_metadata, +"volume_metadata", +"The static properties to add to the contextual information of each\n" +"volume. The metadata are specified as a semicolon-delimited list of\n" +"prop=value pairs. (Example: 'prop1=value1;prop2=value2')"); + add(::volumes, "volumes", "Creates preprovisioned volumes upon start-up. The volumes are\n" @@ -164,6 +170,7 @@ public: string work_dir; Bytes available_capacity; Option create_parameters; + Option volume_metadata; Option volumes; Option forward; }; @@ -184,12 +191,14 @@ public: const string& _workDir, const Bytes& _availableCapacity, const hashmap& _createParameters, + const hashmap& _volumeMetadata, const hashmap& _volumes) : apiVersion(_apiVersion), endpoint(_endpoint), workDir(_workDir), availableCapacity(_availableCapacity), - createParameters(_createParameters.begin(), _createParameters.end()) + createParameters(_createParameters.begin(), _createParameters.end()), + volumeMetadata(_volumeMetadata.begin(), _volumeMetadata.end()) { // Construct the default mount volume capability. defaultVolumeCapability.mutable_mount(); @@ -221,7 +230,7 @@ public: } VolumeInfo volumeInfo{ -capacity, getVolumePath(capacity, name), Map()}; +capacity, getVolumePath(capacity, name), volumeMetadata}; Try mkdir = os::mkdir(volumeInfo.id); CHECK_SOME(mkdir) @@ -483,6 +492,7 @@ private: Bytes availableCapacity; VolumeCapability defaultVolumeCapability; Map createParameters; + Map volumeMetadata; hashmap volumes; }; @@ -1289,7 +1299,7 @@ Try TestCSIPlugin::parseVolumePath(const string& dir) << "Cannot reconstruct volume path '" << dir << "' from volume name '" << name.get() << "' and capacity " << capacity.get(); - return VolumeInfo{capacity.get(), dir, Map()}; + return VolumeInfo{capacity.get(), dir, volumeMetadata}; } @@ -1352,7 +1362,7 @@ Try TestCSIPlugin::createVolume( VolumeInfo volumeInfo{min(max(defaultSize, requiredBytes), limitBytes), getVolumePath(volumeInfo.capacity, name), - Map()}; + volumeMetadata}; Try mkdir = os::mkdir(volumeInfo.id); if (mkdir.isError()) { @@ -1995,19 +2005,27 @@ int main(int argc, char** argv) foreachpair (const string& param, const vector& values, strings::pairs(flags.create_parameters.get(), ";", "=")) { - Option error; - if (values.size() != 1) { -error = "Parameter keys must be unique"; - } else { -createParameters.put(param, values[0]); +cerr << "Parameter key '" << param << "' is not unique" << endl; +return EXIT_FAILURE; } - if (error.isSome()) { -cerr << "Failed to parse the '--create_parameters' flags: " - << error->message << endl; + createParameters.put(param, values[0]); +} + } + + hashmap volumeMetadata; + + if (flags.volume_metadata.isSome()) { +foreachpair (const string& prop, + const vector& values, + strings::pairs(flags.volume_metadata.get(), ";", "=")) { + if (values.size() != 1) { +cerr << "Metadata key '" << prop << "' is not unique" << endl; return EXIT_FAILURE; } + + volumeMetadata.put(prop, values[0]); } } @@ -2060,6 +2078,7 @@ int main(int argc, char** argv) flags.work_dir, flags.available_capacity, createParameters, +volumeMetadata,
[mesos] 09/09: Added MESOS-9831 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 4b9c5667e599fc44ee04f1b524c30b95a2c840f8 Author: Chun-Hung Hsiao AuthorDate: Thu Jun 6 11:36:53 2019 -0700 Added MESOS-9831 to the 1.8.1 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 69a015b..197d791 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -26,6 +26,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. * [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`. + * [MESOS-9831] - Master should not report disconnected resource providers. ** Improvement * [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
[mesos] 06/09: Added MESOS-9803 to the 1.8.1 CHANNGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 1271584060af14bbfc6a5715c9da47cdbd7a1438 Author: Chun-Hung Hsiao AuthorDate: Wed Jun 5 15:12:48 2019 -0700 Added MESOS-9803 to the 1.8.1 CHANNGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 9d1e557..a19654f 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -25,6 +25,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) * [MESOS-9782] - Random sorter fails to clear removed clients. * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup. + * [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`. ** Improvement * [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
[mesos] branch master updated (421728f -> bf07bbd)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 421728f Fixed the race between validating and applying FrameworkInfo updates. new 365e06c Added a helper to devolve offer operations. new 06aec42 Changed the `*TaskIdEq` test matchers to take a `TaskID`. new 3747331 Removed the `TaskStatusUpdateIsTerminalState` matcher. new bf07bbd Added a unit test for master operation authorization. The 4 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/internal/devolve.cpp | 6 + src/internal/devolve.hpp | 1 + src/slave/slave.hpp| 3 +- src/tests/api_tests.cpp| 8 +- .../containerizer/nvidia_gpu_isolator_tests.cpp| 21 +- src/tests/default_executor_tests.cpp | 96 ++-- src/tests/gc_tests.cpp | 20 +- src/tests/master_authorization_tests.cpp | 597 + src/tests/master_tests.cpp | 18 +- src/tests/mesos.hpp| 141 +++-- src/tests/mock_slave.cpp | 7 + src/tests/mock_slave.hpp | 6 + src/tests/partition_tests.cpp | 6 +- src/tests/slave_tests.cpp | 33 +- 14 files changed, 832 insertions(+), 131 deletions(-)
[mesos] 01/04: Added a helper to devolve offer operations.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 365e06c91dc57c04c995c49f19a37f56c94dfdf7 Author: Chun-Hung Hsiao AuthorDate: Tue May 21 21:02:06 2019 -0700 Added a helper to devolve offer operations. Review: https://reviews.apache.org/r/70683 --- src/internal/devolve.cpp | 6 ++ src/internal/devolve.hpp | 1 + 2 files changed, 7 insertions(+) diff --git a/src/internal/devolve.cpp b/src/internal/devolve.cpp index e23ed3c..1d300b4 100644 --- a/src/internal/devolve.cpp +++ b/src/internal/devolve.cpp @@ -104,6 +104,12 @@ Offer devolve(const v1::Offer& offer) } +Offer::Operation devolve(const v1::Offer::Operation& operation) +{ + return devolve(operation); +} + + OperationStatus devolve(const v1::OperationStatus& status) { OperationStatus _status = devolve(status); diff --git a/src/internal/devolve.hpp b/src/internal/devolve.hpp index a1f8d8d..fefe86e 100644 --- a/src/internal/devolve.hpp +++ b/src/internal/devolve.hpp @@ -60,6 +60,7 @@ FrameworkInfo devolve(const v1::FrameworkInfo& frameworkInfo); HealthCheck devolve(const v1::HealthCheck& check); InverseOffer devolve(const v1::InverseOffer& inverseOffer); Offer devolve(const v1::Offer& offer); +Offer::Operation devolve(const v1::Offer::Operation& operation); OperationStatus devolve(const v1::OperationStatus& status); Resource devolve(const v1::Resource& resource); ResourceProviderID devolve(const v1::ResourceProviderID& resourceProviderId);
[mesos] 03/04: Removed the `TaskStatusUpdateIsTerminalState` matcher.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 3747331d05b4372ed57696941a19d6f41cd7670d Author: Chun-Hung Hsiao AuthorDate: Wed May 15 16:13:19 2019 -0700 Removed the `TaskStatusUpdateIsTerminalState` matcher. The `TaskStatusUpdateIsTerminalState` test matcher does not handle both v0 and v1 status updates. Since introducing such a matcher in the `v1::scheduler` namespace is inconsistent and it is not hard to use the built-in `Truly` matcher to implement the same functionality, this matcher is removed for now. Review: https://reviews.apache.org/r/70685 --- src/tests/containerizer/nvidia_gpu_isolator_tests.cpp | 17 ++--- src/tests/mesos.hpp | 10 -- 2 files changed, 14 insertions(+), 13 deletions(-) diff --git a/src/tests/containerizer/nvidia_gpu_isolator_tests.cpp b/src/tests/containerizer/nvidia_gpu_isolator_tests.cpp index 5e753e7..fe82c82 100644 --- a/src/tests/containerizer/nvidia_gpu_isolator_tests.cpp +++ b/src/tests/containerizer/nvidia_gpu_isolator_tests.cpp @@ -32,6 +32,8 @@ #include #include +#include "common/protobuf_utils.hpp" + #include "docker/docker.hpp" #include "master/master.hpp" @@ -73,6 +75,7 @@ using testing::AtMost; using testing::DoAll; using testing::Eq; using testing::Return; +using testing::Truly; namespace mesos { namespace internal { @@ -307,7 +310,9 @@ TEST_F(NvidiaGpuTest, ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage) Future task1Finished; EXPECT_CALL(*scheduler, update(_, AllOf( TaskStatusUpdateTaskIdEq(task1.task_id()), - TaskStatusUpdateIsTerminalState( + Truly([](const v1::scheduler::Event::Update& update) { +return protobuf::isTerminalState(devolve(update.status()).state()); + } .WillOnce(DoAll( FutureArg<1>(), v1::scheduler::SendAcknowledge(frameworkId, agentId))); @@ -315,7 +320,9 @@ TEST_F(NvidiaGpuTest, ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage) Future task2Failed; EXPECT_CALL(*scheduler, update(_, AllOf( TaskStatusUpdateTaskIdEq(task2.task_id()), - TaskStatusUpdateIsTerminalState( + Truly([](const v1::scheduler::Event::Update& update) { +return protobuf::isTerminalState(devolve(update.status()).state()); + } .WillOnce(DoAll( FutureArg<1>(), v1::scheduler::SendAcknowledge(frameworkId, agentId))); @@ -429,7 +436,11 @@ TEST_F(NvidiaGpuTest, ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_TensorflowGpuImage) .WillOnce(v1::scheduler::SendAcknowledge(frameworkId, agentId)); Future terminalStatusUpdate; - EXPECT_CALL(*scheduler, update(_, TaskStatusUpdateIsTerminalState())) + EXPECT_CALL(*scheduler, update( + _, + Truly([](const v1::scheduler::Event::Update& update) { +return protobuf::isTerminalState(devolve(update.status()).state()); + }))) .WillOnce(DoAll( FutureArg<1>(), v1::scheduler::SendAcknowledge(frameworkId, agentId))); diff --git a/src/tests/mesos.hpp b/src/tests/mesos.hpp index 605a69f..c886789 100644 --- a/src/tests/mesos.hpp +++ b/src/tests/mesos.hpp @@ -73,8 +73,6 @@ #include "common/http.hpp" #include "common/protobuf_utils.hpp" -#include "internal/devolve.hpp" - #include "messages/messages.hpp" // For google::protobuf::Message. #include "master/master.hpp" @@ -3697,14 +3695,6 @@ MATCHER_P(TaskStatusUpdateStateEq, taskState, "") } -// This matcher is used to match an `Event.update.status` message whose state is -// terminal. -MATCHER(TaskStatusUpdateIsTerminalState, "") -{ - return protobuf::isTerminalState(devolve(arg.status()).state()); -} - - // This matcher is used to match the task id of // `authorization::Request.Object.TaskInfo`. MATCHER_P(AuthorizationRequestHasTaskID, taskId, "")
[mesos] 02/04: Changed the `*TaskIdEq` test matchers to take a `TaskID`.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 06aec4208a506bd14534876e36575bb84a1076ca Author: Chun-Hung Hsiao AuthorDate: Wed May 15 14:19:08 2019 -0700 Changed the `*TaskIdEq` test matchers to take a `TaskID`. The `TaskStatusTaskIdEq` and `TaskStatusUpdateTaskIdEq` matchers previously take a `TaskInfo`, which is not very intuitive and inconsistent with other matchers such as `OptionTaskHasTaskID`. By making these matchers take a `TaskID` we also reduce coupling of this matcher and the structure of `TaskInfo`. Review: https://reviews.apache.org/r/70684 --- src/tests/api_tests.cpp| 8 +- .../containerizer/nvidia_gpu_isolator_tests.cpp| 4 +- src/tests/default_executor_tests.cpp | 96 +++--- src/tests/gc_tests.cpp | 20 ++--- src/tests/master_tests.cpp | 18 ++-- src/tests/mesos.hpp| 8 +- src/tests/partition_tests.cpp | 6 +- src/tests/slave_tests.cpp | 33 +--- 8 files changed, 102 insertions(+), 91 deletions(-) diff --git a/src/tests/api_tests.cpp b/src/tests/api_tests.cpp index bc19d7e..37d0cb1 100644 --- a/src/tests/api_tests.cpp +++ b/src/tests/api_tests.cpp @@ -5025,7 +5025,7 @@ TEST_P(MasterAPITest, UnreachableAgentMarkedGone) EXPECT_CALL( *scheduler, update(_, AllOf( - TaskStatusUpdateTaskIdEq(taskInfo), + TaskStatusUpdateTaskIdEq(taskInfo.task_id()), TaskStatusUpdateStateEq(v1::TASK_STARTING .InSequence(updateSequence) .WillOnce(DoAll( @@ -5035,7 +5035,7 @@ TEST_P(MasterAPITest, UnreachableAgentMarkedGone) EXPECT_CALL( *scheduler, update(_, AllOf( -TaskStatusUpdateTaskIdEq(taskInfo), +TaskStatusUpdateTaskIdEq(taskInfo.task_id()), TaskStatusUpdateStateEq(v1::TASK_RUNNING .InSequence(updateSequence) .WillOnce(DoAll( @@ -5056,7 +5056,7 @@ TEST_P(MasterAPITest, UnreachableAgentMarkedGone) EXPECT_CALL( *scheduler, update(_, AllOf( -TaskStatusUpdateTaskIdEq(taskInfo), +TaskStatusUpdateTaskIdEq(taskInfo.task_id()), TaskStatusUpdateStateEq(v1::TASK_UNREACHABLE .WillOnce(FutureArg<1>()); @@ -5092,7 +5092,7 @@ TEST_P(MasterAPITest, UnreachableAgentMarkedGone) EXPECT_CALL( *scheduler, update(_, AllOf( -TaskStatusUpdateTaskIdEq(taskInfo), +TaskStatusUpdateTaskIdEq(taskInfo.task_id()), TaskStatusUpdateStateEq(v1::TASK_GONE_BY_OPERATOR .WillOnce(FutureArg<1>()); diff --git a/src/tests/containerizer/nvidia_gpu_isolator_tests.cpp b/src/tests/containerizer/nvidia_gpu_isolator_tests.cpp index 7f5bd8c..5e753e7 100644 --- a/src/tests/containerizer/nvidia_gpu_isolator_tests.cpp +++ b/src/tests/containerizer/nvidia_gpu_isolator_tests.cpp @@ -306,7 +306,7 @@ TEST_F(NvidiaGpuTest, ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage) Future task1Finished; EXPECT_CALL(*scheduler, update(_, AllOf( - TaskStatusUpdateTaskIdEq(task1), + TaskStatusUpdateTaskIdEq(task1.task_id()), TaskStatusUpdateIsTerminalState( .WillOnce(DoAll( FutureArg<1>(), @@ -314,7 +314,7 @@ TEST_F(NvidiaGpuTest, ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage) Future task2Failed; EXPECT_CALL(*scheduler, update(_, AllOf( - TaskStatusUpdateTaskIdEq(task2), + TaskStatusUpdateTaskIdEq(task2.task_id()), TaskStatusUpdateIsTerminalState( .WillOnce(DoAll( FutureArg<1>(), diff --git a/src/tests/default_executor_tests.cpp b/src/tests/default_executor_tests.cpp index 93d7a1c..1c3b488 100644 --- a/src/tests/default_executor_tests.cpp +++ b/src/tests/default_executor_tests.cpp @@ -413,7 +413,7 @@ TEST_P(DefaultExecutorTest, KillTask) EXPECT_CALL( *scheduler, update(_, AllOf( - TaskStatusUpdateTaskIdEq(taskInfo1), + TaskStatusUpdateTaskIdEq(taskInfo1.task_id()), TaskStatusUpdateStateEq(v1::TASK_STARTING .InSequence(task1) .WillOnce( @@ -424,7 +424,7 @@ TEST_P(DefaultExecutorTest, KillTask) EXPECT_CALL( *scheduler, update(_, AllOf( - TaskStatusUpdateTaskIdEq(taskInfo1), + TaskStatusUpdateTaskIdEq(taskInfo1.task_id()), TaskStatusUpdateStateEq(v1::TASK_RUNNING .InSequence(task1) .WillOnce( @@ -435,7 +435,7 @@ TEST_P(DefaultExecutorTest, KillTask) EXPECT_CALL( *scheduler, update(_, AllOf( - TaskStatusUpdateTaskIdEq(taskInfo1), + TaskStatusUpdateTaskIdEq(taskInfo1.task_id()), TaskStatusUpdateStateEq(v1::TASK_KILLED .InSequence(task1) .WillOnce( @@
[mesos] 04/04: Added a unit test for master operation authorization.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit bf07bbd1cf103b6e8c23ff3831a11dacf9dfc398 Author: Chun-Hung Hsiao AuthorDate: Thu Jan 17 18:48:59 2019 -0800 Added a unit test for master operation authorization. This test verifies that allowing or denying an action will only result in a success or failure on specific operations but not other operations in an accept call. This is a regression test for MESOS-9474 and MESOS-9480. Review: https://reviews.apache.org/r/70686 --- src/slave/slave.hpp | 3 +- src/tests/master_authorization_tests.cpp | 597 +++ src/tests/mesos.hpp | 123 +-- src/tests/mock_slave.cpp | 7 + src/tests/mock_slave.hpp | 6 + 5 files changed, 709 insertions(+), 27 deletions(-) diff --git a/src/slave/slave.hpp b/src/slave/slave.hpp index c740bf7..6954f53 100644 --- a/src/slave/slave.hpp +++ b/src/slave/slave.hpp @@ -268,7 +268,8 @@ public: void checkpointResourcesMessage( const std::vector& resources); - void applyOperation(const ApplyOperationMessage& message); + // Made 'virtual' for Slave mocking. + virtual void applyOperation(const ApplyOperationMessage& message); // Reconciles pending operations with the master. This is necessary to handle // cases in which operations were dropped in transit, or in which an agent's diff --git a/src/tests/master_authorization_tests.cpp b/src/tests/master_authorization_tests.cpp index f65b621..21e450c 100644 --- a/src/tests/master_authorization_tests.cpp +++ b/src/tests/master_authorization_tests.cpp @@ -14,7 +14,10 @@ // See the License for the specific language governing permissions and // limitations under the License. +#include +#include #include +#include #include #include @@ -29,6 +32,7 @@ #include #include +#include #include #include #include @@ -36,9 +40,16 @@ #include #include +#include #include "authorizer/local/authorizer.hpp" +#include "common/protobuf_utils.hpp" +#include "common/resources_utils.hpp" + +#include "internal/devolve.hpp" +#include "internal/evolve.hpp" + #include "master/master.hpp" #include "master/allocator/mesos/allocator.hpp" @@ -57,6 +68,8 @@ namespace http = process::http; +using google::protobuf::RepeatedPtrField; + using mesos::internal::master::Master; using mesos::internal::master::allocator::MesosAllocatorProcess; @@ -81,11 +94,14 @@ using std::string; using std::vector; using testing::_; +using testing::AllOf; using testing::An; using testing::AtMost; using testing::DoAll; using testing::Eq; +using testing::Invoke; using testing::Return; +using testing::Truly; namespace mesos { namespace internal { @@ -2748,6 +2764,587 @@ TEST_F(MasterAuthorizationTest, AuthorizedToRegisterNoStaticReservations) AWAIT_READY(slaveRegisteredMessage); } + +class MasterOperationAuthorizationTest + : public MesosTest, +public ::testing::WithParamInterface +{ +public: + static Resources createAgentResources(const Resources& resources) + { +Resources agentResources; +foreach ( +Resource resource, +resources - resources.filter(::hasResourceProvider)) { + if (Resources::isPersistentVolume(resource)) { +if (resource.disk().has_source()) { + resource.mutable_disk()->clear_persistence(); + resource.mutable_disk()->clear_volume(); +} else { + resource.clear_disk(); +} + } + + agentResources += resource; +} + +return agentResources; + } + + static vector createOperations( + const v1::FrameworkID& frameworkId, + const v1::AgentID& agentId, + const authorization::Action& action) + { +switch (action) { + case authorization::RUN_TASK: { +const v1::Resources taskResources = + v1::Resources::parse("cpus:1;mem:32").get(); + +v1::ExecutorInfo executorInfo = v1::createExecutorInfo( +v1::DEFAULT_EXECUTOR_ID, +None(), +v1::Resources::parse("cpus:1;mem:32;disk:32").get(), +v1::ExecutorInfo::DEFAULT, +frameworkId); + +return {v1::LAUNCH( +{v1::createTask(agentId, taskResources, ""), + v1::createTask(agentId, taskResources, "")}), +v1::LAUNCH_GROUP( +executorInfo, +v1::createTaskGroupInfo( +{v1::createTask(agentId, taskResources, ""), + v1::createTask(agentId, taskResources, "")}))}; + } + case authorization::RESERVE_RESOURCES: { +v1::Operat
[mesos] 01/02: Explicitly marked agent resource provider config calls as experimental.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit c9e3a191e0c58b1d22af5b64a7c839d2cea58251 Author: Chun-Hung Hsiao AuthorDate: Fri May 10 14:14:30 2019 -0700 Explicitly marked agent resource provider config calls as experimental. The resource provider feature has been marked as experimental, so we should also call out the related config API calls are experimental. Review: https://reviews.apache.org/r/70627 --- include/mesos/agent/agent.proto| 9 + include/mesos/v1/agent/agent.proto | 9 + 2 files changed, 18 insertions(+) diff --git a/include/mesos/agent/agent.proto b/include/mesos/agent/agent.proto index ff408a4..316a384 100644 --- a/include/mesos/agent/agent.proto +++ b/include/mesos/agent/agent.proto @@ -319,6 +319,9 @@ message Call { // resource provider of the same type and name exists, but the content is // not identical. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message AddResourceProviderConfig { required ResourceProviderInfo info = 1; } @@ -342,6 +345,9 @@ message Call { // Returns 404 Not Found if no config file describes a resource // provider of the same type and name exists. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message UpdateResourceProviderConfig { required ResourceProviderInfo info = 1; } @@ -360,6 +366,9 @@ message Call { // exists. // Returns 403 Forbidden if the call is not authorized. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message RemoveResourceProviderConfig { required string type = 1; required string name = 2; diff --git a/include/mesos/v1/agent/agent.proto b/include/mesos/v1/agent/agent.proto index 19d6c42..2797d20 100644 --- a/include/mesos/v1/agent/agent.proto +++ b/include/mesos/v1/agent/agent.proto @@ -319,6 +319,9 @@ message Call { // resource provider of the same type and name exists, but the content is // not identical. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message AddResourceProviderConfig { required ResourceProviderInfo info = 1; } @@ -342,6 +345,9 @@ message Call { // Returns 404 Not Found if no config file describes a resource // provider of the same type and name exists. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message UpdateResourceProviderConfig { required ResourceProviderInfo info = 1; } @@ -360,6 +366,9 @@ message Call { // exists. // Returns 403 Forbidden if the call is not authorized. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message RemoveResourceProviderConfig { required string type = 1; required string name = 2;
[mesos] branch 1.8.x updated (f5770dc -> 484782c)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from f5770dc Revert "Made nested contaienr can access its sandbox via `MESOS_SANDBOX`." new c9e3a19 Explicitly marked agent resource provider config calls as experimental. new 484782c Return 409 if `UPDATE_RESOURCE_PROVIDER_CONFIG` names a missing config. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: include/mesos/agent/agent.proto | 19 ++- include/mesos/v1/agent/agent.proto| 19 ++- src/slave/http.cpp| 12 .../agent_resource_provider_config_api_tests.cpp | 4 ++-- 4 files changed, 38 insertions(+), 16 deletions(-)
[mesos] branch master updated: Return 409 if `UPDATE_RESOURCE_PROVIDER_CONFIG` names a missing config.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new ff2f1d5 Return 409 if `UPDATE_RESOURCE_PROVIDER_CONFIG` names a missing config. ff2f1d5 is described below commit ff2f1d5bc9d068352e791245a3d867c2e6518f59 Author: Chun-Hung Hsiao AuthorDate: Fri May 10 15:57:28 2019 -0700 Return 409 if `UPDATE_RESOURCE_PROVIDER_CONFIG` names a missing config. Since 404 is returned if the API endpoint route is not set yet, this error code becomes ambiguous and makes clients hard to programmatically handle it. Therefore, the error code for specifying a missing config in this API call is changed to 409 Conflict. Review: https://reviews.apache.org/r/70628 --- include/mesos/agent/agent.proto| 10 +- include/mesos/v1/agent/agent.proto | 10 +- src/slave/http.cpp | 12 src/tests/agent_resource_provider_config_api_tests.cpp | 4 ++-- 4 files changed, 20 insertions(+), 16 deletions(-) diff --git a/include/mesos/agent/agent.proto b/include/mesos/agent/agent.proto index 316a384..83eb7bb 100644 --- a/include/mesos/agent/agent.proto +++ b/include/mesos/agent/agent.proto @@ -315,9 +315,9 @@ message Call { // exists. // Returns 400 Bad Request if `info` is not well-formed. // Returns 403 Forbidden if the call is not authorized. - // Returns 409 Conflict if another config file that describes a - // resource provider of the same type and name exists, but the content is - // not identical. + // Returns 409 Conflict if another config file that describes a resource + // provider of the same type and name exists, but the content is not + // identical. // Returns 500 Internal Server Error if anything goes wrong. // // NOTE: For the time being, this API is subject to change and the related @@ -342,8 +342,8 @@ message Call { // in the config file. // Returns 400 Bad Request if `info` is not well-formed. // Returns 403 Forbidden if the call is not authorized. - // Returns 404 Not Found if no config file describes a resource - // provider of the same type and name exists. + // Returns 409 Conflict if no config file describes a resource provider of the + // same type and name exists. // Returns 500 Internal Server Error if anything goes wrong. // // NOTE: For the time being, this API is subject to change and the related diff --git a/include/mesos/v1/agent/agent.proto b/include/mesos/v1/agent/agent.proto index 2797d20..f6574cb 100644 --- a/include/mesos/v1/agent/agent.proto +++ b/include/mesos/v1/agent/agent.proto @@ -315,9 +315,9 @@ message Call { // exists. // Returns 400 Bad Request if `info` is not well-formed. // Returns 403 Forbidden if the call is not authorized. - // Returns 409 Conflict if another config file that describes a - // resource provider of the same type and name exists, but the content is - // not identical. + // Returns 409 Conflict if another config file that describes a resource + // provider of the same type and name exists, but the content is not + // identical. // Returns 500 Internal Server Error if anything goes wrong. // // NOTE: For the time being, this API is subject to change and the related @@ -342,8 +342,8 @@ message Call { // in the config file. // Returns 400 Bad Request if `info` is not well-formed. // Returns 403 Forbidden if the call is not authorized. - // Returns 404 Not Found if no config file describes a resource - // provider of the same type and name exists. + // Returns 409 Conflict if no config file describes a resource provider of the + // same type and name exists. // Returns 500 Internal Server Error if anything goes wrong. // // NOTE: For the time being, this API is subject to change and the related diff --git a/src/slave/http.cpp b/src/slave/http.cpp index 2c4e792..69e6d74 100644 --- a/src/slave/http.cpp +++ b/src/slave/http.cpp @@ -3249,9 +3249,11 @@ Future Http::addResourceProviderConfig( } return slave->localResourceProviderDaemon->add(info) -.then([](bool added) -> Response { +.then([info](bool added) -> Response { if (!added) { -return Conflict(); +return Conflict( +"Resource provider with type '" + info.type() + +"' and name '" + info.name() + "' already exists"); } return OK(); @@ -3294,9 +3296,11 @@ Future Http::updateResourceProviderConfig( } return slave->localResourceProviderDaemon->update(info) -.then([](bool updated) -> Response { +.then([info](bool
[mesos] branch 1.8.x updated: Added MESOS-9779 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/1.8.x by this push: new 755604b Added MESOS-9779 to the 1.8.1 CHANGELOG. 755604b is described below commit 755604bed4dda09df9d79c7fe184c292942f25c1 Author: Chun-Hung Hsiao AuthorDate: Tue May 21 14:25:45 2019 -0700 Added MESOS-9779 to the 1.8.1 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 431df6a..bc24064 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -6,6 +6,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9766] - /__processes__ endpoint can hang. + * [MESOS-9779] - `UPDATE_RESOURCE_PROVIDER_CONFIG` agent call returns 404 ambiguously. * [MESOS-9782] - Random sorter fails to clear removed clients. * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup.
[mesos] branch master updated: Added MESOS-9779 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new d945c73 Added MESOS-9779 to the 1.8.1 CHANGELOG. d945c73 is described below commit d945c73b27567f28258030c9576771c18f0de9d3 Author: Chun-Hung Hsiao AuthorDate: Tue May 21 14:25:45 2019 -0700 Added MESOS-9779 to the 1.8.1 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 237790b..e569dc3 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -16,6 +16,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9766] - /__processes__ endpoint can hang. + * [MESOS-9779] - `UPDATE_RESOURCE_PROVIDER_CONFIG` agent call returns 404 ambiguously. * [MESOS-9782] - Random sorter fails to clear removed clients. * [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master. * [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup.
[mesos] 02/02: Return 409 if `UPDATE_RESOURCE_PROVIDER_CONFIG` names a missing config.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 484782c324ebead46f5f7977ee4f667cdc45f25f Author: Chun-Hung Hsiao AuthorDate: Fri May 10 15:57:28 2019 -0700 Return 409 if `UPDATE_RESOURCE_PROVIDER_CONFIG` names a missing config. Since 404 is returned if the API endpoint route is not set yet, this error code becomes ambiguous and makes clients hard to programmatically handle it. Therefore, the error code for specifying a missing config in this API call is changed to 409 Conflict. Review: https://reviews.apache.org/r/70628 --- include/mesos/agent/agent.proto| 10 +- include/mesos/v1/agent/agent.proto | 10 +- src/slave/http.cpp | 12 src/tests/agent_resource_provider_config_api_tests.cpp | 4 ++-- 4 files changed, 20 insertions(+), 16 deletions(-) diff --git a/include/mesos/agent/agent.proto b/include/mesos/agent/agent.proto index 316a384..83eb7bb 100644 --- a/include/mesos/agent/agent.proto +++ b/include/mesos/agent/agent.proto @@ -315,9 +315,9 @@ message Call { // exists. // Returns 400 Bad Request if `info` is not well-formed. // Returns 403 Forbidden if the call is not authorized. - // Returns 409 Conflict if another config file that describes a - // resource provider of the same type and name exists, but the content is - // not identical. + // Returns 409 Conflict if another config file that describes a resource + // provider of the same type and name exists, but the content is not + // identical. // Returns 500 Internal Server Error if anything goes wrong. // // NOTE: For the time being, this API is subject to change and the related @@ -342,8 +342,8 @@ message Call { // in the config file. // Returns 400 Bad Request if `info` is not well-formed. // Returns 403 Forbidden if the call is not authorized. - // Returns 404 Not Found if no config file describes a resource - // provider of the same type and name exists. + // Returns 409 Conflict if no config file describes a resource provider of the + // same type and name exists. // Returns 500 Internal Server Error if anything goes wrong. // // NOTE: For the time being, this API is subject to change and the related diff --git a/include/mesos/v1/agent/agent.proto b/include/mesos/v1/agent/agent.proto index 2797d20..f6574cb 100644 --- a/include/mesos/v1/agent/agent.proto +++ b/include/mesos/v1/agent/agent.proto @@ -315,9 +315,9 @@ message Call { // exists. // Returns 400 Bad Request if `info` is not well-formed. // Returns 403 Forbidden if the call is not authorized. - // Returns 409 Conflict if another config file that describes a - // resource provider of the same type and name exists, but the content is - // not identical. + // Returns 409 Conflict if another config file that describes a resource + // provider of the same type and name exists, but the content is not + // identical. // Returns 500 Internal Server Error if anything goes wrong. // // NOTE: For the time being, this API is subject to change and the related @@ -342,8 +342,8 @@ message Call { // in the config file. // Returns 400 Bad Request if `info` is not well-formed. // Returns 403 Forbidden if the call is not authorized. - // Returns 404 Not Found if no config file describes a resource - // provider of the same type and name exists. + // Returns 409 Conflict if no config file describes a resource provider of the + // same type and name exists. // Returns 500 Internal Server Error if anything goes wrong. // // NOTE: For the time being, this API is subject to change and the related diff --git a/src/slave/http.cpp b/src/slave/http.cpp index 2c4e792..69e6d74 100644 --- a/src/slave/http.cpp +++ b/src/slave/http.cpp @@ -3249,9 +3249,11 @@ Future Http::addResourceProviderConfig( } return slave->localResourceProviderDaemon->add(info) -.then([](bool added) -> Response { +.then([info](bool added) -> Response { if (!added) { -return Conflict(); +return Conflict( +"Resource provider with type '" + info.type() + +"' and name '" + info.name() + "' already exists"); } return OK(); @@ -3294,9 +3296,11 @@ Future Http::updateResourceProviderConfig( } return slave->localResourceProviderDaemon->update(info) -.then([](bool updated) -> Response { +.then([info](bool updated) -> Response { if (!updated) { -return NotFound(); +return Conflict( +"Resource provider with type '"
[mesos] branch master updated: Explicitly marked agent resource provider config calls as experimental.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 6608074 Explicitly marked agent resource provider config calls as experimental. 6608074 is described below commit 660807426805b81938891916b5b1f103bcd0b99a Author: Chun-Hung Hsiao AuthorDate: Fri May 10 14:14:30 2019 -0700 Explicitly marked agent resource provider config calls as experimental. The resource provider feature has been marked as experimental, so we should also call out the related config API calls are experimental. Review: https://reviews.apache.org/r/70627 --- include/mesos/agent/agent.proto| 9 + include/mesos/v1/agent/agent.proto | 9 + 2 files changed, 18 insertions(+) diff --git a/include/mesos/agent/agent.proto b/include/mesos/agent/agent.proto index ff408a4..316a384 100644 --- a/include/mesos/agent/agent.proto +++ b/include/mesos/agent/agent.proto @@ -319,6 +319,9 @@ message Call { // resource provider of the same type and name exists, but the content is // not identical. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message AddResourceProviderConfig { required ResourceProviderInfo info = 1; } @@ -342,6 +345,9 @@ message Call { // Returns 404 Not Found if no config file describes a resource // provider of the same type and name exists. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message UpdateResourceProviderConfig { required ResourceProviderInfo info = 1; } @@ -360,6 +366,9 @@ message Call { // exists. // Returns 403 Forbidden if the call is not authorized. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message RemoveResourceProviderConfig { required string type = 1; required string name = 2; diff --git a/include/mesos/v1/agent/agent.proto b/include/mesos/v1/agent/agent.proto index 19d6c42..2797d20 100644 --- a/include/mesos/v1/agent/agent.proto +++ b/include/mesos/v1/agent/agent.proto @@ -319,6 +319,9 @@ message Call { // resource provider of the same type and name exists, but the content is // not identical. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message AddResourceProviderConfig { required ResourceProviderInfo info = 1; } @@ -342,6 +345,9 @@ message Call { // Returns 404 Not Found if no config file describes a resource // provider of the same type and name exists. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message UpdateResourceProviderConfig { required ResourceProviderInfo info = 1; } @@ -360,6 +366,9 @@ message Call { // exists. // Returns 403 Forbidden if the call is not authorized. // Returns 500 Internal Server Error if anything goes wrong. + // + // NOTE: For the time being, this API is subject to change and the related + // feature is experimental. message RemoveResourceProviderConfig { required string type = 1; required string name = 2;
[mesos] 01/02: Launched tasks with more memory in SLRP unit tests.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 07862ee764e85d164b9e84a33d08845a55422d2e Author: Chun-Hung Hsiao AuthorDate: Fri May 3 12:04:16 2019 -0700 Launched tasks with more memory in SLRP unit tests. Raised the task memory to 128MB (which are the value used in most persistent volume tests) in all SLRP tests that launch tasks to avoid being OOM-killed. Review: https://reviews.apache.org/r/70596 --- .../storage_local_resource_provider_tests.cpp | 29 +- 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/src/tests/storage_local_resource_provider_tests.cpp b/src/tests/storage_local_resource_provider_tests.cpp index ba55728..487047f 100644 --- a/src/tests/storage_local_resource_provider_tests.cpp +++ b/src/tests/storage_local_resource_provider_tests.cpp @@ -494,6 +494,11 @@ public: return stringify(diskProfileMapping); } + static Resources createTaskResources(const Resources& additional) + { +return Resources::parse("cpus:0.1;mem:128").get() + additional; + } + string metricName(const string& basename) { return "resource_providers/" + stringify(TEST_SLRP_TYPE) + "." + @@ -1792,7 +1797,7 @@ TEST_P(StorageLocalResourceProviderTest, ROOT_AgentRegisteredWithNewId) {CREATE(persistentVolumes), LAUNCH({createTask( offer.slave_id(), - persistentVolumes, + createTaskResources(persistentVolumes), createCommandInfo( "touch " + path::join(containerPaths[0], "file") + " " + path::join(containerPaths[1], "file")))})}); @@ -1966,7 +1971,7 @@ TEST_P(StorageLocalResourceProviderTest, ROOT_AgentRegisteredWithNewId) {CREATE(imported), LAUNCH({createTask( offer.slave_id(), - imported, + createTaskResources(imported), createCommandInfo("test -f " + path::join("volume", "file")))}), DESTROY_DISK(preprovisioned[1])}); @@ -2117,7 +2122,7 @@ TEST_P( {CREATE(persistentVolume), LAUNCH({createTask( offer.slave_id(), - persistentVolume, + createTaskResources(persistentVolume), createCommandInfo("touch " + path::join("volume", "file")))})}); AWAIT_READY(taskFinished); @@ -2290,7 +2295,7 @@ TEST_P( {CREATE(persistentVolume), LAUNCH({createTask( offer.slave_id(), - persistentVolume, + createTaskResources(persistentVolume), createCommandInfo("touch " + path::join("volume", "file")))})}); AWAIT_READY(taskFinished); @@ -2328,7 +2333,7 @@ TEST_P( {offer.id()}, {LAUNCH({createTask( offer.slave_id(), - persistentVolume, + createTaskResources(persistentVolume), createCommandInfo("test -f " + path::join("volume", "file")))})}); AWAIT_READY(taskFinished); @@ -2563,7 +2568,7 @@ TEST_P( {CREATE(persistentVolume), LAUNCH({createTask( offer.slave_id(), - persistentVolume, + createTaskResources(persistentVolume), createCommandInfo("touch " + path::join("volume", "file")))})}); AWAIT_READY(taskFinished); @@ -2640,7 +2645,7 @@ TEST_P( {offer.id()}, {LAUNCH({createTask( offer.slave_id(), - persistentVolume, + createTaskResources(persistentVolume), createCommandInfo("test -f " + path::join("volume", "file")))})}); AWAIT_READY(taskFinished); @@ -2967,7 +2972,7 @@ TEST_P( {CREATE(persistentVolume), LAUNCH({createTask( volumeCreatedOffers->at(0).slave_id(), - persistentVolume, + createTaskResources(persistentVolume), createCommandInfo("test -f " + path::join("volume", "file")))})}); AWAIT_READY(taskStarting); @@ -3271,7 +3276,7 @@ TEST_P(StorageLocalResourceProviderTest, DestroyUnpublishedPersistentVolume) {CREATE(persistentVolume), LAUNCH({createTask( offer.slave_id(), - persistentVolume, + createTaskResources(persistentVolume), createCommandInfo("touch " + path::join("volume", "file")))})}); AWAIT_READY(nodePublishVolumeCall); @@ -3435,7 +3440,7 @@ TEST_P( {CREATE(persistentVolume), LAUNCH({createTask( offer.slave_id(), - persistentVolume, + createTaskResources(persistentVolume), createCommandInfo("touch " + path::join("
[mesos] 02/02: Made SLRP test's `metricName` helper static.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 2862d9c61f5d02cff4764a374c28f92129f1df62 Author: Chun-Hung Hsiao AuthorDate: Thu May 9 20:58:00 2019 -0700 Made SLRP test's `metricName` helper static. --- src/tests/storage_local_resource_provider_tests.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/tests/storage_local_resource_provider_tests.cpp b/src/tests/storage_local_resource_provider_tests.cpp index 487047f..609aebc 100644 --- a/src/tests/storage_local_resource_provider_tests.cpp +++ b/src/tests/storage_local_resource_provider_tests.cpp @@ -499,7 +499,7 @@ public: return Resources::parse("cpus:0.1;mem:128").get() + additional; } - string metricName(const string& basename) + static string metricName(const string& basename) { return "resource_providers/" + stringify(TEST_SLRP_TYPE) + "." + stringify(TEST_SLRP_NAME) + "/" + basename;
[mesos] branch master updated (6bbf183 -> 2862d9c)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 6bbf183 Logged headroom related info in the allocator. new 07862ee Launched tasks with more memory in SLRP unit tests. new 2862d9c Made SLRP test's `metricName` helper static. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../storage_local_resource_provider_tests.cpp | 31 +- 1 file changed, 18 insertions(+), 13 deletions(-)
[mesos] branch master updated: Fixed flakiness in 'RetryRpcWithExponentialBackoff'.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new a87b66a Fixed flakiness in 'RetryRpcWithExponentialBackoff'. a87b66a is described below commit a87b66aeed840abbf06a2a300d85e8098e3d6fd4 Author: Jan Schlicht AuthorDate: Tue May 7 20:40:21 2019 -0700 Fixed flakiness in 'RetryRpcWithExponentialBackoff'. Under some circumstances, offers would be filtered, resulting in the test being stuck while waiting for offers. This has been resolved by settling the clock before accepting new offers. Review: https://reviews.apache.org/r/70184/ --- src/tests/storage_local_resource_provider_tests.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/tests/storage_local_resource_provider_tests.cpp b/src/tests/storage_local_resource_provider_tests.cpp index ecd..ba55728 100644 --- a/src/tests/storage_local_resource_provider_tests.cpp +++ b/src/tests/storage_local_resource_provider_tests.cpp @@ -5861,7 +5861,9 @@ TEST_P(StorageLocalResourceProviderTest, RetryRpcWithExponentialBackoff) AWAIT_READY(updateOperationStatus); EXPECT_EQ(OPERATION_FINISHED, updateOperationStatus->status().state()); - // Advance the clock to trigger a batch allocation. + // Settle the clock to recover the created disk, then advance the clock to + // trigger a batch allocation. + Clock::settle(); Clock::advance(masterFlags.allocation_interval); AWAIT_READY(offers);
[mesos] branch master updated (18bc6c9 -> 7fd2d65)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 18bc6c9 Added SLRP unit tests for destroying unpublished persistent volumes. new 14c7f7e Renamed variables in `Master::_accept` to improve readability. new 7fd2d65 Removed unnecessary accept filters in SLRP tests. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/master/master.cpp | 98 +++--- .../storage_local_resource_provider_tests.cpp | 32 ++- 2 files changed, 54 insertions(+), 76 deletions(-)
[mesos] 02/02: Removed unnecessary accept filters in SLRP tests.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 7fd2d65c8722bf9a5268ce7462511fa32f357fc5 Author: Chun-Hung Hsiao AuthorDate: Tue Mar 5 16:40:13 2019 -0800 Removed unnecessary accept filters in SLRP tests. Review: https://reviews.apache.org/r/70133 --- .../storage_local_resource_provider_tests.cpp | 32 -- 1 file changed, 5 insertions(+), 27 deletions(-) diff --git a/src/tests/storage_local_resource_provider_tests.cpp b/src/tests/storage_local_resource_provider_tests.cpp index a2d2705..ecd 100644 --- a/src/tests/storage_local_resource_provider_tests.cpp +++ b/src/tests/storage_local_resource_provider_tests.cpp @@ -2134,12 +2134,7 @@ TEST_P( EXPECT_CALL(sched, resourceOffers(, OffersHaveResource(created))) .WillOnce(FutureArg<1>()); - // TODO(chhsiao): We use the following filter so that the resources will not - // be filtered for 5 seconds (the default) because of MESOS-9616. Remove the - // filter once it is resolved. - Filters acceptFilters; - acceptFilters.set_refuse_seconds(0); - driver.acceptOffers({offer.id()}, {DESTROY(persistentVolume)}, acceptFilters); + driver.acceptOffers({offer.id()}, {DESTROY(persistentVolume)}); // NOTE: Since `DESTROY` would be applied by the master synchronously, we // might get an offer before the persistent volume is cleaned up on the agent, @@ -2402,12 +2397,7 @@ TEST_P( EXPECT_CALL(sched, resourceOffers(, OffersHaveResource(created))) .WillOnce(FutureArg<1>()); - // TODO(chhsiao): We use the following filter so that the resources will not - // be filtered for 5 seconds (the default) because of MESOS-9616. Remove the - // filter once it is resolved. - Filters acceptFilters; - acceptFilters.set_refuse_seconds(0); - driver.acceptOffers({offer.id()}, {DESTROY(persistentVolume)}, acceptFilters); + driver.acceptOffers({offer.id()}, {DESTROY(persistentVolume)}); // NOTE: Since `DESTROY` would be applied by the master synchronously, we // might get an offer before the persistent volume is cleaned up on the agent, @@ -2724,12 +2714,7 @@ TEST_P( EXPECT_CALL(sched, resourceOffers(, OffersHaveResource(created))) .WillOnce(FutureArg<1>()); - // TODO(chhsiao): We use the following filter so that the resources will not - // be filtered for 5 seconds (the default) because of MESOS-9616. Remove the - // filter once it is resolved. - Filters acceptFilters; - acceptFilters.set_refuse_seconds(0); - driver.acceptOffers({offer.id()}, {DESTROY(persistentVolume)}, acceptFilters); + driver.acceptOffers({offer.id()}, {DESTROY(persistentVolume)}); // NOTE: Since `DESTROY` would be applied by the master synchronously, we // might get an offer before the persistent volume is cleaned up on the agent, @@ -3124,15 +3109,8 @@ TEST_P(StorageLocalResourceProviderTest, CreatePersistentBlockVolume) std::bind(isBlockDisk, lambda::_1, "test" .WillOnce(FutureArg<1>()); - // We use the following filter so that the resources will not be filtered for - // 5 seconds (the default). - Filters acceptFilters; - acceptFilters.set_refuse_seconds(0); - driver.acceptOffers( - {offer.id()}, - {CREATE_DISK(raw, Resource::DiskInfo::Source::BLOCK)}, - acceptFilters); + {offer.id()}, {CREATE_DISK(raw, Resource::DiskInfo::Source::BLOCK)}); AWAIT_READY(offers); ASSERT_EQ(1u, offers->size()); @@ -3160,7 +3138,7 @@ TEST_P(StorageLocalResourceProviderTest, CreatePersistentBlockVolume) sched, resourceOffers(, OffersHaveResource(created))) .WillOnce(FutureArg<1>()); - driver.acceptOffers({offer.id()}, {CREATE(persistentVolume)}, acceptFilters); + driver.acceptOffers({offer.id()}, {CREATE(persistentVolume)}); AWAIT_READY(createOperationStatus); EXPECT_EQ(OPERATION_FAILED, createOperationStatus->status().state());
[mesos] 01/02: Renamed variables in `Master::_accept` to improve readability.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 14c7f7e1432d2b0b428ba5fa36f6221fe29f3524 Author: Chun-Hung Hsiao AuthorDate: Mon Apr 22 18:13:17 2019 -0700 Renamed variables in `Master::_accept` to improve readability. Review: https://reviews.apache.org/r/70521 --- src/master/master.cpp | 98 +-- 1 file changed, 49 insertions(+), 49 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index a8ee629..6c0e30b 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -4833,24 +4833,24 @@ void Master::_accept( return; } - // Some operations update the offered resources. We keep - // updated offered resources here. When a task is successfully - // launched, we remove its resource from offered resources. - Resources _offeredResources = offeredResources; + // We maintain the "running remaining" resources here to support pipelining of + // speculative operations (e.g., RESERVE), which would modify the remaining + // resources. Resources consumed by non-speculative operations (e.g., LAUNCH) + // are removed from the remaining resources. + Resources remainingResources = offeredResources; // Converted resources from volume resizes. These converted resources are not - // put into `_offeredResources`, so no other operations can consume them. + // put into `remainingResources`, so no other operations can consume them. // TODO(zhitao): This will be unnecessary once `GROW_VOLUME` and // `SHRINK_VOLUME` become non-speculative. Resources resizedResources; - // We keep track of the shared resources from the offers separately. - // `offeredSharedResources` can be modified by CREATE/DESTROY but we - // don't remove from it when a task is successfully launched so this - // variable always tracks the *total* amount. We do this to support - // validation of tasks involving shared resources. See comments in - // the LAUNCH case below. - Resources offeredSharedResources = offeredResources.shared(); + // We keep track of the "running remaining" shared resources from the offers + // separately. `remainingSharedResources` can be modified by CREATE/DESTROY + // but we don't remove from it when a task is successfully launched so this + // variable always tracks the *total* amount. We do this to support validation + // of tasks involving shared resources. See comments in the LAUNCH case below. + Resources remainingSharedResources = offeredResources.shared(); // Maintain a list of resource conversions to pass to the allocator // as a result of operations. Note that: @@ -4927,13 +4927,13 @@ void Master::_accept( continue; } -Try resources = _offeredResources.apply(_conversions.get()); +Try resources = remainingResources.apply(_conversions.get()); if (resources.isError()) { drop(framework, operation, resources.error()); continue; } -_offeredResources = resources.get(); +remainingResources = resources.get(); LOG(INFO) << "Applying RESERVE operation for resources " << operation.reserve().resources() << " from framework " @@ -4994,13 +4994,13 @@ void Master::_accept( continue; } -Try resources = _offeredResources.apply(_conversions.get()); +Try resources = remainingResources.apply(_conversions.get()); if (resources.isError()) { drop(framework, operation, resources.error()); continue; } -_offeredResources = resources.get(); +remainingResources = resources.get(); LOG(INFO) << "Applying UNRESERVE operation for resources " << operation.unreserve().resources() << " from framework " @@ -5071,14 +5071,14 @@ void Master::_accept( continue; } -Try resources = _offeredResources.apply(_conversions.get()); +Try resources = remainingResources.apply(_conversions.get()); if (resources.isError()) { drop(framework, operation, resources.error()); continue; } -_offeredResources = resources.get(); -offeredSharedResources = _offeredResources.shared(); +remainingResources = resources.get(); +remainingSharedResources = remainingResources.shared(); LOG(INFO) << "Applying CREATE operation for volumes " << operation.create().volumes() << " from framework " @@ -5165,14 +5165,14 @@ void Master::_accept( continue; } -Try resources = _offeredResources.apply(_conversions.get()); +Try resources = remainingResources.apply(
[mesos] 02/03: Added a SLRP unit test for persistent block volume creation.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 75d718f3141d073593c4608bbe8d8027e1e1123a Author: Chun-Hung Hsiao AuthorDate: Wed Feb 6 12:29:37 2019 -0800 Added a SLRP unit test for persistent block volume creation. Test `CreateDestroyPersistentBlockVolume` verifies that SLRP would fail a `CREATE` operation on a BLOCK disk resource, and a followup `DESTROY` will be dropped (instead of failing the SLRP). Review: https://reviews.apache.org/r/69954 --- src/tests/mock_csi_plugin.cpp | 30 +++- .../storage_local_resource_provider_tests.cpp | 162 + 2 files changed, 190 insertions(+), 2 deletions(-) diff --git a/src/tests/mock_csi_plugin.cpp b/src/tests/mock_csi_plugin.cpp index dacdc15..82dae64 100644 --- a/src/tests/mock_csi_plugin.cpp +++ b/src/tests/mock_csi_plugin.cpp @@ -16,6 +16,8 @@ #include "tests/mock_csi_plugin.hpp" +#include + #include using std::string; @@ -61,8 +63,20 @@ MockCSIPlugin::MockCSIPlugin() EXPECT_CALL(*this, Probe(_, _, A())) .WillRepeatedly(Return(Status::OK)); + // Return a success by default for testing with the test CSI plugin in + // forwarding mode. EXPECT_CALL(*this, CreateVolume(_, _, A())) -.WillRepeatedly(Return(Status::OK)); +.WillRepeatedly(Invoke([]( +ServerContext* context, +const csi::v0::CreateVolumeRequest* request, +csi::v0::CreateVolumeResponse* response) { + response->mutable_volume()->set_capacity_bytes(std::max( + request->capacity_range().required_bytes(), + request->capacity_range().limit_bytes())); + response->mutable_volume()->set_id(request->name()); + + return Status::OK; +})); EXPECT_CALL(*this, DeleteVolume(_, _, A())) .WillRepeatedly(Return(Status::OK)); @@ -169,8 +183,20 @@ MockCSIPlugin::MockCSIPlugin() EXPECT_CALL(*this, Probe(_, _, A())) .WillRepeatedly(Return(Status::OK)); + // Return a success by default for testing with the test CSI plugin in + // forwarding mode. EXPECT_CALL(*this, CreateVolume(_, _, A())) -.WillRepeatedly(Return(Status::OK)); +.WillRepeatedly(Invoke([]( +ServerContext* context, +const csi::v1::CreateVolumeRequest* request, +csi::v1::CreateVolumeResponse* response) { + response->mutable_volume()->set_capacity_bytes(std::max( + request->capacity_range().required_bytes(), + request->capacity_range().limit_bytes())); + response->mutable_volume()->set_volume_id(request->name()); + + return Status::OK; +})); EXPECT_CALL(*this, DeleteVolume(_, _, A())) .WillRepeatedly(Return(Status::OK)); diff --git a/src/tests/storage_local_resource_provider_tests.cpp b/src/tests/storage_local_resource_provider_tests.cpp index 09e7ca0..efc03c2 100644 --- a/src/tests/storage_local_resource_provider_tests.cpp +++ b/src/tests/storage_local_resource_provider_tests.cpp @@ -501,6 +501,23 @@ static bool isMountDisk(const Resource& r, const string& profile) } +// Tests whether a resource is a BLOCK disk of a given profile but not a +// persistent volume. A BLOCK disk has both profile and source ID set. +template +static bool isBlockDisk(const Resource& r, const string& profile) +{ + return r.has_disk() && +r.disk().has_source() && +r.disk().source().type() == Resource::DiskInfo::Source::BLOCK && +r.disk().source().has_vendor() && +r.disk().source().vendor() == TEST_CSI_VENDOR && +r.disk().source().has_id() && +r.disk().source().has_profile() && +r.disk().source().profile() == profile && +!r.disk().has_persistence(); +} + + // Tests whether a resource is a preprovisioned volume. A preprovisioned volume // is a RAW disk resource with a source ID but no profile. template @@ -2968,6 +2985,151 @@ TEST_P( } +// This test verifies that the storage local resource provider would fail to +// create a persistent volume on a BLOCK disk resource. +// +// TODO(chhsiao): Update this test once persistent BLOCK volumes are supported. +TEST_P(StorageLocalResourceProviderTest, CreatePersistentBlockVolume) +{ + const string profilesPath = path::join(sandbox.get(), "profiles.json"); + Try blockDiskProfileMapping = strings::format( + R"~( + { +"profile_matrix": { + "test": { +"csi_plugin_type_selector": { + "plugin_type": "%s" +}, +"volume_capabilities": { + "block": {}, + "access_mode": { +"mode": "SINGLE_NODE_WRITER" + } +} + }
[mesos] 01/03: Made the `RetryRpcWithExponentialBackoff` SLRP test work with CSI v1.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 5eff8b208e23b0ce9d064a7acea92a984f9b7c64 Author: Chun-Hung Hsiao AuthorDate: Mon Apr 8 21:22:36 2019 -0700 Made the `RetryRpcWithExponentialBackoff` SLRP test work with CSI v1. This patch enables the unit test to test against CSI v1 through the following changes: * The forwarding mode of the test CSI plugin now respects the `--api_version` option. When specified, only requests of the proper CSI version will be forwarded. * The expectations of `CreateVolume` and `DeleteVolume` calls in the unit tests are parameterized against the CSI version string. * The mock CSI plugin now provides a default implementation for the `GetCapacity` call so the unit test can be simplified. Review: https://reviews.apache.org/r/70431 --- src/examples/test_csi_plugin.cpp | 53 ++- src/tests/mock_csi_plugin.cpp | 24 +- .../storage_local_resource_provider_tests.cpp | 396 +++-- 3 files changed, 259 insertions(+), 214 deletions(-) diff --git a/src/examples/test_csi_plugin.cpp b/src/examples/test_csi_plugin.cpp index b54d666..03f782e 100644 --- a/src/examples/test_csi_plugin.cpp +++ b/src/examples/test_csi_plugin.cpp @@ -1716,8 +1716,12 @@ Try TestCSIPlugin::nodeUnpublishVolume( class CSIProxy { public: - CSIProxy(const string& _endpoint, const string& forward) -: endpoint(_endpoint), + CSIProxy( + const Option& _apiVersion, + const string& _endpoint, + const string& forward) +: apiVersion(_apiVersion), + endpoint(_endpoint), stub(grpc::CreateChannel(forward, grpc::InsecureChannelCredentials())), service(new AsyncGenericService()) {} @@ -1747,6 +1751,7 @@ private: void serve(ServerCompletionQueue* completionQueue); + const Option apiVersion; const string endpoint; GenericStub stub; @@ -1779,13 +1784,13 @@ void CSIProxy::run() // The lifecycle of a forwarded CSI call is shown as follows. The transitions // happen after the completions of the API calls. // -// Server-side -//+-+ +-+ WriteAndFinish +---+ -//| INITIALIZED | | FINISHING +> X | -//+--+--+ +--^--++---+ -// Server-side | | Client-side -// RequestCall |Server-side| Finish (unary call) -//+--v--+Read +--+--+ +//Unsupported Server-side +//+-+ API version +-+ WriteAndFinish +---+ +//| INITIALIZED | +-> FINISHING +> X | +//+--+--+ | +--^--++---+ +// Server-side | +--+| Client-side +// RequestCall | |Server-side| Finish (unary call) +//+--v---+--+Read +--+--+ //| REQUESTED +-> FORWARDING | //+-+ +-+ // @@ -1816,7 +1821,7 @@ void CSIProxy::serve(ServerCompletionQueue* completionQueue) if (!ok) { // Server-side `RequestCall`: the server has been shutdown so continue // to drain the queue. - continue; + break; } call->state = Call::State::REQUESTED; @@ -1841,13 +1846,29 @@ void CSIProxy::serve(ServerCompletionQueue* completionQueue) case Call::State::REQUESTED: { if (!ok) { // Server-side `Read`: the client has done a `WritesDone` already, so - // clean up the call and move on to the next one. + // clean up the call and move to the next iteration immediately. delete call; continue; } -LOG(INFO) << "Forwarding " << call->serverContext.method() << " call"; +// The expected method names are of the following form: +// /csi../ +// See: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#requests // NOLINT +if (apiVersion.isSome() && +!strings::startsWith( +call->serverContext.method(), +"/csi." + apiVersion.get() + ".")) { + // The proxy does not support the API version of the call so respond + // with `UNIMPLEMENTED`. + call->state = Call::State::FINISHING; + call->status = Status(grpc::UNIMPLEMENTED, ""); + call->serverReaderWriter.WriteAndFinish( + call->response, WriteOptions(), call->status, call); +
[mesos] branch master updated (6bc1c80 -> 18bc6c9)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 6bc1c80 Fixed the flaky ExamplesTest.DynamicReservationFramework. new 5eff8b2 Made the `RetryRpcWithExponentialBackoff` SLRP test work with CSI v1. new 75d718f Added a SLRP unit test for persistent block volume creation. new 18bc6c9 Added SLRP unit tests for destroying unpublished persistent volumes. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/examples/test_csi_plugin.cpp | 53 +- src/tests/mock_csi_plugin.cpp | 54 +- .../storage_local_resource_provider_tests.cpp | 1383 3 files changed, 1195 insertions(+), 295 deletions(-)
[mesos] 03/03: Added SLRP unit tests for destroying unpublished persistent volumes.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 18bc6c95a67e2ac2dd4d5557608d75b7fb01d383 Author: Chun-Hung Hsiao AuthorDate: Mon Feb 11 20:52:26 2019 -0800 Added SLRP unit tests for destroying unpublished persistent volumes. This patch adds 3 unit tests: `DestroyUnpublishedPersistentVolume`, `DestroyUnpublishedPersistentVolumeWithRecovery`, and `DestroyUnpublishedPersistentVolumeWithReboot` to test that the SLRP is resilient to misbehaved CSI plugins that fail to publish volumes. Review: https://reviews.apache.org/r/69955 --- .../storage_local_resource_provider_tests.cpp | 667 + 1 file changed, 667 insertions(+) diff --git a/src/tests/storage_local_resource_provider_tests.cpp b/src/tests/storage_local_resource_provider_tests.cpp index efc03c2..a2d2705 100644 --- a/src/tests/storage_local_resource_provider_tests.cpp +++ b/src/tests/storage_local_resource_provider_tests.cpp @@ -416,6 +416,46 @@ public: UNREACHABLE(); } + // Set up an expected `NodePublishVolume` CSI call for a given mock CSI + // plugin. When the call is made to the mock plugin, `result` will be + // responded. When the response is received by the volume manager, the + // returned future will be satisfied. + Future futureNodePublishVolumeCall( + MockCSIPlugin* plugin, const Try& result) + { +if (GetParam() == csi::v0::API_VERSION) { + EXPECT_CALL(*plugin, NodePublishVolume( + _, _, A())) +.WillOnce(Invoke([result]( +grpc::ServerContext* context, +const csi::v0::NodePublishVolumeRequest* request, +csi::v0::NodePublishVolumeResponse* response) { + return result.isError() ? result.error().status : grpc::Status::OK; +})); + + return FUTURE_DISPATCH(_, ::v0::VolumeManagerProcess::__call< + csi::v0::NodePublishVolumeResponse>); +} else if (GetParam() == csi::v1::API_VERSION) { + EXPECT_CALL(*plugin, NodePublishVolume( + _, _, A())) +.WillOnce(Invoke([result]( +grpc::ServerContext* context, +const csi::v1::NodePublishVolumeRequest* request, +csi::v1::NodePublishVolumeResponse* response) { + return result.isError() ? result.error().status : grpc::Status::OK; +})); + + return FUTURE_DISPATCH(_, ::v1::VolumeManagerProcess::__call< + csi::v1::NodePublishVolumeResponse>); +} + +// This extra closure is necessary in order to use `FAIL` as it requires a +// void return type. +[&] { FAIL() << "Unsupported CSI API version " << GetParam(); }(); + +UNREACHABLE(); + } + // Create a JSON string representing a disk profile mapping containing the // given profile-parameter pairs. static string createDiskProfileMapping( @@ -3130,6 +3170,633 @@ TEST_P(StorageLocalResourceProviderTest, CreatePersistentBlockVolume) } +// This test verifies that if a persistent volumes is never published by the +// storage local resource provider, the volume can be destroyed. +// +// To accomplish this: +// 1. Create a MOUNT disk from a RAW disk resource. +// 2. Create a persistent volume on the MOUNT disk then launches a task to +// write a file into it. +// 3. Return `UNIMPLEMENTED` for the `NodePublishVolume` call. The task will +// fail to launch. +// 4. Destroy the persistent volume and the MOUNT disk. +TEST_P(StorageLocalResourceProviderTest, DestroyUnpublishedPersistentVolume) +{ + const string profilesPath = path::join(sandbox.get(), "profiles.json"); + + ASSERT_SOME( + os::write(profilesPath, createDiskProfileMapping({{"test", None()}}))); + + loadUriDiskProfileAdaptorModule(profilesPath); + + const string mockCsiEndpoint = +"unix://" + path::join(sandbox.get(), "mock_csi.sock"); + + MockCSIPlugin plugin; + ASSERT_SOME(plugin.startup(mockCsiEndpoint)); + + setupResourceProviderConfig(Bytes(0), None(), None(), mockCsiEndpoint); + + Try> master = StartMaster(); + ASSERT_SOME(master); + + Owned detector = master.get()->createDetector(); + + slave::Flags slaveFlags = CreateSlaveFlags(); + slaveFlags.disk_profile_adaptor = URI_DISK_PROFILE_ADAPTOR_NAME; + + Future slaveRegisteredMessage = +FUTURE_PROTOBUF(SlaveRegisteredMessage(), _, _); + + Try> slave = StartSlave(detector.get(), slaveFlags); + ASSERT_SOME(slave); + + AWAIT_READY(slaveRegisteredMessage); + + // Register a framework to exercise operations. + FrameworkInfo framework = DEFAULT_FRAMEWORK_INFO; + framework.set_roles(0, "storage"); + + MockScheduler sched; + MesosSchedulerDriver driver( + , framework, master.get()->pid, DEFAULT_CREDENTIAL); + + EXPECT_CALL(sched, registered(, _, _)); + + // We use the following filter to fil
[mesos] 02/02: Added MESOS-9616 to the 1.5.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 13b2888ead47cc7efbdd20c267089e673546a380 Author: Chun-Hung Hsiao AuthorDate: Wed May 1 13:41:32 2019 -0700 Added MESOS-9616 to the 1.5.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index fd85213..c38fa85 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,6 +4,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP) ** Bug * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. + * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error
[mesos] 01/02: Do not implicitly decline speculatively converted resources.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 6c0d92b5826205481be133e8f054f184b2cbb4cc Author: Chun-Hung Hsiao AuthorDate: Mon Apr 22 15:46:36 2019 -0700 Do not implicitly decline speculatively converted resources. Currently if a framework accepts an offer with a `RESERVE` operation without a task consuming the reserved resources, the resources will be implicitly declined. This is counter to what one would expect (that only the remaining resources in the offer will be declined): Offer `cpus:10` -> `ACCEPT` with `RESERVE cpus(role):1` *Actual* implicit decline: `cpus:9;cpus(role):1` *Expected* implicit decline: `cpus:9` The same issue is present with other transformational operations (i.e., `UNRESERVE`, `CREATE` and `DESTROY`). This patch fixes this issue by only implicitly declining the "remaining" untransformed resources, computed as follows: Offered = `cpus:10` Remaining = `cpus:9;cpus(role):1` Implicitly declined = remaining - (remaining - offered) = `cpus:9;cpus(role):1` - `cpus:(role):1` = `cpus:9` Review: https://reviews.apache.org/r/70132 --- src/master/master.cpp | 37 +++-- 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index 995ff55..42f88b6 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -5456,13 +5456,38 @@ void Master::_accept( conversions); } - if (!_offeredResources.empty()) { -// Tell the allocator about the unused (e.g., refused) resources. + // We now need to compute the amounts of remaining (1) speculatively converted + // resources to recover without a filter and (2) resources that are implicitly + // declined with the filter: + // + // Speculatively converted resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - offered resources not consumed by any operation + // = `_offeredResources` - offered resources not consumed by any operation + // = `_offeredResources` - offered resources + // + // (The last equality holds because resource subtraction yields no negatives.) + // + // Implicitly declined resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - speculatively converted resources + // = `_offeredResources` - speculatively converted resources + Resources speculativelyConverted = _offeredResources - offeredResources; + Resources implicitlyDeclined = _offeredResources - speculativelyConverted; + + // Tell the allocator about the net speculatively converted resources. These + // resources should not be implicitly declined. + if (!speculativelyConverted.empty()) { allocator->recoverResources( -frameworkId, -slaveId, -_offeredResources, -accept.filters()); +frameworkId, slaveId, speculativelyConverted, None()); + } + + // Tell the allocator about the implicitly declined resources. + if (!implicitlyDeclined.empty()) { +allocator->recoverResources( +frameworkId, slaveId, implicitlyDeclined, accept.filters()); } }
[mesos] branch master updated: Added MESOS-9616 to the 1.5.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 3a0c2fa Added MESOS-9616 to the 1.5.4 CHANGELOG. 3a0c2fa is described below commit 3a0c2fa2ae338eeab292e7c0d3dae55b66f5886d Author: Chun-Hung Hsiao AuthorDate: Wed May 1 13:41:32 2019 -0700 Added MESOS-9616 to the 1.5.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index f2bf363..2ee079b 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1352,6 +1352,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP) ** Bug * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. + * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error
[mesos] 01/02: Do not implicitly decline speculatively converted resources.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 899fac19fd0082ce96a0c15bf00ac9d9d0453932 Author: Chun-Hung Hsiao AuthorDate: Mon Apr 22 15:46:36 2019 -0700 Do not implicitly decline speculatively converted resources. Currently if a framework accepts an offer with a `RESERVE` operation without a task consuming the reserved resources, the resources will be implicitly declined. This is counter to what one would expect (that only the remaining resources in the offer will be declined): Offer `cpus:10` -> `ACCEPT` with `RESERVE cpus(role):1` *Actual* implicit decline: `cpus:9;cpus(role):1` *Expected* implicit decline: `cpus:9` The same issue is present with other transformational operations (i.e., `UNRESERVE`, `CREATE` and `DESTROY`). This patch fixes this issue by only implicitly declining the "remaining" untransformed resources, computed as follows: Offered = `cpus:10` Remaining = `cpus:9;cpus(role):1` Implicitly declined = remaining - (remaining - offered) = `cpus:9;cpus(role):1` - `cpus:(role):1` = `cpus:9` Review: https://reviews.apache.org/r/70132 --- src/master/master.cpp | 44 1 file changed, 36 insertions(+), 8 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index 28a1593..66e8e92 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -5785,16 +5785,44 @@ void Master::_accept( conversions); } + // We now need to compute the amounts of remaining (1) speculatively converted + // resources to recover without a filter and (2) resources that are implicitly + // declined with the filter: + // + // Speculatively converted resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - offered resources not consumed by any operation + // = `_offeredResources` - offered resources not consumed by any operation + // = `_offeredResources` - offered resources + // + // (The last equality holds because resource subtraction yields no negatives.) + // + // Implicitly declined resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - speculatively converted resources + // = `_offeredResources` - speculatively converted resources + // + // TODO(zhitao): Right now `GROW_VOLUME` and `SHRINK_VOLUME` are implemented + // as speculative operations. Since the plan is to make them non-speculative + // in the future, their results are not in `_offeredResources`, so we add them + // back here. Remove this once the operations become non-speculative. + Resources speculativelyConverted = +_offeredResources + resizedResources - offeredResources; + Resources implicitlyDeclined = _offeredResources - speculativelyConverted; + + // Tell the allocator about the net speculatively converted resources. These + // resources should not be implicitly declined. + if (!speculativelyConverted.empty()) { +allocator->recoverResources( +frameworkId, slaveId, speculativelyConverted, None()); + } - // TODO(zhitao): Remove `resizedResources` once `GROW_VOLUME` and - // `SHRINK_VOLUME` become non-speculative. - if (!_offeredResources.empty() || !resizedResources.empty()) { -// Tell the allocator about the unused (e.g., refused) resources. + // Tell the allocator about the implicitly declined resources. + if (!implicitlyDeclined.empty()) { allocator->recoverResources( -frameworkId, -slaveId, -_offeredResources + resizedResources, -accept.filters()); +frameworkId, slaveId, implicitlyDeclined, accept.filters()); } }
[mesos] branch 1.6.x updated (13fdaa4 -> d8e3909)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 13fdaa4 Added MESOS-9695 to the 1.6.3 CHANGELOG. new 899fac1 Do not implicitly decline speculatively converted resources. new d8e3909 Added MESOS-9616 to the 1.6.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/master/master.cpp | 44 2 files changed, 37 insertions(+), 8 deletions(-)
[mesos] 02/02: Added MESOS-9616 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit d8e39097a92ac304ab241d0c51e9a65ff3fdba0e Author: Chun-Hung Hsiao AuthorDate: Wed May 1 13:40:52 2019 -0700 Added MESOS-9616 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 55b74d1..3f84ffb 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -7,6 +7,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. + * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9692] - Quota may be under allocated for disk resources. * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
[mesos] branch master updated: Added MESOS-9616 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 4fff6dc Added MESOS-9616 to the 1.6.3 CHANGELOG. 4fff6dc is described below commit 4fff6dc1e1ad858ec872d7694e529ab85c514ba5 Author: Chun-Hung Hsiao AuthorDate: Wed May 1 13:40:52 2019 -0700 Added MESOS-9616 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 366fe81..f2bf363 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -886,6 +886,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. + * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9692] - Quota may be under allocated for disk resources. * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
[mesos] 02/02: Added MESOS-9616 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit b82ad53dfcd222619ce1ab53caa81201facd59ec Author: Chun-Hung Hsiao AuthorDate: Wed May 1 13:40:30 2019 -0700 Added MESOS-9616 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 369a2c8..5767b1f 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -12,6 +12,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9568] - SLRP does not clean up mount directories for destroyed MOUNT disks. * [MESOS-9607] - Removing a resource provider with consumers breaks resource publishing. * [MESOS-9610] - Fetcher vulnerability - escaping from sandbox. + * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations. * [MESOS-9692] - Quota may be under allocated for disk resources.
[mesos] 01/02: Do not implicitly decline speculatively converted resources.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit f05ac51eafd35d3ec0fb8a6492d0a0bb2622e375 Author: Chun-Hung Hsiao AuthorDate: Mon Apr 22 15:46:36 2019 -0700 Do not implicitly decline speculatively converted resources. Currently if a framework accepts an offer with a `RESERVE` operation without a task consuming the reserved resources, the resources will be implicitly declined. This is counter to what one would expect (that only the remaining resources in the offer will be declined): Offer `cpus:10` -> `ACCEPT` with `RESERVE cpus(role):1` *Actual* implicit decline: `cpus:9;cpus(role):1` *Expected* implicit decline: `cpus:9` The same issue is present with other transformational operations (i.e., `UNRESERVE`, `CREATE` and `DESTROY`). This patch fixes this issue by only implicitly declining the "remaining" untransformed resources, computed as follows: Offered = `cpus:10` Remaining = `cpus:9;cpus(role):1` Implicitly declined = remaining - (remaining - offered) = `cpus:9;cpus(role):1` - `cpus:(role):1` = `cpus:9` Review: https://reviews.apache.org/r/70132 --- src/master/master.cpp | 44 1 file changed, 36 insertions(+), 8 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index ed072d3..479d56c 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -5866,16 +5866,44 @@ void Master::_accept( conversions); } + // We now need to compute the amounts of remaining (1) speculatively converted + // resources to recover without a filter and (2) resources that are implicitly + // declined with the filter: + // + // Speculatively converted resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - offered resources not consumed by any operation + // = `_offeredResources` - offered resources not consumed by any operation + // = `_offeredResources` - offered resources + // + // (The last equality holds because resource subtraction yields no negatives.) + // + // Implicitly declined resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - speculatively converted resources + // = `_offeredResources` - speculatively converted resources + // + // TODO(zhitao): Right now `GROW_VOLUME` and `SHRINK_VOLUME` are implemented + // as speculative operations. Since the plan is to make them non-speculative + // in the future, their results are not in `_offeredResources`, so we add them + // back here. Remove this once the operations become non-speculative. + Resources speculativelyConverted = +_offeredResources + resizedResources - offeredResources; + Resources implicitlyDeclined = _offeredResources - speculativelyConverted; + + // Tell the allocator about the net speculatively converted resources. These + // resources should not be implicitly declined. + if (!speculativelyConverted.empty()) { +allocator->recoverResources( +frameworkId, slaveId, speculativelyConverted, None()); + } - // TODO(zhitao): Remove `resizedResources` once `GROW_VOLUME` and - // `SHRINK_VOLUME` become non-speculative. - if (!_offeredResources.empty() || !resizedResources.empty()) { -// Tell the allocator about the unused (e.g., refused) resources. + // Tell the allocator about the implicitly declined resources. + if (!implicitlyDeclined.empty()) { allocator->recoverResources( -frameworkId, -slaveId, -_offeredResources + resizedResources, -accept.filters()); +frameworkId, slaveId, implicitlyDeclined, accept.filters()); } }
[mesos] branch 1.7.x updated (80c9fd7 -> b82ad53)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 80c9fd7 Added MESOS-9695 to the 1.7.3 CHANGELOG. new f05ac51 Do not implicitly decline speculatively converted resources. new b82ad53 Added MESOS-9616 to the 1.7.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/master/master.cpp | 44 2 files changed, 37 insertions(+), 8 deletions(-)
[mesos] 01/02: Do not implicitly decline speculatively converted resources.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 7e43f1fcfab8983f102ba79419a81f62fba677d8 Author: Chun-Hung Hsiao AuthorDate: Mon Apr 22 15:46:36 2019 -0700 Do not implicitly decline speculatively converted resources. Currently if a framework accepts an offer with a `RESERVE` operation without a task consuming the reserved resources, the resources will be implicitly declined. This is counter to what one would expect (that only the remaining resources in the offer will be declined): Offer `cpus:10` -> `ACCEPT` with `RESERVE cpus(role):1` *Actual* implicit decline: `cpus:9;cpus(role):1` *Expected* implicit decline: `cpus:9` The same issue is present with other transformational operations (i.e., `UNRESERVE`, `CREATE` and `DESTROY`). This patch fixes this issue by only implicitly declining the "remaining" untransformed resources, computed as follows: Offered = `cpus:10` Remaining = `cpus:9;cpus(role):1` Implicitly declined = remaining - (remaining - offered) = `cpus:9;cpus(role):1` - `cpus:(role):1` = `cpus:9` Review: https://reviews.apache.org/r/70132 --- src/master/master.cpp | 49 +++ src/tests/slave_tests.cpp | 3 ++- 2 files changed, 43 insertions(+), 9 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index ad54ae2..555136e 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -5959,17 +5959,50 @@ void Master::_accept( conversions); } + // We now need to compute the amounts of remaining (1) speculatively converted + // resources to recover without a filter and (2) resources that are implicitly + // declined with the filter: + // + // Speculatively converted resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - offered resources not consumed by any operation + // = `_offeredResources` - offered resources not consumed by any operation + // = `_offeredResources` - offered resources + // + // (The last equality holds because resource subtraction yields no negatives.) + // + // Implicitly declined resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - speculatively converted resources + // = `_offeredResources` - speculatively converted resources + // + // TODO(zhitao): Right now `GROW_VOLUME` and `SHRINK_VOLUME` are implemented + // as speculative operations. Since the plan is to make them non-speculative + // in the future, their results are not in `_offeredResources`, so we add them + // back here. Remove this once the operations become non-speculative. + Resources speculativelyConverted = +_offeredResources + resizedResources - offeredResources; + Resources implicitlyDeclined = _offeredResources - speculativelyConverted; + + // Prevent any allocations from occurring during resource recovery below. + allocator->pause(); - // TODO(zhitao): Remove `resizedResources` once `GROW_VOLUME` and - // `SHRINK_VOLUME` become non-speculative. - if (!_offeredResources.empty() || !resizedResources.empty()) { -// Tell the allocator about the unused (e.g., refused) resources. + // Tell the allocator about the net speculatively converted resources. These + // resources should not be implicitly declined. + if (!speculativelyConverted.empty()) { allocator->recoverResources( -frameworkId, -slaveId, -_offeredResources + resizedResources, -accept.filters()); +frameworkId, slaveId, speculativelyConverted, None()); + } + + // Tell the allocator about the implicitly declined resources. + if (!implicitlyDeclined.empty()) { +allocator->recoverResources( +frameworkId, slaveId, implicitlyDeclined, accept.filters()); } + + allocator->resume(); } diff --git a/src/tests/slave_tests.cpp b/src/tests/slave_tests.cpp index 019dbd7..50882a5 100644 --- a/src/tests/slave_tests.cpp +++ b/src/tests/slave_tests.cpp @@ -6495,7 +6495,8 @@ TEST_F(SlaveTest, UpdateOperationStatusRetry) Future offers; EXPECT_CALL(*scheduler, offers(_, _)) -.WillOnce(FutureArg<1>()); +.WillOnce(FutureArg<1>()) +.WillRepeatedly(Return()); // Ignore subsequent offers. ContentType contentType = ContentType::PROTOBUF;
[mesos] branch 1.8.x updated (a684f07 -> 400f8e7)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from a684f07 Allowed compiling Seccomp isolator on older kernel versions. new 7e43f1f Do not implicitly decline speculatively converted resources. new 400f8e7 Added MESOS-9616 to the 1.8.1 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/master/master.cpp | 49 +++ src/tests/slave_tests.cpp | 3 ++- 3 files changed, 44 insertions(+), 9 deletions(-)
[mesos] 02/02: Added MESOS-9616 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 400f8e716d07d2f9d54000dabd37186b07667796 Author: Chun-Hung Hsiao AuthorDate: Wed May 1 13:39:50 2019 -0700 Added MESOS-9616 to the 1.8.1 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index c99523c..ed2862c 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,6 +4,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) ** Bug * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. + * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer Release Notes - Mesos - Version 1.8.0
[mesos] branch master updated (fc8847d -> 8d28725)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from fc8847d Added MESOS-9695 to the 1.4.4 CHANGELOG. new de7c969 Do not implicitly decline speculatively converted resources. new 8d28725 Added MESOS-9616 to the 1.8.1 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/master/master.cpp | 49 +++ src/tests/slave_tests.cpp | 3 ++- 3 files changed, 44 insertions(+), 9 deletions(-)
[mesos] 01/02: Do not implicitly decline speculatively converted resources.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit de7c969d061d0d6a815cc38dc59bcb4e8dfc2ff7 Author: Chun-Hung Hsiao AuthorDate: Mon Apr 22 15:46:36 2019 -0700 Do not implicitly decline speculatively converted resources. Currently if a framework accepts an offer with a `RESERVE` operation without a task consuming the reserved resources, the resources will be implicitly declined. This is counter to what one would expect (that only the remaining resources in the offer will be declined): Offer `cpus:10` -> `ACCEPT` with `RESERVE cpus(role):1` *Actual* implicit decline: `cpus:9;cpus(role):1` *Expected* implicit decline: `cpus:9` The same issue is present with other transformational operations (i.e., `UNRESERVE`, `CREATE` and `DESTROY`). This patch fixes this issue by only implicitly declining the "remaining" untransformed resources, computed as follows: Offered = `cpus:10` Remaining = `cpus:9;cpus(role):1` Implicitly declined = remaining - (remaining - offered) = `cpus:9;cpus(role):1` - `cpus:(role):1` = `cpus:9` Review: https://reviews.apache.org/r/70132 --- src/master/master.cpp | 49 +++ src/tests/slave_tests.cpp | 3 ++- 2 files changed, 43 insertions(+), 9 deletions(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index 9f0a976..a8ee629 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -5938,17 +5938,50 @@ void Master::_accept( conversions); } + // We now need to compute the amounts of remaining (1) speculatively converted + // resources to recover without a filter and (2) resources that are implicitly + // declined with the filter: + // + // Speculatively converted resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - offered resources not consumed by any operation + // = `_offeredResources` - offered resources not consumed by any operation + // = `_offeredResources` - offered resources + // + // (The last equality holds because resource subtraction yields no negatives.) + // + // Implicitly declined resources + // = (offered resources).apply(speculative operations) + // - resources consumed by non-speculative operations + // - speculatively converted resources + // = `_offeredResources` - speculatively converted resources + // + // TODO(zhitao): Right now `GROW_VOLUME` and `SHRINK_VOLUME` are implemented + // as speculative operations. Since the plan is to make them non-speculative + // in the future, their results are not in `_offeredResources`, so we add them + // back here. Remove this once the operations become non-speculative. + Resources speculativelyConverted = +_offeredResources + resizedResources - offeredResources; + Resources implicitlyDeclined = _offeredResources - speculativelyConverted; + + // Prevent any allocations from occurring during resource recovery below. + allocator->pause(); - // TODO(zhitao): Remove `resizedResources` once `GROW_VOLUME` and - // `SHRINK_VOLUME` become non-speculative. - if (!_offeredResources.empty() || !resizedResources.empty()) { -// Tell the allocator about the unused (e.g., refused) resources. + // Tell the allocator about the net speculatively converted resources. These + // resources should not be implicitly declined. + if (!speculativelyConverted.empty()) { allocator->recoverResources( -frameworkId, -slaveId, -_offeredResources + resizedResources, -accept.filters()); +frameworkId, slaveId, speculativelyConverted, None()); + } + + // Tell the allocator about the implicitly declined resources. + if (!implicitlyDeclined.empty()) { +allocator->recoverResources( +frameworkId, slaveId, implicitlyDeclined, accept.filters()); } + + allocator->resume(); } diff --git a/src/tests/slave_tests.cpp b/src/tests/slave_tests.cpp index 019dbd7..50882a5 100644 --- a/src/tests/slave_tests.cpp +++ b/src/tests/slave_tests.cpp @@ -6495,7 +6495,8 @@ TEST_F(SlaveTest, UpdateOperationStatusRetry) Future offers; EXPECT_CALL(*scheduler, offers(_, _)) -.WillOnce(FutureArg<1>()); +.WillOnce(FutureArg<1>()) +.WillRepeatedly(Return()); // Ignore subsequent offers. ContentType contentType = ContentType::PROTOBUF;
[mesos] 02/02: Added MESOS-9616 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 8d28725b7035308acdb9807d9c967d3c2bd34777 Author: Chun-Hung Hsiao AuthorDate: Wed May 1 13:39:50 2019 -0700 Added MESOS-9616 to the 1.8.1 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index f3f6f92..56f3ef6 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,6 +4,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) ** Bug * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. + * [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers. * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer Release Notes - Mesos - Version 1.8.0
[mesos] branch master updated: Added mesos-resource-provider-sdk in client libraries.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new fb1695f Added mesos-resource-provider-sdk in client libraries. fb1695f is described below commit fb1695f93acad12f07634423e205fad1384dde1e Author: longfei AuthorDate: Mon Apr 15 11:25:41 2019 -0700 Added mesos-resource-provider-sdk in client libraries. Added Resource Provider SDK in Go contributed by @carlonelong, which is based on @verizonlab' Mesos Framework SDK. The latter is removed since it is no longer maintained. This closes #333 --- docs/api-client-libraries.md | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/docs/api-client-libraries.md b/docs/api-client-libraries.md index 4d623bd..bb5e579 100644 --- a/docs/api-client-libraries.md +++ b/docs/api-client-libraries.md @@ -53,7 +53,7 @@ run into any issues, file them with the library maintainers.* Go - https://github.com/verizonlabs/mesos-framework-sdk;> + https://github.com/carlonelong/mesos-framework-sdk;> mesos-framework-sdk Go @@ -145,3 +145,21 @@ run into any issues, file them with the library maintainers.* + +## Resource Provider API + +### User Contributed + +*Note: These libraries are supported by their authors, so if you +run into any issues, file them with the library maintainers.* + + + +NameLanguage + + + https://github.com/carlonelong/mesos-resource-provider-sdk;> + mesos-resource-provider-sdk + Go + +
[mesos] branch 1.8.x updated: Updated the 1.8.0 CHANGELOG to highlight CSI v1 support.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/1.8.x by this push: new ccd790b Updated the 1.8.0 CHANGELOG to highlight CSI v1 support. ccd790b is described below commit ccd790b1816821b259f7a0b357362670e826864f Author: Chun-Hung Hsiao AuthorDate: Fri Apr 12 15:52:50 2019 -0700 Updated the 1.8.0 CHANGELOG to highlight CSI v1 support. --- CHANGELOG | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/CHANGELOG b/CHANGELOG index 62d6462..fddb34e 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -20,7 +20,7 @@ This release contains the following highlights: use the flag `--enable-new-cli` with Autotools and `-DENABLE_NEW_CLI=1` with CMake on MacOS or Linux. - * API Changes: + * Operation Feedback: * V1 schedulers can now receive operation feedback for operations on agent default resources, i.e. normal cpu, memory, and disk. This means that the @@ -36,6 +36,18 @@ This release contains the following highlights: reconciliation request. This is similar to the way in which the master replies to requests for task status reconciliation. + * Container Storage Interface (CSI): + +* **Experimental** Supported the new CSI v1 API. Operators can deploy + plugins that are compatible to either CSI v0 or v1 to create persistent + volumes through storage local resource providers, and Mesos will + automatically detect which CSI versions are supported by the plugins. + +Additional API Changes: + * [MESOS-9540] - Improved the experimental `DESTROY_DISK` operations so +frameworks can now deprovision any unwanted pre-provisioned CSI volume +directly, if they are authorized to perform `DESTROY_RAW_DISK` actions. + Unresolved Critical Issues: * [MESOS-9697] - Release RPMs are not uploaded to bintray * [MESOS-9672] - Docker containerizer should ignore pids of executors that do not pass the connection check. @@ -78,6 +90,7 @@ Unresolved Critical Issues: * [MESOS-2842] - Master crashes when framework changes principal on re-registration All Resolved Issues: + ** Bug * [MESOS-5048] - MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky * [MESOS-5189] - SSLTest.ProtocolMismatch is slow
[mesos] branch master updated: Updated the 1.8.0 CHANGELOG to highlight CSI v1 support.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 5e71313 Updated the 1.8.0 CHANGELOG to highlight CSI v1 support. 5e71313 is described below commit 5e71313e5d4373c88e1387e60e85323105a42c9d Author: Chun-Hung Hsiao AuthorDate: Fri Apr 12 15:52:50 2019 -0700 Updated the 1.8.0 CHANGELOG to highlight CSI v1 support. --- CHANGELOG | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/CHANGELOG b/CHANGELOG index e1b3ab9..cc2db30 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -20,7 +20,7 @@ This release contains the following highlights: use the flag `--enable-new-cli` with Autotools and `-DENABLE_NEW_CLI=1` with CMake on MacOS or Linux. - * API Changes: + * Operation Feedback: * V1 schedulers can now receive operation feedback for operations on agent default resources, i.e. normal cpu, memory, and disk. This means that the @@ -36,6 +36,18 @@ This release contains the following highlights: reconciliation request. This is similar to the way in which the master replies to requests for task status reconciliation. + * Container Storage Interface (CSI): + +* **Experimental** Supported the new CSI v1 API. Operators can deploy + plugins that are compatible to either CSI v0 or v1 to create persistent + volumes through storage local resource providers, and Mesos will + automatically detect which CSI versions are supported by the plugins. + +Additional API Changes: + * [MESOS-9540] - Improved the experimental `DESTROY_DISK` operations so +frameworks can now deprovision any unwanted pre-provisioned CSI volume +directly, if they are authorized to perform `DESTROY_RAW_DISK` actions. + Unresolved Critical Issues: * [MESOS-9697] - Release RPMs are not uploaded to bintray * [MESOS-9672] - Docker containerizer should ignore pids of executors that do not pass the connection check. @@ -78,6 +90,7 @@ Unresolved Critical Issues: * [MESOS-2842] - Master crashes when framework changes principal on re-registration All Resolved Issues: + ** Bug * [MESOS-5048] - MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky * [MESOS-5189] - SSLTest.ProtocolMismatch is slow
[mesos] branch 1.8.x updated (950247d -> 2ec9762)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 950247d Updated the 1.8.0 CHANGELOG with a new feature. new 134eda9 Fixed crash when recovering a volume failed to publish with CSI v1. new 2ec9762 Added MESOS-9729 to the 1.8.0 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/csi/v1_volume_manager.cpp | 16 2 files changed, 13 insertions(+), 4 deletions(-)
[mesos] 02/02: Added MESOS-9729 to the 1.8.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 2ec97620fe61589de6343960418361b49957948a Author: Chun-Hung Hsiao AuthorDate: Fri Apr 12 15:19:10 2019 -0700 Added MESOS-9729 to the 1.8.0 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index c8d67ec..62d6462 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -221,6 +221,7 @@ All Resolved Issues: * [MESOS-9711] - Avoid shutting down executors registering before a required resource provider. * [MESOS-9712] - StorageLocalResourceProviderTest.CsiPluginRpcMetrics is flaky. * [MESOS-9727] - Heartbeat calls from executor to agent are reported as errors. + * [MESOS-9729] - Unpublishing a volume that is failed to publish crashes the agent with CSI v1. ** Epic * [MESOS-8054] - Feedback for operations
[mesos] 01/02: Fixed crash when recovering a volume failed to publish with CSI v1.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 134eda9b1d537683994fa87acac0a96fdca4c730 Author: Chun-Hung Hsiao AuthorDate: Fri Apr 12 14:43:09 2019 -0700 Fixed crash when recovering a volume failed to publish with CSI v1. The CSI v1 volume manager falsely assumed that the target path always exists when unpublishing a volume, which is not true if there is a failure when publishing the volume. Review: https://reviews.apache.org/r/70468 --- src/csi/v1_volume_manager.cpp | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/csi/v1_volume_manager.cpp b/src/csi/v1_volume_manager.cpp index bf640f9..e7e0329 100644 --- a/src/csi/v1_volume_manager.cpp +++ b/src/csi/v1_volume_manager.cpp @@ -975,7 +975,12 @@ Future VolumeManagerProcess::_publishVolume(const string& volumeId) } return call(NODE_SERVICE, ::nodePublishVolume, std::move(request)) -.then(defer(self(), [this, volumeId, targetPath] { +.then(process::defer(self(), [this, volumeId, targetPath]() +-> Future { + if (!os::exists(targetPath)) { +return Failure("Target path '" + targetPath + "' not created"); + } + CHECK(volumes.contains(volumeId)); VolumeState& volumeState = volumes.at(volumeId).state; @@ -1158,8 +1163,6 @@ Future VolumeManagerProcess::__unpublishVolume(const string& volumeId) const string targetPath = paths::getMountTargetPath( paths::getMountRootDir(rootDir, info.type(), info.name()), volumeId); - CHECK(os::exists(targetPath)); - LOG(INFO) << "Calling '/csi.v1.Node/NodeUnpublishVolume' for volume '" << volumeId << "'"; @@ -1168,7 +1171,12 @@ Future VolumeManagerProcess::__unpublishVolume(const string& volumeId) request.set_target_path(targetPath); return call(NODE_SERVICE, ::nodeUnpublishVolume, std::move(request)) -.then(process::defer(self(), [this, volumeId] { +.then(process::defer(self(), [this, volumeId, targetPath]() +-> Future { + if (os::exists(targetPath)) { +return Failure("Target path '" + targetPath + "' not removed"); + } + CHECK(volumes.contains(volumeId)); VolumeState& volumeState = volumes.at(volumeId).state; volumeState.set_state(VolumeState::VOL_READY);
[mesos] branch master updated (728a176 -> 1ce5ded)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 728a176 Updated the 1.8.0 CHANGELOG with a new feature. new dff1eac Fixed crash when recovering a volume failed to publish with CSI v1. new 1ce5ded Added MESOS-9729 to the 1.8.0 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/csi/v1_volume_manager.cpp | 16 2 files changed, 13 insertions(+), 4 deletions(-)
[mesos] 01/02: Fixed crash when recovering a volume failed to publish with CSI v1.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit dff1eac2e0d7485181b49d4ee93e50ac6ba83e63 Author: Chun-Hung Hsiao AuthorDate: Fri Apr 12 14:43:09 2019 -0700 Fixed crash when recovering a volume failed to publish with CSI v1. The CSI v1 volume manager falsely assumed that the target path always exists when unpublishing a volume, which is not true if there is a failure when publishing the volume. Review: https://reviews.apache.org/r/70468 --- src/csi/v1_volume_manager.cpp | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/csi/v1_volume_manager.cpp b/src/csi/v1_volume_manager.cpp index bf640f9..e7e0329 100644 --- a/src/csi/v1_volume_manager.cpp +++ b/src/csi/v1_volume_manager.cpp @@ -975,7 +975,12 @@ Future VolumeManagerProcess::_publishVolume(const string& volumeId) } return call(NODE_SERVICE, ::nodePublishVolume, std::move(request)) -.then(defer(self(), [this, volumeId, targetPath] { +.then(process::defer(self(), [this, volumeId, targetPath]() +-> Future { + if (!os::exists(targetPath)) { +return Failure("Target path '" + targetPath + "' not created"); + } + CHECK(volumes.contains(volumeId)); VolumeState& volumeState = volumes.at(volumeId).state; @@ -1158,8 +1163,6 @@ Future VolumeManagerProcess::__unpublishVolume(const string& volumeId) const string targetPath = paths::getMountTargetPath( paths::getMountRootDir(rootDir, info.type(), info.name()), volumeId); - CHECK(os::exists(targetPath)); - LOG(INFO) << "Calling '/csi.v1.Node/NodeUnpublishVolume' for volume '" << volumeId << "'"; @@ -1168,7 +1171,12 @@ Future VolumeManagerProcess::__unpublishVolume(const string& volumeId) request.set_target_path(targetPath); return call(NODE_SERVICE, ::nodeUnpublishVolume, std::move(request)) -.then(process::defer(self(), [this, volumeId] { +.then(process::defer(self(), [this, volumeId, targetPath]() +-> Future { + if (os::exists(targetPath)) { +return Failure("Target path '" + targetPath + "' not removed"); + } + CHECK(volumes.contains(volumeId)); VolumeState& volumeState = volumes.at(volumeId).state; volumeState.set_state(VolumeState::VOL_READY);
[mesos] 02/02: Added MESOS-9729 to the 1.8.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 1ce5dedd7d3512f84eb18ccdb542fa5a566cd426 Author: Chun-Hung Hsiao AuthorDate: Fri Apr 12 15:19:10 2019 -0700 Added MESOS-9729 to the 1.8.0 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 977790c..e1b3ab9 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -221,6 +221,7 @@ All Resolved Issues: * [MESOS-9711] - Avoid shutting down executors registering before a required resource provider. * [MESOS-9712] - StorageLocalResourceProviderTest.CsiPluginRpcMetrics is flaky. * [MESOS-9727] - Heartbeat calls from executor to agent are reported as errors. + * [MESOS-9729] - Unpublishing a volume that is failed to publish crashes the agent with CSI v1. ** Epic * [MESOS-8054] - Feedback for operations
[mesos] branch 1.8.x updated (6730da8 -> 0c503b0)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 6730da8 Added MESOS-9712 to the 1.8.0 CHANGELOG. new 9498d92 Avoid publishing resources when an HTTP executor resubscribes. new 0c503b0 Added MESOS-9711 to the 1.8.0 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 2 ++ src/slave/slave.cpp | 17 +++- src/tests/slave_tests.cpp | 51 ++- 3 files changed, 46 insertions(+), 24 deletions(-)
[mesos] 01/02: Avoid publishing resources when an HTTP executor resubscribes.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 9498d928976dee6b814420c1dd392d7ffc0685da Author: Chun-Hung Hsiao AuthorDate: Wed Apr 10 16:16:41 2019 -0700 Avoid publishing resources when an HTTP executor resubscribes. After an agent failover, an HTTP executor may resubscribe before any resource provider resubscribes. If that happens and the executor has tasks consuming resources from an unsubscribed resource provider, the agent will fail to publish the resources and kill the executor, which is an undesired behavior. The patch fixes this issue. Review: https://reviews.apache.org/r/70449 --- src/slave/slave.cpp | 17 +++- src/tests/slave_tests.cpp | 51 ++- 2 files changed, 44 insertions(+), 24 deletions(-) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index a3ea5d2..95f05a1 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5024,7 +5024,22 @@ void Slave::subscribe( const ContainerID& containerId = executor->containerId; const Resources& resources = executor->allocatedResources(); - publishResources(containerId, resources) + Future resourcesPublished; + if (executor->queuedTasks.empty()) { +// Since no task is queued, all resources should have been published +// before, so we skip resource publishing here. This avoids failures due +// to unregistered resource providers during recovery (see MESOS-9711). +// +// NOTE: It is safe to not update the published resources when the +// executor reduces its resource consumption (e.g., due to task +// completion) because we don't require resources to be unpublished +// after use. See comments in `publishResources` for details. +resourcesPublished = Nothing(); + } else { +resourcesPublished = publishResources(containerId, resources); + } + + resourcesPublished .then(defer(self(), [this, containerId, resources] { // NOTE: The executor struct could have been removed before // containerizer update, so we use the captured container ID and diff --git a/src/tests/slave_tests.cpp b/src/tests/slave_tests.cpp index 6df461b..019dbd7 100644 --- a/src/tests/slave_tests.cpp +++ b/src/tests/slave_tests.cpp @@ -11568,9 +11568,9 @@ TEST_F(SlaveTest, RetryOperationStatusUpdateAfterRecovery) } -// This test verifies that on agent failover HTTP-based executors using -// resource provider resources can resubscribe without crashing the -// agent. This is a regression test for MESOS-9667. +// This test verifies that on agent failover HTTP-based executors using resource +// provider resources can resubscribe without crashing the agent or killing the +// executor. This is a regression test for MESOS-9667 and MESOS-9711. TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) { // This test is run with paused clock to avoid @@ -11623,7 +11623,7 @@ TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) AWAIT_READY(updateSlaveMessage); - // Register a framework to excercise operations. + // Register a framework to exercise operations. auto scheduler = std::make_shared(); Future connected; @@ -11692,7 +11692,6 @@ TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) Future taskStarting; Future taskRunning; EXPECT_CALL(*scheduler, update(_, _)) -.Times(AtLeast(2)) .WillOnce(DoAll( v1::scheduler::SendAcknowledge(frameworkId, agentId), FutureArg<1>())) @@ -11700,6 +11699,16 @@ TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) v1::scheduler::SendAcknowledge(frameworkId, agentId), FutureArg<1>())); + // The following futures will ensure that the task status update manager has + // checkpointed the status update acknowledgements so there will be no retry. + // + // NOTE: The order of the two `FUTURE_DISPATCH`s is reversed because Google + // Mock will search the expectations in reverse order. + Future _taskRunningAcknowledgement = +FUTURE_DISPATCH(_, ::_statusUpdateAcknowledgement); + Future _taskStartingAcknowledgement = +FUTURE_DISPATCH(_, ::_statusUpdateAcknowledgement); + { v1::Resources executorResources = *v1::Resources::parse("cpus:0.1;mem:32;disk:32"); @@ -11727,41 +11736,37 @@ TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) AWAIT_READY(taskStarting); ASSERT_EQ(v1::TaskState::TASK_STARTING, taskStarting->status().state()); + ASSERT_EQ(v1::TaskStatus::SOURCE_EXECUTOR, taskStarting->status().source()); + AWAIT_READY(_taskStartingAcknowledgement); AWAIT_READY(taskRunning); ASSERT_EQ(v1::TaskSta
[mesos] branch master updated (9aac730 -> 56c34ee)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 9aac730 Added MESOS-9712 to the 1.8.0 CHANGELOG. new e31ed7d Avoid publishing resources when an HTTP executor resubscribes. new 56c34ee Added MESOS-9711 to the 1.8.0 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 2 ++ src/slave/slave.cpp | 17 +++- src/tests/slave_tests.cpp | 51 ++- 3 files changed, 46 insertions(+), 24 deletions(-)
[mesos] 02/02: Added MESOS-9711 to the 1.8.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 0c503b01d3a9428ec9db35d09da5e237d737c570 Author: Chun-Hung Hsiao AuthorDate: Wed Apr 10 16:30:30 2019 -0700 Added MESOS-9711 to the 1.8.0 CHANGELOG. --- CHANGELOG | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG b/CHANGELOG index d86f5eb..b1b0aa6 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -188,12 +188,14 @@ All Resolved Issues: * [MESOS-9635] - OperationReconciliationTest.AgentPendingOperationAfterMasterFailover is flaky again (3x) due to orphan operations * [MESOS-9637] - Impossible to CREATE a volume on resource provider resources over the operator API * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations. + * [MESOS-9667] - Check failure when executor for task using resource provider resources subscribes before agent is registered. * [MESOS-9688] - Quota is not enforced properly when subroles have reservations. * [MESOS-9691] - Quota headroom calculation is off when subroles are involved. * [MESOS-9692] - Quota may be under allocated for disk resources. * [MESOS-9696] - Test MasterQuotaTest.AvailableResourcesSingleDisconnectedAgent is flaky * [MESOS-9707] - Calling link::lo() may cause runtime error * [MESOS-9667] - Check failure when executor for task using resource provider resources subscribes before agent is registered. + * [MESOS-9711] - Avoid shutting down executors registering before a required resource provider. * [MESOS-9712] - StorageLocalResourceProviderTest.CsiPluginRpcMetrics is flaky. * [MESOS-9727] - Heartbeat calls from executor to agent are reported as errors.
[mesos] 01/02: Avoid publishing resources when an HTTP executor resubscribes.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit e31ed7d78199f612cc098c6b8d41bac82339e558 Author: Chun-Hung Hsiao AuthorDate: Wed Apr 10 16:16:41 2019 -0700 Avoid publishing resources when an HTTP executor resubscribes. After an agent failover, an HTTP executor may resubscribe before any resource provider resubscribes. If that happens and the executor has tasks consuming resources from an unsubscribed resource provider, the agent will fail to publish the resources and kill the executor, which is an undesired behavior. The patch fixes this issue. Review: https://reviews.apache.org/r/70449 --- src/slave/slave.cpp | 17 +++- src/tests/slave_tests.cpp | 51 ++- 2 files changed, 44 insertions(+), 24 deletions(-) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index a3ea5d2..95f05a1 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5024,7 +5024,22 @@ void Slave::subscribe( const ContainerID& containerId = executor->containerId; const Resources& resources = executor->allocatedResources(); - publishResources(containerId, resources) + Future resourcesPublished; + if (executor->queuedTasks.empty()) { +// Since no task is queued, all resources should have been published +// before, so we skip resource publishing here. This avoids failures due +// to unregistered resource providers during recovery (see MESOS-9711). +// +// NOTE: It is safe to not update the published resources when the +// executor reduces its resource consumption (e.g., due to task +// completion) because we don't require resources to be unpublished +// after use. See comments in `publishResources` for details. +resourcesPublished = Nothing(); + } else { +resourcesPublished = publishResources(containerId, resources); + } + + resourcesPublished .then(defer(self(), [this, containerId, resources] { // NOTE: The executor struct could have been removed before // containerizer update, so we use the captured container ID and diff --git a/src/tests/slave_tests.cpp b/src/tests/slave_tests.cpp index 6df461b..019dbd7 100644 --- a/src/tests/slave_tests.cpp +++ b/src/tests/slave_tests.cpp @@ -11568,9 +11568,9 @@ TEST_F(SlaveTest, RetryOperationStatusUpdateAfterRecovery) } -// This test verifies that on agent failover HTTP-based executors using -// resource provider resources can resubscribe without crashing the -// agent. This is a regression test for MESOS-9667. +// This test verifies that on agent failover HTTP-based executors using resource +// provider resources can resubscribe without crashing the agent or killing the +// executor. This is a regression test for MESOS-9667 and MESOS-9711. TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) { // This test is run with paused clock to avoid @@ -11623,7 +11623,7 @@ TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) AWAIT_READY(updateSlaveMessage); - // Register a framework to excercise operations. + // Register a framework to exercise operations. auto scheduler = std::make_shared(); Future connected; @@ -11692,7 +11692,6 @@ TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) Future taskStarting; Future taskRunning; EXPECT_CALL(*scheduler, update(_, _)) -.Times(AtLeast(2)) .WillOnce(DoAll( v1::scheduler::SendAcknowledge(frameworkId, agentId), FutureArg<1>())) @@ -11700,6 +11699,16 @@ TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) v1::scheduler::SendAcknowledge(frameworkId, agentId), FutureArg<1>())); + // The following futures will ensure that the task status update manager has + // checkpointed the status update acknowledgements so there will be no retry. + // + // NOTE: The order of the two `FUTURE_DISPATCH`s is reversed because Google + // Mock will search the expectations in reverse order. + Future _taskRunningAcknowledgement = +FUTURE_DISPATCH(_, ::_statusUpdateAcknowledgement); + Future _taskStartingAcknowledgement = +FUTURE_DISPATCH(_, ::_statusUpdateAcknowledgement); + { v1::Resources executorResources = *v1::Resources::parse("cpus:0.1;mem:32;disk:32"); @@ -11727,41 +11736,37 @@ TEST_F(SlaveTest, AgentFailoverHTTPExecutorUsingResourceProviderResources) AWAIT_READY(taskStarting); ASSERT_EQ(v1::TaskState::TASK_STARTING, taskStarting->status().state()); + ASSERT_EQ(v1::TaskStatus::SOURCE_EXECUTOR, taskStarting->status().source()); + AWAIT_READY(_taskStartingAcknowledgement); AWAIT_READY(taskRunning); ASSERT_EQ(v1::TaskSta
[mesos] 01/02: Fixed potential use-after-free bug in storage local resource provider.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit e3504a6a86558d264a939d974fdb5c257ab53f5d Author: Benjamin Bannier AuthorDate: Thu Apr 11 08:55:43 2019 -0700 Fixed potential use-after-free bug in storage local resource provider. The storage local resource provider manages a metrics instance which is shared with the service and volume managers it also holds; these services do not manage the lifetime of the metrics instance. This patch fixes the lifetime of the metrics instance. Review: https://reviews.apache.org/r/70454/ --- src/resource_provider/storage/provider.cpp | 28 +++- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/src/resource_provider/storage/provider.cpp b/src/resource_provider/storage/provider.cpp index b2ca5d0..999fe95 100644 --- a/src/resource_provider/storage/provider.cpp +++ b/src/resource_provider/storage/provider.cpp @@ -377,6 +377,19 @@ private: Runtime runtime; + // NOTE: `metrics` must be destructed after `volumeManager` and + // `serviceManager` since they hold a pointer to it. + struct Metrics : public csi::Metrics + { +explicit Metrics(const string& prefix); +~Metrics(); + +hashmap operations_pending; +hashmap operations_finished; +hashmap operations_failed; +hashmap operations_dropped; + } metrics; + // NOTE: `serviceManager` must be destructed after `volumeManager` since the // latter holds a pointer of the former. Owned serviceManager; @@ -401,17 +414,6 @@ private: // keeps track of pending operations that disallow reconciliation, and ensures // that any reconciliation waits for these operations to finish. Sequence sequence; - - struct Metrics : public csi::Metrics - { -explicit Metrics(const string& prefix); -~Metrics(); - -hashmap operations_pending; -hashmap operations_finished; -hashmap operations_failed; -hashmap operations_dropped; - } metrics; }; @@ -434,9 +436,9 @@ StorageLocalResourceProviderProcess::StorageLocalResourceProviderProcess( slaveId(_slaveId), authToken(_authToken), strict(_strict), +metrics("resource_providers/" + info.type() + "." + info.name() + "/"), resourceVersion(id::UUID::random()), -sequence("storage-local-resource-provider-sequence"), -metrics("resource_providers/" + info.type() + "." + info.name() + "/") +sequence("storage-local-resource-provider-sequence") { diskProfileAdaptor = DiskProfileAdaptor::getAdaptor(); CHECK_NOTNULL(diskProfileAdaptor.get());
[mesos] 01/02: Fixed potential use-after-free bug in storage local resource provider.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 1d4bd05ed36cdbf7b75fa1e0322263a13240807a Author: Benjamin Bannier AuthorDate: Thu Apr 11 08:55:43 2019 -0700 Fixed potential use-after-free bug in storage local resource provider. The storage local resource provider manages a metrics instance which is shared with the service and volume managers it also holds; these services do not manage the lifetime of the metrics instance. This patch fixes the lifetime of the metrics instance. Review: https://reviews.apache.org/r/70454/ --- src/resource_provider/storage/provider.cpp | 28 +++- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/src/resource_provider/storage/provider.cpp b/src/resource_provider/storage/provider.cpp index b2ca5d0..999fe95 100644 --- a/src/resource_provider/storage/provider.cpp +++ b/src/resource_provider/storage/provider.cpp @@ -377,6 +377,19 @@ private: Runtime runtime; + // NOTE: `metrics` must be destructed after `volumeManager` and + // `serviceManager` since they hold a pointer to it. + struct Metrics : public csi::Metrics + { +explicit Metrics(const string& prefix); +~Metrics(); + +hashmap operations_pending; +hashmap operations_finished; +hashmap operations_failed; +hashmap operations_dropped; + } metrics; + // NOTE: `serviceManager` must be destructed after `volumeManager` since the // latter holds a pointer of the former. Owned serviceManager; @@ -401,17 +414,6 @@ private: // keeps track of pending operations that disallow reconciliation, and ensures // that any reconciliation waits for these operations to finish. Sequence sequence; - - struct Metrics : public csi::Metrics - { -explicit Metrics(const string& prefix); -~Metrics(); - -hashmap operations_pending; -hashmap operations_finished; -hashmap operations_failed; -hashmap operations_dropped; - } metrics; }; @@ -434,9 +436,9 @@ StorageLocalResourceProviderProcess::StorageLocalResourceProviderProcess( slaveId(_slaveId), authToken(_authToken), strict(_strict), +metrics("resource_providers/" + info.type() + "." + info.name() + "/"), resourceVersion(id::UUID::random()), -sequence("storage-local-resource-provider-sequence"), -metrics("resource_providers/" + info.type() + "." + info.name() + "/") +sequence("storage-local-resource-provider-sequence") { diskProfileAdaptor = DiskProfileAdaptor::getAdaptor(); CHECK_NOTNULL(diskProfileAdaptor.get());
[mesos] branch 1.8.x updated (1c65e1b -> 6730da8)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 1c65e1b Updated comments in v1/mesos.proto. new 1d4bd05 Fixed potential use-after-free bug in storage local resource provider. new 6730da8 Added MESOS-9712 to the 1.8.0 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/resource_provider/storage/provider.cpp | 28 +++- 2 files changed, 16 insertions(+), 13 deletions(-)
[mesos] 02/02: Added MESOS-9712 to the 1.8.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 9aac730672232435dfab788e9542cde0cd93dfd7 Author: Chun-Hung Hsiao AuthorDate: Thu Apr 11 09:04:02 2019 -0700 Added MESOS-9712 to the 1.8.0 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index b51de9b..e312244 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -194,6 +194,7 @@ All Resolved Issues: * [MESOS-9696] - Test MasterQuotaTest.AvailableResourcesSingleDisconnectedAgent is flaky * [MESOS-9707] - Calling link::lo() may cause runtime error * [MESOS-9667] - Check failure when executor for task using resource provider resources subscribes before agent is registered. + * [MESOS-9712] - StorageLocalResourceProviderTest.CsiPluginRpcMetrics is flaky. * [MESOS-9727] - Heartbeat calls from executor to agent are reported as errors. ** Epic
[mesos] branch master updated (4a33d2a -> 9aac730)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 4a33d2a Updated comments in v1/mesos.proto. new e3504a6 Fixed potential use-after-free bug in storage local resource provider. new 9aac730 Added MESOS-9712 to the 1.8.0 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/resource_provider/storage/provider.cpp | 28 +++- 2 files changed, 16 insertions(+), 13 deletions(-)
[mesos] 02/02: Added MESOS-9712 to the 1.8.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 6730da8eeff20c35f12ed2aa4a79314ebb955b61 Author: Chun-Hung Hsiao AuthorDate: Thu Apr 11 09:04:02 2019 -0700 Added MESOS-9712 to the 1.8.0 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 841bc80..d86f5eb 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -194,6 +194,7 @@ All Resolved Issues: * [MESOS-9696] - Test MasterQuotaTest.AvailableResourcesSingleDisconnectedAgent is flaky * [MESOS-9707] - Calling link::lo() may cause runtime error * [MESOS-9667] - Check failure when executor for task using resource provider resources subscribes before agent is registered. + * [MESOS-9712] - StorageLocalResourceProviderTest.CsiPluginRpcMetrics is flaky. * [MESOS-9727] - Heartbeat calls from executor to agent are reported as errors. ** Epic
[mesos] 01/02: Avoid dereferencing removed executors and launching containers for them.
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 9cce93413b691f7627491ac264022f7d15ead9cc Author: Chun-Hung Hsiao AuthorDate: Fri Mar 1 15:49:32 2019 -0800 Avoid dereferencing removed executors and launching containers for them. When launching executors and tasks, there is no guarantee that the executors still remain after `Slave::publishResources` is returned. If not, the executor struct should not be dereferenced and the executor containers should not be launched at all. NOTE: The patch makes `Slave::launchExecutor` called asynchronously even if there is no secret generator. However this should not affect the correctness of executor launching. Review: https://reviews.apache.org/r/70084 --- src/slave/slave.cpp | 242 ++-- src/slave/slave.hpp | 12 ++- 2 files changed, 131 insertions(+), 123 deletions(-) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index 10af517..c0b5388 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -2964,22 +2964,51 @@ void Slave::__run( executor = added.get(); -if (secretGenerator) { - generateSecret(framework->id(), executor->id, executor->containerId) -.onAny(defer( - self(), - ::launchExecutor, - lambda::_1, - frameworkId, - executorId, - taskGroup.isNone() ? task.get() : Option::none())); -} else { - Slave::launchExecutor( - None(), +// NOTE: We make a copy of the executor info because we may mutate it with +// some default fields and resources. +ExecutorInfo executorInfo_ = executorInfo; + +// Populate the command info for default executor. We modify the executor +// info to avoid resetting command info upon reregistering with the master +// since the master doesn't store them; they are generated by the slave. +if (executorInfo_.has_type() && +executorInfo_.type() == ExecutorInfo::DEFAULT) { + CHECK(!executorInfo_.has_command()); + + *executorInfo_.mutable_command() = +defaultExecutorCommandInfo(flags.launcher_dir, executor->user); +} + +// NOTE: We modify the ExecutorInfo to include the task's resources when +// launching the executor so that the containerizer has non-zero resources +// to work with when the executor has no resources. This should be revisited +// after MESOS-600. +if (task.isSome()) { + *executorInfo_.mutable_resources() = +Resources(executorInfo.resources()) + task->resources(); +} + +// Add the default container info to the executor info. +// TODO(jieyu): Rename the flag to be default_mesos_container_info. +if (!executorInfo_.has_container() && +flags.default_container_info.isSome()) { + *executorInfo_.mutable_container() = flags.default_container_info.get(); +} + +publishResources(executor->containerId, executorInfo_.resources()) + .then(defer( + self(), + ::generateSecret, frameworkId, executorId, - taskGroup.isNone() ? task.get() : Option::none()); -} + executor->containerId)) + .onAny(defer( + self(), + ::launchExecutor, + lambda::_1, + frameworkId, + executorInfo_, + taskGroup.isNone() ? task.get() : Option::none())); } CHECK_NOTNULL(executor); @@ -3064,11 +3093,16 @@ void Slave::__run( LOG(INFO) << "Queued " << taskOrTaskGroup(task, taskGroup) << " for executor " << *executor; - publishResources(executor->containerId, executor->allocatedResources()) -.then(defer(self(), [=] { - return containerizer->update( - executor->containerId, - executor->allocatedResources()); + const ContainerID& containerId = executor->containerId; + const Resources& resources = executor->allocatedResources(); + + publishResources(containerId, resources) +.then(defer(self(), [this, containerId, resources] { + // NOTE: The executor struct could have been removed before + // containerizer update, so we use the captured container ID and + // resources here. If this happens, the containerizer would simply + // skip updating a destroyed container. + return containerizer->update(containerId, resources); })) .onAny(defer(self(), ::___run, @@ -3313,12 +3347,15 @@ void Slave::___run( } -// Generates a secret for executor authentication. -Future Slave::generateSecret( +Future> Slave::generateSecret( const FrameworkID& frameworkId, const ExecutorID
[mesos] branch 1.7.x updated (9f22a3b -> 150644e)
This is an automated email from the ASF dual-hosted git repository. chhsiao pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 9f22a3b Added MESOS-9707 to the 1.7.3 CHANGELOG. new 9cce934 Avoid dereferencing removed executors and launching containers for them. new 150644e Added MESOS-8467 to the 1.7.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/slave.cpp | 242 ++-- src/slave/slave.hpp | 12 ++- 3 files changed, 132 insertions(+), 123 deletions(-)