[jira] [Commented] (MESOS-8745) Add a `LIST_RESOURCE_PROVIDER_CONFIGS` agent API call.
[ https://issues.apache.org/jira/browse/MESOS-8745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742592#comment-16742592 ] Chun-Hung Hsiao commented on MESOS-8745: We could use this API to expose states that's not available to the resource provider manager. See MESOS-9223. > Add a `LIST_RESOURCE_PROVIDER_CONFIGS` agent API call. > -- > > Key: MESOS-8745 > URL: https://issues.apache.org/jira/browse/MESOS-8745 > Project: Mesos > Issue Type: Task > Components: agent >Reporter: Chun-Hung Hsiao >Priority: Minor > Labels: mesosphere, storage > > For API completeness, it would be nice if we can provider a call to list all > valid resource provider configs on an agent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9223) Storage local provider does not sufficiently handle container launch failures or errors
[ https://issues.apache.org/jira/browse/MESOS-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742591#comment-16742591 ] Chun-Hung Hsiao commented on MESOS-9223: For surfacing the error: We could pass the failure to {{LocalResourceProviderManager}} through the interface I proposed above, then expose that in the {{LIST_RESOURCE_PROVIDER_CONFIGS}} API through MESOS-8745. > Storage local provider does not sufficiently handle container launch failures > or errors > --- > > Key: MESOS-9223 > URL: https://issues.apache.org/jira/browse/MESOS-9223 > Project: Mesos > Issue Type: Improvement > Components: agent, storage >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Critical > > The storage local resource provider as currently implemented does not handle > launch failures or task errors of its standalone containers well enough, If > e.g., a RP container fails to come up during node start a warning would be > logged, but an operator still needs to detect degraded functionality, > manually check the state of containers with {{GET_CONTAINERS}}, and decide > whether the agent needs restarting; I suspect they do not have always have > enough context for this decision. It would be better if the provider would > either enforce a restart by failing over the whole agent, or by retrying the > operation (optionally: up to some maximum amount of retries). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9521) MasterAPITest.OperationUpdatesUponAgentGone is flaky
[ https://issues.apache.org/jira/browse/MESOS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742276#comment-16742276 ] Benno Evers commented on MESOS-9521: Review: https://reviews.apache.org/r/69726/ The warning is known, but due to the caveat that is printed right below the warning {noformat} NOTE: You can safely ignore the above warning unless this call should not happen. Do not suppress it by blindly adding an EXPECT_CALL() if you don't mean to enforce the call. See https://github.com/google/googletest/blob/master/googlemock/docs/CookBook.md#knowing-when-to-expect for details. {noformat} I left it, because the test does not really care about whether `disconnect()` is called or not. > MasterAPITest.OperationUpdatesUponAgentGone is flaky > > > Key: MESOS-9521 > URL: https://issues.apache.org/jira/browse/MESOS-9521 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 1.8.0 > Environment: Fedora28, cmake w/ SSL >Reporter: Benjamin Bannier >Priority: Major > Labels: flaky, flaky-test > > The recently added test {{MasterAPITest.OperationUpdatesUponAgentGone}} is > flaky, e.g., > {noformat}../src/tests/api_tests.cpp:5051: Failure > Value of: resources.empty() > Actual: true > Expected: false > ../3rdparty/libprocess/src/../include/process/gmock.hpp:504: Failure > Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(to, > testing::A()))... > Expected args: message matcher (32-byte object 24-00 00-00 00-00 00-00 24-00 00-00 00-00 00-00 41-63 74-75 61-6C 20-66>, > 1-byte object ) > Expected: to be called once >Actual: never called - unsatisfied and active > {noformat} > I am able to reproduce this reliable in less than 10 iterations when running > the test in repetition under additional system stress. > Even if the test does not fail it produces the following gmock warning, > {noformat} > GMOCK WARNING: > Uninteresting mock function call - returning directly. > Function call: disconnected() > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (MESOS-9521) MasterAPITest.OperationUpdatesUponAgentGone is flaky
[ https://issues.apache.org/jira/browse/MESOS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741868#comment-16741868 ] Benjamin Bannier edited comment on MESOS-9521 at 1/14/19 9:06 AM: -- cc [~bennoe], [~greggomann] was (Author: bbannier): cc @bennoe, [~greggomann] > MasterAPITest.OperationUpdatesUponAgentGone is flaky > > > Key: MESOS-9521 > URL: https://issues.apache.org/jira/browse/MESOS-9521 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 1.8.0 > Environment: Fedora28, cmake w/ SSL >Reporter: Benjamin Bannier >Priority: Major > > The recently added test {{MasterAPITest.OperationUpdatesUponAgentGone}} is > flaky, e.g., > {noformat}../src/tests/api_tests.cpp:5051: Failure > Value of: resources.empty() > Actual: true > Expected: false > ../3rdparty/libprocess/src/../include/process/gmock.hpp:504: Failure > Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(to, > testing::A()))... > Expected args: message matcher (32-byte object 24-00 00-00 00-00 00-00 24-00 00-00 00-00 00-00 41-63 74-75 61-6C 20-66>, > 1-byte object ) > Expected: to be called once >Actual: never called - unsatisfied and active > {noformat} > I am able to reproduce this reliable in less than 10 iterations when running > the test in repetition under additional system stress. > Even if the test does not fail it produces the following gmock warning, > {noformat} > GMOCK WARNING: > Uninteresting mock function call - returning directly. > Function call: disconnected() > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9521) MasterAPITest.OperationUpdatesUponAgentGone is flaky
[ https://issues.apache.org/jira/browse/MESOS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741868#comment-16741868 ] Benjamin Bannier commented on MESOS-9521: - cc @bennoe, [~greggomann] > MasterAPITest.OperationUpdatesUponAgentGone is flaky > > > Key: MESOS-9521 > URL: https://issues.apache.org/jira/browse/MESOS-9521 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 1.8.0 > Environment: Fedora28, cmake w/ SSL >Reporter: Benjamin Bannier >Priority: Major > > The recently added test {{MasterAPITest.OperationUpdatesUponAgentGone}} is > flaky, e.g., > {noformat}../src/tests/api_tests.cpp:5051: Failure > Value of: resources.empty() > Actual: true > Expected: false > ../3rdparty/libprocess/src/../include/process/gmock.hpp:504: Failure > Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(to, > testing::A()))... > Expected args: message matcher (32-byte object 24-00 00-00 00-00 00-00 24-00 00-00 00-00 00-00 41-63 74-75 61-6C 20-66>, > 1-byte object ) > Expected: to be called once >Actual: never called - unsatisfied and active > {noformat} > I am able to reproduce this reliable in less than 10 iterations when running > the test in repetition under additional system stress. > Even if the test does not fail it produces the following gmock warning, > {noformat} > GMOCK WARNING: > Uninteresting mock function call - returning directly. > Function call: disconnected() > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9521) MasterAPITest.OperationUpdatesUponAgentGone is flaky
Benjamin Bannier created MESOS-9521: --- Summary: MasterAPITest.OperationUpdatesUponAgentGone is flaky Key: MESOS-9521 URL: https://issues.apache.org/jira/browse/MESOS-9521 Project: Mesos Issue Type: Bug Components: test Affects Versions: 1.8.0 Environment: Fedora28, cmake w/ SSL Reporter: Benjamin Bannier The recently added test {{MasterAPITest.OperationUpdatesUponAgentGone}} is flaky, e.g., {noformat}../src/tests/api_tests.cpp:5051: Failure Value of: resources.empty() Actual: true Expected: false ../3rdparty/libprocess/src/../include/process/gmock.hpp:504: Failure Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(to, testing::A()))... Expected args: message matcher (32-byte object , 1-byte object ) Expected: to be called once Actual: never called - unsatisfied and active {noformat} I am able to reproduce this reliable in less than 10 iterations when running the test in repetition under additional system stress. Even if the test does not fail it produces the following gmock warning, {noformat} GMOCK WARNING: Uninteresting mock function call - returning directly. Function call: disconnected() {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9520) IOTest.Read hangs on Windows
Jan Schlicht created MESOS-9520: --- Summary: IOTest.Read hangs on Windows Key: MESOS-9520 URL: https://issues.apache.org/jira/browse/MESOS-9520 Project: Mesos Issue Type: Bug Components: test Environment: Windows Reporter: Jan Schlicht Noticed in test runs that {{IOTest.Read}} hangs in Windows environments. Test runs need to be aborted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)