[jira] [Commented] (MESOS-8745) Add a `LIST_RESOURCE_PROVIDER_CONFIGS` agent API call.

2019-01-14 Thread Chun-Hung Hsiao (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-8745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742592#comment-16742592
 ] 

Chun-Hung Hsiao commented on MESOS-8745:


We could use this API to expose states that's not available to the resource 
provider manager. See MESOS-9223.

> Add a `LIST_RESOURCE_PROVIDER_CONFIGS` agent API call.
> --
>
> Key: MESOS-8745
> URL: https://issues.apache.org/jira/browse/MESOS-8745
> Project: Mesos
>  Issue Type: Task
>  Components: agent
>Reporter: Chun-Hung Hsiao
>Priority: Minor
>  Labels: mesosphere, storage
>
> For API completeness, it would be nice if we can provider a call to list all 
> valid resource provider configs on an agent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9223) Storage local provider does not sufficiently handle container launch failures or errors

2019-01-14 Thread Chun-Hung Hsiao (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742591#comment-16742591
 ] 

Chun-Hung Hsiao commented on MESOS-9223:


For surfacing the error:
We could pass the failure to {{LocalResourceProviderManager}} through the 
interface I proposed above, then expose that in the 
{{LIST_RESOURCE_PROVIDER_CONFIGS}} API through MESOS-8745.

> Storage local provider does not sufficiently handle container launch failures 
> or errors
> ---
>
> Key: MESOS-9223
> URL: https://issues.apache.org/jira/browse/MESOS-9223
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, storage
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Critical
>
> The storage local resource provider as currently implemented does not handle 
> launch failures or task errors of its standalone containers well enough, If 
> e.g., a RP container fails to come up during node start a warning would be 
> logged, but an operator still needs to detect degraded functionality, 
> manually check the state of containers with {{GET_CONTAINERS}}, and decide 
> whether the agent needs restarting; I suspect they do not have always have 
> enough context for this decision. It would be better if the provider would 
> either enforce a restart by failing over the whole agent, or by retrying the 
> operation (optionally: up to some maximum amount of retries).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9521) MasterAPITest.OperationUpdatesUponAgentGone is flaky

2019-01-14 Thread Benno Evers (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742276#comment-16742276
 ] 

Benno Evers commented on MESOS-9521:


Review: https://reviews.apache.org/r/69726/

The warning is known, but due to the caveat that is printed right below the 
warning
{noformat}
NOTE: You can safely ignore the above warning unless this call should not 
happen.  Do not suppress it by blindly adding an EXPECT_CALL() if you don't 
mean to enforce the call.  See 
https://github.com/google/googletest/blob/master/googlemock/docs/CookBook.md#knowing-when-to-expect
 for details.
{noformat}

I left it, because the test does not really care about whether `disconnect()` 
is called or not.

> MasterAPITest.OperationUpdatesUponAgentGone is flaky
> 
>
> Key: MESOS-9521
> URL: https://issues.apache.org/jira/browse/MESOS-9521
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.8.0
> Environment: Fedora28, cmake w/ SSL
>Reporter: Benjamin Bannier
>Priority: Major
>  Labels: flaky, flaky-test
>
> The recently added test {{MasterAPITest.OperationUpdatesUponAgentGone}} is 
> flaky, e.g.,
> {noformat}../src/tests/api_tests.cpp:5051: Failure
> Value of: resources.empty()
>   Actual: true
> Expected: false
> ../3rdparty/libprocess/src/../include/process/gmock.hpp:504: Failure
> Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(to, 
> testing::A()))...
> Expected args: message matcher (32-byte object  24-00 00-00 00-00 00-00 24-00 00-00 00-00 00-00 41-63 74-75 61-6C 20-66>, 
> 1-byte object )
>  Expected: to be called once
>Actual: never called - unsatisfied and active
> {noformat}
> I am able to reproduce this reliable in less than 10 iterations when running 
> the test in repetition under additional system stress.
> Even if the test does not fail it produces the following gmock warning,
> {noformat}
> GMOCK WARNING:
> Uninteresting mock function call - returning directly.
> Function call: disconnected()
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (MESOS-9521) MasterAPITest.OperationUpdatesUponAgentGone is flaky

2019-01-14 Thread Benjamin Bannier (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741868#comment-16741868
 ] 

Benjamin Bannier edited comment on MESOS-9521 at 1/14/19 9:06 AM:
--

cc [~bennoe], [~greggomann]


was (Author: bbannier):
cc @bennoe, [~greggomann]

> MasterAPITest.OperationUpdatesUponAgentGone is flaky
> 
>
> Key: MESOS-9521
> URL: https://issues.apache.org/jira/browse/MESOS-9521
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.8.0
> Environment: Fedora28, cmake w/ SSL
>Reporter: Benjamin Bannier
>Priority: Major
>
> The recently added test {{MasterAPITest.OperationUpdatesUponAgentGone}} is 
> flaky, e.g.,
> {noformat}../src/tests/api_tests.cpp:5051: Failure
> Value of: resources.empty()
>   Actual: true
> Expected: false
> ../3rdparty/libprocess/src/../include/process/gmock.hpp:504: Failure
> Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(to, 
> testing::A()))...
> Expected args: message matcher (32-byte object  24-00 00-00 00-00 00-00 24-00 00-00 00-00 00-00 41-63 74-75 61-6C 20-66>, 
> 1-byte object )
>  Expected: to be called once
>Actual: never called - unsatisfied and active
> {noformat}
> I am able to reproduce this reliable in less than 10 iterations when running 
> the test in repetition under additional system stress.
> Even if the test does not fail it produces the following gmock warning,
> {noformat}
> GMOCK WARNING:
> Uninteresting mock function call - returning directly.
> Function call: disconnected()
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9521) MasterAPITest.OperationUpdatesUponAgentGone is flaky

2019-01-14 Thread Benjamin Bannier (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741868#comment-16741868
 ] 

Benjamin Bannier commented on MESOS-9521:
-

cc @bennoe, [~greggomann]

> MasterAPITest.OperationUpdatesUponAgentGone is flaky
> 
>
> Key: MESOS-9521
> URL: https://issues.apache.org/jira/browse/MESOS-9521
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 1.8.0
> Environment: Fedora28, cmake w/ SSL
>Reporter: Benjamin Bannier
>Priority: Major
>
> The recently added test {{MasterAPITest.OperationUpdatesUponAgentGone}} is 
> flaky, e.g.,
> {noformat}../src/tests/api_tests.cpp:5051: Failure
> Value of: resources.empty()
>   Actual: true
> Expected: false
> ../3rdparty/libprocess/src/../include/process/gmock.hpp:504: Failure
> Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(to, 
> testing::A()))...
> Expected args: message matcher (32-byte object  24-00 00-00 00-00 00-00 24-00 00-00 00-00 00-00 41-63 74-75 61-6C 20-66>, 
> 1-byte object )
>  Expected: to be called once
>Actual: never called - unsatisfied and active
> {noformat}
> I am able to reproduce this reliable in less than 10 iterations when running 
> the test in repetition under additional system stress.
> Even if the test does not fail it produces the following gmock warning,
> {noformat}
> GMOCK WARNING:
> Uninteresting mock function call - returning directly.
> Function call: disconnected()
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9521) MasterAPITest.OperationUpdatesUponAgentGone is flaky

2019-01-14 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-9521:
---

 Summary: MasterAPITest.OperationUpdatesUponAgentGone is flaky
 Key: MESOS-9521
 URL: https://issues.apache.org/jira/browse/MESOS-9521
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 1.8.0
 Environment: Fedora28, cmake w/ SSL
Reporter: Benjamin Bannier


The recently added test {{MasterAPITest.OperationUpdatesUponAgentGone}} is 
flaky, e.g.,
{noformat}../src/tests/api_tests.cpp:5051: Failure
Value of: resources.empty()
  Actual: true
Expected: false
../3rdparty/libprocess/src/../include/process/gmock.hpp:504: Failure
Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(to, 
testing::A()))...
Expected args: message matcher (32-byte object , 
1-byte object )
 Expected: to be called once
   Actual: never called - unsatisfied and active
{noformat}

I am able to reproduce this reliable in less than 10 iterations when running 
the test in repetition under additional system stress.

Even if the test does not fail it produces the following gmock warning,
{noformat}

GMOCK WARNING:
Uninteresting mock function call - returning directly.
Function call: disconnected()
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9520) IOTest.Read hangs on Windows

2019-01-14 Thread Jan Schlicht (JIRA)
Jan Schlicht created MESOS-9520:
---

 Summary: IOTest.Read hangs on Windows
 Key: MESOS-9520
 URL: https://issues.apache.org/jira/browse/MESOS-9520
 Project: Mesos
  Issue Type: Bug
  Components: test
 Environment: Windows
Reporter: Jan Schlicht


Noticed in test runs that {{IOTest.Read}} hangs in Windows environments. Test 
runs need to be aborted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)