[jira] [Created] (MESOS-9486) Set up `object.value` for `CREATE_DISK` and `DESTROY_DISK` authorizations.

2018-12-17 Thread Chun-Hung Hsiao (JIRA)
Chun-Hung Hsiao created MESOS-9486:
--

 Summary: Set up `object.value` for `CREATE_DISK` and 
`DESTROY_DISK` authorizations.
 Key: MESOS-9486
 URL: https://issues.apache.org/jira/browse/MESOS-9486
 Project: Mesos
  Issue Type: Improvement
  Components: master
Reporter: Chun-Hung Hsiao
Assignee: Chun-Hung Hsiao


We should be defensive and set up {{object.value}} to the role of the resource 
for authorization actions {{CREATE_BLOCK_DISK}}, {{DESTROY_BLOCK_DISK}}, 
{{CREATE_MOUNT_DISK}} and {{DESTROY_MOUNT_DISK}} so an old-school authorizer 
can rely on the field to perform authorization.

This behavior is deprecated though, so will be removed once all `*_WITH_ROLE` 
authorization action aliases are removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9485) Unit test for master operation authorization.

2018-12-17 Thread Chun-Hung Hsiao (JIRA)
Chun-Hung Hsiao created MESOS-9485:
--

 Summary: Unit test for master operation authorization.
 Key: MESOS-9485
 URL: https://issues.apache.org/jira/browse/MESOS-9485
 Project: Mesos
  Issue Type: Task
  Components: test
Reporter: Chun-Hung Hsiao
Assignee: Chun-Hung Hsiao


We should create a unit test for MESOS-9474 and MESOS-9480.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9482) Resource provider manager can crash on invalid data from resource providers

2018-12-17 Thread Chun-Hung Hsiao (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723277#comment-16723277
 ] 

Chun-Hung Hsiao commented on MESOS-9482:


Actually I already have MESOS-9407 created a while ago. Closing that one as a 
duplicate of this one since this describes the problem more generally.

> Resource provider manager can crash on invalid data from resource providers
> ---
>
> Key: MESOS-9482
> URL: https://issues.apache.org/jira/browse/MESOS-9482
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benjamin Bannier
>Priority: Major
>
> The resource provider manager code currently contains a number of assertions 
> which will crash the manager (and its agent) if some forms of invalid data 
> are received from a resource provider. This is dangerous since resource 
> providers are not necessarily part of Mesos-controlled code (they talk to the 
> manager over an HTTP API and could even be in external processes).
> Instead of crashing, the resource provider manager should disconnect the 
> resource providers in such scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9459) Reviewbot is not verifying reviews that need verification

2018-12-17 Thread Till Toenshoff (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722957#comment-16722957
 ] 

Till Toenshoff commented on MESOS-9459:
---

https://reviews.apache.org/r/69559

> Reviewbot is not verifying reviews that need verification
> -
>
> Key: MESOS-9459
> URL: https://issues.apache.org/jira/browse/MESOS-9459
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Armand Grillet
>Priority: Major
>  Labels: ci, integration
>
> For example this run of ReviewBot 
> https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Reviewbot/23594/console
>  says that there are no reviews to be verified, which is false because if we 
> look at ReviewBoard there are a bunch of reviews that have not been commented 
> on by ReviewBot since a new diff has been posted.
> {noformat}
> 12-05-18_23:41:54 - Running 
> /home/jenkins/jenkins-slave/workspace/Mesos-Reviewbot/support/verify-reviews.py
> 0 review requests need verification
> {noformat}
> I see the the logic of the verify-reviews.py script was changed as part of 
> the python3 transition here: https://reviews.apache.org/r/68619/diff/1#27 
> which likely caused the bug. 
> As an aside, It's unfortunate that python3 update was bundled with logic 
> changes in this review. cc [~andschwa]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9484) GroupTest.GroupDataWithDisconnect is flaky

2018-12-17 Thread Benno Evers (JIRA)
Benno Evers created MESOS-9484:
--

 Summary: GroupTest.GroupDataWithDisconnect is flaky
 Key: MESOS-9484
 URL: https://issues.apache.org/jira/browse/MESOS-9484
 Project: Mesos
  Issue Type: Bug
 Environment: Mac OSX w/ libevent
Reporter: Benno Evers


Observed the following error in our CI:
{noformat}
../../src/tests/group_tests.cpp:129: Failure
data.get() is NONE
{noformat}

Full log:
{noformat}
[ RUN  ] GroupTest.GroupDataWithDisconnect
I1214 15:06:53.386937 398710208 zookeeper_test_server.cpp:156] Started 
ZooKeeperTestServer on port 51193
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-14 15:06:53,387:69505(0x739ee000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:51193 sessionTimeout=1 
watcher=0x11a65f9a0 sessionId=0 sessionPasswd= context=0x7fcd06163550 
flags=0
2018-12-14 15:06:53,387:69505(0x74415000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:51193]
2018-12-14 15:06:53,389:69505(0x74415000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:51193], 
sessionId=0x167aef9004a, negotiated timeout=1
I1214 15:06:53.389168 60743680 group.cpp:341] Group process 
(zookeeper-group(40)@10.0.49.4:49309) connected to ZooKeeper
I1214 15:06:53.389210 60743680 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (1, 0, 0)
I1214 15:06:53.389227 60743680 group.cpp:419] Trying to create path '/test' in 
ZooKeeper
I1214 15:06:53.392253 398710208 zookeeper_test_server.cpp:116] Shutting down 
ZooKeeperTestServer on port 51193
2018-12-14 
15:06:53,393:69505(0x74415000):ZOO_ERROR@handle_socket_error_msg@1782: 
Socket [127.0.0.1:51193] zk retcode=-4, errno=64(Host is down): failed while 
receiving a server response
I1214 15:06:53.393187 59133952 group.cpp:452] Lost connection to ZooKeeper, 
attempting to reconnect ...
I1214 15:06:53.393661 59670528 group.cpp:700] Trying to get '/test/00' 
in ZooKeeper
2018-12-14 
15:06:53,393:69505(0x74415000):ZOO_ERROR@handle_socket_error_msg@1758: 
Socket [127.0.0.1:51193] zk retcode=-4, errno=61(Connection refused): server 
refused to accept the client
I1214 15:06:53.395321 398710208 zookeeper_test_server.cpp:156] Started 
ZooKeeperTestServer on port 51193
W1214 15:07:04.003191 59670528 group.cpp:495] Timed out waiting to connect to 
ZooKeeper. Forcing ZooKeeper session (sessionId=167aef9004a) expiration
I1214 15:07:04.003652 59670528 group.cpp:511] ZooKeeper session expired
2018-12-14 15:07:04,004:69505(0x738e8000):ZOO_INFO@zookeeper_close@2579: 
Freeing zookeeper resources for sessionId=0x167aef9004a

2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-14 15:07:04,004:69505(0x739ee000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:51193 sessionTimeout=1 

[jira] [Created] (MESOS-9483) ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors is flaky

2018-12-17 Thread Benno Evers (JIRA)
Benno Evers created MESOS-9483:
--

 Summary: ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors 
is flaky
 Key: MESOS-9483
 URL: https://issues.apache.org/jira/browse/MESOS-9483
 Project: Mesos
  Issue Type: Bug
 Environment: Mac OSX w/ libevent
Reporter: Benno Evers


Observed a failure with the following error:
{noformat}
../../src/tests/master_contender_detector_tests.cpp:409: Failure
Failed to wait 15secs for group1.join("data")
{noformat}

Full log:
{noformat}
[ RUN  ] ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors
I1214 15:03:56.036525 398710208 zookeeper_test_server.cpp:156] Started 
ZooKeeperTestServer on port 50199
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-14 15:03:56,036:69505(0x7396b000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:50199 sessionTimeout=1 
watcher=0x11a65f9a0 sessionId=0 sessionPasswd= context=0x7fcd061125a0 
flags=0
2018-12-14 15:03:56,037:69505(0x74415000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:50199]
2018-12-14 15:03:56,039:69505(0x74415000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:50199], 
sessionId=0x167aef64b83, negotiated timeout=1
I1214 15:03:56.039242 60207104 group.cpp:341] Group process 
(zookeeper-group(14)@10.0.49.4:49309) connected to ZooKeeper
I1214 15:03:56.039286 60207104 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (1, 0, 0)
I1214 15:03:56.039309 60207104 group.cpp:395] Authenticating with ZooKeeper 
using digest
2018-12-14 15:04:05,989:69505(0x74415000):ZOO_WARN@zookeeper_interest@1597: 
Exceeded deadline by 6619ms
2018-12-14 
15:04:05,989:69505(0x74415000):ZOO_ERROR@handle_socket_error_msg@1702: 
Socket [127.0.0.1:50199] zk retcode=-7, errno=60(Operation timed out): 
connection to 127.0.0.1:50199 timed out (exceeded timeout by 3284ms)
2018-12-14 15:04:05,989:69505(0x74415000):ZOO_WARN@zookeeper_interest@1597: 
Exceeded deadline by 6619ms
I1214 15:04:05.990031 60207104 group.cpp:452] Lost connection to ZooKeeper, 
attempting to reconnect ...
2018-12-14 15:04:09,332:69505(0x74415000):ZOO_WARN@zookeeper_interest@1597: 
Exceeded deadline by 9963ms
2018-12-14 15:04:09,332:69505(0x74415000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:50199]
2018-12-14 
15:04:09,333:69505(0x74415000):ZOO_ERROR@handle_socket_error_msg@1800: 
Socket [127.0.0.1:50199] zk retcode=-112, errno=70(Stale NFS file handle): 
sessionId=0x167aef64b83 has expired.
I1214 15:04:09.333552 59670528 group.cpp:511] ZooKeeper session expired
2018-12-14 15:04:09,333:69505(0x738e8000):ZOO_INFO@zookeeper_close@2579: 
Freeing zookeeper resources for sessionId=0x167aef64b83

2018-12-14 15:04:09,333:69505(0x7375f000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-14 15:04:09,333:69505(0x7375f000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-14 15:04:09,333:69505(0x7375f000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-14 15:04:09,333:69505(0x7375f000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-14 15:04:09,333:69505(0x7375f000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-14 15:04:09,333:69505(0x7375f000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-14 15:04:09,333:69505(0x7375f000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-14 15:04:09,333:69505(0x7375f000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-14 

[jira] [Assigned] (MESOS-9480) Master may skip processing authorization results for `LAUNCH_GROUP`.

2018-12-17 Thread Jan Schlicht (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Schlicht reassigned MESOS-9480:
---

Assignee: Chun-Hung Hsiao  (was: Jan Schlicht)

> Master may skip processing authorization results for `LAUNCH_GROUP`.
> 
>
> Key: MESOS-9480
> URL: https://issues.apache.org/jira/browse/MESOS-9480
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.7.0
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>Priority: Blocker
>  Labels: mesosphere
>
> If there is a validation error for {{LAUNCH_GROUP}}, or if there are multiple 
> authorization errors for some of the tasks in a {{LAUNCH_GROUP}}, the master 
> will skip processing the remaining authorization results, which would result 
> in these authorization results being examined by subsequent operations 
> incorrectly:
> https://github.com/apache/mesos/blob/3ade731d0c1772206c4afdf56318cfab6356acee/src/master/master.cpp#L5487-L5521



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9482) Resource provider manager can crash on invalid data from resource providers

2018-12-17 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-9482:
---

 Summary: Resource provider manager can crash on invalid data from 
resource providers
 Key: MESOS-9482
 URL: https://issues.apache.org/jira/browse/MESOS-9482
 Project: Mesos
  Issue Type: Bug
Reporter: Benjamin Bannier


The resource provider manager code currently contains a number of assertions 
which will crash the manager (and its agent) if some forms of invalid data are 
received from a resource provider. This is dangerous since resource providers 
are not necessarily part of Mesos-controlled code (they talk to the manager 
over an HTTP API and could even be in external processes).

Instead of crashing, the resource provider manager should disconnect the 
resource providers in such scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9481) Registration of frameworks with set but empty ID should not be allowed

2018-12-17 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-9481:
---

 Summary: Registration of frameworks with set but empty ID should 
not be allowed
 Key: MESOS-9481
 URL: https://issues.apache.org/jira/browse/MESOS-9481
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Benjamin Bannier


Mesos currently allows frameworks to register with a set, but empty ID. 
Internally this is treated identically to the case of registration without a 
set ID, and quite some code exists to support this.

We should check whether we really need to provide this level of flexibility. It 
not only complicates the implementation, but also the API which both leads to 
conceptually harder to grasp code (which tends to be error prone). Ideally we 
should reject a set but empty {{FrameworkID}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)