[jira] [Created] (MESOS-4649) Speed up ZooKeeperTest.LeaderContender by advance Clock.

2016-02-11 Thread haosdent (JIRA)
haosdent created MESOS-4649:
---

 Summary: Speed up ZooKeeperTest.LeaderContender by advance Clock.
 Key: MESOS-4649
 URL: https://issues.apache.org/jira/browse/MESOS-4649
 Project: Mesos
  Issue Type: Improvement
Reporter: haosdent
Assignee: haosdent
Priority: Minor


ZooKeeperTest.LeaderContender reconnect multiple times. We could use advance to 
avoid those reconnect timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4159) Speed up GroupTest.*

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142471#comment-15142471
 ] 

haosdent commented on MESOS-4159:
-

GroupTest.GroupPathWithRestrictivePerms depends 
[MESOS-4648|https://issues.apache.org/jira/browse/MESOS-4648]

> Speed up GroupTest.*
> 
>
> Key: MESOS-4159
> URL: https://issues.apache.org/jira/browse/MESOS-4159
> Project: Mesos
>  Issue Type: Epic
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> Execution times on Mac OS 10.10.4:
> {code}
> GroupTest.GroupJoinWithDisconnect (3352 ms)
> GroupTest.GroupDataWithDisconnect (3350 ms)
> GroupTest.GroupCancelWithDisconnect (2013 ms)
> GroupTest.GroupPathWithRestrictivePerms (13368 ms)
> GroupTest.RetryableErrors (26720 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4649) Speed up ZooKeeperTest.LeaderContender by advance Clock.

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142404#comment-15142404
 ] 

haosdent edited comment on MESOS-4649 at 2/11/16 8:32 AM:
--

Patch: https://reviews.apache.org/r/43472


was (Author: haosd...@gmail.com):
Patch: https://issues.apache.org/jira/browse/MESOS-4649

> Speed up ZooKeeperTest.LeaderContender by advance Clock.
> 
>
> Key: MESOS-4649
> URL: https://issues.apache.org/jira/browse/MESOS-4649
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
> ZooKeeperTest.LeaderContender reconnect multiple times. We could use advance 
> to avoid those reconnect timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4652) Speed up ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster by advance Clock.

2016-02-11 Thread haosdent (JIRA)
haosdent created MESOS-4652:
---

 Summary: Speed up 
ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster
 by advance Clock.
 Key: MESOS-4652
 URL: https://issues.apache.org/jira/browse/MESOS-4652
 Project: Mesos
  Issue Type: Improvement
Reporter: haosdent
Assignee: haosdent
Priority: Minor


ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession 
contains reconnect. We could use advance to avoid the expired timeout and speed 
up reconnecting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4651) Speed up ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession by advance Clock.

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4651:

Description: 
ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession 
contains reconnect. We could use advance to avoid the expired timeout and speed 
up reconnecting.  (was: 
ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork contains 
reconnect. We could use advance to avoid the expired timeout and speed up 
reconnecting.)

> Speed up 
> ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession by 
> advance Clock.
> --
>
> Key: MESOS-4651
> URL: https://issues.apache.org/jira/browse/MESOS-4651
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
> ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession 
> contains reconnect. We could use advance to avoid the expired timeout and 
> speed up reconnecting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4650) Speed up ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork by advance Clock.

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4650:

Description:   We could use advance to avoid the expired timeout and speed 
up reconnecting.  (was: ZooKeeperTest.LeaderContender reconnect multiple times. 
We could use advance to avoid those reconnect timeout.)

> Speed up 
> ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork by 
> advance Clock.
> 
>
> Key: MESOS-4650
> URL: https://issues.apache.org/jira/browse/MESOS-4650
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
>   We could use advance to avoid the expired timeout and speed up reconnecting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4648) Backport zookeeper slow add_auth patch

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142401#comment-15142401
 ] 

haosdent commented on MESOS-4648:
-

This issue pending for the patch of 
[ZOOKEEPER-770|https://issues.apache.org/jira/browse/ZOOKEEPER-770] merged to 
upstream.

> Backport zookeeper slow add_auth patch
> --
>
> Key: MESOS-4648
> URL: https://issues.apache.org/jira/browse/MESOS-4648
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: test, zookeeper
>
> Backport [ZOOKEEPER-770 Slow add_auth calls with multi-threaded 
> client|https://issues.apache.org/jira/browse/ZOOKEEPER-770] to solve c client 
> slow add_auth call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4159) Speed up GroupTest.*

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142468#comment-15142468
 ] 

haosdent commented on MESOS-4159:
-

After apply the patch
{code}
[   OK ] GroupTest.GroupJoinWithDisconnect (405 ms)
[   OK ] GroupTest.GroupDataWithDisconnect (192 ms)
[   OK ] GroupTest.GroupCancelWithDisconnect (250 ms)
[   OK ] GroupTest.GroupPathWithRestrictivePerms (334 ms)
[   OK ] GroupTest.RetryableErrors (341 ms)
{code}

> Speed up GroupTest.*
> 
>
> Key: MESOS-4159
> URL: https://issues.apache.org/jira/browse/MESOS-4159
> Project: Mesos
>  Issue Type: Epic
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> Execution times on Mac OS 10.10.4:
> {code}
> GroupTest.GroupJoinWithDisconnect (3352 ms)
> GroupTest.GroupDataWithDisconnect (3350 ms)
> GroupTest.GroupCancelWithDisconnect (2013 ms)
> GroupTest.GroupPathWithRestrictivePerms (13368 ms)
> GroupTest.RetryableErrors (26720 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4650) Speed up ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork by advance Clock.

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4650:

Description: 
ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork contains 
reconnect. We could use advance to avoid the expired timeout and speed up 
reconnecting.  (was:   We could use advance to avoid the expired timeout and 
speed up reconnecting.)

> Speed up 
> ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork by 
> advance Clock.
> 
>
> Key: MESOS-4650
> URL: https://issues.apache.org/jira/browse/MESOS-4650
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
> ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork 
> contains reconnect. We could use advance to avoid the expired timeout and 
> speed up reconnecting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4651) Speed up ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession by advance Clock.

2016-02-11 Thread haosdent (JIRA)
haosdent created MESOS-4651:
---

 Summary: Speed up 
ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSession by 
advance Clock.
 Key: MESOS-4651
 URL: https://issues.apache.org/jira/browse/MESOS-4651
 Project: Mesos
  Issue Type: Improvement
Reporter: haosdent
Assignee: haosdent
Priority: Minor


ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork contains 
reconnect. We could use advance to avoid the expired timeout and speed up 
reconnecting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4650) Speed up ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork by advance Clock.

2016-02-11 Thread haosdent (JIRA)
haosdent created MESOS-4650:
---

 Summary: Speed up 
ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork by 
advance Clock.
 Key: MESOS-4650
 URL: https://issues.apache.org/jira/browse/MESOS-4650
 Project: Mesos
  Issue Type: Improvement
Reporter: haosdent
Assignee: haosdent
Priority: Minor


ZooKeeperTest.LeaderContender reconnect multiple times. We could use advance to 
avoid those reconnect timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4633) Tests will dereference stack allocated agent objects upon assertion/expectation failure.

2016-02-11 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142429#comment-15142429
 ] 

Michael Park commented on MESOS-4633:
-

{noformat}
commit 28917d657a69ee4731375ba54fe67d13816198ff
Author: Joseph Wu 
Date:   Wed Feb 10 16:56:35 2016 -0800

Constrain `Option`'s forwarding constructor to constructible types.

Review: https://reviews.apache.org/r/43434/
{noformat}

> Tests will dereference stack allocated agent objects upon 
> assertion/expectation failure.
> 
>
> Key: MESOS-4633
> URL: https://issues.apache.org/jira/browse/MESOS-4633
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: flaky, mesosphere, tech-debt, test
>
> Tests that use the {{StartSlave}} test helper are generally fragile when the 
> test fails an assert/expect in the middle of the test.  This is because the 
> {{StartSlave}} helper takes raw pointer arguments, which may be 
> stack-allocated.
> In case of an assert failure, the test immediately exits (destroying stack 
> allocated objects) and proceeds onto test cleanup.  The test cleanup may 
> dereference some of these destroyed objects, leading to a test crash like:
> {code}
> [18:27:36][Step 8/8] F0204 18:27:35.981302 23085 logging.cpp:64] RAW: Pure 
> virtual method called
> [18:27:36][Step 8/8] @ 0x7f7077055e1c  google::LogMessage::Fail()
> [18:27:36][Step 8/8] @ 0x7f707705ba6f  google::RawLog__()
> [18:27:36][Step 8/8] @ 0x7f70760f76c9  __cxa_pure_virtual
> [18:27:36][Step 8/8] @   0xa9423c  
> mesos::internal::tests::Cluster::Slaves::shutdown()
> [18:27:36][Step 8/8] @  0x1074e45  
> mesos::internal::tests::MesosTest::ShutdownSlaves()
> [18:27:36][Step 8/8] @  0x1074de4  
> mesos::internal::tests::MesosTest::Shutdown()
> [18:27:36][Step 8/8] @  0x1070ec7  
> mesos::internal::tests::MesosTest::TearDown()
> {code}
> The {{StartSlave}} helper should take {{shared_ptr}} arguments instead.
> This also means that we can remove the {{Shutdown}} helper from most of these 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4164) MasterTest.RecoverResources is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144121#comment-15144121
 ] 

haosdent commented on MESOS-4164:
-

After:
{code}
[   OK ] MasterTest.RecoverResources (113 ms)
{code}

> MasterTest.RecoverResources is slow
> ---
>
> Key: MESOS-4164
> URL: https://issues.apache.org/jira/browse/MESOS-4164
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterTest.RecoverResources}} test takes more than {{1s}} to finish on 
> my Mac OS 10.10.4:
> {code}
> MasterTest.RecoverResources (1018 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4168) MasterMaintenanceTest.EnterMaintenanceMode is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144119#comment-15144119
 ] 

haosdent commented on MESOS-4168:
-

{code}
[   OK ] MasterMaintenanceTest.EnterMaintenanceMode (138 ms)
{code}


> MasterMaintenanceTest.EnterMaintenanceMode is slow 
> ---
>
> Key: MESOS-4168
> URL: https://issues.apache.org/jira/browse/MESOS-4168
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterMaintenanceTest.EnterMaintenanceMode}} test takes more than 
> {{5s}} to finish on my Mac OS 10.10.4:
> {code}
> MasterMaintenanceTest.EnterMaintenanceMode (5087 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4169) MasterMaintenanceTest.InverseOffers is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144120#comment-15144120
 ] 

haosdent commented on MESOS-4169:
-

{code}
[   OK ] MasterMaintenanceTest.InverseOffers (134 ms)
{code}

> MasterMaintenanceTest.InverseOffers is slow
> ---
>
> Key: MESOS-4169
> URL: https://issues.apache.org/jira/browse/MESOS-4169
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterMaintenanceTest.InverseOffers}} test takes more than {{2s}} to 
> finish on my Mac OS 10.10.4:
> {code}
> MasterMaintenanceTest.InverseOffers (2027 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4172) GarbageCollectorIntegrationTest.Restart is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144118#comment-15144118
 ] 

haosdent commented on MESOS-4172:
-

After 
{code}
[   OK ] GarbageCollectorIntegrationTest.Restart (158 ms)
{code}

> GarbageCollectorIntegrationTest.Restart is slow
> ---
>
> Key: MESOS-4172
> URL: https://issues.apache.org/jira/browse/MESOS-4172
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{GarbageCollectorIntegrationTest.Restart}} test takes more than {{5s}} 
> to finish on my Mac OS 10.10.4:
> {code}
> GarbageCollectorIntegrationTest.Restart (5102 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4167) MasterTest.OfferTimeout is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144125#comment-15144125
 ] 

haosdent commented on MESOS-4167:
-

After
{code}
[   OK ] MasterTest.OfferTimeout (62 ms)
{code}

> MasterTest.OfferTimeout is slow
> ---
>
> Key: MESOS-4167
> URL: https://issues.apache.org/jira/browse/MESOS-4167
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterTest.OfferTimeout}} test takes more than {{1s}} to finish on my 
> Mac OS 10.10.4:
> {code}
> MasterTest.OfferTimeout (1053 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4165) MasterTest.MasterInfoOnReElection is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144122#comment-15144122
 ] 

haosdent commented on MESOS-4165:
-

After
{code}
[   OK ] MasterTest.MasterInfoOnReElection (62 ms)
{code}

> MasterTest.MasterInfoOnReElection is slow
> -
>
> Key: MESOS-4165
> URL: https://issues.apache.org/jira/browse/MESOS-4165
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterTest.MasterInfoOnReElection}} test takes more than {{1s}} to 
> finish on my Mac OS 10.10.4:
> {code}
> MasterTest.MasterInfoOnReElection (1024 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4171) OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144127#comment-15144127
 ] 

haosdent commented on MESOS-4171:
-

After
{code}
[   OK ] OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover (56 ms)
{code}

> OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover is slow
> --
>
> Key: MESOS-4171
> URL: https://issues.apache.org/jira/browse/MESOS-4171
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover}} test takes 
> more than {{1s}} to finish on my Mac OS 10.10.4:
> {code}
> OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover (1018 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4170) OversubscriptionTest.UpdateAllocatorOnSchedulerFailover is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144126#comment-15144126
 ] 

haosdent commented on MESOS-4170:
-

After
{code}
[   OK ] OversubscriptionTest.UpdateAllocatorOnSchedulerFailover (56 ms)
{code}

> OversubscriptionTest.UpdateAllocatorOnSchedulerFailover is slow
> ---
>
> Key: MESOS-4170
> URL: https://issues.apache.org/jira/browse/MESOS-4170
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{OversubscriptionTest.UpdateAllocatorOnSchedulerFailover}} test takes 
> more than {{1s}} to finish on my Mac OS 10.10.4:
> {code}
> OversubscriptionTest.UpdateAllocatorOnSchedulerFailover (1018 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4166) MasterTest.LaunchCombinedOfferTest is slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144124#comment-15144124
 ] 

haosdent commented on MESOS-4166:
-

After
{code}
[   OK ] MasterTest.LaunchCombinedOfferTest (101 ms)
{code}

> MasterTest.LaunchCombinedOfferTest is slow
> --
>
> Key: MESOS-4166
> URL: https://issues.apache.org/jira/browse/MESOS-4166
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterTest.LaunchCombinedOfferTest}} test takes more than {{2s}} to 
> finish on my Mac OS 10.10.4:
> {code}
> MasterTest.LaunchCombinedOfferTest (2023 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4160) Log recover tests are slow

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144164#comment-15144164
 ] 

haosdent commented on MESOS-4160:
-

Hi [~lins05] Are you still doing this?

> Log recover tests are slow
> --
>
> Key: MESOS-4160
> URL: https://issues.apache.org/jira/browse/MESOS-4160
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: Shuai Lin
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> On Mac OS 10.10.4, some tests take longer than {{1s}} to finish:
> {code}
> RecoverTest.AutoInitialization (1003 ms)
> RecoverTest.AutoInitializationRetry (1000 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3738) Mesos health check is invoked incorrectly when Mesos slave is within the docker container

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144175#comment-15144175
 ] 

haosdent commented on MESOS-3738:
-

Hi, [~meatmanek] I think you could patch it as I mentioned above.

> Mesos health check is invoked incorrectly when Mesos slave is within the 
> docker container
> -
>
> Key: MESOS-3738
> URL: https://issues.apache.org/jira/browse/MESOS-3738
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.25.0
> Environment: Docker 1.8.0:
> Client:
>  Version:  1.8.0
>  API version:  1.20
>  Go version:   go1.4.2
>  Git commit:   0d03096
>  Built:Tue Aug 11 16:48:39 UTC 2015
>  OS/Arch:  linux/amd64
> Server:
>  Version:  1.8.0
>  API version:  1.20
>  Go version:   go1.4.2
>  Git commit:   0d03096
>  Built:Tue Aug 11 16:48:39 UTC 2015
>  OS/Arch:  linux/amd64
> Host: Ubuntu 14.04
> Container: Debian 8.1 + Java-7
>Reporter: Yong Tang
>Assignee: haosdent
> Fix For: 0.26.0
>
> Attachments: MESOS-3738-0_23_1.patch, MESOS-3738-0_24_1.patch, 
> MESOS-3738-0_25_0.patch
>
>
> When Mesos slave is within the container, the COMMAND health check from 
> Marathon is invoked incorrectly.
> In such a scenario, the sandbox directory (instead of the 
> launcher/health-check directory) is used. This result in an error with the 
> container.
> Command to invoke the Mesos slave container:
> {noformat}
> sudo docker run -d -v /sys:/sys -v /usr/bin/docker:/usr/bin/docker:ro -v 
> /usr/lib/x86_64-linux-gnu/libapparmor.so.1:/usr/lib/x86_64-linux-gnu/libapparmor.so.1:ro
>  -v /var/run/docker.sock:/var/run/docker.sock -v /tmp/mesos:/tmp/mesos mesos 
> mesos slave --master=zk://10.2.1.2:2181/mesos --containerizers=docker,mesos 
> --executor_registration_timeout=5mins --docker_stop_timeout=10secs 
> --launcher=posix
> {noformat}
> Marathon JSON file:
> {code}
> {
>   "id": "ubuntu",
>   "container":
>   {
> "type": "DOCKER",
> "docker":
> {
>   "image": "ubuntu",
>   "network": "BRIDGE",
>   "parameters": []
> }
>   },
>   "args": [ "bash", "-c", "while true; do echo 1; sleep 5; done" ],
>   "uris": [],
>   "healthChecks":
>   [
> {
>   "protocol": "COMMAND",
>   "command": { "value": "echo Success" },
>   "gracePeriodSeconds": 3000,
>   "intervalSeconds": 5,
>   "timeoutSeconds": 5,
>   "maxConsecutiveFailures": 300
> }
>   ],
>   "instances": 1
> }
> {code}
> {noformat}
> STDOUT:
> root@cea2be47d64f:/mnt/mesos/sandbox# cat stdout 
> --container="mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f"
>  --stop_timeout="10secs"
> --container="mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f"
>  --stop_timeout="10secs"
> Registered docker executor on b01e2e75afcb
> Starting task ubuntu.86bca10f-72c9-11e5-b36d-02420a020106
> 1
> Launching health check process: 
> /tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f/mesos-health-check
>  --executor=(1)@10.2.1.7:40695 
> --health_check_json={"command":{"shell":true,"value":"docker exec 
> mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f
>  sh -c \" echo Success 
> \""},"consecutive_failures":300,"delay_seconds":0.0,"grace_period_seconds":3000.0,"interval_seconds":5.0,"timeout_seconds":5.0}
>  --task_id=ubuntu.86bca10f-72c9-11e5-b36d-02420a020106
> Health check process launched at pid: 94
> 1
> 1
> 1
> 1
> 1
> STDERR:
> root@cea2be47d64f:/mnt/mesos/sandbox# cat stderr
> I1014 23:15:58.12795056 exec.cpp:134] Version: 0.25.0
> I1014 23:15:58.13062762 exec.cpp:208] Executor registered on slave 
> 

[jira] [Assigned] (MESOS-4162) SlaveTest.MetricsSlaveLaunchErrors is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4162:
---

Assignee: haosdent

> SlaveTest.MetricsSlaveLaunchErrors is slow
> --
>
> Key: MESOS-4162
> URL: https://issues.apache.org/jira/browse/MESOS-4162
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{SlaveTest.MetricsSlaveLaunchErrors}} test takes around {{1s}} to finish 
> on my Mac OS 10.10.4:
> {code}
> SlaveTest.MetricsSlaveLaunchErrors (1009 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4656) strings::split behaves incorrectly when n=1

2016-02-11 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-4656:
--

 Summary: strings::split behaves incorrectly when n=1
 Key: MESOS-4656
 URL: https://issues.apache.org/jira/browse/MESOS-4656
 Project: Mesos
  Issue Type: Bug
  Components: stout
Reporter: Benjamin Mahler
Assignee: Benjamin Mahler


While looking at the patches for MESOS-3833, I noticed that the code for 
strings::split behaves incorrectly for n=1 (maximum number of tokens).

Adding the following test case demonstrates the issue:

{code}
TEST(StringsTest, SplitNOne)
{
  vector tokens = strings::split("foo,bar,,,", ",", 1);
  ASSERT_EQ(1u, tokens.size());
  EXPECT_EQ("foo,bar,,,", tokens[0]);
}
{code}

This fails as follows:

{noformat}
[ RUN  ] StringsTest.SplitNOne
../../../../3rdparty/libprocess/3rdparty/stout/tests/strings_tests.cpp:357: 
Failure
Value of: tokens.size()
  Actual: 5
Expected: 1u
Which is: 1
[  FAILED  ] StringsTest.SplitNOne (0 ms)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4646) PortMappingIsolatorTests get kernel stuck.

2016-02-11 Thread Cong Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143334#comment-15143334
 ] 

Cong Wang commented on MESOS-4646:
--

OK, the kernel stuck is probably another kernel bug, but without kernel stack 
trace, I have no idea what bug it is. Could you please try to setup kdump to 
capture the kernel crash/stuck?

BTW, here at Twitter we use 4.1 kernel + the above fix, I just repeated the 
PortMappingIsolatorTest for 30 times, all passed. So maybe it is a new kernel 
bug I never see before.

> PortMappingIsolatorTests get kernel stuck.
> --
>
> Key: MESOS-4646
> URL: https://issues.apache.org/jira/browse/MESOS-4646
> Project: Mesos
>  Issue Type: Bug
> Environment: Linux Kernel 3.19.9-49-generic,
> libnl-3.2.27
>Reporter: Till Toenshoff
>Assignee: Cong Wang
>
> {noformat}
> $ sudo ./bin/mesos-tests.sh --gtest_filter="*PortMappingIsolatorTest*"
> Source directory: /home/till/scratchpad/mesos
> Build directory: /home/till/scratchpad/mesos/build
> -
> We cannot run any cgroups tests that require mounting
> hierarchies because you have the following hierarchies mounted:
> /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, 
> /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, 
> /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/net_cls, 
> /sys/fs/cgroup/net_prio, /sys/fs/cgroup/perf_event, /sys/fs/cgroup/systemd
> We'll disable the CgroupsNoHierarchyTest test fixture for now.
> -
> WARNING: perf not found for kernel 3.19.0-49
>   You may need to install the following packages for this specific kernel:
> linux-tools-3.19.0-49-generic
> linux-cloud-tools-3.19.0-49-generic
>   You may also want to install one of the following packages to keep up to 
> date:
> linux-tools-generic-lts-
> linux-cloud-tools-generic-lts-
> -
> No 'perf' command found so no 'perf' tests will be run
> -
> WARNING: perf not found for kernel 3.19.0-49
>   You may need to install the following packages for this specific kernel:
> linux-tools-3.19.0-49-generic
> linux-cloud-tools-3.19.0-49-generic
>   You may also want to install one of the following packages to keep up to 
> date:
> linux-tools-generic-lts-
> linux-cloud-tools-generic-lts-
> -
> The 'perf' command wasn't found so tests using it
> to sample the 'cycles' hardware event will not be run.
> -
> /bin/nc
> /usr/local/bin/curl
> Note: Google Test filter = 
> 

[jira] [Created] (MESOS-4659) Consider how to handle orphaned tasks after master failover

2016-02-11 Thread Neil Conway (JIRA)
Neil Conway created MESOS-4659:
--

 Summary: Consider how to handle orphaned tasks after master 
failover
 Key: MESOS-4659
 URL: https://issues.apache.org/jira/browse/MESOS-4659
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Neil Conway


If a framework becomes disconnected from the master, its tasks are killed after 
waiting for {{failover_timeout}}.

However, if a master failover occurs but a framework never reconnects to the 
new master, we never kill any of the tasks associated with that framework. 
These tasks remain orphaned and presumably would need to be manually removed by 
the operator.

We should consider whether to kill such orphaned tasks automatically, likely 
after waiting for some (framework-configurable?) timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4660) Document net_cls isolator in docs/mesos-containerizer.md.

2016-02-11 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4660:
--
  Sprint: Mesosphere Sprint 28
Story Points: 1

> Document net_cls isolator in docs/mesos-containerizer.md.
> -
>
> Key: MESOS-4660
> URL: https://issues.apache.org/jira/browse/MESOS-4660
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>
> We need to add a section in the doc to describe how to use cgroups/net_cls 
> isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4660) Document net_cls isolator in docs/mesos-containerizer.md.

2016-02-11 Thread Jie Yu (JIRA)
Jie Yu created MESOS-4660:
-

 Summary: Document net_cls isolator in docs/mesos-containerizer.md.
 Key: MESOS-4660
 URL: https://issues.apache.org/jira/browse/MESOS-4660
 Project: Mesos
  Issue Type: Task
Reporter: Jie Yu


We need to add a section in the doc to describe how to use cgroups/net_cls 
isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4633) Tests will dereference stack allocated agent objects upon assertion/expectation failure.

2016-02-11 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141913#comment-15141913
 ] 

Joseph Wu edited comment on MESOS-4633 at 2/11/16 7:33 PM:
---

|| Review || Summary ||
| https://reviews.apache.org/r/43434/ | Change to {{Option}} |
|| Discarded below | (decided to take a different approach) |
| https://reviews.apache.org/r/43435/ | Change to {{StartSlave}} helper |
| https://reviews.apache.org/r/43436/ | Change to {{TestContainerizer}} |
| https://reviews.apache.org/r/43437/
https://reviews.apache.org/r/43438/
https://reviews.apache.org/r/43439/
https://reviews.apache.org/r/43440/
https://reviews.apache.org/r/43441/
https://reviews.apache.org/r/43442/
https://reviews.apache.org/r/43444/
https://reviews.apache.org/r/43445/
https://reviews.apache.org/r/43446/
https://reviews.apache.org/r/43447/
https://reviews.apache.org/r/43448/ | Tons and tons of test changes |


was (Author: kaysoky):
|| Review || Summary ||
| https://reviews.apache.org/r/43434/ | Change to {{Option}} |
| https://reviews.apache.org/r/43435/ | Change to {{StartSlave}} helper |
| https://reviews.apache.org/r/43436/ | Change to {{TestContainerizer}} |
| https://reviews.apache.org/r/43437/
https://reviews.apache.org/r/43438/
https://reviews.apache.org/r/43439/
https://reviews.apache.org/r/43440/
https://reviews.apache.org/r/43441/
https://reviews.apache.org/r/43442/
https://reviews.apache.org/r/43444/
https://reviews.apache.org/r/43445/
https://reviews.apache.org/r/43446/
https://reviews.apache.org/r/43447/
https://reviews.apache.org/r/43448/ | Tons and tons of test changes |

> Tests will dereference stack allocated agent objects upon 
> assertion/expectation failure.
> 
>
> Key: MESOS-4633
> URL: https://issues.apache.org/jira/browse/MESOS-4633
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: flaky, mesosphere, tech-debt, test
>
> Tests that use the {{StartSlave}} test helper are generally fragile when the 
> test fails an assert/expect in the middle of the test.  This is because the 
> {{StartSlave}} helper takes raw pointer arguments, which may be 
> stack-allocated.
> In case of an assert failure, the test immediately exits (destroying stack 
> allocated objects) and proceeds onto test cleanup.  The test cleanup may 
> dereference some of these destroyed objects, leading to a test crash like:
> {code}
> [18:27:36][Step 8/8] F0204 18:27:35.981302 23085 logging.cpp:64] RAW: Pure 
> virtual method called
> [18:27:36][Step 8/8] @ 0x7f7077055e1c  google::LogMessage::Fail()
> [18:27:36][Step 8/8] @ 0x7f707705ba6f  google::RawLog__()
> [18:27:36][Step 8/8] @ 0x7f70760f76c9  __cxa_pure_virtual
> [18:27:36][Step 8/8] @   0xa9423c  
> mesos::internal::tests::Cluster::Slaves::shutdown()
> [18:27:36][Step 8/8] @  0x1074e45  
> mesos::internal::tests::MesosTest::ShutdownSlaves()
> [18:27:36][Step 8/8] @  0x1074de4  
> mesos::internal::tests::MesosTest::Shutdown()
> [18:27:36][Step 8/8] @  0x1070ec7  
> mesos::internal::tests::MesosTest::TearDown()
> {code}
> The {{StartSlave}} helper should take {{shared_ptr}} arguments instead.
> This also means that we can remove the {{Shutdown}} helper from most of these 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4658) process::Connection can lead to deadlock around execution in the same context.

2016-02-11 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-4658:
-

 Summary: process::Connection can lead to deadlock around execution 
in the same context.
 Key: MESOS-4658
 URL: https://issues.apache.org/jira/browse/MESOS-4658
 Project: Mesos
  Issue Type: Bug
  Components: HTTP API, libprocess
Reporter: Anand Mazumdar


The {{Connection}} abstraction is prone to deadlocks arising from the object 
being destroyed inside the same execution context.

Consider this example:

{code}
Option connection = process::http::connect(...);
connection.disconnected()
  .onAny(defer(self(), , connection));

connection.disconnect();
connection = None();
{code}

In the above snippet, if the {{connection = None()}} gets executed first before 
the actual dispatch to {{ConnectionProcess}} happens. You might loose the only 
existing reference to {{Connection}} object inside 
{{ConnectionProcess::disconnect}}. This would lead to the destruction of the 
{{Connection}} object in the {{ConnectionProcess}} execution context.

We do have a snippet in our existing code that alludes to such occurrences 
happening: 
https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/http.cpp#L1325

{code}
  // This is a one time request which will close the connection when
  // the response is received. Since 'Connection' is reference-counted,
  // we must keep a copy around until the disconnection occurs. Note
  // that in order to avoid a deadlock (Connection destruction occurring
  // from the ConnectionProcess execution context), we use 'async'.
{code}

AFAICT, for scenarios where we need to hold on to the {{Connection}} object for 
later, this approach does not suffice.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4654) Add a test env to keep temporary folder

2016-02-11 Thread haosdent (JIRA)
haosdent created MESOS-4654:
---

 Summary: Add a test env to keep temporary folder
 Key: MESOS-4654
 URL: https://issues.apache.org/jira/browse/MESOS-4654
 Project: Mesos
  Issue Type: Improvement
  Components: technical debt, test
Reporter: haosdent
Assignee: haosdent
Priority: Minor


Currently, we would clear up temporary folders after we tear down test cases. 
But sometimes we want to check the containers stdout/stderr logs which located 
in the temporary folder to see what happens. And may want to check resource 
files in the temporary file to find out why the test cases failed. I think it 
is more convenient if we could keep the temporary folder after a environment 
variable set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4655) PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in CentOS 7.1

2016-02-11 Thread haosdent (JIRA)
haosdent created MESOS-4655:
---

 Summary: PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in 
CentOS 7.1
 Key: MESOS-4655
 URL: https://issues.apache.org/jira/browse/MESOS-4655
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: haosdent
Assignee: haosdent






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2444) Update mesos presentations documentation

2016-02-11 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142572#comment-15142572
 ] 

Michael Park commented on MESOS-2444:
-

{noformat}
commit da240e2ca55ab3ebd3ca51934bf29a81cb5169ab
Author: Guangya Liu 
Date:   Thu Feb 11 00:42:32 2016 -0800

Updated `docs/presentations.md` to include MesosCon 2015 slides.

Review: https://reviews.apache.org/r/43403/
{noformat}

> Update mesos presentations documentation
> 
>
> Key: MESOS-2444
> URL: https://issues.apache.org/jira/browse/MESOS-2444
> Project: Mesos
>  Issue Type: Task
>  Components: documentation, project website
>Reporter: Dave Lester
>Assignee: Disha Singh
>  Labels: newbie
>
> The list of Mesos presentations in `docs/mesos-presentations.md` only 
> reflects presentations as of mid-2014 and could be more-comprehensive. It 
> would be great to include additional presentations (both slides and videos) 
> on this page.
> Optionally, the display of content on this page could be improved -- 
> potentially using a table and generating thumbnails for each video/slideshow 
> to make it more visual. If this route is taken, images can be added to 
> docs/images; ideally within a subfolder to organize them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4655) PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in CentOS 7.1

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142538#comment-15142538
 ] 

haosdent commented on MESOS-4655:
-

Patch: https://reviews.apache.org/r/43283/
https://reviews.apache.org/r/43284/

> PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in CentOS 7.1
> --
>
> Key: MESOS-4655
> URL: https://issues.apache.org/jira/browse/MESOS-4655
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: haosdent
>Assignee: haosdent
>
> PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in CentOS 7.1, error log is:
> {code}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from PerfEventIsolatorTest
> [ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> I0207 00:58:32.392724 16501 perf_event.cpp:71] Creating PerfEvent isolator
> I0207 00:58:32.440187 16501 perf_event.cpp:109] PerfEvent isolator will 
> profile for 250ms every 500ms for events: { cycles, task-clock }
> I0207 00:58:32.443006 16521 perf_event.cpp:217] Preparing perf event cgroup 
> for 239d30bb-f7a1-413b-9d99-0914149d5899
> E0207 00:58:33.224544 16518 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line ' counted>,,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected 
> number of fields
> E0207 00:58:33.727793 16516 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line ' counted>,,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected 
> number of fields
> E0207 00:58:34.230981 16517 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line ' counted>,,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected 
> number of fields
> E0207 00:58:34.734318 16520 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line ' counted>,,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected 
> number of fields
> E0207 00:58:35.237889 16517 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line ' counted>,,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected 
> number of fields
> E0207 00:58:35.742452 16522 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line ' counted>,,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected 
> number of fields
> E0207 00:58:36.246068 16515 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line ' counted>,,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected 
> number of fields
> ../../src/tests/containerizer/isolator_tests.cpp:1083: Failure
> Expected: (statistics1.get().perf().timestamp()) != 
> (statistics2.perf().timestamp()), actual: 1.45478e+09 vs 1.45478e+09
> ../../src/tests/containerizer/isolator_tests.cpp:1085: Failure
> Value of: statistics2.perf().has_cycles()
>   Actual: false
> Expected: true
> ../../src/tests/containerizer/isolator_tests.cpp:1088: Failure
> Value of: statistics2.perf().has_task_clock()
>   Actual: false
> Expected: true
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample (4069 ms)
> [--] 1 test from PerfEventIsolatorTest (4069 ms total)
> [--] Global test environment tear-down
> ../../src/tests/environment.cpp:732: Failure
> Failed
> Tests completed with child processes remaining:
> -+- 16501 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests 
> --gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose
>  |-+- 16580 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests 
> --gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose
>  | \-+- 16582 perf stat --all-cpus --field-separator , --log-fd 1 --event 
> cycles --cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 --event task-clock 
> --cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 -- sleep 0.25
>  |   \--- 16584 sleep 0.25
>  \--- 16581 ()
> [==] 1 test from 1 test case ran. (4095 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4039) PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142537#comment-15142537
 ] 

haosdent commented on MESOS-4039:
-

Sorry, it's my bad to don't realize this ticket have already closed.

> PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails
> ---
>
> Key: MESOS-4039
> URL: https://issues.apache.org/jira/browse/MESOS-4039
> Project: Mesos
>  Issue Type: Bug
>Reporter: Greg Mann
>Assignee: Jan Schlicht
>  Labels: mesosphere, test-fail
>
> PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails on CentOS 6.6:
> {code}
> [--] 1 test from PerfEventIsolatorTest
> [ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> ../../src/tests/containerizer/isolator_tests.cpp:848: Failure
> isolator: Perf is not supported
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample (79 ms)
> [--] 1 test from PerfEventIsolatorTest (79 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (86 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1356) Uncaught exceptions

2016-02-11 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142591#comment-15142591
 ] 

Klaus Ma commented on MESOS-1356:
-

Yes, maybe more or less the issues than description.

> Uncaught exceptions
> ---
>
> Key: MESOS-1356
> URL: https://issues.apache.org/jira/browse/MESOS-1356
> Project: Mesos
>  Issue Type: Bug
>Reporter: Niklas Quarfot Nielsen
>Assignee: Michael Browning
>  Labels: coverity, newbie
>
> We usually do _not_ use exceptions in Mesos, but some libraries may and we 
> should handle them and perhaps convert them into Try<>/Error.
> 
> *** CID 1213893:  Uncaught exception  (UNCAUGHT_EXCEPT)
> /src/slave/containerizer/linux_launcher.cpp: 148 in 
> mesos::internal::slave::_childMain(const std::tr1::function &, int 
> *)()
> 142   return (*func)();
> 143 }
> 144
> 145
> 146 // Helper that creates a new session then blocks on reading the pipe 
> before
> 147 // calling the supplied function.
> >>> CID 1213893:  Uncaught exception  (UNCAUGHT_EXCEPT)
> >>> In function "_childMain" an exception of type 
> >>> "std::tr1::bad_function_call" is thrown and never caught.
> 148 static int _childMain(
> 149 const lambda::function& childFunction,
> 150 int pipes[2])
> 151 {
> 152   // In child.
> 153   os::close(pipes[1]);
> 
> *** CID 1213894:  Uncaught exception  (UNCAUGHT_EXCEPT)
> /src/slave/containerizer/linux_launcher.cpp: 137 in 
> mesos::internal::slave::childMain(void *)()
> 131
> 132   return Nothing();
> 133 }
> 134
> 135
> 136 // Helper for clone() which expects an int(void*).
> >>> CID 1213894:  Uncaught exception  (UNCAUGHT_EXCEPT)
> >>> In function "childMain" an exception of type 
> >>> "std::tr1::bad_function_call" is thrown and never caught.
> 137 static int childMain(void* child)
> 138 {
> 139   const lambda::function* func =
> 140 static_cast*> (child);
> 141
> 142   return (*func)();
> 
> *** CID 1213895:  Uncaught exception  (UNCAUGHT_EXCEPT)
> /src/usage/main.cpp: 72 in main()
> 66<< endl
> 67<< "Supported options:" << endl
> 68<< flags.usage();
> 69 }
> 70
> 71
> >>> CID 1213895:  Uncaught exception  (UNCAUGHT_EXCEPT)
> >>> In function "main" an exception of type 
> >>> "google::protobuf::FatalException" is thrown and never caught.
> 72 int main(int argc, char** argv)
> 73 {
> 74   GOOGLE_PROTOBUF_VERIFY_VERSION;
> 75
> 76   Flags flags;
> 77
> /src/usage/main.cpp: 72 in main()
> 66<< endl
> 67<< "Supported options:" << endl
> 68<< flags.usage();
> 69 }
> 70
> 71
> >>> CID 1213895:  Uncaught exception  (UNCAUGHT_EXCEPT)
> >>> In function "main" an exception of type 
> >>> "google::protobuf::FatalException" is thrown and never caught.
> 72 int main(int argc, char** argv)
> 73 {
> 74   GOOGLE_PROTOBUF_VERIFY_VERSION;
> 75
> 76   Flags flags;
> 77
> /src/usage/main.cpp: 72 in main()
> 66<< endl
> 67<< "Supported options:" << endl
> 68<< flags.usage();
> 69 }
> 70
> 71
> >>> CID 1213895:  Uncaught exception  (UNCAUGHT_EXCEPT)
> >>> In function "main" an exception of type 
> >>> "google::protobuf::FatalException" is thrown and never caught.
> 72 int main(int argc, char** argv)
> 73 {
> 74   GOOGLE_PROTOBUF_VERIFY_VERSION;
> 75
> 76   Flags flags;
> 77
> 
> *** CID 1213896:  Uncaught exception  (UNCAUGHT_EXCEPT)
> /src/launcher/executor.cpp: 423 in main()
> 417 };
> 418
> 419 } // namespace internal {
> 420 } // namespace mesos {
> 421
> 422
> >>> CID 1213896:  Uncaught exception  (UNCAUGHT_EXCEPT)
> >>> In function "main" an exception of type "std::tr1::bad_function_call" 
> >>> is thrown and never caught.
> 423 int main(int argc, char** argv)
> 424 {
> 425   mesos::internal::CommandExecutor executor;
> 426   mesos::MesosExecutorDriver driver();
> 427   return driver.run() == mesos::DRIVER_STOPPED ? 0 : 1;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4647) Use in_memory as default registry when testing

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142676#comment-15142676
 ] 

haosdent commented on MESOS-4647:
-

The draft: https://reviews.apache.org/r/43480/

So far only change:
# MesosZooKeeperTest
# MasterTest.RecoveredSlaveDoesNotReregister
# MasterTest.RateLimitRecoveredSlaveRemoval
# MasterTest.CancelRecoveredSlaveRemoval

> Use in_memory as default registry when testing
> --
>
> Key: MESOS-4647
> URL: https://issues.apache.org/jira/browse/MESOS-4647
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>
> Currently, we use {{replicated_log}} as default registry when testing. This 
> cause io operations when testings and slow down test cases. We should change 
> it to use {{in_memory}} when testing and only use {{replicated_log}} when 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4039) PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142525#comment-15142525
 ] 

haosdent commented on MESOS-4039:
-

Got it.

> PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails
> ---
>
> Key: MESOS-4039
> URL: https://issues.apache.org/jira/browse/MESOS-4039
> Project: Mesos
>  Issue Type: Bug
>Reporter: Greg Mann
>Assignee: Jan Schlicht
>  Labels: mesosphere, test-fail
>
> PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails on CentOS 6.6:
> {code}
> [--] 1 test from PerfEventIsolatorTest
> [ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> ../../src/tests/containerizer/isolator_tests.cpp:848: Failure
> isolator: Perf is not supported
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample (79 ms)
> [--] 1 test from PerfEventIsolatorTest (79 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (86 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4655) PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in CentOS 7.1

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4655:

Description: 
{code}
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from PerfEventIsolatorTest
[ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
I0207 00:58:32.392724 16501 perf_event.cpp:71] Creating PerfEvent isolator
I0207 00:58:32.440187 16501 perf_event.cpp:109] PerfEvent isolator will profile 
for 250ms every 500ms for events: { cycles, task-clock }
I0207 00:58:32.443006 16521 perf_event.cpp:217] Preparing perf event cgroup for 
239d30bb-f7a1-413b-9d99-0914149d5899
E0207 00:58:33.224544 16518 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:33.727793 16516 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:34.230981 16517 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:34.734318 16520 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:35.237889 16517 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:35.742452 16522 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:36.246068 16515 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
../../src/tests/containerizer/isolator_tests.cpp:1083: Failure
Expected: (statistics1.get().perf().timestamp()) != 
(statistics2.perf().timestamp()), actual: 1.45478e+09 vs 1.45478e+09
../../src/tests/containerizer/isolator_tests.cpp:1085: Failure
Value of: statistics2.perf().has_cycles()
  Actual: false
Expected: true
../../src/tests/containerizer/isolator_tests.cpp:1088: Failure
Value of: statistics2.perf().has_task_clock()
  Actual: false
Expected: true
[  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample (4069 ms)
[--] 1 test from PerfEventIsolatorTest (4069 ms total)

[--] Global test environment tear-down
../../src/tests/environment.cpp:732: Failure
Failed
Tests completed with child processes remaining:
-+- 16501 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests 
--gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose
 |-+- 16580 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests 
--gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose
 | \-+- 16582 perf stat --all-cpus --field-separator , --log-fd 1 --event 
cycles --cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 --event task-clock 
--cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 -- sleep 0.25
 |   \--- 16584 sleep 0.25
 \--- 16581 ()
[==] 1 test from 1 test case ran. (4095 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
{code}

> PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in CentOS 7.1
> --
>
> Key: MESOS-4655
> URL: https://issues.apache.org/jira/browse/MESOS-4655
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: haosdent
>Assignee: haosdent
>
> {code}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from PerfEventIsolatorTest
> [ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> I0207 00:58:32.392724 16501 perf_event.cpp:71] Creating PerfEvent isolator
> I0207 00:58:32.440187 16501 perf_event.cpp:109] PerfEvent isolator will 
> profile for 250ms every 500ms for events: { cycles, task-clock }
> I0207 00:58:32.443006 16521 perf_event.cpp:217] Preparing perf event cgroup 
> for 239d30bb-f7a1-413b-9d99-0914149d5899
> E0207 00:58:33.224544 16518 perf_event.cpp:408] Failed to get perf sample: 
> Failed to parse perf sample: Failed to parse perf sample line ' counted>,,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected 
> number of fields
> E0207 00:58:33.727793 16516 perf_event.cpp:408] Failed to get perf sample: 
> Failed to 

[jira] [Updated] (MESOS-4655) PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in CentOS 7.1

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4655:

Description: 
PerfEventIsolatorTest.ROOT_CGROUPS_Sample failed in CentOS 7.1, error log is:
{code}
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from PerfEventIsolatorTest
[ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
I0207 00:58:32.392724 16501 perf_event.cpp:71] Creating PerfEvent isolator
I0207 00:58:32.440187 16501 perf_event.cpp:109] PerfEvent isolator will profile 
for 250ms every 500ms for events: { cycles, task-clock }
I0207 00:58:32.443006 16521 perf_event.cpp:217] Preparing perf event cgroup for 
239d30bb-f7a1-413b-9d99-0914149d5899
E0207 00:58:33.224544 16518 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:33.727793 16516 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:34.230981 16517 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:34.734318 16520 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:35.237889 16517 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:35.742452 16522 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:36.246068 16515 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
../../src/tests/containerizer/isolator_tests.cpp:1083: Failure
Expected: (statistics1.get().perf().timestamp()) != 
(statistics2.perf().timestamp()), actual: 1.45478e+09 vs 1.45478e+09
../../src/tests/containerizer/isolator_tests.cpp:1085: Failure
Value of: statistics2.perf().has_cycles()
  Actual: false
Expected: true
../../src/tests/containerizer/isolator_tests.cpp:1088: Failure
Value of: statistics2.perf().has_task_clock()
  Actual: false
Expected: true
[  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample (4069 ms)
[--] 1 test from PerfEventIsolatorTest (4069 ms total)

[--] Global test environment tear-down
../../src/tests/environment.cpp:732: Failure
Failed
Tests completed with child processes remaining:
-+- 16501 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests 
--gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose
 |-+- 16580 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests 
--gtest_filter=PerfEventIsolatorTest.ROOT_CGROUPS_Sample --verbose
 | \-+- 16582 perf stat --all-cpus --field-separator , --log-fd 1 --event 
cycles --cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 --event task-clock 
--cgroup mesos/239d30bb-f7a1-413b-9d99-0914149d5899 -- sleep 0.25
 |   \--- 16584 sleep 0.25
 \--- 16581 ()
[==] 1 test from 1 test case ran. (4095 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
{code}

  was:
{code}
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from PerfEventIsolatorTest
[ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
I0207 00:58:32.392724 16501 perf_event.cpp:71] Creating PerfEvent isolator
I0207 00:58:32.440187 16501 perf_event.cpp:109] PerfEvent isolator will profile 
for 250ms every 500ms for events: { cycles, task-clock }
I0207 00:58:32.443006 16521 perf_event.cpp:217] Preparing perf event cgroup for 
239d30bb-f7a1-413b-9d99-0914149d5899
E0207 00:58:33.224544 16518 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:33.727793 16516 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 
of fields
E0207 00:58:34.230981 16517 perf_event.cpp:408] Failed to get perf sample: 
Failed to parse perf sample: Failed to parse perf sample line ',,cycles,mesos/239d30bb-f7a1-413b-9d99-0914149d5899': Unexpected number 

[jira] [Updated] (MESOS-4653) Unify test case temporary folder name format

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4653:

Component/s: test

> Unify test case temporary folder name format
> 
>
> Key: MESOS-4653
> URL: https://issues.apache.org/jira/browse/MESOS-4653
> Project: Mesos
>  Issue Type: Improvement
>  Components: test
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: test
>
> In 
> [environment.cpp#L759https://github.com/apache/mesos/blob/master/src/tests/environment.cpp#L759]
> {code}
>   const string& path =
> path::join("/tmp", strings::join("_", testCase, testName, "XX"));
> {code}
> The temporary file format here is {{testCase_testName_xx}} here.
> But in 
> [utils.hpp#L37|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/3rdparty/stout/include/stout/tests/utils.hpp#L37]
> {code}
> // Create a temporary directory for the test.
> Try directory = os::mkdtemp();
> {code}
> The temporary folder we create here is {{xx}}. I think it would be better 
> we could unify this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4653) Unify test case temporary folder name format

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4653:

Labels: test  (was: )

> Unify test case temporary folder name format
> 
>
> Key: MESOS-4653
> URL: https://issues.apache.org/jira/browse/MESOS-4653
> Project: Mesos
>  Issue Type: Improvement
>  Components: test
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: test
>
> In 
> [environment.cpp#L759https://github.com/apache/mesos/blob/master/src/tests/environment.cpp#L759]
> {code}
>   const string& path =
> path::join("/tmp", strings::join("_", testCase, testName, "XX"));
> {code}
> The temporary file format here is {{testCase_testName_xx}} here.
> But in 
> [utils.hpp#L37|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/3rdparty/stout/include/stout/tests/utils.hpp#L37]
> {code}
> // Create a temporary directory for the test.
> Try directory = os::mkdtemp();
> {code}
> The temporary folder we create here is {{xx}}. I think it would be better 
> we could unify this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4039) PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails

2016-02-11 Thread Jan Schlicht (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142513#comment-15142513
 ] 

Jan Schlicht commented on MESOS-4039:
-

Looks like the test fails with a different reason than the one in this bug. Can 
you create a separate JIRA issue for your case to better track the differences? 
I'll close this issue here again, because the {{Perf is not supported}} failure 
has been fixed.

> PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails
> ---
>
> Key: MESOS-4039
> URL: https://issues.apache.org/jira/browse/MESOS-4039
> Project: Mesos
>  Issue Type: Bug
>Reporter: Greg Mann
>Assignee: Jan Schlicht
>  Labels: mesosphere, test-fail
>
> PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails on CentOS 6.6:
> {code}
> [--] 1 test from PerfEventIsolatorTest
> [ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> ../../src/tests/containerizer/isolator_tests.cpp:848: Failure
> isolator: Perf is not supported
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample (79 ms)
> [--] 1 test from PerfEventIsolatorTest (79 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (86 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4039) PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails

2016-02-11 Thread Jan Schlicht (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142532#comment-15142532
 ] 

Jan Schlicht commented on MESOS-4039:
-

Thanks!

> PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails
> ---
>
> Key: MESOS-4039
> URL: https://issues.apache.org/jira/browse/MESOS-4039
> Project: Mesos
>  Issue Type: Bug
>Reporter: Greg Mann
>Assignee: Jan Schlicht
>  Labels: mesosphere, test-fail
>
> PerfEventIsolatorTest.ROOT_CGROUPS_Sample fails on CentOS 6.6:
> {code}
> [--] 1 test from PerfEventIsolatorTest
> [ RUN  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> ../../src/tests/containerizer/isolator_tests.cpp:848: Failure
> isolator: Perf is not supported
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample (79 ms)
> [--] 1 test from PerfEventIsolatorTest (79 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (86 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4353) Limit the number of processes created by libprocess

2016-02-11 Thread Maged Michael (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142699#comment-15142699
 ] 

Maged Michael commented on MESOS-4353:
--

Replying to Joris:
> I don't think it makes sense to make this a maximum. Rather, it is just the 
> number of libprocess_worker_threads.

My concern is that the number may be set to a very large value.How about we set 
a hardwired maximum value to limit the given value if it is too large?

> Limit the number of processes created by libprocess
> ---
>
> Key: MESOS-4353
> URL: https://issues.apache.org/jira/browse/MESOS-4353
> Project: Mesos
>  Issue Type: Improvement
>  Components: libprocess
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> Currently libprocess will create {{max(8, number of CPU cores)}} processes 
> during the initialization, see 
> https://github.com/apache/mesos/blob/0.26.0/3rdparty/libprocess/src/process.cpp#L2146
>  for details. This should be OK for a normal machine which has no much cores 
> (e.g., 16, 32), but for a powerful machine which may have a large number of 
> cores (e.g., an IBM Power machine may have 192 cores), this will cause too 
> much worker threads which are not necessary.
> And since libprocess is widely used in Mesos (master, agent, scheduler, 
> executor), it may also cause some performance issue. For example, when user 
> creates a Docker container via Mesos in a Mesos agent which is running on a 
> powerful machine with 192 cores, the DockerContainerizer in Mesos agent will 
> create a dedicated executor for the container, and there will be 192 worker 
> threads in that executor. And if user creates 1000 Docker containers in that 
> machine, then there will be 1000 executors, i.e., 1000 * 192 worker threads 
> which is a large number and may thrash the OS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4647) Use in_memory as default registry when testing

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4647:

Description: 
Currently, we use {{replicated_log}} as default registry when testing. This 
cause io operations when testings and slow down test cases. We should change it 
to use {{in_memory}} when testing and only use {{replicated_log}} when 
necessary.

When testing this without sudo.

Before
{code}
[--] Global test environment tear-down
[==] 978 tests from 129 test cases ran. (678321 ms total)
[  PASSED  ] 978 tests.
{code}

After
{code}
[--] Global test environment tear-down
[==] 978 tests from 129 test cases ran. (422265 ms total)
[  PASSED  ] 978 tests.
{code}


  was:Currently, we use {{replicated_log}} as default registry when testing. 
This cause io operations when testings and slow down test cases. We should 
change it to use {{in_memory}} when testing and only use {{replicated_log}} 
when necessary.


> Use in_memory as default registry when testing
> --
>
> Key: MESOS-4647
> URL: https://issues.apache.org/jira/browse/MESOS-4647
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>
> Currently, we use {{replicated_log}} as default registry when testing. This 
> cause io operations when testings and slow down test cases. We should change 
> it to use {{in_memory}} when testing and only use {{replicated_log}} when 
> necessary.
> When testing this without sudo.
> Before
> {code}
> [--] Global test environment tear-down
> [==] 978 tests from 129 test cases ran. (678321 ms total)
> [  PASSED  ] 978 tests.
> {code}
> After
> {code}
> [--] Global test environment tear-down
> [==] 978 tests from 129 test cases ran. (422265 ms total)
> [  PASSED  ] 978 tests.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4646) PortMappingIsolatorTests get kernel stuck.

2016-02-11 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142766#comment-15142766
 ] 

Till Toenshoff commented on MESOS-4646:
---

I now tried a 4.3 kernel. The results are just a bit  better in that the kernel 
does not get stuck but the tests still fail utterly while getting stuck 
themselves.

{noformat}
[ RUN  ] PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP
I0211 05:56:11.255408 90890 port_mapping_tests.cpp:224] Using eth0 as the 
public interface
I0211 05:56:11.255954 90890 port_mapping_tests.cpp:232] Using lo as the 
loopback interface
I0211 05:56:13.144747 90890 port_mapping.cpp:1255] Using eth0 as the public 
interface
I0211 05:56:13.145141 90890 port_mapping.cpp:1280] Using lo as the loopback 
interface
I0211 05:56:13.146286 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/neigh/default/gc_thresh3 = '1024'
I0211 05:56:13.146486 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/neigh/default/gc_thresh1 = '128'
I0211 05:56:13.146747 90890 port_mapping.cpp:1567] /proc/sys/net/ipv4/tcp_wmem 
= '4096  16384   4194304'
I0211 05:56:13.147191 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_synack_retries = '5'
I0211 05:56:13.147518 90890 port_mapping.cpp:1567] /proc/sys/net/core/somaxconn 
= '128'
I0211 05:56:13.147707 90890 port_mapping.cpp:1567] /proc/sys/net/core/rmem_max 
= '212992'
I0211 05:56:13.147971 90890 port_mapping.cpp:1567] /proc/sys/net/ipv4/tcp_rmem 
= '4096  87380   6291456'
I0211 05:56:13.148393 90890 port_mapping.cpp:1567] /proc/sys/net/core/wmem_max 
= '212992'
I0211 05:56:13.148653 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_keepalive_time = '7200'
I0211 05:56:13.148808 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/neigh/default/gc_thresh2 = '512'
I0211 05:56:13.148962 90890 port_mapping.cpp:1567] 
/proc/sys/net/core/netdev_max_backlog = '1000'
I0211 05:56:13.150074 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_keepalive_intvl = '75'
I0211 05:56:13.150271 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_keepalive_probes = '9'
I0211 05:56:13.150394 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_max_syn_backlog = '512'
I0211 05:56:13.150619 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_retries2 = '15'
I0211 05:56:17.074481 90890 linux_launcher.cpp:101] Using 
/sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I0211 05:56:17.078749 90909 port_mapping.cpp:2162] Using non-ephemeral ports 
{[31000,31500)} and ephemeral ports [30016,30032) for container container1 of 
executor ''
I0211 05:56:17.334048 90890 linux_launcher.cpp:363] Cloning child process with 
flags = CLONE_NEWNET | CLONE_NEWNS
../../src/tests/containerizer/port_mapping_tests.cpp:507: Failure
Failed to wait 15secs for isolator.get()->isolate(containerId1, pid.get())
I0211 05:56:34.901305 90907 port_mapping.cpp:2226] Bind mounted 
'/proc/90956/ns/net' to '/var/run/netns/90956' for container container1
[  FAILED  ] PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP (29652 ms)
[ RUN  ] PortMappingIsolatorTest.ROOT_NC_ContainerToContainerUDP
I0211 05:56:40.905812 90890 port_mapping_tests.cpp:224] Using eth0 as the 
public interface
I0211 05:56:40.906904 90890 port_mapping_tests.cpp:232] Using lo as the 
loopback interface
I0211 05:56:40.938251 90890 port_mapping.cpp:1255] Using eth0 as the public 
interface
I0211 05:56:40.938639 90890 port_mapping.cpp:1280] Using lo as the loopback 
interface
I0211 05:56:41.037220 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/neigh/default/gc_thresh3 = '1024'
I0211 05:56:41.037513 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/neigh/default/gc_thresh1 = '128'
I0211 05:56:41.037768 90890 port_mapping.cpp:1567] /proc/sys/net/ipv4/tcp_wmem 
= '4096  16384   4194304'
I0211 05:56:41.038230 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_synack_retries = '5'
I0211 05:56:41.038434 90890 port_mapping.cpp:1567] /proc/sys/net/core/somaxconn 
= '128'
I0211 05:56:41.038596 90890 port_mapping.cpp:1567] /proc/sys/net/core/rmem_max 
= '212992'
I0211 05:56:41.051391 90890 port_mapping.cpp:1567] /proc/sys/net/ipv4/tcp_rmem 
= '4096  87380   6291456'
I0211 05:56:41.051430 90890 port_mapping.cpp:1567] /proc/sys/net/core/wmem_max 
= '212992'
I0211 05:56:41.051456 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_keepalive_time = '7200'
I0211 05:56:41.051482 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/neigh/default/gc_thresh2 = '512'
I0211 05:56:41.051507 90890 port_mapping.cpp:1567] 
/proc/sys/net/core/netdev_max_backlog = '1000'
I0211 05:56:41.051534 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_keepalive_intvl = '75'
I0211 05:56:41.051558 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_keepalive_probes = '9'
I0211 05:56:41.051583 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_max_syn_backlog = '512'
I0211 05:56:41.051606 90890 port_mapping.cpp:1567] 
/proc/sys/net/ipv4/tcp_retries2 = '15'

[jira] [Commented] (MESOS-4519) configure.ac uses a mix of tabs and spaces indentation

2016-02-11 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142730#comment-15142730
 ] 

Klaus Ma commented on MESOS-4519:
-

[~jvanremoortere], would you help to shepherd this?

> configure.ac uses a mix of tabs and spaces indentation
> --
>
> Key: MESOS-4519
> URL: https://issues.apache.org/jira/browse/MESOS-4519
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: Benjamin Bannier
>Priority: Trivial
>  Labels: newbie
>
> configure.ac uses a mix of tabs and spaces for indention while only spaces 
> should be used. Replacing tabs with 8 spaces each appears to be safe and 
> seems to give the desired indention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4646) PortMappingIsolatorTests get kernel stuck.

2016-02-11 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142782#comment-15142782
 ] 

Till Toenshoff commented on MESOS-4646:
---

Ow, after having left the machine in that state for a few minutes, at some 
point the kernel got stuck as well, even with 4.3.

> PortMappingIsolatorTests get kernel stuck.
> --
>
> Key: MESOS-4646
> URL: https://issues.apache.org/jira/browse/MESOS-4646
> Project: Mesos
>  Issue Type: Bug
> Environment: Linux Kernel 3.19.9-49-generic,
> libnl-3.2.27
>Reporter: Till Toenshoff
>Assignee: Cong Wang
>
> {noformat}
> $ sudo ./bin/mesos-tests.sh --gtest_filter="*PortMappingIsolatorTest*"
> Source directory: /home/till/scratchpad/mesos
> Build directory: /home/till/scratchpad/mesos/build
> -
> We cannot run any cgroups tests that require mounting
> hierarchies because you have the following hierarchies mounted:
> /sys/fs/cgroup/blkio, /sys/fs/cgroup/cpu, /sys/fs/cgroup/cpuacct, 
> /sys/fs/cgroup/cpuset, /sys/fs/cgroup/devices, /sys/fs/cgroup/freezer, 
> /sys/fs/cgroup/hugetlb, /sys/fs/cgroup/memory, /sys/fs/cgroup/net_cls, 
> /sys/fs/cgroup/net_prio, /sys/fs/cgroup/perf_event, /sys/fs/cgroup/systemd
> We'll disable the CgroupsNoHierarchyTest test fixture for now.
> -
> WARNING: perf not found for kernel 3.19.0-49
>   You may need to install the following packages for this specific kernel:
> linux-tools-3.19.0-49-generic
> linux-cloud-tools-3.19.0-49-generic
>   You may also want to install one of the following packages to keep up to 
> date:
> linux-tools-generic-lts-
> linux-cloud-tools-generic-lts-
> -
> No 'perf' command found so no 'perf' tests will be run
> -
> WARNING: perf not found for kernel 3.19.0-49
>   You may need to install the following packages for this specific kernel:
> linux-tools-3.19.0-49-generic
> linux-cloud-tools-3.19.0-49-generic
>   You may also want to install one of the following packages to keep up to 
> date:
> linux-tools-generic-lts-
> linux-cloud-tools-generic-lts-
> -
> The 'perf' command wasn't found so tests using it
> to sample the 'cycles' hardware event will not be run.
> -
> /bin/nc
> /usr/local/bin/curl
> Note: Google Test filter = 
> 

[jira] [Commented] (MESOS-4643) PortMappingIsolatorTest fail when no namespaces are set.

2016-02-11 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142770#comment-15142770
 ] 

Till Toenshoff commented on MESOS-4643:
---

My workaround was to simply add a namespace before running the test-suite:
{{sudo ip netns add foo}}

> PortMappingIsolatorTest fail when no namespaces are set.
> 
>
> Key: MESOS-4643
> URL: https://issues.apache.org/jira/browse/MESOS-4643
> Project: Mesos
>  Issue Type: Bug
> Environment: Linux Kernel 3.19.0-49-generic,
> libnl-3.2.27
>Reporter: Till Toenshoff
>Priority: Minor
>
> Currently our network isolator tests fail with the following output on a 
> Ubuntu 14.04 VM.
> {noformat}
> [02:10:15][Step 8/8] [ RUN  ] 
> PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP
> [02:10:15][Step 8/8] 
> ../../src/tests/containerizer/port_mapping_tests.cpp:164: Failure
> [02:10:15][Step 8/8] entries: Failed to opendir '/var/run/netns': No such 
> file or directory
> [02:10:15][Step 8/8] 
> ../../src/tests/containerizer/port_mapping_tests.cpp:164: Failure
> [02:10:15][Step 8/8] entries: Failed to opendir '/var/run/netns': No such 
> file or directory
> [02:10:15][Step 8/8] [  FAILED  ] 
> PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP (4 ms)
> {noformat}
> The machine has no network namespaces set, hence {{/var/run/netns}} does not 
> exist. 
> We should help users understanding this prerequisite or maybe even get these 
> things in a fixture.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4657) Add LOG(INFO) in `cgroups/net_cls` for debugging allocation of net_cls handles.

2016-02-11 Thread Avinash Sridharan (JIRA)
Avinash Sridharan created MESOS-4657:


 Summary: Add LOG(INFO) in `cgroups/net_cls` for debugging 
allocation of net_cls handles.
 Key: MESOS-4657
 URL: https://issues.apache.org/jira/browse/MESOS-4657
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
 Environment: Linux
Reporter: Avinash Sridharan
Assignee: Avinash Sridharan
Priority: Minor


We need to add LOG(INFO) during the prepare phase of `cgroups/net_cls` for 
debugging management of `net_cls` handles within the isolator. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4479) Implement reservation labels

2016-02-11 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143764#comment-15143764
 ] 

Michael Park commented on MESOS-4479:
-

{noformat}
commit 3b02b80fae886caccd242f5fc205e91a42723861
Author: Neil Conway 
Date:   Thu Feb 11 16:07:05 2016 -0800

Added documentation for labeled reserved resources.

Review: https://reviews.apache.org/r/42755/
{noformat}
{noformat}
commit 77448c0bda4109ceb0c2aadbb5d240faa12b1f3e
Author: Neil Conway 
Date:   Thu Feb 11 15:56:39 2016 -0800

Added support for labels to resource reservations.

Labels are free-form key-value pairs that can be used to associate
metadata with reserved resources.

Review: https://reviews.apache.org/r/42754/
{noformat}

> Implement reservation labels
> 
>
> Key: MESOS-4479
> URL: https://issues.apache.org/jira/browse/MESOS-4479
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: labels, mesosphere, reservations
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4661) SlaveRecoveryTest/0.ReconnectHTTPExecutor is flaky

2016-02-11 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-4661:
-

 Summary: SlaveRecoveryTest/0.ReconnectHTTPExecutor is flaky
 Key: MESOS-4661
 URL: https://issues.apache.org/jira/browse/MESOS-4661
 Project: Mesos
  Issue Type: Bug
Reporter: Anand Mazumdar


Showed up on ASF CI:
https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu:14.04,label_exp=docker%7C%7CHadoop/1660/consoleFull

{code}
[ RUN  ] SlaveRecoveryTest/0.ReconnectHTTPExecutor
I0212 00:23:08.177824   702 leveldb.cpp:174] Opened db in 2.499462ms
I0212 00:23:08.179204   702 leveldb.cpp:181] Compacted db in 1.206514ms
I0212 00:23:08.179400   702 leveldb.cpp:196] Created db iterator in 36168ns
I0212 00:23:08.179538   702 leveldb.cpp:202] Seeked to beginning of db in 2343ns
I0212 00:23:08.179651   702 leveldb.cpp:271] Iterated through 0 keys in the db 
in 471ns
I0212 00:23:08.179816   702 replica.cpp:779] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0212 00:23:08.180547   736 recover.cpp:447] Starting replica recovery
I0212 00:23:08.181025   736 recover.cpp:473] Replica is in EMPTY status
I0212 00:23:08.182406   722 replica.cpp:673] Replica in EMPTY status received a 
broadcasted recover request from (9452)@172.17.0.2:57200
I0212 00:23:08.182624   724 recover.cpp:193] Received a recover response from a 
replica in EMPTY status
I0212 00:23:08.183368   736 recover.cpp:564] Updating replica status to STARTING
I0212 00:23:08.184329   730 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 726589ns
I0212 00:23:08.184361   730 replica.cpp:320] Persisted replica status to 
STARTING
I0212 00:23:08.184501   722 recover.cpp:473] Replica is in STARTING status
I0212 00:23:08.186000   733 replica.cpp:673] Replica in STARTING status 
received a broadcasted recover request from (9453)@172.17.0.2:57200
I0212 00:23:08.186311   735 recover.cpp:193] Received a recover response from a 
replica in STARTING status
I0212 00:23:08.186650   724 recover.cpp:564] Updating replica status to VOTING
I0212 00:23:08.186785   727 master.cpp:376] Master 
6508f198-e145-4d76-844f-0460dc5d7d39 (ca60addecc0b) started on 172.17.0.2:57200
I0212 00:23:08.186808   727 master.cpp:378] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="true" --authenticate_http="true" --authenticate_slaves="true" 
--authenticators="crammd5" --authorizers="local" 
--credentials="/tmp/9KHFn8/credentials" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/mesos/mesos-0.28.0/_inst/share/mesos/webui" 
--work_dir="/tmp/9KHFn8/master" --zk_session_timeout="10secs"
I0212 00:23:08.187353   727 master.cpp:423] Master only allowing authenticated 
frameworks to register
I0212 00:23:08.187366   727 master.cpp:428] Master only allowing authenticated 
slaves to register
I0212 00:23:08.187376   727 credentials.hpp:35] Loading credentials for 
authentication from '/tmp/9KHFn8/credentials'
I0212 00:23:08.187533   724 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 460382ns
I0212 00:23:08.187676   724 replica.cpp:320] Persisted replica status to VOTING
I0212 00:23:08.187770   727 master.cpp:468] Using default 'crammd5' 
authenticator
I0212 00:23:08.188096   727 master.cpp:537] Using default 'basic' HTTP 
authenticator
I0212 00:23:08.188344   727 master.cpp:571] Authorization enabled
I0212 00:23:08.188544   728 recover.cpp:578] Successfully joined the Paxos group
I0212 00:23:08.189209   722 hierarchical.cpp:144] Initialized hierarchical 
allocator process
I0212 00:23:08.189337   731 whitelist_watcher.cpp:77] No whitelist given
I0212 00:23:08.189357   728 recover.cpp:462] Recover process terminated
I0212 00:23:08.192903   733 master.cpp:1712] The newly elected leader is 
master@172.17.0.2:57200 with id 6508f198-e145-4d76-844f-0460dc5d7d39
I0212 00:23:08.192940   733 master.cpp:1725] Elected as the leading master!
I0212 00:23:08.193133   733 master.cpp:1470] Recovering from registrar
I0212 00:23:08.193269   734 registrar.cpp:307] Recovering registrar
I0212 00:23:08.194031   734 log.cpp:659] Attempting to start the writer
I0212 00:23:08.195296   730 replica.cpp:493] Replica received implicit promise 
request from (9455)@172.17.0.2:57200 with proposal 1
I0212 00:23:08.196018   730 

[jira] [Created] (MESOS-4662) PortMapping network isolator should not assume BIND_MOUNT_ROOT is a realpath.

2016-02-11 Thread Jie Yu (JIRA)
Jie Yu created MESOS-4662:
-

 Summary: PortMapping network isolator should not assume 
BIND_MOUNT_ROOT is a realpath.
 Key: MESOS-4662
 URL: https://issues.apache.org/jira/browse/MESOS-4662
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.25.0, 0.26.0, 0.27.0
Reporter: Jie Yu


On some newer linux distributions, /var/run is a symlink to /run. The port 
mapping isolator assumes that PORT_MAPPING_BIND_MOUNT_ROOT is a realpath 
(exists in the mount table), which obviously is not true on those systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4439) Fix appc CachedImage image validation

2016-02-11 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4439:
--
Story Points: 1  (was: 2)

> Fix appc CachedImage image validation
> -
>
> Key: MESOS-4439
> URL: https://issues.apache.org/jira/browse/MESOS-4439
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere, unified-containerizer-mvp
> Fix For: 0.28.0
>
>
> Currently image validation is done assuming that the image's filename will 
> have  digest (SHA-512) information. This is not part of the spec
> (https://github.com/appc/spec/blob/master/spec/discovery.md).
> 
> The spec specifies the tuple  as unique identifier 
> for  discovering an image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4658) process::Connection can lead to deadlock around execution in the same context.

2016-02-11 Thread Shuai Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Lin reassigned MESOS-4658:


Assignee: Shuai Lin

> process::Connection can lead to deadlock around execution in the same context.
> --
>
> Key: MESOS-4658
> URL: https://issues.apache.org/jira/browse/MESOS-4658
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, libprocess
>Reporter: Anand Mazumdar
>Assignee: Shuai Lin
>  Labels: mesosphere
>
> The {{Connection}} abstraction is prone to deadlocks arising from the object 
> being destroyed inside the same execution context.
> Consider this example:
> {code}
> Option connection = process::http::connect(...);
> connection.disconnected()
>   .onAny(defer(self(), , connection));
> connection.disconnect();
> connection = None();
> {code}
> In the above snippet, if the {{connection = None()}} gets executed first 
> before the actual dispatch to {{ConnectionProcess}} happens. You might loose 
> the only existing reference to {{Connection}} object inside 
> {{ConnectionProcess::disconnect}}. This would lead to the destruction of the 
> {{Connection}} object in the {{ConnectionProcess}} execution context.
> We do have a snippet in our existing code that alludes to such occurrences 
> happening: 
> https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/http.cpp#L1325
> {code}
>   // This is a one time request which will close the connection when
>   // the response is received. Since 'Connection' is reference-counted,
>   // we must keep a copy around until the disconnection occurs. Note
>   // that in order to avoid a deadlock (Connection destruction occurring
>   // from the ConnectionProcess execution context), we use 'async'.
> {code}
> AFAICT, for scenarios where we need to hold on to the {{Connection}} object 
> for later, this approach does not suffice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2971) Implement OverlayFS based provisioner backend

2016-02-11 Thread Mei Wan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143721#comment-15143721
 ] 

Mei Wan commented on MESOS-2971:


Hi Shuai, I have a reviewboard still under review 
https://reviews.apache.org/r/37853/ but I haven't had much time to look at it. 
Feel free to take a look or start afresh!

> Implement OverlayFS based provisioner backend
> -
>
> Key: MESOS-2971
> URL: https://issues.apache.org/jira/browse/MESOS-2971
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Mei Wan
>  Labels: mesosphere, twitter, unified-containerizer-mvp
>
> Part of the image provisioning process is to call a backend to create a root 
> filesystem based on the image on disk layout.
> The problem with the copy backend is that it's both waste of IO and space, 
> and bind only can deal with one layer.
> Overlayfs backend allows us to utilize the filesystem to merge multiple 
> filesystems into one efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4596) Add common Appc spec utilities.

2016-02-11 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4596:
--
Story Points: 2  (was: 3)

> Add common Appc spec utilities.
> ---
>
> Key: MESOS-4596
> URL: https://issues.apache.org/jira/browse/MESOS-4596
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere, unified-containerizer-mvp
> Fix For: 0.28.0
>
>
>  Add common utility functions such as :
>   - validating image information against actual data in the image 
> directory.
>   - getting list of dependencies at depth 1 for an image.
>   - getting image path simple image discovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2971) Implement OverlayFS based provisioner backend

2016-02-11 Thread Shuai Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143713#comment-15143713
 ] 

Shuai Lin commented on MESOS-2971:
--

Hi all, What's the status of this ticket? [~mwan] Can I take it if you're not 
working on it recently? 

> Implement OverlayFS based provisioner backend
> -
>
> Key: MESOS-2971
> URL: https://issues.apache.org/jira/browse/MESOS-2971
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Mei Wan
>  Labels: mesosphere, twitter, unified-containerizer-mvp
>
> Part of the image provisioning process is to call a backend to create a root 
> filesystem based on the image on disk layout.
> The problem with the copy backend is that it's both waste of IO and space, 
> and bind only can deal with one layer.
> Overlayfs backend allows us to utilize the filesystem to merge multiple 
> filesystems into one efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4164) MasterTest.RecoverResources is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4164:
---

Assignee: haosdent

> MasterTest.RecoverResources is slow
> ---
>
> Key: MESOS-4164
> URL: https://issues.apache.org/jira/browse/MESOS-4164
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterTest.RecoverResources}} test takes more than {{1s}} to finish on 
> my Mac OS 10.10.4:
> {code}
> MasterTest.RecoverResources (1018 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4173) HealthCheckTest.CheckCommandTimeout is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4173:

Assignee: Timothy Chen  (was: haosdent)

> HealthCheckTest.CheckCommandTimeout is slow
> ---
>
> Key: MESOS-4173
> URL: https://issues.apache.org/jira/browse/MESOS-4173
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: Timothy Chen
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{HealthCheckTest.CheckCommandTimeout}} test takes more than {{15s}}! to 
> finish on my Mac OS 10.10.4:
> {code}
> HealthCheckTest.CheckCommandTimeout (15483 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4165) MasterTest.MasterInfoOnReElection is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4165:
---

Assignee: haosdent

> MasterTest.MasterInfoOnReElection is slow
> -
>
> Key: MESOS-4165
> URL: https://issues.apache.org/jira/browse/MESOS-4165
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterTest.MasterInfoOnReElection}} test takes more than {{1s}} to 
> finish on my Mac OS 10.10.4:
> {code}
> MasterTest.MasterInfoOnReElection (1024 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4170) OversubscriptionTest.UpdateAllocatorOnSchedulerFailover is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4170:
---

Assignee: haosdent

> OversubscriptionTest.UpdateAllocatorOnSchedulerFailover is slow
> ---
>
> Key: MESOS-4170
> URL: https://issues.apache.org/jira/browse/MESOS-4170
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{OversubscriptionTest.UpdateAllocatorOnSchedulerFailover}} test takes 
> more than {{1s}} to finish on my Mac OS 10.10.4:
> {code}
> OversubscriptionTest.UpdateAllocatorOnSchedulerFailover (1018 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4172) GarbageCollectorIntegrationTest.Restart is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4172:
---

Assignee: haosdent

> GarbageCollectorIntegrationTest.Restart is slow
> ---
>
> Key: MESOS-4172
> URL: https://issues.apache.org/jira/browse/MESOS-4172
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{GarbageCollectorIntegrationTest.Restart}} test takes more than {{5s}} 
> to finish on my Mac OS 10.10.4:
> {code}
> GarbageCollectorIntegrationTest.Restart (5102 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4168) MasterMaintenanceTest.EnterMaintenanceMode is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4168:
---

Assignee: haosdent

> MasterMaintenanceTest.EnterMaintenanceMode is slow 
> ---
>
> Key: MESOS-4168
> URL: https://issues.apache.org/jira/browse/MESOS-4168
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterMaintenanceTest.EnterMaintenanceMode}} test takes more than 
> {{5s}} to finish on my Mac OS 10.10.4:
> {code}
> MasterMaintenanceTest.EnterMaintenanceMode (5087 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4171) OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4171:
---

Assignee: haosdent

> OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover is slow
> --
>
> Key: MESOS-4171
> URL: https://issues.apache.org/jira/browse/MESOS-4171
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover}} test takes 
> more than {{1s}} to finish on my Mac OS 10.10.4:
> {code}
> OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover (1018 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4167) MasterTest.OfferTimeout is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4167:
---

Assignee: haosdent

> MasterTest.OfferTimeout is slow
> ---
>
> Key: MESOS-4167
> URL: https://issues.apache.org/jira/browse/MESOS-4167
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterTest.OfferTimeout}} test takes more than {{1s}} to finish on my 
> Mac OS 10.10.4:
> {code}
> MasterTest.OfferTimeout (1053 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4166) MasterTest.LaunchCombinedOfferTest is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4166:
---

Assignee: haosdent

> MasterTest.LaunchCombinedOfferTest is slow
> --
>
> Key: MESOS-4166
> URL: https://issues.apache.org/jira/browse/MESOS-4166
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterTest.LaunchCombinedOfferTest}} test takes more than {{2s}} to 
> finish on my Mac OS 10.10.4:
> {code}
> MasterTest.LaunchCombinedOfferTest (2023 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4173) HealthCheckTest.CheckCommandTimeout is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4173:
---

Assignee: haosdent

> HealthCheckTest.CheckCommandTimeout is slow
> ---
>
> Key: MESOS-4173
> URL: https://issues.apache.org/jira/browse/MESOS-4173
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{HealthCheckTest.CheckCommandTimeout}} test takes more than {{15s}}! to 
> finish on my Mac OS 10.10.4:
> {code}
> HealthCheckTest.CheckCommandTimeout (15483 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4169) MasterMaintenanceTest.InverseOffers is slow

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4169:
---

Assignee: haosdent

> MasterMaintenanceTest.InverseOffers is slow
> ---
>
> Key: MESOS-4169
> URL: https://issues.apache.org/jira/browse/MESOS-4169
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{MasterMaintenanceTest.InverseOffers}} test takes more than {{2s}} to 
> finish on my Mac OS 10.10.4:
> {code}
> MasterMaintenanceTest.InverseOffers (2027 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-297) Speed up the slow running tests.

2016-02-11 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-297:
---
Assignee: Benjamin Mahler

> Speed up the slow running tests.
> 
>
> Key: MESOS-297
> URL: https://issues.apache.org/jira/browse/MESOS-297
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Benjamin Mahler
>Assignee: Benjamin Mahler
>Priority: Minor
>
> The tests currently take 70 seconds on my machine:
> [==] 200 tests from 37 test cases ran. (68963 ms total)
> There are some major culprits:
> [--] 12 tests from ZooKeeperTest (27484 ms total)
> [--] 5 tests from SampleFrameworks (12529 ms total)
> [--] 8 tests from ResourceOffersTest (4166 ms total)
> [--] 2 tests from AllocatorZooKeeperTest/0 (4128 ms total)
> [--] 3 tests from GarbageCollectorTest (3117 ms total)
> Hopefully there are some quick gains to be had.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3738) Mesos health check is invoked incorrectly when Mesos slave is within the docker container

2016-02-11 Thread Evan Krall (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144020#comment-15144020
 ] 

Evan Krall commented on MESOS-3738:
---

Any chance we could get that patch applied and a version 0.23.2, 0.24.2, 0.25.2 
released?

> Mesos health check is invoked incorrectly when Mesos slave is within the 
> docker container
> -
>
> Key: MESOS-3738
> URL: https://issues.apache.org/jira/browse/MESOS-3738
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.25.0
> Environment: Docker 1.8.0:
> Client:
>  Version:  1.8.0
>  API version:  1.20
>  Go version:   go1.4.2
>  Git commit:   0d03096
>  Built:Tue Aug 11 16:48:39 UTC 2015
>  OS/Arch:  linux/amd64
> Server:
>  Version:  1.8.0
>  API version:  1.20
>  Go version:   go1.4.2
>  Git commit:   0d03096
>  Built:Tue Aug 11 16:48:39 UTC 2015
>  OS/Arch:  linux/amd64
> Host: Ubuntu 14.04
> Container: Debian 8.1 + Java-7
>Reporter: Yong Tang
>Assignee: haosdent
> Fix For: 0.26.0
>
> Attachments: MESOS-3738-0_23_1.patch, MESOS-3738-0_24_1.patch, 
> MESOS-3738-0_25_0.patch
>
>
> When Mesos slave is within the container, the COMMAND health check from 
> Marathon is invoked incorrectly.
> In such a scenario, the sandbox directory (instead of the 
> launcher/health-check directory) is used. This result in an error with the 
> container.
> Command to invoke the Mesos slave container:
> {noformat}
> sudo docker run -d -v /sys:/sys -v /usr/bin/docker:/usr/bin/docker:ro -v 
> /usr/lib/x86_64-linux-gnu/libapparmor.so.1:/usr/lib/x86_64-linux-gnu/libapparmor.so.1:ro
>  -v /var/run/docker.sock:/var/run/docker.sock -v /tmp/mesos:/tmp/mesos mesos 
> mesos slave --master=zk://10.2.1.2:2181/mesos --containerizers=docker,mesos 
> --executor_registration_timeout=5mins --docker_stop_timeout=10secs 
> --launcher=posix
> {noformat}
> Marathon JSON file:
> {code}
> {
>   "id": "ubuntu",
>   "container":
>   {
> "type": "DOCKER",
> "docker":
> {
>   "image": "ubuntu",
>   "network": "BRIDGE",
>   "parameters": []
> }
>   },
>   "args": [ "bash", "-c", "while true; do echo 1; sleep 5; done" ],
>   "uris": [],
>   "healthChecks":
>   [
> {
>   "protocol": "COMMAND",
>   "command": { "value": "echo Success" },
>   "gracePeriodSeconds": 3000,
>   "intervalSeconds": 5,
>   "timeoutSeconds": 5,
>   "maxConsecutiveFailures": 300
> }
>   ],
>   "instances": 1
> }
> {code}
> {noformat}
> STDOUT:
> root@cea2be47d64f:/mnt/mesos/sandbox# cat stdout 
> --container="mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f"
>  --stop_timeout="10secs"
> --container="mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f"
>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
> --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" 
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
> --sandbox_directory="/tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f"
>  --stop_timeout="10secs"
> Registered docker executor on b01e2e75afcb
> Starting task ubuntu.86bca10f-72c9-11e5-b36d-02420a020106
> 1
> Launching health check process: 
> /tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f/mesos-health-check
>  --executor=(1)@10.2.1.7:40695 
> --health_check_json={"command":{"shell":true,"value":"docker exec 
> mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f
>  sh -c \" echo Success 
> \""},"consecutive_failures":300,"delay_seconds":0.0,"grace_period_seconds":3000.0,"interval_seconds":5.0,"timeout_seconds":5.0}
>  --task_id=ubuntu.86bca10f-72c9-11e5-b36d-02420a020106
> Health check process launched at pid: 94
> 1
> 1
> 1
> 1
> 1
> STDERR:
> root@cea2be47d64f:/mnt/mesos/sandbox# cat stderr
> I1014 23:15:58.12795056 exec.cpp:134] Version: 0.25.0
> I1014 23:15:58.13062762 exec.cpp:208] Executor registered on 

[jira] [Commented] (MESOS-4653) Unify test case temporary folder name format

2016-02-11 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142981#comment-15142981
 ] 

Joseph Wu commented on MESOS-4653:
--

[~haosd...@gmail.com] If you'd like to work on this, feel free to take 
[MESOS-3848], which is already scoped.  I believe [~jieyu] should be willing to 
shepherd (but confirm that he has cycles first).

> Unify test case temporary folder name format
> 
>
> Key: MESOS-4653
> URL: https://issues.apache.org/jira/browse/MESOS-4653
> Project: Mesos
>  Issue Type: Improvement
>  Components: test
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: test
>
> In 
> [environment.cpp#L759https://github.com/apache/mesos/blob/master/src/tests/environment.cpp#L759]
> {code}
>   const string& path =
> path::join("/tmp", strings::join("_", testCase, testName, "XX"));
> {code}
> The temporary file format here is {{testCase_testName_xx}} here.
> But in 
> [utils.hpp#L37|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/3rdparty/stout/include/stout/tests/utils.hpp#L37]
> {code}
> // Create a temporary directory for the test.
> Try directory = os::mkdtemp();
> {code}
> The temporary folder we create here is {{xx}}. I think it would be better 
> we could unify this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3848) Refactor Environment::mkdtemp into TemporaryDirectoryTest.

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143001#comment-15143001
 ] 

haosdent commented on MESOS-3848:
-

{quote}
Move the temporary directory logic from Environment::mkdtemp to 
TemporaryDirectoryTest.
{quote}
+1 And does this mean we could call multiple times mkdtemp in 
{{TemporaryDirectoryTest}} and destroy them in 
{{TemporaryDirectoryTest::TearDown}}. Just as what we do now in 
{{Environment::TearDown}}?

And I saw
* process_tests.cpp
* subprocess_tests.cpp
* zookeeper_test_server.cpp

still use os::mkdtemp. I think it would be better change them use the dir 
created by {{TemporaryDirectoryTest}}.

> Refactor Environment::mkdtemp into TemporaryDirectoryTest.
> --
>
> Key: MESOS-3848
> URL: https://issues.apache.org/jira/browse/MESOS-3848
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: mesosphere
>
> As part of [MESOS-3762], many tests were changed from one 
> {{TemporaryDirectoryTest}} to another {{TemporaryDirectoryTest}}.  One subtle 
> difference is that the name of the temporary directory no longer contains the 
> name of the test.  In [MESOS-3847], the duplicate {{TemporaryDirectoryTest}} 
> was removed.
> The original {{TemporaryDirectoryTest}} called 
> [{{environment->mkdtemp}}|https://github.com/apache/mesos/blob/master/src/tests/environment.cpp#L494].
>   We would like the naming, which is valuable for debugging, to be available 
> for a majority of tests.  (A majority of tests inherit from 
> {{TemporaryDirectoryTest}} in some way.)
> Note:
> * Any additional directories created via {{environment->mkdtemp}} are cleaned 
> up after the test.
> * We don't want mesos-specific logic in Stout, like the {{umount}} shell 
> command in {{Environment::TearDown}}.
> *Proposed change:*
> Move the temporary directory logic from {{Environment::mkdtemp}} to 
> {{TemporaryDirectoryTest}}.
> *Tests that need to change*
> | {{log_tests.cpp}} | {{LogZooKeeperTest}} | We can change {{ZooKeeperTest}} 
> to inherit from {{TemporaryDirectoryTest}} to get rid of code duplication |
> | {{tests/mesos.cpp}} | {{MesosTest::CreateSlaveFlags}} | {{MesosTest}} 
> already inherits from {{TemporaryDirectoryTest}}. |
> | {{tests/script.hpp}} | {{TEST_SCRIPT}} | This is used for the 
> {{ExampleTests}}.  We can define a test class that inherits appropriately. |
> | {{docker_tests.cpp}} | {{*}} | Already inherits from {{MesosTest}}. |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3848) Refactor Environment::mkdtemp into TemporaryDirectoryTest.

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143027#comment-15143027
 ] 

haosdent commented on MESOS-3848:
-

Got it. Thank you.

> Refactor Environment::mkdtemp into TemporaryDirectoryTest.
> --
>
> Key: MESOS-3848
> URL: https://issues.apache.org/jira/browse/MESOS-3848
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: mesosphere
>
> As part of [MESOS-3762], many tests were changed from one 
> {{TemporaryDirectoryTest}} to another {{TemporaryDirectoryTest}}.  One subtle 
> difference is that the name of the temporary directory no longer contains the 
> name of the test.  In [MESOS-3847], the duplicate {{TemporaryDirectoryTest}} 
> was removed.
> The original {{TemporaryDirectoryTest}} called 
> [{{environment->mkdtemp}}|https://github.com/apache/mesos/blob/master/src/tests/environment.cpp#L494].
>   We would like the naming, which is valuable for debugging, to be available 
> for a majority of tests.  (A majority of tests inherit from 
> {{TemporaryDirectoryTest}} in some way.)
> Note:
> * Any additional directories created via {{environment->mkdtemp}} are cleaned 
> up after the test.
> * We don't want mesos-specific logic in Stout, like the {{umount}} shell 
> command in {{Environment::TearDown}}.
> *Proposed change:*
> Move the temporary directory logic from {{Environment::mkdtemp}} to 
> {{TemporaryDirectoryTest}}.
> *Tests that need to change*
> | {{log_tests.cpp}} | {{LogZooKeeperTest}} | We can change {{ZooKeeperTest}} 
> to inherit from {{TemporaryDirectoryTest}} to get rid of code duplication |
> | {{tests/mesos.cpp}} | {{MesosTest::CreateSlaveFlags}} | {{MesosTest}} 
> already inherits from {{TemporaryDirectoryTest}}. |
> | {{tests/script.hpp}} | {{TEST_SCRIPT}} | This is used for the 
> {{ExampleTests}}.  We can define a test class that inherits appropriately. |
> | {{docker_tests.cpp}} | {{*}} | Already inherits from {{MesosTest}}. |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4547) Introduce TASK_KILLING state.

2016-02-11 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143070#comment-15143070
 ] 

Abhishek Dasgupta commented on MESOS-4547:
--

Please find the patches at:
https://reviews.apache.org/r/43487/
https://reviews.apache.org/r/43488/
https://reviews.apache.org/r/43489/
https://reviews.apache.org/r/43490/

> Introduce TASK_KILLING state.
> -
>
> Key: MESOS-4547
> URL: https://issues.apache.org/jira/browse/MESOS-4547
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Benjamin Mahler
>Assignee: Abhishek Dasgupta
>  Labels: mesosphere
>
> Currently there is no state to express that a task is being killed, but is 
> not yet killed (see MESOS-4140). In a similar way to how we have 
> TASK_STARTING to indicate the task is starting but not yet running, a 
> TASK_KILLING state would indicate the task is being killed but is not yet 
> killed.
> This would need to be guarded by a framework capability to protect old 
> frameworks that cannot understand the TASK_KILLING state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3296) Failing ROOT_ tests on CentOS 7.1 - LinuxFilesystemIsolatorTest

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143071#comment-15143071
 ] 

haosdent commented on MESOS-3296:
-

{code}
[ RUN  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem
I0212 01:11:25.390995 25282 linux.cpp:81] Making 
'/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_f15DnB' a shared 
mount
I0212 01:11:25.402125 25282 linux_launcher.cpp:101] Using 
/sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I0212 01:11:25.404479 25282 systemd.cpp:223] systemd version `219` detected
I0212 01:11:25.405414 25303 containerizer.cpp:666] Starting container 
'720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7' for executor 'test_executor' of 
framework ''
I0212 01:11:25.407177 25299 provisioner.cpp:285] Provisioning image rootfs 
'/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_f15DnB/provisioner/containers/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7/backends/copy/rootfses/b121c623-51db-4ab3-8daf-de3aae6c56d6'
 for container 720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7
I0212 01:11:28.773602 25299 linux.cpp:306] Bind mounting work directory from 
'/tmp/Pe7dyr/sandbox' to 
'/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_f15DnB/provisioner/containers/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7/backends/copy/rootfses/b121c623-51db-4ab3-8daf-de3aae6c56d6/mnt/mesos/sandbox'
 for container 720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7
I0212 01:11:28.778959 25302 linux_launcher.cpp:304] Cloning child process with 
flags = CLONE_NEWNS
+ /home/haosdent/mesos/build/src/mesos-containerizer mount --help=false 
--operation=make-rslave --path=/
+ grep -E /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_f15DnB/.+ 
/proc/self/mountinfo
+ grep -v 720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7
+ cut '-d ' -f5
+ xargs --no-run-if-empty umount -l
Changing root to 
/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_f15DnB/provisioner/containers/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7/backends/copy/rootfses/b121c623-51db-4ab3-8daf-de3aae6c56d6
I0212 01:11:28.972585 25299 containerizer.cpp:1585] Executor for container 
'720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7' has exited
I0212 01:11:28.972681 25299 containerizer.cpp:1369] Destroying container 
'720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7'
I0212 01:11:28.977098 25297 cgroups.cpp:2427] Freezing cgroup 
/sys/fs/cgroup/freezer/mesos/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7
I0212 01:11:28.980403 25296 cgroups.cpp:1409] Successfully froze cgroup 
/sys/fs/cgroup/freezer/mesos/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7 after 
3.208704ms
I0212 01:11:28.983417 25298 cgroups.cpp:2445] Thawing cgroup 
/sys/fs/cgroup/freezer/mesos/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7
I0212 01:11:28.986616 25296 cgroups.cpp:1438] Successfullly thawed cgroup 
/sys/fs/cgroup/freezer/mesos/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7 after 
3.096832ms
I0212 01:11:28.990787 25303 linux.cpp:768] Unmounting sandbox/work directory 
'/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_f15DnB/provisioner/containers/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7/
backends/copy/rootfses/b121c623-51db-4ab3-8daf-de3aae6c56d6/mnt/mesos/sandbox' 
for container 720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7
I0212 01:11:28.991528 25300 provisioner.cpp:330] Destroying container rootfs at 
'/tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystem_f15DnB/provisioner/containers/720db9f5-14b8-4c1f-9d1c-1ad52a1ae3
d7/backends/copy/rootfses/b121c623-51db-4ab3-8daf-de3aae6c56d6' for container 
720db9f5-14b8-4c1f-9d1c-1ad52a1ae3d7
../../src/tests/containerizer/filesystem_isolator_tests.cpp:284: Failure
Failed to wait 15secs for wait
[  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem (43267 ms)
[--] 1 test from LinuxFilesystemIsolatorTest (43267 ms total)

[--] Global test environment tear-down
../../src/tests/environment.cpp:728: Failure
Failed
Tests completed with child processes remaining:
-+- 25282 /home/haosdent/mesos/build/src/.libs/lt-mesos-tests 
--gtest_filter=LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem --verbose
 \--- 25367 ()
[==] 1 test from 1 test case ran. (43468 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem
{code}

I try updated systemd, but not works.
{code}
# rpm -qa|grep systemd
systemd-219-19.el7.x86_64
systemd-sysv-219-19.el7.x86_64
systemd-libs-219-19.el7.x86_64
{code}

> Failing ROOT_ tests on CentOS 7.1 - LinuxFilesystemIsolatorTest
> ---
>
> Key: MESOS-3296
> URL: https://issues.apache.org/jira/browse/MESOS-3296
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker, test
>Affects Versions: 0.23.0, 0.24.0
> Environment: CentOS Linux release 7.1
> Linux 3.10.0
>Reporter: Marco Massenzio
>Assignee: Greg Mann
>   

[jira] [Updated] (MESOS-4657) Add LOG(INFO) in `cgroups/net_cls` for debugging allocation of net_cls handles.

2016-02-11 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4657:
--
Sprint: Mesosphere Sprint 29

> Add LOG(INFO) in `cgroups/net_cls` for debugging allocation of net_cls 
> handles.
> ---
>
> Key: MESOS-4657
> URL: https://issues.apache.org/jira/browse/MESOS-4657
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Minor
>  Labels: mesosphere
>
> We need to add LOG(INFO) during the prepare phase of `cgroups/net_cls` for 
> debugging management of `net_cls` handles within the isolator. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4657) Add LOG(INFO) in `cgroups/net_cls` for debugging allocation of net_cls handles.

2016-02-11 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4657:
--
Sprint: Mesosphere Sprint 28  (was: Mesosphere Sprint 29)

> Add LOG(INFO) in `cgroups/net_cls` for debugging allocation of net_cls 
> handles.
> ---
>
> Key: MESOS-4657
> URL: https://issues.apache.org/jira/browse/MESOS-4657
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Minor
>  Labels: mesosphere
>
> We need to add LOG(INFO) during the prepare phase of `cgroups/net_cls` for 
> debugging management of `net_cls` handles within the isolator. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4653) Unify test case temporary folder name format

2016-02-11 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142984#comment-15142984
 ] 

haosdent commented on MESOS-4653:
-

Thank you very much. Let me close this.

> Unify test case temporary folder name format
> 
>
> Key: MESOS-4653
> URL: https://issues.apache.org/jira/browse/MESOS-4653
> Project: Mesos
>  Issue Type: Improvement
>  Components: test
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>  Labels: test
>
> In 
> [environment.cpp#L759https://github.com/apache/mesos/blob/master/src/tests/environment.cpp#L759]
> {code}
>   const string& path =
> path::join("/tmp", strings::join("_", testCase, testName, "XX"));
> {code}
> The temporary file format here is {{testCase_testName_xx}} here.
> But in 
> [utils.hpp#L37|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/3rdparty/stout/include/stout/tests/utils.hpp#L37]
> {code}
> // Create a temporary directory for the test.
> Try directory = os::mkdtemp();
> {code}
> The temporary folder we create here is {{xx}}. I think it would be better 
> we could unify this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3848) Refactor Environment::mkdtemp into TemporaryDirectoryTest.

2016-02-11 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143015#comment-15143015
 ] 

Joseph Wu commented on MESOS-3848:
--

There are a couple tests that need multiple working directories.  These are the 
tests that currently use {{environment->mkdtemp}}.

> Refactor Environment::mkdtemp into TemporaryDirectoryTest.
> --
>
> Key: MESOS-3848
> URL: https://issues.apache.org/jira/browse/MESOS-3848
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: mesosphere
>
> As part of [MESOS-3762], many tests were changed from one 
> {{TemporaryDirectoryTest}} to another {{TemporaryDirectoryTest}}.  One subtle 
> difference is that the name of the temporary directory no longer contains the 
> name of the test.  In [MESOS-3847], the duplicate {{TemporaryDirectoryTest}} 
> was removed.
> The original {{TemporaryDirectoryTest}} called 
> [{{environment->mkdtemp}}|https://github.com/apache/mesos/blob/master/src/tests/environment.cpp#L494].
>   We would like the naming, which is valuable for debugging, to be available 
> for a majority of tests.  (A majority of tests inherit from 
> {{TemporaryDirectoryTest}} in some way.)
> Note:
> * Any additional directories created via {{environment->mkdtemp}} are cleaned 
> up after the test.
> * We don't want mesos-specific logic in Stout, like the {{umount}} shell 
> command in {{Environment::TearDown}}.
> *Proposed change:*
> Move the temporary directory logic from {{Environment::mkdtemp}} to 
> {{TemporaryDirectoryTest}}.
> *Tests that need to change*
> | {{log_tests.cpp}} | {{LogZooKeeperTest}} | We can change {{ZooKeeperTest}} 
> to inherit from {{TemporaryDirectoryTest}} to get rid of code duplication |
> | {{tests/mesos.cpp}} | {{MesosTest::CreateSlaveFlags}} | {{MesosTest}} 
> already inherits from {{TemporaryDirectoryTest}}. |
> | {{tests/script.hpp}} | {{TEST_SCRIPT}} | This is used for the 
> {{ExampleTests}}.  We can define a test class that inherits appropriately. |
> | {{docker_tests.cpp}} | {{*}} | Already inherits from {{MesosTest}}. |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2971) Implement OverlayFS based provisioner backend

2016-02-11 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-2971:
--
Sprint: Mesosphere Sprint 29

> Implement OverlayFS based provisioner backend
> -
>
> Key: MESOS-2971
> URL: https://issues.apache.org/jira/browse/MESOS-2971
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Mei Wan
>  Labels: mesosphere, twitter, unified-containerizer-mvp
>
> Part of the image provisioning process is to call a backend to create a root 
> filesystem based on the image on disk layout.
> The problem with the copy backend is that it's both waste of IO and space, 
> and bind only can deal with one layer.
> Overlayfs backend allows us to utilize the filesystem to merge multiple 
> filesystems into one efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)