[jira] [Commented] (MESOS-3202) Avoid frameworks starving in DRF allocator.

2015-08-05 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654897#comment-14654897
 ] 

Joerg Schad commented on MESOS-3202:


Short Update: After various discussions with [~alex-mesos] and 
[~benjaminhindman] we hope to avoid such situations using  Quota (MESOS-1791).

 Avoid frameworks starving in DRF allocator.
 ---

 Key: MESOS-3202
 URL: https://issues.apache.org/jira/browse/MESOS-3202
 Project: Mesos
  Issue Type: Bug
Reporter: Joerg Schad

 We currently run into issues with the DRF scheduler that frameworks do not 
 receive offers (see https://github.com/mesosphere/marathon/issues/1931 for 
 details). 
 Imagine that we have 10 frameworks and unallocated resources from a single 
 slave.
 Allocation interval is 1 sec, and refuse_seconds (i.e. the time for which a 
 declined resource is filtered) is 3 sec across all frameworks. 
 Allocator offers resources to framework 1 (according to DRF) which declines 
 the offer immediately. 
 In the next allocation interval framework 1 is skipped due to the declined 
 offer before. Hence the next framework 2 is offered the resources, which it 
 also declines.
 The same procedure in the next allocation interval (with framework 3). 
 In the next allocation interval the refuse_seconds for framework 1 are over, 
 and as it still has the lowest DRF share it gets the resource offered again, 
 which it again declines. And the cycle begins again
 Framework 4 (which is actually waiting for this resource) is never offered 
 this resource.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2916) Expose State API via HTTP

2015-08-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MESOS-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654984#comment-14654984
 ] 

Tomás Senart commented on MESOS-2916:
-

This refers to the state abstraction.

 Expose State API via HTTP
 -

 Key: MESOS-2916
 URL: https://issues.apache.org/jira/browse/MESOS-2916
 Project: Mesos
  Issue Type: Story
Reporter: Tomás Senart
  Labels: http

 The State API is a useful service for frameworks to use. It would make sense 
 to have it available via the public HTTP API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3207) C++ style guide is not rendered correctly (code section syntax disregarded)

2015-08-05 Thread Bernd Mathiske (JIRA)
Bernd Mathiske created MESOS-3207:
-

 Summary: C++ style guide is not rendered correctly (code section 
syntax disregarded)
 Key: MESOS-3207
 URL: https://issues.apache.org/jira/browse/MESOS-3207
 Project: Mesos
  Issue Type: Bug
  Components: project website, webui
Affects Versions: 0.23.0
Reporter: Anand Mazumdar
Assignee: Bernd Mathiske
Priority: Minor


Some paragraphs at the bottom of docs/mesos-c++-style-guide.md containing code 
sections are not rendered correctly by the web site generator. It looks fine in 
a github gist and apparently the syntax used is correct. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3207) C++ style guide is not rendered correctly (code section syntax disregarded)

2015-08-05 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14655060#comment-14655060
 ] 

haosdent commented on MESOS-3207:
-

{quote}
  Prefer `constexpr to `const` for all constant POD declarations, `constexpr` 
`char` arrays are preferred to `const` `string` literals.
{quote}
should be
{quote}
  Prefer `constexpr` to `const` for all constant POD declarations, `constexpr` 
`char` arrays are preferred to `const` `string` literals.
{quote}


 C++ style guide is not rendered correctly (code section syntax disregarded)
 ---

 Key: MESOS-3207
 URL: https://issues.apache.org/jira/browse/MESOS-3207
 Project: Mesos
  Issue Type: Bug
  Components: project website, webui
Affects Versions: 0.23.0
Reporter: Anand Mazumdar
Assignee: Bernd Mathiske
Priority: Minor
  Labels: mesosphere

 Some paragraphs at the bottom of docs/mesos-c++-style-guide.md containing 
 code sections are not rendered correctly by the web site generator. It looks 
 fine in a github gist and apparently the syntax used is correct. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3207) C++ style guide is not rendered correctly (code section syntax disregarded)

2015-08-05 Thread Bernd Mathiske (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14655061#comment-14655061
 ] 

Bernd Mathiske commented on MESOS-3207:
---

Thus I learned the hard way that using a gist preview is not enough to find out 
if a markdown file will be rendered as expected eventually.  Better to generate 
the web site and observe it in dev mode (see the README.md in the web site 
repository), checking the real thing.


 C++ style guide is not rendered correctly (code section syntax disregarded)
 ---

 Key: MESOS-3207
 URL: https://issues.apache.org/jira/browse/MESOS-3207
 Project: Mesos
  Issue Type: Bug
  Components: project website, webui
Affects Versions: 0.23.0
Reporter: Anand Mazumdar
Assignee: Bernd Mathiske
Priority: Minor
  Labels: mesosphere

 Some paragraphs at the bottom of docs/mesos-c++-style-guide.md containing 
 code sections are not rendered correctly by the web site generator. It looks 
 fine in a github gist and apparently the syntax used is correct. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3207) C++ style guide is not rendered correctly (code section syntax disregarded)

2015-08-05 Thread Bernd Mathiske (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14655063#comment-14655063
 ] 

Bernd Mathiske commented on MESOS-3207:
---

Ah, thx [~haosd...@gmail.com]! Maybe the missing tick threw it over. In any 
case, our markdown style guide now recommends using tildes instead. That's what 
I will do in the fix as well.

 C++ style guide is not rendered correctly (code section syntax disregarded)
 ---

 Key: MESOS-3207
 URL: https://issues.apache.org/jira/browse/MESOS-3207
 Project: Mesos
  Issue Type: Bug
  Components: project website, webui
Affects Versions: 0.23.0
Reporter: Anand Mazumdar
Assignee: Bernd Mathiske
Priority: Minor
  Labels: mesosphere

 Some paragraphs at the bottom of docs/mesos-c++-style-guide.md containing 
 code sections are not rendered correctly by the web site generator. It looks 
 fine in a github gist and apparently the syntax used is correct. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3062) Add authorization for dynamic reservation

2015-08-05 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14655124#comment-14655124
 ] 

Michael Park commented on MESOS-3062:
-

Introduced ACL protobuf definitions for dynamic reservation: 
https://reviews.apache.org/r/37002/
Enabled the Authorizer to handle Reserve/Unreserve ACLs: 
https://reviews.apache.org/r/37110/
Added 'Master::authorize' for Reserve/Unreserve: 
https://reviews.apache.org/r/37125/
Added authorization for dynamic reservation master endpoints: 
https://reviews.apache.org/r/37126/
Added framework authorization for dynamic reservation: 
https://reviews.apache.org/r/37127/

 Add authorization for dynamic reservation
 -

 Key: MESOS-3062
 URL: https://issues.apache.org/jira/browse/MESOS-3062
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Michael Park
Assignee: Michael Park
  Labels: mesosphere

 Dynamic reservations should be authorized with the {{principal}} of the 
 reserving entity (framework or master). The idea is to introduce {{Reserve}} 
 and {{Unreserve}} into the ACL.
 {code}
   message Reserve {
 // Subjects.
 required Entity principals = 1;
 // Objects.  MVP: Only possible values = ANY, NONE
 required Entity resources = 1;
   }
   message Unreserve {
 // Subjects.
 required Entity principals = 1;
 // Objects.
 required Entity reserver_principals = 2;
   }
 {code}
 When a framework/operator reserves resources, reserve ACLs are checked to 
 see if the framework ({{FrameworkInfo.principal}}) or the operator 
 ({{Credential.user}}) is authorized to reserve the specified resources. If 
 not authorized, the reserve operation is rejected.
 When a framework/operator unreserves resources, unreserve ACLs are checked 
 to see if the framework ({{FrameworkInfo.principal}}) or the operator 
 ({{Credential.user}}) is authorized to unreserve the resources reserved by a 
 framework or operator ({{Resource.ReservationInfo.principal}}). If not 
 authorized, the unreserve operation is rejected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2073) Fetcher cache file verification, updating and invalidation

2015-08-05 Thread Bernd Mathiske (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernd Mathiske updated MESOS-2073:
--
Story Points:   (was: 2)
Target Version/s:   (was: 0.24.0)
  Issue Type: Epic  (was: Improvement)

After working out how to implement this, it turned out to become an epic by 
itself and so I removed this from under the fetcher cache epic now.

 Fetcher cache file verification, updating and invalidation
 --

 Key: MESOS-2073
 URL: https://issues.apache.org/jira/browse/MESOS-2073
 Project: Mesos
  Issue Type: Epic
  Components: fetcher, slave
Reporter: Bernd Mathiske
Assignee: Bernd Mathiske
Priority: Minor
  Labels: mesosphere
   Original Estimate: 96h
  Remaining Estimate: 96h

 The other tickets in the fetcher cache epic do not necessitate a check sum 
 (e.g. MD5, SHA*) for files cached by the fetcher. Whereas such a check sum 
 could be used to verify whether the file arrived without unintended 
 alterations, it can first and foremost be employed to detect and trigger 
 updates. 
 Scenario: If a UIR is requested for fetching and the indicated download has 
 the same check sum as the cached file, then the cache file will be used and 
 the download forgone. If the check sum is different, then fetching proceeds 
 and the cached file gets replaced. 
 This capability will be indicated by an additional field in the URI protobuf. 
 Details TBD, i.e. to be discussed in comments below.
 In addition to the above, even if the check sum is the same, we can support 
 voluntary cache file invalidation: a fresh download can be requested, or the 
 caching behavior can be revoked entirely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-336) Mesos slave should cache executors

2015-08-05 Thread Bernd Mathiske (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernd Mathiske updated MESOS-336:
-
Comment: was deleted

(was: By spinning off MESOS-2073 into its own epic, this one can now be 
regarded as completed. 
(That we are only using two-level hierarchies to break down tickets led to this 
repositioning.))

 Mesos slave should cache executors
 --

 Key: MESOS-336
 URL: https://issues.apache.org/jira/browse/MESOS-336
 Project: Mesos
  Issue Type: Epic
  Components: slave
Reporter: brian wickman
Assignee: Bernd Mathiske
  Labels: mesosphere
   Original Estimate: 672h
  Remaining Estimate: 672h

 The slave should be smarter about how it handles pulling down executors.  In 
 our environment, executors rarely change but the slave will always pull it 
 down from regardless HDFS.  This puts undue stress on our HDFS clusters, and 
 is not resilient to reduced HDFS availability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3208) Fetch checksum files to inform fetcher cache use

2015-08-05 Thread Bernd Mathiske (JIRA)
Bernd Mathiske created MESOS-3208:
-

 Summary: Fetch checksum files to inform fetcher cache use
 Key: MESOS-3208
 URL: https://issues.apache.org/jira/browse/MESOS-3208
 Project: Mesos
  Issue Type: Improvement
  Components: fetcher
Reporter: Bernd Mathiske
Assignee: Bernd Mathiske
Priority: Minor


This is the first part of phase 1 as described in the comments for MESOS-2073. 
We add a field to CommandInfo::URI that contains the URI of a checksum file. 
When this file has new content, then the contents of the associated value URI 
needs to be refreshed in the fetcher cache. 

In this implementation step, we just add the above basic functionality 
(download, checksum comparison). In later steps, we will add more control flow 
to cover corner cases and thus make this feature more useful.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2073) Fetcher cache file verification, updating and invalidation

2015-08-05 Thread Bernd Mathiske (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernd Mathiske updated MESOS-2073:
--
Epic Name: Fetcher cache checksums

 Fetcher cache file verification, updating and invalidation
 --

 Key: MESOS-2073
 URL: https://issues.apache.org/jira/browse/MESOS-2073
 Project: Mesos
  Issue Type: Epic
  Components: fetcher, slave
Reporter: Bernd Mathiske
Assignee: Bernd Mathiske
Priority: Minor
  Labels: mesosphere
   Original Estimate: 96h
  Remaining Estimate: 96h

 The other tickets in the fetcher cache epic do not necessitate a check sum 
 (e.g. MD5, SHA*) for files cached by the fetcher. Whereas such a check sum 
 could be used to verify whether the file arrived without unintended 
 alterations, it can first and foremost be employed to detect and trigger 
 updates. 
 Scenario: If a UIR is requested for fetching and the indicated download has 
 the same check sum as the cached file, then the cache file will be used and 
 the download forgone. If the check sum is different, then fetching proceeds 
 and the cached file gets replaced. 
 This capability will be indicated by an additional field in the URI protobuf. 
 Details TBD, i.e. to be discussed in comments below.
 In addition to the above, even if the check sum is the same, we can support 
 voluntary cache file invalidation: a fresh download can be requested, or the 
 caching behavior can be revoked entirely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-2073) Fetcher cache file verification, updating and invalidation

2015-08-05 Thread Bernd Mathiske (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bernd Mathiske updated MESOS-2073:
--
Comment: was deleted

(was: First patch in a series to implement phase 1: 

https://reviews.apache.org/r/37075/
)

 Fetcher cache file verification, updating and invalidation
 --

 Key: MESOS-2073
 URL: https://issues.apache.org/jira/browse/MESOS-2073
 Project: Mesos
  Issue Type: Epic
  Components: fetcher, slave
Reporter: Bernd Mathiske
Assignee: Bernd Mathiske
Priority: Minor
  Labels: mesosphere
   Original Estimate: 96h
  Remaining Estimate: 96h

 The other tickets in the fetcher cache epic do not necessitate a check sum 
 (e.g. MD5, SHA*) for files cached by the fetcher. Whereas such a check sum 
 could be used to verify whether the file arrived without unintended 
 alterations, it can first and foremost be employed to detect and trigger 
 updates. 
 Scenario: If a UIR is requested for fetching and the indicated download has 
 the same check sum as the cached file, then the cache file will be used and 
 the download forgone. If the check sum is different, then fetching proceeds 
 and the cached file gets replaced. 
 This capability will be indicated by an additional field in the URI protobuf. 
 Details TBD, i.e. to be discussed in comments below.
 In addition to the above, even if the check sum is the same, we can support 
 voluntary cache file invalidation: a fresh download can be requested, or the 
 caching behavior can be revoked entirely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2073) Fetcher cache file verification, updating and invalidation

2015-08-05 Thread Bernd Mathiske (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653655#comment-14653655
 ] 

Bernd Mathiske edited comment on MESOS-2073 at 8/5/15 11:36 AM:


First patch in a series to implement phase 1: 

https://reviews.apache.org/r/37075/



was (Author: bernd-mesos):
First patch in a series to implement phase 1: 

https://reviews.apache.org/r/37075/


 Fetcher cache file verification, updating and invalidation
 --

 Key: MESOS-2073
 URL: https://issues.apache.org/jira/browse/MESOS-2073
 Project: Mesos
  Issue Type: Epic
  Components: fetcher, slave
Reporter: Bernd Mathiske
Assignee: Bernd Mathiske
Priority: Minor
  Labels: mesosphere
   Original Estimate: 96h
  Remaining Estimate: 96h

 The other tickets in the fetcher cache epic do not necessitate a check sum 
 (e.g. MD5, SHA*) for files cached by the fetcher. Whereas such a check sum 
 could be used to verify whether the file arrived without unintended 
 alterations, it can first and foremost be employed to detect and trigger 
 updates. 
 Scenario: If a UIR is requested for fetching and the indicated download has 
 the same check sum as the cached file, then the cache file will be used and 
 the download forgone. If the check sum is different, then fetching proceeds 
 and the cached file gets replaced. 
 This capability will be indicated by an additional field in the URI protobuf. 
 Details TBD, i.e. to be discussed in comments below.
 In addition to the above, even if the check sum is the same, we can support 
 voluntary cache file invalidation: a fresh download can be requested, or the 
 caching behavior can be revoked entirely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3208) Fetch checksum files to inform fetcher cache use

2015-08-05 Thread Bernd Mathiske (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14655233#comment-14655233
 ] 

Bernd Mathiske commented on MESOS-3208:
---

First patch in a series to implement phase 1:
https://reviews.apache.org/r/37075/

 Fetch checksum files to inform fetcher cache use
 

 Key: MESOS-3208
 URL: https://issues.apache.org/jira/browse/MESOS-3208
 Project: Mesos
  Issue Type: Improvement
  Components: fetcher
Reporter: Bernd Mathiske
Assignee: Bernd Mathiske
Priority: Minor

 This is the first part of phase 1 as described in the comments for 
 MESOS-2073. We add a field to CommandInfo::URI that contains the URI of a 
 checksum file. When this file has new content, then the contents of the 
 associated value URI needs to be refreshed in the fetcher cache. 
 In this implementation step, we just add the above basic functionality 
 (download, checksum comparison). In later steps, we will add more control 
 flow to cover corner cases and thus make this feature more useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1457) Process IDs should be required to be human-readable

2015-08-05 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658324#comment-14658324
 ] 

Till Toenshoff commented on MESOS-1457:
---

Shepherd will get assigned shortly.

 Process IDs should be required to be human-readable 
 

 Key: MESOS-1457
 URL: https://issues.apache.org/jira/browse/MESOS-1457
 Project: Mesos
  Issue Type: Improvement
  Components: libprocess
Reporter: Dominic Hamon
Assignee: Palak Choudhary
Priority: Minor

 When debugging, it's very useful to understand which processes are getting 
 timeslices. As such, the human-readable names that can be passed to 
 {{ProcessBase}} are incredibly valuable, however they are currently optional.
 If the constructor of {{ProcessBase}} took a mandatory string, every process 
 would get a human-readable name and debugging would be much easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3199) Validate Quota Requests.

2015-08-05 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3199:
--
Shepherd: Bernd Mathiske

 Validate Quota Requests.
 

 Key: MESOS-3199
 URL: https://issues.apache.org/jira/browse/MESOS-3199
 Project: Mesos
  Issue Type: Task
Reporter: Joerg Schad
Assignee: Joerg Schad
  Labels: mesosphere

 We need to validate quota requests in terms of syntax correctness, update 
 Master bookkeeping structures, and persist quota requests in the {{Registry}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3063) Add an example framework using dynamic reservation

2015-08-05 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658408#comment-14658408
 ] 

Klaus Ma commented on MESOS-3063:
-

Have a draft example to reserve the resources, i'm thinking to un-reserve the 
resource after all tasks done. Will update the code by the end of this week.

 Add an example framework using dynamic reservation
 --

 Key: MESOS-3063
 URL: https://issues.apache.org/jira/browse/MESOS-3063
 Project: Mesos
  Issue Type: Task
Reporter: Michael Park
Assignee: Klaus Ma

 An example framework using dynamic reservation should added to
 # test dynamic reservations further, and
 # to be used as a reference for those who want to use the dynamic reservation 
 feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3015) Add hooks for Slave exits

2015-08-05 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3015:
--
Shepherd: Niklas Quarfot Nielsen

 Add hooks for Slave exits
 -

 Key: MESOS-3015
 URL: https://issues.apache.org/jira/browse/MESOS-3015
 Project: Mesos
  Issue Type: Task
Reporter: Kapil Arya
Assignee: Kapil Arya
  Labels: mesosphere

 The hook will be triggered on slave exits. A master hook module can use this 
 to do Slave-specific cleanups.
 In our particular use case, the hook would trigger cleanup of IPs assigned to 
 the given Slave (see the [design doc | 
 https://docs.google.com/document/d/17mXtAmdAXcNBwp_JfrxmZcQrs7EO6ancSbejrqjLQ0g/edit#]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3092) Configure Jenkins to run Docker tests

2015-08-05 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658410#comment-14658410
 ] 

Niklas Quarfot Nielsen commented on MESOS-3092:
---

[~vinodkone] Would you mind shepherd this? :)

 Configure Jenkins to run Docker tests
 -

 Key: MESOS-3092
 URL: https://issues.apache.org/jira/browse/MESOS-3092
 Project: Mesos
  Issue Type: Improvement
  Components: docker
Reporter: Timothy Chen
Assignee: Timothy Chen
  Labels: mesosphere

 Add a jenkin job to run the Docker tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3021) Implement Docker Image Provisioner Reference Store

2015-08-05 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3021:
--
Shepherd: Timothy Chen

 Implement Docker Image Provisioner Reference Store
 --

 Key: MESOS-3021
 URL: https://issues.apache.org/jira/browse/MESOS-3021
 Project: Mesos
  Issue Type: Improvement
Reporter: Lily Chen
Assignee: Lily Chen
  Labels: mesosphere

 Create a comprehensive store to look up an image and tag's associated image 
 layer ID. Implement add, remove, save, and update images and their associated 
 tags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1010) Python extension build is broken if gflags-dev is installed

2015-08-05 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658414#comment-14658414
 ] 

Niklas Quarfot Nielsen commented on MESOS-1010:
---

[~jvanremoortere] Would you mind shepherd this? :)

 Python extension build is broken if gflags-dev is installed
 ---

 Key: MESOS-1010
 URL: https://issues.apache.org/jira/browse/MESOS-1010
 Project: Mesos
  Issue Type: Bug
  Components: build, python api
 Environment: Fedora 20, amd64. GCC: 4.8.2.
Reporter: Nikita Vetoshkin
Assignee: Greg Mann
  Labels: flaky-test, mesosphere

 In my environment mesos build from master results in broken python api module 
 {{_mesos.so}}:
 {noformat}
 nekto0n@ya-darkstar ~/workspace/mesos/src/python $ 
 PYTHONPATH=build/lib.linux-x86_64-2.7/ python -c import _mesos
 Traceback (most recent call last):
   File string, line 1, in module
 ImportError: 
 /home/nekto0n/workspace/mesos/src/python/build/lib.linux-x86_64-2.7/_mesos.so:
  undefined symbol: _ZN6google14FlagRegistererC1EPKcS2_S2_S2_PvS3_
 {noformat}
 Unmangled version of symbol looks like this:
 {noformat}
 google::FlagRegisterer::FlagRegisterer(char const*, char const*, char const*, 
 char const*, void*, void*)
 {noformat}
 During {{./configure}} step {{glog}} finds {{gflags}} development files and 
 starts using them, thus *implicitly* adding dependency on {{libgflags.so}}. 
 This breaks Python extensions module and perhaps can break other mesos 
 subsystems when moved to hosts without {{gflags}} installed.
 This task is done when the ExamplesTest.PythonFramework test will pass on a 
 system with gflags installed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-830) ExamplesTest.JavaFramework is flaky

2015-08-05 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658415#comment-14658415
 ] 

Niklas Quarfot Nielsen commented on MESOS-830:
--

[~jvanremoortere] Would you mind shepherd this? :)

 ExamplesTest.JavaFramework is flaky
 ---

 Key: MESOS-830
 URL: https://issues.apache.org/jira/browse/MESOS-830
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Vinod Kone
Assignee: Greg Mann
  Labels: flaky, mesosphere

 Identify the cause of the following test failure:
 [ RUN  ] ExamplesTest.JavaFramework
 Using temporary directory '/tmp/ExamplesTest_JavaFramework_wSc7u8'
 Enabling authentication for the framework
 I1120 15:13:39.820032 1681264640 master.cpp:285] Master started on 
 172.25.133.171:52576
 I1120 15:13:39.820180 1681264640 master.cpp:299] Master ID: 
 201311201513-2877626796-52576-3234
 I1120 15:13:39.820194 1681264640 master.cpp:302] Master only allowing 
 authenticated frameworks to register!
 I1120 15:13:39.821197 1679654912 slave.cpp:112] Slave started on 
 1)@172.25.133.171:52576
 I1120 15:13:39.821795 1679654912 slave.cpp:212] Slave resources: cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.822855 1682337792 slave.cpp:112] Slave started on 
 2)@172.25.133.171:52576
 I1120 15:13:39.823652 1682337792 slave.cpp:212] Slave resources: cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.825330 1679118336 master.cpp:744] The newly elected leader is 
 master@172.25.133.171:52576
 I1120 15:13:39.825445 1679118336 master.cpp:748] Elected as the leading 
 master!
 I1120 15:13:39.825907 1681264640 state.cpp:33] Recovering state from 
 '/tmp/ExamplesTest_JavaFramework_wSc7u8/0/meta'
 I1120 15:13:39.826127 1681264640 status_update_manager.cpp:180] Recovering 
 status update manager
 I1120 15:13:39.826331 1681801216 process_isolator.cpp:317] Recovering isolator
 I1120 15:13:39.826738 1682874368 slave.cpp:2743] Finished recovery
 I1120 15:13:39.827747 1682337792 state.cpp:33] Recovering state from 
 '/tmp/ExamplesTest_JavaFramework_wSc7u8/1/meta'
 I1120 15:13:39.827945 1680191488 slave.cpp:112] Slave started on 
 3)@172.25.133.171:52576
 I1120 15:13:39.828415 1682337792 status_update_manager.cpp:180] Recovering 
 status update manager
 I1120 15:13:39.828608 1680728064 sched.cpp:260] Authenticating with master 
 master@172.25.133.171:52576
 I1120 15:13:39.828606 1680191488 slave.cpp:212] Slave resources: cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.828680 1682874368 slave.cpp:497] New master detected at 
 master@172.25.133.171:52576
 I1120 15:13:39.828765 1682337792 process_isolator.cpp:317] Recovering isolator
 I1120 15:13:39.829828 1680728064 sched.cpp:229] Detecting new master
 I1120 15:13:39.830288 1679654912 authenticatee.hpp:100] Initializing client 
 SASL
 I1120 15:13:39.831635 1680191488 state.cpp:33] Recovering state from 
 '/tmp/ExamplesTest_JavaFramework_wSc7u8/2/meta'
 I1120 15:13:39.831991 1679118336 status_update_manager.cpp:158] New master 
 detected at master@172.25.133.171:52576
 I1120 15:13:39.832042 1682874368 slave.cpp:524] Detecting new master
 I1120 15:13:39.832314 1682337792 slave.cpp:2743] Finished recovery
 I1120 15:13:39.832309 1681264640 master.cpp:1266] Attempting to register 
 slave on vkone.local at slave(1)@172.25.133.171:52576
 I1120 15:13:39.832929 1680728064 status_update_manager.cpp:180] Recovering 
 status update manager
 I1120 15:13:39.833371 1681801216 slave.cpp:497] New master detected at 
 master@172.25.133.171:52576
 I1120 15:13:39.833273 1681264640 master.cpp:2513] Adding slave 
 201311201513-2877626796-52576-3234-0 at vkone.local with cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.833595 1680728064 process_isolator.cpp:317] Recovering isolator
 I1120 15:13:39.833859 1681801216 slave.cpp:524] Detecting new master
 I1120 15:13:39.833861 1682874368 status_update_manager.cpp:158] New master 
 detected at master@172.25.133.171:52576
 I1120 15:13:39.834092 1680191488 slave.cpp:542] Registered with master 
 master@172.25.133.171:52576; given slave ID 
 201311201513-2877626796-52576-3234-0
 I1120 15:13:39.834486 1681264640 master.cpp:1266] Attempting to register 
 slave on vkone.local at slave(2)@172.25.133.171:52576
 I1120 15:13:39.834549 1681264640 master.cpp:2513] Adding slave 
 201311201513-2877626796-52576-3234-1 at vkone.local with cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.834750 1680191488 slave.cpp:555] Checkpointing SlaveInfo to 
 '/tmp/ExamplesTest_JavaFramework_wSc7u8/0/meta/slaves/201311201513-2877626796-52576-3234-0/slave.info'
 I1120 15:13:39.834875 1682874368 hierarchical_allocator_process.hpp:445] 
 Added slave 

[jira] [Commented] (MESOS-830) ExamplesTest.JavaFramework is flaky

2015-08-05 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658434#comment-14658434
 ] 

Till Toenshoff commented on MESOS-830:
--

[~greggomann] I added some debug code into that macro which told me that 
pthread_rwlock_wrlock returned 22 (Invalid Argument) and from that I assumed 
that the mutex in question had gotten killed already.

 ExamplesTest.JavaFramework is flaky
 ---

 Key: MESOS-830
 URL: https://issues.apache.org/jira/browse/MESOS-830
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Vinod Kone
Assignee: Greg Mann
  Labels: flaky, mesosphere

 Identify the cause of the following test failure:
 [ RUN  ] ExamplesTest.JavaFramework
 Using temporary directory '/tmp/ExamplesTest_JavaFramework_wSc7u8'
 Enabling authentication for the framework
 I1120 15:13:39.820032 1681264640 master.cpp:285] Master started on 
 172.25.133.171:52576
 I1120 15:13:39.820180 1681264640 master.cpp:299] Master ID: 
 201311201513-2877626796-52576-3234
 I1120 15:13:39.820194 1681264640 master.cpp:302] Master only allowing 
 authenticated frameworks to register!
 I1120 15:13:39.821197 1679654912 slave.cpp:112] Slave started on 
 1)@172.25.133.171:52576
 I1120 15:13:39.821795 1679654912 slave.cpp:212] Slave resources: cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.822855 1682337792 slave.cpp:112] Slave started on 
 2)@172.25.133.171:52576
 I1120 15:13:39.823652 1682337792 slave.cpp:212] Slave resources: cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.825330 1679118336 master.cpp:744] The newly elected leader is 
 master@172.25.133.171:52576
 I1120 15:13:39.825445 1679118336 master.cpp:748] Elected as the leading 
 master!
 I1120 15:13:39.825907 1681264640 state.cpp:33] Recovering state from 
 '/tmp/ExamplesTest_JavaFramework_wSc7u8/0/meta'
 I1120 15:13:39.826127 1681264640 status_update_manager.cpp:180] Recovering 
 status update manager
 I1120 15:13:39.826331 1681801216 process_isolator.cpp:317] Recovering isolator
 I1120 15:13:39.826738 1682874368 slave.cpp:2743] Finished recovery
 I1120 15:13:39.827747 1682337792 state.cpp:33] Recovering state from 
 '/tmp/ExamplesTest_JavaFramework_wSc7u8/1/meta'
 I1120 15:13:39.827945 1680191488 slave.cpp:112] Slave started on 
 3)@172.25.133.171:52576
 I1120 15:13:39.828415 1682337792 status_update_manager.cpp:180] Recovering 
 status update manager
 I1120 15:13:39.828608 1680728064 sched.cpp:260] Authenticating with master 
 master@172.25.133.171:52576
 I1120 15:13:39.828606 1680191488 slave.cpp:212] Slave resources: cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.828680 1682874368 slave.cpp:497] New master detected at 
 master@172.25.133.171:52576
 I1120 15:13:39.828765 1682337792 process_isolator.cpp:317] Recovering isolator
 I1120 15:13:39.829828 1680728064 sched.cpp:229] Detecting new master
 I1120 15:13:39.830288 1679654912 authenticatee.hpp:100] Initializing client 
 SASL
 I1120 15:13:39.831635 1680191488 state.cpp:33] Recovering state from 
 '/tmp/ExamplesTest_JavaFramework_wSc7u8/2/meta'
 I1120 15:13:39.831991 1679118336 status_update_manager.cpp:158] New master 
 detected at master@172.25.133.171:52576
 I1120 15:13:39.832042 1682874368 slave.cpp:524] Detecting new master
 I1120 15:13:39.832314 1682337792 slave.cpp:2743] Finished recovery
 I1120 15:13:39.832309 1681264640 master.cpp:1266] Attempting to register 
 slave on vkone.local at slave(1)@172.25.133.171:52576
 I1120 15:13:39.832929 1680728064 status_update_manager.cpp:180] Recovering 
 status update manager
 I1120 15:13:39.833371 1681801216 slave.cpp:497] New master detected at 
 master@172.25.133.171:52576
 I1120 15:13:39.833273 1681264640 master.cpp:2513] Adding slave 
 201311201513-2877626796-52576-3234-0 at vkone.local with cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.833595 1680728064 process_isolator.cpp:317] Recovering isolator
 I1120 15:13:39.833859 1681801216 slave.cpp:524] Detecting new master
 I1120 15:13:39.833861 1682874368 status_update_manager.cpp:158] New master 
 detected at master@172.25.133.171:52576
 I1120 15:13:39.834092 1680191488 slave.cpp:542] Registered with master 
 master@172.25.133.171:52576; given slave ID 
 201311201513-2877626796-52576-3234-0
 I1120 15:13:39.834486 1681264640 master.cpp:1266] Attempting to register 
 slave on vkone.local at slave(2)@172.25.133.171:52576
 I1120 15:13:39.834549 1681264640 master.cpp:2513] Adding slave 
 201311201513-2877626796-52576-3234-1 at vkone.local with cpus(*):4; 
 mem(*):7168; disk(*):481998; ports(*):[31000-32000]
 I1120 15:13:39.834750 1680191488 slave.cpp:555] Checkpointing SlaveInfo to 
 

[jira] [Created] (MESOS-3209) parameterize allocator benchmark by framework count

2015-08-05 Thread James Peach (JIRA)
James Peach created MESOS-3209:
--

 Summary: parameterize allocator benchmark by framework count
 Key: MESOS-3209
 URL: https://issues.apache.org/jira/browse/MESOS-3209
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: James Peach
Assignee: James Peach
Priority: Minor


In order to explore allocation performance with multiple frameworks, extend the 
{{HierarchicalAllocator_BENCHMARK_Test}} benchmark so it is parameterized by 
the framework count as well as the slave count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-1838) Add documentation for Authentication

2015-08-05 Thread Tim Anderegg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Anderegg reassigned MESOS-1838:
---

Assignee: Tim Anderegg

 Add documentation for Authentication
 

 Key: MESOS-1838
 URL: https://issues.apache.org/jira/browse/MESOS-1838
 Project: Mesos
  Issue Type: Documentation
  Components: documentation
Reporter: Vinod Kone
Assignee: Tim Anderegg

 We need some documentation about how to enable framework and slave 
 authentication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3210) DiscoveryInfo is broken in state.json

2015-08-05 Thread Dr. Stefan Schimanski (JIRA)
Dr. Stefan Schimanski created MESOS-3210:


 Summary: DiscoveryInfo is broken in state.json
 Key: MESOS-3210
 URL: https://issues.apache.org/jira/browse/MESOS-3210
 Project: Mesos
  Issue Type: Bug
  Components: master
Affects Versions: 0.23.0
Reporter: Dr. Stefan Schimanski


The DiscoveryInfo field of a task in state.json is broken: ports and labels 
fields are nested once too much.

Got:
{code}
discovery : {
  name : docker,
  labels : {
 labels : [
{
   key : canary,
   value : Mallorca
}
 ]
  },
  visibility : CLUSTER,
  ports : {
 ports : [
{
   name : health,
   number : 1080,
   protocol : http
}
 ]
  }
   },
{code}

Expected:
{code}
discovery : {
  name : docker,
 labels : [
{
   key : canary,
   value : Mallorca
}
 ]
  visibility : CLUSTER,
 ports : [
{
   name : health,
   number : 1080,
   protocol : http
}
 ]
   },
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3210) DiscoveryInfo is broken in state.json

2015-08-05 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658498#comment-14658498
 ] 

haosdent commented on MESOS-3210:
-

Because we use a general function convert discovery protobuf to json, and 
{quote}
  ports : {
 ports : [
{
   name : health,
   number : 1080,
   protocol : http
}
 ]
  }
{quote}
is match the protobuf message struct. Suppose we add notes to ports in 
protobuf, the struct would change to 
{quote}
  ports : {
 notes: This is a note.
 ports : [
{
   name : health,
   number : 1080,
   protocol : http
}
 ]
  }
{quote}
it would not compatible with your expected struct. So I think keep current 
struct would be better, unless we change the protobuf of DiscoveryInfo.

 DiscoveryInfo is broken in state.json
 -

 Key: MESOS-3210
 URL: https://issues.apache.org/jira/browse/MESOS-3210
 Project: Mesos
  Issue Type: Bug
  Components: master
Affects Versions: 0.23.0
Reporter: Dr. Stefan Schimanski

 The DiscoveryInfo field of a task in state.json is broken: ports and labels 
 fields are nested once too much.
 Got:
 {code}
 discovery : {
   name : docker,
   labels : {
  labels : [
 {
key : canary,
value : Mallorca
 }
  ]
   },
   visibility : CLUSTER,
   ports : {
  ports : [
 {
name : health,
number : 1080,
protocol : http
 }
  ]
   }
},
 {code}
 Expected:
 {code}
 discovery : {
   name : docker,
  labels : [
 {
key : canary,
value : Mallorca
 }
  ]
   visibility : CLUSTER,
  ports : [
 {
name : health,
number : 1080,
protocol : http
 }
  ]
},
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3210) DiscoveryInfo is broken in state.json

2015-08-05 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658498#comment-14658498
 ] 

haosdent edited comment on MESOS-3210 at 8/5/15 4:51 PM:
-

Because we use a general function convert discovery protobuf to json, and 
{code}
  ports : {
 ports : [
{
   name : health,
   number : 1080,
   protocol : http
}
 ]
  }
{code}
is match the protobuf message struct. Suppose we add notes to ports in 
protobuf, the struct would change to 
{code}
  ports : {
 notes: This is a note.
 ports : [
{
   name : health,
   number : 1080,
   protocol : http
}
 ]
  }
{code}
it would not compatible with your expected struct. So I think keep current 
struct would be better, unless we change the protobuf of DiscoveryInfo.


was (Author: haosd...@gmail.com):
Because we use a general function convert discovery protobuf to json, and 
{quote}
  ports : {
 ports : [
{
   name : health,
   number : 1080,
   protocol : http
}
 ]
  }
{quote}
is match the protobuf message struct. Suppose we add notes to ports in 
protobuf, the struct would change to 
{quote}
  ports : {
 notes: This is a note.
 ports : [
{
   name : health,
   number : 1080,
   protocol : http
}
 ]
  }
{quote}
it would not compatible with your expected struct. So I think keep current 
struct would be better, unless we change the protobuf of DiscoveryInfo.

 DiscoveryInfo is broken in state.json
 -

 Key: MESOS-3210
 URL: https://issues.apache.org/jira/browse/MESOS-3210
 Project: Mesos
  Issue Type: Bug
  Components: master
Affects Versions: 0.23.0
Reporter: Dr. Stefan Schimanski

 The DiscoveryInfo field of a task in state.json is broken: ports and labels 
 fields are nested once too much.
 Got:
 {code}
 discovery : {
   name : docker,
   labels : {
  labels : [
 {
key : canary,
value : Mallorca
 }
  ]
   },
   visibility : CLUSTER,
   ports : {
  ports : [
 {
name : health,
number : 1080,
protocol : http
 }
  ]
   }
},
 {code}
 Expected:
 {code}
 discovery : {
   name : docker,
  labels : [
 {
key : canary,
value : Mallorca
 }
  ]
   visibility : CLUSTER,
  ports : [
 {
name : health,
number : 1080,
protocol : http
 }
  ]
},
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3209) parameterize allocator benchmark by framework count

2015-08-05 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658562#comment-14658562
 ] 

James Peach commented on MESOS-3209:


https://reviews.apache.org/r/37133/

 parameterize allocator benchmark by framework count
 ---

 Key: MESOS-3209
 URL: https://issues.apache.org/jira/browse/MESOS-3209
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: James Peach
Assignee: James Peach
Priority: Minor

 In order to explore allocation performance with multiple frameworks, extend 
 the {{HierarchicalAllocator_BENCHMARK_Test}} benchmark so it is parameterized 
 by the framework count as well as the slave count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3209) parameterize allocator benchmark by framework count

2015-08-05 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-3209:
---
Shepherd: Benjamin Mahler

 parameterize allocator benchmark by framework count
 ---

 Key: MESOS-3209
 URL: https://issues.apache.org/jira/browse/MESOS-3209
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: James Peach
Assignee: James Peach
Priority: Minor

 In order to explore allocation performance with multiple frameworks, extend 
 the {{HierarchicalAllocator_BENCHMARK_Test}} benchmark so it is parameterized 
 by the framework count as well as the slave count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3092) Configure Jenkins to run Docker tests

2015-08-05 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658586#comment-14658586
 ] 

Vinod Kone commented on MESOS-3092:
---

Yup. Happy to shepherd.

I was going to delete 
https://builds.apache.org/user/vinodkone/my-views/view/Mesos/job/Mesos-Docker-Tests/
 because I thought it was something I created for testing a while ago and 
forgot. Now, it looks like [~tnachen] created this? Tim, lets not point this 
job to builds@ until it has been baked and green for a while. 

 Configure Jenkins to run Docker tests
 -

 Key: MESOS-3092
 URL: https://issues.apache.org/jira/browse/MESOS-3092
 Project: Mesos
  Issue Type: Improvement
  Components: docker
Reporter: Timothy Chen
Assignee: Timothy Chen
  Labels: mesosphere

 Add a jenkin job to run the Docker tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3202) Avoid frameworks starving in DRF allocator.

2015-08-05 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658595#comment-14658595
 ] 

Vinod Kone commented on MESOS-3202:
---

Do you want to close this as a duplicate of MESOS-1791 then?

 Avoid frameworks starving in DRF allocator.
 ---

 Key: MESOS-3202
 URL: https://issues.apache.org/jira/browse/MESOS-3202
 Project: Mesos
  Issue Type: Bug
Reporter: Joerg Schad

 We currently run into issues with the DRF scheduler that frameworks do not 
 receive offers (see https://github.com/mesosphere/marathon/issues/1931 for 
 details). 
 Imagine that we have 10 frameworks and unallocated resources from a single 
 slave.
 Allocation interval is 1 sec, and refuse_seconds (i.e. the time for which a 
 declined resource is filtered) is 3 sec across all frameworks. 
 Allocator offers resources to framework 1 (according to DRF) which declines 
 the offer immediately. 
 In the next allocation interval framework 1 is skipped due to the declined 
 offer before. Hence the next framework 2 is offered the resources, which it 
 also declines.
 The same procedure in the next allocation interval (with framework 3). 
 In the next allocation interval the refuse_seconds for framework 1 are over, 
 and as it still has the lowest DRF share it gets the resource offered again, 
 which it again declines. And the cycle begins again
 Framework 4 (which is actually waiting for this resource) is never offered 
 this resource.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2562) 0.24.0 release

2015-08-05 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2562:
--
Description: The main feature of this release is going to be v1 (beta) 
release of the HTTP scheduler API (part of MESOS-2288 epic).

 0.24.0 release
 --

 Key: MESOS-2562
 URL: https://issues.apache.org/jira/browse/MESOS-2562
 Project: Mesos
  Issue Type: Task
Reporter: Kapil Arya
Assignee: Vinod Kone

 The main feature of this release is going to be v1 (beta) release of the HTTP 
 scheduler API (part of MESOS-2288 epic).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3142) As a Developer I want a better way to run shell commands

2015-08-05 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658663#comment-14658663
 ] 

Marco Massenzio commented on MESOS-3142:


Waiting on this to be available, as it greatly simplifies the code for this 
functionality.

 As a Developer I want a better way to run shell commands
 

 Key: MESOS-3142
 URL: https://issues.apache.org/jira/browse/MESOS-3142
 Project: Mesos
  Issue Type: Story
  Components: stout
Affects Versions: 0.23.0
Reporter: Benjamin Hindman
Assignee: Marco Massenzio
  Labels: mesosphere, tech-debt

 When reviewing the code in [r/36425|https://reviews.apache.org/r/36425/] 
 [~benjaminhindman] noticed that there is a better abstraction that is 
 possible to introduce for {{os::shell()}} that will simplify the caller's 
 life.
 Instead of having to handle all possible outcomes, we propose to refactor 
 {{os::shell()}} as follows:
 {code}
 /**
  * Returns the output from running the specified command with the shell.
  */
 Trystd::string shell(const string command)
 {
   // Actually handle the WIFEXITED, WIFSIGNALED here!
 }
 {code}
 where the returned string is {{stdout}} and, should the program be signaled, 
 or exit with a non-zero exit code, we will simply return a {{Failure}} with 
 an error message that will encapsulate both the returned/signaled state, and, 
 possibly {{stderr}}.
 And some test driven development:
 {code}
 EXPECT_ERROR(os::shell(false));
 EXPECT_SOME(os::shell(true));
 EXPECT_SOME_EQ(hello world, os::shell(echo hello world));
 {code}
 Alternatively, the caller can ask to have {{stderr}} conflated with 
 {{stdout}}:
 {code}
 Trystring outAndErr = os::shell(myCmd --foo 21);
 {code}
 However, {{stderr}} will be ignored by default:
 {code}
 // We don't read standard error by default.
 EXPECT_SOME_EQ(, os::shell(echo hello world 12));
 // We don't even read stderr if something fails (to return in Try::error).
 Trystring output = os::shell(echo hello world 12  false);
 EXPECT_ERROR(output);
 EXPECT_FALSE(strings::contains(output.error(), hello world));
 {code}
 An analysis of existing usage shows that in almost all cases, the caller only 
 cares {{if not error}}; in fact, the actual exit code is read only once, and 
 even then, in a test case.
 We believe this will simplify the API to the caller, and will significantly 
 reduce the length and complexity at the calling sites (6 LOC against the 
 current 20+).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1013) ExamplesTest.JavaLog is flaky

2015-08-05 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-1013:
---
Assignee: Greg Mann

 ExamplesTest.JavaLog is flaky
 -

 Key: MESOS-1013
 URL: https://issues.apache.org/jira/browse/MESOS-1013
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.19.0
Reporter: Vinod Kone
Assignee: Greg Mann
  Labels: flaky

 [ RUN  ] ExamplesTest.JavaLog
 Using temporary directory '/tmp/ExamplesTest_JavaLog_WBWEb9'
 Feb 18, 2014 12:10:57 PM TestLog main
 INFO: Starting a local ZooKeeper server
 log4j:WARN No appenders could be found for logger 
 (org.apache.zookeeper.server.ZooKeeperServer).
 log4j:WARN Please initialize the log4j system properly.
 Feb 18, 2014 12:10:57 PM TestLog main
 INFO: Initializing log /tmp/mesos-epljTr/log1 with 
 /var/jenkins/workspace/mesos-fedora-19-clang/src/mesos-log
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I0218 12:10:58.107450 17404 process.cpp:1591] libprocess is initialized on 
 192.168.122.134:36627 for 8 cpus
 I0218 12:10:58.111640 17404 leveldb.cpp:166] Opened db in 3.145702ms
 I0218 12:10:58.113097 17404 leveldb.cpp:173] Compacted db in 770230ns
 I0218 12:10:58.113137 17404 leveldb.cpp:188] Created db iterator in 20506ns
 I0218 12:10:58.113152 17404 leveldb.cpp:194] Seeked to beginning of db in 
 12095ns
 I0218 12:10:58.113198 17404 leveldb.cpp:255] Iterated through 1 keys in the 
 db in 43127ns
 I0218 12:10:58.113248 17404 replica.cpp:732] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@712: Client 
 environment:zookeeper.version=zookeeper C client 3.4.5
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@716: Client 
 environment:host.name=fedora-19
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@723: Client 
 environment:os.name=Linux
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@724: Client 
 environment:os.arch=3.12.9-201.fc19.x86_64
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@725: Client 
 environment:os.version=#1 SMP Wed Jan 29 15:44:35 UTC 2014
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@733: Client 
 environment:user.name=jenkins
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@741: Client 
 environment:user.home=/home/jenkins
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@753: Client 
 environment:user.dir=/tmp/ExamplesTest_JavaLog_WBWEb9
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@zookeeper_init@786: 
 Initiating client connection, host=0:0:0:0:0:0:0:0:40410 sessionTimeout=3000 
 watcher=0x7f792228c440 sessionId=0 sessionPasswd=null context=0x13089c0 
 flags=0
 2014-02-18 12:10:58,117:17397(0x7f7921407700):ZOO_INFO@log_env@712: Client 
 environment:zookeeper.version=zookeeper C client 3.4.5
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@716: Client 
 environment:host.name=fedora-19
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@723: Client 
 environment:os.name=Linux
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@724: Client 
 environment:os.arch=3.12.9-201.fc19.x86_64
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@725: Client 
 environment:os.version=#1 SMP Wed Jan 29 15:44:35 UTC 2014
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@733: Client 
 environment:user.name=jenkins
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@741: Client 
 environment:user.home=/home/jenkins
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@753: Client 
 environment:user.dir=/tmp/ExamplesTest_JavaLog_WBWEb9
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@zookeeper_init@786: 
 Initiating client connection, host=0:0:0:0:0:0:0:0:40410 sessionTimeout=3000 
 watcher=0x7f792228c440 sessionId=0 sessionPasswd=null 
 context=0x7f7904000e40 flags=0
 I0218 12:10:58.119313 17452 log.cpp:222] Attempting to join replica to 
 ZooKeeper group
 I0218 12:10:58.119781 17452 recover.cpp:103] Start recovering a replica
 I0218 12:10:58.119881 17452 recover.cpp:139] Replica is in VOTING status
 I0218 12:10:58.119923 17452 recover.cpp:117] Recover process terminated
 Feb 18, 2014 12:10:58 PM TestLog main
 INFO: Initializing log /tmp/mesos-epljTr/log2 with 
 /var/jenkins/workspace/mesos-fedora-19-clang/src/mesos-log
 2014-02-18 12:10:58,126:17397(0x7f78fcff9700):ZOO_INFO@check_events@1703: 
 initiated connection to server [:::40410]
 2014-02-18 12:10:58,131:17397(0x7f78fdffb700):ZOO_INFO@check_events@1703: 
 initiated connection to server [:::40410]
 2014-02-18 12:10:58,165:17397(0x7f78fcff9700):ZOO_INFO@check_events@1750: 
 session establishment complete on server 

[jira] [Commented] (MESOS-1013) ExamplesTest.JavaLog is flaky

2015-08-05 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658683#comment-14658683
 ] 

Marco Massenzio commented on MESOS-1013:


[~greggomann] can you please review this one and see whether fits with the 
other ones you're looking at?
It may well be this no longer applies (looks from quite some time ago and I 
don't remember seeing the {{JavaLog}} example framework).

Maybe [~vinodkone] has more info.

Thanks!

 ExamplesTest.JavaLog is flaky
 -

 Key: MESOS-1013
 URL: https://issues.apache.org/jira/browse/MESOS-1013
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 0.19.0
Reporter: Vinod Kone
Assignee: Greg Mann
  Labels: flaky

 [ RUN  ] ExamplesTest.JavaLog
 Using temporary directory '/tmp/ExamplesTest_JavaLog_WBWEb9'
 Feb 18, 2014 12:10:57 PM TestLog main
 INFO: Starting a local ZooKeeper server
 log4j:WARN No appenders could be found for logger 
 (org.apache.zookeeper.server.ZooKeeperServer).
 log4j:WARN Please initialize the log4j system properly.
 Feb 18, 2014 12:10:57 PM TestLog main
 INFO: Initializing log /tmp/mesos-epljTr/log1 with 
 /var/jenkins/workspace/mesos-fedora-19-clang/src/mesos-log
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 I0218 12:10:58.107450 17404 process.cpp:1591] libprocess is initialized on 
 192.168.122.134:36627 for 8 cpus
 I0218 12:10:58.111640 17404 leveldb.cpp:166] Opened db in 3.145702ms
 I0218 12:10:58.113097 17404 leveldb.cpp:173] Compacted db in 770230ns
 I0218 12:10:58.113137 17404 leveldb.cpp:188] Created db iterator in 20506ns
 I0218 12:10:58.113152 17404 leveldb.cpp:194] Seeked to beginning of db in 
 12095ns
 I0218 12:10:58.113198 17404 leveldb.cpp:255] Iterated through 1 keys in the 
 db in 43127ns
 I0218 12:10:58.113248 17404 replica.cpp:732] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@712: Client 
 environment:zookeeper.version=zookeeper C client 3.4.5
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@716: Client 
 environment:host.name=fedora-19
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@723: Client 
 environment:os.name=Linux
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@724: Client 
 environment:os.arch=3.12.9-201.fc19.x86_64
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@725: Client 
 environment:os.version=#1 SMP Wed Jan 29 15:44:35 UTC 2014
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@733: Client 
 environment:user.name=jenkins
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@741: Client 
 environment:user.home=/home/jenkins
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@log_env@753: Client 
 environment:user.dir=/tmp/ExamplesTest_JavaLog_WBWEb9
 2014-02-18 12:10:58,115:17397(0x7f79152d9700):ZOO_INFO@zookeeper_init@786: 
 Initiating client connection, host=0:0:0:0:0:0:0:0:40410 sessionTimeout=3000 
 watcher=0x7f792228c440 sessionId=0 sessionPasswd=null context=0x13089c0 
 flags=0
 2014-02-18 12:10:58,117:17397(0x7f7921407700):ZOO_INFO@log_env@712: Client 
 environment:zookeeper.version=zookeeper C client 3.4.5
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@716: Client 
 environment:host.name=fedora-19
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@723: Client 
 environment:os.name=Linux
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@724: Client 
 environment:os.arch=3.12.9-201.fc19.x86_64
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@725: Client 
 environment:os.version=#1 SMP Wed Jan 29 15:44:35 UTC 2014
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@733: Client 
 environment:user.name=jenkins
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@741: Client 
 environment:user.home=/home/jenkins
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@log_env@753: Client 
 environment:user.dir=/tmp/ExamplesTest_JavaLog_WBWEb9
 2014-02-18 12:10:58,118:17397(0x7f7921407700):ZOO_INFO@zookeeper_init@786: 
 Initiating client connection, host=0:0:0:0:0:0:0:0:40410 sessionTimeout=3000 
 watcher=0x7f792228c440 sessionId=0 sessionPasswd=null 
 context=0x7f7904000e40 flags=0
 I0218 12:10:58.119313 17452 log.cpp:222] Attempting to join replica to 
 ZooKeeper group
 I0218 12:10:58.119781 17452 recover.cpp:103] Start recovering a replica
 I0218 12:10:58.119881 17452 recover.cpp:139] Replica is in VOTING status
 I0218 12:10:58.119923 17452 recover.cpp:117] Recover process terminated
 Feb 18, 2014 12:10:58 PM TestLog main
 INFO: Initializing log /tmp/mesos-epljTr/log2 with 
 /var/jenkins/workspace/mesos-fedora-19-clang/src/mesos-log
 2014-02-18 

[jira] [Commented] (MESOS-1201) Store IP addresses in host order

2015-08-05 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658688#comment-14658688
 ] 

Marco Massenzio commented on MESOS-1201:


[~jieyu]: We have now updated {{MasterInfo}} to use an {{Address}} field 
instead of the raw {{ip}} int filed (it's still there, for compatibility 
purposes, as it's declared as {{required}}).

Internally, as mentioned, {{net::IP}} is consistent, so the need for this is 
greatly reduced.

What do you think: is this still necessary? Or could we close it with a won't 
fix?

 Store IP addresses in host order
 

 Key: MESOS-1201
 URL: https://issues.apache.org/jira/browse/MESOS-1201
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Jie Yu

 Currently, in our code base, we store ip addresses in network order. For 
 instance, in UPID. Ironically, we store ports in host order.
 This can cause some subtle bugs which will be very hard to debug. For 
 example, we store ip in MasterInfo. Say the IP address is: 01.02.03.04. Since 
 we don't convert it into host order in our code, on x86 (little endian), it's 
 integer value will be 0x04030201. Now, we store it as an uint32 field in 
 MasterInfo protobuf. Protobuf will convert all integers into little endian 
 format, since x86 is little endian machine, no conversion will take place. As 
 a result, the value stored in probobuf will be 0x04030201. Now, if a big 
 endian machine reads this protobuf, it will do the conversion. If it later 
 interprets the ip from this integer, it will interpret it to be 04.03.02.01.
 So I plan to store all IP addresses in our code base to be in host order 
 (which is the common practice).
 We may have some compatibility issues as we store MasterInfo in ZooKeeper for 
 master detection and redirection. For example, what if the new code reads an 
 old MasterInfo? What would happen?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-426) Python-based frameworks use old API and are broken

2015-08-05 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-426:
--
Assignee: David Greenberg

 Python-based frameworks use old API and are broken
 --

 Key: MESOS-426
 URL: https://issues.apache.org/jira/browse/MESOS-426
 Project: Mesos
  Issue Type: Bug
  Components: framework, python api
Affects Versions: 0.9.0
Reporter: David Greenberg
Assignee: David Greenberg
 Attachments: mesos_changes.p1


 If you try to use mesos-submit or torque with mesos 0.9.0+, you get 
 exceptions due to API mismatches in these framework's expectations of the 
 python API.
 Steps to reproduce: try running mesos-submit mymaster echo hi, note the 
 stacktraces.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-426) Python-based frameworks use old API and are broken

2015-08-05 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658700#comment-14658700
 ] 

Marco Massenzio edited comment on MESOS-426 at 8/5/15 7:05 PM:
---

Looking at the review it would appear that it was committed by 
[~benjaminhindman] (almost two years ago?) so I'm closing this.

If this is mistaken, please feel free to re-open and update.


was (Author: marco-mesos):
Looking at the review it would appear that it was committed by 
[~benjaminhindman] (a year ago?) so I'm closing this.

If this is mistaken, please feel free to re-open and update.

 Python-based frameworks use old API and are broken
 --

 Key: MESOS-426
 URL: https://issues.apache.org/jira/browse/MESOS-426
 Project: Mesos
  Issue Type: Bug
  Components: framework, python api
Affects Versions: 0.9.0
Reporter: David Greenberg
Assignee: David Greenberg
 Attachments: mesos_changes.p1


 If you try to use mesos-submit or torque with mesos 0.9.0+, you get 
 exceptions due to API mismatches in these framework's expectations of the 
 python API.
 Steps to reproduce: try running mesos-submit mymaster echo hi, note the 
 stacktraces.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3069) Registry operations do not exist for manipulating maintanence schedules

2015-08-05 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658705#comment-14658705
 ] 

Joseph Wu commented on MESOS-3069:
--

Review: https://reviews.apache.org/r/37052/

 Registry operations do not exist for manipulating maintanence schedules
 ---

 Key: MESOS-3069
 URL: https://issues.apache.org/jira/browse/MESOS-3069
 Project: Mesos
  Issue Type: Task
  Components: master, replicated log
Reporter: Joseph Wu
Assignee: Joseph Wu
  Labels: mesosphere

 In order to modify the maintenance schedule in the replicated registry, we 
 will need Operations (src/master/registrar.hpp).
 The operations will likely correspond to the HTTP API:
 * UpdateMaintenance: Given a blob representing a maintenance schedule, write 
 the blob to the registry.  Possibly perform some verification on the blob.
 * UpdateSlaveMaintenanceStatus:  Given a set of machines and a status 
 (action), change the machiness' status in the maintenance schedule.
 Possible test(s):
 * UpdateMaintenance:
 ** Add a schedule with 1 slave, 2+ slaves, and 0 slaves.
 ** Add multiple schedules (different intervals).
 ** Delete schedules (empty schedule).
 * UpdateSlaveMaintenanceStatus:
 ** Add schedule.
 ** Change a slave's status.
 ** Change a slave's status, given a slave that is not in the schedule (slave 
 should be added to the schedule).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3212) As a Java developer I want a simple way to obtain information about Master from ZooKeeper

2015-08-05 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3212:
---
Sprint:   (was: Mesosphere Sprint 16)

 As a Java developer I want a simple way to obtain information about Master 
 from ZooKeeper
 -

 Key: MESOS-3212
 URL: https://issues.apache.org/jira/browse/MESOS-3212
 Project: Mesos
  Issue Type: Story
Reporter: Marco Massenzio
Assignee: Marco Massenzio
  Labels: mesosphere

 With the new JSON {{MasterInfo}} published to ZK, we want to provide a simple 
 library class for Python developers to retrieve info about the masters and 
 the leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3212) As a Java developer I want a simple way to obtain information about Master from ZooKeeper

2015-08-05 Thread Marco Massenzio (JIRA)
Marco Massenzio created MESOS-3212:
--

 Summary: As a Java developer I want a simple way to obtain 
information about Master from ZooKeeper
 Key: MESOS-3212
 URL: https://issues.apache.org/jira/browse/MESOS-3212
 Project: Mesos
  Issue Type: Story
Reporter: Marco Massenzio
Assignee: Marco Massenzio


With the new JSON {{MasterInfo}} published to ZK, we want to provide a simple 
library class for Python developers to retrieve info about the masters and the 
leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3211) As a Python developer I want a simple way to obtain information about Master from ZooKeeper

2015-08-05 Thread Marco Massenzio (JIRA)
Marco Massenzio created MESOS-3211:
--

 Summary: As a Python developer I want a simple way to obtain 
information about Master from ZooKeeper
 Key: MESOS-3211
 URL: https://issues.apache.org/jira/browse/MESOS-3211
 Project: Mesos
  Issue Type: Story
Reporter: Marco Massenzio
Assignee: Marco Massenzio


With the new JSON {{MasterInfo}} published to ZK, we want to provide a simple 
library class for Python developers to retrieve info about the masters and the 
leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3212) As a Java developer I want a simple way to obtain information about Master from ZooKeeper

2015-08-05 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3212:
---
Description: With the new JSON {{MasterInfo}} published to ZK, we want to 
provide a simple library class for Java Framework developers to retrieve info 
about the masters and the leader.  (was: With the new JSON {{MasterInfo}} 
published to ZK, we want to provide a simple library class for Python 
developers to retrieve info about the masters and the leader.)

 As a Java developer I want a simple way to obtain information about Master 
 from ZooKeeper
 -

 Key: MESOS-3212
 URL: https://issues.apache.org/jira/browse/MESOS-3212
 Project: Mesos
  Issue Type: Story
Reporter: Marco Massenzio
Assignee: Marco Massenzio
  Labels: mesosphere

 With the new JSON {{MasterInfo}} published to ZK, we want to provide a simple 
 library class for Java Framework developers to retrieve info about the 
 masters and the leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3211) As a Python developer I want a simple way to obtain information about Master from ZooKeeper

2015-08-05 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658714#comment-14658714
 ] 

Vinod Kone commented on MESOS-3211:
---

Duplicate of MESOS-2912?

 As a Python developer I want a simple way to obtain information about Master 
 from ZooKeeper
 ---

 Key: MESOS-3211
 URL: https://issues.apache.org/jira/browse/MESOS-3211
 Project: Mesos
  Issue Type: Story
Reporter: Marco Massenzio
Assignee: Marco Massenzio
  Labels: mesosphere

 With the new JSON {{MasterInfo}} published to ZK, we want to provide a simple 
 library class for Python developers to retrieve info about the masters and 
 the leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3212) As a Java developer I want a simple way to obtain information about Master from ZooKeeper

2015-08-05 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658716#comment-14658716
 ] 

Vinod Kone commented on MESOS-3212:
---

Dup of MESOS-2298 ?

 As a Java developer I want a simple way to obtain information about Master 
 from ZooKeeper
 -

 Key: MESOS-3212
 URL: https://issues.apache.org/jira/browse/MESOS-3212
 Project: Mesos
  Issue Type: Story
Reporter: Marco Massenzio
Assignee: Marco Massenzio
  Labels: mesosphere

 With the new JSON {{MasterInfo}} published to ZK, we want to provide a simple 
 library class for Java Framework developers to retrieve info about the 
 masters and the leader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2834) Support different perf output formats

2015-08-05 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658723#comment-14658723
 ] 

Benjamin Mahler commented on MESOS-2834:


Cleanup of existing subprocess usage:

{noformat}
commit e52f43fe7d3d0606b111b8a9c17212caecbde05e
Author: Paul Brett pau...@twopensource.com
Date:   Wed Aug 5 11:47:08 2015 -0700

Cleanups to Subprocess usage in Linux perf sampling.

Review: https://reviews.apache.org/r/37045
{noformat}

 Support different perf output formats
 -

 Key: MESOS-2834
 URL: https://issues.apache.org/jira/browse/MESOS-2834
 Project: Mesos
  Issue Type: Improvement
  Components: isolation
Reporter: Ian Downes
Assignee: Paul Brett
  Labels: twitter

 The output format of perf changes in 3.14 (inserting an additional field) and 
 in again in 4.1 (appending additional) fields. See kernel commits:
 410136f5dd96b6013fe6d1011b523b1c247e1ccb
 d73515c03c6a2706e088094ff6095a3abefd398b
 Update the perf::parse() function to understand all these formats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3201) Libev handle_async can deadlock with run_in_event_loop

2015-08-05 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658802#comment-14658802
 ] 

Joris Van Remoortere commented on MESOS-3201:
-

Indeed it is.
If you run {{valgrind --tool=helgrind --num-callers=60 ./tests 
--gtest_filter=IOTest.Read --gtest_repeat=10}}
you will likely find this (as well as other locking order violations) in the 
output:
{code}
==2083== Thread #10: lock order 0xC911BD8 before 0xC9110C0 violated
==2083==
==2083== Observed (incorrect) order is: acquisition of lock at 0xC9110C0
==2083==at 0x4C33596: pthread_mutex_lock (in 
/usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==2083==by 0x72F4C4: __gthread_mutex_lock(pthread_mutex_t*) 
(gthr-default.h:748)
==2083==by 0x753F74: std::mutex::lock() (mutex:135)
==2083==by 0x753F58: Synchronizedstd::mutex 
synchronizestd::mutex(std::mutex*)::{lambda(std::mutex*)#1}::operator()(std::mutex*)
 const (in /mesos/build/3rdparty/libprocess/tests)
==2083==by 0x753F37: Synchronizedstd::mutex 
synchronizestd::mutex(std::mutex*)::{lambda(std::mutex*)#1}::__invoke(std::mutex*)
 (synchronized.hpp:58)
==2083==by 0x753DA8: Synchronizedstd::mutex::Synchronized(std::mutex*, 
void (*)(std::mutex*), void (*)(std::mutex*)) (synchronized.hpp:35)
==2083==by 0x753C7B: Synchronizedstd::mutex 
synchronizestd::mutex(std::mutex*) (synchronized.hpp:56)
==2083==by 0x7E3C31: process::handle_async(ev_loop*, ev_async*, int) 
(libev.cpp:48)
==2083==by 0x827EF4: ev_invoke_pending (ev.c:2994)
==2083==by 0x828A72: ev_run (ev.c:3394)
==2083==by 0x7E41BA: ev_loop(ev_loop*, int) (ev.h:826)
==2083==by 0x7E4133: process::EventLoop::run() (libev.cpp:135)
==2083==by 0x7B31AE: void std::_Bind_simplevoid 
(*())()::_M_invoke(std::_Index_tuple) (in 
/mesos/build/3rdparty/libprocess/tests)
==2083==by 0x7B3184: std::_Bind_simplevoid (*())()::operator()() (in 
/mesos/build/3rdparty/libprocess/tests)
==2083==by 0x7B315B: std::thread::_Implstd::_Bind_simplevoid (*())() 
::_M_run() (in /mesos/build/3rdparty/libprocess/tests)
==2083==by 0x6371E2F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==2083==by 0x4C31FD6: ??? (in 
/usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==2083==by 0x4E456A9: start_thread (pthread_create.c:333)
==2083==by 0x68E2EEC: clone (clone.S:109)
==2083==
==2083==  followed by a later acquisition of lock at 0xC911BD8
==2083==at 0x4C33596: pthread_mutex_lock (in 
/usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==2083==by 0x52DA84: __gthread_mutex_lock(pthread_mutex_t*) 
(gthr-default.h:748)
==2083==by 0x52DA54: __gthread_recursive_mutex_lock(pthread_mutex_t*) 
(gthr-default.h:810)
==2083==by 0x5AFE04: std::recursive_mutex::lock() (mutex:176)
==2083==by 0x5AFDE8: Synchronizedstd::recursive_mutex 
synchronizestd::recursive_mutex(std::recursive_mutex*)::{lambda(std::recursive_mutex*)#1}::operator()(std::recursive_mutex*)
 const (in /mesos/build/3rdparty/libprocess/tests)
==2083==by 0x5AFDC7: Synchronizedstd::recursive_mutex 
synchronizestd::recursive_mutex(std::recursive_mutex*)::{lambda(std::recursive_mutex*)#1}::__invoke(std::recursive_mutex*)
 (synchronized.hpp:58)
==2083==by 0x5AFC2E: 
Synchronizedstd::recursive_mutex::Synchronized(std::recursive_mutex*, void 
(*)(std::recursive_mutex*), void (*)(std::recursive_mutex*)) 
(synchronized.hpp:35)
==2083==by 0x5AA70B: Synchronizedstd::recursive_mutex 
synchronizestd::recursive_mutex(std::recursive_mutex*) (synchronized.hpp:56)
==2083==by 0x75A52C: process::ProcessManager::use(process::UPID const) 
(process.cpp:2136)
==2083==by 0x7694D8: process::ProcessManager::terminate(process::UPID 
const, bool, process::ProcessBase*) (process.cpp:2613)
==2083==by 0x76BF0A: process::terminate(process::UPID const, bool) 
(process.cpp:3147)
==2083==by 0x72C98C: process::Latch::trigger() (latch.cpp:53)
==2083==by 0x41F394: 
process::internal::awaited(process::Ownedprocess::Latch) (future.hpp:1001)
==2083==by 0x48F903: void std::_Bindvoid 
(*(process::Ownedprocess::Latch))(process::Ownedprocess::Latch)::__callvoid,
 process::Futureunsigned long const, 
0ul(std::tupleprocess::Futureunsigned long const, 
std::_Index_tuple0ul) (functional:1263)
==2083==by 0x48F88C: void std::_Bindvoid 
(*(process::Ownedprocess::Latch))(process::Ownedprocess::Latch)::operator()process::Futureunsigned
 long const, void(process::Futureunsigned long const) (in 
/mesos/build/3rdparty/libprocess/tests)
==2083==by 0x48F841: std::_Function_handlervoid (process::Futureunsigned 
long const), std::_Bindvoid 
(*(process::Ownedprocess::Latch))(process::Ownedprocess::Latch) 
::_M_invoke(std::_Any_data const, process::Futureunsigned long const) 
(functional:2039)
==2083==by 0x7249D7: std::functionvoid (process::Futureunsigned long 
const)::operator()(process::Futureunsigned long const) 

[jira] [Updated] (MESOS-3166) Design doc for docker image registry client

2015-08-05 Thread Jojy Varghese (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jojy Varghese updated MESOS-3166:
-
Summary: Design doc for docker image registry client  (was: Design doc for 
docker image registry authenticator)

 Design doc for docker image registry client
 ---

 Key: MESOS-3166
 URL: https://issues.apache.org/jira/browse/MESOS-3166
 Project: Mesos
  Issue Type: Bug
  Components: containerization
 Environment: linux
Reporter: Jojy Varghese
Assignee: Jojy Varghese
  Labels: mesosphere

 Create design document for the docker registry Authenticator component so 
 that we have a baseline for the implementation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3213) Design doc for docker registry token manager

2015-08-05 Thread Jojy Varghese (JIRA)
Jojy Varghese created MESOS-3213:


 Summary: Design doc for docker registry token manager
 Key: MESOS-3213
 URL: https://issues.apache.org/jira/browse/MESOS-3213
 Project: Mesos
  Issue Type: Task
  Components: containerization, docker
 Environment: linux
Reporter: Jojy Varghese


Create design document for describing the component and interaction between 
Docker Registry Client and remote Docker Registry for token based authorization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3213) Design doc for docker registry token manager

2015-08-05 Thread Jojy Varghese (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jojy Varghese updated MESOS-3213:
-
Sprint: Mesosphere Sprint 16

 Design doc for docker registry token manager
 

 Key: MESOS-3213
 URL: https://issues.apache.org/jira/browse/MESOS-3213
 Project: Mesos
  Issue Type: Task
  Components: containerization, docker
 Environment: linux
Reporter: Jojy Varghese

 Create design document for describing the component and interaction between 
 Docker Registry Client and remote Docker Registry for token based 
 authorization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3205) No need to checkpoint container root filesystem path.

2015-08-05 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu reassigned MESOS-3205:
-

Assignee: Jie Yu

 No need to checkpoint container root filesystem path.
 -

 Key: MESOS-3205
 URL: https://issues.apache.org/jira/browse/MESOS-3205
 Project: Mesos
  Issue Type: Task
Reporter: Jie Yu
Assignee: Jie Yu

 Given the design discussed in 
 [MESOS-3004|https://issues.apache.org/jira/browse/MESOS-3004], one container 
 might have multiple provisioned root filesystems. Only checkpointing the root 
 filesystem for ContainerInfo::image does not make sense.
 Also, we realized that checkpointing container root filesystem path is not 
 necessary because each provisioner should be able to destroy root filesystems 
 for a given container based on a canonical directory layout (e.g., 
 appc_rootfs_dir/container_id/xxx).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3214) Replace boost foreach with range-based for

2015-08-05 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659336#comment-14659336
 ] 

Klaus Ma commented on MESOS-3214:
-

+1 for option 1; it enforce all contributor to use range-based for. Option 2 is 
safer for production, but foreach  range-based for maybe mixed in the code 
without this background.

 Replace boost foreach with range-based for
 --

 Key: MESOS-3214
 URL: https://issues.apache.org/jira/browse/MESOS-3214
 Project: Mesos
  Issue Type: Task
  Components: stout
Reporter: Michael Park
  Labels: mesosphere

 It's desirable to replace the boost {{foreach}} macro with the C++11 
 range-based {{for}}. This will help avoid some of the pitfalls of boost 
 {{foreach}} such as dealing with types with commas in them, as well as 
 improving compiler diagnostics by avoiding the macro expansion.
 One way to accomplish this is to replace the existing {{foreach (const Elem 
 elem, container)}} pattern with {{for (const Elem elem : container)}}. We 
 could support {{foreachkey}} and {{foreachvalue}} semantics via adaptors 
 {{keys}} and {{values}} which would be used like this: {{for (const Key key 
 : keys(container))}}, {{for (const Value value : values(container))}}. This 
 leaves {{foreachpair}} which cannot be used with {{for}}. I think it would be 
 desirable to support {{foreachpair}} for cases where the implicit unpacking 
 is useful.
 Another approach is to keep {{foreach}}, {{foreachpair}}, {{foreachkey}} and 
 {{foreachvalue}}, but simply implement them based on range-based {{for}}. For 
 example, {{#define foreach(elem, container) for (elem : container)}}. While 
 the consistency in the names is desirable, but unnecessary indirection of the 
 macro definition is not.
 It's unclear to me which approach we would favor in Mesos, so please share 
 your thoughts and preferences.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3214) Replace boost foreach with range-based for

2015-08-05 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-3214:

Description: 
Replace boost {{foreach}} macro with the C++11 range-based {{for}}. This will 
help avoid some of the pitfalls of boost {{foreach}} such as dealing with types 
with commas in them, as well as improving compiler diagnostics by avoiding the 
macro expansion.

The existing {{foreach (const Elem elem, container)}} pattern can be replaced 
with {{for (const Elem elem : container)}}.

{{foreachpair}}, {{foreachkey}} and {{foreachvalue}} will still be supported 
for cases where the implicit unpacking is useful.

The implementation of {{foreachpair}} can be simplified with the use of 
range-based for within, {{foreachkey}} and {{foreachvalue}} will be exactly as 
is except it can use {{std::ignore}} instead of the hand-rolled version.

  was:
Replace boost {{foreach}} macro with the C++11 range-based {{for}}. This will 
help avoid some of the pitfalls of boost {{foreach}} such as dealing with types 
with commas in them, as well as improving compiler diagnostics by avoiding the 
macro expansion.

The existing {{foreach (const Elem elem, container)}} can be replaced with 
{{for (const Elem elem : container)}}.

{{foreachpair}}, {{foreachkey}} and {{foreachvalue}} will still be supported 
for cases where the implicit unpacking is useful.

The implementation of {{foreachpair}} can be simplified with the use of 
range-based for within, {{foreachkey}} and {{foreachvalue}} will be exactly as 
is except it can use {{std::ignore}} instead of the hand-rolled version.


 Replace boost foreach with range-based for
 --

 Key: MESOS-3214
 URL: https://issues.apache.org/jira/browse/MESOS-3214
 Project: Mesos
  Issue Type: Task
  Components: stout
Reporter: Michael Park
  Labels: mesosphere

 Replace boost {{foreach}} macro with the C++11 range-based {{for}}. This will 
 help avoid some of the pitfalls of boost {{foreach}} such as dealing with 
 types with commas in them, as well as improving compiler diagnostics by 
 avoiding the macro expansion.
 The existing {{foreach (const Elem elem, container)}} pattern can be 
 replaced with {{for (const Elem elem : container)}}.
 {{foreachpair}}, {{foreachkey}} and {{foreachvalue}} will still be supported 
 for cases where the implicit unpacking is useful.
 The implementation of {{foreachpair}} can be simplified with the use of 
 range-based for within, {{foreachkey}} and {{foreachvalue}} will be exactly 
 as is except it can use {{std::ignore}} instead of the hand-rolled version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3215) CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04

2015-08-05 Thread Artem Harutyunyan (JIRA)
Artem Harutyunyan created MESOS-3215:


 Summary: CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 
14.04
 Key: MESOS-3215
 URL: https://issues.apache.org/jira/browse/MESOS-3215
 Project: Mesos
  Issue Type: Bug
Reporter: Artem Harutyunyan


[ RUN  ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
../../src/tests/containerizer/cgroups_tests.cpp:172: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
'/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
../../src/tests/containerizer/cgroups_tests.cpp:190: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
'/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
[  FAILED  ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf (9 ms)
[--] 1 test from CgroupsAnyHierarchyWithPerfEventTest (9 ms total)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2402) MesosContainerizerDestroyTest.LauncherDestroyFailure is flaky

2015-08-05 Thread billow (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659529#comment-14659529
 ] 

billow commented on MESOS-2402:
---

i encounter this problem two days ago.

 MesosContainerizerDestroyTest.LauncherDestroyFailure is flaky
 -

 Key: MESOS-2402
 URL: https://issues.apache.org/jira/browse/MESOS-2402
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Vinod Kone
Assignee: Vinod Kone
 Fix For: 0.23.0


 Failed to os::execvpe in childMain. Never seen this one before.
 {code}
 [ RUN  ] MesosContainerizerDestroyTest.LauncherDestroyFailure
 Using temporary directory 
 '/tmp/MesosContainerizerDestroyTest_LauncherDestroyFailure_QpjQEn'
 I0224 18:55:49.326912 21391 containerizer.cpp:461] Starting container 
 'test_container' for executor 'executor' of framework ''
 I0224 18:55:49.332252 21391 launcher.cpp:130] Forked child with pid '23496' 
 for container 'test_container'
 ABORT: (src/subprocess.cpp:165): Failed to os::execvpe in childMain
 *** Aborted at 1424832949 (unix time) try date -d @1424832949 if you are 
 using GNU date ***
 PC: @ 0x2b178c5db0d5 (unknown)
 I0224 18:55:49.340955 21392 process.cpp:2117] Dropped / Lost event for PID: 
 scheduler-509d37ac-296f-4429-b101-af433c1800e9@127.0.1.1:39647
 I0224 18:55:49.342300 21386 containerizer.cpp:911] Destroying container 
 'test_container'
 *** SIGABRT (@0x3e85bc8) received by PID 23496 (TID 0x2b178f9f0700) from 
 PID 23496; stack trace: ***
 @ 0x2b178c397cb0 (unknown)
 @ 0x2b178c5db0d5 (unknown)
 @ 0x2b178c5de83b (unknown)
 @   0x87a945 _Abort()
 @ 0x2b1789f610b9 process::childMain()
 I0224 18:55:49.391793 21386 containerizer.cpp:1120] Executor for container 
 'test_container' has exited
 I0224 18:55:49.400478 21391 process.cpp:2770] Handling HTTP event for process 
 'metrics' with path: '/metrics/snapshot'
 tests/containerizer_tests.cpp:485: Failure
 Value of: metrics.values[containerizer/mesos/container_destroy_errors]
   Actual: 16-byte object 02-00 00-00 17-2B 00-00 E0-86 0E-04 00-00 00-00
 Expected: 1u
 Which is: 1
 [  FAILED  ] MesosContainerizerDestroyTest.LauncherDestroyFailure (89 ms)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3214) Replace boost foreach with range-based for

2015-08-05 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-3214:

Description: 
It's desirable to replace the boost {{foreach}} macro with the C++11 
range-based {{for}}. This will help avoid some of the pitfalls of boost 
{{foreach}} such as dealing with types with commas in them, as well as 
improving compiler diagnostics by avoiding the macro expansion.

One way to accomplish this is to replace the existing {{foreach (const Elem 
elem, container)}} pattern with {{for (const Elem elem : container)}}. We 
could support {{foreachkey}} and {{foreachvalue}} semantics via adaptors 
{{keys}} and {{values}} which would be used like this: {{for (const Key key : 
keys(container))}}, {{for (const Value value : values(container))}}. This 
leaves {{foreachpair}} which cannot be used with {{for}}. I think it would be 
desirable to support {{foreachpair}} for cases where the implicit unpacking is 
useful.

Another approach is to keep {{foreach}}, {{foreachpair}}, {{foreachkey}} and 
{{foreachvalue}}, but simply implement them based on range-based {{for}}. For 
example, {{#define foreach(elem, container) for (elem : container)}}. While the 
consistency in the names is desirable, but unnecessary indirection of the macro 
definition is not.

It's unclear to me which approach we would favor in Mesos, so please share your 
thoughts and preferences.

  was:
Replace boost {{foreach}} macro with the C++11 range-based {{for}}. This will 
help avoid some of the pitfalls of boost {{foreach}} such as dealing with types 
with commas in them, as well as improving compiler diagnostics by avoiding the 
macro expansion.

The existing {{foreach (const Elem elem, container)}} pattern can be replaced 
with {{for (const Elem elem : container)}}.

{{foreachpair}}, {{foreachkey}} and {{foreachvalue}} will still be supported 
for cases where the implicit unpacking is useful.

The implementation of {{foreachpair}} can be simplified with the use of 
range-based for within, {{foreachkey}} and {{foreachvalue}} will be exactly as 
is except it can use {{std::ignore}} instead of the hand-rolled version.


 Replace boost foreach with range-based for
 --

 Key: MESOS-3214
 URL: https://issues.apache.org/jira/browse/MESOS-3214
 Project: Mesos
  Issue Type: Task
  Components: stout
Reporter: Michael Park
  Labels: mesosphere

 It's desirable to replace the boost {{foreach}} macro with the C++11 
 range-based {{for}}. This will help avoid some of the pitfalls of boost 
 {{foreach}} such as dealing with types with commas in them, as well as 
 improving compiler diagnostics by avoiding the macro expansion.
 One way to accomplish this is to replace the existing {{foreach (const Elem 
 elem, container)}} pattern with {{for (const Elem elem : container)}}. We 
 could support {{foreachkey}} and {{foreachvalue}} semantics via adaptors 
 {{keys}} and {{values}} which would be used like this: {{for (const Key key 
 : keys(container))}}, {{for (const Value value : values(container))}}. This 
 leaves {{foreachpair}} which cannot be used with {{for}}. I think it would be 
 desirable to support {{foreachpair}} for cases where the implicit unpacking 
 is useful.
 Another approach is to keep {{foreach}}, {{foreachpair}}, {{foreachkey}} and 
 {{foreachvalue}}, but simply implement them based on range-based {{for}}. For 
 example, {{#define foreach(elem, container) for (elem : container)}}. While 
 the consistency in the names is desirable, but unnecessary indirection of the 
 macro definition is not.
 It's unclear to me which approach we would favor in Mesos, so please share 
 your thoughts and preferences.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3214) Replace boost foreach with range-based for

2015-08-05 Thread Michael Park (JIRA)
Michael Park created MESOS-3214:
---

 Summary: Replace boost foreach with range-based for
 Key: MESOS-3214
 URL: https://issues.apache.org/jira/browse/MESOS-3214
 Project: Mesos
  Issue Type: Task
  Components: stout
Reporter: Michael Park


Replace boost {{foreach}} macro with the C++11 range-based {{for}}. This will 
help avoid some of the pitfalls of boost {{foreach}} such as dealing with types 
with commas in them, as well as improving compiler diagnostics by avoiding the 
macro expansion.

The existing {{foreach (const Elem elem, container)}} can be replaced with 
{{for (const Elem elem : container)}}.

{{foreachpair}}, {{foreachkey}} and {{foreachvalue}} will still be supported 
for cases where the implicit unpacking is useful.

The implementation of {{foreachpair}} can be simplified with the use of 
range-based for within, {{foreachkey}} and {{foreachvalue}} will be exactly as 
is except it can use {{std::ignore}} instead of the hand-rolled version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3214) Replace boost foreach with range-based for

2015-08-05 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659340#comment-14659340
 ] 

Benjamin Mahler commented on MESOS-3214:


As an iterative first step, how about doing your second suggestion in order to 
remove the boost header dependency? FWICT foreach.hpp is a pretty expensive 
header for compilation?

 Replace boost foreach with range-based for
 --

 Key: MESOS-3214
 URL: https://issues.apache.org/jira/browse/MESOS-3214
 Project: Mesos
  Issue Type: Task
  Components: stout
Reporter: Michael Park
  Labels: mesosphere

 It's desirable to replace the boost {{foreach}} macro with the C++11 
 range-based {{for}}. This will help avoid some of the pitfalls of boost 
 {{foreach}} such as dealing with types with commas in them, as well as 
 improving compiler diagnostics by avoiding the macro expansion.
 One way to accomplish this is to replace the existing {{foreach (const Elem 
 elem, container)}} pattern with {{for (const Elem elem : container)}}. We 
 could support {{foreachkey}} and {{foreachvalue}} semantics via adaptors 
 {{keys}} and {{values}} which would be used like this: {{for (const Key key 
 : keys(container))}}, {{for (const Value value : values(container))}}. This 
 leaves {{foreachpair}} which cannot be used with {{for}}. I think it would be 
 desirable to support {{foreachpair}} for cases where the implicit unpacking 
 is useful.
 Another approach is to keep {{foreach}}, {{foreachpair}}, {{foreachkey}} and 
 {{foreachvalue}}, but simply implement them based on range-based {{for}}. For 
 example, {{#define foreach(elem, container) for (elem : container)}}. While 
 the consistency in the names is desirable, but unnecessary indirection of the 
 macro definition is not.
 It's unclear to me which approach we would favor in Mesos, so please share 
 your thoughts and preferences.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1535) Pyspark on Mesos scheduler error

2015-08-05 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659221#comment-14659221
 ] 

Timothy Chen commented on MESOS-1535:
-

kill task is now supported in the Mesos scheduler.

 Pyspark on Mesos scheduler error
 

 Key: MESOS-1535
 URL: https://issues.apache.org/jira/browse/MESOS-1535
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.18.0, 0.18.1
 Environment: Running a Mesos on a cluster of Centos 6.5 machines. 180 
 GB memory.
Reporter: Ajay Viswanathan
  Labels: pyspark

 This is an error that I get while running fine-grained PySpark on the mesos 
 cluster. This comes after running some 200-1000 tasks generally.
 Pyspark code:
 while True:
 sc.parallelize(range(10)).map(lambda n : n*2).collect()
 Error log:
 (In console)
 ERROR DAGSchedulerActorSupervisor: eventProcesserActor failed due to the 
 error EOF reached before Python server acknowledged; shutting down 
 SparkContext
 Traceback (most recent call last):
   File stdin, line 2, in module
   File .../spark-1.0.0/python/pyspark/rdd.py, line 583, in collect
 bytesInJava = self._jrdd.collect().iterator()
   File .../spark-1.0.0/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py, 
 line 537,
   File .../spark-1.0.0/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py, line 
 300, in
 py4j.protocol.Py4JJavaError: An error occurred while calling o847.collect.
 org.apache.spark.SparkException: Job 75 cancelled as part of cancellation of 
 all jobs
 at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$fail
 at 
 org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:998)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply$mcVI$sp(DAGS
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler
 at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
 at 
 org.apache.spark.scheduler.DAGScheduler.doCancelAllJobs(DAGScheduler.scala:499)
 at 
 org.apache.spark.scheduler.DAGSchedulerActorSupervisor$$anonfun$2.applyOrElse(DAGSche
 at 
 org.apache.spark.scheduler.DAGSchedulerActorSupervisor$$anonfun$2.applyOrElse(DAGSche
 at 
 akka.actor.SupervisorStrategy.handleFailure(FaultHandling.scala:295)
 at 
 akka.actor.dungeon.FaultHandling$class.handleFailure(FaultHandling.scala:253)
 at akka.actor.ActorCell.handleFailure(ActorCell.scala:338)
 at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:423)
 at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447)
 at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262)
 at akka.dispatch.Mailbox.run(Mailbox.scala:218)
 at 
 akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.s
 at 
 scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 at 
 scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at 
 scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 at 
 scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
  14/06/24 02:58:19 ERROR OneForOneStrategy:
 java.lang.UnsupportedOperationException
 at 
 org.apache.spark.scheduler.SchedulerBackend$class.killTask(SchedulerBackend.scala:32)
 at 
 org.apache.spark.scheduler.cluster.mesos.MesosSchedulerBackend.killTask(MesosSchedule
 at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3$$anonfun$apply$1.
 at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3$$anonfun$apply$1.
 at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3$$anonfun$apply$1.
 at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
 at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3.apply(TaskSchedul
 at 
 org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$cancelTasks$3.apply(TaskSchedul
 at scala.Option.foreach(Option.scala:236)
 at 
 org.apache.spark.scheduler.TaskSchedulerImpl.cancelTasks(TaskSchedulerImpl.scala:176)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGSchedu
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGSchedu
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGSchedu
 at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
 at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$fail
 at