date:20150831

[jira] [Commented] (MESOS-3351) nextSlaveId in master was not updated when recover

2015-08-31 Thread Klaus Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724868#comment-14724868
 ] 

Klaus Ma commented on MESOS-3351:
-

[~vinodkone], I add you as  shepherd because it's also a duplicated ID issue.

> nextSlaveId in master was not updated when recover
> --
>
> Key: MESOS-3351
> URL: https://issues.apache.org/jira/browse/MESOS-3351
> Project: Mesos
>  Issue Type: Bug
>  Components: master
> Environment: Mac OS (Darwin da-macbookair.cn.ibm.com 14.5.0 Darwin 
> Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; 
> root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64)
>Reporter: Klaus Ma
>Assignee: Klaus Ma
>  Labels: race-condition, uuid
> Attachments: test.log
>
>
> When a slave register to master, master will generate a slave ID for it by 
> slaveInfo.id + "-S" + nextSlaveId (in master.cpp) to avoid duplicate 
> slaveInfo.id. But if master failover, nextSlaveId was reset to 0 which may 
> trigger duplicated slaveId between old slave & new slave.
> For now, it's only reproduced in Mac OS unstably, and can NOT reproduce in 
> Ubuntu; not sure the other OS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3351) nextSlaveId in master was not updated when recover

2015-08-31 Thread Klaus Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-3351:

Shepherd: Vinod Kone

> nextSlaveId in master was not updated when recover
> --
>
> Key: MESOS-3351
> URL: https://issues.apache.org/jira/browse/MESOS-3351
> Project: Mesos
>  Issue Type: Bug
>  Components: master
> Environment: Mac OS (Darwin da-macbookair.cn.ibm.com 14.5.0 Darwin 
> Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; 
> root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64)
>Reporter: Klaus Ma
>Assignee: Klaus Ma
>  Labels: race-condition, uuid
> Attachments: test.log
>
>
> When a slave register to master, master will generate a slave ID for it by 
> slaveInfo.id + "-S" + nextSlaveId (in master.cpp) to avoid duplicate 
> slaveInfo.id. But if master failover, nextSlaveId was reset to 0 which may 
> trigger duplicated slaveId between old slave & new slave.
> For now, it's only reproduced in Mac OS unstably, and can NOT reproduce in 
> Ubuntu; not sure the other OS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3351) nextSlaveId in master was not updated when recover

2015-08-31 Thread Klaus Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-3351:

Attachment: test.log

Append the log of UT cases. In the log, the two slave are using the same 
slaveID; and the new started slave was rejected by the master.

{code}
da-macbookair:build dma$ grep "Registering slave at" test.log 
I0901 13:44:40.462039 430882816 master.cpp:3670] Registering slave at 
slave(1)@9.181.90.57:49795 (da-macbookair.cn.ibm.com) with id 
20150901-134440-962245897-49795-59127-S0
I0901 13:44:40.660033 433565696 master.cpp:3670] Registering slave at 
slave(2)@9.181.90.57:49795 (da-macbookair.cn.ibm.com) with id 
20150901-134440-962245897-49795-59127-S0
{code}

> nextSlaveId in master was not updated when recover
> --
>
> Key: MESOS-3351
> URL: https://issues.apache.org/jira/browse/MESOS-3351
> Project: Mesos
>  Issue Type: Bug
>  Components: master
> Environment: Mac OS (Darwin da-macbookair.cn.ibm.com 14.5.0 Darwin 
> Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; 
> root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64)
>Reporter: Klaus Ma
>Assignee: Klaus Ma
>  Labels: race-condition, uuid
> Attachments: test.log
>
>
> When a slave register to master, master will generate a slave ID for it by 
> slaveInfo.id + "-S" + nextSlaveId (in master.cpp) to avoid duplicate 
> slaveInfo.id. But if master failover, nextSlaveId was reset to 0 which may 
> trigger duplicated slaveId between old slave & new slave.
> For now, it's only reproduced in Mac OS unstably, and can NOT reproduce in 
> Ubuntu; not sure the other OS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3070) Master CHECK failure if a framework uses duplicated task id.

2015-08-31 Thread Klaus Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724861#comment-14724861
 ] 

Klaus Ma commented on MESOS-3070:
-

If MESOS-3351 was not fixed, we can not get new resources by starting a new 
slave; so can not trigger duplicated task id issue.

> Master CHECK failure if a framework uses duplicated task id.
> 
>
> Key: MESOS-3070
> URL: https://issues.apache.org/jira/browse/MESOS-3070
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.22.1
>Reporter: Jie Yu
>Assignee: Klaus Ma
>
> We observed this in one of our testing cluster.
> One framework (under development) keeps launching tasks using the same 
> task_id. We don't expect the master to crash even if the framework is not 
> doing what it's supposed to do. However, under a series of events, this could 
> happen and keeps crashing the master.
> 1) frameworkA launches task 'task_id_1' on slaveA
> 2) master fails over
> 3) slaveA has not re-registered yet
> 4) frameworkA re-registered and launches task 'task_id_1' on slaveB
> 5) slaveA re-registering and add task "task_id_1' to frameworkA
> 6) CHECK failure in addTask
> {noformat}
> I0716 21:52:50.759305 28805 master.hpp:159] Adding task 'task_id_1' with 
> resources cpus(*):4; mem(*):32768 on slave 
> 20150417-232509-1735470090-5050-48870-S25 (hostname)
> ...
> ...
> F0716 21:52:50.760136 28805 master.hpp:362] Check failed: 
> !tasks.contains(task->task_id()) Duplicate task 'task_id_1' of framework 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3351) nextSlaveId in master was not updated when recover

2015-08-31 Thread Klaus Ma (JIRA)

Klaus Ma created MESOS-3351:
---

 Summary: nextSlaveId in master was not updated when recover
 Key: MESOS-3351
 URL: https://issues.apache.org/jira/browse/MESOS-3351
 Project: Mesos
  Issue Type: Bug
  Components: master
 Environment: Mac OS (Darwin da-macbookair.cn.ibm.com 14.5.0 Darwin 
Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; 
root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64)
Reporter: Klaus Ma
Assignee: Klaus Ma


When a slave register to master, master will generate a slave ID for it by 
slaveInfo.id + "-S" + nextSlaveId (in master.cpp) to avoid duplicate 
slaveInfo.id. But if master failover, nextSlaveId was reset to 0 which may 
trigger duplicated slaveId between old slave & new slave.

For now, it's only reproduced in Mac OS unstably, and can NOT reproduce in 
Ubuntu; not sure the other OS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3350) Create a protobuf VersionInfo to store mesos version information

2015-08-31 Thread haosdent (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-3350:

Component/s: (was: technical debt)

> Create a protobuf VersionInfo to store mesos version information
> 
>
> Key: MESOS-3350
> URL: https://issues.apache.org/jira/browse/MESOS-3350
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: Marco Massenzio
>  Labels: tech-debt
>
> Currently we use string to store mesos version in protobuf. It would be 
> better to create a protobuf struct which named VersionInfo like:
> {code}
> message VersionInfo {
>  option string git_sha = 1;
>  option string build_user = 2;
>  x
> }
> {code}
> So that we could use this struct everywhere (expose informations to http 
> endpoint, replace the version string in MasterInfo).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3350) Create a protobuf VersionInfo to store mesos version information

2015-08-31 Thread haosdent (JIRA)

haosdent created MESOS-3350:
---

 Summary: Create a protobuf VersionInfo to store mesos version 
information
 Key: MESOS-3350
 URL: https://issues.apache.org/jira/browse/MESOS-3350
 Project: Mesos
  Issue Type: Improvement
  Components: technical debt
Reporter: haosdent
Assignee: Marco Massenzio


Currently we use string to store mesos version in protobuf. It would be better 
to create a protobuf struct which named VersionInfo like:

{code}
message VersionInfo {
 option string git_sha = 1;
 option string build_user = 2;
 x
}
{code}

So that we could use this struct everywhere (expose informations to http 
endpoint, replace the version string in MasterInfo).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3187) Docker cli option support

2015-08-31 Thread Timothy Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724838#comment-14724838
 ] 

Timothy Chen commented on MESOS-3187:
-

commit 9d8829182e2b40bafc9aaa58e96ff3ac637d0b2c
Author: Vaibhav Khanduja 
Date:   Tue Sep 1 00:08:23 2015 +

Added support for custom docker host.

Review: https://reviews.apache.org/r/37114

> Docker cli option support
> -
>
> Key: MESOS-3187
> URL: https://issues.apache.org/jira/browse/MESOS-3187
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker, slave
>Reporter: Vaibhav Khanduja
>Assignee: Vaibhav Khanduja
>Priority: Minor
> Fix For: 0.25.0
>
>
> Mesos slave today support docker as a container environment. The docker cli 
> support much more options than what is supported by mesos slave. The slave 
> command line option should be enhanced support such parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3187) Docker cli option support

2015-08-31 Thread Timothy Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-3187:

Shepherd: Timothy Chen

> Docker cli option support
> -
>
> Key: MESOS-3187
> URL: https://issues.apache.org/jira/browse/MESOS-3187
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker, slave
>Reporter: Vaibhav Khanduja
>Assignee: Vaibhav Khanduja
>Priority: Minor
> Fix For: 0.25.0
>
>
> Mesos slave today support docker as a container environment. The docker cli 
> support much more options than what is supported by mesos slave. The slave 
> command line option should be enhanced support such parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3187) Docker cli option support

2015-08-31 Thread Timothy Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-3187:

Fix Version/s: 0.25.0

> Docker cli option support
> -
>
> Key: MESOS-3187
> URL: https://issues.apache.org/jira/browse/MESOS-3187
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker, slave
>Reporter: Vaibhav Khanduja
>Assignee: Vaibhav Khanduja
>Priority: Minor
> Fix For: 0.25.0
>
>
> Mesos slave today support docker as a container environment. The docker cli 
> support much more options than what is supported by mesos slave. The slave 
> command line option should be enhanced support such parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-2224) Add explanatory comments for Allocator interface

2015-08-31 Thread Guangya Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu reassigned MESOS-2224:
--

Assignee: Guangya Liu

> Add explanatory comments for Allocator interface
> 
>
> Key: MESOS-2224
> URL: https://issues.apache.org/jira/browse/MESOS-2224
> Project: Mesos
>  Issue Type: Task
>  Components: allocation
>Reporter: Alexander Rukletsov
>Assignee: Guangya Liu
>Priority: Minor
>
> Allocator is the public API and it would be great to have comments on all 
> calls to be implemented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2224) Add explanatory comments for Allocator interface

2015-08-31 Thread Guangya Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu updated MESOS-2224:
---
Shepherd: Alexander Rukletsov

> Add explanatory comments for Allocator interface
> 
>
> Key: MESOS-2224
> URL: https://issues.apache.org/jira/browse/MESOS-2224
> Project: Mesos
>  Issue Type: Task
>  Components: allocation
>Reporter: Alexander Rukletsov
>Assignee: Guangya Liu
>Priority: Minor
>
> Allocator is the public API and it would be great to have comments on all 
> calls to be implemented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3147) Allocator refactor

2015-08-31 Thread Guangya Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724727#comment-14724727
 ] 

Guangya Liu commented on MESOS-3147:


Thanks [~mcypark] , I have added [~alex-mesos] as shepherd 

> Allocator refactor
> --
>
> Key: MESOS-3147
> URL: https://issues.apache.org/jira/browse/MESOS-3147
> Project: Mesos
>  Issue Type: Task
>  Components: allocation, master
>Reporter: Michael Park
>Assignee: Guangya Liu
>  Labels: mesosphere, tech-debt
>
> With new features such as dynamic reservation, persistent volume, quota, 
> optimistic offers, it has been apparent that we need to refactor the 
> allocator to
> 1. solidify the API (e.g. consolidate {{updateSlave}} and {{updateAvailable}})
> 2. possibly move the offer generation to the allocator from the master
> 3. support for allocator modules where the API involves returning 
> {{libprocess::Future}}
> The sequence of implementation challenges for dynamic reservation master 
> endpoints are captured in [this 
> document|https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit?usp=sharing].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3147) Allocator refactor

2015-08-31 Thread Guangya Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu updated MESOS-3147:
---
Shepherd: Alexander Rukletsov

> Allocator refactor
> --
>
> Key: MESOS-3147
> URL: https://issues.apache.org/jira/browse/MESOS-3147
> Project: Mesos
>  Issue Type: Task
>  Components: allocation, master
>Reporter: Michael Park
>Assignee: Guangya Liu
>  Labels: mesosphere, tech-debt
>
> With new features such as dynamic reservation, persistent volume, quota, 
> optimistic offers, it has been apparent that we need to refactor the 
> allocator to
> 1. solidify the API (e.g. consolidate {{updateSlave}} and {{updateAvailable}})
> 2. possibly move the offer generation to the allocator from the master
> 3. support for allocator modules where the API involves returning 
> {{libprocess::Future}}
> The sequence of implementation challenges for dynamic reservation master 
> endpoints are captured in [this 
> document|https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit?usp=sharing].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3147) Allocator refactor

2015-08-31 Thread Michael Park (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724713#comment-14724713
 ] 

Michael Park commented on MESOS-3147:
-

[~gyliu] This is actually quite a hefty ticket. I'm not sure who will be able 
to help you shepherd this currently. Perhaps you could work with [~alexr] on 
this?

> Allocator refactor
> --
>
> Key: MESOS-3147
> URL: https://issues.apache.org/jira/browse/MESOS-3147
> Project: Mesos
>  Issue Type: Task
>  Components: allocation, master
>Reporter: Michael Park
>Assignee: Guangya Liu
>  Labels: mesosphere, tech-debt
>
> With new features such as dynamic reservation, persistent volume, quota, 
> optimistic offers, it has been apparent that we need to refactor the 
> allocator to
> 1. solidify the API (e.g. consolidate {{updateSlave}} and {{updateAvailable}})
> 2. possibly move the offer generation to the allocator from the master
> 3. support for allocator modules where the API involves returning 
> {{libprocess::Future}}
> The sequence of implementation challenges for dynamic reservation master 
> endpoints are captured in [this 
> document|https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit?usp=sharing].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-3147) Allocator refactor

2015-08-31 Thread Guangya Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu reassigned MESOS-3147:
--

Assignee: Guangya Liu

> Allocator refactor
> --
>
> Key: MESOS-3147
> URL: https://issues.apache.org/jira/browse/MESOS-3147
> Project: Mesos
>  Issue Type: Task
>  Components: allocation, master
>Reporter: Michael Park
>Assignee: Guangya Liu
>  Labels: mesosphere, tech-debt
>
> With new features such as dynamic reservation, persistent volume, quota, 
> optimistic offers, it has been apparent that we need to refactor the 
> allocator to
> 1. solidify the API (e.g. consolidate {{updateSlave}} and {{updateAvailable}})
> 2. possibly move the offer generation to the allocator from the master
> 3. support for allocator modules where the API involves returning 
> {{libprocess::Future}}
> The sequence of implementation challenges for dynamic reservation master 
> endpoints are captured in [this 
> document|https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit?usp=sharing].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-1831) Master should send PingSlaveMessage instead of "PING"

2015-08-31 Thread Yong Qiao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-1831:
-

Assignee: Yong Qiao Wang

> Master should send PingSlaveMessage instead of "PING"
> -
>
> Key: MESOS-1831
> URL: https://issues.apache.org/jira/browse/MESOS-1831
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> In 0.21.0 master sends "PING" message with an embedded PingSlaveMessage for 
> backwards compatibility (https://reviews.apache.org/r/25867/).
> In 0.22.0, master should send PingSlaveMessage directly instead of "PING".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-1832) Slave should accept PingSlaveMessage but not "PING" message.

2015-08-31 Thread Yong Qiao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-1832:
-

Assignee: Yong Qiao Wang

> Slave should accept PingSlaveMessage but not "PING" message.
> 
>
> Key: MESOS-1832
> URL: https://issues.apache.org/jira/browse/MESOS-1832
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>  Labels: mesosphere
>
> Slave handles both "PING" message and PingSlaveMessage in until 0.22.0 for 
> backwards compatibility (https://reviews.apache.org/r/25867/).
> In 0.23.0, slave no longer needs handle "PING".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-2646) Update Master to send revocable resources in separate offers

2015-08-31 Thread Yong Qiao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang reassigned MESOS-2646:
-

Assignee: Yong Qiao Wang

> Update Master to send revocable resources in separate offers
> 
>
> Key: MESOS-2646
> URL: https://issues.apache.org/jira/browse/MESOS-2646
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Vinod Kone
>Assignee: Yong Qiao Wang
>  Labels: twitter
>
> Master will send separate offers for revocable and non-revocable/regular 
> resources. This allows master to rescind revocable offers (e.g, when a new 
> oversubscribed resources estimate comes from the slave) without impacting 
> regular offers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-3345) Expand the range of integer precision when converting into/out of json.

2015-08-31 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724630#comment-14724630
 ] 

Benjamin Mahler edited comment on MESOS-3345 at 9/1/15 2:40 AM:


Yes, we use proto2, I'm suggesting we mimic the proto3 <\-> JSON conversion 
(much like we did with base64 for bytes fields). Yes, picojson will parse it as 
a string, that's the point :) Within our JSON\->Protobuf conversion, we have to 
numify accordingly.


was (Author: bmahler):
Yes, we use proto2, I'm suggesting we mimic the proto3 <-> JSON conversion 
(much like we did with base64 for bytes fields). Yes, picojson will parse it as 
a string, that's the point :) Within our JSON->Protobuf conversion, we have to 
numify accordingly.

> Expand the range of integer precision when converting into/out of json.
> ---
>
> Key: MESOS-3345
> URL: https://issues.apache.org/jira/browse/MESOS-3345
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: json, mesosphere, protobuf
>
> For [MESOS-3299], we added some protobufs to represent time with integer 
> precision.  However, this precision is not maintained through protobuf <-> 
> JSON conversion, because of how our JSON encoders/decoders convert numbers to 
> floating point.
> To maintain precision, we can try one of the following:
> * Try using a {{long double}} to represent a number.
> * Add logic to stringify/parse numbers without loss when possible.
> * Try representing {{int64_t}} as a string and parse it as such?
> * Update PicoJson and add a compiler flag, i.e. {{-DPICOJSON_USE_INT64}} 
> In all cases, we'll need to make sure that:
> * Integers are properly stringified without loss.
> * The JSON decoder parses the integer without loss.
> * We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
> small integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3345) Expand the range of integer precision when converting into/out of json.

2015-08-31 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724630#comment-14724630
 ] 

Benjamin Mahler commented on MESOS-3345:


Yes, we use proto2, I'm suggesting we mimic the proto3 <-> JSON conversion 
(much like we did with base64 for bytes fields). Yes, picojson will parse it as 
a string, that's the point :) Within our JSON->Protobuf conversion, we have to 
numify accordingly.

> Expand the range of integer precision when converting into/out of json.
> ---
>
> Key: MESOS-3345
> URL: https://issues.apache.org/jira/browse/MESOS-3345
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: json, mesosphere, protobuf
>
> For [MESOS-3299], we added some protobufs to represent time with integer 
> precision.  However, this precision is not maintained through protobuf <-> 
> JSON conversion, because of how our JSON encoders/decoders convert numbers to 
> floating point.
> To maintain precision, we can try one of the following:
> * Try using a {{long double}} to represent a number.
> * Add logic to stringify/parse numbers without loss when possible.
> * Try representing {{int64_t}} as a string and parse it as such?
> * Update PicoJson and add a compiler flag, i.e. {{-DPICOJSON_USE_INT64}} 
> In all cases, we'll need to make sure that:
> * Integers are properly stringified without loss.
> * The JSON decoder parses the integer without loss.
> * We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
> small integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2858) FetcherCacheHttpTest.HttpMixed is flaky.

2015-08-31 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724625#comment-14724625
 ] 

Benjamin Mahler commented on MESOS-2858:


[~bernd-mesos] another failure just now on apache jenkins:

{noformat}
[ RUN  ] FetcherCacheHttpTest.HttpMixed
Using temporary directory '/tmp/FetcherCacheHttpTest_HttpMixed_QorfSx'
I0901 01:30:37.064944 30488 leveldb.cpp:176] Opened db in 2.397166ms
I0901 01:30:37.065899 30488 leveldb.cpp:183] Compacted db in 880282ns
I0901 01:30:37.065968 30488 leveldb.cpp:198] Created db iterator in 23806ns
I0901 01:30:37.066103 30488 leveldb.cpp:204] Seeked to beginning of db in 2305ns
I0901 01:30:37.066125 30488 leveldb.cpp:273] Iterated through 0 keys in the db 
in 414ns
I0901 01:30:37.066190 30488 replica.cpp:744] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0901 01:30:37.066812 30518 recover.cpp:449] Starting replica recovery
I0901 01:30:37.067175 30518 recover.cpp:475] Replica is in EMPTY status
I0901 01:30:37.068895 30509 replica.cpp:641] Replica in EMPTY status received a 
broadcasted recover request
I0901 01:30:37.070057 30511 recover.cpp:195] Received a recover response from a 
replica in EMPTY status
I0901 01:30:37.070122 30513 master.cpp:378] Master 
20150901-013037-1996493228-35897-30488 (998e845ced4a) started on 
172.17.0.119:35897
I0901 01:30:37.070686 30521 recover.cpp:566] Updating replica status to STARTING
I0901 01:30:37.070360 30513 master.cpp:380] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
--authorizers="local" 
--credentials="/tmp/FetcherCacheHttpTest_HttpMixed_QorfSx/credentials" 
--framework_sorter="drf" --help="false" --initialize_driver_logging="true" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_slave_ping_timeouts="5" --quiet="false" 
--recovery_slave_removal_limit="100%" --registry="replicated_log" 
--registry_fetch_timeout="1mins" --registry_store_timeout="25secs" 
--registry_strict="true" --root_submissions="true" 
--slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
--user_sorter="drf" --version="false" 
--webui_dir="/mesos/mesos-0.25.0/_inst/share/mesos/webui" 
--work_dir="/tmp/FetcherCacheHttpTest_HttpMixed_QorfSx/master" 
--zk_session_timeout="10secs"
I0901 01:30:37.071161 30513 master.cpp:425] Master only allowing authenticated 
frameworks to register
I0901 01:30:37.071323 30513 master.cpp:430] Master only allowing authenticated 
slaves to register
I0901 01:30:37.071486 30511 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 658303ns
I0901 01:30:37.071542 30513 credentials.hpp:37] Loading credentials for 
authentication from '/tmp/FetcherCacheHttpTest_HttpMixed_QorfSx/credentials'
I0901 01:30:37.071569 30511 replica.cpp:323] Persisted replica status to 
STARTING
I0901 01:30:37.071864 30521 recover.cpp:475] Replica is in STARTING status
I0901 01:30:37.072119 30513 master.cpp:469] Using default 'crammd5' 
authenticator
I0901 01:30:37.072484 30513 master.cpp:506] Authorization enabled
I0901 01:30:37.072827 30521 hierarchical.hpp:346] Initialized hierarchical 
allocator process
I0901 01:30:37.072973 30521 whitelist_watcher.cpp:79] No whitelist given
I0901 01:30:37.073446 30522 replica.cpp:641] Replica in STARTING status 
received a broadcasted recover request
I0901 01:30:37.074085 30511 recover.cpp:195] Received a recover response from a 
replica in STARTING status
I0901 01:30:37.075266 30519 recover.cpp:566] Updating replica status to VOTING
I0901 01:30:37.075773 30522 master.cpp:1559] The newly elected leader is 
master@172.17.0.119:35897 with id 20150901-013037-1996493228-35897-30488
I0901 01:30:37.075842 30522 master.cpp:1572] Elected as the leading master!
I0901 01:30:37.075920 30522 master.cpp:1332] Recovering from registrar
I0901 01:30:37.076122 30518 registrar.cpp:311] Recovering registrar
I0901 01:30:37.076969 30511 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 1.360762ms
I0901 01:30:37.077129 30511 replica.cpp:323] Persisted replica status to VOTING
I0901 01:30:37.077304 30511 recover.cpp:580] Successfully joined the Paxos group
I0901 01:30:37.077510 30511 recover.cpp:464] Recover process terminated
I0901 01:30:37.078060 30509 log.cpp:661] Attempting to start the writer
I0901 01:30:37.079370 30515 replica.cpp:477] Replica received implicit promise 
request with proposal 1
I0901 01:30:37.079810 30515 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 398985ns
I0901 01:30:37.079838 30515 replica.cpp:345] Persisted promised to 1
I0901 01:30:37.080569 30509 coordinator.cpp:231] Coordinator attemping to fill 
missing position
I0901 01:30:37.082077 30507 replica.cpp:378] Replica received explicit promise 
request for position 0 with proposal 2
I0901 01:30:37.082552 30507 leveldb.cpp:343

[jira] [Updated] (MESOS-3346) Add filter support for inverse offers

2015-08-31 Thread Artem Harutyunyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-3346:
-
Summary: Add filter support for inverse offers  (was: Inverse offers do not 
support filters)

> Add filter support for inverse offers
> -
>
> Key: MESOS-3346
> URL: https://issues.apache.org/jira/browse/MESOS-3346
> Project: Mesos
>  Issue Type: Task
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> A filter attached to the inverse offer can be used by the framework to 
> control when it wants to be contacted again with the inverse offer, since 
> future circumstances may change the viability of the maintenance schedule.  
> The “filter” for InverseOffers is identical to the existing mechanism for 
> re-offering Offers to frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3346) Add filter support for inverse offers

2015-08-31 Thread Artem Harutyunyan (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724623#comment-14724623
 ] 

Artem Harutyunyan commented on MESOS-3346:
--

Thanks to [~bmahler] for suggesting a more descriptive title.

> Add filter support for inverse offers
> -
>
> Key: MESOS-3346
> URL: https://issues.apache.org/jira/browse/MESOS-3346
> Project: Mesos
>  Issue Type: Task
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> A filter attached to the inverse offer can be used by the framework to 
> control when it wants to be contacted again with the inverse offer, since 
> future circumstances may change the viability of the maintenance schedule.  
> The “filter” for InverseOffers is identical to the existing mechanism for 
> re-offering Offers to frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3329) Unused hashmap::existsValue functions have incomplete code paths

2015-08-31 Thread Jian Qiu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724618#comment-14724618
 ] 

Jian Qiu commented on MESOS-3329:
-

Append the related review request: https://reviews.apache.org/r/37955/

> Unused hashmap::existsValue functions have incomplete code paths
> 
>
> Key: MESOS-3329
> URL: https://issues.apache.org/jira/browse/MESOS-3329
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Jan Schlicht
>Assignee: Jian Qiu
>Priority: Trivial
>  Labels: easyfix, mesosphere
>
> `stout/hashmap.hpp` defines functions `hashmap::existsValue`. These return 
> true if a certain value exists in the hashmap instance. The control flow of 
> these functions doesn't cover the case that the value is not found, which 
> should result in false. Right now the result in this case is undefined.
> As the `existsValue` functions are never called this doesn't result in a 
> compile error atm.
> Possible solutions:
> 1) Add `return false`
> 2) Remove function



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3347) Remove dead code in src/linux/perf.cpp

2015-08-31 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-3347:
---
Shepherd: Benjamin Mahler

> Remove dead code in src/linux/perf.cpp
> --
>
> Key: MESOS-3347
> URL: https://issues.apache.org/jira/browse/MESOS-3347
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Paul Brett
>Assignee: Paul Brett
> Fix For: 0.25.0
>
>
> Performance monitoring routines include support for sampling for single pid, 
> single cgroup and multiple pids cases but these are never used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (MESOS-3329) Unused hashmap::existsValue functions have incomplete code paths

2015-08-31 Thread Jian Qiu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Qiu updated MESOS-3329:

Comment: was deleted

(was: commit 7ef97970ae4d5ea909abbf234a565ae7db62b192
Author: Jian Qiu 
Date:   Mon Aug 31 21:28:34 2015 +0800

Remove hashmap::existsValue since it is never called

Review: https://reviews.apache.org/r/37955
)

> Unused hashmap::existsValue functions have incomplete code paths
> 
>
> Key: MESOS-3329
> URL: https://issues.apache.org/jira/browse/MESOS-3329
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Jan Schlicht
>Assignee: Jian Qiu
>Priority: Trivial
>  Labels: easyfix, mesosphere
>
> `stout/hashmap.hpp` defines functions `hashmap::existsValue`. These return 
> true if a certain value exists in the hashmap instance. The control flow of 
> these functions doesn't cover the case that the value is not found, which 
> should result in false. Right now the result in this case is undefined.
> As the `existsValue` functions are never called this doesn't result in a 
> compile error atm.
> Possible solutions:
> 1) Add `return false`
> 2) Remove function



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3349) PersistentVolumeTest.AccessPersistentVolume fails when run as root.

2015-08-31 Thread Benjamin Mahler (JIRA)

Benjamin Mahler created MESOS-3349:
--

 Summary: PersistentVolumeTest.AccessPersistentVolume fails when 
run as root.
 Key: MESOS-3349
 URL: https://issues.apache.org/jira/browse/MESOS-3349
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Benjamin Mahler


When running the tests as root:

{noformat}
[ RUN  ] PersistentVolumeTest.AccessPersistentVolume
I0901 02:17:26.435140 39432 exec.cpp:133] Version: 0.25.0
I0901 02:17:26.442129 39461 exec.cpp:207] Executor registered on slave 
20150901-021726-1828659978-52102-32604-S0
Registered executor on hostname
Starting task d8ff1f00-e720-4a61-b440-e111009dfdc3
sh -c 'echo abc > path1/file'
Forked command at 39484
Command exited with status 0 (pid: 39484)
../../src/tests/persistent_volume_tests.cpp:579: Failure
Value of: os::exists(path::join(directory, "path1"))
  Actual: true
Expected: false
[  FAILED  ] PersistentVolumeTest.AccessPersistentVolume (777 ms)
{noformat}

FYI [~jieyu] [~mcypark]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3329) Unused hashmap::existsValue functions have incomplete code paths

2015-08-31 Thread Jian Qiu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724614#comment-14724614
 ] 

Jian Qiu commented on MESOS-3329:
-

[~haosd...@gmail.com] Thanks for reminding me :-)

> Unused hashmap::existsValue functions have incomplete code paths
> 
>
> Key: MESOS-3329
> URL: https://issues.apache.org/jira/browse/MESOS-3329
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Jan Schlicht
>Assignee: Jian Qiu
>Priority: Trivial
>  Labels: easyfix, mesosphere
>
> `stout/hashmap.hpp` defines functions `hashmap::existsValue`. These return 
> true if a certain value exists in the hashmap instance. The control flow of 
> these functions doesn't cover the case that the value is not found, which 
> should result in false. Right now the result in this case is undefined.
> As the `existsValue` functions are never called this doesn't result in a 
> compile error atm.
> Possible solutions:
> 1) Add `return false`
> 2) Remove function



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-1575) master sets failover timeout to 0 when framework requests a high value

2015-08-31 Thread JIRA


[ 
https://issues.apache.org/jira/browse/MESOS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724582#comment-14724582
 ] 

José Guilherme Vanz edited comment on MESOS-1575 at 9/1/15 2:08 AM:


Well, I do not know [~gyliu]. As a novice in the project I am in doubt about 
what is the best approach. Maybe [~vinodkone] or [~kevints] have a better 
opinion.

I believe the idea of refuse the subscribe is notify the user that something 
can be wrong with his application instead of just silently use the default 
value. The default value can be used when the the parameter is missing... 

Anyway, I can update the default value. =)


was (Author: jvanz):
Well, I do not know [~gyliu]. As a novice in the project I am in doubt about 
what is the best approach. Maybe [~vinodkone] or [~kevints] have a better 
opinion.

Anyway, I can update the default value. =)

> master sets failover timeout to 0 when framework requests a high value
> --
>
> Key: MESOS-1575
> URL: https://issues.apache.org/jira/browse/MESOS-1575
> Project: Mesos
>  Issue Type: Bug
>Reporter: Kevin Sweeney
>Assignee: José Guilherme Vanz
>  Labels: newbie, twitter
>
> In response to a registered RPC we observed the following behavior:
> {noformat}
> W0709 19:07:32.982997 11400 master.cpp:612] Using the default value for 
> 'failover_timeout' becausethe input value is invalid: Argument out of the 
> range that a Duration can represent due to int64_t's size limit
> I0709 19:07:32.983008 11404 hierarchical_allocator_process.hpp:408] 
> Deactivated framework 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983013 11400 master.cpp:617] Giving framework 
> 20140709-184342-119646400-5050-11380-0003 0ns to failover
> I0709 19:07:32.983271 11404 master.cpp:2201] Framework failover timeout, 
> removing framework 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983294 11404 master.cpp:2688] Removing framework 
> 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983678 11404 hierarchical_allocator_process.hpp:363] Removed 
> framework 20140709-184342-119646400-5050-11380-0003
> {noformat}
> This was using the following frameworkInfo.
> {code}
> FrameworkInfo frameworkInfo = FrameworkInfo.newBuilder()
> .setUser("test")
> .setName("jvm")
> .setFailoverTimeout(Long.MAX_VALUE)
> .build();
> {code}
> Instead of silently defaulting large values to 0 the master should refuse to 
> process the request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-1575) master sets failover timeout to 0 when framework requests a high value

2015-08-31 Thread Guangya Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724586#comment-14724586
 ] 

Guangya Liu commented on MESOS-1575:


I think it is not good to set the default failover value to 0 but a value 
greater than 0 which is more meaningful.

> master sets failover timeout to 0 when framework requests a high value
> --
>
> Key: MESOS-1575
> URL: https://issues.apache.org/jira/browse/MESOS-1575
> Project: Mesos
>  Issue Type: Bug
>Reporter: Kevin Sweeney
>Assignee: José Guilherme Vanz
>  Labels: newbie, twitter
>
> In response to a registered RPC we observed the following behavior:
> {noformat}
> W0709 19:07:32.982997 11400 master.cpp:612] Using the default value for 
> 'failover_timeout' becausethe input value is invalid: Argument out of the 
> range that a Duration can represent due to int64_t's size limit
> I0709 19:07:32.983008 11404 hierarchical_allocator_process.hpp:408] 
> Deactivated framework 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983013 11400 master.cpp:617] Giving framework 
> 20140709-184342-119646400-5050-11380-0003 0ns to failover
> I0709 19:07:32.983271 11404 master.cpp:2201] Framework failover timeout, 
> removing framework 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983294 11404 master.cpp:2688] Removing framework 
> 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983678 11404 hierarchical_allocator_process.hpp:363] Removed 
> framework 20140709-184342-119646400-5050-11380-0003
> {noformat}
> This was using the following frameworkInfo.
> {code}
> FrameworkInfo frameworkInfo = FrameworkInfo.newBuilder()
> .setUser("test")
> .setName("jvm")
> .setFailoverTimeout(Long.MAX_VALUE)
> .build();
> {code}
> Instead of silently defaulting large values to 0 the master should refuse to 
> process the request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-2430) Update Mesos version that appears in getting started guide

2015-08-31 Thread Yong Qiao Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724583#comment-14724583
 ] 

Yong Qiao Wang edited comment on MESOS-2430 at 9/1/15 1:59 AM:
---

According to check the latest getting-started doc, this issue has bee 
addressed, so can mark this issue to resolved directly.


was (Author: jamesyongqiaowang):
According to check the latest code that this issue has bee addressed.

> Update Mesos version that appears in getting started guide
> --
>
> Key: MESOS-2430
> URL: https://issues.apache.org/jira/browse/MESOS-2430
> Project: Mesos
>  Issue Type: Task
>  Components: project website
>Reporter: Dave Lester
>Assignee: Yong Qiao Wang
>  Labels: newbie
>
> The latest Mesos version that appears in docs/getting-started.md gives the 
> example of using version 0.20.1. This should be updated to reflect the 
> most-current version, Mesos version



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-1575) master sets failover timeout to 0 when framework requests a high value

2015-08-31 Thread JIRA


[ 
https://issues.apache.org/jira/browse/MESOS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724582#comment-14724582
 ] 

José Guilherme Vanz commented on MESOS-1575:


Well, I do not know [~gyliu]. As a novice in the project I am in doubt about 
what is the best approach. Maybe [~vinodkone] or [~kevints] have a better 
opinion.

Anyway, I can update the default value. =)

> master sets failover timeout to 0 when framework requests a high value
> --
>
> Key: MESOS-1575
> URL: https://issues.apache.org/jira/browse/MESOS-1575
> Project: Mesos
>  Issue Type: Bug
>Reporter: Kevin Sweeney
>Assignee: José Guilherme Vanz
>  Labels: newbie, twitter
>
> In response to a registered RPC we observed the following behavior:
> {noformat}
> W0709 19:07:32.982997 11400 master.cpp:612] Using the default value for 
> 'failover_timeout' becausethe input value is invalid: Argument out of the 
> range that a Duration can represent due to int64_t's size limit
> I0709 19:07:32.983008 11404 hierarchical_allocator_process.hpp:408] 
> Deactivated framework 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983013 11400 master.cpp:617] Giving framework 
> 20140709-184342-119646400-5050-11380-0003 0ns to failover
> I0709 19:07:32.983271 11404 master.cpp:2201] Framework failover timeout, 
> removing framework 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983294 11404 master.cpp:2688] Removing framework 
> 20140709-184342-119646400-5050-11380-0003
> I0709 19:07:32.983678 11404 hierarchical_allocator_process.hpp:363] Removed 
> framework 20140709-184342-119646400-5050-11380-0003
> {noformat}
> This was using the following frameworkInfo.
> {code}
> FrameworkInfo frameworkInfo = FrameworkInfo.newBuilder()
> .setUser("test")
> .setName("jvm")
> .setFailoverTimeout(Long.MAX_VALUE)
> .build();
> {code}
> Instead of silently defaulting large values to 0 the master should refuse to 
> process the request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2430) Update Mesos version that appears in getting started guide

2015-08-31 Thread Yong Qiao Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724583#comment-14724583
 ] 

Yong Qiao Wang commented on MESOS-2430:
---

According to check the latest code that this issue has bee addressed.

> Update Mesos version that appears in getting started guide
> --
>
> Key: MESOS-2430
> URL: https://issues.apache.org/jira/browse/MESOS-2430
> Project: Mesos
>  Issue Type: Task
>  Components: project website
>Reporter: Dave Lester
>Assignee: Yong Qiao Wang
>  Labels: newbie
>
> The latest Mesos version that appears in docs/getting-started.md gives the 
> example of using version 0.20.1. This should be updated to reflect the 
> most-current version, Mesos version



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3348) Add either log rotation or capped-size logging

2015-08-31 Thread Joseph Wu (JIRA)

Joseph Wu created MESOS-3348:


 Summary: Add either log rotation or capped-size logging
 Key: MESOS-3348
 URL: https://issues.apache.org/jira/browse/MESOS-3348
 Project: Mesos
  Issue Type: Story
Affects Versions: 0.23.0
Reporter: Joseph Wu
Assignee: Joseph Wu


Tasks currently log their output (stdout/stderr) to files on an agent's disk.  
In some cases, the accumulation of these logs can completely fill up the 
agent's disk and thereby kill the task or machine.

To prevent this, we should either implement a log rotation mechanism or 
capped-size logging.

We will first scope out several possible approaches for log rotation/capping in 
a design document (linked below).  Once the an approach is chosen, this story 
will be broken down into some corresponding issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3301) Should remove redundant check

2015-08-31 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724437#comment-14724437
 ] 

Vinod Kone commented on MESOS-3301:
---

Commented on the ticket.

As a piece of advice, I would highly recommend looking for a shepherd and 
discussing with him/her, *before* sending out reviews. It will help streamline 
the review process and avoid delays.

Looking forward to your contributions!

> Should remove redundant check
> -
>
> Key: MESOS-3301
> URL: https://issues.apache.org/jira/browse/MESOS-3301
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>Priority: Minor
>
> In functions: HierarchicalAllocatorProcess FrameworkSorter>::initialize() and HierarchicalAllocatorProcess FrameworkSorter>::allocate(), it does not need to check the 
> roleSorter->count() , because the role * always be added to roleSorter.
> So suggest to remove the following lines from initialize() and allocate() 
> functions:
> -if (roleSorter->count() == 0) { 
> - LOG(ERROR) << "No roles specified, cannot allocate resources!"; 
> - return; 
> -}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3347) Remove dead code in src/linux/perf.cpp

2015-08-31 Thread Paul Brett (JIRA)

Paul Brett created MESOS-3347:
-

 Summary: Remove dead code in src/linux/perf.cpp
 Key: MESOS-3347
 URL: https://issues.apache.org/jira/browse/MESOS-3347
 Project: Mesos
  Issue Type: Improvement
Reporter: Paul Brett
Assignee: Paul Brett


Performance monitoring routines include support for sampling for single pid, 
single cgroup and multiple pids cases but these are never used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2562) 0.24.0 release

2015-08-31 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724414#comment-14724414
 ] 

Vinod Kone commented on MESOS-2562:
---

Cherry picked the following commits for 0.24.0-rc2

{code}
* 0d41ee6 Fixed user cgroup failing test on centos 7.
* ec3b93f Document iptable rule need added when use docker bridge network mode.
* a5a37a3 Added symlink test for /bin, lib, and /lib64 when preparing test root 
filesystem.
{code}

> 0.24.0 release
> --
>
> Key: MESOS-2562
> URL: https://issues.apache.org/jira/browse/MESOS-2562
> Project: Mesos
>  Issue Type: Task
>Reporter: Kapil Arya
>Assignee: Vinod Kone
>
> The main feature of this release is going to be v1 (beta) release of the HTTP 
> scheduler API (part of MESOS-2288 epic).
> Unresolved issues tracker: 
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20MESOS%20AND%20status%20!%3D%20Resolved%20AND%20%22Target%20Version%2Fs%22%20%3D%200.24.0%20ORDER%20BY%20status%20DESC



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3297) Failing ROOT_ tests on CentOS 7.1 - MesosContainerizerLaunchTest

2015-08-31 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3297:
--
Fix Version/s: 0.24.0

> Failing ROOT_ tests on CentOS 7.1 - MesosContainerizerLaunchTest
> 
>
> Key: MESOS-3297
> URL: https://issues.apache.org/jira/browse/MESOS-3297
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker, test
>Affects Versions: 0.23.0, 0.24.0
> Environment: CentOS Linux release 7.1
> Linux 3.10.0
>Reporter: Marco Massenzio
>Assignee: Greg Mann
>Priority: Blocker
>  Labels: mesosphere, tech-debt
> Fix For: 0.24.0
>
>
> h2. MesosContainerizerLaunchTest
> This is one of several ROOT failing tests: we want to track them 
> *individually* and for each of them decide whether to:
> * fix;
> * remove; OR
> * redesign.
> (full verbose logs attached)
> h2. Steps to Reproduce
> Completely cleaned the build, removed directory, clean pull from {{master}} 
> (SHA: {{fb93d93}}) - same results, 9 failed tests:
> {noformat}
> [==] 751 tests from 114 test cases ran. (231218 ms total)
> [  PASSED  ] 742 tests.
> [  FAILED  ] 9 tests, listed below:
> [  FAILED  ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
> [  FAILED  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess
> [  FAILED  ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint
> [  FAILED  ] 
> LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem
> [  FAILED  ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs
>  9 FAILED TESTS
>   YOU HAVE 10 DISABLED TESTS
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3294) Failing ROOT_ tests on CentOS 7.1 - UserCgroupIsolatorTest

2015-08-31 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3294:
--
Fix Version/s: 0.24.0

> Failing ROOT_ tests on CentOS 7.1 - UserCgroupIsolatorTest
> --
>
> Key: MESOS-3294
> URL: https://issues.apache.org/jira/browse/MESOS-3294
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker, test
>Affects Versions: 0.23.0, 0.24.0
> Environment: CentOS Linux release 7.1
> Linux 3.10.0
>Reporter: Marco Massenzio
>Assignee: Timothy Chen
>Priority: Blocker
>  Labels: mesosphere, tech-debt
> Fix For: 0.24.0
>
>
> h2. UserCgroupIsolatorTest
> This is one of several ROOT failing tests: we want to track them 
> *individually* and for each of them decide whether to:
> * fix;
> * remove; OR
> * redesign.
> (full verbose logs attached)
> h2. Steps to Reproduce
> Completely cleaned the build, removed directory, clean pull from {{master}} 
> (SHA: {{fb93d93}}) - same results, 9 failed tests:
> {noformat}
> [==] 751 tests from 114 test cases ran. (231218 ms total)
> [  PASSED  ] 742 tests.
> [  FAILED  ] 9 tests, listed below:
> [  FAILED  ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
> [  FAILED  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess
> [  FAILED  ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint
> [  FAILED  ] 
> LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem
> [  FAILED  ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs
>  9 FAILED TESTS
>   YOU HAVE 10 DISABLED TESTS
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3053) Failing DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged

2015-08-31 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3053:
--
Fix Version/s: 0.24.0

> Failing DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged
> ---
>
> Key: MESOS-3053
> URL: https://issues.apache.org/jira/browse/MESOS-3053
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
> Environment: Linux klaus-OptiPlex-780 3.13.0-57-generic #95-Ubuntu 
> SMP Fri Jun 19 09:28:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Klaus Ma
>Assignee: haosdent
> Fix For: 0.24.0
>
>
> {code}
> [ RUN  ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged
> ../../src/tests/docker_containerizer_tests.cpp:618: Failure
> Value of: statusRunning.get().state()
>   Actual: TASK_LOST
> Expected: TASK_RUNNING
> ../../src/tests/docker_containerizer_tests.cpp:619: Failure
> Failed to wait 1mins for statusFinished
> ../../src/tests/docker_containerizer_tests.cpp:610: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, 
> statusUpdate(&driver, _))...
>  Expected: to be called twice
>Actual: called once - unsatisfied and active
> *** Aborted at 1436949820 (unix time) try "date -d @1436949820" if you are 
> using GNU date ***
> PC: @   0x86ecf2 mesos::internal::tests::Cluster::Slaves::shutdown()
> *** SIGSEGV (@0x7ffd010100c7) received by PID 1244 (TID 0x2ba741a49140) from 
> PID 16842951; stack trace: ***
> @ 0x2ba74761b340 (unknown)
> @   0x86ecf2 mesos::internal::tests::Cluster::Slaves::shutdown()
> @   0x86eab2 mesos::internal::tests::Cluster::Slaves::~Slaves()
> @   0x870546 mesos::internal::tests::Cluster::~Cluster()
> @   0x8705b7 mesos::internal::tests::MesosTest::~MesosTest()
> @   0xa38b23 
> mesos::internal::tests::DockerContainerizerTest::~DockerContainerizerTest()
> @   0xa67583 
> mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test()
> @   0xa675b2 
> mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test()
> @  0x11adb2e testing::Test::DeleteSelf_()
> @  0x11b6a57 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x11b1bde 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x119a70f testing::TestInfo::Run()
> @  0x119ac4a testing::TestCase::Run()
> @  0x119f914 testing::internal::UnitTestImpl::RunAllTests()
> @  0x11b78ee 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x11b2903 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x119e820 testing::UnitTest::Run()
> @   0xce54c3 main
> @ 0x2ba74784aec5 (unknown)
> @   0x864c09 (unknown)
> make[3]: *** [check-local] Segmentation fault (core dumped)
> make[3]: Leaving directory `/home/klaus/mesos/build/src'
> make[2]: *** [check-am] Error 2
> make[2]: Leaving directory `/home/klaus/mesos/build/src'
> make[1]: *** [check] Error 2
> make[1]: Leaving directory `/home/klaus/mesos/build/src'
> make: *** [check-recursive] Error 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3296) Failing ROOT_ tests on CentOS 7.1 - LinuxFilesystemIsolatorTest

2015-08-31 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3296:
--
Fix Version/s: 0.24.0

> Failing ROOT_ tests on CentOS 7.1 - LinuxFilesystemIsolatorTest
> ---
>
> Key: MESOS-3296
> URL: https://issues.apache.org/jira/browse/MESOS-3296
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker, test
>Affects Versions: 0.23.0, 0.24.0
> Environment: CentOS Linux release 7.1
> Linux 3.10.0
>Reporter: Marco Massenzio
>Assignee: Greg Mann
>Priority: Blocker
>  Labels: mesosphere, tech-debt
> Fix For: 0.24.0
>
>
> h2. LinuxFilesystemIsolatorTest
> This is one of several ROOT failing tests: we want to track them 
> *individually* and for each of them decide whether to:
> * fix;
> * remove; OR
> * redesign.
> (full verbose logs attached)
> h2. Steps to Reproduce
> Completely cleaned the build, removed directory, clean pull from {{master}} 
> (SHA: {{fb93d93}}) - same results, 9 failed tests:
> {noformat}
> [==] 751 tests from 114 test cases ran. (231218 ms total)
> [  PASSED  ] 742 tests.
> [  FAILED  ] 9 tests, listed below:
> [  FAILED  ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
> [  FAILED  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess
> [  FAILED  ] ContainerizerTest.ROOT_CGROUPS_BalloonFramework
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost
> [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint
> [  FAILED  ] 
> LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem
> [  FAILED  ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs
>  9 FAILED TESTS
>   YOU HAVE 10 DISABLED TESTS
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3273) EventCall Test Framework is flaky

2015-08-31 Thread Vinod Kone (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3273:
--
Assignee: (was: Vinod Kone)

> EventCall Test Framework is flaky
> -
>
> Key: MESOS-3273
> URL: https://issues.apache.org/jira/browse/MESOS-3273
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API
>Affects Versions: 0.24.0
> Environment: 
> https://builds.apache.org/job/Mesos/705/COMPILER=clang,CONFIGURATION=--verbose,OS=ubuntu:14.04,label_exp=docker%7C%7CHadoop/consoleFull
>Reporter: Vinod Kone
>  Labels: flaky-test, tech-debt, twitter
>
> Observed this on ASF CI. h/t [~haosd...@gmail.com]
> Looks like the HTTP scheduler never sent a SUBSCRIBE request to the master.
> {code}
> [ RUN  ] ExamplesTest.EventCallFramework
> Using temporary directory '/tmp/ExamplesTest_EventCallFramework_k4vXkx'
> I0813 19:55:15.643579 26085 exec.cpp:443] Ignoring exited event because the 
> driver is aborted!
> Shutting down
> Sending SIGTERM to process tree at pid 26061
> Killing the following process trees:
> [ 
> ]
> Shutting down
> Sending SIGTERM to process tree at pid 26062
> Shutting down
> Killing the following process trees:
> [ 
> ]
> Sending SIGTERM to process tree at pid 26063
> Killing the following process trees:
> [ 
> ]
> Shutting down
> Sending SIGTERM to process tree at pid 26098
> Killing the following process trees:
> [ 
> ]
> Shutting down
> Sending SIGTERM to process tree at pid 26099
> Killing the following process trees:
> [ 
> ]
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> I0813 19:55:17.161726 26100 process.cpp:1012] libprocess is initialized on 
> 172.17.2.10:60249 for 16 cpus
> I0813 19:55:17.161888 26100 logging.cpp:177] Logging to STDERR
> I0813 19:55:17.163625 26100 scheduler.cpp:157] Version: 0.24.0
> I0813 19:55:17.175302 26100 leveldb.cpp:176] Opened db in 3.167446ms
> I0813 19:55:17.176393 26100 leveldb.cpp:183] Compacted db in 1.047996ms
> I0813 19:55:17.176496 26100 leveldb.cpp:198] Created db iterator in 77155ns
> I0813 19:55:17.176518 26100 leveldb.cpp:204] Seeked to beginning of db in 
> 8429ns
> I0813 19:55:17.176527 26100 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 4219ns
> I0813 19:55:17.176708 26100 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0813 19:55:17.178951 26136 recover.cpp:449] Starting replica recovery
> I0813 19:55:17.179934 26136 recover.cpp:475] Replica is in EMPTY status
> I0813 19:55:17.181970 26126 master.cpp:378] Master 
> 20150813-195517-167907756-60249-26100 (297daca2d01a) started on 
> 172.17.2.10:60249
> I0813 19:55:17.182317 26126 master.cpp:380] Flags at startup: 
> --acls="permissive: false
> register_frameworks {
>   principals {
> type: SOME
> values: "test-principal"
>   }
>   roles {
> type: SOME
> values: "*"
>   }
> }
> run_tasks {
>   principals {
> type: SOME
> values: "test-principal"
>   }
>   users {
> type: SOME
> values: "mesos"
>   }
> }
> " --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_slaves="false" 
> --authenticators="crammd5" 
> --credentials="/tmp/ExamplesTest_EventCallFramework_k4vXkx/credentials" 
> --framework_sorter="drf" --help="false" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_slave_ping_timeouts="5" --quiet="false" 
> --recovery_slave_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" 
> --registry_strict="false" --root_submissions="true" 
> --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
> --user_sorter="drf" --version="false" 
> --webui_dir="/mesos/mesos-0.24.0/src/webui" --work_dir="/tmp/mesos-II8Gua" 
> --zk_session_timeout="10secs"
> I0813 19:55:17.183475 26126 master.cpp:427] Master allowing unauthenticated 
> frameworks to register
> I0813 19:55:17.183536 26126 master.cpp:432] Master allowing unauthenticated 
> slaves to register
> I0813 19:55:17.183615 26126 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/ExamplesTest_EventCallFramework_k4vXkx/credentials'
> W0813 19:55:17.183859 26126 credentials.hpp:52] Permissions on credentials 
> file '/tmp/ExamplesTest_EventCallFramework_k4vXkx/credentials' are too open. 
> It is recommended that your credentials file is NOT accessible by others.
> I0813 19:55:17.183969 26123 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I0813 19:55:17.184306 26126 master.cpp:469] Using default 'crammd5' 
> authenticator
> I0813 19:55:17.184661 26126 authenticator.cpp:512] Initializing server SASL
> I0813 19:55:17.185104 26138 recover.cpp:195] Received a recover response from 
> a

[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2840:
---
Sprint: Mesosphere Sprint 12, Mesosphere Sprint 13, Mesosphere Sprint 14, 
Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17  (was: 
Mesosphere Sprint 12, Mesosphere Sprint 13, Mesosphere Sprint 14, Mesosphere 
Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18)

> MesosContainerizer support multiple image provisioners
> --
>
> Key: MESOS-2840
> URL: https://issues.apache.org/jira/browse/MESOS-2840
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization, docker
>Affects Versions: 0.23.0
>Reporter: Marco Massenzio
>Assignee: Timothy Chen
>  Labels: mesosphere, twitter
>
> We want to utilize the Appc integration interfaces to further make 
> MesosContainerizers to support multiple image formats.
> This allows our future work on isolators to support any container image 
> format.
> Design
> https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2840) MesosContainerizer support multiple image provisioners

2015-08-31 Thread Marco Massenzio (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724386#comment-14724386
 ] 

Marco Massenzio commented on MESOS-2840:


Sorry about that!
I didn't even realize you could actually put an Epic into a Sprint (and the 
only reason Jira shows it was me, it's only because I closed the previous 
Sprint(s) and it automatically it moved to the next one).

I'm taking this out of Sprint, it should totally not be there.

> MesosContainerizer support multiple image provisioners
> --
>
> Key: MESOS-2840
> URL: https://issues.apache.org/jira/browse/MESOS-2840
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization, docker
>Affects Versions: 0.23.0
>Reporter: Marco Massenzio
>Assignee: Timothy Chen
>  Labels: mesosphere, twitter
>
> We want to utilize the Appc integration interfaces to further make 
> MesosContainerizers to support multiple image formats.
> This allows our future work on isolators to support any container image 
> format.
> Design
> https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3346) Inverse offers do not support filters

2015-08-31 Thread Artem Harutyunyan (JIRA)

Artem Harutyunyan created MESOS-3346:


 Summary: Inverse offers do not support filters
 Key: MESOS-3346
 URL: https://issues.apache.org/jira/browse/MESOS-3346
 Project: Mesos
  Issue Type: Task
Reporter: Artem Harutyunyan
Assignee: Artem Harutyunyan


A filter attached to the inverse offer can be used by the framework to control 
when it wants to be contacted again with the inverse offer, since future 
circumstances may change the viability of the maintenance schedule.  The 
“filter” for InverseOffers is identical to the existing mechanism for 
re-offering Offers to frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3346) Inverse offers do not support filters

2015-08-31 Thread Artem Harutyunyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-3346:
-
Labels: mesosphere  (was: )

> Inverse offers do not support filters
> -
>
> Key: MESOS-3346
> URL: https://issues.apache.org/jira/browse/MESOS-3346
> Project: Mesos
>  Issue Type: Task
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> A filter attached to the inverse offer can be used by the framework to 
> control when it wants to be contacted again with the inverse offer, since 
> future circumstances may change the viability of the maintenance schedule.  
> The “filter” for InverseOffers is identical to the existing mechanism for 
> re-offering Offers to frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3345) Expand the range of integer precision when converting into/out of json.

2015-08-31 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-3345:
-
Description: 
For [MESOS-3299], we added some protobufs to represent time with integer 
precision.  However, this precision is not maintained through protobuf <-> JSON 
conversion, because of how our JSON encoders/decoders convert numbers to 
floating point.

To maintain precision, we can:
1) Try using a {{long double}} to represent a number.
2) Add logic to stringify/parse numbers without loss when possible.
3) Try representing {{int64_t}} as a string and parse it as such?
4) Update PicoJson and add a compiler flag, i.e. {{-DPICOJSON_USE_INT64}} 

In all cases, we'll need to make sure that:
* Integers are properly stringified without loss.
* The JSON decoder parses the integer without loss.
* We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
small integers.

  was:
For [MESOS-3299], we added some protobufs to represent time with integer 
precision.  However, this precision is not maintained through protobuf <-> JSON 
conversion, because of how our JSON encoders/decoders convert numbers to 
floating point.

To maintain precision, we can:
1) Try using a {{long double}} to represent a number.
2) Add logic to stringify/parse numbers without loss when possible.

In all cases, we'll need to make sure that:
* Integers are properly stringified without loss.
* The JSON decoder parses the integer without loss.
* We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
small integers.


> Expand the range of integer precision when converting into/out of json.
> ---
>
> Key: MESOS-3345
> URL: https://issues.apache.org/jira/browse/MESOS-3345
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: json, mesosphere, protobuf
>
> For [MESOS-3299], we added some protobufs to represent time with integer 
> precision.  However, this precision is not maintained through protobuf <-> 
> JSON conversion, because of how our JSON encoders/decoders convert numbers to 
> floating point.
> To maintain precision, we can:
> 1) Try using a {{long double}} to represent a number.
> 2) Add logic to stringify/parse numbers without loss when possible.
> 3) Try representing {{int64_t}} as a string and parse it as such?
> 4) Update PicoJson and add a compiler flag, i.e. {{-DPICOJSON_USE_INT64}} 
> In all cases, we'll need to make sure that:
> * Integers are properly stringified without loss.
> * The JSON decoder parses the integer without loss.
> * We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
> small integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3345) Expand the range of integer precision when converting into/out of json.

2015-08-31 Thread Joseph Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-3345:
-
Description: 
For [MESOS-3299], we added some protobufs to represent time with integer 
precision.  However, this precision is not maintained through protobuf <-> JSON 
conversion, because of how our JSON encoders/decoders convert numbers to 
floating point.

To maintain precision, we can try one of the following:
* Try using a {{long double}} to represent a number.
* Add logic to stringify/parse numbers without loss when possible.
* Try representing {{int64_t}} as a string and parse it as such?
* Update PicoJson and add a compiler flag, i.e. {{-DPICOJSON_USE_INT64}} 

In all cases, we'll need to make sure that:
* Integers are properly stringified without loss.
* The JSON decoder parses the integer without loss.
* We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
small integers.

  was:
For [MESOS-3299], we added some protobufs to represent time with integer 
precision.  However, this precision is not maintained through protobuf <-> JSON 
conversion, because of how our JSON encoders/decoders convert numbers to 
floating point.

To maintain precision, we can:
1) Try using a {{long double}} to represent a number.
2) Add logic to stringify/parse numbers without loss when possible.
3) Try representing {{int64_t}} as a string and parse it as such?
4) Update PicoJson and add a compiler flag, i.e. {{-DPICOJSON_USE_INT64}} 

In all cases, we'll need to make sure that:
* Integers are properly stringified without loss.
* The JSON decoder parses the integer without loss.
* We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
small integers.


> Expand the range of integer precision when converting into/out of json.
> ---
>
> Key: MESOS-3345
> URL: https://issues.apache.org/jira/browse/MESOS-3345
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: json, mesosphere, protobuf
>
> For [MESOS-3299], we added some protobufs to represent time with integer 
> precision.  However, this precision is not maintained through protobuf <-> 
> JSON conversion, because of how our JSON encoders/decoders convert numbers to 
> floating point.
> To maintain precision, we can try one of the following:
> * Try using a {{long double}} to represent a number.
> * Add logic to stringify/parse numbers without loss when possible.
> * Try representing {{int64_t}} as a string and parse it as such?
> * Update PicoJson and add a compiler flag, i.e. {{-DPICOJSON_USE_INT64}} 
> In all cases, we'll need to make sure that:
> * Integers are properly stringified without loss.
> * The JSON decoder parses the integer without loss.
> * We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
> small integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2840) MesosContainerizer support multiple image provisioners

2015-08-31 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724335#comment-14724335
 ] 

Vinod Kone commented on MESOS-2840:
---

[~marco-mesos] Can you avoid putting epics into sprints? Mainly because epics 
labeled with "mesosphere" and "twitter" end up polluting our sprints. For 
example, Twitter cannot close its sprint because Mesosphere has an active 
sprint with a ticket that has "twitter" label. Meanwhile, I'll see if I can fix 
our sprints to not depend on the "twitter" label.

> MesosContainerizer support multiple image provisioners
> --
>
> Key: MESOS-2840
> URL: https://issues.apache.org/jira/browse/MESOS-2840
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization, docker
>Affects Versions: 0.23.0
>Reporter: Marco Massenzio
>Assignee: Timothy Chen
>  Labels: mesosphere, twitter
>
> We want to utilize the Appc integration interfaces to further make 
> MesosContainerizers to support multiple image formats.
> This allows our future work on isolators to support any container image 
> format.
> Design
> https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3345) Expand the range of integer precision when converting into/out of json.

2015-08-31 Thread Joseph Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724332#comment-14724332
 ] 

Joseph Wu commented on MESOS-3345:
--

I don't think this will quite work.  (And aren't we using protobufs version 2?)

>From what I can tell, if there's a {{"}} in the JSON, our PicoJson parser will 
>return it as a string.  If there's a digit (0-9), PicoJson will parse it as a 
>double.  Some things might break if we start printing JSON ints as strings.

Perhaps we could update PicoJson: 
https://github.com/kazuho/picojson#experimental-support-for-int64_t
And then use the experimental int64_t.

> Expand the range of integer precision when converting into/out of json.
> ---
>
> Key: MESOS-3345
> URL: https://issues.apache.org/jira/browse/MESOS-3345
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: json, mesosphere, protobuf
>
> For [MESOS-3299], we added some protobufs to represent time with integer 
> precision.  However, this precision is not maintained through protobuf <-> 
> JSON conversion, because of how our JSON encoders/decoders convert numbers to 
> floating point.
> To maintain precision, we can:
> 1) Try using a {{long double}} to represent a number.
> 2) Add logic to stringify/parse numbers without loss when possible.
> In all cases, we'll need to make sure that:
> * Integers are properly stringified without loss.
> * The JSON decoder parses the integer without loss.
> * We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
> small integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3345) Expand the range of integer precision when converting into/out of json.

2015-08-31 Thread Vinod Kone (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724305#comment-14724305
 ] 

Vinod Kone commented on MESOS-3345:
---

Looks like ResourceStatistics have uint64 fields, which will be breaking if we 
change the Protobuf to JSON mapping :(

https://github.com/apache/mesos/blob/master/src/slave/monitor.cpp#L140

https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L704

> Expand the range of integer precision when converting into/out of json.
> ---
>
> Key: MESOS-3345
> URL: https://issues.apache.org/jira/browse/MESOS-3345
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: json, mesosphere, protobuf
>
> For [MESOS-3299], we added some protobufs to represent time with integer 
> precision.  However, this precision is not maintained through protobuf <-> 
> JSON conversion, because of how our JSON encoders/decoders convert numbers to 
> floating point.
> To maintain precision, we can:
> 1) Try using a {{long double}} to represent a number.
> 2) Add logic to stringify/parse numbers without loss when possible.
> In all cases, we'll need to make sure that:
> * Integers are properly stringified without loss.
> * The JSON decoder parses the integer without loss.
> * We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
> small integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3345) Expand the range of integer precision when converting into/out of json.

2015-08-31 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724137#comment-14724137
 ] 

Benjamin Mahler commented on MESOS-3345:


[~kaysoky] Ah, let's just always use JSON strings for int64, per the proto3 
mapping:
https://developers.google.com/protocol-buffers/docs/proto3?hl=en#json

Can we do this in a non-breaking way given what we have today?

> Expand the range of integer precision when converting into/out of json.
> ---
>
> Key: MESOS-3345
> URL: https://issues.apache.org/jira/browse/MESOS-3345
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: json, mesosphere, protobuf
>
> For [MESOS-3299], we added some protobufs to represent time with integer 
> precision.  However, this precision is not maintained through protobuf <-> 
> JSON conversion, because of how our JSON encoders/decoders convert numbers to 
> floating point.
> To maintain precision, we can:
> 1) Try using a {{long double}} to represent a number.
> 2) Add logic to stringify/parse numbers without loss when possible.
> In all cases, we'll need to make sure that:
> * Integers are properly stringified without loss.
> * The JSON decoder parses the integer without loss.
> * We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
> small integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3345) Expand the range of integer precision when converting into/out of json.

2015-08-31 Thread Joseph Wu (JIRA)

Joseph Wu created MESOS-3345:


 Summary: Expand the range of integer precision when converting 
into/out of json.
 Key: MESOS-3345
 URL: https://issues.apache.org/jira/browse/MESOS-3345
 Project: Mesos
  Issue Type: Task
  Components: stout
Reporter: Joseph Wu
Assignee: Joseph Wu
Priority: Minor


For [MESOS-3299], we added some protobufs to represent time with integer 
precision.  However, this precision is not maintained through protobuf <-> JSON 
conversion, because of how our JSON encoders/decoders convert numbers to 
floating point.

To maintain precision, we can:
1) Try using a {{long double}} to represent a number.
2) Add logic to stringify/parse numbers without loss when possible.

In all cases, we'll need to make sure that:
* Integers are properly stringified without loss.
* The JSON decoder parses the integer without loss.
* We have some unit tests for big (close to {{INT32_MAX}}/{{INT64_MAX}}) and 
small integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-3344) Add more comments for strings::internal::fmt

2015-08-31 Thread Guangya Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu reassigned MESOS-3344:
--

Assignee: Guangya Liu

> Add more comments for strings::internal::fmt
> 
>
> Key: MESOS-3344
> URL: https://issues.apache.org/jira/browse/MESOS-3344
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.25.0
>Reporter: Guangya Liu
>Assignee: Guangya Liu
> Fix For: 0.25.0
>
>
> There is a issue MESOS-1805 which want to change the const pass-by-value to 
> const reference for some functions, but it is not right, as for some cases, 
> this will cause some unexpected behavior. We can add more comments for those 
> part for why not using const reference to make code clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-1805) change const pass-by-value to const reference in stout

2015-08-31 Thread Guangya Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724055#comment-14724055
 ] 

Guangya Liu commented on MESOS-1805:


A new issue https://issues.apache.org/jira/browse/MESOS-3344 has been filed to 
add more comments to those functions for why not using reference.

> change const pass-by-value to const reference in stout
> --
>
> Key: MESOS-1805
> URL: https://issues.apache.org/jira/browse/MESOS-1805
> Project: Mesos
>  Issue Type: Improvement
>  Components: stout
>Affects Versions: 0.25.0
>Reporter: Kamil Domański
>Assignee: Guangya Liu
>Priority: Trivial
>  Labels: easyfix, patch, performance
> Fix For: 0.25.0
>
>
> {{os::shell}} and an overload of {{strings::internal::fmt}} in stout pass a 
> {{const std::string}} parameter instead of {{const std::string&}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3344) Add more comments for strings::internal::fmt

2015-08-31 Thread Guangya Liu (JIRA)

Guangya Liu created MESOS-3344:
--

 Summary: Add more comments for strings::internal::fmt
 Key: MESOS-3344
 URL: https://issues.apache.org/jira/browse/MESOS-3344
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.25.0
Reporter: Guangya Liu
 Fix For: 0.25.0


There is a issue MESOS-1805 which want to change the const pass-by-value to 
const reference for some functions, but it is not right, as for some cases, 
this will cause some unexpected behavior. We can add more comments for those 
part for why not using const reference to make code clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-1805) change const pass-by-value to const reference in stout

2015-08-31 Thread Guangya Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724044#comment-14724044
 ] 

Guangya Liu commented on MESOS-1805:


It seems that we do not need to change this part to const reference as it may 
cause some unexpected behavior, please refer to 
http://stackoverflow.com/a/222314 for detail.

> change const pass-by-value to const reference in stout
> --
>
> Key: MESOS-1805
> URL: https://issues.apache.org/jira/browse/MESOS-1805
> Project: Mesos
>  Issue Type: Improvement
>  Components: stout
>Affects Versions: 0.25.0
>Reporter: Kamil Domański
>Assignee: Guangya Liu
>Priority: Trivial
>  Labels: easyfix, patch, performance
> Fix For: 0.25.0
>
>
> {{os::shell}} and an overload of {{strings::internal::fmt}} in stout pass a 
> {{const std::string}} parameter instead of {{const std::string&}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MESOS-3343) Rate Limiting functionality for HTTP Frameworks

2015-08-31 Thread Anand Mazumdar (JIRA)

Anand Mazumdar created MESOS-3343:
-

 Summary: Rate Limiting functionality for HTTP Frameworks
 Key: MESOS-3343
 URL: https://issues.apache.org/jira/browse/MESOS-3343
 Project: Mesos
  Issue Type: Task
Reporter: Anand Mazumdar


We need to build rate limiting functionality for frameworks connecting via the 
Scheduler HTTP API similar to the PID based frameworks.

Link to the rate-limiting section from design doc:
https://docs.google.com/document/d/1pnIY_HckimKNvpqhKRhbc9eSItWNFT-priXh_urR-T0/edit#heading=h.kzgdk4d5fmba

- This ticket deals with refactoring the existing PID based framework 
functionality and extend it for HTTP frameworks.
- The second part of notifying the framework when rate-limiting is active i.e. 
returning a status of 429 can be undertook as part of MESOS-1664



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2720) Implement protobufs for master operator endpoints

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2720:
---
Labels: mesosphere  (was: )

> Implement protobufs for master operator endpoints
> -
>
> Key: MESOS-2720
> URL: https://issues.apache.org/jira/browse/MESOS-2720
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Isabel Jimenez
>  Labels: mesosphere
>
> We should define protobufs for master operator endpoints so as to provide a 
> structure we can refer to for each possible return from an endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2971) Implement OverlayFS based provisioner backend

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2971:
---
Labels: twitter  (was: mesosphere)

> Implement OverlayFS based provisioner backend
> -
>
> Key: MESOS-2971
> URL: https://issues.apache.org/jira/browse/MESOS-2971
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>Assignee: Mei Wan
>  Labels: twitter
>
> Part of the image provisioning process is to call a backend to create a root 
> filesystem based on the image on disk layout.
> The problem with the copy backend is that it's both waste of IO and space, 
> and bind only can deal with one layer.
> Overlayfs backend allows us to utilize the filesystem to merge multiple 
> filesystems into one efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2879) Random recursive_mutex errors in when running make check

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2879:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 18  (was: Mesosphere Sprint 
15)

> Random recursive_mutex errors in when running make check
> 
>
> Key: MESOS-2879
> URL: https://issues.apache.org/jira/browse/MESOS-2879
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Alexander Rojas
>Assignee: Greg Mann
>  Labels: mesosphere, tech-debt
>
> While running make check on OS X, from time to time {{recursive_mutex}} 
> errors appear after running all the test successfully. Just one of the 
> experience messages actually stops {{make check}} reporting an error.
> The following error messages have been experienced:
> {code}
> libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: 
> libc++abi.dylib: libc++abi.dylib: terminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argument
> *** Aborted at 1434553937 (unix time) try "date -d @1434553937" if you are 
> using GNU date ***
> {code}
> {code}
> libc++abi.dylib: terminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid argument
> *** Aborted at 1434557001 (unix time) try "date -d @1434557001" if you are 
> using GNU date ***
> libc++abi.dylib: PC: @ 0x7fff93855286 __pthread_kill
> libc++abi.dylib: *** SIGABRT (@0x7fff93855286) received by PID 88060 (TID 
> 0x10fc4) stack trace: ***
> @ 0x7fff8e1d6f1a _sigtramp
> libc++abi.dylib: @0x10fc3f1a8 (unknown)
> libc++abi.dylib: @ 0x7fff979deb53 abort
> libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: terminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentMaking check in include
> {code}
> {code}
> Assertion failed: (e == 0), function ~recursive_mutex, file 
> /SourceCache/libcxx/libcxx-120/src/mutex.cpp, line 82.
> *** Aborted at 1434555685 (unix time) try "date -d @1434555685" if you are 
> using GNU date ***
> PC: @ 0x7fff93855286 __pthread_kill
> *** SIGABRT (@0x7fff93855286) received by PID 60235 (TID 0x7fff7ebdc300) 
> stack trace: ***
> @ 0x7fff8e1d6f1a _sigtramp
> @0x10b512350 google::CheckNotNull<>()
> @ 0x7fff979deb53 abort
> @ 0x7fff979a6c39 __assert_rtn
> @ 0x7fff9bffdcc9 std::__1::recursive_mutex::~recursive_mutex()
> @0x10b881928 process::ProcessManager::~ProcessManager()
> @0x10b874445 process::ProcessManager::~ProcessManager()
> @0x10b874418 process::finalize()
> @0x10b2f7aec main
> @ 0x7fff98edc5c9 start
> make[5]: *** [check-local] Abort trap: 6
> make[4]: *** [check-am] Error 2
> make[3]: *** [check-recursive] Error 1
> make[2]: *** [check-recursive] Error 1
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (MESOS-2879) Random recursive_mutex errors in when running make check

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-2879:
--

Assignee: Greg Mann  (was: Joris Van Remoortere)

> Random recursive_mutex errors in when running make check
> 
>
> Key: MESOS-2879
> URL: https://issues.apache.org/jira/browse/MESOS-2879
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Alexander Rojas
>Assignee: Greg Mann
>  Labels: mesosphere, tech-debt
>
> While running make check on OS X, from time to time {{recursive_mutex}} 
> errors appear after running all the test successfully. Just one of the 
> experience messages actually stops {{make check}} reporting an error.
> The following error messages have been experienced:
> {code}
> libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: 
> libc++abi.dylib: libc++abi.dylib: terminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argument
> *** Aborted at 1434553937 (unix time) try "date -d @1434553937" if you are 
> using GNU date ***
> {code}
> {code}
> libc++abi.dylib: terminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid argument
> *** Aborted at 1434557001 (unix time) try "date -d @1434557001" if you are 
> using GNU date ***
> libc++abi.dylib: PC: @ 0x7fff93855286 __pthread_kill
> libc++abi.dylib: *** SIGABRT (@0x7fff93855286) received by PID 88060 (TID 
> 0x10fc4) stack trace: ***
> @ 0x7fff8e1d6f1a _sigtramp
> libc++abi.dylib: @0x10fc3f1a8 (unknown)
> libc++abi.dylib: @ 0x7fff979deb53 abort
> libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: terminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentterminating with uncaught 
> exception of type std::__1::system_error: recursive_mutex lock failed: 
> Invalid argumentterminating with uncaught exception of type 
> std::__1::system_error: recursive_mutex lock failed: Invalid 
> argumentterminating with uncaught exception of type std::__1::system_error: 
> recursive_mutex lock failed: Invalid argumentMaking check in include
> {code}
> {code}
> Assertion failed: (e == 0), function ~recursive_mutex, file 
> /SourceCache/libcxx/libcxx-120/src/mutex.cpp, line 82.
> *** Aborted at 1434555685 (unix time) try "date -d @1434555685" if you are 
> using GNU date ***
> PC: @ 0x7fff93855286 __pthread_kill
> *** SIGABRT (@0x7fff93855286) received by PID 60235 (TID 0x7fff7ebdc300) 
> stack trace: ***
> @ 0x7fff8e1d6f1a _sigtramp
> @0x10b512350 google::CheckNotNull<>()
> @ 0x7fff979deb53 abort
> @ 0x7fff979a6c39 __assert_rtn
> @ 0x7fff9bffdcc9 std::__1::recursive_mutex::~recursive_mutex()
> @0x10b881928 process::ProcessManager::~ProcessManager()
> @0x10b874445 process::ProcessManager::~ProcessManager()
> @0x10b874418 process::finalize()
> @0x10b2f7aec main
> @ 0x7fff98edc5c9 start
> make[5]: *** [check-local] Abort trap: 6
> make[4]: *** [check-am] Error 2
> make[3]: *** [check-recursive] Error 1
> make[2]: *** [check-recursive] Error 1
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3073:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18  (was: Mesosphere Sprint 15, Mesosphere Sprint 16, 
Mesosphere Sprint 17)

> Introduce HTTP endpoints for Quota
> --
>
> Key: MESOS-3073
> URL: https://issues.apache.org/jira/browse/MESOS-3073
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> We need to implement the HTTP endpoints for Quota as outlined in the Design 
> Doc: 
> (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3158) Libprocess Process: Join runqueue workers during finalization

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3158:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Libprocess Process: Join runqueue workers during finalization
> -
>
> Key: MESOS-3158
> URL: https://issues.apache.org/jira/browse/MESOS-3158
> Project: Mesos
>  Issue Type: Improvement
>  Components: libprocess
>Reporter: Joris Van Remoortere
>Assignee: Greg Mann
>  Labels: beginner, libprocess, mesosphere, newbie
>
> The lack of synchronization between ProcessManager destruction and the thread 
> pool threads running the queued processes means that the shared state that is 
> part of the ProcessManager gets destroyed prematurely.
> Synchronizing the ProcessManager destructor with draining the work queues and 
> stopping the workers will allow us to not require leaking the shared state to 
> avoid use beyond destruction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3265) Starting maintenance needs to deactivate agents and kill tasks.

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3265:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Starting maintenance needs to deactivate agents and kill tasks.
> ---
>
> Key: MESOS-3265
> URL: https://issues.apache.org/jira/browse/MESOS-3265
> Project: Mesos
>  Issue Type: Task
>  Components: master, slave
>Reporter: Joseph Wu
>Assignee: Joris Van Remoortere
>  Labels: mesosphere
>
> After using the {{/maintenance/start}} endpoint to begin maintenance on a 
> machine, agents running on said machine should:
> * Be deactivated such that no offers are sent from that agent.  (Investigate 
> if {{Master::deactivate(Slave*)}} can be used or modified for this purpose.)
> * Kill all tasks still running on the agent (See MESOS-1475).
> * Prevent other agents on that machine from registering or sending out 
> offers.  This will likely involve some modifications to {{Master::register}} 
> and {{Master::reregister}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2062) Add InverseOffer to Event/Call API.

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2062:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Add InverseOffer to Event/Call API.
> ---
>
> Key: MESOS-2062
> URL: https://issues.apache.org/jira/browse/MESOS-2062
> Project: Mesos
>  Issue Type: Task
>  Components: c++ api
>Reporter: Benjamin Mahler
>Assignee: Joris Van Remoortere
>  Labels: mesosphere
>
> The initial use case for InverseOffer in the framework API will be the 
> maintenance primitives in mesos: MESOS-1474.
> One way to add this is to tack it on to the OFFERS Event:
> {code}
> message Offers {
>   repeated Offer offers = 1;
>   repeated InverseOffer inverse_offers = 2;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3043) Master does not handle InverseOffers in the Accept call (Event/Call API)

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3043:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Master does not handle InverseOffers in the Accept call (Event/Call API)
> 
>
> Key: MESOS-3043
> URL: https://issues.apache.org/jira/browse/MESOS-3043
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Joseph Wu
>Assignee: Joris Van Remoortere
>  Labels: mesosphere
>
> InverseOffers are similar to Offers in that they are Accepted or Declined 
> based on their OfferID.  
> Some additional logic may be neccesary in Master::accept 
> (src/master/master.cpp) to gracefully handle the acceptance of InverseOffers.
> * The InverseOffer needs to be removed from the set of pending InverseOffers.
> * The InverseOffer should not result any errors/warnings.  
> Note: accepted InverseOffers do not preclude further InverseOffers from being 
> sent to the framework.  Instead, an accepted InverseOffer merely signifies 
> that the framework is _currently_ fine with the expected downtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3313) Rework Jenkins build script

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3313:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Rework Jenkins build script
> ---
>
> Key: MESOS-3313
> URL: https://issues.apache.org/jira/browse/MESOS-3313
> Project: Mesos
>  Issue Type: Task
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> Mesos Jenkins build script needs to be reworked to support the following:
> - Wider test coverage (libevent, libssl, root tests, Docker tests).
> - More OS/compiler Docker images for testing Mesos.
> - Excluding tests on per-image basis.
> - Reproducing the test image locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3062) Add authorization for dynamic reservation

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3062:
---
Sprint: Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18  
(was: Mesosphere Sprint 16, Mesosphere Sprint 17)

> Add authorization for dynamic reservation
> -
>
> Key: MESOS-3062
> URL: https://issues.apache.org/jira/browse/MESOS-3062
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Michael Park
>Assignee: Michael Park
>  Labels: mesosphere
>
> Dynamic reservations should be authorized with the {{principal}} of the 
> reserving entity (framework or master). The idea is to introduce {{Reserve}} 
> and {{Unreserve}} into the ACL.
> {code}
>   message Reserve {
> // Subjects.
> required Entity principals = 1;
> // Objects.  MVP: Only possible values = ANY, NONE
> required Entity resources = 1;
>   }
>   message Unreserve {
> // Subjects.
> required Entity principals = 1;
> // Objects.
> required Entity reserver_principals = 2;
>   }
> {code}
> When a framework/operator reserves resources, "reserve" ACLs are checked to 
> see if the framework ({{FrameworkInfo.principal}}) or the operator 
> ({{Credential.user}}) is authorized to reserve the specified resources. If 
> not authorized, the reserve operation is rejected.
> When a framework/operator unreserves resources, "unreserve" ACLs are checked 
> to see if the framework ({{FrameworkInfo.principal}}) or the operator 
> ({{Credential.user}}) is authorized to unreserve the resources reserved by a 
> framework or operator ({{Resource.ReservationInfo.principal}}). If not 
> authorized, the unreserve operation is rejected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2968) Implement shared copy based provisioner backend

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2968:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Implement shared copy based provisioner backend
> ---
>
> Key: MESOS-2968
> URL: https://issues.apache.org/jira/browse/MESOS-2968
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Timothy Chen
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> Currently Appc and Docker both implemented its own copy backend, but most of 
> the logic is the same where the input is just a image name with its 
> dependencies.
> We can refactor both so that we just have one implementation that is shared 
> between both provisioners, so appc and docker can reuse the shared copy 
> backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3304) Remove remnants of LIBPROCESS_STATISTICS_WINDOW

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3304:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Remove remnants of LIBPROCESS_STATISTICS_WINDOW
> ---
>
> Key: MESOS-3304
> URL: https://issues.apache.org/jira/browse/MESOS-3304
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Greg Mann
>Assignee: Greg Mann
>Priority: Trivial
>  Labels: easyfix, mesosphere
>
> As seen in MESOS-1283, LIBPROCESS_STATISTICS_WINDOW is no longer needed since 
> metrics now require specification of a window size, and default to no history 
> if not provided.
> Some commented-out code remnants associated with this environment variable 
> still remain and should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2719) Deprecating '.json' extension in master endpoints urls

2015-08-31 Thread Isabel Jimenez (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2719:
--
Sprint: Mesosphere Sprint 18

> Deprecating '.json' extension in master endpoints urls
> --
>
> Key: MESOS-2719
> URL: https://issues.apache.org/jira/browse/MESOS-2719
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Isabel Jimenez
>Assignee: Isabel Jimenez
>  Labels: HTTP, mesosphere
>
> Add an endpoint for each master endpoint with a '.json' extension such as 
> `/master/stats.json` so it becomes `/master/stats` after a deprecation cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3289) Add DockerRegistry unit tests

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3289:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Add DockerRegistry unit tests
> -
>
> Key: MESOS-3289
> URL: https://issues.apache.org/jira/browse/MESOS-3289
> Project: Mesos
>  Issue Type: Task
>  Components: docker
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Add unit tests suite for docker registry implementation.  This could include:
> - Creating mock docker registry server
> - Using openssl library for digest functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3223) Implement token manager for docker registry

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3223:
---
Sprint: Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18  
(was: Mesosphere Sprint 16, Mesosphere Sprint 17)

> Implement token manager for docker registry
> ---
>
> Key: MESOS-3223
> URL: https://issues.apache.org/jira/browse/MESOS-3223
> Project: Mesos
>  Issue Type: Task
>  Components: containerization, docker
> Environment: linux
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Implement the following:
> - A component that fetches JSON web authorization token from a given registry.
> - Caches the token keyed on registry, service and scope
> - Validates the cache for expiry date
> Nice to have:
> - Cache gets pruned as tokens are aged beyond expiration time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3038) Resource offers do not contain Unavailability, given a maintenance schedule

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3038:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Resource offers do not contain Unavailability, given a maintenance schedule
> ---
>
> Key: MESOS-3038
> URL: https://issues.apache.org/jira/browse/MESOS-3038
> Project: Mesos
>  Issue Type: Task
>  Components: allocation, master
>Reporter: Joseph Wu
>Assignee: Joris Van Remoortere
>  Labels: mesosphere
>
> Given a schedule, defined elsewhere, any resource offers to affected slaves 
> must include an Unavailability field.
> The maintenance schedule for a single slave should be held in [persistent 
> storage|MESOS-2075] and locally by the master.  i.e. In src/master/master.hpp:
> {code}
> struct Slave {
>   ... // Existing fields.
>   // New field that the master/allocator can access
>   Maintenances pendingDowntime;
> }
> {code}
> The new field should be populated via an API call (see [MESOS-2067]).
> The Unavailability field can be added to Master::offer 
> (src/master/master.cpp).
> {code}
> offer->mutable_unavailability()->MergeFrom(slave->pendingDowntime);
> {code}
> Possible test(s):
> * PendingUnavailibilityTest
> ** Start master, slave.
> ** Check unavailability of offer == none.
> ** Set unavailability to the future.
> ** Check offer has unavailability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2600) Add /reserve and /unreserve endpoints on the master for dynamic reservation

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2600:
---
Sprint: Mesosphere Sprint 10, Mesosphere Sprint 11, Mesosphere Sprint 15, 
Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18  (was: 
Mesosphere Sprint 10, Mesosphere Sprint 11, Mesosphere Sprint 15, Mesosphere 
Sprint 16, Mesosphere Sprint 17)

> Add /reserve and /unreserve endpoints on the master for dynamic reservation
> ---
>
> Key: MESOS-2600
> URL: https://issues.apache.org/jira/browse/MESOS-2600
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Michael Park
>Assignee: Michael Park
>Priority: Critical
>  Labels: mesosphere
>
> Enable operators to manage dynamic reservations by Introducing the 
> {{/reserve}} and {{/unreserve}} HTTP endpoints on the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2949) Design generalized Authorizer interface

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2949:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18  (was: Mesosphere Sprint 15, Mesosphere Sprint 16, 
Mesosphere Sprint 17)

> Design generalized Authorizer interface
> ---
>
> Key: MESOS-2949
> URL: https://issues.apache.org/jira/browse/MESOS-2949
> Project: Mesos
>  Issue Type: Task
>  Components: master, security
>Reporter: Alexander Rojas
>Assignee: Alexander Rojas
>  Labels: acl, mesosphere, security
>
> As mentioned in MESOS-2948 the current {{mesos::Authorizer}} interface is 
> rather inflexible if new _Actions_ or _Objects_ need to be added.
> A new API needs to be designed in a way that allows for arbitrary _Actions_ 
> and _Objects_ to be added to the authorization mechanism without having to 
> recompile mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3164) Introduce QuotaInfo message

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3164:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18  (was: Mesosphere Sprint 15, Mesosphere Sprint 16, 
Mesosphere Sprint 17)

> Introduce QuotaInfo message
> ---
>
> Key: MESOS-3164
> URL: https://issues.apache.org/jira/browse/MESOS-3164
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> A {{QuotaInfo}} protobuf message is internal representation for quota related 
> information (e.g. for persisting quota). The protobuf message should be 
> extendable for future needs and allows for easy aggregation across roles and 
> operator principals. It may also be used to pass quota information to 
> allocators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3074) Check satisfiability of quota requests in Master

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3074:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18  (was: Mesosphere Sprint 15, Mesosphere Sprint 16, 
Mesosphere Sprint 17)

> Check satisfiability of quota requests in Master
> 
>
> Key: MESOS-3074
> URL: https://issues.apache.org/jira/browse/MESOS-3074
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> We need to to validate and quota requests in the Mesos Master as outlined in 
> the Design Doc: 
> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I
> This ticket aims to validate satisfiability (in terms of available resources) 
> of a quota request using a heuristic algorithm in the Mesos Master, rather 
> than validating the syntax of the request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2466) Write documentation for all the LIBPROCESS_* environment variables.

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2466:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Write documentation for all the LIBPROCESS_* environment variables.
> ---
>
> Key: MESOS-2466
> URL: https://issues.apache.org/jira/browse/MESOS-2466
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Alexander Rojas
>Assignee: Greg Mann
>  Labels: documentation, mesosphere
>
> libprocess uses a set of environment variables to modify its behaviour; 
> however, these variables are not documented anywhere, nor it is defined where 
> the documentation should be.
> What would be needed is a decision whether the environment variables should 
> be documented (a new doc file or reusing an existing one), and then add the 
> documentation there.
> After searching in the code, these are the variables which need to be 
> documented:
> # {{LIBPROCESS_IP}}
> # {{LIBPROCESS_PORT}}
> # {{LIBPROCESS_ADVERTISE_IP}}
> # {{LIBPROCESS_ADVERTISE_PORT}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2083) Add documentation for maintenance primitives.

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2083:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Add documentation for maintenance primitives.
> -
>
> Key: MESOS-2083
> URL: https://issues.apache.org/jira/browse/MESOS-2083
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Benjamin Mahler
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> We should provide some guiding documentation around the upcoming maintenance 
> primitives in Mesos.
> Specifically, we should ensure that general users, framework developers, and 
> operators understand the notion of maintenance in Mesos. Some guidance and 
> recommendations for the latter two audiences will be necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3092) Configure Jenkins to run Docker tests

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3092:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18  (was: Mesosphere Sprint 15, Mesosphere Sprint 16, 
Mesosphere Sprint 17)

> Configure Jenkins to run Docker tests
> -
>
> Key: MESOS-3092
> URL: https://issues.apache.org/jira/browse/MESOS-3092
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Timothy Chen
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> Add a jenkin job to run the Docker tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2984) Deprecating '.json' extension in files endpoints url

2015-08-31 Thread Isabel Jimenez (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2984:
--
Sprint: Mesosphere Sprint 18

> Deprecating '.json' extension in files endpoints url
> 
>
> Key: MESOS-2984
> URL: https://issues.apache.org/jira/browse/MESOS-2984
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Isabel Jimenez
>Assignee: Isabel Jimenez
>  Labels: HTTP, mesosphere
>
> Remove the '.json' extension on endpoints such as `/files/browse.json` so it 
> become `/files/browse`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3042) Master/Allocator does not send InverseOffers to resources to be maintained

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3042:
---
Sprint: Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18  
(was: Mesosphere Sprint 16, Mesosphere Sprint 17)

> Master/Allocator does not send InverseOffers to resources to be maintained
> --
>
> Key: MESOS-3042
> URL: https://issues.apache.org/jira/browse/MESOS-3042
> Project: Mesos
>  Issue Type: Task
>  Components: allocation, master
>Reporter: Joseph Wu
>Assignee: Joris Van Remoortere
>  Labels: mesosphere
>
> Offers are currently sent from master/allocator to framework via 
> ResourceOffersMessage's.  InverseOffers, which are roughly equivalent to 
> negative Offers, can be sent in the same package.
> In src/messages/messages.proto
> {code}
> message ResourceOffersMessage {
>   repeated Offer offers = 1;
>   repeated string pids = 2;
>   // New field with InverseOffers
>   repeated InverseOffer inverseOffers = 3;
> }
> {code}
> Sent InverseOffers can be tracked in the master's local state:
> i.e. In src/master/master.hpp:
> {code}
> struct Slave {
>   ... // Existing fields.
>   // Active InverseOffers on this slave.
>   // Similar pattern to the "offers" field
>   hashset inverseOffers;
> }
> {code}
> One actor (master or allocator) should populate the new InverseOffers field.
> * In master (src/master/master.cpp)
> ** Master::offer is where the ResourceOffersMessage and Offer object is 
> constructed.
> ** The same method could also check for maintenance and send InverseOffers.
> * In the allocator (src/master/allocator/mesos/hierarchical.hpp)
> ** HierarchicalAllocatorProcess::allocate is where slave resources are 
> aggregated an sent off to the frameworks.
> ** InverseOffers (i.e. negative resources) allocation could be calculated in 
> this method.
> ** A change to Master::offer (i.e. the "offerCallback") may be necessary to 
> account for the negative resources.
> Possible test(s):
> * InverseOfferTest
> ** Start master, slave, framework.
> ** Accept resource offer, start task.
> ** Set maintenance schedule to the future.
> ** Check that InverseOffer(s) are sent to the framework.
> ** Decline InverseOffer.
> ** Check that more InverseOffer(s) are sent.
> ** Accept InverseOffer.
> ** Check that more InverseOffer(s) are sent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3222) Implement docker registry client

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3222:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Implement docker registry client
> 
>
> Key: MESOS-3222
> URL: https://issues.apache.org/jira/browse/MESOS-3222
> Project: Mesos
>  Issue Type: Task
>  Components: containerization, docker
> Environment: linux
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Implement the following functionality:
> - fetch manifest from remote registry based on authorization method dictated 
> by the registry.
> - fetch image layers from remote registry  based on authorization method 
> dictated by the registry..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3266) Stopping/Completing maintenance needs to reactivate agents.

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3266:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Stopping/Completing maintenance needs to reactivate agents.
> ---
>
> Key: MESOS-3266
> URL: https://issues.apache.org/jira/browse/MESOS-3266
> Project: Mesos
>  Issue Type: Task
>  Components: master, slave
>Reporter: Joseph Wu
>Assignee: Joris Van Remoortere
>  Labels: mesosphere
>
> After using the {{/maintenance/stop}} endpoint to end maintenance on a 
> machine, any deactivated agents must be reactivated and allowed to register 
> with the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3041) Decline call does not include an optional "reason", in the Event/Call API

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3041:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Decline call does not include an optional "reason", in the Event/Call API
> -
>
> Key: MESOS-3041
> URL: https://issues.apache.org/jira/browse/MESOS-3041
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Joseph Wu
>Assignee: Joris Van Remoortere
>  Labels: mesosphere
>
> In the Event/Call API, the Decline call is currently used by frameworks to 
> reject resource offers.
> In the case of InverseOffers, the framework could give additional information 
> to the operators and/or allocator, as to why the InverseOffer is declined. 
> i.e. Suppose a cluster running some consensus algorithm is given an 
> InverseOffer on one of its nodes.  It may decline saying "Too few nodes" (or, 
> more verbosely, "Specified InverseOffer would lower the number of active 
> nodes below quorum").
> This change requires the following changes:
> * include/mesos/scheduler/scheduler.proto:
> {code}
> message Call {
>   ...
>   message Decline {
> repeated OfferID offer_ids = 1;
> optional Filters filters = 2;
> // Add this extra string for each OfferID
> // i.e. reasons[i] is for offer_ids[i]
> repeated string reasons = 3;
>   }
>   ...
> }
> {code}
> * src/master/master.cpp
> Change Master::decline to either store the reason, or log it.
> * Add a declineOffer overload in the (Mesos)SchedulerDriver with an optional 
> "reason".
> ** Extend the interface in include/mesos/scheduler.hpp
> ** Add/change the declineOffer method in src/sched/sched.cpp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2840:
---
Sprint: Mesosphere Sprint 12, Mesosphere Sprint 13, Mesosphere Sprint 14, 
Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere 
Sprint 18  (was: Mesosphere Sprint 12, Mesosphere Sprint 13, Mesosphere Sprint 
14, Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17)

> MesosContainerizer support multiple image provisioners
> --
>
> Key: MESOS-2840
> URL: https://issues.apache.org/jira/browse/MESOS-2840
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization, docker
>Affects Versions: 0.23.0
>Reporter: Marco Massenzio
>Assignee: Timothy Chen
>  Labels: mesosphere, twitter
>
> We want to utilize the Appc integration interfaces to further make 
> MesosContainerizers to support multiple image formats.
> This allows our future work on isolators to support any container image 
> format.
> Design
> https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2849) Implement Docker local image store

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2849:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18  (was: Mesosphere Sprint 15, Mesosphere Sprint 16, 
Mesosphere Sprint 17)

> Implement Docker local image store
> --
>
> Key: MESOS-2849
> URL: https://issues.apache.org/jira/browse/MESOS-2849
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Timothy Chen
>Assignee: Lily Chen
>  Labels: mesosphere, unified-prototype
>
> Given a local Docker image name and path to the image or image tarball, 
> fetches the image's dependent layers, untarring if necessary. It will also 
> parse the image layers' configuration json and place the layers and image 
> into persistent store.
> Done when a Docker image can be successfully stored and retrieved using 'put' 
> and 'get' methods. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3312) Factor out JSON to repeated protobuf conversion

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3312:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Factor out JSON to repeated protobuf conversion
> ---
>
> Key: MESOS-3312
> URL: https://issues.apache.org/jira/browse/MESOS-3312
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> In general, we have the collection of protobuf messages as another protobuf 
> message, which makes JSON -> protobuf conversion straightforward. This is not 
> always the case, for example, {{Resources}} class is not a protobuf, though 
> protobuf-convertible.
> To facilitate conversions like JSON -> {{Resources}} and avoid writing code 
> for each particular case, we propose to introduce {{JSON::Array}} -> 
> {{repeated protobuf}} conversion. With this in place, {{JSON::Array}} -> 
> {{Resources}} boils down to {{JSON::Array}} -> {{repeated Resource}} -> 
> (extra c-tor call) -> {{Resources}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2708) Design doc for the Executor HTTP API

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2708:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> Design doc for the Executor HTTP API
> 
>
> Key: MESOS-2708
> URL: https://issues.apache.org/jira/browse/MESOS-2708
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rojas
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> This tracks the design of the Executor HTTP API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2983) Deprecating '.json' extension in slave endpoints url

2015-08-31 Thread Isabel Jimenez (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2983:
--
Sprint: Mesosphere Sprint 18

> Deprecating '.json' extension in slave endpoints url
> 
>
> Key: MESOS-2983
> URL: https://issues.apache.org/jira/browse/MESOS-2983
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Isabel Jimenez
>Assignee: Isabel Jimenez
>  Labels: HTTP, mesosphere
>
> Remove the '.json' extension on endpoints such as `/slave/state.json` so it 
> become `/slave/state`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3086:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18  (was: Mesosphere Sprint 15, Mesosphere Sprint 16, 
Mesosphere Sprint 17)

> Create cgroups TasksKiller for non freeze subsystems.
> -
>
> Key: MESOS-3086
> URL: https://issues.apache.org/jira/browse/MESOS-3086
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> We have a number of test issues when we cannot remove cgroups (in case there 
> are still related tasks running) in cases where the freezer subsystem is not 
> available. 
> In the current code 
> (https://github.com/apache/mesos/blob/0.22.1/src/linux/cgroups.cpp#L1728)  we 
> will fallback to a very simple mechnism of recursivly trying to remove the 
> cgroups which fails if there are still tasks running. 
> Therefore we need an additional  (NonFreeze)TasksKiller which doesn't  rely 
> on the freezer subsystem.
> This problem caused issues when running 'sudo make check' during 0.23 release 
> testing, where BenH provided already a better error message with 
> b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-3051) performance issues with port ranges comparison

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3051:
---
Sprint: Mesosphere Sprint 17, Mesosphere Sprint 18  (was: Mesosphere Sprint 
17)

> performance issues with port ranges comparison
> --
>
> Key: MESOS-3051
> URL: https://issues.apache.org/jira/browse/MESOS-3051
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Affects Versions: 0.22.1
>Reporter: James Peach
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> Testing in an environment with lots of frameworks (>200), where the 
> frameworks permanently decline resources they don't need. The allocator ends 
> up spending a lot of time figuring out whether offers are refused (the code 
> path through {{HierarchicalAllocatorProcess::isFiltered()}}.
> In profiling a synthetic benchmark, it turns out that comparing port ranges 
> is very expensive, involving many temporary allocations. 61% of 
> Resources::contains() run time is in operator -= (Resource). 35% of 
> Resources::contains() run time is in Resources::_contains().
> The heaviest call chain through {{Resources::_contains}} is:
> {code}
> Running Time  Self (ms) Symbol Name
> 7237.0ms   35.5%  4.0
> mesos::Resources::_contains(mesos::Resource const&) const
> 7200.0ms   35.3%  1.0 mesos::contains(mesos::Resource 
> const&, mesos::Resource const&)
> 7133.0ms   35.0%121.0  
> mesos::operator<=(mesos::Value_Ranges const&, mesos::Value_Ranges const&)
> 6319.0ms   31.0%  7.0   
> mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Ranges const&)
> 6240.0ms   30.6%161.0
> mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
> 1867.0ms9.1% 25.0 mesos::Value_Ranges::add_range()
> 1694.0ms8.3%  4.0 
> mesos::Value_Ranges::~Value_Ranges()
> 1495.0ms7.3% 16.0 
> mesos::Value_Ranges::operator=(mesos::Value_Ranges const&)
>  445.0ms2.1% 94.0 
> mesos::Value_Range::MergeFrom(mesos::Value_Range const&)
>  154.0ms0.7% 24.0 mesos::Value_Ranges::range(int) 
> const
>  103.0ms0.5% 24.0 
> mesos::Value_Ranges::range_size() const
>   95.0ms0.4%  2.0 
> mesos::Value_Range::Value_Range(mesos::Value_Range const&)
>   59.0ms0.2%  4.0 
> mesos::Value_Ranges::Value_Ranges()
>   50.0ms0.2% 50.0 mesos::Value_Range::begin() 
> const
>   28.0ms0.1% 28.0 mesos::Value_Range::end() const
>   26.0ms0.1%  0.0 
> mesos::Value_Range::~Value_Range()
> {code}
> mesos::coalesce(Value_Ranges) gets done a lot and ends up being really 
> expensive. The heaviest parts of the inverted call chain are:
> {code}
> Running Time  Self (ms)   Symbol Name
> 3209.0ms   15.7%  3209.0  mesos::Value_Range::~Value_Range()
> 3209.0ms   15.7%  0.0  
> google::protobuf::internal::GenericTypeHandler::Delete(mesos::Value_Range*)
> 3209.0ms   15.7%  0.0   void 
> google::protobuf::internal::RepeatedPtrFieldBase::Destroy::TypeHandler>()
> 3209.0ms   15.7%  0.0
> google::protobuf::RepeatedPtrField::~RepeatedPtrField()
> 3209.0ms   15.7%  0.0 
> google::protobuf::RepeatedPtrField::~RepeatedPtrField()
> 3209.0ms   15.7%  0.0  
> mesos::Value_Ranges::~Value_Ranges()
> 3209.0ms   15.7%  0.0   
> mesos::Value_Ranges::~Value_Ranges()
> 2441.0ms   11.9%  0.0
> mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range const&)
>  452.0ms2.2%  0.0
> mesos::remove(mesos::Value_Ranges*, mesos::Value_Range const&)
>  169.0ms0.8%  0.0
> mesos::operator<=(mesos::Value_Ranges const&, mesos::Value_Ranges const&)
>   82.0ms0.4%  0.0
> mesos::operator-=(mesos::Value_Ranges&, mesos::Value_Ranges const&)
>   65.0ms0.3%  0.0
> mesos::Value_Ranges::~Value_Ranges()
> 2541.0ms   12.4%  2541.0  
> google::protobuf::internal::GenericTypeHandler::New()
> 2541.0ms   12.4%  0.0  
> google::protobuf::RepeatedPtrField::TypeHandler::Type* 
> google::protobuf::internal::RepeatedPtrFieldBase::Add::TypeHandler>()
> 2305.0ms   11.3%  0.0   
> google::protobuf::RepeatedPtrField::Add()
> 2305.0ms   11.3%  0.0mesos::Value_Ranges::add_range()
> 1962.0ms9.6%  0.0 
> mesos::coalesce(mesos::Value_Ranges*, mesos::Value_Range cons

[jira] [Updated] (MESOS-3021) Implement Docker Image Provisioner Reference Store

2015-08-31 Thread Marco Massenzio (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3021:
---
Sprint: Mesosphere Sprint 14, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18  (was: Mesosphere Sprint 14, Mesosphere Sprint 16, 
Mesosphere Sprint 17)

> Implement Docker Image Provisioner Reference Store
> --
>
> Key: MESOS-3021
> URL: https://issues.apache.org/jira/browse/MESOS-3021
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Lily Chen
>Assignee: Lily Chen
>  Labels: mesosphere
>
> Create a comprehensive store to look up an image and tag's associated image 
> layer ID. Implement add, remove, save, and update images and their associated 
> tags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2968) Implement shared copy based provisioner backend

2015-08-31 Thread Timothy Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-2968:

Sprint: Mesosphere Sprint 17

> Implement shared copy based provisioner backend
> ---
>
> Key: MESOS-2968
> URL: https://issues.apache.org/jira/browse/MESOS-2968
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Timothy Chen
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> Currently Appc and Docker both implemented its own copy backend, but most of 
> the logic is the same where the input is just a image name with its 
> dependencies.
> We can refactor both so that we just have one implementation that is shared 
> between both provisioners, so appc and docker can reuse the shared copy 
> backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (MESOS-2684) mesos-slave should not abort when a single task has e.g. a 'mkdir' failure

2015-08-31 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-2684:
---
Component/s: docker

> mesos-slave should not abort when a single task has e.g. a 'mkdir' failure
> --
>
> Key: MESOS-2684
> URL: https://issues.apache.org/jira/browse/MESOS-2684
> Project: Mesos
>  Issue Type: Bug
>  Components: docker, slave
>Affects Versions: 0.21.1
>Reporter: Steven Schlansker
> Attachments: mesos-slave-restart.txt
>
>
> mesos-slave can encounter a variety of problems while attempting to launch a 
> task.  If the task fails, that is unfortunate, but not the end of the world.  
> Other tasks should not be affected.
> However, if the task failure happens to trigger an assertion, the entire 
> slave comes crashing down:
> F0501 19:10:46.095464  1705 paths.hpp:342] CHECK_SOME(mkdir): No space left 
> on device Failed to create executor directory 
> '/mnt/mesos/slaves/20150327-194449-419644938-5050-1649-S71/frameworks/Singularity/executors/pp-gc-eventlog-teamcity.2015.03.31T23.55.14-1430507446029-2-10.70.8.160-us_west_2b/runs/95a54aeb-322c-48e9-9f6f-5b359bccbc01'
> Immediately afterwards, all tasks on this slave were declared TASK_KILLED 
> when mesos-slave restarted.
> Something as simple as a 'mkdir' failing is not worthy of an assertion 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3324) Resource leak issue in Mesos

2015-08-31 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723939#comment-14723939
 ] 

Benjamin Mahler commented on MESOS-3324:


We've been planning to fix this by persisting framework information in the 
registry, in the same way that we handle slave failure during master failover. 
Can't seem to find the ticket related to this.

> Resource leak issue in Mesos
> 
>
> Key: MESOS-3324
> URL: https://issues.apache.org/jira/browse/MESOS-3324
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>Priority: Critical
>
> In Mesos master recovery case, if one framework is exit during Mesos master 
> downtime and this framework has already launched some long running tasks 
> before Mesos master down. Then after Mesos master recovery, those long 
> running tasks will always running as the orphaned tasks in Mesos cluster, no 
> any other components can kill those tasks later. This should be a resource 
> leak issue in Mesos, I propose to add a timeout to kill those orphaned tasks 
> or executors in Mesos master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-2684) mesos-slave should not abort when a single task has e.g. a 'mkdir' failure

2015-08-31 Thread Steven Schlansker (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723936#comment-14723936
 ] 

Steven Schlansker commented on MESOS-2684:
--

Here is a similar presumed unintentional crasher that another user reported on 
the mailing list:

tag=mesos-slave[12858]:  F0831 09:37:29.838184 12898 slave.cpp:3354] 
CHECK_SOME(os::touch(path)): Failed to open file: No such file or directory 


> mesos-slave should not abort when a single task has e.g. a 'mkdir' failure
> --
>
> Key: MESOS-2684
> URL: https://issues.apache.org/jira/browse/MESOS-2684
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.21.1
>Reporter: Steven Schlansker
> Attachments: mesos-slave-restart.txt
>
>
> mesos-slave can encounter a variety of problems while attempting to launch a 
> task.  If the task fails, that is unfortunate, but not the end of the world.  
> Other tasks should not be affected.
> However, if the task failure happens to trigger an assertion, the entire 
> slave comes crashing down:
> F0501 19:10:46.095464  1705 paths.hpp:342] CHECK_SOME(mkdir): No space left 
> on device Failed to create executor directory 
> '/mnt/mesos/slaves/20150327-194449-419644938-5050-1649-S71/frameworks/Singularity/executors/pp-gc-eventlog-teamcity.2015.03.31T23.55.14-1430507446029-2-10.70.8.160-us_west_2b/runs/95a54aeb-322c-48e9-9f6f-5b359bccbc01'
> Immediately afterwards, all tasks on this slave were declared TASK_KILLED 
> when mesos-slave restarted.
> Something as simple as a 'mkdir' failing is not worthy of an assertion 
> failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 161 matches

Mail list logo