[jira] [Commented] (MESOS-313) Report executor terminations to framework schedulers.

2016-08-23 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434172#comment-15434172
 ] 

Vinod Kone commented on MESOS-313:
--

It's intentional. It should've been called a `executorTerminated` message 
instead.

> Report executor terminations to framework schedulers.
> -
>
> Key: MESOS-313
> URL: https://issues.apache.org/jira/browse/MESOS-313
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Charles Reiss
>Assignee: Zhitao Li
>  Labels: mesosphere, newbie
> Fix For: 0.27.0
>
>
> The Scheduler interface has a callback for executorLost, but currently it is 
> never called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6041) Stream ID mismatch should print out expected and received stream ID

2016-08-23 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-6041:
--
Shepherd: Vinod Kone

> Stream ID mismatch should print out expected and received stream ID
> ---
>
> Key: MESOS-6041
> URL: https://issues.apache.org/jira/browse/MESOS-6041
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API
>Affects Versions: 1.0.0
>Reporter: Zameer Manji
>Assignee: Abhishek Dasgupta
> Fix For: 1.1.0
>
>
> If you send an incorrect stream id via the HTTP API the master responds with:
> {noformat}
> The stream ID included in this request didn't match the stream ID currently 
> associated with framework ID '0dffbee9-a514-4ffa-87e1-2850dd4dcf00'`
> {noformat}
> This error message should be enhanced to include the expected and received 
> stream id to aide debugging. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6077) Implement a basic default pod executor.

2016-08-23 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-6077:
-

 Summary: Implement a basic default pod executor.
 Key: MESOS-6077
 URL: https://issues.apache.org/jira/browse/MESOS-6077
 Project: Mesos
  Issue Type: Task
Reporter: Anand Mazumdar
Assignee: Anand Mazumdar


We would like to build a basic default pod executor that upon receiving a 
{{LAUNCH_GROUP}} event from the agent, sends a {{TASK_RUNNING}} status update. 
This would be a good building block for getting to a fully functional pod based 
default command executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6076) Implement RunTaskGroup handler on the agent.

2016-08-23 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-6076:
-

 Summary: Implement RunTaskGroup handler on the agent.
 Key: MESOS-6076
 URL: https://issues.apache.org/jira/browse/MESOS-6076
 Project: Mesos
  Issue Type: Task
Reporter: Anand Mazumdar
Assignee: Anand Mazumdar


We need to implement the {{RunTaskGroup}} handler on the agent. This would be 
similar to the {{RunTask}} handler that already exists except that this would 
have the relevant logic to send the task group to the executor atomically.

Ideally, we would like to re-use as much pieces of the already existing 
functionality from the {{runTask()}} handler. We also need to add a state 
{{queuedTaskGroups}} since it is needed for dispatching queued task groups to 
the executor upon registration. Also, we should ensure to populate 
{{queuedTasks}} with the task group information too thereby enabling users to 
query it via the `/state` endpoint/master reconciliation messages etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6074) Master check failure if the metrics endpoint is polled soon after it starts

2016-08-23 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6074:
--
Fix Version/s: 1.1.0
   1.0.2

> Master check failure if the metrics endpoint is polled soon after it starts
> ---
>
> Key: MESOS-6074
> URL: https://issues.apache.org/jira/browse/MESOS-6074
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Priority: Critical
> Fix For: 1.1.0, 1.0.2
>
>
> We observed the following check failure
> {noformat:title=}
> F0822 22:27:10.364923 10489 owned.hpp:110] Check failed: 'get()' Must be non 
> NULL
> {noformat}
> called from 
> {{mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::_resources_total}}
>  
> The code:
> {code}
> double HierarchicalAllocatorProcess::_resources_total(
> const string& resource)
> {
>   Option total =
> roleSorter->totalScalarQuantities()
>   .get(resource);
>   return total.isSome() ? total->value() : 0;
> }
> {code}
> See 
> [github|https://github.com/apache/mesos/blob/dcc8bd7d2a942889fe473c21ab64e863d0e6a13f/src/master/allocator/mesos/hierarchical.cpp#L1804]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6071) Validate that an explicitly specified DEFAULT executor has disk resources.

2016-08-23 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434160#comment-15434160
 ] 

Vinod Kone commented on MESOS-6071:
---

The current validation code checks that task group executors have non-zero disk 
resources. 
https://github.com/apache/mesos/blob/master/src/master/validation.cpp#L1080

Is this ticket supposed to track something else?

> Validate that an explicitly specified DEFAULT executor has disk resources.
> --
>
> Key: MESOS-6071
> URL: https://issues.apache.org/jira/browse/MESOS-6071
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>
> When the framework is explicitly specifying the DEFAULT executor (currently 
> only supported for task groups), we should consider validating that it 
> contains disk resources. Currently, we validate that explicitly specified 
> (DEFAULT or CUSTOM) executors only contain cpus and mem.
> We should also consider supporting the omission of DEFAULT executor resources 
> and injecting a default amount of resources. However, the difficulty here is 
> that the framework must know about these amounts since they need to be 
> available in the offer. We could expose these to the framework during 
> framework registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6072) mesos-0.28.1 failed to launch wordpress with overlay network

2016-08-23 Thread weifeng liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434082#comment-15434082
 ] 

weifeng liu commented on MESOS-6072:


I've set mysql memory to 1024M, but it's still useless, the bug still exists.


> mesos-0.28.1 failed to launch wordpress with overlay network
> 
>
> Key: MESOS-6072
> URL: https://issues.apache.org/jira/browse/MESOS-6072
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.28.1
> Environment: centos7  64bit
> mesos-0.2.81
> docker-1.11.0
>Reporter: weifeng liu
>Priority: Minor
>  Labels: mesosphere
>
> centos 7, x86-64bit 
> mesos 0.2.81 
> docker 1.11.0
> 1. create a docker overlay network with subnet defined 10.10.0.0/16, overlay 
> network name is mynet
> 2. create mysql app on marathon, and the json used as below shows:
> {
>   "id": "/db",
>   "cmd": null,
>   "cpus": 0.6,
>   "mem": 512,
>   "disk": 0,
>   "instances": 1,
>   "container": {
> "type": "DOCKER",
> "volumes": [],
> "docker": {
>   "image": "mysql:5.7.14",
>   "network": "BRIDGE",
>   "portMappings": [
> {
>   "containerPort": 3306,
>   "hostPort": 0,
>   "servicePort": 10001,
>   "protocol": "tcp",
>   "labels": {}
> }
>   ],
>   "privileged": false,
>   "parameters": [
> {
>   "key": "net",
>   "value": "mynet"
> },
> {
>   "key": "net-alias",
>   "value": "db"
> },
> {
>   "key": "hostname",
>   "value": "db"
> }
>   ],
>   "forcePullImage": false
> }
>   },
>   "env": {
> "MYSQL_DATABASE": "wordpress",
> "MYSQL_ROOT_PASSWORD": "password"
>   },
>   "portDefinitions": [
> {
>   "port": 10001,
>   "protocol": "tcp",
>   "labels": {}
> }
>   ]
> }
> after a while, the mysql runs as expected!
> 3. create wordpress app on marathon, and the json is:
> {
>   "id": "/server",
>   "cmd": null,
>   "cpus": 0.6,
>   "mem": 256,
>   "disk": 0,
>   "instances": 1,
>   "container": {
> "type": "DOCKER",
> "volumes": [],
> "docker": {
>   "image": "wordpress",
>   "network": "BRIDGE",
>   "portMappings": [
> {
>   "containerPort": 80,
>   "hostPort": 0,
>   "servicePort": 10004,
>   "protocol": "tcp",
>   "labels": {}
> }
>   ],
>   "privileged": false,
>   "parameters": [
> {
>   "key": "net",
>   "value": "mynet"
> },
> {
>   "key": "net-alias",
>   "value": "server"
> },
> {
>   "key": "hostname",
>   "value": "server"
> }
>   ],
>   "forcePullImage": false
> }
>   },
>   "env": {
> "WORDPRESS_DB_HOST": "db:3306",
> "WORDPRESS_DB_PASSWORD": "password"
>   },
>   "portDefinitions": [
> {
>   "port": 10004,
>   "protocol": "tcp",
>   "labels": {}
> }
>   ]
> }
> once wordpress app started, the mysql app will be shutdown and then stared by 
> marathon, on the other hand, wordpress can not connected to db, and 
> eventually failed and started by marathon...  this process will be repeated 
> forever
> from mysql log, it seems that mysql process is killed, and the error log is :
> 2016-08-23T06:35:18.271876Z 0 [Note] Event Scheduler: Loaded 0 events
> 2016-08-23T06:35:18.272052Z 0 [Note] mysqld: ready for connections.
> Version: '5.7.14'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL 
> Comm
> unity Server (GPL)
> 06:36:11 UTC - mysqld got signal 6 ;
> This could be because you hit a bug. It is also possible that this binary
> or one of the libraries it was linked against is corrupt, improperly built,
> or misconfigured. This error can also be caused by malfunctioning hardware.
> Attempting to collect some information that could help diagnose the problem.
> As this is a crash and something is definitely wrong, the information
> collection process might fail.
> key_buffer_size=8388608
> read_buffer_size=131072
> max_used_connections=1
> max_threads=151
> thread_count=1
> connection_count=1
> It is possible that mysqld could use up to
> key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68190 K 
>  b
> ytes of memory
> Hope that's ok; if not, decrease some variables in the equation.
> Thread pointer: 0x7f9b8c000ae0
> if the wordpress app is not started by marathon, but by docker command 
> directly, it will startup successfully without any error!
> it's pretty wired, and it puzzled me for quite a long time.
> the above bug can be reproduced with mysql:5.7.14 and above tags, 
> mysql:5.7.13 and mysql:5.6 can run successfully.  
> Anyone can 

[jira] [Commented] (MESOS-5227) Implement HTTP Docker Executor that uses the Executor Library

2016-08-23 Thread Yong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433858#comment-15433858
 ] 

Yong Tang commented on MESOS-5227:
--

The review request has been updated again. Now the new RR are located:
https://reviews.apache.org/r/51351/
https://reviews.apache.org/r/51352/

Please discard the old ones.

> Implement HTTP Docker Executor that uses the Executor Library
> -
>
> Key: MESOS-5227
> URL: https://issues.apache.org/jira/browse/MESOS-5227
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Yong Tang
>
> Similar to what we did with the HTTP command executor in MESOS-3558 we should 
> have a HTTP docker executor that can speak the v1 Executor API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6075) Avoid libprocess functions in `mesos-containerizer launch`.

2016-08-23 Thread Gilbert Song (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilbert Song reassigned MESOS-6075:
---

Assignee: Gilbert Song

> Avoid libprocess functions in `mesos-containerizer launch`.
> ---
>
> Key: MESOS-6075
> URL: https://issues.apache.org/jira/browse/MESOS-6075
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Affects Versions: 1.1.0
>Reporter: Jie Yu
>Assignee: Gilbert Song
>
> Calling libprocses functions in `mesos-containerizer launch` will cause 
> libprocess being initialized. That will have some performance impact as it'll 
> create N threads (N == #cores).
> Given that `mesos-containerizer launch` can be blocking, we should avoid 
> using libprocess methods for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6075) Avoid libprocess functions in `mesos-containerizer launch`.

2016-08-23 Thread Jie Yu (JIRA)
Jie Yu created MESOS-6075:
-

 Summary: Avoid libprocess functions in `mesos-containerizer 
launch`.
 Key: MESOS-6075
 URL: https://issues.apache.org/jira/browse/MESOS-6075
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Affects Versions: 1.1.0
Reporter: Jie Yu


Calling libprocses functions in `mesos-containerizer launch` will cause 
libprocess being initialized. That will have some performance impact as it'll 
create N threads (N == #cores).

Given that `mesos-containerizer launch` can be blocking, we should avoid using 
libprocess methods for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6074) Master check failure if the metrics endpoint is polled soon after it starts

2016-08-23 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6074:
--
Description: 
We observed the following check failure

{noformat:title=}
F0822 22:27:10.364923 10489 owned.hpp:110] Check failed: 'get()' Must be non 
NULL
{noformat}

called from 
{{mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::_resources_total}}
 

The code:
{code}
double HierarchicalAllocatorProcess::_resources_total(
const string& resource)
{
  Option total =
roleSorter->totalScalarQuantities()
  .get(resource);

  return total.isSome() ? total->value() : 0;
}
{code}

See 
[github|https://github.com/apache/mesos/blob/dcc8bd7d2a942889fe473c21ab64e863d0e6a13f/src/master/allocator/mesos/hierarchical.cpp#L1804]

  was:
We observed the following check failure

{noformat:title=}
F0822 22:27:10.364923 10489 owned.hpp:110] Check failed: 'get()' Must be non 
NULL
{noformat}

called from 
{{mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::_re
sources_total}} 

The code:
{code}
double HierarchicalAllocatorProcess::_resources_total(
const string& resource)
{
  Option total =
roleSorter->totalScalarQuantities()
  .get(resource);

  return total.isSome() ? total->value() : 0;
}
{code}

See 
[github|https://github.com/apache/mesos/blob/dcc8bd7d2a942889fe473c21ab64e863d0e6a13f/src/master/allocator/mesos/hierarchical.cpp#L1804]


> Master check failure if the metrics endpoint is polled soon after it starts
> ---
>
> Key: MESOS-6074
> URL: https://issues.apache.org/jira/browse/MESOS-6074
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Priority: Critical
>
> We observed the following check failure
> {noformat:title=}
> F0822 22:27:10.364923 10489 owned.hpp:110] Check failed: 'get()' Must be non 
> NULL
> {noformat}
> called from 
> {{mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::_resources_total}}
>  
> The code:
> {code}
> double HierarchicalAllocatorProcess::_resources_total(
> const string& resource)
> {
>   Option total =
> roleSorter->totalScalarQuantities()
>   .get(resource);
>   return total.isSome() ? total->value() : 0;
> }
> {code}
> See 
> [github|https://github.com/apache/mesos/blob/dcc8bd7d2a942889fe473c21ab64e863d0e6a13f/src/master/allocator/mesos/hierarchical.cpp#L1804]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6074) Master check failure if the metrics endpoint is polled soon after it starts

2016-08-23 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433641#comment-15433641
 ] 

Yan Xu commented on MESOS-6074:
---

The problem seems to be that: since 1.0 we introduced a few gauges including 
this one that depend on the sorters which are Owned pointers. These pointers 
are not initialized until in 
[Master::initialize|https://github.com/apache/mesos/blob/dcc8bd7d2a942889fe473c21ab64e863d0e6a13f/src/master/master.cpp#L699].
 However the metrics, being hosted by a separate actor, can be accessed before 
the master is initialized, thus leaving the Owned sorter pointers uninitialized 
when called.

I don't see a problem initializing the sorters in the 
[HierarchicalAllocatorProcess 
constructor|https://github.com/apache/mesos/blob/dcc8bd7d2a942889fe473c21ab64e863d0e6a13f/src/master/allocator/mesos/hierarchical.hpp#L74-L85]?

/cc [~bmahler] [~jvanremoortere] [~alexr] 

> Master check failure if the metrics endpoint is polled soon after it starts
> ---
>
> Key: MESOS-6074
> URL: https://issues.apache.org/jira/browse/MESOS-6074
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Priority: Critical
>
> We observed the following check failure
> {noformat:title=}
> F0822 22:27:10.364923 10489 owned.hpp:110] Check failed: 'get()' Must be non 
> NULL
> {noformat}
> called from 
> {{mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::_re
> sources_total}} 
> The code:
> {code}
> double HierarchicalAllocatorProcess::_resources_total(
> const string& resource)
> {
>   Option total =
> roleSorter->totalScalarQuantities()
>   .get(resource);
>   return total.isSome() ? total->value() : 0;
> }
> {code}
> See 
> [github|https://github.com/apache/mesos/blob/dcc8bd7d2a942889fe473c21ab64e863d0e6a13f/src/master/allocator/mesos/hierarchical.cpp#L1804]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6074) Master check failure if the metrics endpoint is polled soon after it starts

2016-08-23 Thread Yan Xu (JIRA)
Yan Xu created MESOS-6074:
-

 Summary: Master check failure if the metrics endpoint is polled 
soon after it starts
 Key: MESOS-6074
 URL: https://issues.apache.org/jira/browse/MESOS-6074
 Project: Mesos
  Issue Type: Bug
  Components: master
Affects Versions: 1.0.0
Reporter: Yan Xu
Priority: Critical


We observed the following check failure

{noformat:title=}
F0822 22:27:10.364923 10489 owned.hpp:110] Check failed: 'get()' Must be non 
NULL
{noformat}

called from 
{{mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::_re
sources_total}} 

The code:
{code}
double HierarchicalAllocatorProcess::_resources_total(
const string& resource)
{
  Option total =
roleSorter->totalScalarQuantities()
  .get(resource);

  return total.isSome() ? total->value() : 0;
}
{code}

See 
[github|https://github.com/apache/mesos/blob/dcc8bd7d2a942889fe473c21ab64e863d0e6a13f/src/master/allocator/mesos/hierarchical.cpp#L1804]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6073) Update the streaming function for ContainerID to be nesting aware.

2016-08-23 Thread Jie Yu (JIRA)
Jie Yu created MESOS-6073:
-

 Summary: Update the streaming function for ContainerID to be 
nesting aware.
 Key: MESOS-6073
 URL: https://issues.apache.org/jira/browse/MESOS-6073
 Project: Mesos
  Issue Type: Task
Reporter: Jie Yu
Assignee: Gilbert Song


We need to print the hierarchical structure of the nested container. For 
instance: x/y/zz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5070) Introduce more flexible subprocess interface for child options.

2016-08-23 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433347#comment-15433347
 ] 

Jie Yu commented on MESOS-5070:
---

I made some comments on your first comment. Let me know if that makes sense to 
you.

> Introduce more flexible subprocess interface for child options.
> ---
>
> Key: MESOS-5070
> URL: https://issues.apache.org/jira/browse/MESOS-5070
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: tech-debt
>
> We introduced a number of parameters to the subprocess interface with 
> MESOS-5049.
> Adding all options explicitly to the subprocess interface makes it 
> inflexible. 
> We should investigate a flexible options, which still prevents arbitrary code 
> to be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5070) Introduce more flexible subprocess interface for child options.

2016-08-23 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433345#comment-15433345
 ] 

Jie Yu commented on MESOS-5070:
---

I'd rename Hook -> ParentHook

Also, why Watchdog is not a ChildHook? Also, i suggest we rename Watchdog -> 
Supervised

> Introduce more flexible subprocess interface for child options.
> ---
>
> Key: MESOS-5070
> URL: https://issues.apache.org/jira/browse/MESOS-5070
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: tech-debt
>
> We introduced a number of parameters to the subprocess interface with 
> MESOS-5049.
> Adding all options explicitly to the subprocess interface makes it 
> inflexible. 
> We should investigate a flexible options, which still prevents arbitrary code 
> to be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5070) Introduce more flexible subprocess interface for child options.

2016-08-23 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433343#comment-15433343
 ] 

Joerg Schad commented on MESOS-5070:


[~jieyu] I am happy to reopen/rebase this if you agree with the overall 
structure


> Introduce more flexible subprocess interface for child options.
> ---
>
> Key: MESOS-5070
> URL: https://issues.apache.org/jira/browse/MESOS-5070
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: tech-debt
>
> We introduced a number of parameters to the subprocess interface with 
> MESOS-5049.
> Adding all options explicitly to the subprocess interface makes it 
> inflexible. 
> We should investigate a flexible options, which still prevents arbitrary code 
> to be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5070) Introduce more flexible subprocess interface for child options.

2016-08-23 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433321#comment-15433321
 ] 

Jie Yu commented on MESOS-5070:
---

[~joerg84] i saw the patches are discarded. Is this still reviewable.

Would love to get this resolved.

> Introduce more flexible subprocess interface for child options.
> ---
>
> Key: MESOS-5070
> URL: https://issues.apache.org/jira/browse/MESOS-5070
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: tech-debt
>
> We introduced a number of parameters to the subprocess interface with 
> MESOS-5049.
> Adding all options explicitly to the subprocess interface makes it 
> inflexible. 
> We should investigate a flexible options, which still prevents arbitrary code 
> to be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5070) Introduce more flexible subprocess interface for child options.

2016-08-23 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5070:
--
Labels: tech-debt  (was: )

> Introduce more flexible subprocess interface for child options.
> ---
>
> Key: MESOS-5070
> URL: https://issues.apache.org/jira/browse/MESOS-5070
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: tech-debt
>
> We introduced a number of parameters to the subprocess interface with 
> MESOS-5049.
> Adding all options explicitly to the subprocess interface makes it 
> inflexible. 
> We should investigate a flexible options, which still prevents arbitrary code 
> to be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5807) Support job_object in subprocess on Windows.

2016-08-23 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5807:
--
Assignee: Joseph Wu

> Support job_object in subprocess on Windows.
> 
>
> Key: MESOS-5807
> URL: https://issues.apache.org/jira/browse/MESOS-5807
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Jie Yu
>Assignee: Joseph Wu
>
> Currently, in command executor, we use different code path for posix and 
> windows:
> {noformat}
> #ifndef __WINDOWS__
> pid = launchTaskPosix(
> command,
> launcherDir,
> user,
> rootfs,
> sandboxDirectory,
> workingDirectory);
> #else
> // A Windows process is started using the `CREATE_SUSPENDED` flag
> // and is part of a job object. While the process handle is kept
> // open the reap function will work.
> PROCESS_INFORMATION processInformation = launchTaskWindows(
> command,
> rootfs);
> pid = processInformation.dwProcessId;
> ::ResumeThread(processInformation.hThread);
> CloseHandle(processInformation.hThread);
> processHandle = processInformation.hProcess;
> #endif
> {noformat}
> During a recent refactor (MESOS-5753), for the posix path, command executor 
> reused `mesos-containerizer launch` helper to launch user tasks.
> If we were to be able to support job_object in Subprocess, we can get rid of 
> this divergence in command executor. This also allow us to support custom 
> executors on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6041) Stream ID mismatch should print out expected and received stream ID

2016-08-23 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433260#comment-15433260
 ] 

Abhishek Dasgupta commented on MESOS-6041:
--

Trivial Patch: https://reviews.apache.org/r/51342/

> Stream ID mismatch should print out expected and received stream ID
> ---
>
> Key: MESOS-6041
> URL: https://issues.apache.org/jira/browse/MESOS-6041
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API
>Affects Versions: 1.0.0
>Reporter: Zameer Manji
>Assignee: Abhishek Dasgupta
>
> If you send an incorrect stream id via the HTTP API the master responds with:
> {noformat}
> The stream ID included in this request didn't match the stream ID currently 
> associated with framework ID '0dffbee9-a514-4ffa-87e1-2850dd4dcf00'`
> {noformat}
> This error message should be enhanced to include the expected and received 
> stream id to aide debugging. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5763) Task stuck in fetching is not cleaned up after --executor_registration_timeout.

2016-08-23 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433168#comment-15433168
 ] 

Yan Xu commented on MESOS-5763:
---

[~megha.sharma] contributed a test for this.

{noformat:title=}
commit a064505e411fe78a257e9b336a888f1eeddaa949
Author: Megha Sharma 
Date:   Mon Aug 22 14:51:07 2016 -0700

Added test to simulate slow/unresponsive fetch.

Added test to simulate the scenario of slow/unresponsive HDFS leading
to executor register timeout and verify that slave gets notified of the
failure.

Review: https://reviews.apache.org/r/5/
{noformat}

> Task stuck in fetching is not cleaned up after 
> --executor_registration_timeout.
> ---
>
> Key: MESOS-5763
> URL: https://issues.apache.org/jira/browse/MESOS-5763
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 0.28.0, 1.0.0, 0.29.0
>Reporter: Yan Xu
>Assignee: Yan Xu
>Priority: Blocker
> Fix For: 0.28.3, 1.0.0, 0.27.4
>
>
> When the fetching process hangs forever due to reasons such as HDFS issues, 
> Mesos containerizer would attempt to destroy the container and kill the 
> executor after {{--executor_registration_timeout}}. However this reliably 
> fails for us: the executor would be killed by the launcher destroy and the 
> container would be destroyed but the agent would never find out that the 
> executor is terminated thus leaving the task in the STAGING state forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5951) Remove "strict registry" code

2016-08-23 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-5951:
---
Labels: mesosphere  (was: )

> Remove "strict registry" code
> -
>
> Key: MESOS-5951
> URL: https://issues.apache.org/jira/browse/MESOS-5951
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> Once {{PARTITION_AWARE}} frameworks are supported, we should eventually 
> remove the code that supports the "non-strict" semantics in the master. That 
> is:
> 1. The master will be "strict" in Mesos 1.1, in the sense that master 
> behavior will always reflect the content of the registry and will not change 
> depending on whether the master has failed over. The exception here is that 
> for non-PARTITION_AWARE frameworks, we will _only_ kill such tasks on a 
> reregistering agent if the master hasn't failed over in the meantime. i.e., 
> we'll remain backwards compatible with the previous "non-strict" semantics 
> that old frameworks might depend on.
> 2. The "strict" semantics will be less problematic, because the master will 
> no longer be killing tasks and shutting down agents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6072) mesos-0.28.1 failed to launch wordpress with overlay network

2016-08-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MESOS-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432607#comment-15432607
 ] 

Stéphane Cottin commented on MESOS-6072:


{{mysqld got signal 6}} = not enough memory

When launching docker run w/o specific options, it does not limit memory.
You should be able to reproduce the same behavior with docker run by adding the 
{{-m 256M}} option.

Raising the limit to 512M should fix this.

> mesos-0.28.1 failed to launch wordpress with overlay network
> 
>
> Key: MESOS-6072
> URL: https://issues.apache.org/jira/browse/MESOS-6072
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.28.1
> Environment: centos7  64bit
> mesos-0.2.81
> docker-1.11.0
>Reporter: weifeng liu
>Priority: Minor
>  Labels: mesosphere
>
> centos 7, x86-64bit 
> mesos 0.2.81 
> docker 1.11.0
> 1. create a docker overlay network with subnet defined 10.10.0.0/16, overlay 
> network name is mynet
> 2. create mysql app on marathon, and the json used as below shows:
> {
>   "id": "/db",
>   "cmd": null,
>   "cpus": 0.6,
>   "mem": 512,
>   "disk": 0,
>   "instances": 1,
>   "container": {
> "type": "DOCKER",
> "volumes": [],
> "docker": {
>   "image": "mysql:5.7.14",
>   "network": "BRIDGE",
>   "portMappings": [
> {
>   "containerPort": 3306,
>   "hostPort": 0,
>   "servicePort": 10001,
>   "protocol": "tcp",
>   "labels": {}
> }
>   ],
>   "privileged": false,
>   "parameters": [
> {
>   "key": "net",
>   "value": "mynet"
> },
> {
>   "key": "net-alias",
>   "value": "db"
> },
> {
>   "key": "hostname",
>   "value": "db"
> }
>   ],
>   "forcePullImage": false
> }
>   },
>   "env": {
> "MYSQL_DATABASE": "wordpress",
> "MYSQL_ROOT_PASSWORD": "password"
>   },
>   "portDefinitions": [
> {
>   "port": 10001,
>   "protocol": "tcp",
>   "labels": {}
> }
>   ]
> }
> after a while, the mysql runs as expected!
> 3. create wordpress app on marathon, and the json is:
> {
>   "id": "/server",
>   "cmd": null,
>   "cpus": 0.6,
>   "mem": 256,
>   "disk": 0,
>   "instances": 1,
>   "container": {
> "type": "DOCKER",
> "volumes": [],
> "docker": {
>   "image": "wordpress",
>   "network": "BRIDGE",
>   "portMappings": [
> {
>   "containerPort": 80,
>   "hostPort": 0,
>   "servicePort": 10004,
>   "protocol": "tcp",
>   "labels": {}
> }
>   ],
>   "privileged": false,
>   "parameters": [
> {
>   "key": "net",
>   "value": "mynet"
> },
> {
>   "key": "net-alias",
>   "value": "server"
> },
> {
>   "key": "hostname",
>   "value": "server"
> }
>   ],
>   "forcePullImage": false
> }
>   },
>   "env": {
> "WORDPRESS_DB_HOST": "db:3306",
> "WORDPRESS_DB_PASSWORD": "password"
>   },
>   "portDefinitions": [
> {
>   "port": 10004,
>   "protocol": "tcp",
>   "labels": {}
> }
>   ]
> }
> once wordpress app started, the mysql app will be shutdown and then stared by 
> marathon, on the other hand, wordpress can not connected to db, and 
> eventually failed and started by marathon...  this process will be repeated 
> forever
> from mysql log, it seems that mysql process is killed, and the error log is :
> 2016-08-23T06:35:18.271876Z 0 [Note] Event Scheduler: Loaded 0 events
> 2016-08-23T06:35:18.272052Z 0 [Note] mysqld: ready for connections.
> Version: '5.7.14'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL 
> Comm
> unity Server (GPL)
> 06:36:11 UTC - mysqld got signal 6 ;
> This could be because you hit a bug. It is also possible that this binary
> or one of the libraries it was linked against is corrupt, improperly built,
> or misconfigured. This error can also be caused by malfunctioning hardware.
> Attempting to collect some information that could help diagnose the problem.
> As this is a crash and something is definitely wrong, the information
> collection process might fail.
> key_buffer_size=8388608
> read_buffer_size=131072
> max_used_connections=1
> max_threads=151
> thread_count=1
> connection_count=1
> It is possible that mysqld could use up to
> key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68190 K 
>  b
> ytes of memory
> Hope that's ok; if not, decrease some variables in the equation.
> Thread pointer: 0x7f9b8c000ae0
> if the wordpress app is not started by marathon, but by docker command 
> directly, it will startup successfully without any error!
> it

[jira] [Created] (MESOS-6072) mesos-0.28.1 failed to launch wordpress with overlay network

2016-08-23 Thread weifeng liu (JIRA)
weifeng liu created MESOS-6072:
--

 Summary: mesos-0.28.1 failed to launch wordpress with overlay 
network
 Key: MESOS-6072
 URL: https://issues.apache.org/jira/browse/MESOS-6072
 Project: Mesos
  Issue Type: Bug
  Components: docker
Affects Versions: 0.28.1
 Environment: centos7  64bit
mesos-0.2.81
docker-1.11.0
Reporter: weifeng liu
Priority: Minor


centos 7, x86-64bit 
mesos 0.2.81 
docker 1.11.0

1. create a docker overlay network with subnet defined 10.10.0.0/16, overlay 
network name is mynet
2. create mysql app on marathon, and the json used as below shows:
{
  "id": "/db",
  "cmd": null,
  "cpus": 0.6,
  "mem": 512,
  "disk": 0,
  "instances": 1,
  "container": {
"type": "DOCKER",
"volumes": [],
"docker": {
  "image": "mysql:5.7.14",
  "network": "BRIDGE",
  "portMappings": [
{
  "containerPort": 3306,
  "hostPort": 0,
  "servicePort": 10001,
  "protocol": "tcp",
  "labels": {}
}
  ],
  "privileged": false,
  "parameters": [
{
  "key": "net",
  "value": "mynet"
},
{
  "key": "net-alias",
  "value": "db"
},
{
  "key": "hostname",
  "value": "db"
}
  ],
  "forcePullImage": false
}
  },
  "env": {
"MYSQL_DATABASE": "wordpress",
"MYSQL_ROOT_PASSWORD": "password"
  },
  "portDefinitions": [
{
  "port": 10001,
  "protocol": "tcp",
  "labels": {}
}
  ]
}

after a while, the mysql runs as expected!
3. create wordpress app on marathon, and the json is:
{
  "id": "/server",
  "cmd": null,
  "cpus": 0.6,
  "mem": 256,
  "disk": 0,
  "instances": 1,
  "container": {
"type": "DOCKER",
"volumes": [],
"docker": {
  "image": "wordpress",
  "network": "BRIDGE",
  "portMappings": [
{
  "containerPort": 80,
  "hostPort": 0,
  "servicePort": 10004,
  "protocol": "tcp",
  "labels": {}
}
  ],
  "privileged": false,
  "parameters": [
{
  "key": "net",
  "value": "mynet"
},
{
  "key": "net-alias",
  "value": "server"
},
{
  "key": "hostname",
  "value": "server"
}
  ],
  "forcePullImage": false
}
  },
  "env": {
"WORDPRESS_DB_HOST": "db:3306",
"WORDPRESS_DB_PASSWORD": "password"
  },
  "portDefinitions": [
{
  "port": 10004,
  "protocol": "tcp",
  "labels": {}
}
  ]
}

once wordpress app started, the mysql app will be shutdown and then stared by 
marathon, on the other hand, wordpress can not connected to db, and eventually 
failed and started by marathon...  this process will be repeated forever

from mysql log, it seems that mysql process is killed, and the error log is :

2016-08-23T06:35:18.271876Z 0 [Note] Event Scheduler: Loaded 0 events
2016-08-23T06:35:18.272052Z 0 [Note] mysqld: ready for connections.
Version: '5.7.14'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Comm
unity Server (GPL)
06:36:11 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=1
max_threads=151
thread_count=1
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68190 K  b
ytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f9b8c000ae0



if the wordpress app is not started by marathon, but by docker command 
directly, it will startup successfully without any error!

it's pretty wired, and it puzzled me for quite a long time.

the above bug can be reproduced with mysql:5.7.14 and above tags, mysql:5.7.13 
and mysql:5.6 can run successfully.  

Anyone can help me out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6013) Use readdir instead of readdir_r

2016-08-23 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-6013:
---
Shepherd: Alexander Rukletsov

> Use readdir instead of readdir_r
> 
>
> Key: MESOS-6013
> URL: https://issues.apache.org/jira/browse/MESOS-6013
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
> Environment: Linux archlinux.vagrant.vm 4.6.4-1-ARCH #1 SMP PREEMPT 
> Mon Jul 11 19:12:32 CEST 2016 x86_64 GNU/Linux
>Reporter: Neil Conway
>  Labels: mesosphere
>
> {{readdir_r}} is deprecated in recent versions of glibc 
> (https://sourceware.org/ml/libc-alpha/2016-02/msg00093.html). As a result, 
> Mesos doesn't build on recent Arch Linux:
> {noformat}
> /bin/sh ../libtool  --tag=CXX   --mode=compile ccache g++ 
> -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" 
> -DPACKAGE_VERSION=\"1.1.0\" -DPACKAGE_STRING=\"mesos\ 1.1.0\" 
> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" 
> -DVERSION=\"1.1.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
> -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
> -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
> -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
> -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 
> -DHAVE_LIBAPR_1=1 -DHAVE_LIBCURL=1 -DMESOS_HAS_JAVA=1 -DHAVE_LIBSASL2=1 
> -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 
> -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBZ=1 -I. -I../../mesos/src   -Wall -Werror 
> -Wsign-compare -DLIBDIR=\"/usr/local/lib\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
> -DPKGDATADIR=\"/usr/local/share/mesos\" 
> -DPKGMODULEDIR=\"/usr/local/lib/mesos/modules\" -I../../mesos/include 
> -I../include -I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS 
> -isystem ../3rdparty/boost-1.53.0 -I../3rdparty/elfio-3.1 
> -I../3rdparty/glog-0.3.3/src -I../3rdparty/leveldb-1.4/include 
> -I../../mesos/3rdparty/libprocess/include -I../3rdparty/nvml-352.79 
> -I../3rdparty/picojson-1.3.0 -I../3rdparty/protobuf-2.6.1/src 
> -I../../mesos/3rdparty/stout/include 
> -I../3rdparty/zookeeper-3.4.8/src/c/include 
> -I../3rdparty/zookeeper-3.4.8/src/c/generated -DHAS_AUTHENTICATION=1 
> -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0  
> -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
> appc/libmesos_no_3rdparty_la-spec.lo -MD -MP -MF 
> appc/.deps/libmesos_no_3rdparty_la-spec.Tpo -c -o 
> appc/libmesos_no_3rdparty_la-spec.lo `test -f 'appc/spec.cpp' || echo 
> '../../mesos/src/'`appc/spec.cpp
> libtool: compile:  ccache g++ -DPACKAGE_NAME=\"mesos\" 
> -DPACKAGE_TARNAME=\"mesos\" -DPACKAGE_VERSION=\"1.1.0\" 
> "-DPACKAGE_STRING=\"mesos 1.1.0\"" -DPACKAGE_BUGREPORT=\"\" 
> -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" -DVERSION=\"1.1.0\" -DSTDC_HEADERS=1 
> -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
> -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 
> -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 
> -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_LIBCURL=1 -DMESOS_HAS_JAVA=1 
> -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 
> -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBZ=1 -I. 
> -I../../mesos/src -Wall -Werror -Wsign-compare -DLIBDIR=\"/usr/local/lib\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
> -DPKGDATADIR=\"/usr/local/share/mesos\" 
> -DPKGMODULEDIR=\"/usr/local/lib/mesos/modules\" -I../../mesos/include 
> -I../include -I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS 
> -isystem ../3rdparty/boost-1.53.0 -I../3rdparty/elfio-3.1 
> -I../3rdparty/glog-0.3.3/src -I../3rdparty/leveldb-1.4/include 
> -I../../mesos/3rdparty/libprocess/include -I../3rdparty/nvml-352.79 
> -I../3rdparty/picojson-1.3.0 -I../3rdparty/protobuf-2.6.1/src 
> -I../../mesos/3rdparty/stout/include 
> -I../3rdparty/zookeeper-3.4.8/src/c/include 
> -I../3rdparty/zookeeper-3.4.8/src/c/generated -DHAS_AUTHENTICATION=1 
> -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 
> -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
> appc/libmesos_no_3rdparty_la-spec.lo -MD -MP -MF 
> appc/.deps/libmesos_no_3rdparty_la-spec.Tpo -c ../../mesos/src/appc/spec.cpp  
> -fPIC -DPIC -o appc/.libs/libmesos_no_3rdparty_la-spec.o
> In file included from ../../mesos/3rdparty/stout/include/stout/os.hpp:52:0,
>  from ../../mesos/src/appc/spec.cpp:17:
> ../../mesos/3rdparty/stout/include/stout/os/ls.hpp: In function 
> ‘Try > > os::ls(const 
> string&)’:
> ../../mesos/3rdparty/stout/include/stout/os/ls.hpp:56:19: error: ‘int 
> readdir_r(DIR*, dirent*, dirent**)’ 

[jira] [Assigned] (MESOS-6013) Use readdir instead of readdir_r

2016-08-23 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway reassigned MESOS-6013:
--

Assignee: Neil Conway

> Use readdir instead of readdir_r
> 
>
> Key: MESOS-6013
> URL: https://issues.apache.org/jira/browse/MESOS-6013
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
> Environment: Linux archlinux.vagrant.vm 4.6.4-1-ARCH #1 SMP PREEMPT 
> Mon Jul 11 19:12:32 CEST 2016 x86_64 GNU/Linux
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> {{readdir_r}} is deprecated in recent versions of glibc 
> (https://sourceware.org/ml/libc-alpha/2016-02/msg00093.html). As a result, 
> Mesos doesn't build on recent Arch Linux:
> {noformat}
> /bin/sh ../libtool  --tag=CXX   --mode=compile ccache g++ 
> -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" 
> -DPACKAGE_VERSION=\"1.1.0\" -DPACKAGE_STRING=\"mesos\ 1.1.0\" 
> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" 
> -DVERSION=\"1.1.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
> -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
> -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
> -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
> -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 
> -DHAVE_LIBAPR_1=1 -DHAVE_LIBCURL=1 -DMESOS_HAS_JAVA=1 -DHAVE_LIBSASL2=1 
> -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 
> -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBZ=1 -I. -I../../mesos/src   -Wall -Werror 
> -Wsign-compare -DLIBDIR=\"/usr/local/lib\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
> -DPKGDATADIR=\"/usr/local/share/mesos\" 
> -DPKGMODULEDIR=\"/usr/local/lib/mesos/modules\" -I../../mesos/include 
> -I../include -I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS 
> -isystem ../3rdparty/boost-1.53.0 -I../3rdparty/elfio-3.1 
> -I../3rdparty/glog-0.3.3/src -I../3rdparty/leveldb-1.4/include 
> -I../../mesos/3rdparty/libprocess/include -I../3rdparty/nvml-352.79 
> -I../3rdparty/picojson-1.3.0 -I../3rdparty/protobuf-2.6.1/src 
> -I../../mesos/3rdparty/stout/include 
> -I../3rdparty/zookeeper-3.4.8/src/c/include 
> -I../3rdparty/zookeeper-3.4.8/src/c/generated -DHAS_AUTHENTICATION=1 
> -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0  
> -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
> appc/libmesos_no_3rdparty_la-spec.lo -MD -MP -MF 
> appc/.deps/libmesos_no_3rdparty_la-spec.Tpo -c -o 
> appc/libmesos_no_3rdparty_la-spec.lo `test -f 'appc/spec.cpp' || echo 
> '../../mesos/src/'`appc/spec.cpp
> libtool: compile:  ccache g++ -DPACKAGE_NAME=\"mesos\" 
> -DPACKAGE_TARNAME=\"mesos\" -DPACKAGE_VERSION=\"1.1.0\" 
> "-DPACKAGE_STRING=\"mesos 1.1.0\"" -DPACKAGE_BUGREPORT=\"\" 
> -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" -DVERSION=\"1.1.0\" -DSTDC_HEADERS=1 
> -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
> -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 
> -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 
> -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_LIBCURL=1 -DMESOS_HAS_JAVA=1 
> -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 
> -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBZ=1 -I. 
> -I../../mesos/src -Wall -Werror -Wsign-compare -DLIBDIR=\"/usr/local/lib\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
> -DPKGDATADIR=\"/usr/local/share/mesos\" 
> -DPKGMODULEDIR=\"/usr/local/lib/mesos/modules\" -I../../mesos/include 
> -I../include -I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS 
> -isystem ../3rdparty/boost-1.53.0 -I../3rdparty/elfio-3.1 
> -I../3rdparty/glog-0.3.3/src -I../3rdparty/leveldb-1.4/include 
> -I../../mesos/3rdparty/libprocess/include -I../3rdparty/nvml-352.79 
> -I../3rdparty/picojson-1.3.0 -I../3rdparty/protobuf-2.6.1/src 
> -I../../mesos/3rdparty/stout/include 
> -I../3rdparty/zookeeper-3.4.8/src/c/include 
> -I../3rdparty/zookeeper-3.4.8/src/c/generated -DHAS_AUTHENTICATION=1 
> -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 
> -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
> appc/libmesos_no_3rdparty_la-spec.lo -MD -MP -MF 
> appc/.deps/libmesos_no_3rdparty_la-spec.Tpo -c ../../mesos/src/appc/spec.cpp  
> -fPIC -DPIC -o appc/.libs/libmesos_no_3rdparty_la-spec.o
> In file included from ../../mesos/3rdparty/stout/include/stout/os.hpp:52:0,
>  from ../../mesos/src/appc/spec.cpp:17:
> ../../mesos/3rdparty/stout/include/stout/os/ls.hpp: In function 
> ‘Try > > os::ls(const 
> string&)’:
> ../../mesos/3rdparty/stout/include/stout/os/ls.hpp:56:19: error: ‘int 
> re

[jira] [Commented] (MESOS-6071) Validate that an explicitly specified DEFAULT executor has disk resources.

2016-08-23 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432362#comment-15432362
 ] 

Abhishek Dasgupta commented on MESOS-6071:
--

Is anyone actively working on this? I would like to take up this issue.
cc [~bmahler]

> Validate that an explicitly specified DEFAULT executor has disk resources.
> --
>
> Key: MESOS-6071
> URL: https://issues.apache.org/jira/browse/MESOS-6071
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>
> When the framework is explicitly specifying the DEFAULT executor (currently 
> only supported for task groups), we should consider validating that it 
> contains disk resources. Currently, we validate that explicitly specified 
> (DEFAULT or CUSTOM) executors only contain cpus and mem.
> We should also consider supporting the omission of DEFAULT executor resources 
> and injecting a default amount of resources. However, the difficulty here is 
> that the framework must know about these amounts since they need to be 
> available in the offer. We could expose these to the framework during 
> framework registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6071) Validate that an explicitly specified DEFAULT executor has disk resources.

2016-08-23 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432361#comment-15432361
 ] 

Abhishek Dasgupta commented on MESOS-6071:
--

Is anyone actively working on this? I would like to take up this issue.
cc [~bmahler]

> Validate that an explicitly specified DEFAULT executor has disk resources.
> --
>
> Key: MESOS-6071
> URL: https://issues.apache.org/jira/browse/MESOS-6071
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>
> When the framework is explicitly specifying the DEFAULT executor (currently 
> only supported for task groups), we should consider validating that it 
> contains disk resources. Currently, we validate that explicitly specified 
> (DEFAULT or CUSTOM) executors only contain cpus and mem.
> We should also consider supporting the omission of DEFAULT executor resources 
> and injecting a default amount of resources. However, the difficulty here is 
> that the framework must know about these amounts since they need to be 
> available in the offer. We could expose these to the framework during 
> framework registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)