[jira] [Commented] (MESOS-5468) Add logic in long-lived-framework to handle network partitions.

2016-05-31 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309244#comment-15309244
 ] 

Jay Guo commented on MESOS-5468:


[~anandmazumdar] Sorry for the delay.
One out of two connections between framework and master is successfully closed, 
however another one is left ESTABLISHED when master attempts to remove the 
framework. Upon network rejoin, master repeatedly denied subscription call from 
framework. So the question is, is the EVENT connection left open intentionally 
or accidentally?

Here's the full log:
{code:title=master.log}
I0601 12:12:03.671700  2252 master.cpp:5195] Status update TASK_FINISHED (UUID: 
e370dac6-2915-4090-876f-c000d0fe71c7) for task 3 of framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- from agent 
edbc3730-e55b-4390-a1f2-5de5a66497f5-S0 at slave(1)@127.0.1.1:5051 (ubuntu)
I0601 12:12:03.671931  2252 master.cpp:5243] Forwarding status update 
TASK_FINISHED (UUID: e370dac6-2915-4090-876f-c000d0fe71c7) for task 3 of 
framework e8288e1d-2c05-4e05-9db7-713a366f7f5f-
I0601 12:12:03.672360  2252 master.cpp:6853] Updating the state of task 3 of 
framework e8288e1d-2c05-4e05-9db7-713a366f7f5f- (latest state: 
TASK_FINISHED, status update state: TASK_FINISHED)
I0601 12:14:43.677433  2247 master.cpp:5195] Status update TASK_FINISHED (UUID: 
e370dac6-2915-4090-876f-c000d0fe71c7) for task 3 of framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- from agent 
edbc3730-e55b-4390-a1f2-5de5a66497f5-S0 at slave(1)@127.0.1.1:5051 (ubuntu)
I0601 12:14:43.677781  2247 master.cpp:5243] Forwarding status update 
TASK_FINISHED (UUID: e370dac6-2915-4090-876f-c000d0fe71c7) for task 3 of 
framework e8288e1d-2c05-4e05-9db7-713a366f7f5f-
I0601 12:14:43.678387  2247 master.cpp:6853] Updating the state of task 3 of 
framework e8288e1d-2c05-4e05-9db7-713a366f7f5f- (latest state: 
TASK_FINISHED, status update state: TASK_FINISHED)
I0601 12:20:03.679064  2251 master.cpp:5195] Status update TASK_FINISHED (UUID: 
e370dac6-2915-4090-876f-c000d0fe71c7) for task 3 of framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- from agent 
edbc3730-e55b-4390-a1f2-5de5a66497f5-S0 at slave(1)@127.0.1.1:5051 (ubuntu)
I0601 12:20:03.679194  2251 master.cpp:5243] Forwarding status update 
TASK_FINISHED (UUID: e370dac6-2915-4090-876f-c000d0fe71c7) for task 3 of 
framework e8288e1d-2c05-4e05-9db7-713a366f7f5f-
I0601 12:20:03.679565  2251 master.cpp:6853] Updating the state of task 3 of 
framework e8288e1d-2c05-4e05-9db7-713a366f7f5f- (latest state: 
TASK_FINISHED, status update state: TASK_FINISHED)
E0601 12:25:02.891707  2254 process.cpp:2040] Failed to shutdown socket with fd 
13: Transport endpoint is not connected
I0601 12:25:02.895753  2248 master.cpp:1388] Framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- (Long Lived Framework (C++)) 
disconnected
I0601 12:25:02.896077  2248 master.cpp:2822] Disconnecting framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- (Long Lived Framework (C++))
I0601 12:25:02.896289  2248 master.cpp:2846] Deactivating framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- (Long Lived Framework (C++))
W0601 12:25:02.896682  2248 master.hpp:1903] Master attempted to send message 
to disconnected framework e8288e1d-2c05-4e05-9db7-713a366f7f5f- (Long Lived 
Framework (C++))
W0601 12:25:02.897027  2248 master.hpp:1909] Unable to send event to framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- (Long Lived Framework (C++)): 
connection closed
I0601 12:25:02.897341  2248 master.cpp:1401] Giving framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- (Long Lived Framework (C++)) 0ns to 
failover
I0601 12:25:02.896751  2249 hierarchical.cpp:375] Deactivated framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f-
I0601 12:25:02.901005  2251 master.cpp:5608] Framework failover timeout, 
removing framework e8288e1d-2c05-4e05-9db7-713a366f7f5f- (Long Lived 
Framework (C++))
I0601 12:25:02.901053  2251 master.cpp:6338] Removing framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- (Long Lived Framework (C++))
I0601 12:25:02.901409  2251 master.cpp:6853] Updating the state of task 3 of 
framework e8288e1d-2c05-4e05-9db7-713a366f7f5f- (latest state: 
TASK_FINISHED, status update state: TASK_KILLED)
I0601 12:25:02.901449  2251 master.cpp:6919] Removing task 3 with resources 
cpus(*):0.001; mem(*):1 of framework e8288e1d-2c05-4e05-9db7-713a366f7f5f- 
on agent edbc3730-e55b-4390-a1f2-5de5a66497f5-S0 at slave(1)@127.0.1.1:5051 
(ubuntu)
I0601 12:25:02.901721  2251 master.cpp:6948] Removing executor 'default' with 
resources cpus(*):0.1; mem(*):32 of framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f- on agent 
edbc3730-e55b-4390-a1f2-5de5a66497f5-S0 at slave(1)@127.0.1.1:5051 (ubuntu)
I0601 12:25:02.902426  2251 hierarchical.cpp:326] Removed framework 
e8288e1d-2c05-4e05-9db7-713a366f7f5f-
W0601 12:25:08.007905  2253 master.cpp:5291] Ignoring unknown exited executor 
'default' 

[jira] [Commented] (MESOS-5359) The scheduler library should have a delay before initiating a connection with master.

2016-05-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MESOS-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309115#comment-15309115
 ] 

José Guilherme Vanz commented on MESOS-5359:


I already have a preliminary patch. But before submitted it to review, I would 
like to ask something related which is the best approach considering the Mesos 
way to does stuff.

Right now, I added a {{flags}} member in the {{MesosProcess}}:

{code:title=src/scheduler/scherduler.ccp|borderStyle=solid}
  MesosProcess(
  const string& master,
  ContentType _contentType,
  const lambda::function& connected,
  const lambda::function& disconnected,
  const lambda::function& received,
  const Option& _credential,
  const Option& _detector,
  const mesos::v1::scheduler::Flags& _flags)
: ProcessBase(ID::generate("scheduler")),
  state(DISCONNECTED),
  contentType(_contentType),
  callbacks {connected, disconnected, received},
  credential(_credential),
  local(false),
  flags(_flags)
  {



Mesos::Mesos(
const string& master,
ContentType contentType,
const lambda::function& connected,
const lambda::function& disconnected,
const lambda::function& received,
const Option& credential,
const Option& detector,
const mesos::v1::scheduler::Flags& flags)
{
{code}

The {{mesos::v1::scheduler::Flags}} is the class created following the 
{{src/sched/flags.hpp}} example.  However, I'm not sure if pass the {{Flags}} 
object is the best idea. I believe that the old api does that because the 
scheduler driver, as an "internal" class, is responsable for that.  The new API 
{{Mesos}} class is the instanciated by the scheduler itself thereby I had to 
add the {{mesos::v1::scheduler::Flags}} in the include dir, allowing scheduler 
to instanciante the class. Is it ok? Should I pass just the flag value in the 
{{Mesos}} constructor such as master  connection url?

> The scheduler library should have a delay before initiating a connection with 
> master.
> -
>
> Key: MESOS-5359
> URL: https://issues.apache.org/jira/browse/MESOS-5359
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Anand Mazumdar
>Assignee: José Guilherme Vanz
>  Labels: mesosphere
>
> Currently, the scheduler library {{src/scheduler/scheduler.cpp}} does have an 
> artificially induced delay when trying to initially establish a connection 
> with the master. In the event of a master failover or ZK disconnect, a large 
> number of frameworks can get disconnected and then thereby overwhelm the 
> master with TCP SYN requests. 
> On a large cluster with many agents, the master is already overwhelmed with 
> handling connection requests from the agents. This compounds the issue 
> further on the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5395) Task getting stuck in staging state if launch it on a rebooted slave.

2016-05-31 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308926#comment-15308926
 ] 

Gilbert Song commented on MESOS-5395:
-

[~Mengkui], Thanks for reporting this issue. Could you reproduce this issue and 
see whether restarting the slave process resolve the issue?

BTW, could you verify https://issues.apache.org/jira/browse/MESOS-5482 is 
identical to this issue? Thanks. :)

> Task getting stuck in staging state if launch it on a rebooted slave.
> -
>
> Key: MESOS-5395
> URL: https://issues.apache.org/jira/browse/MESOS-5395
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.0
> Environment: mesos/marathon cluster,  3 maters/4 slaves
> Mesos: 0.28.0 ,  Marathon 0.15.2
>Reporter: Mengkui gong
> Attachments: mesos-log.zip
>
>
> if rebooting a slave, after that,  using Marathon to launch a task,  the task 
> can start on other slaves without problem.  But if launch it on the rebooted 
> slave, the task will be stuck. From Mesos UI shows it in staging state from 
> active tasks list.  From Marathon UI shows it in deploying state. It can 
> keeping in stuck state for more than 2 hours.  After that time, Marathon will 
> automatically launch the task on this rebooted slave or other slave as 
> normal. So the rebooted slave be recovered as well after that time.   
> From Mesos log,  I can see "telling slave to kill task" all the time.
> I0517 15:25:27.207237 20568 master.cpp:3826] Telling slave 
> 282745ab-423a-4350-a449-3e8cdfccfb93-S1 at slave(1)@10.254.234.236:5050 
> (mesos-slave-3) to kill task 
> project-hub_project-hub-frontend.b645f24b-1c1f-11e6-bb25-d00d2cce797e of 
> framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730- (marathon) at 
> scheduler-fe615b72-ab92-49ca-89e6-e74e600c7e15@10.254.228.3:56757.
> From rebooted slave log, I can see:
> May 17 15:28:37 euca-10-254-234-236 mesos-slave[829]: I0517 15:28:37.206831   
> 916 slave.cpp:1891] Asked to kill task 
> project-hub_project-hub-frontend.b645f24b-1c1f-11e6-bb25-d00d2cce797e of 
> framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-
> May 17 15:28:37 euca-10-254-234-236 mesos-slave[829]: W0517 15:28:37.206866   
> 916 slave.cpp:2018] Ignoring kill task 
> project-hub_project-hub-frontend.b645f24b-1c1f-11e6-bb25-d00d2cce797e because 
> the executor 
> 'project-hub_project-hub-frontend.b645f24b-1c1f-11e6-bb25-d00d2cce797e' of 
> framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730- is terminating/terminated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5520) Before starting a build, bootstrapping shows some warning

2016-05-31 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308870#comment-15308870
 ] 

Gilbert Song commented on MESOS-5520:
-

[~anksv], seems like this relates to libprocess/stout Makefile, which may not 
block the build. Should be an easy fix on Makefile.

> Before starting a build, bootstrapping shows some warning
> -
>
> Key: MESOS-5520
> URL: https://issues.apache.org/jira/browse/MESOS-5520
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.28.1
> Environment: ubuntu 14.04.4
>Reporter: Ankur Verma
>Priority: Minor
> Attachments: bootstrap_cmd_logs.txt
>
>
> For the first time before building, some warning comes when using command 
> bootstrap
> # Bootstrap (Only required if building from git repository). 
> $ ./bootstrap
> Logs:
> 3rdparty/stout/Makefile.am:71: warning: source file 
> 'tests/subcommand_tests.cpp' is in a subdirectory,
> 3rdparty/stout/Makefile.am:71: but option 'subdir-objects' is disabled
> 3rdparty/stout/Makefile.am:71: warning: source file 'tests/svn_tests.cpp' is 
> in a subdirectory,
> 3rdparty/stout/Makefile.am:71: but option 'subdir-objects' is disabled
> 3rdparty/stout/Makefile.am:71: warning: source file 'tests/try_tests.cpp' is 
> in a subdirectory,
> 3rdparty/stout/Makefile.am:71: but option 'subdir-objects' is disabled
> 3rdparty/stout/Makefile.am:71: warning: source file 'tests/uuid_tests.cpp' is 
> in a subdirectory,
> 3rdparty/stout/Makefile.am:71: but option 'subdir-objects' is disabled
> 3rdparty/stout/Makefile.am:71: warning: source file 'tests/version_tests.cpp' 
> is in a subdirectory,
> 3rdparty/stout/Makefile.am:71: but option 'subdir-objects' is disabled
> 3rdparty/stout/Makefile.am:122: warning: source file 'tests/proc_tests.cpp' 
> is in a subdirectory,
> 3rdparty/stout/Makefile.am:122: but option 'subdir-objects' is disabled



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5405) Make fields in authorization::Request protobuf optional.

2016-05-31 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308861#comment-15308861
 ] 

Joerg Schad commented on MESOS-5405:


https://reviews.apache.org/r/48101/

> Make fields in authorization::Request protobuf optional.
> 
>
> Key: MESOS-5405
> URL: https://issues.apache.org/jira/browse/MESOS-5405
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Till Toenshoff
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Currently {{authorization::Request}} protobuf declares {{subject}} and 
> {{object}} as required fields. However, in the codebase we not always set 
> them, which renders the message in the uninitialized state, for example:
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/common/http.cpp#L603
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/master/http.cpp#L2057
> I believe that the reason why we don't see issues related to this is because 
> we never send authz requests over the wire, i.e., never serialize/deserialize 
> them. However, they are still invalid protobuf messages. Moreover, some 
> external authorizers may serialize these messages.
> We can either ensure all required fields are set or make both {{subject}} and 
> {{object}} fields optional. This will also require updating local authorizer, 
> which should properly handle the situation when these fields are absent. We 
> may also want to notify authors of external authorizers to update their code 
> accordingly.
> It looks like no deprecation is necessary, mainly because we 
> already—erroneously!—treat these fields as optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5482) mesos/marathon task stuck in staging after slave reboot

2016-05-31 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308843#comment-15308843
 ] 

Gilbert Song commented on MESOS-5482:
-

[~lutfu], thanks for reporting this issue.

Are you always able to reproduce this issue? or it is just occasionally (like a 
race)?

> mesos/marathon task stuck in staging after slave reboot
> ---
>
> Key: MESOS-5482
> URL: https://issues.apache.org/jira/browse/MESOS-5482
> Project: Mesos
>  Issue Type: Bug
>Reporter: lutful karim
> Attachments: marathon-mesos-masters_after-reboot.log, 
> mesos-masters_mesos.log, mesos_slaves_after_reboot.log, 
> tasks_running_before_rebooot.marathon
>
>
> The main idea of mesos/marathon is to sleep well, but after node reboot mesos 
> task gets stuck in staging for about 4 hours.
> To reproduce the issue: 
> - setup a mesos cluster in HA mode with systemd enabled mesos-master and 
> mesos-slave service.
> - run docker registry (https://hub.docker.com/_/registry/ ) with mesos 
> constraint (hostname:LIKE:mesos-slave-1) in one node. Reboot the node and 
> notice that task getting stuck in staging.
> Possible workaround: service mesos-slave restart fixes the issue.
> OS: centos 7.2
> mesos version: 0.28.1
> marathon: 1.1.1
> zookeeper: 3.4.8
> docker: 1.9.1 dockerAPIversion: 1.21
> error message:
> May 30 08:38:24 euca-10-254-237-140 mesos-slave[832]: W0530 08:38:24.120013   
> 909 slave.cpp:2018] Ignoring kill task 
> docker-registry.066fb448-2628-11e6-bedd-d00d0ef81dc3 because the executor 
> 'docker-registry.066fb448-2628-11e6-bedd-d00d0ef81dc3' of framework 
> 8517fcb7-f2d0-47ad-ae02-837570bef929- is terminating/terminated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5339) Create Tests for testing fine-grained HTTP endpoint filtering.

2016-05-31 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5339:
--
Fix Version/s: 1.0.0

> Create Tests for testing fine-grained HTTP endpoint filtering.
> --
>
> Key: MESOS-5339
> URL: https://issues.apache.org/jira/browse/MESOS-5339
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5439) registerExecutor problem

2016-05-31 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308814#comment-15308814
 ] 

Gilbert Song commented on MESOS-5439:
-

hi [~wnghksrla001], are you saying it is only slow between 'Forked child with 
pid' and 'Got registration for executor', or you are saying all the agent 
logging is slow. If it is the former case, it may be related to the executor.

As an usual case, it should be pretty quick. You can test it out to launch some 
similar tasks using mesos-execute with command executor.

> registerExecutor problem
> 
>
> Key: MESOS-5439
> URL: https://issues.apache.org/jira/browse/MESOS-5439
> Project: Mesos
>  Issue Type: Bug
>  Components: c++ api, slave
>Affects Versions: 0.27.0
>Reporter: kimjoohwan
>
> Currently, we are using Mesos 0.27.0. The master is build up with a Intel(R) 
> Core(TM) i5-3470 CPU @ 3.20GHz CPU and a 4GB RAM. The slave (Banana PI) is 
> build up with a Cortex -A7 Dual-Core CPU and a 1GB RAM.
> By using the Mesos API, we have developed and completed the execution of the 
> framework which is based on python.
> but, we found that it takes too much time between the messages, 'Forked child 
> with pid' and 'Got registration for executor' from the slave log. (5sec)
> If you know how to deal with this problem, please let us know.
> I0523 17:38:16.264289  1787 slave.cpp:5208] Launching executor default of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 with resources  in work 
> directory 
> '/tmp/mesos/slaves/3fb86eea-96c4-4b07-aaa2-caf071275bdf-S2/frameworks/3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010/executors/default/runs/1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:16.290601  1789 containerizer.cpp:616] Starting container 
> '1c830c9a-4120-4ef0-af80-49a52d307539' for executor 'default' of framework 
> '3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010'
> I0523 17:38:16.293285  1787 slave.cpp:1626] Queuing task '0' for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010
> I0523 17:38:16.297369  1787 slave.cpp:4233] Current disk usage 2.14%. Max 
> allowed age: 6.150293798159722days
> I0523 17:38:16.504043  1789 launcher.cpp:132] Forked child with pid '1837' 
> for container '1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:21.510535  1785 slave.cpp:2573] Got registration for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.554608  1785 slave.cpp:1791] Sending queued task '0' to 
> executor 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 at 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.594511  1789 slave.cpp:2932] Handling status update 
> TASK_RUNNING (UUID: cd04ec2a-0e68-460a-ad2e-e4f504f3b032) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.600050  1789 slave.cpp:2932] Handling status update 
> TASK_FINISHED (UUID: 46e110c8-4078-4f98-ae30-30b3a1376034) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5405) Make fields in authorization::Request protobuf optional.

2016-05-31 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308765#comment-15308765
 ] 

Till Toenshoff commented on MESOS-5405:
---

sgtm

> Make fields in authorization::Request protobuf optional.
> 
>
> Key: MESOS-5405
> URL: https://issues.apache.org/jira/browse/MESOS-5405
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Till Toenshoff
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Currently {{authorization::Request}} protobuf declares {{subject}} and 
> {{object}} as required fields. However, in the codebase we not always set 
> them, which renders the message in the uninitialized state, for example:
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/common/http.cpp#L603
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/master/http.cpp#L2057
> I believe that the reason why we don't see issues related to this is because 
> we never send authz requests over the wire, i.e., never serialize/deserialize 
> them. However, they are still invalid protobuf messages. Moreover, some 
> external authorizers may serialize these messages.
> We can either ensure all required fields are set or make both {{subject}} and 
> {{object}} fields optional. This will also require updating local authorizer, 
> which should properly handle the situation when these fields are absent. We 
> may also want to notify authors of external authorizers to update their code 
> accordingly.
> It looks like no deprecation is necessary, mainly because we 
> already—erroneously!—treat these fields as optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5405) Make fields in authorization::Request protobuf optional.

2016-05-31 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308764#comment-15308764
 ] 

Till Toenshoff commented on MESOS-5405:
---

Additional work was done belonging to this issue:

Added {{Request}} sanity checks in {{LocalAuthorizer}}: 
https://reviews.apache.org/r/48085/
Updated comments in authorizer.proto.: https://reviews.apache.org/r/48093/

Note that the latter tries to supercede https://reviews.apache.org/r/47876 - by 
borrowing some inspirations from it - thanks [~adam-mesos]!

> Make fields in authorization::Request protobuf optional.
> 
>
> Key: MESOS-5405
> URL: https://issues.apache.org/jira/browse/MESOS-5405
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Till Toenshoff
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Currently {{authorization::Request}} protobuf declares {{subject}} and 
> {{object}} as required fields. However, in the codebase we not always set 
> them, which renders the message in the uninitialized state, for example:
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/common/http.cpp#L603
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/master/http.cpp#L2057
> I believe that the reason why we don't see issues related to this is because 
> we never send authz requests over the wire, i.e., never serialize/deserialize 
> them. However, they are still invalid protobuf messages. Moreover, some 
> external authorizers may serialize these messages.
> We can either ensure all required fields are set or make both {{subject}} and 
> {{object}} fields optional. This will also require updating local authorizer, 
> which should properly handle the situation when these fields are absent. We 
> may also want to notify authors of external authorizers to update their code 
> accordingly.
> It looks like no deprecation is necessary, mainly because we 
> already—erroneously!—treat these fields as optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5405) Make fields in authorization::Request protobuf optional.

2016-05-31 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308700#comment-15308700
 ] 

Till Toenshoff edited comment on MESOS-5405 at 5/31/16 9:46 PM:


[~tillt] [~adam-mesos] [~mcypark]
This breaks some assumptions of the current `authorized` interface which assume 
`subject` and `object` are set (see below).

In order to accomodate for this these new optional fields i would propose the 
following 
1. Change getObjectApprover's signatures to accept Option, Option 

2. Change objectApprover->approved() signature to accept an Option
(and adapt the logic in approved for the LocalAuthorizerObjectApprover to deal 
with the None -> Any conversion)

{noformat}
  Future authorized(const authorization::Request& request)
  {
return getObjectApprover(request.subject(), request.action())
  .then([=](const Owned& objectApprover) -> Future {
ObjectApprover::Object object(request.object());
Try result = objectApprover->approved(object);
if (result.isError()) {
  return Failure(result.error());
}
return result.get();
  });
  }
{noformat}



was (Author: js84):
[~tillt] [~adam-mesos] [~mcypark]
This breaks some assumptions of the current `authorized` interface which assume 
`subject` and `object` are set (see below).

In order to accomodate for this these new optional fields i would propose the 
following 
1. Change getObjectApprover's signatures to accept Option, Option 

2. Change objectApprover->approved() signature to accept an Option
(and adapt the logic in approved for the LocalAuthorizerObjectApprover to deal 
with the None -> Any conversion)

```
  Future authorized(const authorization::Request& request)
  {
return getObjectApprover(request.subject(), request.action())
  .then([=](const Owned& objectApprover) -> Future {
ObjectApprover::Object object(request.object());
Try result = objectApprover->approved(object);
if (result.isError()) {
  return Failure(result.error());
}
return result.get();
  });
  }
```

> Make fields in authorization::Request protobuf optional.
> 
>
> Key: MESOS-5405
> URL: https://issues.apache.org/jira/browse/MESOS-5405
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Till Toenshoff
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Currently {{authorization::Request}} protobuf declares {{subject}} and 
> {{object}} as required fields. However, in the codebase we not always set 
> them, which renders the message in the uninitialized state, for example:
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/common/http.cpp#L603
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/master/http.cpp#L2057
> I believe that the reason why we don't see issues related to this is because 
> we never send authz requests over the wire, i.e., never serialize/deserialize 
> them. However, they are still invalid protobuf messages. Moreover, some 
> external authorizers may serialize these messages.
> We can either ensure all required fields are set or make both {{subject}} and 
> {{object}} fields optional. This will also require updating local authorizer, 
> which should properly handle the situation when these fields are absent. We 
> may also want to notify authors of external authorizers to update their code 
> accordingly.
> It looks like no deprecation is necessary, mainly because we 
> already—erroneously!—treat these fields as optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5405) Make fields in authorization::Request protobuf optional.

2016-05-31 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308700#comment-15308700
 ] 

Joerg Schad commented on MESOS-5405:


[~tillt] [~adam-mesos] [~mcypark]
This breaks some assumptions of the current `authorized` interface which assume 
`subject` and `object` are set (see below).

In order to accomodate for this these new optional fields i would propose the 
following 
1. Change getObjectApprover's signatures to accept Option, Option 

2. Change objectApprover->approved() signature to accept an Option
(and adapt the logic in approved for the LocalAuthorizerObjectApprover to deal 
with the None -> Any conversion)

```
  Future authorized(const authorization::Request& request)
  {
return getObjectApprover(request.subject(), request.action())
  .then([=](const Owned& objectApprover) -> Future {
ObjectApprover::Object object(request.object());
Try result = objectApprover->approved(object);
if (result.isError()) {
  return Failure(result.error());
}
return result.get();
  });
  }
```

> Make fields in authorization::Request protobuf optional.
> 
>
> Key: MESOS-5405
> URL: https://issues.apache.org/jira/browse/MESOS-5405
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Till Toenshoff
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Currently {{authorization::Request}} protobuf declares {{subject}} and 
> {{object}} as required fields. However, in the codebase we not always set 
> them, which renders the message in the uninitialized state, for example:
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/common/http.cpp#L603
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/master/http.cpp#L2057
> I believe that the reason why we don't see issues related to this is because 
> we never send authz requests over the wire, i.e., never serialize/deserialize 
> them. However, they are still invalid protobuf messages. Moreover, some 
> external authorizers may serialize these messages.
> We can either ensure all required fields are set or make both {{subject}} and 
> {{object}} fields optional. This will also require updating local authorizer, 
> which should properly handle the situation when these fields are absent. We 
> may also want to notify authors of external authorizers to update their code 
> accordingly.
> It looks like no deprecation is necessary, mainly because we 
> already—erroneously!—treat these fields as optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5494) Implement GET_ROLES Call in v1 master API.

2016-05-31 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308676#comment-15308676
 ] 

Abhishek Dasgupta commented on MESOS-5494:
--

RR: https://reviews.apache.org/r/48094

> Implement GET_ROLES Call in v1 master API.
> --
>
> Key: MESOS-5494
> URL: https://issues.apache.org/jira/browse/MESOS-5494
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Abhishek Dasgupta
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5405) Make fields in authorization::Request protobuf optional.

2016-05-31 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-5405:
--
Shepherd: Adam B

> Make fields in authorization::Request protobuf optional.
> 
>
> Key: MESOS-5405
> URL: https://issues.apache.org/jira/browse/MESOS-5405
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Till Toenshoff
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 1.0.0
>
>
> Currently {{authorization::Request}} protobuf declares {{subject}} and 
> {{object}} as required fields. However, in the codebase we not always set 
> them, which renders the message in the uninitialized state, for example:
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/common/http.cpp#L603
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/master/http.cpp#L2057
> I believe that the reason why we don't see issues related to this is because 
> we never send authz requests over the wire, i.e., never serialize/deserialize 
> them. However, they are still invalid protobuf messages. Moreover, some 
> external authorizers may serialize these messages.
> We can either ensure all required fields are set or make both {{subject}} and 
> {{object}} fields optional. This will also require updating local authorizer, 
> which should properly handle the situation when these fields are absent. We 
> may also want to notify authors of external authorizers to update their code 
> accordingly.
> It looks like no deprecation is necessary, mainly because we 
> already—erroneously!—treat these fields as optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5529) Distinguish non-revocable and revocable allocation guarantees.

2016-05-31 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-5529:
--

 Summary: Distinguish non-revocable and revocable allocation 
guarantees.
 Key: MESOS-5529
 URL: https://issues.apache.org/jira/browse/MESOS-5529
 Project: Mesos
  Issue Type: Epic
  Components: allocation
Reporter: Benjamin Mahler


Currently, the notion of fair sharing and quota do not make a distinction 
between revocable and non-revocable resources. However, this makes fair sharing 
difficult since we currently offer resources as non-revocable within the fair 
share and cannot perform revocation when we need to restore fairness or quota.

As we move towards providing guarantees for the particular resources types, we 
may want to allow the operator to specify quota (absolutes) or shares 
(relatives) for both revocable or non-revocable resources:

| |*Non-revocable*|*Revocable*|
|*Quota*|absolute guarantees for non-revocable resources (well suited for 
service-like always running workloads)|absolute guarantees for revocable 
resources (useful for expressing minimum requirements of batch workload?)|
|*Fair Share*|relative guarantees for non-revocable resources (e.g. backwards 
compatibility with old behavior)|relative guarantees for revocable resources 
(e.g. well suited for fair sharing in a dynamic cluster)|

See MESOS-5526 for revocation support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5528) Use inverse offers to reclaim resources from schedulers over their quota.

2016-05-31 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-5528:
--

 Summary: Use inverse offers to reclaim resources from schedulers 
over their quota.
 Key: MESOS-5528
 URL: https://issues.apache.org/jira/browse/MESOS-5528
 Project: Mesos
  Issue Type: Epic
  Components: allocation
Reporter: Benjamin Mahler


As we move towards distinguishing non-revocable and revocable allocation of 
resources, we need to enforce that the upper limits specified via quota are 
enforced.

For example, if a scheduler has quota for non-revocable resources and there is 
only fair sharing turned on for revocable resources, the scheduler should not 
be able to consume more non-revocable resources than its quota limit.

Even if mesos disallows this when tasks are launched, there are cases where the 
scheduler can exceed its quota:
* Unreachable nodes that were not accounted for reconnect to the cluster with 
existing resources allocated to the scheduler's role.
* The operator lowers the amount of quota for the role.

In these cases and more generally, we need an always running mechanism for 
reclaiming excess quota allocation via inverse offers. The deadline should be 
configurable by the operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5527) Provide work conservation incentives for schedulers.

2016-05-31 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-5527:
--

 Summary: Provide work conservation incentives for schedulers.
 Key: MESOS-5527
 URL: https://issues.apache.org/jira/browse/MESOS-5527
 Project: Mesos
  Issue Type: Epic
  Components: allocation, framework
Reporter: Benjamin Mahler


As we begin to add support for schedulers to revoke resources to obtain their 
quota or fair share, we need to consider the case of non-cooperative or 
malicious schedulers that cause excessive revocation either by accident or 
intentionally.

For example, a malicious scheduler could keep a low allocation below its fair 
share, and revoke as many resources as it can in order to disturb existing work 
as much as possible.

We can provide mitigation techniques, or incentives / penalties to schedulers 
that cause excessive revocation:
* Disallow revocation when a scheduler resources are available. The scheduler 
must choose available resources or wait until allocated resources free up. This 
means picky schedulers may not obtain the resources they want.
* Penalize schedulers causing excessive revocation in order to incentivize them 
to play nicely.
* Use a degree of pessimism to restrict which resources a scheduler can revoke 
(e.g. only batch tasks that have not been running for a long time). If we 
augment task information to know whether it is a service or a batch job we may 
be able to do better here.
* etc

The techniques employed for work conservation in the presence of revocation 
should be configurable, and users should be able to achieve their own custom 
work conservation policies by implementing an allocator (or a subcomponent of 
the existing allocator).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5526) Allow schedulers to revoke resources to obtain their quota or fair share.

2016-05-31 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-5526:
--

 Summary: Allow schedulers to revoke resources to obtain their 
quota or fair share.
 Key: MESOS-5526
 URL: https://issues.apache.org/jira/browse/MESOS-5526
 Project: Mesos
  Issue Type: Epic
  Components: allocation
Reporter: Benjamin Mahler


In order to ensure fairness and quota guarantees are met in a dynamic cluster, 
we need to ensure that schedulers can revoke existing revocable allocations in 
order to obtain their fair share or their quota. Otherwise, schedulers must 
wait (potentially forever!) until existing allocations are freed. This is a 
policy that completely favors work conservation, in favor of meeting the 
fairness and quota guarantees in a bounded amount of time.

As we expose resource constraints to schedulers (MESOS-5524), they will be able 
to determine when Mesos will allow them to revoke resources. For example:
* If a scheduler is below its fair share, the scheduler may revoke existing 
revocable resources that are offered to it.
* If a scheduler is below its quota, it can revoke existing revocable resources 
in order to consume it for quota in a non-revocable manner.

This is orthogonal to optimistic or pessimistic allocation, in that either 
approaches need to allow the schedulers to perform revocation in this manner. 
In the pessimistic approach, we may confine what the scheduler can revoke, and 
in an optimistic approach, we may provide more choice to the scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5526) Allow schedulers to revoke resources to obtain their quota or fair share.

2016-05-31 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-5526:
---
Component/s: framework api

> Allow schedulers to revoke resources to obtain their quota or fair share.
> -
>
> Key: MESOS-5526
> URL: https://issues.apache.org/jira/browse/MESOS-5526
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation, framework api
>Reporter: Benjamin Mahler
>
> In order to ensure fairness and quota guarantees are met in a dynamic 
> cluster, we need to ensure that schedulers can revoke existing revocable 
> allocations in order to obtain their fair share or their quota. Otherwise, 
> schedulers must wait (potentially forever!) until existing allocations are 
> freed. This is a policy that completely favors work conservation, in favor of 
> meeting the fairness and quota guarantees in a bounded amount of time.
> As we expose resource constraints to schedulers (MESOS-5524), they will be 
> able to determine when Mesos will allow them to revoke resources. For example:
> * If a scheduler is below its fair share, the scheduler may revoke existing 
> revocable resources that are offered to it.
> * If a scheduler is below its quota, it can revoke existing revocable 
> resources in order to consume it for quota in a non-revocable manner.
> This is orthogonal to optimistic or pessimistic allocation, in that either 
> approaches need to allow the schedulers to perform revocation in this manner. 
> In the pessimistic approach, we may confine what the scheduler can revoke, 
> and in an optimistic approach, we may provide more choice to the scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5525) Allow schedulers to decide whether to consume resources as revocable or non-revocable.

2016-05-31 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-5525:
--

 Summary: Allow schedulers to decide whether to consume resources 
as revocable or non-revocable.
 Key: MESOS-5525
 URL: https://issues.apache.org/jira/browse/MESOS-5525
 Project: Mesos
  Issue Type: Epic
  Components: framework api, allocation
Reporter: Benjamin Mahler


The idea here is that although some resources may only be consumed in a 
revocable manner (e.g. oversubscribed resources, resources from "spot 
instances", etc), other resources may be consumed in a non-revocable manner 
(e.g. dedicated instance, on-premise machine).

However, a scheduler may wish to consume these non-revocable resources in a 
revocable manner. For example, if the scheduler has quota for non-revocable 
resources it may want not want to use its quota for a particular task and may 
wish to launch it in a revocable manner out of its fair share. See: 

In order to support this, we should adjust the meaning of revocable and 
non-revocable resources in order to allow schedulers to decide how to consume 
them. The scheduler could choose to consume non-revocable resources in a 
revocable manner in order to use its fair share of revocable resources rather 
than its quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5524) Expose resource consumption constraints (quota, shares) to schedulers.

2016-05-31 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-5524:
--

 Summary: Expose resource consumption constraints (quota, shares) 
to schedulers.
 Key: MESOS-5524
 URL: https://issues.apache.org/jira/browse/MESOS-5524
 Project: Mesos
  Issue Type: Epic
  Components: scheduler api, allocation
Reporter: Benjamin Mahler


Currently, schedulers do not have visibility into their quota or shares of the 
cluster. By providing this information, we give the scheduler the ability to 
make better decisions. As we start to allow schedulers to decide how they'd 
like to use a particular resource (e.g. as non-revocable or revocable), 
schedulers need visibility into their quota and shares to make an effective 
decision (otherwise they may accidentally exceed their quota and will not find 
out until mesos replies with TASK_LOST REASON_QUOTA_EXCEEDED).

We would start by exposing the following information:
* quota: e.g. cpus:10, mem:20, disk:40
* shares: e.g. cpus:20, mem:40, disk:80

Currently, quota is used for non-revocable resources and the idea is to use 
shares only for consuming revocable resources since the number of shares 
available to a role changes dynamically as resources come and go, frameworks 
come and go, or the operator manipulates the amount of resources sectioned off 
for quota.

By exposing quota and shares, the framework knows when it can consume 
additional non-revocable resources (i.e. when it has fewer non-revocable 
resources allocated to it than its quota) or when it can consume revocable 
resources (always! but in the future, it cannot revoke another user's revocable 
resources if the framework is above its fair share).

This also allows schedulers to determine whether they have sufficient quota 
assigned to them, and to alert the operator if they need more to run safely. 
Also, by viewing their fair share, the framework can expose monitoring 
information that shows the discrepancy between how much it would like and its 
fair share (note that the framework can actually exceed its fair share but in 
the future this will mean increased potential for revocation).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4642) Mesos Agent Json API can dump binary data from log files out as invalid JSON

2016-05-31 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308445#comment-15308445
 ] 

Vinod Kone commented on MESOS-4642:
---

Sounds like a plan Chris. Do you want to send a PR or review for the doc 
change? Here is the code 
https://github.com/apache/mesos/blob/master/src/files/files.cpp#L399

> Mesos Agent Json API can dump binary data from log files out as invalid JSON
> 
>
> Key: MESOS-4642
> URL: https://issues.apache.org/jira/browse/MESOS-4642
> Project: Mesos
>  Issue Type: Bug
>  Components: json api, slave
>Affects Versions: 0.27.0
>Reporter: Steven Schlansker
>Priority: Critical
>
> One of our tasks accidentally started logging binary data to stderr.  This 
> was not intentional and generally should not happen -- however, it causes 
> severe problems with the Mesos Agent "files/read.json" API, since it gladly 
> dumps this binary data out as invalid JSON.
> {code}
> # hexdump -C /path/to/task/stderr | tail
> 0003d1f0  6f 6e 6e 65 63 74 69 6f  6e 0a 4e 45 54 3a 20 31  |onnection.NET: 1|
> 0003d200  20 6f 6e 72 65 61 64 20  45 4e 4f 45 4e 54 20 32  | onread ENOENT 2|
> 0003d210  39 35 34 35 36 20 32 35  31 20 32 39 35 37 30 37  |95456 251 295707|
> 0003d220  0a 01 00 00 00 00 00 00  ac 57 65 64 2c 20 31 30  |.Wed, 10|
> 0003d230  20 55 6e 72 65 63 6f 67  6e 69 7a 65 64 20 69 6e  | Unrecognized in|
> 0003d240  70 75 74 20 68 65 61 64  65 72 0a |put header.|
> {code}
> {code}
> # curl 
> 'http://agent-host:5051/files/read.json?path=/path/to/task/stderr=220443=9='
>  | hexdump -C
> 7970  6e 65 63 74 69 6f 6e 5c  6e 4e 45 54 3a 20 31 20  |nection\nNET: 1 |
> 7980  6f 6e 72 65 61 64 20 45  4e 4f 45 4e 54 20 32 39  |onread ENOENT 29|
> 7990  35 34 35 36 20 32 35 31  20 32 39 35 37 30 37 5c  |5456 251 295707\|
> 79a0  6e 5c 75 30 30 30 31 5c  75 30 30 30 30 5c 75 30  |n\u0001\u\u0|
> 79b0  30 30 30 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |000\u\u\|
> 79c0  75 30 30 30 30 5c 75 30  30 30 30 ac 57 65 64 2c  |u\u.Wed,|
> 79d0  20 31 30 20 55 6e 72 65  63 6f 67 6e 69 7a 65 64  | 10 Unrecognized|
> 79e0  20 69 6e 70 75 74 20 68  65 61 64 65 72 5c 6e 22  | input header\n"|
> 79f0  2c 22 6f 66 66 73 65 74  22 3a 32 32 30 34 34 33  |,"offset":220443|
> 7a00  7d|}|
> {code}
> This causes downstream sadness:
> {code}
> ERROR [2016-02-10 18:55:12,303] 
> io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 
> 0ee749630f8b26f1
> ! com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xac
> !  at [Source: org.jboss.netty.buffer.ChannelBufferInputStream@6d69ee8; line: 
> 1, column: 31181]
> ! at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487) 
> ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3339)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2360)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2287)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:286)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:29)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:12)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:523)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:381)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1073)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.module.afterburner.deser.SuperSonicBeanDeserializer.deserializeFromObject(SuperSonicBeanDeserializer.java:196)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:142)
>  

[jira] [Commented] (MESOS-5457) Create a small testing doc for the v1 Scheduler/Executor API

2016-05-31 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308431#comment-15308431
 ] 

Vinod Kone commented on MESOS-5457:
---

Some of the improvements that were done as part of these tests.

commit bf7162205b53114eb7367fa322951d573cbb716d
Author: Anand Mazumdar 
Date:   Tue May 31 13:28:56 2016 -0600

Added move semantics to `Future::set`.

Review: https://reviews.apache.org/r/47989/

commit 6ce7279b2399a02f524692ff5799d637b99b38ff
Author: Anand Mazumdar 
Date:   Tue May 31 13:28:51 2016 -0600

Added move constructor/assignment to `Try`.

Review: https://reviews.apache.org/r/47988/

commit ae53e3b9980465119cd073620c02baf6e52d5695
Author: Anand Mazumdar 
Date:   Tue May 31 13:28:47 2016 -0600

Constrained constructible types constructor for `Result`.

This ensures that `Result` can only be created from constructible
types. This logic is similar to the one already present in `Option`.
Somehow, this constaint was never added for `Result`.

Review: https://reviews.apache.org/r/47987/

commit a6b3d1ad6f4e1b83b48ac58ba247a422fac32101
Author: Anand Mazumdar 
Date:   Tue May 31 13:28:43 2016 -0600

Added move constructor/assignment operator to `Result`.

Added move constructor/assignment operator to \`Result\`.
Note that `Some` still makes a copy and would be fixed in
a separate patch.

Review: https://reviews.apache.org/r/47986/



> Create a small testing doc for the v1 Scheduler/Executor API
> 
>
> Key: MESOS-5457
> URL: https://issues.apache.org/jira/browse/MESOS-5457
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>Assignee: Jay Guo
>  Labels: mesosphere
> Fix For: 1.0.0
>
>
> This is a follow up JIRA based on the comments from MESOS-3302 around testing 
> the v1 Scheduler/Executor API. I created a small document that has the 
> details of the manual testing done by me. The intent of this issue is to 
> track  all the details on this ticket rather then on the epic.
> Link to the doc: 
> https://docs.google.com/document/d/1Z8_8pn-x-VYInm12_En-1oP-FxkLzpG8EgC1qQ0eDRY/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5523) ValueError: A 0.7-series setuptools cannot be installed with distribute. Found one at /usr/lib/python2.7/dist-packages

2016-05-31 Thread Vinson Lee (JIRA)
Vinson Lee created MESOS-5523:
-

 Summary: ValueError: A 0.7-series setuptools cannot be installed 
with distribute. Found one at /usr/lib/python2.7/dist-packages
 Key: MESOS-5523
 URL: https://issues.apache.org/jira/browse/MESOS-5523
 Project: Mesos
  Issue Type: Bug
  Components: build
 Environment: Ubuntu 16.04
Reporter: Vinson Lee


{noformat}
$ make
[...]
Building protobuf Python egg ...
cd ../3rdparty/libprocess/3rdparty/protobuf-2.5.0/python && 
\
  CC="gcc"  \
  CXX="g++" \
  CFLAGS="-g1 -O0 -Wno-unused-local-typedefs"   \
  CXXFLAGS="-g1 -O0 -Wno-unused-local-typedefs -std=c++11"  
\
  PYTHONPATH=build/3rdparty/distribute-0.6.26   \
  /usr/bin/python setup.py build bdist_egg
Traceback (most recent call last):
  File "setup.py", line 11, in 
from setuptools import setup, Extension
  File "build/3rdparty/distribute-0.6.26/setuptools/__init__.py", line 2, in 

from setuptools.extension import Extension, Library
  File "build/3rdparty/distribute-0.6.26/setuptools/extension.py", line 5, in 

from setuptools.dist import _get_unpatched
  File "build/3rdparty/distribute-0.6.26/setuptools/dist.py", line 6, in 

from setuptools.command.install import install
  File "build/3rdparty/distribute-0.6.26/setuptools/command/__init__.py", line 
8, in 
from setuptools.command import install_scripts
  File 
"build/3rdparty/distribute-0.6.26/setuptools/command/install_scripts.py", line 
3, in 
from pkg_resources import Distribution, PathMetadata, ensure_directory
  File "build/3rdparty/distribute-0.6.26/pkg_resources.py", line 2731, in 

add_activation_listener(lambda dist: dist.activate())
  File "build/3rdparty/distribute-0.6.26/pkg_resources.py", line 704, in 
subscribe
callback(dist)
  File "build/3rdparty/distribute-0.6.26/pkg_resources.py", line 2731, in 

add_activation_listener(lambda dist: dist.activate())
  File "build/3rdparty/distribute-0.6.26/pkg_resources.py", line 2231, in 
activate
self.insert_on(path)
  File "build/3rdparty/distribute-0.6.26/pkg_resources.py", line 2332, in 
insert_on
"with distribute. Found one at %s" % str(self.location))
ValueError: A 0.7-series setuptools cannot be installed with distribute. Found 
one at /usr/lib/python2.7/dist-packages
Makefile:10277: recipe for target 
'../3rdparty/libprocess/3rdparty/protobuf-2.5.0/python/dist/protobuf-2.5.0-py2.7.egg'
 failed
make[2]: *** 
[../3rdparty/libprocess/3rdparty/protobuf-2.5.0/python/dist/protobuf-2.5.0-py2.7.egg]
 Error 1
make[2]: Leaving directory 'build/src'
Makefile:2805: recipe for target 'all' failed
make[1]: *** [all] Error 2
make[1]: Leaving directory 'build/src'
Makefile:731: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-5503) Implement GET_MAINTENANCE_STATUS Call in v1 master API.

2016-05-31 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-5503:
---

Assignee: haosdent

> Implement GET_MAINTENANCE_STATUS Call in v1 master API.
> ---
>
> Key: MESOS-5503
> URL: https://issues.apache.org/jira/browse/MESOS-5503
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: haosdent
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4642) Mesos Agent Json API can dump binary data from log files out as invalid JSON

2016-05-31 Thread Chris Pennello (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308222#comment-15308222
 ] 

Chris Pennello commented on MESOS-4642:
---

At the very least, if we aren't going to modify how the code works, then at 
least document that {{files/read.json}} is _not_ guaranteed to return valid 
JSON.  For example, a consequential, and not _too_ unreasonable workaround, is 
to ensure that your files are themselves UTF-8 encoded, and that would be a 
helpful thing to mention in the endpoint documentation.

> Mesos Agent Json API can dump binary data from log files out as invalid JSON
> 
>
> Key: MESOS-4642
> URL: https://issues.apache.org/jira/browse/MESOS-4642
> Project: Mesos
>  Issue Type: Bug
>  Components: json api, slave
>Affects Versions: 0.27.0
>Reporter: Steven Schlansker
>Priority: Critical
>
> One of our tasks accidentally started logging binary data to stderr.  This 
> was not intentional and generally should not happen -- however, it causes 
> severe problems with the Mesos Agent "files/read.json" API, since it gladly 
> dumps this binary data out as invalid JSON.
> {code}
> # hexdump -C /path/to/task/stderr | tail
> 0003d1f0  6f 6e 6e 65 63 74 69 6f  6e 0a 4e 45 54 3a 20 31  |onnection.NET: 1|
> 0003d200  20 6f 6e 72 65 61 64 20  45 4e 4f 45 4e 54 20 32  | onread ENOENT 2|
> 0003d210  39 35 34 35 36 20 32 35  31 20 32 39 35 37 30 37  |95456 251 295707|
> 0003d220  0a 01 00 00 00 00 00 00  ac 57 65 64 2c 20 31 30  |.Wed, 10|
> 0003d230  20 55 6e 72 65 63 6f 67  6e 69 7a 65 64 20 69 6e  | Unrecognized in|
> 0003d240  70 75 74 20 68 65 61 64  65 72 0a |put header.|
> {code}
> {code}
> # curl 
> 'http://agent-host:5051/files/read.json?path=/path/to/task/stderr=220443=9='
>  | hexdump -C
> 7970  6e 65 63 74 69 6f 6e 5c  6e 4e 45 54 3a 20 31 20  |nection\nNET: 1 |
> 7980  6f 6e 72 65 61 64 20 45  4e 4f 45 4e 54 20 32 39  |onread ENOENT 29|
> 7990  35 34 35 36 20 32 35 31  20 32 39 35 37 30 37 5c  |5456 251 295707\|
> 79a0  6e 5c 75 30 30 30 31 5c  75 30 30 30 30 5c 75 30  |n\u0001\u\u0|
> 79b0  30 30 30 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |000\u\u\|
> 79c0  75 30 30 30 30 5c 75 30  30 30 30 ac 57 65 64 2c  |u\u.Wed,|
> 79d0  20 31 30 20 55 6e 72 65  63 6f 67 6e 69 7a 65 64  | 10 Unrecognized|
> 79e0  20 69 6e 70 75 74 20 68  65 61 64 65 72 5c 6e 22  | input header\n"|
> 79f0  2c 22 6f 66 66 73 65 74  22 3a 32 32 30 34 34 33  |,"offset":220443|
> 7a00  7d|}|
> {code}
> This causes downstream sadness:
> {code}
> ERROR [2016-02-10 18:55:12,303] 
> io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 
> 0ee749630f8b26f1
> ! com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xac
> !  at [Source: org.jboss.netty.buffer.ChannelBufferInputStream@6d69ee8; line: 
> 1, column: 31181]
> ! at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487) 
> ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3339)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2360)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2287)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:286)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:29)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:12)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:523)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:381)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1073)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> 

[jira] [Commented] (MESOS-5339) Create Tests for testing fine-grained HTTP endpoint filtering.

2016-05-31 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308030#comment-15308030
 ] 

Michael Park commented on MESOS-5339:
-

[~adam-mesos] We are planning to commit this today. The patch is at 
https://reviews.apache.org/r/48054/. 

> Create Tests for testing fine-grained HTTP endpoint filtering.
> --
>
> Key: MESOS-5339
> URL: https://issues.apache.org/jira/browse/MESOS-5339
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5040) Add cgroups_subsystems flag for cgroups unified isolator

2016-05-31 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307851#comment-15307851
 ] 

haosdent commented on MESOS-5040:
-

Hi, [~qianzhang] Thank you for your reply. Please follow this chain. The above 
one is discared.

> Add cgroups_subsystems flag for cgroups unified isolator
> 
>
> Key: MESOS-5040
> URL: https://issues.apache.org/jira/browse/MESOS-5040
> Project: Mesos
>  Issue Type: Task
>  Components: cgroups, isolation
>Reporter: haosdent
>Assignee: haosdent
>
> In past, we specify the cgroups subsystems we used in Mesos containerizer in 
> {{--isolation}} flag. In cgroups unified isolator, we need to add this 
> separate flag to control which subsystems we enable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5040) Add cgroups_subsystems flag for cgroups unified isolator

2016-05-31 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307841#comment-15307841
 ] 

Qian Zhang commented on MESOS-5040:
---

[~haosd...@gmail.com], is the above patch reviewable now? If yes, can you 
please make it public? :-)

> Add cgroups_subsystems flag for cgroups unified isolator
> 
>
> Key: MESOS-5040
> URL: https://issues.apache.org/jira/browse/MESOS-5040
> Project: Mesos
>  Issue Type: Task
>  Components: cgroups, isolation
>Reporter: haosdent
>Assignee: haosdent
>
> In past, we specify the cgroups subsystems we used in Mesos containerizer in 
> {{--isolation}} flag. In cgroups unified isolator, we need to add this 
> separate flag to control which subsystems we enable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5339) Create Tests for testing fine-grained HTTP endpoint filtering.

2016-05-31 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307568#comment-15307568
 ] 

Adam B commented on MESOS-5339:
---

[~js84], [~mcypark], are you guys planning to add tests for 0.29/1.0? If not, 
let's remove this from the parent Epic MESOS-4931 and close out the Epic now 
that the rest of its tasks are resolved.

> Create Tests for testing fine-grained HTTP endpoint filtering.
> --
>
> Key: MESOS-5339
> URL: https://issues.apache.org/jira/browse/MESOS-5339
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4772) TaskInfo/ExecutorInfo should include fine-grained ownership/namespacing

2016-05-31 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-4772:
--
Assignee: (was: Jan Schlicht)

> TaskInfo/ExecutorInfo should include fine-grained ownership/namespacing
> ---
>
> Key: MESOS-4772
> URL: https://issues.apache.org/jira/browse/MESOS-4772
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: Adam B
>  Labels: authorization, mesosphere, ownership, security
>
> We need a way to assign fine-grained ownership to tasks/executors so that 
> multi-user frameworks can tell Mesos to associate the task with a user 
> identity (rather than just the framework principal+role). Then, when an HTTP 
> user requests to view the task's sandbox contents, or kill the task, or list 
> all tasks, the authorizer can determine whether to allow/deny/filter the 
> request based on finer-grained, user-level ownership.
> Some systems may want TaskInfo.owner to represent a group rather than an 
> individual user. That's fine as long as the framework sets the field to the 
> group ID in such a way that a group-aware authorizer can interpret it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2717) Qemu/KVM containerizer

2016-05-31 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307274#comment-15307274
 ] 

Jie Yu commented on MESOS-2717:
---

Yes, i don't disagree. But if you take a look at the existing docker 
containerizer, many of the logics are shared with Mesos containerizer. I don't 
want us to introduce yet another containerizer and copy the core logics yet 
again. Also, writing a contianerizer is highly non trivial, and hard to get it 
right. 

> Qemu/KVM containerizer
> --
>
> Key: MESOS-2717
> URL: https://issues.apache.org/jira/browse/MESOS-2717
> Project: Mesos
>  Issue Type: Wish
>  Components: containerization
>Reporter: Pierre-Yves Ritschard
>Assignee: Abhishek Dasgupta
>
> I think it would make sense for Mesos to have the ability to treat 
> hypervisors as containerizers and the most sensible one to start with would 
> probably be Qemu/KVM.
> There are a few workloads that can require full-fledged VMs (the most obvious 
> one being Windows workloads).
> The containerization code is well decoupled and seems simple enough, I can 
> definitely take a shot at it. VMs do bring some questions with them here is 
> my take on them:
> 1. Routing, network strategy
> ==
> The simplest approach here might very well be to go for bridged networks
> and leave the setup and inter slave routing up to the administrator
> 2. IP Address assignment
> 
> At first, it can be up to the Frameworks to deal with IP assignment.
> The simplest way to address this could be to have an executor running
> on slaves providing the qemu/kvm containerizer which would instrument a DHCP 
> server and collect IP + Mac address resources from slaves. While it may be up 
> to the frameworks to provide this, an example should most likely be provided.
> 3. VM Templates
> ==
> VM templates should probably leverage the fetcher and could thus be copied 
> locally or fetch from HTTP(s) / HDFS.
> 4. Resource limiting
> 
> Mapping resouce constraints to the qemu command line is probably the easiest 
> part, Additional command line should also be fetchable. For Unix VMs, the 
> sandbox could show the output of the serial console
> 5. Libvirt / plain Qemu
> =
> I tend to favor limiting the amount of necessary hoops to jump through and 
> would thus investigate working directly with Qemu, maintaining an open 
> connection to the monitor to assert status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2717) Qemu/KVM containerizer

2016-05-31 Thread Angus Lees (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307270#comment-15307270
 ] 

Angus Lees commented on MESOS-2717:
---

> Every time we introduce a new feature (e.g., persistent volume, gpu, etc.), 
> we need to provide two implementations for both containerizers.

You're going to need to implement those features again for a VM-based 
"containerizer" anyway.  It is highly unlikely that you could share any 
implementation regardless of where the code actually lived...

> Qemu/KVM containerizer
> --
>
> Key: MESOS-2717
> URL: https://issues.apache.org/jira/browse/MESOS-2717
> Project: Mesos
>  Issue Type: Wish
>  Components: containerization
>Reporter: Pierre-Yves Ritschard
>Assignee: Abhishek Dasgupta
>
> I think it would make sense for Mesos to have the ability to treat 
> hypervisors as containerizers and the most sensible one to start with would 
> probably be Qemu/KVM.
> There are a few workloads that can require full-fledged VMs (the most obvious 
> one being Windows workloads).
> The containerization code is well decoupled and seems simple enough, I can 
> definitely take a shot at it. VMs do bring some questions with them here is 
> my take on them:
> 1. Routing, network strategy
> ==
> The simplest approach here might very well be to go for bridged networks
> and leave the setup and inter slave routing up to the administrator
> 2. IP Address assignment
> 
> At first, it can be up to the Frameworks to deal with IP assignment.
> The simplest way to address this could be to have an executor running
> on slaves providing the qemu/kvm containerizer which would instrument a DHCP 
> server and collect IP + Mac address resources from slaves. While it may be up 
> to the frameworks to provide this, an example should most likely be provided.
> 3. VM Templates
> ==
> VM templates should probably leverage the fetcher and could thus be copied 
> locally or fetch from HTTP(s) / HDFS.
> 4. Resource limiting
> 
> Mapping resouce constraints to the qemu command line is probably the easiest 
> part, Additional command line should also be fetchable. For Unix VMs, the 
> sandbox could show the output of the serial console
> 5. Libvirt / plain Qemu
> =
> I tend to favor limiting the amount of necessary hoops to jump through and 
> would thus investigate working directly with Qemu, maintaining an open 
> connection to the monitor to assert status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)