[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-15 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196797#comment-15196797
 ] 

Vinod Kone commented on MESOS-3902:
---

I think Location should be "://:/" when someone hits / on the non-leading master.

> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-15 Thread Ashwin Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196774#comment-15196774
 ] 

Ashwin Murthy commented on MESOS-3902:
--

[~vinodkone]

Clearly, we see the 307 and Temporary Rediect. Also, the Location header is set 
as above. Would the fix be to ensure 

1. the URI scheme shows up as "http:"
2. The path of the URl is set by adding the suffix "/api/v1/scheduler"

Is this the expected behavior? The http API mesos spec doesnt say this. In that 
the Location header is masterhost:

> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-15 Thread Ashwin Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196772#comment-15196772
 ] 

Ashwin Murthy commented on MESOS-3902:
--

I tried to repro this in our production mesos cluster. I see the following:

ashwinm@mgmt01-sjc1:~$ curl -v -X POST -H "Content-Type: application/json" 
--data @body.json 
http://compute34-sjc1.prod.uber.internal:5050/api/v1/scheduler 
* About to connect() to compute34-sjc1.prod.uber.internal port 5050 (#0)
*   Trying 10.162.29.25... connected
> POST /api/v1/scheduler HTTP/1.1
> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.2d 
> zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> Host: compute34-sjc1.prod.uber.internal:5050
> Accept: */*
> Content-Type: application/json
> Content-Length: 120
> 
* upload completely sent off: 120out of 120 bytes
< HTTP/1.1 307 Temporary Redirect
< Date: Wed, 16 Mar 2016 05:01:13 GMT
< Location: //compute35-sjc1.prod.uber.internal:5050
< Content-Length: 0
< 
* Connection #0 to host compute34-sjc1.prod.uber.internal left intact
* Closing connection #0


> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4954) URI fetcher error message if plugin is not found is mis-leading.

2016-03-15 Thread Yong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196691#comment-15196691
 ] 

Yong Tang commented on MESOS-4954:
--

Added the review request:
https://reviews.apache.org/r/44883/


> URI fetcher error message if plugin is not found is mis-leading.
> 
>
> Key: MESOS-4954
> URL: https://issues.apache.org/jira/browse/MESOS-4954
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Anand Mazumdar
>Assignee: Yong Tang
>  Labels: newbie
>
> In {{src/uri/fetcher.cpp}}, if we are unable to create a plugin, we skip it 
> but we log an erroneous misleading message:
> {code}
>   // NOTE: We skip the plugin if it cannot be created, instead of
>   // returning an Error so that we can still use other plugins.
>   LOG(ERROR) << "Failed to create URI fetcher plugin "
>  << "'"  << name << "': " << plugin.error();
> {code}
> Ideally, it should be at best a {{LOG(INFO)}} with it clearly specifying that 
> the relevant plugin was skipped since it was not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4955) Generize perf event parsing to match PerfStatistics filed name for "perf stat"

2016-03-15 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196680#comment-15196680
 ] 

haosdent commented on MESOS-4955:
-

Seems [~Bartek Plotka] submit a patch in 
https://issues.apache.org/jira/browse/MESOS-4595 which more general.

> Generize perf event parsing to match PerfStatistics filed name for "perf stat"
> --
>
> Key: MESOS-4955
> URL: https://issues.apache.org/jira/browse/MESOS-4955
> Project: Mesos
>  Issue Type: Improvement
>  Components: isolation
>Reporter: Fan Du
>Assignee: Fan Du
>
> Current 
> [design|https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=include/mesos/mesos.proto;h=deb9c0910a27afd67276f54b3f666a878212727b;hb=HEAD#l981]
>  does not support event like:
> {{SUBSYS/EVENT  <- Most notable intel_cqm/llc_occupancy/}}
> {{SUSSYS:EVENT  <- All Tracepoint event}}
> This gap could be fulfilled with a bit by matching EVENT with PerfStatistics 
> Proto Message name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4955) Generize perf event parsing to match PerfStatistics filed name for "perf stat"

2016-03-15 Thread Fan Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196627#comment-15196627
 ] 

Fan Du commented on MESOS-4955:
---

Here posted the RFC review request to evaluate whether this ticket is 
worthwhile to pursue further more:
https://reviews.apache.org/r/44881/

btw, currently I use {{intel_cqm/llc_occupancy/}} and 
{{sched:intel_cqm/llc_occupancy/}} as an example only, other event could be 
easily expended later on.

> Generize perf event parsing to match PerfStatistics filed name for "perf stat"
> --
>
> Key: MESOS-4955
> URL: https://issues.apache.org/jira/browse/MESOS-4955
> Project: Mesos
>  Issue Type: Improvement
>  Components: isolation
>Reporter: Fan Du
>Assignee: Fan Du
>
> Current 
> [design|https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=include/mesos/mesos.proto;h=deb9c0910a27afd67276f54b3f666a878212727b;hb=HEAD#l981]
>  does not support event like:
> {{SUBSYS/EVENT  <- Most notable intel_cqm/llc_occupancy/}}
> {{SUSSYS:EVENT  <- All Tracepoint event}}
> This gap could be fulfilled with a bit by matching EVENT with PerfStatistics 
> Proto Message name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3890) Add notion of evictable task to RunTaskMessage

2016-03-15 Thread Guangya Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu updated MESOS-3890:
---
Description: 
{code}
// Evict Resources to launch tasks.
  message Revocation {
optional FrameworkID framework_id = 1;
required string role = 2;
repeated Resource revocable_resources = 3;
  }
  repeated Revocation revocations = 5;
{code}

  was:
{code}
message RunTaskMessage {
  ...
  // Evict Resources to launch tasks.
  message Revocation {
optional frameworkId frameworkId = 1;
string role = 2;
repeated Resource revocable_resources = 3;
  } 
  repeated Revocation revocations = 3;
}
{code}


> Add notion of evictable task to RunTaskMessage
> --
>
> Key: MESOS-3890
> URL: https://issues.apache.org/jira/browse/MESOS-3890
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Guangya Liu
>  Labels: mesosphere
>
> {code}
> // Evict Resources to launch tasks.
>   message Revocation {
> optional FrameworkID framework_id = 1;
> required string role = 2;
> repeated Resource revocable_resources = 3;
>   }
>   repeated Revocation revocations = 5;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4954) URI fetcher error message if plugin is not found is mis-leading.

2016-03-15 Thread Yong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Tang reassigned MESOS-4954:


Assignee: Yong Tang

> URI fetcher error message if plugin is not found is mis-leading.
> 
>
> Key: MESOS-4954
> URL: https://issues.apache.org/jira/browse/MESOS-4954
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Anand Mazumdar
>Assignee: Yong Tang
>  Labels: newbie
>
> In {{src/uri/fetcher.cpp}}, if we are unable to create a plugin, we skip it 
> but we log an erroneous misleading message:
> {code}
>   // NOTE: We skip the plugin if it cannot be created, instead of
>   // returning an Error so that we can still use other plugins.
>   LOG(ERROR) << "Failed to create URI fetcher plugin "
>  << "'"  << name << "': " << plugin.error();
> {code}
> Ideally, it should be at best a {{LOG(INFO)}} with it clearly specifying that 
> the relevant plugin was skipped since it was not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4907) ClangTidy Integration

2016-03-15 Thread Shuai Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196599#comment-15196599
 ] 

Shuai Lin commented on MESOS-4907:
--

+1, thanks for the reply.

> ClangTidy Integration
> -
>
> Key: MESOS-4907
> URL: https://issues.apache.org/jira/browse/MESOS-4907
> Project: Mesos
>  Issue Type: Epic
>  Components: technical debt
>Reporter: Michael Park
>  Labels: gsoc, gsoc2016, mentor, mesosphere
>
> While {{cpplint}} has been a useful tool as a C++ linter for quite some time,
> It carries limitations since it does its best without actually parsing C++.
> [ClangTidy|http://clang.llvm.org/extra/clang-tidy/] is a clang tool that is 
> based
> off of Clang, and has the advantage that it has access to a full AST.
> There are many checks that come built-in with {{clang-tidy}} which are very 
> useful,
> but we can extend it to fit Mesos coding style and patterns as well.
> The initial phase of the project will be to create a basis with which to 
> leverage
> the existing checks as applicable to Mesos, then to create a scaffolding to 
> add
> custom checks, and ways to integrate the custom checks to infrastructure such
> as Mesos ReviewBot, or Apache CI.
> I've done some preliminary, experimental work for this for a Hackathon project
> and have given a 
> [presentation|https://docs.google.com/presentation/d/1z_qGzpY7Mt46TXxuLRW6M5HcCWBLRz6UJfd4bPknYeg/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3890) Add notion of evictable task to RunTaskMessage

2016-03-15 Thread Guangya Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu updated MESOS-3890:
---
Description: 
{code}
message RunTaskMessage {
  ...
  // Evict Resources to launch tasks.
  message Revocation {
optional frameworkId frameworkId = 1;
string role = 2;
repeated Resource revocable_resources = 3;
  } 
  repeated Revocation revocations = 3;
}
{code}

  was:
{code}
message RunTaskMessageTaskInfo {
  ...
  // Evict Resources to launch tasks.
  message RevocationEvictResource {
string role = 1;
repeated Resource revocable_resources = 2;
  } 
  repeated RevocationEvictResource revocationsevict_resources = 3;
}
{code}


> Add notion of evictable task to RunTaskMessage
> --
>
> Key: MESOS-3890
> URL: https://issues.apache.org/jira/browse/MESOS-3890
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Guangya Liu
>  Labels: mesosphere
>
> {code}
> message RunTaskMessage {
>   ...
>   // Evict Resources to launch tasks.
>   message Revocation {
> optional frameworkId frameworkId = 1;
> string role = 2;
> repeated Resource revocable_resources = 3;
>   } 
>   repeated Revocation revocations = 3;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4954) URI fetcher error message if plugin is not found is mis-leading.

2016-03-15 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-4954:
-

 Summary: URI fetcher error message if plugin is not found is 
mis-leading.
 Key: MESOS-4954
 URL: https://issues.apache.org/jira/browse/MESOS-4954
 Project: Mesos
  Issue Type: Bug
  Components: containerization
Reporter: Anand Mazumdar


In {{src/uri/fetcher.cpp}}, if we are unable to create a plugin, we skip it but 
we log an erroneous misleading message:

{code}
  // NOTE: We skip the plugin if it cannot be created, instead of
  // returning an Error so that we can still use other plugins.
  LOG(ERROR) << "Failed to create URI fetcher plugin "
 << "'"  << name << "': " << plugin.error();
{code}

Ideally, it should be at best a {{LOG(INFO)}} with it clearly specifying that 
the relevant plugin was skipped since it was not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4802) Update leveldb patch file to suport PowerPC LE

2016-03-15 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196562#comment-15196562
 ] 

Vinod Kone commented on MESOS-4802:
---

If leveldb-1.18 allows us to not carry a patch, lets just upgrade to 
leveldb-1.18 instead of patching leveldb-1.4.

> Update leveldb patch file to suport PowerPC LE
> --
>
> Key: MESOS-4802
> URL: https://issues.apache.org/jira/browse/MESOS-4802
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Chen Zhiwei
>
> See: https://github.com/google/leveldb/releases/tag/v1.18 for improvements / 
> bug fixes.
> The motivation is that leveldb 1.18 has officially supported IBM Power 
> (ppc64le), so this is needed by 
> [MESOS-4312|https://issues.apache.org/jira/browse/MESOS-4312].
> Update: Since someone updated leveldb to 1.4, so I only update the patch file 
> to support PowerPC LE. Because I don't think upgrade 3rdparty library 
> frequently is a good thing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4953) DockerFetcherPluginTest.INTERNET_CURL_FetchImage is flaky

2016-03-15 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-4953:
-

 Summary: DockerFetcherPluginTest.INTERNET_CURL_FetchImage is flaky
 Key: MESOS-4953
 URL: https://issues.apache.org/jira/browse/MESOS-4953
 Project: Mesos
  Issue Type: Bug
  Components: containerization, fetcher
Reporter: Anand Mazumdar


This test fails quite regularly on my linux box. Relevant verbose logs:

{code}
[ RUN  ] DockerFetcherPluginTest.INTERNET_CURL_FetchManifest
E0315 17:28:59.233052 25940 shell.hpp:106] Command 'hadoop version 2>&1' 
failed; this is the output:
sh: 1: hadoop: not found
E0315 17:28:59.233104 25940 fetcher.cpp:59] Failed to create URI fetcher plugin 
'hadoop': Failed to create HDFS client: Failed to execute 'hadoop version 
2>&1'; the command was either not found or exited with a non-zero exit status: 
127
../../src/tests/uri_fetcher_tests.cpp:230: Failure
Failed to wait 1mins for fetcher.get()->fetch(uri, dir)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4952) Annoying image provisioner logging for when images are not used.

2016-03-15 Thread Yan Xu (JIRA)
Yan Xu created MESOS-4952:
-

 Summary: Annoying image provisioner logging for when images are 
not used.
 Key: MESOS-4952
 URL: https://issues.apache.org/jira/browse/MESOS-4952
 Project: Mesos
  Issue Type: Bug
Reporter: Yan Xu
Priority: Minor


{{Provisioner::destroy()}} logs this message even when images are not used in 
the Mesos cluster:

{noformat:title=}
Ignoring destroy request for unknown container 
597f511e-479d-4632-a3b9-43b1e368c744
{noformat}

See 
[code|https://github.com/apache/mesos/blob/37958fd70de1998e6c29b643abd4f43dd1ef4c79/src/slave/containerizer/mesos/provisioner/provisioner.cpp#L306].

This can be surprising and annoying to people who are not actually using this 
feature and the container is totally valid, it's just not using images.

Let's at least tune it down to VLOG(1).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2281) Deprecate plain text Credential format.

2016-03-15 Thread Cody Maloney (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196404#comment-15196404
 ] 

Cody Maloney commented on MESOS-2281:
-

The JSON format was added as part of MESOS-1391. The original author intended 
to deprecate the legacy credential format.

Original commit: 
https://github.com/apache/mesos/commit/2cb3761c6bfa80b956eaafde9c69eafaeac3deae
Review:
https://reviews.apache.org/r/2/

The JSON format should allow us to eliminate some code, as well as provide a 
more robust parser to ensure people don't read / write garbage (There was 
accidentally a newline or space added to the name of one principal, now all the 
parsing is off by a little bit and things aren't working properly)

> Deprecate plain text Credential format.
> ---
>
> Key: MESOS-2281
> URL: https://issues.apache.org/jira/browse/MESOS-2281
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Affects Versions: 0.21.1
>Reporter: Cody Maloney
>Assignee: Jan Schlicht
>  Labels: mesosphere, security, tech-debt
>
> Currently two formats of credentials are supported: JSON
> {code}
>   "credentials": [
> {
>   "principal": "sherman",
>   "secret": "kitesurf"
> }
> {code}
> And a new line file:
> {code}
> principal1 secret1
> pricipal2 secret2
> {code}
> We should deprecate the new line format and remove support for the old format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4907) ClangTidy Integration

2016-03-15 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196322#comment-15196322
 ] 

Michael Park commented on MESOS-4907:
-

[~lins05] Yes, this will be something that people who want to run 
{{clang-tidy}} locally will have to be aware of. With custom mesos checks that 
will be added however, I'm not sure that folks will be willing to build their 
own {{clang-tidy}} to run locally. In the short term, it'll probably be more 
beneficial for {{clang-tidy}} to be integrated into a CI tool like Reviewbot or 
ASF Buildbot.

> ClangTidy Integration
> -
>
> Key: MESOS-4907
> URL: https://issues.apache.org/jira/browse/MESOS-4907
> Project: Mesos
>  Issue Type: Epic
>  Components: technical debt
>Reporter: Michael Park
>  Labels: gsoc, gsoc2016, mentor, mesosphere
>
> While {{cpplint}} has been a useful tool as a C++ linter for quite some time,
> It carries limitations since it does its best without actually parsing C++.
> [ClangTidy|http://clang.llvm.org/extra/clang-tidy/] is a clang tool that is 
> based
> off of Clang, and has the advantage that it has access to a full AST.
> There are many checks that come built-in with {{clang-tidy}} which are very 
> useful,
> but we can extend it to fit Mesos coding style and patterns as well.
> The initial phase of the project will be to create a basis with which to 
> leverage
> the existing checks as applicable to Mesos, then to create a scaffolding to 
> add
> custom checks, and ways to integrate the custom checks to infrastructure such
> as Mesos ReviewBot, or Apache CI.
> I've done some preliminary, experimental work for this for a Hackathon project
> and have given a 
> [presentation|https://docs.google.com/presentation/d/1z_qGzpY7Mt46TXxuLRW6M5HcCWBLRz6UJfd4bPknYeg/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4947) Persistent volumes are not listed

2016-03-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MESOS-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196313#comment-15196313
 ] 

Max Neunhöffer edited comment on MESOS-4947 at 3/15/16 9:41 PM:


Thanks for this information. This at least gives us a way to work around our 
customers' problem with zombie resources. We now have both a manual and a 
(semi-) automatic way to clean up zombie resources. This is closed now  in the 
hope of better "on-board" cleanup facilities in future Mesos versions.


was (Author: neunhoef):
Thanks for this information. This at least gives us a way to work around our 
customers' problem with zombie resources. We now have both a manual and a 
(semi-) automatic way to clean up zombie resources. I will close this now in 
the hope of better "on-board" cleanup facilities in future Mesos versions.

> Persistent volumes are not listed
> -
>
> Key: MESOS-4947
> URL: https://issues.apache.org/jira/browse/MESOS-4947
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.1
> Environment: DCOS 1.6.1 on AWS
>Reporter: Max Neunhöffer
>Assignee: Neil Conway
>
> TL;DR:  In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either (as described here: 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes).
> Situation: There are Mesos agents in the cluster that have dynamic 
> reservations as well as persistent volumes for role "arangodb" with principal 
> "arangodb" but the corresponding framework does no longer exist (was 
> "destroyed" by clicking in the Marathon UI). Let's call these "Zombie 
> persistent volumes". We try to cleanup this mess manually (or automatically).
> Effect: According to 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
>  one should be able to list these zombies using 
> http:///mesos/slaves JSON/REST endpoint. We see a summary 
> of the dynamic reservations, but the persistent disks do not appear. As a 
> consequence we can neither use the /destroy-volumes API nor the /unreserve 
> API do perform manual or automatic cleanup.
> Additional information: 
>   - If we start a new framework with role "arangodb" and principal "arangodb" 
> it will receive resource offers containing the dynamic reservations *and* the 
> persistent volumes.
>   - In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either. So this might not be limited to zombie 
> persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4951) Enable actors to set an existing endpoint's authentication realm

2016-03-15 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-4951:
-
Description: To prepare for MESOS-4902, the Mesos master and agent need a 
way to set the authentication realm of an endpoint that has already been 
installed. Since some endpoints (like {{/profiler/*}}) get installed in 
libprocess, the master/agent should be able to specify during initialization 
what authentication realm the libprocess-level endpoints will be authenticated 
under.  (was: To prepare for MESOS-4902, the Mesos master and agent need a way 
to set the authentication realm of an endpoint that has already been installed. 
Since some endpoints (like {{/profiler/*}}) get installed in libprocess, the 
master/agent should be able to specify during initialization what 
authentication realm the libprocess-level endpoints should be authenticated 
under.)

> Enable actors to set an existing endpoint's authentication realm
> 
>
> Key: MESOS-4951
> URL: https://issues.apache.org/jira/browse/MESOS-4951
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess, slave
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authentication, http, mesosphere
>
> To prepare for MESOS-4902, the Mesos master and agent need a way to set the 
> authentication realm of an endpoint that has already been installed. Since 
> some endpoints (like {{/profiler/*}}) get installed in libprocess, the 
> master/agent should be able to specify during initialization what 
> authentication realm the libprocess-level endpoints will be authenticated 
> under.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4951) Enable actors to set an existing endpoint's authentication realm

2016-03-15 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-4951:
-
Story Points: 2  (was: 3)

> Enable actors to set an existing endpoint's authentication realm
> 
>
> Key: MESOS-4951
> URL: https://issues.apache.org/jira/browse/MESOS-4951
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess, slave
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authentication, http, mesosphere
>
> To prepare for MESOS-4902, the Mesos master and agent need a way to set the 
> authentication realm of an endpoint that has already been installed. Since 
> some endpoints (like {{/profiler/*}}) get installed in libprocess, the 
> master/agent should be able to specify during initialization what 
> authentication realm the libprocess-level endpoints should be authenticated 
> under.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4907) ClangTidy Integration

2016-03-15 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196292#comment-15196292
 ] 

Michael Park commented on MESOS-4907:
-

[~sachiths] Do you know how to go through the GSoC application process as a 
student?

> ClangTidy Integration
> -
>
> Key: MESOS-4907
> URL: https://issues.apache.org/jira/browse/MESOS-4907
> Project: Mesos
>  Issue Type: Epic
>  Components: technical debt
>Reporter: Michael Park
>  Labels: gsoc, gsoc2016, mentor, mesosphere
>
> While {{cpplint}} has been a useful tool as a C++ linter for quite some time,
> It carries limitations since it does its best without actually parsing C++.
> [ClangTidy|http://clang.llvm.org/extra/clang-tidy/] is a clang tool that is 
> based
> off of Clang, and has the advantage that it has access to a full AST.
> There are many checks that come built-in with {{clang-tidy}} which are very 
> useful,
> but we can extend it to fit Mesos coding style and patterns as well.
> The initial phase of the project will be to create a basis with which to 
> leverage
> the existing checks as applicable to Mesos, then to create a scaffolding to 
> add
> custom checks, and ways to integrate the custom checks to infrastructure such
> as Mesos ReviewBot, or Apache CI.
> I've done some preliminary, experimental work for this for a Hackathon project
> and have given a 
> [presentation|https://docs.google.com/presentation/d/1z_qGzpY7Mt46TXxuLRW6M5HcCWBLRz6UJfd4bPknYeg/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4902) Add authentication to remaining agent endpoints

2016-03-15 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-4902:


Assignee: Greg Mann

> Add authentication to remaining agent endpoints
> ---
>
> Key: MESOS-4902
> URL: https://issues.apache.org/jira/browse/MESOS-4902
> Project: Mesos
>  Issue Type: Improvement
>  Components: HTTP API
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authentication, http, mesosphere, security
>
> In addition to the endpoints addressed by MESOS-4850, the following endpoints 
> would also benefit from HTTP authentication:
> * {{/files/*}}
> * {{/profiler/*}}
> * {{/logging/toggle}}
> * {{/metrics/snapshot}}
> * {{/monitor/statistics}}
> * {{/system/stats.json}}
> Adding HTTP authentication to these endpoints is a bit more complicated: some 
> endpoints are defined at the libprocess level, while others are defined in 
> code that is shared by the master and agent.
> While working on MESOS-4850, it became apparent that since our tests use the 
> same instance of libprocess for both master and agent, different default 
> authentication realms must be used for master/agent so that HTTP 
> authentication can be independently enabled/disabled for each.
> We should establish a mechanism for making an endpoint authenticated that 
> allows us to:
> 1) Install an endpoint like {{/files}}, whose code is shared by the master 
> and agent, with different authentication realms for the master and agent
> 2) Avoid hard-coding a default authentication realm into libprocess, to 
> permit the use of different authentication realms for the master and agent 
> and to keep application-level concerns from leaking into libprocess
> Another option would be to use a single default authentication realm and 
> always enable or disable HTTP authentication for *both* the master and agent 
> in tests. However, this wouldn't allow us to test scenarios where HTTP 
> authentication is enabled on one but disabled on the other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4951) Enable actors to set an existing endpoint's authentication realm

2016-03-15 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-4951:


Assignee: Greg Mann

> Enable actors to set an existing endpoint's authentication realm
> 
>
> Key: MESOS-4951
> URL: https://issues.apache.org/jira/browse/MESOS-4951
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess, slave
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: authentication, http, mesosphere
>
> To prepare for MESOS-4902, the Mesos master and agent need a way to set the 
> authentication realm of an endpoint that has already been installed. Since 
> some endpoints (like {{/profiler/*}}) get installed in libprocess, the 
> master/agent should be able to specify during initialization what 
> authentication realm the libprocess-level endpoints should be authenticated 
> under.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4951) Enable actors to set an existing endpoint's authentication realm

2016-03-15 Thread Greg Mann (JIRA)
Greg Mann created MESOS-4951:


 Summary: Enable actors to set an existing endpoint's 
authentication realm
 Key: MESOS-4951
 URL: https://issues.apache.org/jira/browse/MESOS-4951
 Project: Mesos
  Issue Type: Bug
  Components: libprocess, slave
Reporter: Greg Mann


To prepare for MESOS-4902, the Mesos master and agent need a way to set the 
authentication realm of an endpoint that has already been installed. Since some 
endpoints (like {{/profiler/*}}) get installed in libprocess, the master/agent 
should be able to specify during initialization what authentication realm the 
libprocess-level endpoints should be authenticated under.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4947) Persistent volumes are not listed

2016-03-15 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196207#comment-15196207
 ] 

Joerg Schad commented on MESOS-4947:


Thanks that solved the issue!

> Persistent volumes are not listed
> -
>
> Key: MESOS-4947
> URL: https://issues.apache.org/jira/browse/MESOS-4947
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.1
> Environment: DCOS 1.6.1 on AWS
>Reporter: Max Neunhöffer
>Assignee: Neil Conway
>
> TL;DR:  In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either (as described here: 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes).
> Situation: There are Mesos agents in the cluster that have dynamic 
> reservations as well as persistent volumes for role "arangodb" with principal 
> "arangodb" but the corresponding framework does no longer exist (was 
> "destroyed" by clicking in the Marathon UI). Let's call these "Zombie 
> persistent volumes". We try to cleanup this mess manually (or automatically).
> Effect: According to 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
>  one should be able to list these zombies using 
> http:///mesos/slaves JSON/REST endpoint. We see a summary 
> of the dynamic reservations, but the persistent disks do not appear. As a 
> consequence we can neither use the /destroy-volumes API nor the /unreserve 
> API do perform manual or automatic cleanup.
> Additional information: 
>   - If we start a new framework with role "arangodb" and principal "arangodb" 
> it will receive resource offers containing the dynamic reservations *and* the 
> persistent volumes.
>   - In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either. So this might not be limited to zombie 
> persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4864) Add flag to specify available Nvidia GPUs on an agent's command line.

2016-03-15 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4864:
---
Sprint: Mesosphere Sprint 31

> Add flag to specify available Nvidia GPUs on an agent's command line.
> -
>
> Key: MESOS-4864
> URL: https://issues.apache.org/jira/browse/MESOS-4864
> Project: Mesos
>  Issue Type: Task
>  Components: isolation
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: flags, gpu, isolation, mesosphere
>
> In the initial GPU support we will not do auto-discovery of GPUs on an agent. 
>  As such, an operator will need to specify a flag on the command line, 
> listing all of the GPUs available on the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4865) Add GPUs as an explicit resource.

2016-03-15 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4865:
---
Sprint: Mesosphere Sprint 31

> Add GPUs as an explicit resource.
> -
>
> Key: MESOS-4865
> URL: https://issues.apache.org/jira/browse/MESOS-4865
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: containerizer, mesosphere, resources
>
> We will add "gpus" as an explicitly recognized resource in Mesos, akin to 
> cpus, memory, ports, and disk.  In the containerizer, we will verify that the 
> number of GPU resources passed in via the --resources flag matches the list 
> of GPUs passed in via the --nvidia_gpus flag.  In the future we will add 
> autodiscovery so this matching is unnecessary.  However, we will always have 
> to pass "gpus" as a resource to make any GPU available on the system (unlike 
> for cpus and memory, where the default is probed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4863) Add infrastructure for Nvidia GPU specific tests.

2016-03-15 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4863:
---
Sprint: Mesosphere Sprint 31

> Add infrastructure for Nvidia GPU specific tests.
> -
>
> Key: MESOS-4863
> URL: https://issues.apache.org/jira/browse/MESOS-4863
> Project: Mesos
>  Issue Type: Task
>  Components: isolation, test
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: gpu, isolation, mesosphere, test
>
> We need to be able to run unit tests that verify GPU isolation, as well as 
> run full blown tests that actually exercise the GPUs.
> These tests should only build when the proper configure flags are set for 
> enabling nvidia GPU support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4861) Add configure flags to build with Nvidia GPU support.

2016-03-15 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4861:
---
Sprint: Mesosphere Sprint 31

> Add configure flags to build with Nvidia GPU support.
> -
>
> Key: MESOS-4861
> URL: https://issues.apache.org/jira/browse/MESOS-4861
> Project: Mesos
>  Issue Type: Task
>  Components: isolation
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: configuration, gpu, isolation, mesosphere
>
> The configure flags can be used to enable Nvidia GPU support, as well as 
> specify the installation directories of the nvml header and library files if 
> not already installed in standard include/library paths on the system.
> They will also be used to conditionally build support for Nvidia GPUs into 
> Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4623) Add a stub Nvidia GPU isolator.

2016-03-15 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4623:
---
Sprint: Mesosphere Sprint 31

> Add a stub Nvidia GPU isolator.
> ---
>
> Key: MESOS-4623
> URL: https://issues.apache.org/jira/browse/MESOS-4623
> Project: Mesos
>  Issue Type: Task
>  Components: isolation
>Reporter: Benjamin Mahler
>Assignee: Kevin Klues
>  Labels: gpu, isolator, mesosphere
>
> We'll first wire up a skeleton Nvidia GPU isolator, which needs to be guarded 
> by a configure flag due to the dependency on NVML.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4860) Add a script to install the Nvidia GDK on a host.

2016-03-15 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4860:
---
Sprint: Mesosphere Sprint 31

> Add a script to install the Nvidia GDK on a host.
> -
>
> Key: MESOS-4860
> URL: https://issues.apache.org/jira/browse/MESOS-4860
> Project: Mesos
>  Issue Type: Task
>  Components: isolation
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Minor
>  Labels: gpu, isolator, mesosphere
>
> This script can be used to install the Nvidia GDK for Cuda 7.5 on a
> mesos development machine. The purpose of the Nvidia GDK is to provide
> all the necessary header files (nvml.h) and library files
> (libnvidia-ml.so) necessary to build mesos with Nvidia GPU support.
> If the machine on which Mesos is being compiled doesn't have any GPUs,
> then libnvidia-ml.so consists only of stubs, allowing Mesos to build
> and run, but not actually do anything useful under the hood. This
> enables us to build a GPU-enabled mesos on a development machine
> without GPUs and then deploy it to a production machine with GPUs and
> be reasonably sure it will work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4926) Add a list parser for comma separated integers in flags.

2016-03-15 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4926:
---
Sprint: Mesosphere Sprint 31

> Add a list parser for comma separated integers in flags.
> 
>
> Key: MESOS-4926
> URL: https://issues.apache.org/jira/browse/MESOS-4926
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: mesosphere
>
> Some flags require lists of integers to be passed in.  We should have an 
> explicit parser for this instead of relying on ad hoc solutions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4950) Implement reconnect funtionality in the scheduler library.

2016-03-15 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-4950:
-

 Summary: Implement reconnect funtionality in the scheduler library.
 Key: MESOS-4950
 URL: https://issues.apache.org/jira/browse/MESOS-4950
 Project: Mesos
  Issue Type: Bug
Reporter: Anand Mazumdar


Currently, there is no way for the schedulers to force a reconnection attempt 
with the master using the scheduler library {{src/scheduler/scheduler.cpp}}. It 
is specifically useful in scenarios where there is a one way network partition 
with the master. Due to this, the scheduler has not received any {{HEARTBEAT}} 
events from the master. In this case, the scheduler might want to force a 
reconnection attempt with the master instead of relying on the {{disconnected}} 
callback.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4928) Remove all '.get().' calls on Option / Try variables in the resources abstraction.

2016-03-15 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4928:
---
Sprint: Mesosphere Sprint 31

> Remove all '.get().' calls on Option / Try variables in the resources 
> abstraction.
> --
>
> Key: MESOS-4928
> URL: https://issues.apache.org/jira/browse/MESOS-4928
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: mesosphere
>
> When possible, {{.get()}} calls should be replaced by {{->}} for {{Option}} / 
> {{Try}} variables.  This ticket only proposes a blanket change for this in 
> the resource abstraction files, not the code base as a whole.  This is in 
> preparation for introducing the new GPU resource.  Without this change, I 
> would need to use the old {{.get()}} calls.  Instead, I propose to fix the 
> old code surrounding it so that consistency has me doing it the right way.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4809) Allow parallel execution of tests

2016-03-15 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-4809:

Assignee: Benjamin Bannier

> Allow parallel execution of tests
> -
>
> Key: MESOS-4809
> URL: https://issues.apache.org/jira/browse/MESOS-4809
> Project: Mesos
>  Issue Type: Epic
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Minor
>
> We should allow parallel execution of tests. There are two flavors to this:
> (a) tests are run in parallel in the same process, or
> (b) tests are run in parallel with separate processes (e.g., with 
> gtest-parallel).
> While (a) likely has overall better performance, it depends on tests being 
> independent of global state (e.g., current directory, and others). On the 
> other hand, already (b) improves execution time, and has much smaller 
> requirements.
> This epic tracks efforts to fix test to allow scenario (b) above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4807) IOTest.BufferedRead writes to the current directory

2016-03-15 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-4807:

Labels: mesosphere newbie parallel-tests  (was: newbie parallel-tests)

> IOTest.BufferedRead writes to the current directory
> ---
>
> Key: MESOS-4807
> URL: https://issues.apache.org/jira/browse/MESOS-4807
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess, test
>Reporter: Benjamin Bannier
>Assignee: Yong Tang
>Priority: Minor
>  Labels: mesosphere, newbie, parallel-tests
> Fix For: 0.29.0
>
>
> libprocess's {{IOTest.BufferedRead}} writes to the current directory. This is 
> bad for a number of reasons, e.g.,
> * should the test fail data might be leaked to random locations,
> * the test cannot be executed from a write-only directory, or
> * executing the same test in parallel would race on the existence of the 
> created file, and show bogus behavior.
> The test should probably be executed from a temporary directory, e.g., via 
> stout's {{TemporaryDirectoryTest}} fixture.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4948) Move maintenance tests to use the new scheduler library interface.

2016-03-15 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4948:
--
Description: 
We need to move the existing maintenance tests to use the new scheduler 
interface. We have already moved 1 test 
{{MasterMaintenanceTest.PendingUnavailabilityTest}} to use the new interface. 
It would be good to move the other 2 remaining tests to the new interface since 
it can lead to failures around the stack object being referenced after has been 
already destroyed. Detailed log from an ASF CI build failure.

{code}
[ RUN  ] MasterMaintenanceTest.InverseOffers
I0315 04:16:50.786032  2681 leveldb.cpp:174] Opened db in 125.361171ms
I0315 04:16:50.836374  2681 leveldb.cpp:181] Compacted db in 50.254411ms
I0315 04:16:50.836470  2681 leveldb.cpp:196] Created db iterator in 25917ns
I0315 04:16:50.836488  2681 leveldb.cpp:202] Seeked to beginning of db in 3291ns
I0315 04:16:50.836498  2681 leveldb.cpp:271] Iterated through 0 keys in the db 
in 253ns
I0315 04:16:50.836549  2681 replica.cpp:779] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0315 04:16:50.837474  2702 recover.cpp:447] Starting replica recovery
I0315 04:16:50.837565  2681 cluster.cpp:183] Creating default 'local' authorizer
I0315 04:16:50.838191  2702 recover.cpp:473] Replica is in EMPTY status
I0315 04:16:50.839532  2704 replica.cpp:673] Replica in EMPTY status received a 
broadcasted recover request from (4784)@172.17.0.4:39845
I0315 04:16:50.839754  2705 recover.cpp:193] Received a recover response from a 
replica in EMPTY status
I0315 04:16:50.841893  2704 recover.cpp:564] Updating replica status to STARTING
I0315 04:16:50.842566  2703 master.cpp:376] Master 
c326bc68-2581-48d4-9dc4-0d6f270bdda1 (01fcd642f65f) started on 172.17.0.4:39845
I0315 04:16:50.842644  2703 master.cpp:378] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="false" --authenticate_http="true" --authenticate_slaves="true" 
--authenticators="crammd5" --authorizers="local" 
--credentials="/tmp/DE2Uaw/credentials" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" 
--work_dir="/tmp/DE2Uaw/master" --zk_session_timeout="10secs"
I0315 04:16:50.843168  2703 master.cpp:425] Master allowing unauthenticated 
frameworks to register
I0315 04:16:50.843227  2703 master.cpp:428] Master only allowing authenticated 
slaves to register
I0315 04:16:50.843302  2703 credentials.hpp:35] Loading credentials for 
authentication from '/tmp/DE2Uaw/credentials'
I0315 04:16:50.843737  2703 master.cpp:468] Using default 'crammd5' 
authenticator
I0315 04:16:50.843969  2703 master.cpp:537] Using default 'basic' HTTP 
authenticator
I0315 04:16:50.844177  2703 master.cpp:571] Authorization enabled
I0315 04:16:50.844360  2708 hierarchical.cpp:144] Initialized hierarchical 
allocator process
I0315 04:16:50.844430  2708 whitelist_watcher.cpp:77] No whitelist given
I0315 04:16:50.848227  2703 master.cpp:1806] The newly elected leader is 
master@172.17.0.4:39845 with id c326bc68-2581-48d4-9dc4-0d6f270bdda1
I0315 04:16:50.848269  2703 master.cpp:1819] Elected as the leading master!
I0315 04:16:50.848292  2703 master.cpp:1508] Recovering from registrar
I0315 04:16:50.848563  2703 registrar.cpp:307] Recovering registrar
I0315 04:16:50.876277  2711 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 34.178445ms
I0315 04:16:50.876365  2711 replica.cpp:320] Persisted replica status to 
STARTING
I0315 04:16:50.876776  2711 recover.cpp:473] Replica is in STARTING status
I0315 04:16:50.878779  2706 replica.cpp:673] Replica in STARTING status 
received a broadcasted recover request from (4786)@172.17.0.4:39845
I0315 04:16:50.879240  2706 recover.cpp:193] Received a recover response from a 
replica in STARTING status
I0315 04:16:50.880100  2701 recover.cpp:564] Updating replica status to VOTING
I0315 04:16:50.915776  2701 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 35.472106ms
I0315 04:16:50.915868  2701 replica.cpp:320] Persisted replica status to VOTING
I0315 04:16:50.916158  2701 recover.cpp:578] Successfully joined the Paxos group
I0315 04:16:50.916363  2701 recover.cpp:462] Recover process terminated
I0315 04:16:50.917192  2701 log.cpp:659] Attempting to start the 

[jira] [Updated] (MESOS-4948) Move maintenance tests to use the new scheduler library interface.

2016-03-15 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4948:
--
Labels: flaky-test maintenance mesosphere newbie  (was: flaky-test 
maintenance mesosphere)

> Move maintenance tests to use the new scheduler library interface.
> --
>
> Key: MESOS-4948
> URL: https://issues.apache.org/jira/browse/MESOS-4948
> Project: Mesos
>  Issue Type: Bug
>  Components: tests
> Environment: Ubuntu 14.04, using gcc, with libevent and SSL enabled 
> (on ASF CI)
>Reporter: Greg Mann
>  Labels: flaky-test, maintenance, mesosphere, newbie
>
> We need to move the existing maintenance tests to use the new scheduler 
> interface. We have already moved 1 test 
> {{MasterMaintenanceTest.PendingUnavailabilityTest}} to use the new interface. 
> It would be good to move the other 2 remaining tests to the new interface 
> since it can lead to failures around the stack object being referenced after 
> has been already destroyed. Detailed log from an ASF CI build failure.
> {code}
> [ RUN  ] MasterMaintenanceTest.InverseOffers
> I0315 04:16:50.786032  2681 leveldb.cpp:174] Opened db in 125.361171ms
> I0315 04:16:50.836374  2681 leveldb.cpp:181] Compacted db in 50.254411ms
> I0315 04:16:50.836470  2681 leveldb.cpp:196] Created db iterator in 25917ns
> I0315 04:16:50.836488  2681 leveldb.cpp:202] Seeked to beginning of db in 
> 3291ns
> I0315 04:16:50.836498  2681 leveldb.cpp:271] Iterated through 0 keys in the 
> db in 253ns
> I0315 04:16:50.836549  2681 replica.cpp:779] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0315 04:16:50.837474  2702 recover.cpp:447] Starting replica recovery
> I0315 04:16:50.837565  2681 cluster.cpp:183] Creating default 'local' 
> authorizer
> I0315 04:16:50.838191  2702 recover.cpp:473] Replica is in EMPTY status
> I0315 04:16:50.839532  2704 replica.cpp:673] Replica in EMPTY status received 
> a broadcasted recover request from (4784)@172.17.0.4:39845
> I0315 04:16:50.839754  2705 recover.cpp:193] Received a recover response from 
> a replica in EMPTY status
> I0315 04:16:50.841893  2704 recover.cpp:564] Updating replica status to 
> STARTING
> I0315 04:16:50.842566  2703 master.cpp:376] Master 
> c326bc68-2581-48d4-9dc4-0d6f270bdda1 (01fcd642f65f) started on 
> 172.17.0.4:39845
> I0315 04:16:50.842644  2703 master.cpp:378] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_http="true" 
> --authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/DE2Uaw/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" 
> --work_dir="/tmp/DE2Uaw/master" --zk_session_timeout="10secs"
> I0315 04:16:50.843168  2703 master.cpp:425] Master allowing unauthenticated 
> frameworks to register
> I0315 04:16:50.843227  2703 master.cpp:428] Master only allowing 
> authenticated slaves to register
> I0315 04:16:50.843302  2703 credentials.hpp:35] Loading credentials for 
> authentication from '/tmp/DE2Uaw/credentials'
> I0315 04:16:50.843737  2703 master.cpp:468] Using default 'crammd5' 
> authenticator
> I0315 04:16:50.843969  2703 master.cpp:537] Using default 'basic' HTTP 
> authenticator
> I0315 04:16:50.844177  2703 master.cpp:571] Authorization enabled
> I0315 04:16:50.844360  2708 hierarchical.cpp:144] Initialized hierarchical 
> allocator process
> I0315 04:16:50.844430  2708 whitelist_watcher.cpp:77] No whitelist given
> I0315 04:16:50.848227  2703 master.cpp:1806] The newly elected leader is 
> master@172.17.0.4:39845 with id c326bc68-2581-48d4-9dc4-0d6f270bdda1
> I0315 04:16:50.848269  2703 master.cpp:1819] Elected as the leading master!
> I0315 04:16:50.848292  2703 master.cpp:1508] Recovering from registrar
> I0315 04:16:50.848563  2703 registrar.cpp:307] Recovering registrar
> I0315 04:16:50.876277  2711 leveldb.cpp:304] Persisting metadata (8 bytes) to 
> leveldb took 34.178445ms
> I0315 04:16:50.876365  2711 replica.cpp:320] Persisted replica status to 
> STARTING
> I0315 04:16:50.876776  2711 recover.cpp:473] Replica is in 

[jira] [Updated] (MESOS-4948) Move maintenance tests to use the new scheduler library interface.

2016-03-15 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4948:
--
Summary: Move maintenance tests to use the new scheduler library interface. 
 (was: MasterMaintenanceTest.InverseOffers is flaky)

> Move maintenance tests to use the new scheduler library interface.
> --
>
> Key: MESOS-4948
> URL: https://issues.apache.org/jira/browse/MESOS-4948
> Project: Mesos
>  Issue Type: Bug
>  Components: tests
> Environment: Ubuntu 14.04, using gcc, with libevent and SSL enabled 
> (on ASF CI)
>Reporter: Greg Mann
>  Labels: flaky-test, maintenance, mesosphere
>
> This seems to be an issue distinct from the other tickets that have been 
> filed on this test. Failed log is included below; while the core dump appears 
> just after the start of {{MasterMaintenanceTest.InverseOffersFilters}}, it 
> looks to me like the segfault is triggered by one of the callbacks called at 
> the end of {{MasterMaintenanceTest.InverseOffers}}.
> {code}
> [ RUN  ] MasterMaintenanceTest.InverseOffers
> I0315 04:16:50.786032  2681 leveldb.cpp:174] Opened db in 125.361171ms
> I0315 04:16:50.836374  2681 leveldb.cpp:181] Compacted db in 50.254411ms
> I0315 04:16:50.836470  2681 leveldb.cpp:196] Created db iterator in 25917ns
> I0315 04:16:50.836488  2681 leveldb.cpp:202] Seeked to beginning of db in 
> 3291ns
> I0315 04:16:50.836498  2681 leveldb.cpp:271] Iterated through 0 keys in the 
> db in 253ns
> I0315 04:16:50.836549  2681 replica.cpp:779] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0315 04:16:50.837474  2702 recover.cpp:447] Starting replica recovery
> I0315 04:16:50.837565  2681 cluster.cpp:183] Creating default 'local' 
> authorizer
> I0315 04:16:50.838191  2702 recover.cpp:473] Replica is in EMPTY status
> I0315 04:16:50.839532  2704 replica.cpp:673] Replica in EMPTY status received 
> a broadcasted recover request from (4784)@172.17.0.4:39845
> I0315 04:16:50.839754  2705 recover.cpp:193] Received a recover response from 
> a replica in EMPTY status
> I0315 04:16:50.841893  2704 recover.cpp:564] Updating replica status to 
> STARTING
> I0315 04:16:50.842566  2703 master.cpp:376] Master 
> c326bc68-2581-48d4-9dc4-0d6f270bdda1 (01fcd642f65f) started on 
> 172.17.0.4:39845
> I0315 04:16:50.842644  2703 master.cpp:378] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_http="true" 
> --authenticate_slaves="true" --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/DE2Uaw/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" 
> --work_dir="/tmp/DE2Uaw/master" --zk_session_timeout="10secs"
> I0315 04:16:50.843168  2703 master.cpp:425] Master allowing unauthenticated 
> frameworks to register
> I0315 04:16:50.843227  2703 master.cpp:428] Master only allowing 
> authenticated slaves to register
> I0315 04:16:50.843302  2703 credentials.hpp:35] Loading credentials for 
> authentication from '/tmp/DE2Uaw/credentials'
> I0315 04:16:50.843737  2703 master.cpp:468] Using default 'crammd5' 
> authenticator
> I0315 04:16:50.843969  2703 master.cpp:537] Using default 'basic' HTTP 
> authenticator
> I0315 04:16:50.844177  2703 master.cpp:571] Authorization enabled
> I0315 04:16:50.844360  2708 hierarchical.cpp:144] Initialized hierarchical 
> allocator process
> I0315 04:16:50.844430  2708 whitelist_watcher.cpp:77] No whitelist given
> I0315 04:16:50.848227  2703 master.cpp:1806] The newly elected leader is 
> master@172.17.0.4:39845 with id c326bc68-2581-48d4-9dc4-0d6f270bdda1
> I0315 04:16:50.848269  2703 master.cpp:1819] Elected as the leading master!
> I0315 04:16:50.848292  2703 master.cpp:1508] Recovering from registrar
> I0315 04:16:50.848563  2703 registrar.cpp:307] Recovering registrar
> I0315 04:16:50.876277  2711 leveldb.cpp:304] Persisting metadata (8 bytes) to 
> leveldb took 34.178445ms
> I0315 04:16:50.876365  2711 replica.cpp:320] Persisted replica status to 
> STARTING
> I0315 04:16:50.876776  2711 recover.cpp:473] Replica is in STARTING status
> I0315 

[jira] [Assigned] (MESOS-3243) Replace NULL with nullptr

2016-03-15 Thread Tomasz Janiszewski (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Janiszewski reassigned MESOS-3243:
-

Assignee: Tomasz Janiszewski

> Replace NULL with nullptr
> -
>
> Key: MESOS-3243
> URL: https://issues.apache.org/jira/browse/MESOS-3243
> Project: Mesos
>  Issue Type: Bug
>Reporter: Michael Park
>Assignee: Tomasz Janiszewski
>
> As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over 
> to use {{nullptr}}. I think it would be an interesting exercise to do this 
> with {{clang-modernize}} using the [nullptr 
> transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although 
> it's probably just as easy to use {{sed}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4781) Executor env variables should not be leaked to the command task.

2016-03-15 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193705#comment-15193705
 ] 

Gilbert Song edited comment on MESOS-4781 at 3/15/16 6:26 PM:
--

https://reviews.apache.org/r/44499/
https://reviews.apache.org/r/44500/


was (Author: gilbert):
https://reviews.apache.org/r/44498/

> Executor env variables should not be leaked to the command task.
> 
>
> Key: MESOS-4781
> URL: https://issues.apache.org/jira/browse/MESOS-4781
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> Currently, command task inherits the env variables of the command executor. 
> This is less ideal because the command executor environment variables include 
> some Mesos internal env variables like MESOS_XXX and LIBPROCESS_XXX. Also, 
> this behavior does not match what Docker containerizer does. We should 
> construct the env variables from scratch for the command task, rather than 
> relying on inheriting the env variables from the command executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1571) Signal escalation timeout is not configurable.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-1571:
-
Sprint: Mesosphere Q4 Sprint 2 - 11/14, Mesosphere Q4 Sprint 3 - 12/7, 
Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  (was: 
Mesosphere Q4 Sprint 2 - 11/14, Mesosphere Q4 Sprint 3 - 12/7, Mesosphere 
Sprint 29, Mesosphere Sprint 30)

> Signal escalation timeout is not configurable.
> --
>
> Key: MESOS-1571
> URL: https://issues.apache.org/jira/browse/MESOS-1571
> Project: Mesos
>  Issue Type: Bug
>Reporter: Niklas Quarfot Nielsen
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Even though the executor shutdown grace period is set to a larger interval, 
> the signal escalation timeout will still be 3 seconds. It should either be 
> configurable or dependent on EXECUTOR_SHUTDOWN_GRACE_PERIOD.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4764) The network/cni isolator should report assigned IP address.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4764:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> The network/cni isolator should report assigned IP address. 
> 
>
> Key: MESOS-4764
> URL: https://issues.apache.org/jira/browse/MESOS-4764
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Qian Zhang
>
> In order for service discovery to work in some cases, the network/cni 
> isolator needs to report the assigned IP address through the 
> isolator->status() interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4722) Add allocator metric for number of active offer filters

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4722:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  
(was: Mesosphere Sprint 29, Mesosphere Sprint 30)

> Add allocator metric for number of active offer filters
> ---
>
> Key: MESOS-4722
> URL: https://issues.apache.org/jira/browse/MESOS-4722
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> To diagnose scenarios where frameworks unexpectedly do not receive offers 
> information on currently active filters are needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4673) Agent fails to shutdown after re-registering period timed-out.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4673:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  
(was: Mesosphere Sprint 29, Mesosphere Sprint 30)

> Agent fails to shutdown after re-registering period timed-out.
> --
>
> Key: MESOS-4673
> URL: https://issues.apache.org/jira/browse/MESOS-4673
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>  Labels: mesosphere
>
> Under certain conditions, when a mesos agent looses connection to the master 
> for an extended period of time (Say a switch fails), the master will 
> de-register the agent, and then when the agent comes back up, refuse to let 
> it register: {{Slave asked to shut down by master@10.102.25.1:5050 because 
> 'Slave attempted to re-register after removal'}}.
> The agent doesn't seem to be able to properly shutdown and remove running 
> tasks as it should do to register as a new agent. Hence this message will 
> persist until it's resolved by manual intervetion.
> This seems to be caused by Docker tasks that couldn't shutdown cleanly when 
> the agent is asked to shutdown running tasks to be able to register as a new 
> agent with the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4823:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish ports that micro-services are listening on, 
> to the outside world. When containers are running on bridged (or ptp) 
> networking this can be achieved by installing port forwarding rules on the 
> agent (using iptables). This can be done in the `network/cni` isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4718) Add allocator metric for number of completed allocation runs

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4718:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  
(was: Mesosphere Sprint 29, Mesosphere Sprint 30)

> Add allocator metric for number of completed allocation runs
> 
>
> Key: MESOS-4718
> URL: https://issues.apache.org/jira/browse/MESOS-4718
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4849) Add agent flags for HTTP authentication

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4849:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Add agent flags for HTTP authentication
> ---
>
> Key: MESOS-4849
> URL: https://issues.apache.org/jira/browse/MESOS-4849
> Project: Mesos
>  Issue Type: Task
>  Components: security, slave
>Reporter: Adam B
>Assignee: Greg Mann
>  Labels: mesosphere, security
>
> Flags should be added to the agent to:
> 1. Enable HTTP authentication ({{--authenticate_http}})
> 2. Specify credentials ({{--http_credentials}})
> 3. Specify HTTP authenticators ({{--authenticators}})



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3003) Support mounting in default configuration files/volumes into every new container

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-3003:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Support mounting in default configuration files/volumes into every new 
> container
> 
>
> Key: MESOS-3003
> URL: https://issues.apache.org/jira/browse/MESOS-3003
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Timothy Chen
>Assignee: Gilbert Song
>  Labels: mesosphere, unified-containerizer-mvp
>
> Most container images leave out system configuration (e.g: /etc/*) and expect 
> the container runtimes to mount in specific configurations as needed such as 
> /etc/resolv.conf from the host into the container when needed.
> We need to support mounting in specific configuration files for command 
> executor to work, and also allow the user to optionally define other 
> configuration files to mount in as well via flags.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4781) Executor env variables should not be leaked to the command task.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4781:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Executor env variables should not be leaked to the command task.
> 
>
> Key: MESOS-4781
> URL: https://issues.apache.org/jira/browse/MESOS-4781
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> Currently, command task inherits the env variables of the command executor. 
> This is less ideal because the command executor environment variables include 
> some Mesos internal env variables like MESOS_XXX and LIBPROCESS_XXX. Also, 
> this behavior does not match what Docker containerizer does. We should 
> construct the env variables from scratch for the command task, rather than 
> relying on inheriting the env variables from the command executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4633) Tests will dereference stack allocated agent objects upon assertion/expectation failure.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4633:
-
Sprint: Mesosphere Sprint 28, Mesosphere Sprint 29, Mesosphere Sprint 30, 
Mesosphere Sprint 31  (was: Mesosphere Sprint 28, Mesosphere Sprint 29, 
Mesosphere Sprint 30)

> Tests will dereference stack allocated agent objects upon 
> assertion/expectation failure.
> 
>
> Key: MESOS-4633
> URL: https://issues.apache.org/jira/browse/MESOS-4633
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: flaky, mesosphere, tech-debt, test
>
> Tests that use the {{StartSlave}} test helper are generally fragile when the 
> test fails an assert/expect in the middle of the test.  This is because the 
> {{StartSlave}} helper takes raw pointer arguments, which may be 
> stack-allocated.
> In case of an assert failure, the test immediately exits (destroying stack 
> allocated objects) and proceeds onto test cleanup.  The test cleanup may 
> dereference some of these destroyed objects, leading to a test crash like:
> {code}
> [18:27:36][Step 8/8] F0204 18:27:35.981302 23085 logging.cpp:64] RAW: Pure 
> virtual method called
> [18:27:36][Step 8/8] @ 0x7f7077055e1c  google::LogMessage::Fail()
> [18:27:36][Step 8/8] @ 0x7f707705ba6f  google::RawLog__()
> [18:27:36][Step 8/8] @ 0x7f70760f76c9  __cxa_pure_virtual
> [18:27:36][Step 8/8] @   0xa9423c  
> mesos::internal::tests::Cluster::Slaves::shutdown()
> [18:27:36][Step 8/8] @  0x1074e45  
> mesos::internal::tests::MesosTest::ShutdownSlaves()
> [18:27:36][Step 8/8] @  0x1074de4  
> mesos::internal::tests::MesosTest::Shutdown()
> [18:27:36][Step 8/8] @  0x1070ec7  
> mesos::internal::tests::MesosTest::TearDown()
> {code}
> The {{StartSlave}} helper should take {{shared_ptr}} arguments instead.
> This also means that we can remove the {{Shutdown}} helper from most of these 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4544) Propose design doc for agent partitioning behavior

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4544:
-
Sprint: Mesosphere Sprint 28, Mesosphere Sprint 29, Mesosphere Sprint 30, 
Mesosphere Sprint 31  (was: Mesosphere Sprint 28, Mesosphere Sprint 29, 
Mesosphere Sprint 30)

> Propose design doc for agent partitioning behavior
> --
>
> Key: MESOS-4544
> URL: https://issues.apache.org/jira/browse/MESOS-4544
> Project: Mesos
>  Issue Type: Task
>  Components: general
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4810) ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand fails.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4810:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand fails.
> --
>
> Key: MESOS-4810
> URL: https://issues.apache.org/jira/browse/MESOS-4810
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.28.0
> Environment: CentOS 7 on AWS, both with or without SSL.
>Reporter: Bernd Mathiske
>Assignee: Jie Yu
>  Labels: docker, mesosphere, test
>
> {noformat}
> [09:46:46] :   [Step 11/11] [ RUN  ] 
> ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.628413  1166 leveldb.cpp:174] 
> Opened db in 4.242882ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629926  1166 leveldb.cpp:181] 
> Compacted db in 1.483621ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629966  1166 leveldb.cpp:196] 
> Created db iterator in 15498ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629977  1166 leveldb.cpp:202] 
> Seeked to beginning of db in 1405ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629984  1166 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 239ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630015  1166 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630470  1183 recover.cpp:447] 
> Starting replica recovery
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630702  1180 recover.cpp:473] 
> Replica is in EMPTY status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.631767  1182 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (14567)@172.30.2.124:37431
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.632115  1183 recover.cpp:193] 
> Received a recover response from a replica in EMPTY status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.632450  1186 recover.cpp:564] 
> Updating replica status to STARTING
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633476  1186 master.cpp:375] 
> Master 3fbb2fb0-4f18-498b-a440-9acbf6923a13 (ip-172-30-2-124.mesosphere.io) 
> started on 172.30.2.124:37431
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633491  1186 master.cpp:377] Flags 
> at startup: --acls="" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate="true" 
> --authenticate_http="true" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/4UxXoW/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/4UxXoW/master" 
> --zk_session_timeout="10secs"
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633677  1186 master.cpp:422] 
> Master only allowing authenticated frameworks to register
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633685  1186 master.cpp:427] 
> Master only allowing authenticated slaves to register
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633692  1186 credentials.hpp:35] 
> Loading credentials for authentication from '/tmp/4UxXoW/credentials'
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633851  1183 leveldb.cpp:304] 
> Persisting metadata (8 bytes) to leveldb took 1.191043ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633873  1183 replica.cpp:320] 
> Persisted replica status to STARTING
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633894  1186 master.cpp:467] Using 
> default 'crammd5' authenticator
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634003  1186 master.cpp:536] Using 
> default 'basic' HTTP authenticator
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634062  1184 recover.cpp:473] 
> Replica is in STARTING status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634109  1186 master.cpp:570] 
> Authorization enabled
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634249  1187 
> whitelist_watcher.cpp:77] No whitelist given
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634255  1184 hierarchical.cpp:144] 
> Initialized hierarchical allocator process
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634884  1187 

[jira] [Updated] (MESOS-4821) Introduce a port field in `ImageManifest` in order to set exposed ports for a container.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4821:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Introduce a port field in `ImageManifest` in order to set exposed ports for a 
> container.
> 
>
> Key: MESOS-4821
> URL: https://issues.apache.org/jira/browse/MESOS-4821
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
>
> Networking isolators such as `network/cni` need to learn about ports that a 
> container wishes to be exposed to the outside world. This can be achieved by 
> adding a field to the `ImageManifest` protobuf and allowing the 
> `ImageProvisioner` to set these fields to inform the isolator of the ports 
> that the container wishes to be exposed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4763) Add test mock for CNI plugins.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4763:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Add test mock for CNI plugins.
> --
>
> Key: MESOS-4763
> URL: https://issues.apache.org/jira/browse/MESOS-4763
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Avinash Sridharan
>  Labels: mesosphere
>
> In order to test the network/cni isolator, we need to mock the behavior of an 
> CNI plugin. One option is to write a mock script which acts as a CNI plugin. 
> The isolator will talk to the mock script the same way it talks to an actual 
> CNI plugin.
> The mock script can just join the host network?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4634) Tests will dereference stack allocated master objects upon assertion/expectation failure.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4634:
-
Sprint: Mesosphere Sprint 28, Mesosphere Sprint 29, Mesosphere Sprint 30, 
Mesosphere Sprint 31  (was: Mesosphere Sprint 28, Mesosphere Sprint 29, 
Mesosphere Sprint 30)

> Tests will dereference stack allocated master objects upon 
> assertion/expectation failure.
> -
>
> Key: MESOS-4634
> URL: https://issues.apache.org/jira/browse/MESOS-4634
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: flaky, mesosphere, tech-debt, test
>
> Tests that use the {{StartMaster}} test helper are generally fragile when the 
> test fails an assert/expect in the middle of the test.  This is because the 
> {{StartMaster}} helper takes raw pointer arguments, which may be 
> stack-allocated.
> In case of an assert failure, the test immediately exits (destroying stack 
> allocated objects) and proceeds onto test cleanup.  The test cleanup may 
> dereference some of these destroyed objects, leading to a test crash like:
> {code}
> [18:27:36][Step 8/8] F0204 18:27:35.981302 23085 logging.cpp:64] RAW: Pure 
> virtual method called
> [18:27:36][Step 8/8] @ 0x7f7077055e1c  google::LogMessage::Fail()
> [18:27:36][Step 8/8] @ 0x7f707705ba6f  google::RawLog__()
> [18:27:36][Step 8/8] @ 0x7f70760f76c9  __cxa_pure_virtual
> [18:27:36][Step 8/8] @   0xa9423c  
> mesos::internal::tests::Cluster::Slaves::shutdown()
> [18:27:36][Step 8/8] @  0x1074e45  
> mesos::internal::tests::MesosTest::ShutdownSlaves()
> [18:27:36][Step 8/8] @  0x1074de4  
> mesos::internal::tests::MesosTest::Shutdown()
> [18:27:36][Step 8/8] @  0x1070ec7  
> mesos::internal::tests::MesosTest::TearDown()
> {code}
> The {{StartMaster}} helper should take {{shared_ptr}} arguments instead.
> This also means that we can remove the {{Shutdown}} helper from most of these 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4720) Add allocator metric for current allocation breakdown

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4720:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  
(was: Mesosphere Sprint 29, Mesosphere Sprint 30)

> Add allocator metric for current allocation breakdown
> -
>
> Key: MESOS-4720
> URL: https://issues.apache.org/jira/browse/MESOS-4720
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> Exposing the current allocation breakdown as seen by the allocator will allow 
> us to correlated the corresponding metrics in the master with what the 
> allocator sees. We should expose at least allocated or available, and total.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4850) Add authentication to agent endpoints /state and /flags

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4850:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Add authentication to agent endpoints /state and /flags
> ---
>
> Key: MESOS-4850
> URL: https://issues.apache.org/jira/browse/MESOS-4850
> Project: Mesos
>  Issue Type: Task
>  Components: security, slave
>Reporter: Adam B
>Assignee: Greg Mann
>  Labels: mesosphere, security
>
> The {{/state}} and {{/flags}} endpoints are installed in 
> {{src/slave/slave.cpp}}, and thus are straightforward to make authenticated. 
> Other agent endpoints require a bit more consideration, and are tracked in 
> MESOS-4902.
> For more information on agent endpoints, see 
> http://mesos.apache.org/documentation/latest/endpoints/
> or search for `route(` in the source code:
> {code}
> $ grep -rn "route(" src/ |grep -v master |grep -v tests |grep -v json
> src/version/version.cpp:75:  route("/", VERSION_HELP(), 
> ::version);
> src/files/files.cpp:150:  route("/browse",
> src/files/files.cpp:153:  route("/read",
> src/files/files.cpp:156:  route("/download",
> src/files/files.cpp:159:  route("/debug",
> src/slave/slave.cpp:580:  route("/api/v1/executor",
> src/slave/slave.cpp:595:  route("/state",
> src/slave/slave.cpp:601:  route("/flags",
> src/slave/slave.cpp:607:  route("/health",
> src/slave/monitor.cpp:100:route("/statistics",
> $ grep -rn "route(" 3rdparty/ |grep -v tests |grep -v README |grep -v 
> examples |grep -v help |grep -v "process..pp"
> 3rdparty/libprocess/include/process/profiler.hpp:34:route("/start", 
> START_HELP(), ::start);
> 3rdparty/libprocess/include/process/profiler.hpp:35:route("/stop", 
> STOP_HELP(), ::stop);
> 3rdparty/libprocess/include/process/system.hpp:70:route("/stats.json", 
> statsHelp(), ::stats);
> 3rdparty/libprocess/include/process/logging.hpp:44:route("/toggle", 
> TOGGLE_HELP(), ::toggle);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4759) Add network/cni isolator for Mesos containerizer.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4759:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Add network/cni isolator for Mesos containerizer.
> -
>
> Key: MESOS-4759
> URL: https://issues.apache.org/jira/browse/MESOS-4759
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Qian Zhang
>
> See the design doc for more context (MESOS-4742).
> The isolator will interact with CNI plugins to create the network for the 
> container to join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4762) Setup proper DNS resolver for containers in network/cni isolator.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4762:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Setup proper DNS resolver for containers in network/cni isolator.
> -
>
> Key: MESOS-4762
> URL: https://issues.apache.org/jira/browse/MESOS-4762
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Avinash Sridharan
>  Labels: mesosphere
>
> Please get more context from the design doc (MESOS-4742).
> The CNI plugin will return the DNS information about the network. The 
> network/cni isolator needs to properly setup /etc/resolv.conf for the 
> container. We should consider the following cases:
> 1) container is using host filesystem
> 2) container is using a different filesystem
> 3) custom executor and command executor



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4719) Add allocator metric for number of offers each framework received

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4719:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  
(was: Mesosphere Sprint 29, Mesosphere Sprint 30)

> Add allocator metric for number of offers each framework received
> -
>
> Key: MESOS-4719
> URL: https://issues.apache.org/jira/browse/MESOS-4719
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> A counter for the number of allocations to a framework can be used to monitor 
> allocation progress, e.g., when agents are added to a cluster, and as other 
> frameworks are added or removed.
> Currently, an offer by the hierarchical allocator to a framework consists of 
> a list of resources on possibly many agents. Resources might be offered in 
> order to satisfy outstanding quota or for fairness. To capture allocations on 
> fine granularity we should not count the number of offers, but instead the 
> pieces making up that offer, as such a metric would better resolve the effect 
> of changes (e.g., adding/removing a framework).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4721) Add allocator metric for allocation duration

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4721:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  
(was: Mesosphere Sprint 29, Mesosphere Sprint 30)

> Add allocator metric for allocation duration
> 
>
> Key: MESOS-4721
> URL: https://issues.apache.org/jira/browse/MESOS-4721
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> Similar allocator timing-related information is already exposed in the log, 
> but should also be exposed via an endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4761) Add agent flags to allow operators to specify CNI plugin and config directories.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4761:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Add agent flags to allow operators to specify CNI plugin and config 
> directories.
> 
>
> Key: MESOS-4761
> URL: https://issues.apache.org/jira/browse/MESOS-4761
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Qian Zhang
>
> According to design doc, we plan to add the following flags:
> “--network_cni_plugins_dir”
> Location of the CNI plugin binaries. The “network/cni” isolator will find CNI 
> plugins under this directory so that it can execute the plugins to add/delete 
> container from the CNI networks. It is the operator’s responsibility to 
> install the CNI plugin binaries in the specified directory.
> “--network_cni_config_dir”
> Location of the CNI network configuration files. For each network that 
> containers launched in Mesos agent can connect to, the operator should 
> install a network configuration file in JSON format in the specified 
> directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4723) Add allocator metric for currently satisfied quotas

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4723:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  
(was: Mesosphere Sprint 29, Mesosphere Sprint 30)

> Add allocator metric for currently satisfied quotas
> ---
>
> Key: MESOS-4723
> URL: https://issues.apache.org/jira/browse/MESOS-4723
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> We currently expose information on set quotas via dedicated quota endpoints. 
> To diagnose allocator problems one additionally needs information about used 
> quotas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4772) TaskInfo/ExecutorInfo should include owner information

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4772:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> TaskInfo/ExecutorInfo should include owner information
> --
>
> Key: MESOS-4772
> URL: https://issues.apache.org/jira/browse/MESOS-4772
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: Adam B
>Assignee: Jan Schlicht
>  Labels: authorization, mesosphere, ownership, security
>
> We need a way to assign fine-grained ownership to tasks/executors so that 
> multi-user frameworks can tell Mesos to associate the task with a user 
> identity (rather than just the framework principal+role). Then, when an HTTP 
> user requests to view the task's sandbox contents, or kill the task, or list 
> all tasks, the authorizer can determine whether to allow/deny/filter the 
> request based on finer-grained, user-level ownership.
> Some systems may want TaskInfo.owner to represent a group rather than an 
> individual user. That's fine as long as the framework sets the field to the 
> group ID in such a way that a group-aware authorizer can interpret it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4818) Add end to end testing for Appc images.

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4818:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Add end to end testing for Appc images.
> ---
>
> Key: MESOS-4818
> URL: https://issues.apache.org/jira/browse/MESOS-4818
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere, unified-containerizer-mvp
>
> Add tests that covers integration test of the Appc provisioner feature with 
> mesos containerizer.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3011) Publish release documentation for major releases on website

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-3011:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Publish release documentation for major releases on website
> ---
>
> Key: MESOS-3011
> URL: https://issues.apache.org/jira/browse/MESOS-3011
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation, project website
>Reporter: Paul Brett
>Assignee: Joerg Schad
>  Labels: documentation, mesosphere
>
> Currently, the website only provides a single version of the documentation.  
> We should publish documentation for each release on the website independently 
> (for example as https://mesos.apache.org/documentation/0.22/index.html, 
> https://mesos.apache.org/documentation/0.23/index.html) and make latest 
> redirect to the current version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2317) Remove deprecated checkpoint=false code

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-2317:
-
Sprint: Mesosphere Q1 Sprint 6 - 4/3, Mesosphere Q1 Sprint 7 - 4/17, 
Mesosphere Q2 Sprint 8 - 5/1, Mesosphere Q1 Sprint 9 - 5/15, Mesosphere Sprint 
10, Mesosphere Sprint 11, Mesosphere Sprint 26, Mesosphere Sprint 27, 
Mesosphere Sprint 28, Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere 
Sprint 31  (was: Mesosphere Q1 Sprint 6 - 4/3, Mesosphere Q1 Sprint 7 - 4/17, 
Mesosphere Q2 Sprint 8 - 5/1, Mesosphere Q1 Sprint 9 - 5/15, Mesosphere Sprint 
10, Mesosphere Sprint 11, Mesosphere Sprint 26, Mesosphere Sprint 27, 
Mesosphere Sprint 28, Mesosphere Sprint 29, Mesosphere Sprint 30)

> Remove deprecated checkpoint=false code
> ---
>
> Key: MESOS-2317
> URL: https://issues.apache.org/jira/browse/MESOS-2317
> Project: Mesos
>  Issue Type: Epic
>Affects Versions: 0.22.0
>Reporter: Adam B
>Assignee: Joerg Schad
>  Labels: checkpoint, mesosphere
>
> Cody's plan from MESOS-444 was:
> 1) -Make it so the flag can't be changed at the command line-
> 2) -Remove the checkpoint variable entirely from slave/flags.hpp. This is a 
> fairly involved change since a number of unit tests depend on manually 
> setting the flag, as well as the default being non-checkpointing.-
> 3) -Remove logic around checkpointing in the slave, remove logic inside the 
> master.-
> 4) Drop the flag from the SlaveInfo struct (Will require a deprecation cycle).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4576) Introduce a stout helper for "which"

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4576:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  
(was: Mesosphere Sprint 29, Mesosphere Sprint 30)

> Introduce a stout helper for "which"
> 
>
> Key: MESOS-4576
> URL: https://issues.apache.org/jira/browse/MESOS-4576
> Project: Mesos
>  Issue Type: Improvement
>  Components: stout
>Reporter: Joseph Wu
>Assignee: Disha Singh
>  Labels: mesosphere
>
> We may want to add a helper to {{stout/os.hpp}} that will natively emulate 
> the functionality of the Linux utility {{which}}.  i.e.
> {code}
> Option which(const string& command)
> {
>   Option path = os::getenv("PATH");
>   // Loop through path and return the first one which os::exists(...).
>   return None();
> }
> {code}
> This helper may be useful:
> * for test filters in {{src/tests/environment.cpp}}
> * a few tests in {{src/tests/containerizer/port_mapping_tests.cpp}}
> * the {{sha512}} utility in {{src/common/command_utils.cpp}}
> * as runtime checks in the {{LogrotateContainerLogger}}
> * etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4820) Need to set `EXPOSED` ports from docker images into `ContainerConfig`

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4820:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31  (was: Mesosphere Sprint 
30)

> Need to set `EXPOSED` ports from docker images into `ContainerConfig`
> -
>
> Key: MESOS-4820
> URL: https://issues.apache.org/jira/browse/MESOS-4820
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker images have an `EXPOSE` command associated with them. This tells 
> the container run-time the TCP ports that the micro-service "wishes" to 
> expose to the outside world. 
> With the `Unified containerizer` project since `MesosContainerizer` is going 
> to natively support docker images it is imperative that the Mesos container 
> run time have a mechanism to expose ports listed in a Docker image. The first 
> step to achieve this is to extract this information from the `Docker` image 
> and set in the `ContainerConfig` . The `ContainerConfig` can then be used to 
> pass this information to any isolator (for e.g. `network/cni` isolator) that 
> will install port forwarding rules to expose the desired ports.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4233) Logging is too verbose for sysadmins / syslog

2016-03-15 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4233:
-
Sprint: Mesosphere Sprint 26, Mesosphere Sprint 27, Mesosphere Sprint 28, 
Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31  (was: 
Mesosphere Sprint 26, Mesosphere Sprint 27, Mesosphere Sprint 28, Mesosphere 
Sprint 29, Mesosphere Sprint 30)

> Logging is too verbose for sysadmins / syslog
> -
>
> Key: MESOS-4233
> URL: https://issues.apache.org/jira/browse/MESOS-4233
> Project: Mesos
>  Issue Type: Epic
>Reporter: Cody Maloney
>Assignee: Kapil Arya
>  Labels: mesosphere
> Attachments: giant_port_range_logging
>
>
> Currently mesos logs a lot. When launching a thousand tasks in the space of 
> 10 seconds it will print tens of thousands of log lines, overwhelming syslog 
> (there is a max rate at which a process can send stuff over a unix socket) 
> and not giving useful information to a sysadmin who cares about just the 
> high-level activity and when something goes wrong.
> Note mesos also blocks writing to its log locations, so when writing a lot of 
> log messages, it can fill up the write buffer in the kernel, and be suspended 
> until the syslog agent catches up reading from the socket (GLOG does a 
> blocking fwrite to stderr). GLOG also has a big mutex around logging so only 
> one thing logs at a time.
> While for "internal debugging" it is useful to see things like "message went 
> from internal compoent x to internal component y", from a sysadmin 
> perspective I only care about the high level actions taken (launched task for 
> framework x), sent offer to framework y, got task failed from host z. Note 
> those are what I'd expect at the "INFO" level. At the "WARNING" level I'd 
> expect very little to be logged / almost nothing in normal operation. Just 
> things like "WARN: Repliacted log write took longer than expected". WARN 
> would also get things like backtraces on crashes and abnormal exits / abort.
> When trying to launch 3k+ tasks inside a second, mesos logging currently 
> overwhelms syslog with 100k+ messages, many of which are thousands of bytes. 
> Sysadmins expect to be able to use syslog to monitor basic events in their 
> system. This is too much.
> We can keep logging the messages to files, but the logging to stderr needs to 
> be reduced significantly (stderr gets picked up and forwarded to syslog / 
> central aggregation).
> What I would like is if I can set the stderr logging level to be different / 
> independent from the file logging level (Syslog giving the "sysadmin" 
> aggregated overview, files useful for debugging in depth what happened in a 
> cluster). A lot of what mesos currently logs at info is really debugging info 
> / should show up as debug log level.
> Some samples of mesos logging a lot more than a sysadmin would want / expect 
> are attached, and some are below:
>  - Every task gets printed multiple times for a basic launch:
> {noformat}
> Dec 15 22:58:30 ip-10-0-7-60.us-west-2.compute.internal mesos-master[1311]: 
> I1215 22:58:29.382644  1315 master.cpp:3248] Launching task 
> envy.5b19a713-a37f-11e5-8b3e-0251692d6109 of framework 
> 5178f46d-71d6-422f-922c-5bbe82dff9cc- (marathon)
> Dec 15 22:58:30 ip-10-0-7-60.us-west-2.compute.internal mesos-master[1311]: 
> I1215 22:58:29.382925  1315 master.hpp:176] Adding task 
> envy.5b1958f2-a37f-11e5-8b3e-0251692d6109 with resources cpus(​*):0.0001; 
> mem(*​):16; ports(*):[14047-14047]
> {noformat}
>  - Every task status update prints many log lines, successful ones are part 
> of normal operation and maybe should be logged at info / debug levels, but 
> not to a sysadmin (Just show when things fail, and maybe aggregate counters 
> to tell of the volume of working)
>  - No log messagse should be really big / more than 1k characters (Would 
> prevent the giant port list attached, make that easily discoverable / bug 
> filable / fixable) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4947) Persistent volumes are not listed

2016-03-15 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195594#comment-15195594
 ] 

Neil Conway commented on MESOS-4947:


Vanilla Mesos 0.27.1 does not support listing persistent volumes. DCOS 1.6.1 
includes a patched version of Mesos that adds this information to the 
{{/state}} endpoint (under the {{reserved_resources_full}} key). Future 
versions of DCOS will use Mesos 0.28+, which exposes this information via 
{{/slaves}}.

Can you check whether the information is present in {{/state}} with DCOS 1.6.1?

> Persistent volumes are not listed
> -
>
> Key: MESOS-4947
> URL: https://issues.apache.org/jira/browse/MESOS-4947
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.1
> Environment: DCOS 1.6.1 on AWS
>Reporter: Max Neunhöffer
>Assignee: Neil Conway
>
> TL;DR:  In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either (as described here: 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes).
> Situation: There are Mesos agents in the cluster that have dynamic 
> reservations as well as persistent volumes for role "arangodb" with principal 
> "arangodb" but the corresponding framework does no longer exist (was 
> "destroyed" by clicking in the Marathon UI). Let's call these "Zombie 
> persistent volumes". We try to cleanup this mess manually (or automatically).
> Effect: According to 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
>  one should be able to list these zombies using 
> http:///mesos/slaves JSON/REST endpoint. We see a summary 
> of the dynamic reservations, but the persistent disks do not appear. As a 
> consequence we can neither use the /destroy-volumes API nor the /unreserve 
> API do perform manual or automatic cleanup.
> Additional information: 
>   - If we start a new framework with role "arangodb" and principal "arangodb" 
> it will receive resource offers containing the dynamic reservations *and* the 
> persistent volumes.
>   - In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either. So this might not be limited to zombie 
> persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4947) Persistent volumes are not listed

2016-03-15 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway reassigned MESOS-4947:
--

Assignee: Neil Conway

> Persistent volumes are not listed
> -
>
> Key: MESOS-4947
> URL: https://issues.apache.org/jira/browse/MESOS-4947
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.1
> Environment: DCOS 1.6.1 on AWS
>Reporter: Max Neunhöffer
>Assignee: Neil Conway
>
> TL;DR:  In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either (as described here: 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes).
> Situation: There are Mesos agents in the cluster that have dynamic 
> reservations as well as persistent volumes for role "arangodb" with principal 
> "arangodb" but the corresponding framework does no longer exist (was 
> "destroyed" by clicking in the Marathon UI). Let's call these "Zombie 
> persistent volumes". We try to cleanup this mess manually (or automatically).
> Effect: According to 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
>  one should be able to list these zombies using 
> http:///mesos/slaves JSON/REST endpoint. We see a summary 
> of the dynamic reservations, but the persistent disks do not appear. As a 
> consequence we can neither use the /destroy-volumes API nor the /unreserve 
> API do perform manual or automatic cleanup.
> Additional information: 
>   - If we start a new framework with role "arangodb" and principal "arangodb" 
> it will receive resource offers containing the dynamic reservations *and* the 
> persistent volumes.
>   - In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either. So this might not be limited to zombie 
> persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4949) Executor shutdown grace period should be configurable.

2016-03-15 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195382#comment-15195382
 ] 

Alexander Rukletsov edited comment on MESOS-4949 at 3/15/16 3:48 PM:
-

https://reviews.apache.org/r/44655/
https://reviews.apache.org/r/44854/


was (Author: alexr):
https://reviews.apache.org/r/44655/

> Executor shutdown grace period should be configurable.
> --
>
> Key: MESOS-4949
> URL: https://issues.apache.org/jira/browse/MESOS-4949
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Currently, executor shutdown grace period is specified by an agent flag, 
> which is propagated to executors via the 
> {{MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD}} environment variable. There is no 
> way to adjust this timeout for the needs of a particular executor.
> To tackle this problem, we propose to introduce an optional 
> {{shutdown_grace_period}} field in {{ExecutorInfo}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2281) Deprecate plain text Credential format.

2016-03-15 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195458#comment-15195458
 ] 

Adam B commented on MESOS-2281:
---

Do we have any logical reason for preferring the json format over the simpler 
newline-delimited plaintext format? Please explain.

> Deprecate plain text Credential format.
> ---
>
> Key: MESOS-2281
> URL: https://issues.apache.org/jira/browse/MESOS-2281
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Affects Versions: 0.21.1
>Reporter: Cody Maloney
>Assignee: Jan Schlicht
>  Labels: mesosphere, security, tech-debt
>
> Currently two formats of credentials are supported: JSON
> {code}
>   "credentials": [
> {
>   "principal": "sherman",
>   "secret": "kitesurf"
> }
> {code}
> And a new line file:
> {code}
> principal1 secret1
> pricipal2 secret2
> {code}
> We should deprecate the new line format and remove support for the old format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4723) Add allocator metric for currently satisfied quotas

2016-03-15 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-4723:

Description: We currently expose information on set quotas via dedicated 
quota endpoints. To diagnose allocator problems one additionally needs 
information about used quotas.

> Add allocator metric for currently satisfied quotas
> ---
>
> Key: MESOS-4723
> URL: https://issues.apache.org/jira/browse/MESOS-4723
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> We currently expose information on set quotas via dedicated quota endpoints. 
> To diagnose allocator problems one additionally needs information about used 
> quotas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4722) Add allocator metric for number of active offer filters

2016-03-15 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-4722:

Description: To diagnose scenarios where frameworks unexpectedly do not 
receive offers information on currently active filters are needed.

> Add allocator metric for number of active offer filters
> ---
>
> Key: MESOS-4722
> URL: https://issues.apache.org/jira/browse/MESOS-4722
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> To diagnose scenarios where frameworks unexpectedly do not receive offers 
> information on currently active filters are needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4721) Add allocator metric for allocation duration

2016-03-15 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-4721:

Description: Similar allocator timing-related information is already 
exposed in the log, but should also be exposed via an endpoint.

> Add allocator metric for allocation duration
> 
>
> Key: MESOS-4721
> URL: https://issues.apache.org/jira/browse/MESOS-4721
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> Similar allocator timing-related information is already exposed in the log, 
> but should also be exposed via an endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4909) Introduce kill policy for tasks.

2016-03-15 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4909:
---
  Sprint: Mesosphere Sprint 31
Story Points: 5

> Introduce kill policy for tasks.
> 
>
> Key: MESOS-4909
> URL: https://issues.apache.org/jira/browse/MESOS-4909
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> A task may require some time to clean up or even a special mechanism to issue 
> a kill request (currently it's a SIGTERM followed by SIGKILL). Introducing 
> kill policies per task will help address these issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4910) Deprecate the --docker_stop_timeout agent flag.

2016-03-15 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4910:
---
  Sprint: Mesosphere Sprint 31
Story Points: 1

> Deprecate the --docker_stop_timeout agent flag.
> ---
>
> Key: MESOS-4910
> URL: https://issues.apache.org/jira/browse/MESOS-4910
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Instead, a combination of {{executor_shutdown_grace_period}}
> agent flag and optionally task kill policies should be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4949) Executor shutdown grace period should be configurable.

2016-03-15 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4949:
---
  Sprint: Mesosphere Sprint 31
Story Points: 1

> Executor shutdown grace period should be configurable.
> --
>
> Key: MESOS-4949
> URL: https://issues.apache.org/jira/browse/MESOS-4949
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Currently, executor shutdown grace period is specified by an agent flag, 
> which is propagated to executors via the 
> {{MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD}} environment variable. There is no 
> way to adjust this timeout for the needs of a particular executor.
> To tackle this problem, we propose to introduce an optional 
> {{shutdown_grace_period}} field in {{ExecutorInfo}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4911) Executor driver does not respect executor shutdown grace period.

2016-03-15 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4911:
---
Sprint: Mesosphere Sprint 31

> Executor driver does not respect executor shutdown grace period.
> 
>
> Key: MESOS-4911
> URL: https://issues.apache.org/jira/browse/MESOS-4911
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Executor shutdown grace period, configured on the agent, is
> propagated to executors via the `MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD`
> environment variable. The executor driver must use this timeout to delay
> the hard shutdown of the related executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-1571) Signal escalation timeout is not configurable.

2016-03-15 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154191#comment-15154191
 ] 

Alexander Rukletsov edited comment on MESOS-1571 at 3/15/16 2:34 PM:
-

https://reviews.apache.org/r/44658/


was (Author: alexr):
https://reviews.apache.org/r/43757/
https://reviews.apache.org/r/43758/
https://reviews.apache.org/r/43759/
https://reviews.apache.org/r/43760/
https://reviews.apache.org/r/43761/
https://reviews.apache.org/r/43762/
https://reviews.apache.org/r/43763/
https://reviews.apache.org/r/43764/

> Signal escalation timeout is not configurable.
> --
>
> Key: MESOS-1571
> URL: https://issues.apache.org/jira/browse/MESOS-1571
> Project: Mesos
>  Issue Type: Bug
>Reporter: Niklas Quarfot Nielsen
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Even though the executor shutdown grace period is set to a larger interval, 
> the signal escalation timeout will still be 3 seconds. It should either be 
> configurable or dependent on EXECUTOR_SHUTDOWN_GRACE_PERIOD.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4949) Executor shutdown grace period should be configurable.

2016-03-15 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4949:
---
Description: 
Currently, executor shutdown grace period is specified by an agent flag, which 
is propagated to executors via the {{MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD}} 
environment variable. There is no way to adjust this timeout for the needs of a 
particular executor.

To tackle this problem, we propose to introduce an optional 
{{shutdown_grace_period}} field in {{ExecutorInfo}}.

> Executor shutdown grace period should be configurable.
> --
>
> Key: MESOS-4949
> URL: https://issues.apache.org/jira/browse/MESOS-4949
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Currently, executor shutdown grace period is specified by an agent flag, 
> which is propagated to executors via the 
> {{MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD}} environment variable. There is no 
> way to adjust this timeout for the needs of a particular executor.
> To tackle this problem, we propose to introduce an optional 
> {{shutdown_grace_period}} field in {{ExecutorInfo}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4949) Executor shutdown grace period should be configurable.

2016-03-15 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-4949:
--

 Summary: Executor shutdown grace period should be configurable.
 Key: MESOS-4949
 URL: https://issues.apache.org/jira/browse/MESOS-4949
 Project: Mesos
  Issue Type: Improvement
Reporter: Alexander Rukletsov
Assignee: Alexander Rukletsov






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4948) MasterMaintenanceTest.InverseOffers is flaky

2016-03-15 Thread Greg Mann (JIRA)
Greg Mann created MESOS-4948:


 Summary: MasterMaintenanceTest.InverseOffers is flaky
 Key: MESOS-4948
 URL: https://issues.apache.org/jira/browse/MESOS-4948
 Project: Mesos
  Issue Type: Bug
  Components: tests
 Environment: Ubuntu 14.04, using gcc, with libevent and SSL enabled 
(on ASF CI)
Reporter: Greg Mann


This seems to be an issue distinct from the other tickets that have been filed 
on this test. Failed log is included below; while the core dump appears just 
after the start of {{MasterMaintenanceTest.InverseOffersFilters}}, it looks to 
me like the segfault is triggered by one of the callbacks called at the end of 
{{MasterMaintenanceTest.InverseOffers}}.

{code}
[ RUN  ] MasterMaintenanceTest.InverseOffers
I0315 04:16:50.786032  2681 leveldb.cpp:174] Opened db in 125.361171ms
I0315 04:16:50.836374  2681 leveldb.cpp:181] Compacted db in 50.254411ms
I0315 04:16:50.836470  2681 leveldb.cpp:196] Created db iterator in 25917ns
I0315 04:16:50.836488  2681 leveldb.cpp:202] Seeked to beginning of db in 3291ns
I0315 04:16:50.836498  2681 leveldb.cpp:271] Iterated through 0 keys in the db 
in 253ns
I0315 04:16:50.836549  2681 replica.cpp:779] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0315 04:16:50.837474  2702 recover.cpp:447] Starting replica recovery
I0315 04:16:50.837565  2681 cluster.cpp:183] Creating default 'local' authorizer
I0315 04:16:50.838191  2702 recover.cpp:473] Replica is in EMPTY status
I0315 04:16:50.839532  2704 replica.cpp:673] Replica in EMPTY status received a 
broadcasted recover request from (4784)@172.17.0.4:39845
I0315 04:16:50.839754  2705 recover.cpp:193] Received a recover response from a 
replica in EMPTY status
I0315 04:16:50.841893  2704 recover.cpp:564] Updating replica status to STARTING
I0315 04:16:50.842566  2703 master.cpp:376] Master 
c326bc68-2581-48d4-9dc4-0d6f270bdda1 (01fcd642f65f) started on 172.17.0.4:39845
I0315 04:16:50.842644  2703 master.cpp:378] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="false" --authenticate_http="true" --authenticate_slaves="true" 
--authenticators="crammd5" --authorizers="local" 
--credentials="/tmp/DE2Uaw/credentials" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" 
--work_dir="/tmp/DE2Uaw/master" --zk_session_timeout="10secs"
I0315 04:16:50.843168  2703 master.cpp:425] Master allowing unauthenticated 
frameworks to register
I0315 04:16:50.843227  2703 master.cpp:428] Master only allowing authenticated 
slaves to register
I0315 04:16:50.843302  2703 credentials.hpp:35] Loading credentials for 
authentication from '/tmp/DE2Uaw/credentials'
I0315 04:16:50.843737  2703 master.cpp:468] Using default 'crammd5' 
authenticator
I0315 04:16:50.843969  2703 master.cpp:537] Using default 'basic' HTTP 
authenticator
I0315 04:16:50.844177  2703 master.cpp:571] Authorization enabled
I0315 04:16:50.844360  2708 hierarchical.cpp:144] Initialized hierarchical 
allocator process
I0315 04:16:50.844430  2708 whitelist_watcher.cpp:77] No whitelist given
I0315 04:16:50.848227  2703 master.cpp:1806] The newly elected leader is 
master@172.17.0.4:39845 with id c326bc68-2581-48d4-9dc4-0d6f270bdda1
I0315 04:16:50.848269  2703 master.cpp:1819] Elected as the leading master!
I0315 04:16:50.848292  2703 master.cpp:1508] Recovering from registrar
I0315 04:16:50.848563  2703 registrar.cpp:307] Recovering registrar
I0315 04:16:50.876277  2711 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 34.178445ms
I0315 04:16:50.876365  2711 replica.cpp:320] Persisted replica status to 
STARTING
I0315 04:16:50.876776  2711 recover.cpp:473] Replica is in STARTING status
I0315 04:16:50.878779  2706 replica.cpp:673] Replica in STARTING status 
received a broadcasted recover request from (4786)@172.17.0.4:39845
I0315 04:16:50.879240  2706 recover.cpp:193] Received a recover response from a 
replica in STARTING status
I0315 04:16:50.880100  2701 recover.cpp:564] Updating replica status to VOTING
I0315 04:16:50.915776  2701 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 35.472106ms
I0315 04:16:50.915868  2701 replica.cpp:320] Persisted replica status to VOTING
I0315 04:16:50.916158  2701 recover.cpp:578] 

[jira] [Commented] (MESOS-4947) Persistent volumes are not listed

2016-03-15 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195273#comment-15195273
 ] 

Joerg Schad commented on MESOS-4947:


[~neilconway] Could you take a look here? Thanks! 

> Persistent volumes are not listed
> -
>
> Key: MESOS-4947
> URL: https://issues.apache.org/jira/browse/MESOS-4947
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.1
> Environment: DCOS 1.6.1 on AWS
>Reporter: Max Neunhöffer
>
> TL;DR:  In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either (as described here: 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes).
> Situation: There are Mesos agents in the cluster that have dynamic 
> reservations as well as persistent volumes for role "arangodb" with principal 
> "arangodb" but the corresponding framework does no longer exist (was 
> "destroyed" by clicking in the Marathon UI). Let's call these "Zombie 
> persistent volumes". We try to cleanup this mess manually (or automatically).
> Effect: According to 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
>  one should be able to list these zombies using 
> http:///mesos/slaves JSON/REST endpoint. We see a summary 
> of the dynamic reservations, but the persistent disks do not appear. As a 
> consequence we can neither use the /destroy-volumes API nor the /unreserve 
> API do perform manual or automatic cleanup.
> Additional information: 
>   - If we start a new framework with role "arangodb" and principal "arangodb" 
> it will receive resource offers containing the dynamic reservations *and* the 
> persistent volumes.
>   - In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either. So this might not be limited to zombie 
> persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4947) Persistent volumes are not listed

2016-03-15 Thread Joerg Schad (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-4947:
---
Component/s: (was: HTTP API)

> Persistent volumes are not listed
> -
>
> Key: MESOS-4947
> URL: https://issues.apache.org/jira/browse/MESOS-4947
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.1
> Environment: DCOS 1.6.1 on AWS
>Reporter: Max Neunhöffer
>
> TL;DR:  In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either (as described here: 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes).
> Situation: There are Mesos agents in the cluster that have dynamic 
> reservations as well as persistent volumes for role "arangodb" with principal 
> "arangodb" but the corresponding framework does no longer exist (was 
> "destroyed" by clicking in the Marathon UI). Let's call these "Zombie 
> persistent volumes". We try to cleanup this mess manually (or automatically).
> Effect: According to 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
>  one should be able to list these zombies using 
> http:///mesos/slaves JSON/REST endpoint. We see a summary 
> of the dynamic reservations, but the persistent disks do not appear. As a 
> consequence we can neither use the /destroy-volumes API nor the /unreserve 
> API do perform manual or automatic cleanup.
> Additional information: 
>   - If we start a new framework with role "arangodb" and principal "arangodb" 
> it will receive resource offers containing the dynamic reservations *and* the 
> persistent volumes.
>   - In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either. So this might not be limited to zombie 
> persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4947) Persistent volumes are not listed

2016-03-15 Thread Joerg Schad (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-4947:
---
Description: 
TL;DR:  In a newly created DCOS cluster with a running framework and actually 
used dynamic reservations and persistent volumes the /slaves API does not list 
the persistent volumes either (as described here: 
https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes).

Situation: There are Mesos agents in the cluster that have dynamic reservations 
as well as persistent volumes for role "arangodb" with principal "arangodb" but 
the corresponding framework does no longer exist (was "destroyed" by clicking 
in the Marathon UI). Let's call these "Zombie persistent volumes". We try to 
cleanup this mess manually (or automatically).

Effect: According to 
https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
 one should be able to list these zombies using 
http:///mesos/slaves JSON/REST endpoint. We see a summary of 
the dynamic reservations, but the persistent disks do not appear. As a 
consequence we can neither use the /destroy-volumes API nor the /unreserve API 
do perform manual or automatic cleanup.

Additional information: 

  - If we start a new framework with role "arangodb" and principal "arangodb" 
it will receive resource offers containing the dynamic reservations *and* the 
persistent volumes.
  - In a newly created DCOS cluster with a running framework and actually used 
dynamic reservations and persistent volumes the /slaves API does not list the 
persistent volumes either. So this might not be limited to zombie persistent 
volumes.

  was:
Situation: There are Mesos agents in the cluster that have dynamic reservations 
as well as persistent volumes for role "arangodb" with principal "arangodb" but 
the corresponding framework does no longer exist (was "destroyed" by clicking 
in the Marathon UI). Let's call these "Zombie persistent volumes". We try to 
cleanup this mess manually (or automatically).

Effect: According to 
https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
 one should be able to list these zombies using 
http:///mesos/slaves JSON/REST endpoint. We see a summary of 
the dynamic reservations, but the persistent disks do not appear. As a 
consequence we can neither use the /destroy-volumes API nor the /unreserve API 
do perform manual or automatic cleanup.

Additional information: 

  - If we start a new framework with role "arangodb" and principal "arangodb" 
it will receive resource offers containing the dynamic reservations *and* the 
persistent volumes.
  - In a newly created DCOS cluster with a running framework and actually used 
dynamic reservations and persistent volumes the /slaves API does not list the 
persistent volumes either. So this might not be limited to zombie persistent 
volumes.


> Persistent volumes are not listed
> -
>
> Key: MESOS-4947
> URL: https://issues.apache.org/jira/browse/MESOS-4947
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.1
> Environment: DCOS 1.6.1 on AWS
>Reporter: Max Neunhöffer
>
> TL;DR:  In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either (as described here: 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes).
> Situation: There are Mesos agents in the cluster that have dynamic 
> reservations as well as persistent volumes for role "arangodb" with principal 
> "arangodb" but the corresponding framework does no longer exist (was 
> "destroyed" by clicking in the Marathon UI). Let's call these "Zombie 
> persistent volumes". We try to cleanup this mess manually (or automatically).
> Effect: According to 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
>  one should be able to list these zombies using 
> http:///mesos/slaves JSON/REST endpoint. We see a summary 
> of the dynamic reservations, but the persistent disks do not appear. As a 
> consequence we can neither use the /destroy-volumes API nor the /unreserve 
> API do perform manual or automatic cleanup.
> Additional information: 
>   - If we start a new framework with role "arangodb" and principal "arangodb" 
> it will receive resource offers containing the dynamic reservations *and* the 
> persistent volumes.
>   - In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either. So this might not be limited to zombie 
> persistent volumes.



--
This message was sent by Atlassian JIRA

[jira] [Updated] (MESOS-4947) Persistent volumes are not listed

2016-03-15 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/MESOS-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Neunhöffer updated MESOS-4947:
--
Summary: Persistent volumes are not listed  (was: Zombie persistent volumes 
are not listed)

> Persistent volumes are not listed
> -
>
> Key: MESOS-4947
> URL: https://issues.apache.org/jira/browse/MESOS-4947
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API
>Affects Versions: 0.27.1
> Environment: DCOS 1.6.1 on AWS
>Reporter: Max Neunhöffer
>
> Situation: There are Mesos agents in the cluster that have dynamic 
> reservations as well as persistent volumes for role "arangodb" with principal 
> "arangodb" but the corresponding framework does no longer exist (was 
> "destroyed" by clicking in the Marathon UI). Let's call these "Zombie 
> persistent volumes". We try to cleanup this mess manually (or automatically).
> Effect: According to 
> https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
>  one should be able to list these zombies using 
> http:///mesos/slaves JSON/REST endpoint. We see a summary 
> of the dynamic reservations, but the persistent disks do not appear. As a 
> consequence we can neither use the /destroy-volumes API nor the /unreserve 
> API do perform manual or automatic cleanup.
> Additional information: 
>   - If we start a new framework with role "arangodb" and principal "arangodb" 
> it will receive resource offers containing the dynamic reservations *and* the 
> persistent volumes.
>   - In a newly created DCOS cluster with a running framework and actually 
> used dynamic reservations and persistent volumes the /slaves API does not 
> list the persistent volumes either. So this might not be limited to zombie 
> persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4947) Zombie persistent volumes are not listed

2016-03-15 Thread JIRA
Max Neunhöffer created MESOS-4947:
-

 Summary: Zombie persistent volumes are not listed
 Key: MESOS-4947
 URL: https://issues.apache.org/jira/browse/MESOS-4947
 Project: Mesos
  Issue Type: Bug
  Components: HTTP API
Affects Versions: 0.27.1
 Environment: DCOS 1.6.1 on AWS
Reporter: Max Neunhöffer


Situation: There are Mesos agents in the cluster that have dynamic reservations 
as well as persistent volumes for role "arangodb" with principal "arangodb" but 
the corresponding framework does no longer exist (was "destroyed" by clicking 
in the Marathon UI). Let's call these "Zombie persistent volumes". We try to 
cleanup this mess manually (or automatically).

Effect: According to 
https://github.com/apache/mesos/blob/master/docs/persistent-volume.md#listing-persistent-volumes
 one should be able to list these zombies using 
http:///mesos/slaves JSON/REST endpoint. We see a summary of 
the dynamic reservations, but the persistent disks do not appear. As a 
consequence we can neither use the /destroy-volumes API nor the /unreserve API 
do perform manual or automatic cleanup.

Additional information: 

  - If we start a new framework with role "arangodb" and principal "arangodb" 
it will receive resource offers containing the dynamic reservations *and* the 
persistent volumes.
  - In a newly created DCOS cluster with a running framework and actually used 
dynamic reservations and persistent volumes the /slaves API does not list the 
persistent volumes either. So this might not be limited to zombie persistent 
volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role and command uris

2016-03-15 Thread Jian Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195151#comment-15195151
 ] 

Jian Qiu commented on MESOS-4744:
-

RR for setting role.
https://reviews.apache.org/r/43935/

> mesos-execute should allow setting role and command uris
> 
>
> Key: MESOS-4744
> URL: https://issues.apache.org/jira/browse/MESOS-4744
> Project: Mesos
>  Issue Type: Bug
>  Components: cli
>Reporter: Jian Qiu
>Assignee: Jian Qiu
>Priority: Minor
>
> It will be quite useful if we can set role and command uris when running 
> mesos-execute



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2281) Deprecate plain text Credential format.

2016-03-15 Thread Jan Schlicht (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Schlicht updated MESOS-2281:

Summary: Deprecate plain text Credential format.  (was: Deprecate legacy 
Credential format.)

> Deprecate plain text Credential format.
> ---
>
> Key: MESOS-2281
> URL: https://issues.apache.org/jira/browse/MESOS-2281
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, slave
>Affects Versions: 0.21.1
>Reporter: Cody Maloney
>Assignee: Jan Schlicht
>  Labels: mesosphere, security, tech-debt
>
> Currently two formats of credentials are supported: JSON
> {code}
>   "credentials": [
> {
>   "principal": "sherman",
>   "secret": "kitesurf"
> }
> {code}
> And a new line file:
> {code}
> principal1 secret1
> pricipal2 secret2
> {code}
> We should deprecate the new line format and remove support for the old format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3243) Replace NULL with nullptr

2016-03-15 Thread Tomasz Janiszewski (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194978#comment-15194978
 ] 

Tomasz Janiszewski commented on MESOS-3243:
---

Review: https://reviews.apache.org/r/44843/
clang-tidy did not replace all NULLs so I used sed

> Replace NULL with nullptr
> -
>
> Key: MESOS-3243
> URL: https://issues.apache.org/jira/browse/MESOS-3243
> Project: Mesos
>  Issue Type: Bug
>Reporter: Michael Park
>
> As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over 
> to use {{nullptr}}. I think it would be an interesting exercise to do this 
> with {{clang-modernize}} using the [nullptr 
> transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although 
> it's probably just as easy to use {{sed}}).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4886) Support mesos containerizer force_pull_image option.

2016-03-15 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194844#comment-15194844
 ] 

Guangya Liu edited comment on MESOS-4886 at 3/15/16 9:03 AM:
-

RR:
https://reviews.apache.org/r/44837/
https://reviews.apache.org/r/44838/
https://reviews.apache.org/r/44839/


was (Author: gyliu):
Two patches for docker:
https://reviews.apache.org/r/44837/
https://reviews.apache.org/r/44838/
https://reviews.apache.org/r/44839/

> Support mesos containerizer force_pull_image option.
> 
>
> Key: MESOS-4886
> URL: https://issues.apache.org/jira/browse/MESOS-4886
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Guangya Liu
>  Labels: containerizer
>
> Currently for unified containerizer, images that are already cached by 
> metadata manager cannot be updated. User has to delete corresponding images 
> in store if an update is need. We should support `force_pull_image` option 
> for unified containerizer, to provide override option if existed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4886) Support mesos containerizer force_pull_image option.

2016-03-15 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194844#comment-15194844
 ] 

Guangya Liu edited comment on MESOS-4886 at 3/15/16 9:02 AM:
-

Two patches for docker:
https://reviews.apache.org/r/44837/
https://reviews.apache.org/r/44838/
https://reviews.apache.org/r/44839/


was (Author: gyliu):
Two patches for docker:
https://reviews.apache.org/r/44837/
https://reviews.apache.org/r/44838/

> Support mesos containerizer force_pull_image option.
> 
>
> Key: MESOS-4886
> URL: https://issues.apache.org/jira/browse/MESOS-4886
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Guangya Liu
>  Labels: containerizer
>
> Currently for unified containerizer, images that are already cached by 
> metadata manager cannot be updated. User has to delete corresponding images 
> in store if an update is need. We should support `force_pull_image` option 
> for unified containerizer, to provide override option if existed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role and command uris

2016-03-15 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194903#comment-15194903
 ] 

haosdent commented on MESOS-4744:
-

I suggest split them to two tickets. One for role, the other one for 
command_uris.

> mesos-execute should allow setting role and command uris
> 
>
> Key: MESOS-4744
> URL: https://issues.apache.org/jira/browse/MESOS-4744
> Project: Mesos
>  Issue Type: Bug
>  Components: cli
>Reporter: Jian Qiu
>Assignee: Jian Qiu
>Priority: Minor
>
> It will be quite useful if we can set role and command uris when running 
> mesos-execute



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.

2016-03-15 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194899#comment-15194899
 ] 

Jay Guo commented on MESOS-4891:


[~jieyu] Should this endpoint be added directly to agent or /monitor/containers?

> Add a '/containers' endpoint to the agent to list all the active containers.
> 
>
> Key: MESOS-4891
> URL: https://issues.apache.org/jira/browse/MESOS-4891
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Jie Yu
>Assignee: Jay Guo
>
> This endpoint will be similar to /monitor/statistics.json endpoint, but it'll 
> also contain the 'container_status' about the container (see ContainerStatus 
> in mesos.proto). We'll eventually deprecate the /monitor/statistics.json 
> endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4744) mesos-execute should allow setting role and command uris

2016-03-15 Thread Jian Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194881#comment-15194881
 ] 

Jian Qiu commented on MESOS-4744:
-

I have updated the description of the ticket. Do you think adding a flag
{code}
--command_uris=uri1,uri2...
{code}
is sufficient?

> mesos-execute should allow setting role and command uris
> 
>
> Key: MESOS-4744
> URL: https://issues.apache.org/jira/browse/MESOS-4744
> Project: Mesos
>  Issue Type: Bug
>  Components: cli
>Reporter: Jian Qiu
>Assignee: Jian Qiu
>Priority: Minor
>
> It will be quite useful if we can set role and command uris when running 
> mesos-execute



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4744) mesos-execute should allow setting role and command uris

2016-03-15 Thread Jian Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Qiu updated MESOS-4744:

Description: It will be quite useful if we can set role and command uris 
when running mesos-execute  (was: It will be quite useful if we can set role 
when running mesos-execute)

> mesos-execute should allow setting role and command uris
> 
>
> Key: MESOS-4744
> URL: https://issues.apache.org/jira/browse/MESOS-4744
> Project: Mesos
>  Issue Type: Bug
>  Components: cli
>Reporter: Jian Qiu
>Assignee: Jian Qiu
>Priority: Minor
>
> It will be quite useful if we can set role and command uris when running 
> mesos-execute



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4744) mesos-execute should allow setting role and command uris

2016-03-15 Thread Jian Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Qiu updated MESOS-4744:

Summary: mesos-execute should allow setting role and command uris  (was: 
mesos-execute should allow setting role)

> mesos-execute should allow setting role and command uris
> 
>
> Key: MESOS-4744
> URL: https://issues.apache.org/jira/browse/MESOS-4744
> Project: Mesos
>  Issue Type: Bug
>  Components: cli
>Reporter: Jian Qiu
>Assignee: Jian Qiu
>Priority: Minor
>
> It will be quite useful if we can set role when running mesos-execute



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >