Review Request 64970: Replace ad hoc venv under support/ with tox.

2018-01-05 Thread Eric Chung

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64970/
---

Review request for mesos, Armand Grillet and Kevin Klues.


Repository: mesos


Description
---

Replace ad hoc venv under support/ with tox.

This patch changes the ad hoc .virtualenv created under support/ with tox, a 
tool that manages virtualenvs with a declaritive config file. The cool thing 
about it is that you can actually have multiple tox.ini files distributed in 
different python source trees, so you don't really have to install all of the 
depedencies into a single virtualenv; all you need to do is to run `tox -e 
 ` at the root of the source tree.


Diffs
-

  support/.gitignore a21a0f95b9113eae2881d2e346821c86761bb2bc 
  support/mesos-style.py 1b34ea2d9afa8f17b545841cea7a6853a6e18144 


Diff: https://reviews.apache.org/r/64970/diff/1/


Testing
---

1. intentionally create a lint error, such as extra spaces before a parens in a 
python file
2. run the pre-commit hook and see tox in action


Thanks,

Eric Chung



Re: Review Request 64953: Added error message in tests for orphaned containers.

2018-01-05 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64953/#review194845
---


Ship it!




Ship It!

- Alexander Rukletsov


On Jan. 4, 2018, 5:28 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64953/
> ---
> 
> (Updated Jan. 4, 2018, 5:28 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Joseph Wu, and Jan 
> Schlicht.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added error message in tests for orphaned containers.
> 
> 
> Diffs
> -
> 
>   src/tests/cluster.cpp f964bf0cd0cf22374877e5748ba142dcb5fee133 
> 
> 
> Diff: https://reviews.apache.org/r/64953/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 64970: Replace ad hoc venv under support/ with tox.

2018-01-05 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64970/#review194848
---



Is the assumption that the user already have `tox` installed on their box? I've 
never installed tox on my machine outside of a virtualenv (and would prefer not 
to have to unless absolutely necessary).

- Kevin Klues


On Jan. 5, 2018, 8:26 a.m., Eric Chung wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64970/
> ---
> 
> (Updated Jan. 5, 2018, 8:26 a.m.)
> 
> 
> Review request for mesos, Armand Grillet and Kevin Klues.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Replace ad hoc venv under support/ with tox.
> 
> This patch changes the ad hoc .virtualenv created under support/ with tox, a 
> tool that manages virtualenvs with a declaritive config file. The cool thing 
> about it is that you can actually have multiple tox.ini files distributed in 
> different python source trees, so you don't really have to install all of the 
> depedencies into a single virtualenv; all you need to do is to run `tox -e 
>  ` at the root of the source tree.
> 
> 
> Diffs
> -
> 
>   support/.gitignore a21a0f95b9113eae2881d2e346821c86761bb2bc 
>   support/mesos-style.py 1b34ea2d9afa8f17b545841cea7a6853a6e18144 
> 
> 
> Diff: https://reviews.apache.org/r/64970/diff/1/
> 
> 
> Testing
> ---
> 
> 1. intentionally create a lint error, such as extra spaces before a parens in 
> a python file
> 2. run the pre-commit hook and see tox in action
> 
> 
> Thanks,
> 
> Eric Chung
> 
>



Review Request 64972: Renamed VolumeProfileAdaptor to DiskProfileAdaptor.

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64972/
---

Review request for mesos and Jie Yu.


Repository: mesos


Description
---

"Disk Profile" is more accurate than "Volume Profile" when describing
the set of parameters passed into a Storage Resource Provider while
creating disk resources.  A disk resource is not necessarily a "Volume"
but will always be a disk.

This commit only renames the affected files.


Diffs
-

  include/mesos/module/volume_profile.hpp  
  include/mesos/resource_provider/storage/volume_profile.hpp  
  src/resource_provider/storage/uri_volume_profile.hpp  
  src/resource_provider/storage/uri_volume_profile.cpp  
  src/resource_provider/storage/volume_profile.cpp  
  src/resource_provider/storage/volume_profile.proto  
  src/resource_provider/storage/volume_profile_utils.hpp  
  src/resource_provider/storage/volume_profile_utils.cpp  
  src/tests/volume_profile_tests.cpp  


Diff: https://reviews.apache.org/r/64972/diff/1/


Testing
---

See next patch.

The two patches will probably be squashed together before commit.


Thanks,

Joseph Wu



Review Request 64973: Renamed VolumeProfileAdaptor to DiskProfileAdaptor (continued).

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64973/
---

Review request for mesos and Jie Yu.


Repository: mesos


Description
---

This renames the contents of the files rather than just the filenames.

This commit will be squashed into the previous review before commit.


Diffs
-

  docs/csi.md d268df19c35720937323995b60f51601f7866342 
  include/mesos/mesos.proto c677a8be07d0ef209d42622ae32056d36e55ff78 
  include/mesos/module/disk_profile.hpp 
bb95b32a18aa3ca316dd806c99b428bddfaba423 
  include/mesos/resource_provider/storage/disk_profile.hpp 
7141a14a0cef4757717316a6fc6b92ecf40f20af 
  include/mesos/v1/mesos.proto da7b4587891c47c02345209e0fdca60585a36fdc 
  src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
  src/module/manager.cpp f038f05ced9a4f967c67b48487f53a39ff4e9e89 
  src/resource_provider/storage/disk_profile.cpp 
336220d241458b3fc4cd356354677c89b442440f 
  src/resource_provider/storage/disk_profile.proto 
c0628ecd451312f521680cacad854ab04dcd4c90 
  src/resource_provider/storage/disk_profile_utils.hpp 
46afbd3a0eba8e1a0635ba9551843f2dce4d6def 
  src/resource_provider/storage/disk_profile_utils.cpp 
3d04558459ed5f970533c4d02a57cb07a5984bbb 
  src/resource_provider/storage/provider.cpp 
a9f007a061691ea3c6c61fc965cf3a3e764a7275 
  src/resource_provider/storage/uri_disk_profile.hpp 
a58758cd943bc7dff4dd096689e5a78a70f9d2ce 
  src/resource_provider/storage/uri_disk_profile.cpp 
4431ad11207d08f86f919042ef957b916c55ddfa 
  src/slave/flags.hpp 1cf5272301b54c73a4bdee4ae1dda7f7d75d55ce 
  src/slave/flags.cpp 48b882152b89cfd3f942382a28b79b7f26aeeefd 
  src/slave/slave.hpp b5eec3352e950371097b74a7f0516e474a4e2a0c 
  src/slave/slave.cpp cfb675df677e7a0d476b8d5a586afc2f197ab810 
  src/tests/disk_profile_tests.cpp 6ed81c90d8769889cfc4f91f9b378acd9c42c038 
  src/tests/storage_local_resource_provider_tests.cpp 
b522b406205ab4a8cb430bedd18ec00c9f437e8a 


Diff: https://reviews.apache.org/r/64973/diff/1/


Testing
---

make check


Thanks,

Joseph Wu



Re: Review Request 64805: Changed 'ConstantEndpointDetector' to have value semantics.

2018-01-05 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64805/#review194850
---


Ship it!




Ship It!

- Jan Schlicht


On Dec. 22, 2017, 2:36 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64805/
> ---
> 
> (Updated Dec. 22, 2017, 2:36 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Jan Schlicht.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The class 'ConstantEndpointDetector' was holding a reference to one of
> its constructor parameters which introduces unnecessary brittleness.
> 
> In this patch we adjust the class to just hold a copy of the
> originally passed value instead.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/detector.hpp 68ea8cf2de4f2decc3eede16cf59fa8a851502f4 
> 
> 
> Diff: https://reviews.apache.org/r/64805/diff/1/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 64932: Added example framework converting disk resources.

2018-01-05 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64932/
---

(Updated Jan. 5, 2018, 1:37 p.m.)


Review request for mesos, Greg Mann and Jie Yu.


Changes
---

Addressed comments from Gaston and Greg.


Repository: mesos


Description
---

This patch introduces an example HTTP framework which transforms
'RAW' disk resources from resource providers into 'MOUNT' volumes and
subsequently unreserves them.


Diffs (updated)
-

  src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
  src/examples/CMakeLists.txt d4f1af4f072efdc68fa0b722f42b1d8aa1779b6e 
  src/examples/test_csi_user_framework.cpp PRE-CREATION 


Diff: https://reviews.apache.org/r/64932/diff/2/

Changes: https://reviews.apache.org/r/64932/diff/1-2/


Testing
---

`make check`


Thanks,

Benjamin Bannier



Re: Review Request 64932: Added example framework converting disk resources.

2018-01-05 Thread Benjamin Bannier


> On Jan. 5, 2018, 6:01 a.m., Greg Mann wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 282 (patched)
> > 
> >
> > Is it not possible that a single resource provider can offer both 
> > reserved and unreserved resources? Even if the current SLRP provides this 
> > guarantee, can we be sure that all RPs will honor it?

This guarantee only works under a very specific setup, and we should be able to 
deal with a much broader set of resources.

I removed this block here, but instead added an additional check to above 
filter loop.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64932/#review194830
---


On Jan. 5, 2018, 1:37 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64932/
> ---
> 
> (Updated Jan. 5, 2018, 1:37 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces an example HTTP framework which transforms
> 'RAW' disk resources from resource providers into 'MOUNT' volumes and
> subsequently unreserves them.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
>   src/examples/CMakeLists.txt d4f1af4f072efdc68fa0b722f42b1d8aa1779b6e 
>   src/examples/test_csi_user_framework.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64932/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 64932: Added example framework converting disk resources.

2018-01-05 Thread Benjamin Bannier


> On Jan. 4, 2018, 9:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 278 (patched)
> > 
> >
> > I'd rephrase this:
> > 
> > ```
> > Check whether the given resources are reserved, ensuring that all 
> > RAW/MOUNT disk resources offered by the resource provider are either 
> > reserved or unreserved.
> > ```
> > 
> > Wouldn't this break if a resource provider offers both reserved and 
> > unreserved disk resources? I guess this will be usual once the agent's 
> > default disk resources (non-csi) are handled by an SLRP.

Fixed, see comment below, https://reviews.apache.org/r/64932/#comment273888.


> On Jan. 4, 2018, 9:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 279 (patched)
> > 
> >
> > Why do you use an option here?
> > 
> > Wouldn't `CHECK(!resources.isEmpty())` be clearer?

This piece checked whether all resources handled here are either _all reserved_ 
or that _none is reserved_. I am not sure the alternative you proposed is 
equivalent.

I moved the checked of reservedness to the previous step so this issue does not 
apply anymore; dropping.


> On Jan. 4, 2018, 9:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 302 (patched)
> > 
> >
> > Use `endl` instead of `'\n'`.

Adjusted here and elsewhere for consistency.


> On Jan. 4, 2018, 9:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 331-332 (patched)
> > 
> >
> > I'd phrase it:
> > 
> > If we didn't create an `ACCEPT` operation, create a `DELCINE` operation.
> 
> Greg Mann wrote:
> +1

We might still create an `ACCEPT` but add no operations in which case we would 
like to decline as well. I went with

// If we did not create operations to accept the offer with decline it.


> On Jan. 4, 2018, 9:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 380-386 (patched)
> > 
> >
> > This method doesn't seem to be used. `int main(...)` should probably 
> > call `Flags::setUsageMessage` instead of this.
> 
> Greg Mann wrote:
> Yea weird, we have these in a couple other example frameworks as well. 
> Should probably clean them up when we have a chance.

I removed the function here.


> On Jan. 4, 2018, 9:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 399 (patched)
> > 
> >
> > Why don't we make this a `string` instead of an `Option`?
> > 
> > That way we could remove the: `if (flags.master.isNone())` block in the 
> > main function.

That's a great suggestion. The current code here does follow what virtually 
every other example framework does though.


> On Jan. 4, 2018, 9:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 444-447 (patched)
> > 
> >
> > Shouldn't this be a flag like in `no_executor_framework.cpp` and in 
> > `long_lived_framework.cpp`?

We do appear to be inconsistent here, e.g., `load_generator_framework.cpp`, 
`test_framework.cpp`, or `test_http_framework.cpp` use an environment variable.

I'd prefer to keep this as is and leave it for a broader cleanup which could 
e.g., make it possible to load flags from both environment or command line 
flags; dropping.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64932/#review194780
---


On Jan. 5, 2018, 1:37 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64932/
> ---
> 
> (Updated Jan. 5, 2018, 1:37 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces an example HTTP framework which transforms
> 'RAW' disk resources from resource providers into 'MOUNT' volumes and
> subsequently unreserves them.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
>   src/examples/CMakeLists.txt d4f1af4f072efdc68fa0b722f42b1d8aa1779b6e 
>   src/examples/test_csi_user_framework.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64932/diff/2/
> 
> 
> Tes

Re: Review Request 64970: Replace ad hoc venv under support/ with tox.

2018-01-05 Thread Armand Grillet

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64970/#review194853
---



I agree with Kevin, hhaving tox managing our virtualenvs rather than doing it 
manually is a good idea but the cost of having a new tool that Mesos 
contributors need to install for the Python linting process seems too high. 
This is reinforced by the fact that the JS linter will still require a manually 
managed virtualenv in `/support`, this patch will thus add a new virtualenv 
instead of replacing/removing an existent one.


support/mesos-style.py
Line 150 (original), 150 (patched)


`mesos/support/tox.ini` does not exist and is not added in this patch, is 
that normal?


- Armand Grillet


On Jan. 5, 2018, 8:26 a.m., Eric Chung wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64970/
> ---
> 
> (Updated Jan. 5, 2018, 8:26 a.m.)
> 
> 
> Review request for mesos, Armand Grillet and Kevin Klues.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Replace ad hoc venv under support/ with tox.
> 
> This patch changes the ad hoc .virtualenv created under support/ with tox, a 
> tool that manages virtualenvs with a declaritive config file. The cool thing 
> about it is that you can actually have multiple tox.ini files distributed in 
> different python source trees, so you don't really have to install all of the 
> depedencies into a single virtualenv; all you need to do is to run `tox -e 
>  ` at the root of the source tree.
> 
> 
> Diffs
> -
> 
>   support/.gitignore a21a0f95b9113eae2881d2e346821c86761bb2bc 
>   support/mesos-style.py 1b34ea2d9afa8f17b545841cea7a6853a6e18144 
> 
> 
> Diff: https://reviews.apache.org/r/64970/diff/1/
> 
> 
> Testing
> ---
> 
> 1. intentionally create a lint error, such as extra spaces before a parens in 
> a python file
> 2. run the pre-commit hook and see tox in action
> 
> 
> Thanks,
> 
> Eric Chung
> 
>



Re: Review Request 64932: Added example framework converting disk resources.

2018-01-05 Thread Benjamin Bannier


> On Jan. 4, 2018, 9:49 p.m., Gaston Kleiman wrote:
> > Should we add this new framework to `src/tests/examples_tests.cpp`?

I am not sure the framework would be useful in its current form for such a 
test; it e.g., has no clear termination criterion.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64932/#review194780
---


On Jan. 5, 2018, 1:37 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64932/
> ---
> 
> (Updated Jan. 5, 2018, 1:37 p.m.)
> 
> 
> Review request for mesos, Greg Mann and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces an example HTTP framework which transforms
> 'RAW' disk resources from resource providers into 'MOUNT' volumes and
> subsequently unreserves them.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
>   src/examples/CMakeLists.txt d4f1af4f072efdc68fa0b722f42b1d8aa1779b6e 
>   src/examples/test_csi_user_framework.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64932/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 61172: Added mesos.http and mesos.exceptions for CLI.

2018-01-05 Thread Armand Grillet

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61172/#review194862
---



Some small issues to fix, there are also some typos in the commit description.


src/python/cli_new/bootstrap
Lines 72 (patched)


This file does not exist, is that normal?



src/python/lib/mesos/exceptions.py
Lines 26 (patched)


s/mesos/Mesos



src/python/lib/mesos/exceptions.py
Lines 30 (patched)


We use the syntax `"{msg}: {exception}".format(msg=message, 
exception=original_exception)` in the rest of our Python codebase. I do not 
have an opinion regarding the syntax but we should follow the current one in 
that review request and do a refactoring later if necessary.



src/python/lib/mesos/exceptions.py
Lines 60 (patched)


s/'/"



src/python/lib/mesos/exceptions.py
Lines 75 (patched)






src/python/lib/mesos/exceptions.py
Lines 83 (patched)


Unecessary comma.



src/python/lib/mesos/exceptions.py
Lines 106 (patched)


"Given when " is unecessary.



src/python/lib/mesos/http.py
Lines 20 (patched)


s/`restful API`/`RESTful API.`



src/python/lib/mesos/http.py
Lines 25 (patched)


Let's do `from urlparse import urlparse` instead as we only use this once 
for that function in that file.



src/python/lib/mesos/http.py
Lines 33 (patched)


Not in alphabetical order but it makes sense due to the inheritance of 
these classes.



src/python/lib/mesos/http.py
Lines 72 (patched)


We should do that everywhere in the CLI, this is a useful addition. We 
could also use [mypy](http://www.mypy-lang.org/).



src/python/lib/mesos/http.py
Lines 117 (patched)


Will just be `urlparse(url)` following previous comment.



src/python/lib/mesos/http.py
Lines 245 (patched)


```
:param use_gzip_encoding: boolean indicating whether to pass gzip
  encoding in the request headers or not
```



src/python/lib/mesos/http.py
Lines 261 (patched)


I have some trouble reading this comment. I would write something like "We 
retry only when it makes sense: either due to a network partition (e.g. 
connection errors) or if the request failed due to a server error such as 500s, 
timeouts, and so on."



src/python/lib/tests/test_http.py
Lines 239 (patched)


s/resrouce/resource


- Armand Grillet


On Jan. 1, 2018, 1:52 a.m., Eric Chung wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61172/
> ---
> 
> (Updated Jan. 1, 2018, 1:52 a.m.)
> 
> 
> Review request for mesos, Armand Grillet, Jason Lai, and Kevin Klues.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Part of MESOS-7310, this patch adds the mesos.http and mesos.exceptions
> modules, which provides a Resource class and its descendants for
> abstracting away common operations over http connectioins with JSON
> serialization.
> 
> updated patch according to klueska's recommendations
> 
> 
> addressed comments
> 
> 
> Diffs
> -
> 
>   src/python/cli_new/bootstrap c84e549e59bc7a07b5b70c24c63bc0c16881e442 
>   src/python/lib/mesos/exceptions.py PRE-CREATION 
>   src/python/lib/mesos/http.py PRE-CREATION 
>   src/python/lib/requirements.in e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 
>   src/python/lib/tests/conftest.py PRE-CREATION 
>   src/python/lib/tests/test_exceptions.py PRE-CREATION 
>   src/python/lib/tests/test_http.py PRE-CREATION 
>   src/python/lib/tox.ini 8ad030d2dbf1fb7a04f1eaadb587462b4ef3f054 
>   support/pylint.config f74f553e238553bd6a6c06f4dd888cc5954a33eb 
> 
> 
> Diff: https://reviews.apache.org/r/61172/diff/8/
> 
> 
> Testing
> ---
> 
> install tox
> cd src/python/lib
> tox
> 
> 
> Thanks,
> 
> Eric Chung
> 
>



Review Request 64978: WIP: Attached and detached executor volume directory for task.

2018-01-05 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64978/
---

Review request for mesos.


Bugs: MESOS-8279
https://issues.apache.org/jira/browse/MESOS-8279


Repository: mesos


Description
---

WIP: Attached and detached executor volume directory for task.


Diffs
-

  src/slave/slave.cpp cfb675df677e7a0d476b8d5a586afc2f197ab810 


Diff: https://reviews.apache.org/r/64978/diff/1/


Testing
---


Thanks,

Qian Zhang



Review Request 64980: Updated health check doc with 3xx redirects.

2018-01-05 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64980/
---

Review request for mesos and Till Toenshoff.


Repository: mesos


Description
---

See summary.


Diffs
-

  docs/health-checks.md b9b6327dea8c70dbc4aab204487a088f76c79479 


Diff: https://reviews.apache.org/r/64980/diff/1/


Testing
---

Rendered in MacDown.


Thanks,

Alexander Rukletsov



Re: Review Request 64980: Updated health check doc with 3xx redirects.

2018-01-05 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64980/
---

(Updated Jan. 5, 2018, 4 p.m.)


Review request for mesos and Till Toenshoff.


Repository: mesos


Description
---

See summary.


Diffs (updated)
-

  docs/health-checks.md b9b6327dea8c70dbc4aab204487a088f76c79479 


Diff: https://reviews.apache.org/r/64980/diff/2/

Changes: https://reviews.apache.org/r/64980/diff/1-2/


Testing
---

Rendered in MacDown.


Thanks,

Alexander Rukletsov



Re: Review Request 64980: Updated health check doc with 3xx redirects.

2018-01-05 Thread Till Toenshoff

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64980/#review194870
---


Ship it!





docs/health-checks.md
Lines 389-390 (original), 392-393 (patched)


This may sound confusing but after discussing with Alex, I understood it 
indeed is exactly accurate.

So we do actually not expect CURL with activate redirect-follow to ever 
come back with a 3XX as it should follow instead and return what it gets then. 
But if CURL now decided it followed enough redirects, maybe it does actually 
come back with a 3XX even though it should follow instead. SUCH case is treated 
as a success and that is exactly what this says :)


- Till Toenshoff


On Jan. 5, 2018, 4 p.m., Alexander Rukletsov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64980/
> ---
> 
> (Updated Jan. 5, 2018, 4 p.m.)
> 
> 
> Review request for mesos and Till Toenshoff.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   docs/health-checks.md b9b6327dea8c70dbc4aab204487a088f76c79479 
> 
> 
> Diff: https://reviews.apache.org/r/64980/diff/2/
> 
> 
> Testing
> ---
> 
> Rendered in MacDown.
> 
> 
> Thanks,
> 
> Alexander Rukletsov
> 
>



Re: Review Request 64973: Renamed VolumeProfileAdaptor to DiskProfileAdaptor (continued).

2018-01-05 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64973/#review194875
---


Fix it, then Ship it!





src/Makefile.am
Line 386 (original), 386 (patched)


alignment?


- Jie Yu


On Jan. 5, 2018, 10:46 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64973/
> ---
> 
> (Updated Jan. 5, 2018, 10:46 a.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This renames the contents of the files rather than just the filenames.
> 
> This commit will be squashed into the previous review before commit.
> 
> 
> Diffs
> -
> 
>   docs/csi.md d268df19c35720937323995b60f51601f7866342 
>   include/mesos/mesos.proto c677a8be07d0ef209d42622ae32056d36e55ff78 
>   include/mesos/module/disk_profile.hpp 
> bb95b32a18aa3ca316dd806c99b428bddfaba423 
>   include/mesos/resource_provider/storage/disk_profile.hpp 
> 7141a14a0cef4757717316a6fc6b92ecf40f20af 
>   include/mesos/v1/mesos.proto da7b4587891c47c02345209e0fdca60585a36fdc 
>   src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
>   src/module/manager.cpp f038f05ced9a4f967c67b48487f53a39ff4e9e89 
>   src/resource_provider/storage/disk_profile.cpp 
> 336220d241458b3fc4cd356354677c89b442440f 
>   src/resource_provider/storage/disk_profile.proto 
> c0628ecd451312f521680cacad854ab04dcd4c90 
>   src/resource_provider/storage/disk_profile_utils.hpp 
> 46afbd3a0eba8e1a0635ba9551843f2dce4d6def 
>   src/resource_provider/storage/disk_profile_utils.cpp 
> 3d04558459ed5f970533c4d02a57cb07a5984bbb 
>   src/resource_provider/storage/provider.cpp 
> a9f007a061691ea3c6c61fc965cf3a3e764a7275 
>   src/resource_provider/storage/uri_disk_profile.hpp 
> a58758cd943bc7dff4dd096689e5a78a70f9d2ce 
>   src/resource_provider/storage/uri_disk_profile.cpp 
> 4431ad11207d08f86f919042ef957b916c55ddfa 
>   src/slave/flags.hpp 1cf5272301b54c73a4bdee4ae1dda7f7d75d55ce 
>   src/slave/flags.cpp 48b882152b89cfd3f942382a28b79b7f26aeeefd 
>   src/slave/slave.hpp b5eec3352e950371097b74a7f0516e474a4e2a0c 
>   src/slave/slave.cpp cfb675df677e7a0d476b8d5a586afc2f197ab810 
>   src/tests/disk_profile_tests.cpp 6ed81c90d8769889cfc4f91f9b378acd9c42c038 
>   src/tests/storage_local_resource_provider_tests.cpp 
> b522b406205ab4a8cb430bedd18ec00c9f437e8a 
> 
> 
> Diff: https://reviews.apache.org/r/64973/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 64973: Renamed VolumeProfileAdaptor to DiskProfileAdaptor (continued).

2018-01-05 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64973/#review194877
---




src/Makefile.am
Line 1477 (original), 1477 (patched)


ditto. alignment for the `` in the end


- Jie Yu


On Jan. 5, 2018, 10:46 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64973/
> ---
> 
> (Updated Jan. 5, 2018, 10:46 a.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This renames the contents of the files rather than just the filenames.
> 
> This commit will be squashed into the previous review before commit.
> 
> 
> Diffs
> -
> 
>   docs/csi.md d268df19c35720937323995b60f51601f7866342 
>   include/mesos/mesos.proto c677a8be07d0ef209d42622ae32056d36e55ff78 
>   include/mesos/module/disk_profile.hpp 
> bb95b32a18aa3ca316dd806c99b428bddfaba423 
>   include/mesos/resource_provider/storage/disk_profile.hpp 
> 7141a14a0cef4757717316a6fc6b92ecf40f20af 
>   include/mesos/v1/mesos.proto da7b4587891c47c02345209e0fdca60585a36fdc 
>   src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
>   src/module/manager.cpp f038f05ced9a4f967c67b48487f53a39ff4e9e89 
>   src/resource_provider/storage/disk_profile.cpp 
> 336220d241458b3fc4cd356354677c89b442440f 
>   src/resource_provider/storage/disk_profile.proto 
> c0628ecd451312f521680cacad854ab04dcd4c90 
>   src/resource_provider/storage/disk_profile_utils.hpp 
> 46afbd3a0eba8e1a0635ba9551843f2dce4d6def 
>   src/resource_provider/storage/disk_profile_utils.cpp 
> 3d04558459ed5f970533c4d02a57cb07a5984bbb 
>   src/resource_provider/storage/provider.cpp 
> a9f007a061691ea3c6c61fc965cf3a3e764a7275 
>   src/resource_provider/storage/uri_disk_profile.hpp 
> a58758cd943bc7dff4dd096689e5a78a70f9d2ce 
>   src/resource_provider/storage/uri_disk_profile.cpp 
> 4431ad11207d08f86f919042ef957b916c55ddfa 
>   src/slave/flags.hpp 1cf5272301b54c73a4bdee4ae1dda7f7d75d55ce 
>   src/slave/flags.cpp 48b882152b89cfd3f942382a28b79b7f26aeeefd 
>   src/slave/slave.hpp b5eec3352e950371097b74a7f0516e474a4e2a0c 
>   src/slave/slave.cpp cfb675df677e7a0d476b8d5a586afc2f197ab810 
>   src/tests/disk_profile_tests.cpp 6ed81c90d8769889cfc4f91f9b378acd9c42c038 
>   src/tests/storage_local_resource_provider_tests.cpp 
> b522b406205ab4a8cb430bedd18ec00c9f437e8a 
> 
> 
> Diff: https://reviews.apache.org/r/64973/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 64932: Added example framework converting disk resources.

2018-01-05 Thread Gaston Kleiman


> On Jan. 4, 2018, 12:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 399 (patched)
> > 
> >
> > Why don't we make this a `string` instead of an `Option`?
> > 
> > That way we could remove the: `if (flags.master.isNone())` block in the 
> > main function.
> 
> Benjamin Bannier wrote:
> That's a great suggestion. The current code here does follow what 
> virtually every other example framework does though.

So we should fix it here and file a JIRA to cleanup the other example 
frameworks?


> On Jan. 4, 2018, 12:49 p.m., Gaston Kleiman wrote:
> > src/examples/test_csi_user_framework.cpp
> > Lines 444-447 (patched)
> > 
> >
> > Shouldn't this be a flag like in `no_executor_framework.cpp` and in 
> > `long_lived_framework.cpp`?
> 
> Benjamin Bannier wrote:
> We do appear to be inconsistent here, e.g., 
> `load_generator_framework.cpp`, `test_framework.cpp`, or 
> `test_http_framework.cpp` use an environment variable.
> 
> I'd prefer to keep this as is and leave it for a broader cleanup which 
> could e.g., make it possible to load flags from both environment or command 
> line flags; dropping.

I think that using an env variable instead of a flag is technical debt that 
would be trivial to fix. This is a new file, and fixing this tech debt in this 
file is trivial.

So I think we should fix it here, and file a JIRA to do a broader cleanup of 
the other example frameworks later.


- Gaston


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64932/#review194780
---


On Jan. 5, 2018, 4:37 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64932/
> ---
> 
> (Updated Jan. 5, 2018, 4:37 a.m.)
> 
> 
> Review request for mesos, Greg Mann and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch introduces an example HTTP framework which transforms
> 'RAW' disk resources from resource providers into 'MOUNT' volumes and
> subsequently unreserves them.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
>   src/examples/CMakeLists.txt d4f1af4f072efdc68fa0b722f42b1d8aa1779b6e 
>   src/examples/test_csi_user_framework.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/64932/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 63860: Windows: Mapped the Docker network info types.

2018-01-05 Thread Akash Gupta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63860/
---

(Updated Jan. 5, 2018, 6:31 p.m.)


Review request for mesos, Alexander Rukletsov, Andrew Schwartzmeyer, Gaston 
Kleiman, Jie Yu, John Kordich, Joseph Wu, and Michael Park.


Changes
---

Updated with Jie's feedback.


Bugs: MESOS-7342
https://issues.apache.org/jira/browse/MESOS-7342


Repository: mesos


Description (updated)
---

The Network enum in DockerInfo is specific to Linux containers. `HOST`
doesn't exist on Windows and `BRIDGE` is `NAT` on Windows. The current
default docker network setting was always `HOST`, which broke the
Windows docker executor. Now, if a specific network isn't given, the
network mode will default to `HOST` on Linux agents and `NAT` on Windows
agents. Also, `BRIDGE` mode will be translated to `NAT` on Windows.


Diffs (updated)
-

  src/docker/docker.cpp 722a54ad113fc4e2bb22a8f08e307ab38d5fbfed 


Diff: https://reviews.apache.org/r/63860/diff/6/

Changes: https://reviews.apache.org/r/63860/diff/5-6/


Testing
---

See https://reviews.apache.org/r/63862/ for test results.


Thanks,

Akash Gupta



Re: Review Request 63861: Windows: Updated networking doc.

2018-01-05 Thread Akash Gupta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63861/
---

(Updated Jan. 5, 2018, 6:32 p.m.)


Review request for mesos, Alexander Rukletsov, Andrew Schwartzmeyer, Gaston 
Kleiman, Jie Yu, John Kordich, and Michael Park.


Changes
---

Updated to reflect networking changes.


Bugs: MESOS-7342
https://issues.apache.org/jira/browse/MESOS-7342


Repository: mesos


Description
---

The networking docs now describe how the Docker network modes in the
`Network` enum work on Windows, since the enum only has Linux network
modes.


Diffs (updated)
-

  docs/networking.md bdd3a762435aae163fc659ccfea7da825983 


Diff: https://reviews.apache.org/r/63861/diff/7/

Changes: https://reviews.apache.org/r/63861/diff/6-7/


Testing
---


Thanks,

Akash Gupta



Re: Review Request 63862: Windows: Ported docker_tests.cpp.

2018-01-05 Thread Akash Gupta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63862/
---

(Updated Jan. 5, 2018, 6:33 p.m.)


Review request for mesos, Alexander Rukletsov, Andrew Schwartzmeyer, Gaston 
Kleiman, Jie Yu, John Kordich, Joseph Wu, and Michael Park.


Changes
---

Updated to reflect network changes.


Bugs: MESOS-7342
https://issues.apache.org/jira/browse/MESOS-7342


Repository: mesos


Description
---

Ported the disabled tests to run on Windows. The following tests
could not be ported due to Windows platform limitations and remain
diabled:
  - ROOT_DOCKER_MountRelativeContainerPath (can't mount volumes inside
sandbox).
  - ROOT_DOCKER_NVIDIA_GPU_DeviceAllow (no GPU container support).
  - ROOT_DOCKER_NVIDIA_GPU_InspectDevices (no GPU container support).


Diffs (updated)
-

  src/tests/containerizer/docker_tests.cpp 
0ac4a79e03d5e11ead5a57a9749e26c20a1ddd57 


Diff: https://reviews.apache.org/r/63862/diff/5/

Changes: https://reviews.apache.org/r/63862/diff/4-5/


Testing
---

Windows mesos-test:
[==] 754 tests from 77 test cases ran. (270586 ms total)
[  PASSED  ] 754 tests.


Windows DockerTest only:
[==] 11 tests from 1 test case ran. (85617 ms total)
[  PASSED  ] 11 tests.

Linux DockerTest (only 12 tests instead of 14, because I don't have Nvidia GPU):
[==] 12 tests from 1 test case ran. (12413 ms total)
[  PASSED  ] 12 tests.


Thanks,

Akash Gupta



Re: Review Request 63860: Windows: Mapped the Docker network info types.

2018-01-05 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63860/#review194884
---


Fix it, then Ship it!





src/docker/docker.cpp
Line 741 (original), 747-754 (patched)


I'd suggest the following way. `DEFAULT_DOCKER_NETWORK` should be killed.

```
ContainerInfo::DockerInfo::Network network;
if (!dockerInfo.has_network()) {
#ifdef __WINDOWS__
  network = ContainerInfo::DockerInfo::Network::BRIDGE;
#else
  network = ContainerInfo::DockerInfo::Network::HOST;
#endif
} else {
  network = dockerInfo.network();
}

switch (network) {
  ...
}
```



src/docker/docker.cpp
Line 777 (original), 812 (patched)


I don't mention `nat` at all here because from API perspective, `nat` is 
not expose to the users. Users only knows about BRIDGE.

```
Port mappings are only supported for BRIDGE and USER network
```


- Jie Yu


On Jan. 5, 2018, 6:31 p.m., Akash Gupta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63860/
> ---
> 
> (Updated Jan. 5, 2018, 6:31 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Andrew Schwartzmeyer, Gaston 
> Kleiman, Jie Yu, John Kordich, Joseph Wu, and Michael Park.
> 
> 
> Bugs: MESOS-7342
> https://issues.apache.org/jira/browse/MESOS-7342
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The Network enum in DockerInfo is specific to Linux containers. `HOST`
> doesn't exist on Windows and `BRIDGE` is `NAT` on Windows. The current
> default docker network setting was always `HOST`, which broke the
> Windows docker executor. Now, if a specific network isn't given, the
> network mode will default to `HOST` on Linux agents and `NAT` on Windows
> agents. Also, `BRIDGE` mode will be translated to `NAT` on Windows.
> 
> 
> Diffs
> -
> 
>   src/docker/docker.cpp 722a54ad113fc4e2bb22a8f08e307ab38d5fbfed 
> 
> 
> Diff: https://reviews.apache.org/r/63860/diff/6/
> 
> 
> Testing
> ---
> 
> See https://reviews.apache.org/r/63862/ for test results.
> 
> 
> Thanks,
> 
> Akash Gupta
> 
>



Re: Review Request 64940: Prevented a crash when an agent with terminal tasks is lost.

2018-01-05 Thread Jiang Yan Xu


> On Jan. 4, 2018, 5:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp
> > Lines 10037-10056 (original), 10039-10062 (patched)
> > 
> >
> > I think we shouldn't create a TASK_UNREACHABLE status update and call 
> > `updateTask` or `forward` for a terminal task at all. . Also, `forward` 
> > sends TASK_UNREACHABLE update for terminal task to the framework which 
> > looks incorrect.
> > 
> > 
> > Ideally, we want terminal but unacknowledged tasks to still be marked 
> > unreachable in some way, either via task state being TASK_UNREACHABLE or 
> > task being present in `unreachableTasks`. This allows, for example, the 
> > WebUI to not show sandbox links for unreachable tasks irrespective of 
> > whether they were terminal or not before going unreachable. 
> > 
> > But doing this is tricky for various reasons:
> > 
> > --> `updateTask()` doesn't allow a terminal state to be transitioned to 
> > TASK_UNREACHABLE. Right now when we call `updateTask` for a terminal task, 
> > it adds TASK_UNREACHABLE status to `Task.statuses` and also sends it to 
> > operator API stream subscribers which looks incorrect. The fact that 
> > `updateTask` internally deals with already terminal tasks is a bad design 
> > decision in retrospect. I think the callers shouldn't call it for terminal 
> > tasks instead.
> > 
> > --> It's not clear to our users what a `completed` task means. The 
> > intention was for this to hold a cache of terminal and acknowledged tasks 
> > for storing recent history. The users of the WebUI probably equate 
> > "Completed Tasks" to terminal tasks irrespective of their acknowledgement 
> > status, which is why it is confusing for them to see terminal but 
> > unacknowledged tasks in the "Active tasks" section in the WebUI.
> > 
> > --> When a framework reconciles the state of a task on an unreachable 
> > agent, master replies with TASK_UNREACHABLE irrespective of whether the 
> > task was in a non-terminal state or terminal but un-acknowledged state or 
> > terminal and acknowledged state when the agent went unreachable.  
> > 
> > I think the direction we want to go towards is
> > 
> > --> Completed tasks should consist of terminal unacknowledged and 
> > terminal acknowled tasks, likely in two different data structures.
> > --> Unreachable tasks should consist of all non-complete tasks on an 
> > unreachable agent.  All the tasks in this map should be in TASK_UNREACHABLE 
> > state.
> > 
> > 
> > Given all the above is a very involved change, I would recommend 
> > keeping what you have here but with a giant TODO (your current comment in 
> > #10058 doesn't go into enough detail about the complexity here) that talks 
> > about the above stuff. Your change at least keeps the parity with the 
> > (broken) semantics that we have in 1.4 and earlier so that's a bit better.
> 
> Vinod Kone wrote:
> Ignore the first line. Forgot to delete it.

Future direction

1. If completed == terminal unacknowledged + terminal acknowledged, then 
completed == terminal right? Should we then unify the terminology and pick one?
2. Unreachable tasks == non-terminal tasks on an unreachable agent: this is 
what this RR is going to do but IIUC you want a different behavior.

Current semantics

1. In 1.4 the the master (in `updateTask` sends `TASK_UNREACHABLE` to the 
operator API subsribers for terminal tasks), as it stands right now we are 
going to send `TASK_UNREACHABLE` to the schedulers as well. Should we change 
that?
2. You also said above that "Ideally, we want terminal but unacknowledged tasks 
to still be marked unreachable in some way" which seems to contradict your 
later point that "Unreachable tasks should consist of all non-complete 
(terminal) tasks", could you clarify?

Overall it sounds to me that the most correct semantic is still to set 
`TASK_UNREACHABLE` only for the tasks that are non-terminal (because otherwise 
we know that the state is not going to change to something else that we don't 
know yet) but perhaps we can use another field in the status to signal the fact 
that the agent is partitioned?


- Jiang Yan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64940/#review194794
---


On Jan. 3, 2018, 5:35 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64940/
> ---
> 
> (Updated Jan. 3, 2018, 5:35 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gaston Kleiman, Jie Yu, Vinod 
> Kone, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8337
> https://issues.apache.org/jira/browse/MESOS-8337
> 
>

Re: Review Request 64940: Prevented a crash when an agent with terminal tasks is lost.

2018-01-05 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64940/
---

(Updated Jan. 5, 2018, 7:06 p.m.)


Review request for mesos, Benjamin Mahler, Gaston Kleiman, Jie Yu, Vinod Kone, 
and Jiang Yan Xu.


Bugs: MESOS-8337
https://issues.apache.org/jira/browse/MESOS-8337


Repository: mesos


Description
---

If an agent is lost, we try to remove all the tasks that might
have been lost. However, if a task is already terminal, it hasn't
really been lost so we should not be tracking it in the framework's
unreachable tasks list.


Diffs (updated)
-

  src/master/master.hpp 130f6e28cc62a8912aac66ecfbf014fe1ee444e3 
  src/master/master.cpp 28d8be3a4769b418b61cff0b95845e4232135bc7 
  src/tests/partition_tests.cpp 3813139f576ea01db0197f0fe8a73597db1bb69a 


Diff: https://reviews.apache.org/r/64940/diff/4/

Changes: https://reviews.apache.org/r/64940/diff/3-4/


Testing
---

make check (Fedora 27)


Thanks,

James Peach



Re: Review Request 64940: Prevented a crash when an agent with terminal tasks is lost.

2018-01-05 Thread James Peach


> On Jan. 5, 2018, 1:25 a.m., Vinod Kone wrote:
> > src/tests/master_tests.cpp
> > Lines 7630 (patched)
> > 
> >
> > `the task`, which task? are both the tasks using the same executor?

We need one task to finish and one to not finish. They have separate executors.


> On Jan. 5, 2018, 1:25 a.m., Vinod Kone wrote:
> > src/tests/master_tests.cpp
> > Lines 7645-7653 (patched)
> > 
> >
> > Why are you launching 2 tasks instead of just one? You just want to 
> > test the unacknowledged terminal task case right? Doesn't really matter if 
> > it is TASK_FINISHED or TASK_LOST right?

`TASK_LOST` is also a terminal state. So I'm distinguishing that the completed 
task doesn't get tracked as unreachable but that the `LOST` task does. If I 
didn't have any non-lost tasks, the `CHECK` would not be triggered.


> On Jan. 5, 2018, 1:25 a.m., Vinod Kone wrote:
> > src/tests/master_tests.cpp
> > Lines 7692-7711 (patched)
> > 
> >
> > Are you verifying that the latest state of the task has been preserved? 
> > I would recommend hitting "/tasks" endpoint instead of "state-summary" to 
> > be more direct.

I'm checking that one task remained at `FINISHED` and one was moved to `LOST`. 
I can make the same checks using the `/tasks` endpoint, but it's going to be 
more code and more complicated.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64940/#review194794
---


On Jan. 5, 2018, 7:06 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64940/
> ---
> 
> (Updated Jan. 5, 2018, 7:06 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gaston Kleiman, Jie Yu, Vinod 
> Kone, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8337
> https://issues.apache.org/jira/browse/MESOS-8337
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> If an agent is lost, we try to remove all the tasks that might
> have been lost. However, if a task is already terminal, it hasn't
> really been lost so we should not be tracking it in the framework's
> unreachable tasks list.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 130f6e28cc62a8912aac66ecfbf014fe1ee444e3 
>   src/master/master.cpp 28d8be3a4769b418b61cff0b95845e4232135bc7 
>   src/tests/partition_tests.cpp 3813139f576ea01db0197f0fe8a73597db1bb69a 
> 
> 
> Diff: https://reviews.apache.org/r/64940/diff/4/
> 
> 
> Testing
> ---
> 
> make check (Fedora 27)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 64940: Prevented a crash when an agent with terminal tasks is lost.

2018-01-05 Thread James Peach


> On Jan. 5, 2018, 1:25 a.m., Vinod Kone wrote:
> > src/master/master.cpp
> > Lines 10037-10056 (original), 10039-10062 (patched)
> > 
> >
> > I think we shouldn't create a TASK_UNREACHABLE status update and call 
> > `updateTask` or `forward` for a terminal task at all. . Also, `forward` 
> > sends TASK_UNREACHABLE update for terminal task to the framework which 
> > looks incorrect.
> > 
> > 
> > Ideally, we want terminal but unacknowledged tasks to still be marked 
> > unreachable in some way, either via task state being TASK_UNREACHABLE or 
> > task being present in `unreachableTasks`. This allows, for example, the 
> > WebUI to not show sandbox links for unreachable tasks irrespective of 
> > whether they were terminal or not before going unreachable. 
> > 
> > But doing this is tricky for various reasons:
> > 
> > --> `updateTask()` doesn't allow a terminal state to be transitioned to 
> > TASK_UNREACHABLE. Right now when we call `updateTask` for a terminal task, 
> > it adds TASK_UNREACHABLE status to `Task.statuses` and also sends it to 
> > operator API stream subscribers which looks incorrect. The fact that 
> > `updateTask` internally deals with already terminal tasks is a bad design 
> > decision in retrospect. I think the callers shouldn't call it for terminal 
> > tasks instead.
> > 
> > --> It's not clear to our users what a `completed` task means. The 
> > intention was for this to hold a cache of terminal and acknowledged tasks 
> > for storing recent history. The users of the WebUI probably equate 
> > "Completed Tasks" to terminal tasks irrespective of their acknowledgement 
> > status, which is why it is confusing for them to see terminal but 
> > unacknowledged tasks in the "Active tasks" section in the WebUI.
> > 
> > --> When a framework reconciles the state of a task on an unreachable 
> > agent, master replies with TASK_UNREACHABLE irrespective of whether the 
> > task was in a non-terminal state or terminal but un-acknowledged state or 
> > terminal and acknowledged state when the agent went unreachable.  
> > 
> > I think the direction we want to go towards is
> > 
> > --> Completed tasks should consist of terminal unacknowledged and 
> > terminal acknowled tasks, likely in two different data structures.
> > --> Unreachable tasks should consist of all non-complete tasks on an 
> > unreachable agent.  All the tasks in this map should be in TASK_UNREACHABLE 
> > state.
> > 
> > 
> > Given all the above is a very involved change, I would recommend 
> > keeping what you have here but with a giant TODO (your current comment in 
> > #10058 doesn't go into enough detail about the complexity here) that talks 
> > about the above stuff. Your change at least keeps the parity with the 
> > (broken) semantics that we have in 1.4 and earlier so that's a bit better.
> 
> Vinod Kone wrote:
> Ignore the first line. Forgot to delete it.
> 
> Jiang Yan Xu wrote:
> Future direction
> 
> 1. If completed == terminal unacknowledged + terminal acknowledged, then 
> completed == terminal right? Should we then unify the terminology and pick 
> one?
> 2. Unreachable tasks == non-terminal tasks on an unreachable agent: this 
> is what this RR is going to do but IIUC you want a different behavior.
> 
> Current semantics
> 
> 1. In 1.4 the the master (in `updateTask` sends `TASK_UNREACHABLE` to the 
> operator API subsribers for terminal tasks), as it stands right now we are 
> going to send `TASK_UNREACHABLE` to the schedulers as well. Should we change 
> that?
> 2. You also said above that "Ideally, we want terminal but unacknowledged 
> tasks to still be marked unreachable in some way" which seems to contradict 
> your later point that "Unreachable tasks should consist of all non-complete 
> (terminal) tasks", could you clarify?
> 
> Overall it sounds to me that the most correct semantic is still to set 
> `TASK_UNREACHABLE` only for the tasks that are non-terminal (because 
> otherwise we know that the state is not going to change to something else 
> that we don't know yet) but perhaps we can use another field in the status to 
> signal the fact that the agent is partitioned?

https://issues.apache.org/jira/browse/MESOS-8405


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64940/#review194794
---


On Jan. 5, 2018, 7:06 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64940/
> ---
> 
> (Updated Jan. 5, 2018, 7:06 p.m.)
> 
> 
> Review request for me

Re: Review Request 64940: Prevented a crash when an agent with terminal tasks is lost.

2018-01-05 Thread Jiang Yan Xu


> On Jan. 4, 2018, 2:09 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp
> > Lines 10061 (patched)
> > 
> >
> > Perhaps move all the logic around determining the unreachable task down 
> > here.
> > 
> > ```
> > // We need to inform `removeTask` whether the task became unreachable 
> > due to the agent being removed. Note that terminal tasks are not considered 
> > unreachable. We can not rely on the task's state because it may have been 
> > set to `TASK_LOST` for backwards-compatibility.
> > bool unreachable =
> >   unreachableTime.isSome() && !protobuf::isTerminalState(task->state());
> > ```
> > 
> > Inlining `unreachable` is probably fine too but separating it helps 
> > clarify it with comments?
> 
> James Peach wrote:
> You have to sample the task state before the update because it might 
> transition to a terminal state at that time (and LOST is terminal).

Right, so put the line above `updateTask`? The main thing I am commenting on is 
that we can consolidate the logic without first defining `bool unreachable = 
true;` much far above?


> On Jan. 4, 2018, 2:09 p.m., Jiang Yan Xu wrote:
> > src/tests/master_tests.cpp
> > Lines 7637 (patched)
> > 
> >
> > You could just wait for the the status update (being droppped) instead 
> > of capaturing `Slave::executorTerminated` and risk the race condition of 
> > the master not receiving the status update before the agent is deemed 
> > partitioned?
> 
> James Peach wrote:
> Dealing with all the status updates was less reliable.

Do you see the race condition I was trying to point out?


> On Jan. 4, 2018, 2:09 p.m., Jiang Yan Xu wrote:
> > src/tests/master_tests.cpp
> > Lines 7686 (patched)
> > 
> >
> > No need to settle?
> 
> James Peach wrote:
> This is settling to ensure that events following from marking the agent 
> down are fully processed.

Oh ok that's right.


- Jiang Yan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64940/#review194777
---


On Jan. 5, 2018, 11:06 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64940/
> ---
> 
> (Updated Jan. 5, 2018, 11:06 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Gaston Kleiman, Jie Yu, Vinod 
> Kone, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8337
> https://issues.apache.org/jira/browse/MESOS-8337
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> If an agent is lost, we try to remove all the tasks that might
> have been lost. However, if a task is already terminal, it hasn't
> really been lost so we should not be tracking it in the framework's
> unreachable tasks list.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 130f6e28cc62a8912aac66ecfbf014fe1ee444e3 
>   src/master/master.cpp 28d8be3a4769b418b61cff0b95845e4232135bc7 
>   src/tests/partition_tests.cpp 3813139f576ea01db0197f0fe8a73597db1bb69a 
> 
> 
> Diff: https://reviews.apache.org/r/64940/diff/4/
> 
> 
> Testing
> ---
> 
> make check (Fedora 27)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 64940: Prevented a crash when an agent with terminal tasks is lost.

2018-01-05 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64940/
---

(Updated Jan. 5, 2018, 7:37 p.m.)


Review request for mesos, Benjamin Mahler, Gaston Kleiman, Jie Yu, Vinod Kone, 
and Jiang Yan Xu.


Bugs: MESOS-8337
https://issues.apache.org/jira/browse/MESOS-8337


Repository: mesos


Description
---

If an agent is lost, we try to remove all the tasks that might
have been lost. However, if a task is already terminal, it hasn't
really been lost so we should not be tracking it in the framework's
unreachable tasks list.


Diffs (updated)
-

  src/master/master.hpp 130f6e28cc62a8912aac66ecfbf014fe1ee444e3 
  src/master/master.cpp 28d8be3a4769b418b61cff0b95845e4232135bc7 
  src/tests/partition_tests.cpp 3813139f576ea01db0197f0fe8a73597db1bb69a 


Diff: https://reviews.apache.org/r/64940/diff/5/

Changes: https://reviews.apache.org/r/64940/diff/4-5/


Testing
---

make check (Fedora 27)


Thanks,

James Peach



Review Request 64969: Added an performance benchmark for master `getstate` v1 api.

2018-01-05 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64969/
---

Review request for mesos and Benjamin Mahler.


Repository: mesos


Description
---

This test is based on the MasterFailover benchmark which
currently lacks things like executor. It is possible
to add these when needed.


Diffs
-

  src/tests/master_benchmarks.cpp 7d69588b3a4090ad1a755b592d2f8756ad6dff9e 


Diff: https://reviews.apache.org/r/64969/diff/1/


Testing
---

make check


Thanks,

Meng Zhu



Review Request 64961: Windows: Fixed memory leak.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64961/
---

Review request for mesos, Akash Gupta, Jeff Coffler, Joseph Wu, and Michael 
Park.


Repository: mesos


Description
---

The environment strings variable block is allocated by the system, and
must be manually freed (according to the docs).


Diffs
-

  3rdparty/stout/include/stout/os/windows/environment.hpp 
f79cbfa17cbe47f024802f3be813759f8fd62baa 


Diff: https://reviews.apache.org/r/64961/diff/1/


Testing
---

ctest -V on Windows (all passed)


Thanks,

Andrew Schwartzmeyer



Review Request 64962: Windows: Explicitly state source and destination path for extract.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64962/
---

Review request for mesos, Akash Gupta, Jeff Coffler, Joseph Wu, and Michael 
Park.


Repository: mesos


Description
---

Makes it more clear that it is in the correct order.


Diffs
-

  src/launcher/fetcher.cpp e2372a1b52d19937a62bd3fd5fe594c9e347e4a8 


Diff: https://reviews.apache.org/r/64962/diff/1/


Testing
---

ctest -V on Windows (all passed)


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64729: Made `Duration::operator*` accept `int` and `size_t`.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64729/
---

(Updated Jan. 5, 2018, 1:53 p.m.)


Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.


Changes
---

Rebased.


Repository: mesos


Description (updated)
---

Without this change, every use of `Duration * 2` throws a conversion
warning because we multiple `Duration` by `int` and `size_t` without
casting to `double` (the type accepted by the actual implementation of
`operator*`).


Diffs (updated)
-

  3rdparty/stout/include/stout/duration.hpp 
d10b6a4d4d1a0096334c30fdec725ecc55c9f0ea 


Diff: https://reviews.apache.org/r/64729/diff/2/

Changes: https://reviews.apache.org/r/64729/diff/1-2/


Testing
---


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64731: Ended `IOSwitchboard::_prepare` with `UNREACHABLE`.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64731/
---

(Updated Jan. 5, 2018, 1:53 p.m.)


Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.


Changes
---

Rebased.


Repository: mesos


Description
---

All paths should `return`, but the compiler wasn't able to determine
this.


Diffs (updated)
-

  src/slave/containerizer/mesos/io/switchboard.cpp 
89dd4988c7132d3c7aeacecad7a76961b0ec5a2c 


Diff: https://reviews.apache.org/r/64731/diff/2/

Changes: https://reviews.apache.org/r/64731/diff/1-2/


Testing
---


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64730: Changed `Sorter::count()` to return `size_t` instead of `int`.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64730/
---

(Updated Jan. 5, 2018, 1:53 p.m.)


Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.


Changes
---

Rebased.


Repository: mesos


Description
---

Also applies to `DRFSorter::count()`. The implementations return
`hashmap::count()` which is a `size_t`. Converting `size_t` to `int`
implicitly generates a "possible loss of data" warning.


Diffs (updated)
-

  src/master/allocator/sorter/drf/sorter.hpp 
1a7681c69b3d28efbf3d37dfaece55743e48c101 
  src/master/allocator/sorter/drf/sorter.cpp 
ecc5586737b6b447c5a1cf1a37037832bcbacd69 
  src/master/allocator/sorter/sorter.hpp 
2c6e1621fc8b3a01750f6fa3b15f5dcd57216cc6 


Diff: https://reviews.apache.org/r/64730/diff/1/


Testing
---


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64732: Fixed conversion warnings in tests.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64732/
---

(Updated Jan. 5, 2018, 1:53 p.m.)


Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.


Changes
---

Rebased.


Repository: mesos


Description
---

The signature of `flags.load` takes an `int`, but
`arraySize` (correctly) returns a `size_t`. We don't want to change
`flags.load` to take a `size_t` because its signature is `int argc,
char** argv`, and so we may be sending it `int` in other places.

As all the test arrays are statically sized, we know that we're not
overflowing an `int`, so it is safe to just `static_cast` here.


Diffs (updated)
-

  3rdparty/stout/tests/flags_tests.cpp 60ed8ebdd7b491e0d9469985f6263aad64b44c51 
  3rdparty/stout/tests/subcommand_tests.cpp 
54d70c99083fc1de16c732917e1faf1ee59ef646 


Diff: https://reviews.apache.org/r/64732/diff/2/

Changes: https://reviews.apache.org/r/64732/diff/1-2/


Testing
---

These are in stout.


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64733: Fixed conversion warnings in tests.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64733/
---

(Updated Jan. 5, 2018, 1:54 p.m.)


Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.


Changes
---

Rebased.


Repository: mesos


Description
---

Stringified a `size_t` for `http::Request.headers` value.

The types in `metrics_tests.cpp` were defined as `double`, and so should
be compared with `EXPECT_DOUBLE_EQ`, not `EXPECT_FLOAT_EQ`.


Diffs (updated)
-

  3rdparty/libprocess/src/tests/http_tests.cpp 
9daac715f0242921b7f9f5c20b3eb27f1be802d4 
  3rdparty/libprocess/src/tests/metrics_tests.cpp 
bb7c9e37e5576f5b2e3ecc1a215e1edac758b254 


Diff: https://reviews.apache.org/r/64733/diff/2/

Changes: https://reviews.apache.org/r/64733/diff/1-2/


Testing
---

These are in libprocess.


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64734: Fixed conversion warnings.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64734/
---

(Updated Jan. 5, 2018, 1:54 p.m.)


Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.


Changes
---

Rebased.


Repository: mesos


Description
---

Just casting to the used type.


Diffs (updated)
-

  3rdparty/libprocess/src/http.cpp f51d2aaa639ee36ce960268cb32a3aa88f14aa29 
  3rdparty/libprocess/src/process.cpp 75cf1d3b6d3d257ba9bc81c68017a74a6511cebf 


Diff: https://reviews.apache.org/r/64734/diff/2/

Changes: https://reviews.apache.org/r/64734/diff/1-2/


Testing
---

These are in libprocess.


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64736: Fixed conversion warnings.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64736/
---

(Updated Jan. 5, 2018, 1:55 p.m.)


Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.


Changes
---

Rebased.


Repository: mesos


Description
---

Fixed conversion warnings in `cram_md5` authentication. The potential
for overflow here is exceedingly unlikely, and we have been ignoring
these warnings to date without problem.

Fixed conversion warning in `files.cpp`. The signature accepts an
`off_t` type, but we calculated the offset from a `size_t`.

Fixed `size_t` to `int` warning in `default_executor.cpp`. The only use
of this counter is in `taskGroup.tasks().Get(index++)`, which expects an
`int`, so we changed the type.

Fixed conversion warnings in master. We can't change the return types
because the `Metric` class only accepts `double` types, so we
`static_cast` (as we were already doing elsewhere in the class).

Fixed warnings when converting to `pid_t` in the containerizer. Can't
change the arguments types because they're in the protocol, but they
obviously represent a `pid_t` originally, so this is a safe
`static_cast`.

Fixed warnings when constructing a `Seconds`a in `slave.cpp` as it takes
`int64_t`, but we're giving it a `double`. We can't change the `double`
because it's set in the protocol, and we're currently ignoring the
conversion anyway, so we just cast. We may want to provide a `double`
constructor for `Seconds` in the future that does proper rounding.

Fixed `size_t` to `int` conversion warning in `slave.cpp`. This counter
was only used in `usage->mutable_exedcutors(i++)` which expected an
`int`, so we changed the type.

Fixed conversion warnings in `zookeeper.cpp`. The signatures of these
functions take an `int`, and they're coming from a `size()`. We have
bigger problems if this overflows.


Diffs (updated)
-

  src/authentication/cram_md5/authenticatee.cpp 
7b3f767e29163de489018081089e67a6f22971c5 
  src/authentication/cram_md5/authenticator.cpp 
2a12bb3f4aec79056ec6c82a8c896ee3af15bb4c 
  src/files/files.cpp ad2189b9f2b919eb2f70c738655fa6ecfaaa1b11 
  src/launcher/default_executor.cpp 6c88de413379d3b58f19a24ac0c2f43e38e85e54 
  src/master/master.hpp 130f6e28cc62a8912aac66ecfbf014fe1ee444e3 
  src/master/master.cpp 28d8be3a4769b418b61cff0b95845e4232135bc7 
  src/master/registrar.cpp ffb4582086aa939ffe7ccf4ff1d52fb68a9ecb78 
  src/slave/containerizer/mesos/containerizer.cpp 
cddc6173d17c8bd2585899b154f976c3822dffdd 
  src/slave/containerizer/mesos/isolators/posix.hpp 
caa282c95746e845992c971982892cf60c9b982c 
  src/slave/containerizer/mesos/isolators/windows/cpu.cpp 
6b376efa9ff88ff93ee3dc8441fae4cda109d55f 
  src/slave/containerizer/mesos/isolators/windows/mem.cpp 
64d11347d73328cb9665697b1ddd69a00276b51f 
  src/slave/containerizer/mesos/launcher.cpp 
ec31fa24c8b91583b8b327a0c658ed6e87bd292f 
  src/slave/slave.cpp cfb675df677e7a0d476b8d5a586afc2f197ab810 
  src/zookeeper/zookeeper.cpp 5ede4e5c47b30693ef9f296777dea8db2c4a48d8 


Diff: https://reviews.apache.org/r/64736/diff/2/

Changes: https://reviews.apache.org/r/64736/diff/1-2/


Testing
---

These are in Mesos.

Excluding 3rdparty:

On Windows 10:

```
Build succeeded.
0 Warning(s)
0 Error(s)
```

On CentOS 7, with Autotools:

```
make check
```

Passed. Also apparently -Wsign-compare is on, but not for Windows. Might 
re-evaluate our Windows warning level. I'd like to turn -Werror on, but all the 
compilers generate different sets of warnings :(


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64735: Fixed conversion warnings in tests.

2018-01-05 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64735/
---

(Updated Jan. 5, 2018, 1:54 p.m.)


Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.


Changes
---

Rebased.


Repository: mesos


Description
---

Fixed `double -> float` conversions. While the Protobuf type and literal
type are both `double`, the `FLOAT_EQ` tests were being used instead of
`DOUBLE_EQ`, causing a conversion warning.

Fixed `size_t` to `int` conversion warnings by casting.

Fixed warning for `Seconds(double)` as it takes `int64_t`. Can't change
the argument types because they're set in the protocol.


Diffs (updated)
-

  src/tests/attributes_tests.cpp f053292e9185e9709d4da9c0b462541691772f1b 
  src/tests/protobuf_io_tests.cpp 4a2e3a3bc87c1a3394368439e5dd30b410e468a4 
  src/tests/resource_offers_tests.cpp 5564636dbabe25801a12b613a1cb04cd52ebb458 
  src/tests/resources_tests.cpp bd328b254eff3815fa2e85d17179cf593cb8a349 
  src/tests/resources_utils.cpp 9736c2a090250562704f2c874ce3872be68555ed 
  src/tests/scheduler_tests.cpp 87ef589fc296d01cfca8bf0b67390b097f68406b 
  src/tests/slave_tests.cpp 159192053087e971746943a1bc17a76143a2d9af 
  src/tests/values_tests.cpp f7e6fa786b85460185dbf83572e5d282671ab519 


Diff: https://reviews.apache.org/r/64735/diff/2/

Changes: https://reviews.apache.org/r/64735/diff/1-2/


Testing
---

These are in Mesos.


Thanks,

Andrew Schwartzmeyer



Re: Review Request 64961: Windows: Fixed memory leak.

2018-01-05 Thread Jeff Coffler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64961/#review194905
---


Ship it!




Ship It!

- Jeff Coffler


On Jan. 5, 2018, 9:49 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64961/
> ---
> 
> (Updated Jan. 5, 2018, 9:49 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Jeff Coffler, Joseph Wu, and Michael 
> Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The environment strings variable block is allocated by the system, and
> must be manually freed (according to the docs).
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/os/windows/environment.hpp 
> f79cbfa17cbe47f024802f3be813759f8fd62baa 
> 
> 
> Diff: https://reviews.apache.org/r/64961/diff/1/
> 
> 
> Testing
> ---
> 
> ctest -V on Windows (all passed)
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Review Request 64993: Made the `os::read` error less redundant.

2018-01-05 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64993/
---

Review request for mesos and Gilbert Song.


Bugs: MESOS-8404
https://issues.apache.org/jira/browse/MESOS-8404


Repository: mesos


Description
---

The error from `os::read` would typically format as "Failed to open
file: File not found", which is unnecessarily redundant. It should
be sufficient to just report the `ErrnoError()`.


Diffs
-

  3rdparty/stout/include/stout/os/read.hpp 
2eb74216792ee1ac902f234076a4508d40483598 


Diff: https://reviews.apache.org/r/64993/diff/1/


Testing
---

make check (Fedora 27)


Thanks,

James Peach



Review Request 64995: Improved image store manifest parsing errors.

2018-01-05 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64995/
---

Review request for mesos and Gilbert Song.


Bugs: MESOS-8404
https://issues.apache.org/jira/browse/MESOS-8404


Repository: mesos


Description
---

When one of the image store backends fails to parse an image manifest,
it is most probable that the the on-disk state is incorrect. Emitting
the path in the error makes it easier for an operator to figure out
what went wrong.


Diffs
-

  src/appc/spec.cpp 8471ccb1452662e11aac01886dcc6af81558db27 
  src/slave/containerizer/mesos/provisioner/appc/cache.cpp 
09190e2c21cf39ab023ee08e12cf10610343a103 
  src/slave/containerizer/mesos/provisioner/docker/local_puller.cpp 
c4879ecfe77bbeb9ec2922831d373695aeb4e180 
  src/slave/containerizer/mesos/provisioner/docker/store.cpp 
d64e6eb0f93a5d43ad756d4f0b0d1a665375dada 


Diff: https://reviews.apache.org/r/64995/diff/1/


Testing
---

make check (Fedora 27)


Thanks,

James Peach



Re: Review Request 64962: Windows: Explicitly state source and destination path for extract.

2018-01-05 Thread Jeff Coffler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64962/#review194907
---




src/launcher/fetcher.cpp
Line 101 (original), 103 (patched)


I got the syntax straight from the help page from `help Expand-Archive`. 
But I guess if you don't know the parameters and don't want to get help, this 
will clarify.


- Jeff Coffler


On Jan. 5, 2018, 9:50 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64962/
> ---
> 
> (Updated Jan. 5, 2018, 9:50 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Jeff Coffler, Joseph Wu, and Michael 
> Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Makes it more clear that it is in the correct order.
> 
> 
> Diffs
> -
> 
>   src/launcher/fetcher.cpp e2372a1b52d19937a62bd3fd5fe594c9e347e4a8 
> 
> 
> Diff: https://reviews.apache.org/r/64962/diff/1/
> 
> 
> Testing
> ---
> 
> ctest -V on Windows (all passed)
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 64962: Windows: Explicitly state source and destination path for extract.

2018-01-05 Thread Jeff Coffler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64962/#review194908
---


Ship it!




Ship It!

- Jeff Coffler


On Jan. 5, 2018, 9:50 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64962/
> ---
> 
> (Updated Jan. 5, 2018, 9:50 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Jeff Coffler, Joseph Wu, and Michael 
> Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Makes it more clear that it is in the correct order.
> 
> 
> Diffs
> -
> 
>   src/launcher/fetcher.cpp e2372a1b52d19937a62bd3fd5fe594c9e347e4a8 
> 
> 
> Diff: https://reviews.apache.org/r/64962/diff/1/
> 
> 
> Testing
> ---
> 
> ctest -V on Windows (all passed)
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Review Request 64992: Added SLRP unit tests for profile updates and corner cases.

2018-01-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64992/
---

Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.


Bugs: MESOS-8407
https://issues.apache.org/jira/browse/MESOS-8407


Repository: mesos


Description
---

This patch adds the following tests:

`NoResource`: RP updates its state with no resource, and can recover
from a checkpointed state that contains no resource.

`ZeroSizedDisk`: CSI plugin reports a pre-existing volume with
zero capacity.

`SmallDisk`: CSI plugin reports a storage pool and a pre-existing volume
with sizes < 1MB.

`NewProfile`: A new profile is added after RP updates its state.


Diffs
-

  src/examples/test_csi_plugin.cpp f6b2c98c44366d5c752aa44bdbf8ae0374a7765c 
  src/tests/storage_local_resource_provider_tests.cpp 
e9ea021867bf2175d5ee53a89a0619563ab8f329 


Diff: https://reviews.apache.org/r/64992/diff/1/


Testing
---

sudo make check


Thanks,

Chun-Hung Hsiao



Review Request 64994: Renamed SLRP tests to describe them better.

2018-01-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64994/
---

Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.


Repository: mesos


Description
---

Renamed SLRP tests to describe them better.


Diffs
-

  src/tests/storage_local_resource_provider_tests.cpp 
e9ea021867bf2175d5ee53a89a0619563ab8f329 


Diff: https://reviews.apache.org/r/64994/diff/1/


Testing
---

sudo make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 64729: Made `Duration::operator*` accept `int` and `size_t`.

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64729/#review194909
---




3rdparty/stout/include/stout/duration.hpp
Lines 173 (patched)


Why sacrifice the precision of integer * integer multiplication?

```
nanos = nanos * multiplier;
```
(You might need to `friend` one of these operators to get this to work)

Same with the other operator.


- Joseph Wu


On Jan. 5, 2018, 1:53 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64729/
> ---
> 
> (Updated Jan. 5, 2018, 1:53 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Without this change, every use of `Duration * 2` throws a conversion
> warning because we multiple `Duration` by `int` and `size_t` without
> casting to `double` (the type accepted by the actual implementation of
> `operator*`).
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/duration.hpp 
> d10b6a4d4d1a0096334c30fdec725ecc55c9f0ea 
> 
> 
> Diff: https://reviews.apache.org/r/64729/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 64730: Changed `Sorter::count()` to return `size_t` instead of `int`.

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64730/#review194910
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 5, 2018, 1:53 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64730/
> ---
> 
> (Updated Jan. 5, 2018, 1:53 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Also applies to `DRFSorter::count()`. The implementations return
> `hashmap::count()` which is a `size_t`. Converting `size_t` to `int`
> implicitly generates a "possible loss of data" warning.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/sorter/drf/sorter.hpp 
> 1a7681c69b3d28efbf3d37dfaece55743e48c101 
>   src/master/allocator/sorter/drf/sorter.cpp 
> ecc5586737b6b447c5a1cf1a37037832bcbacd69 
>   src/master/allocator/sorter/sorter.hpp 
> 2c6e1621fc8b3a01750f6fa3b15f5dcd57216cc6 
> 
> 
> Diff: https://reviews.apache.org/r/64730/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 64731: Ended `IOSwitchboard::_prepare` with `UNREACHABLE`.

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64731/#review194911
---


Fix it, then Ship it!





src/slave/containerizer/mesos/io/switchboard.cpp
Lines 322-323 (original), 322-323 (patched)


It should be clearer to place the UNREACHABLE here.  Otherwise the reader 
needs to scroll all the way up and find where the second `#ifdef` starts.

```
#ifdef __WINDOWS__
  // NOTE: On Windows, both return values of
  // `IOSwitchboard::requiresServer(containerConfig)`
  // are checked and will return before reaching here.
  UNREACHABLE();
#else
  // Other existing stuff...
#endif // __WINDOWS__
```


- Joseph Wu


On Jan. 5, 2018, 1:53 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64731/
> ---
> 
> (Updated Jan. 5, 2018, 1:53 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> All paths should `return`, but the compiler wasn't able to determine
> this.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/io/switchboard.cpp 
> 89dd4988c7132d3c7aeacecad7a76961b0ec5a2c 
> 
> 
> Diff: https://reviews.apache.org/r/64731/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 63860: Windows: Mapped the Docker network info types.

2018-01-05 Thread Akash Gupta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63860/
---

(Updated Jan. 5, 2018, 10:25 p.m.)


Review request for mesos, Alexander Rukletsov, Andrew Schwartzmeyer, Gaston 
Kleiman, Jie Yu, John Kordich, Joseph Wu, and Michael Park.


Changes
---

Updated with Jie's feedback


Bugs: MESOS-7342
https://issues.apache.org/jira/browse/MESOS-7342


Repository: mesos


Description
---

The Network enum in DockerInfo is specific to Linux containers. `HOST`
doesn't exist on Windows and `BRIDGE` is `NAT` on Windows. The current
default docker network setting was always `HOST`, which broke the
Windows docker executor. Now, if a specific network isn't given, the
network mode will default to `HOST` on Linux agents and `NAT` on Windows
agents. Also, `BRIDGE` mode will be translated to `NAT` on Windows.


Diffs (updated)
-

  src/docker/docker.cpp 722a54ad113fc4e2bb22a8f08e307ab38d5fbfed 


Diff: https://reviews.apache.org/r/63860/diff/7/

Changes: https://reviews.apache.org/r/63860/diff/6-7/


Testing
---

See https://reviews.apache.org/r/63862/ for test results.


Thanks,

Akash Gupta



Re: Review Request 64732: Fixed conversion warnings in tests.

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64732/#review194912
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 5, 2018, 1:53 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64732/
> ---
> 
> (Updated Jan. 5, 2018, 1:53 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The signature of `flags.load` takes an `int`, but
> `arraySize` (correctly) returns a `size_t`. We don't want to change
> `flags.load` to take a `size_t` because its signature is `int argc,
> char** argv`, and so we may be sending it `int` in other places.
> 
> As all the test arrays are statically sized, we know that we're not
> overflowing an `int`, so it is safe to just `static_cast` here.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/tests/flags_tests.cpp 
> 60ed8ebdd7b491e0d9469985f6263aad64b44c51 
>   3rdparty/stout/tests/subcommand_tests.cpp 
> 54d70c99083fc1de16c732917e1faf1ee59ef646 
> 
> 
> Diff: https://reviews.apache.org/r/64732/diff/2/
> 
> 
> Testing
> ---
> 
> These are in stout.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 64733: Fixed conversion warnings in tests.

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64733/#review194914
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 5, 2018, 1:54 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64733/
> ---
> 
> (Updated Jan. 5, 2018, 1:54 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Stringified a `size_t` for `http::Request.headers` value.
> 
> The types in `metrics_tests.cpp` were defined as `double`, and so should
> be compared with `EXPECT_DOUBLE_EQ`, not `EXPECT_FLOAT_EQ`.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/http_tests.cpp 
> 9daac715f0242921b7f9f5c20b3eb27f1be802d4 
>   3rdparty/libprocess/src/tests/metrics_tests.cpp 
> bb7c9e37e5576f5b2e3ecc1a215e1edac758b254 
> 
> 
> Diff: https://reviews.apache.org/r/64733/diff/2/
> 
> 
> Testing
> ---
> 
> These are in libprocess.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Review Request 64998: Added a SLRP test for CSI plugin restart.

2018-01-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64998/
---

Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.


Bugs: MESOS-8408
https://issues.apache.org/jira/browse/MESOS-8408


Repository: mesos


Description
---

The test does the same as the `PublishResources` test, but it kills the
CSI plugin contaier between each operation.


Diffs
-

  src/Makefile.am 30cd4d426e797e4c8ee556d1bc3de99830a5fe41 
  src/slave/container_daemon.cpp d74fa5105c61a6cadfb2610790e559d387ca13a0 
  src/tests/storage_local_resource_provider_tests.cpp 
e9ea021867bf2175d5ee53a89a0619563ab8f329 


Diff: https://reviews.apache.org/r/64998/diff/1/


Testing
---

sudo make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 64734: Fixed conversion warnings.

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64734/#review194916
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 5, 2018, 1:54 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64734/
> ---
> 
> (Updated Jan. 5, 2018, 1:54 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Just casting to the used type.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/http.cpp f51d2aaa639ee36ce960268cb32a3aa88f14aa29 
>   3rdparty/libprocess/src/process.cpp 
> 75cf1d3b6d3d257ba9bc81c68017a74a6511cebf 
> 
> 
> Diff: https://reviews.apache.org/r/64734/diff/2/
> 
> 
> Testing
> ---
> 
> These are in libprocess.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 64735: Fixed conversion warnings in tests.

2018-01-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64735/#review194917
---


Fix it, then Ship it!




Ship It!


src/tests/protobuf_io_tests.cpp
Line 161 (original), 161 (patched)


Both `expected` and `actual` are the same type here, so why does only 
`actual` need the cast?


- Joseph Wu


On Jan. 5, 2018, 1:54 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64735/
> ---
> 
> (Updated Jan. 5, 2018, 1:54 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed `double -> float` conversions. While the Protobuf type and literal
> type are both `double`, the `FLOAT_EQ` tests were being used instead of
> `DOUBLE_EQ`, causing a conversion warning.
> 
> Fixed `size_t` to `int` conversion warnings by casting.
> 
> Fixed warning for `Seconds(double)` as it takes `int64_t`. Can't change
> the argument types because they're set in the protocol.
> 
> 
> Diffs
> -
> 
>   src/tests/attributes_tests.cpp f053292e9185e9709d4da9c0b462541691772f1b 
>   src/tests/protobuf_io_tests.cpp 4a2e3a3bc87c1a3394368439e5dd30b410e468a4 
>   src/tests/resource_offers_tests.cpp 
> 5564636dbabe25801a12b613a1cb04cd52ebb458 
>   src/tests/resources_tests.cpp bd328b254eff3815fa2e85d17179cf593cb8a349 
>   src/tests/resources_utils.cpp 9736c2a090250562704f2c874ce3872be68555ed 
>   src/tests/scheduler_tests.cpp 87ef589fc296d01cfca8bf0b67390b097f68406b 
>   src/tests/slave_tests.cpp 159192053087e971746943a1bc17a76143a2d9af 
>   src/tests/values_tests.cpp f7e6fa786b85460185dbf83572e5d282671ab519 
> 
> 
> Diff: https://reviews.apache.org/r/64735/diff/2/
> 
> 
> Testing
> ---
> 
> These are in Mesos.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 64630: Narrowed task sandbox permissions from 0755 to 0750.

2018-01-05 Thread Ilya Pronin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64630/#review194915
---



Looks good to me! Just some minor comments about naming and function arguments.


src/slave/paths.hpp
Lines 403 (patched)


Judging by other functions in this file I think a better name for this 
function would be `createSandboxDirectory()`. Path is a locator, not a reasoure 
that is being created.



src/slave/paths.hpp
Lines 404-405 (patched)


Again judging by other functions here. The convention seems to be that the 
user provides a root dir path and a container ID for the function to construct 
the directory path. So the function prototype would look like this:
```cpp
Try createSandboxDirectory(
const std::string& rootDir,
const ContainerID& containerId,
const Option& user = None());
```



src/slave/paths.cpp
Line 726 (original), 726 (patched)


Misleading naming. Maybe we can create a function like "create, set mode 
and own" and call it from here and from `createSandboxDirectory()`?



src/slave/paths.cpp
Lines 763 (patched)


No error checking?


- Ilya Pronin


On Dec. 14, 2017, 4:10 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64630/
> ---
> 
> (Updated Dec. 14, 2017, 4:10 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Ilya Pronin, Jie Yu, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-8332
> https://issues.apache.org/jira/browse/MESOS-8332
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Since task sandboxes can contain private data, we should not
> make them accessible to others by default. This changes all the
> places that create a task sandbox directory to use a helper API
> `slave::paths::createSandboxPath` that consistently deals with
> setting the directory mode and ownership.
> 
> A number of tests depended on the previous behavior where
> failing to change the ownership was logged but did not cause
> a failure. Depending on the test, these were updated to either
> disable the agent `switch_user` flag, or to specify the current
> user in the task launch message.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 7ab0b07f689f872573ca458ae47cd6426ebc0365 
>   src/slave/containerizer/mesos/paths.cpp 
> 8a188a918873eef468a984b80f5ea7ebaa8fb923 
>   src/slave/http.cpp ed22b9f6bfa1c480a0672ce25d364bba6e33a200 
>   src/slave/paths.hpp 9cbacd8da62e7c7386dca7031fc09a46ae773161 
>   src/slave/paths.cpp fca2a0eec2a75ed76028ea54dc992502275d4bce 
>   src/tests/api_tests.cpp 86cbba4fab5e7a45298d17f3f2969391cc18be68 
>   src/tests/master_allocator_tests.cpp 
> 9bca27c7612b9ac4813f794bcc9ed38aeed078e5 
>   src/tests/master_authorization_tests.cpp 
> 676543a5ad1bb5d47011fc2a8b05dfaaeef18c64 
>   src/tests/slave_authorization_tests.cpp 
> 4ba0b8e96614a2df0daec576c08fe02462ccaa27 
> 
> 
> Diff: https://reviews.apache.org/r/64630/diff/1/
> 
> 
> Testing
> ---
> 
> make check (Fedora 27)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 64630: Narrowed task sandbox permissions from 0755 to 0750.

2018-01-05 Thread Ilya Pronin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64630/#review194919
---




src/tests/api_tests.cpp
Lines 2327 (patched)


Seems that we don't have `switch_user` flag on Windows.



src/tests/master_authorization_tests.cpp
Lines 104 (patched)


Ditto re `switch_user` flag and Windows here and in other test fixtures.


- Ilya Pronin


On Dec. 14, 2017, 4:10 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64630/
> ---
> 
> (Updated Dec. 14, 2017, 4:10 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Ilya Pronin, Jie Yu, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-8332
> https://issues.apache.org/jira/browse/MESOS-8332
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Since task sandboxes can contain private data, we should not
> make them accessible to others by default. This changes all the
> places that create a task sandbox directory to use a helper API
> `slave::paths::createSandboxPath` that consistently deals with
> setting the directory mode and ownership.
> 
> A number of tests depended on the previous behavior where
> failing to change the ownership was logged but did not cause
> a failure. Depending on the test, these were updated to either
> disable the agent `switch_user` flag, or to specify the current
> user in the task launch message.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 7ab0b07f689f872573ca458ae47cd6426ebc0365 
>   src/slave/containerizer/mesos/paths.cpp 
> 8a188a918873eef468a984b80f5ea7ebaa8fb923 
>   src/slave/http.cpp ed22b9f6bfa1c480a0672ce25d364bba6e33a200 
>   src/slave/paths.hpp 9cbacd8da62e7c7386dca7031fc09a46ae773161 
>   src/slave/paths.cpp fca2a0eec2a75ed76028ea54dc992502275d4bce 
>   src/tests/api_tests.cpp 86cbba4fab5e7a45298d17f3f2969391cc18be68 
>   src/tests/master_allocator_tests.cpp 
> 9bca27c7612b9ac4813f794bcc9ed38aeed078e5 
>   src/tests/master_authorization_tests.cpp 
> 676543a5ad1bb5d47011fc2a8b05dfaaeef18c64 
>   src/tests/slave_authorization_tests.cpp 
> 4ba0b8e96614a2df0daec576c08fe02462ccaa27 
> 
> 
> Diff: https://reviews.apache.org/r/64630/diff/1/
> 
> 
> Testing
> ---
> 
> make check (Fedora 27)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Review Request 65000: Added an SLRP test for agent being registered with a new ID.

2018-01-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65000/
---

Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.


Repository: mesos


Description
---

Added an SLRP test for agent being registered with a new ID. Also, we
changed `PublishResourcesRecovery` to launch one more task to test that
a persistent volume persists across agent restarts.


Diffs
-

  src/tests/storage_local_resource_provider_tests.cpp 
e9ea021867bf2175d5ee53a89a0619563ab8f329 


Diff: https://reviews.apache.org/r/65000/diff/1/


Testing
---

sudo make check

NOTE: This test and other tests involving agent restarts is currently broken 
due to MESOS-8393.
It should be fixed once MESOS-8375 is landed.


Thanks,

Chun-Hung Hsiao



Re: Review Request 64969: Added an performance benchmark for master `getstate` v1 api.

2018-01-05 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64969/#review194920
---



Can you link in the output in the 'Testing Done'?

- Benjamin Mahler


On Jan. 6, 2018, 12:39 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64969/
> ---
> 
> (Updated Jan. 6, 2018, 12:39 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Michael Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8344
> https://issues.apache.org/jira/browse/MESOS-8344
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test is based on the MasterFailover benchmark which
> currently lacks things like executor. It is possible
> to add these when needed.
> 
> 
> Diffs
> -
> 
>   src/tests/master_benchmarks.cpp 7d69588b3a4090ad1a755b592d2f8756ad6dff9e 
> 
> 
> Diff: https://reviews.apache.org/r/64969/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 64969: Added an performance benchmark for master `getstate` v1 api.

2018-01-05 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64969/
---

(Updated Jan. 5, 2018, 5:40 p.m.)


Review request for mesos, Benjamin Mahler, Michael Park, and Jiang Yan Xu.


Bugs: MESOS-8344
https://issues.apache.org/jira/browse/MESOS-8344


Repository: mesos


Description
---

This test is based on the MasterFailover benchmark which
currently lacks things like executor. It is possible
to add these when needed.


Diffs
-

  src/tests/master_benchmarks.cpp 7d69588b3a4090ad1a755b592d2f8756ad6dff9e 


Diff: https://reviews.apache.org/r/64969/diff/1/


Testing (updated)
---

Output from running on MacBook Pro with 3.3 GHz Intel Core i7:

[==] Running 6 tests from 1 test case.
[--] Global test environment set-up.
[--] 6 tests from 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test
[ RUN  ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/0
Test setup: 2000 agents with a total of 10 running tasks and 10 
completed tasks.
Starting reregistration for all agents...
All reregistration completed. Starting querying master...
Getstate application/x-protobuf query response took 10.716890195secs with the 
following test setup:
2000 agents with a total of 10 running tasks and 10 completed tasks.
[   OK ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/0 
(36683 ms)
[ RUN  ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/1
Test setup: 2000 agents with a total of 20 running tasks and 0 completed 
tasks.
Starting reregistration for all agents...
All reregistration completed. Starting querying master...
Getstate application/x-protobuf query response took 21.103205786secs with the 
following test setup:
2000 agents with a total of 20 running tasks and 0 completed tasks.
[   OK ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/1 
(62539 ms)
[ RUN  ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/2
Test setup: 2 agents with a total of 10 running tasks and 0 completed 
tasks.
Starting reregistration for all agents...
All reregistration completed. Starting querying master...
Getstate application/x-protobuf query response took 14.185437528secs with the 
following test setup:
2 agents with a total of 10 running tasks and 0 completed tasks.
[   OK ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/2 
(48821 ms)
[ RUN  ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/3
Test setup: 2000 agents with a total of 10 running tasks and 10 
completed tasks.
Starting reregistration for all agents...
All reregistration completed. Starting querying master...
Getstate application/json query response took 2.46403078761667mins with the 
following test setup:
2000 agents with a total of 10 running tasks and 10 completed tasks.
[   OK ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/3 
(176789 ms)
[ RUN  ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/4
Test setup: 2000 agents with a total of 20 running tasks and 0 completed 
tasks.
Starting reregistration for all agents...
All reregistration completed. Starting querying master...
Getstate application/json query response took 5.05502142155mins with the 
following test setup:
2000 agents with a total of 20 running tasks and 0 completed tasks.
[   OK ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/4 
(351393 ms)
[ RUN  ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/5
Test setup: 2 agents with a total of 10 running tasks and 0 completed 
tasks.
Starting reregistration for all agents...
All reregistration completed. Starting querying master...
Getstate application/json query response took 3.26795947955mins with the 
following test setup:
2 agents with a total of 10 running tasks and 0 completed tasks.
[   OK ] 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/5 
(236676 ms)
[--] 6 tests from 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test (912902 ms 
total)

[--] Global test environment tear-down
[==] 6 tests from 1 test case ran. (912933 ms total)
[  PASSED  ] 6 tests.


Thanks,

Meng Zhu



Re: Review Request 64969: Added an performance benchmark for master `getstate` v1 api.

2018-01-05 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64969/#review194923
---



- Can you measure with optimization turned on?
- No need to include the test setup in the output multiple times
- No need to include the messages in the output

- Benjamin Mahler


On Jan. 6, 2018, 1:40 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64969/
> ---
> 
> (Updated Jan. 6, 2018, 1:40 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Michael Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-8344
> https://issues.apache.org/jira/browse/MESOS-8344
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test is based on the MasterFailover benchmark which
> currently lacks things like executor. It is possible
> to add these when needed.
> 
> 
> Diffs
> -
> 
>   src/tests/master_benchmarks.cpp 7d69588b3a4090ad1a755b592d2f8756ad6dff9e 
> 
> 
> Diff: https://reviews.apache.org/r/64969/diff/1/
> 
> 
> Testing
> ---
> 
> Output from running on MacBook Pro with 3.3 GHz Intel Core i7:
> 
> [==] Running 6 tests from 1 test case.
> [--] Global test environment set-up.
> [--] 6 tests from 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test
> [ RUN  ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/0
> Test setup: 2000 agents with a total of 10 running tasks and 10 
> completed tasks.
> Starting reregistration for all agents...
> All reregistration completed. Starting querying master...
> Getstate application/x-protobuf query response took 10.716890195secs with the 
> following test setup:
> 2000 agents with a total of 10 running tasks and 10 completed tasks.
> [   OK ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/0
>  (36683 ms)
> [ RUN  ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/1
> Test setup: 2000 agents with a total of 20 running tasks and 0 completed 
> tasks.
> Starting reregistration for all agents...
> All reregistration completed. Starting querying master...
> Getstate application/x-protobuf query response took 21.103205786secs with the 
> following test setup:
> 2000 agents with a total of 20 running tasks and 0 completed tasks.
> [   OK ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/1
>  (62539 ms)
> [ RUN  ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/2
> Test setup: 2 agents with a total of 10 running tasks and 0 completed 
> tasks.
> Starting reregistration for all agents...
> All reregistration completed. Starting querying master...
> Getstate application/x-protobuf query response took 14.185437528secs with the 
> following test setup:
> 2 agents with a total of 10 running tasks and 0 completed tasks.
> [   OK ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/2
>  (48821 ms)
> [ RUN  ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/3
> Test setup: 2000 agents with a total of 10 running tasks and 10 
> completed tasks.
> Starting reregistration for all agents...
> All reregistration completed. Starting querying master...
> Getstate application/json query response took 2.46403078761667mins with the 
> following test setup:
> 2000 agents with a total of 10 running tasks and 10 completed tasks.
> [   OK ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/3
>  (176789 ms)
> [ RUN  ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/4
> Test setup: 2000 agents with a total of 20 running tasks and 0 completed 
> tasks.
> Starting reregistration for all agents...
> All reregistration completed. Starting querying master...
> Getstate application/json query response took 5.05502142155mins with the 
> following test setup:
> 2000 agents with a total of 20 running tasks and 0 completed tasks.
> [   OK ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/4
>  (351393 ms)
> [ RUN  ] 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetstateV1/5
> Test setup: 2 agents with a total of 10 running tasks and 0 completed 
> tasks.
> Starting reregistration for all agents...
> All reregistration completed. Starting querying master...
> Getstate application/json query response took 3.26795947955mins with the 
> following test setup:
> 2 agents with a total of 10 running tasks and 0 completed tasks.
> [   OK ] 
> AgentFrameworkTaskCountContentType/MasterStateQu