Re: Typed Error Handling in Mesos

2016-04-06 Thread Bernd Mathiske
+1

Another use case:

https://github.com/mesosphere/marathon/commit/d616c05619753e398e882fa8d80e35e137775b30
 

https://issues.apache.org/jira/browse/MESOS-2522 

https://issues.apache.org/jira/browse/MESOS-4548 


Bernd


> On Apr 6, 2016, at 1:25 PM, Kevin Klues  wrote:
> 
> +1
> 
> This is also similar to how errors are typed in Go as well.
> 
> On Wednesday, April 6, 2016, Alexander Rojas  >
> wrote:
> 
>> +1
>> 
>> What I like is that it allows from some kind of type safety into the error
>> management beyond trying to parse error strings.
>> 
>>> On 05 Apr 2016, at 03:48, Michael Park >
>> wrote:
>>> 
>>> Contrary to standard C++ practices, Mesos uses return values as the
>>> mechanism
>>> for error handling rather than exceptions.
>>> 
>>> This proposal is simply an evolution of the current mechanism we have in
>>> Mesos today.
>>> This direction is consistent with the designs made in Rust, which uses
>>> return values as
>>> the error handling mechanism at the language level.
>>> 
>>> The first step is to add an additional template parameter to class
>> template
>>> *Try*, to get *Try*.
>>> 
>>> The proposed design defaults* E *to *Error*, and requires that *E* be, or
>>> is inherited from *Error*.
>>> The return type of *error()* is *const std::string&* if *E == Error* and
>>> *const E&* otherwise,
>>> for backwards-compatibility reasons.
>>> 
>>> So in the end, *Try* behaves exactly as before.
>>> 
>>> The work is being tracked in MESOS-5107
>>> , and i've written a
>>> quick design doc
>>> <
>> https://docs.google.com/document/d/1tG21sD-ZX64FHAKJwhEPk6JkgsBIv12AmA1Y3J0kCYY/edit#
>>> 
>>> capturing
>>> some of the preliminary thoughts on this topic, and a proposal for an
>>> immediate use case
>>> for the Windows work.
>>> 
>>> If you're interested in how Rust deals with error handling, check out
>>> https://doc.rust-lang.org/book/error-handling.html. Our *Option* is
>> their
>>> *Option*,
>>> our *Try* is their *Result*, and they don't have our *Result*.
>>> 
>>> I'm going to be pushing the changes proposed shortly, but the changes are
>>> small and
>>> does not require a large sweeping changes or anything like that.
>>> So please reach out to me with your concerns and complaints and I will be
>>> sure to address them.
>>> 
>>> Thanks,
>>> 
>>> MPark
>> 
>> 
> 
> --
> ~Kevin



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Making 'curl' a prerequisite for installing Mesos

2016-03-03 Thread Bernd Mathiske
+1

> On Mar 3, 2016, at 6:10 PM, Jie Yu  wrote:
> 
> Hi,
> 
> I am proposing making 'curl' a prerequisite when installing Mesos. Currently, 
> we require 'libcurl' being present when installing Mesos 
> (http://mesos.apache.org/gettingstarted/ 
> ). However, we found that it does 
> not compose well with our asynchronous runtime environment (i.e., it'll block 
> the current worker thread).
> 
> Recent work on URI fetcher  
> uses 'curl' directly, instead of using 'libcurl' to fetch artifacts, because 
> it composes well with our async runtime env. 'curl' is installed by default 
> in most systems (e.g., OSX, centos, RHEL).
> 
> So I am proposing adding 'curl' to our prerequisite list. Let me know if you 
> have any concern on this. I'll update the Getting Started doc if you are OK 
> with this change.
> 
> Thanks,
> - Jie
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.27.1 (rc1)

2016-02-17 Thread Bernd Mathiske
+1 (binding)

Test failures look a lot like with 0.27.0. Not clean, but nothing deemed too 
drastic yet.

CentOS 7 plain:
FetcherCacheHttpTest.HttpCachedSerialized flaky again, filed MESOS-4692
LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem known as 
flaky: MESOS-4674
LinuxFilesystemIsolatorTest.ROOT_MultipleContainers known as flaky: MESOS-4674

CentOS 7 SSL-enabled:
LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem,
LinuxFilesystemIsolatorTest.ROOT_MultipleContainers
both known as flaky: MESOS-4674

CentOS 6 plain:  OK

CentOS 6 SSL-enabled:
MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
flaky as often observed before, probably MESOS-4053

Ubuntu 14.04 plain/SSL, Ubuntu 12.04 plain/SSL, Ubuntu 15 plain: OK,

Ubuntu 15 SSL-enabled:
DockerContainerizerTest.ROOT_DOCKER_Logs known as flaky: MESOS-4676

Other known frequently flaky tests that have not been tested this time 
(filtered out):
HealthCheckTest.ROOT_DOCKER_DockerHealthyTask
HealthCheckTest.ROOT_DOCKER_DockerHealthStatusChange
HookTest.ROOT_DOCKER_VerifySlavePreLaunchDockerHook
DockerContainerizerTest.ROOT_DOCKER_Launch_Executor

Bernd

> On Feb 17, 2016, at 1:52 AM, Michael Park  wrote:
> 
> Hi all,
> 
> Please vote on releasing the following candidate as Apache Mesos 0.27.1.
> 
> 
> 0.27.1 includes the following:
> 
> * Improved `systemd` integration.
> * Ability to disable `systemd` integration.
> 
> * Additional performance improvements to /state endpoint.
> * Removed duplicate "active" keys from the /state endpoint.
> 
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.27.1-rc1
> 
> 
> The candidate for Mesos 0.27.1 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/0.27.1-rc1/mesos-0.27.1.tar.gz
> 
> The tag to be voted on is 0.27.1-rc1:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.27.1-rc1
> 
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/0.27.1-rc1/mesos-0.27.1.tar.gz.md5
> 
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/0.27.1-rc1/mesos-0.27.1.tar.gz.asc
> 
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS
> 
> The JAR is up in Maven in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1102
> 
> Please vote on releasing this package as Apache Mesos 0.27.1!
> 
> The vote is open until Fri Feb 19 17:00:00 PST 2016 and passes if a majority 
> of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Mesos 0.27.1
> [ ] -1 Do not release this package because ...
> 
> Thanks,
> 
> Joris, MPark



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Mesos 0.27.0 release update

2016-01-25 Thread Bernd Mathiske
Hi Sarjeet,

I hope we fixed this in upcoming 0.27:

https://issues.apache.org/jira/browse/MESOS-4304 


Bernd

> On Jan 25, 2016, at 7:30 PM, Sarjeet Singh  wrote:
> 
> I ran into an issue when tried Mesos 0.26 version and started a marathon
> app using URI (with maprfs path) on a mesos cluster.
> 
> The issue seems to be caused by Mesos-3602 fix, and is causing issue for
> maprfs (mapr filesystem) when specified maprfs path as URI on marathon.
> 
> The issue is that, It is appending '/' to the URI maprfs path specified, in
> the beginning, and is not executed as expected. e.g.
> 
> =
> *  hadoop fs -copyToLocal '/maprfs:///dist/hadoop-2.7.0.myriad1.tar.gz'
> '/opt/mapr/slaves/67d1f64c-449b-4609-82f3-5da309f3c5c5-S9/frameworks/67d1f64c-449b-4609-82f3-5da309f3c5c5-/executors/myriad1.63bbb98c-c072-11e5-b686-0cc47a587d20/runs/427fe309-82c5-4f8b-9fa3-6dd39a4a5ef4/hadoop-2.7.0.myriad1.tar.gz*
> 
> -copyToLocal: java.net.URISyntaxException: Expected scheme-specific part at
> index 7: maprfs:
> =
> 
> The fix for Mesos-3602 only assumes hdfs path, and doesn't consider other
> cases, such as maprfs or other dfs paths. I haven't filed a JIRA yet on
> this issue, but would like to get some feedback on this, and expect this to
> be fixed for next Mesos release.
> 
> Let me know if there is anything else I could provide related to the issue.
> 
> -Sarjeet
> 
> On Sat, Jan 23, 2016 at 12:41 AM, Timothy Chen  wrote:
> 
>> Hi all,
>> 
>> (Kapil, MPark and I) We're still having 3 blocker issues outstanding
>> at this moment:
>> 
>> MESOS-4449: SegFault on agent during executor startup (shepherd: Joris)
>> MESOS-4441: Do not allocate non-revocable resources beyond quota
>> guarantee. (shepherd: Joris)
>> MESOS-4410: Introduce protobuf for quota set request. (shepherd: Joris)
>> 
>> The remaining major tickets are ContainerLogger related and should be
>> committed today according to Ben.
>> 
>> We've started to test latest master and will be looking at the test
>> failures to see what needs to be addressed.
>> 
>> I encourage everyone to test the latest master on your platform if
>> possible to catch issues early, and once the Blocker issues are
>> resolved we'll be sending a RC to test and vote.
>> 
>> Thanks,
>> 
>> Tim
>> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: `F()` vs `F(void)`

2016-01-21 Thread Bernd Mathiske
High five!

> On Jan 21, 2016, at 12:40 AM, Michael Park  wrote:
> 
> `void` parameters are no longer with us.
> 
> https://github.com/apache/mesos/commit/93a5708294d6d66a5e1350a0bb1c8fe87605ee1d
> https://github.com/apache/mesos/commit/05f9fb2fa66968f37418d28fc8cebd0770a54dca
> https://github.com/apache/mesos/commit/4d4d7166414f0ebd8d5e40df34070446098a3c91
> 
> On Mon, Dec 14, 2015 at 5:36 AM Alexander Rojas 
> wrote:
> 
>> +1
>> 
>>> On 13 Dec 2015, at 19:46, Michael Park  wrote:
>>> 
>>> Hello,
>>> 
>>> In the C++ world, the *void* parameter is considered to be only there
>> for C
>>> compatibility reasons.
>>> 
>>> We do a good job of not using *void *parameters in function declarations,
>>> e.g., *void F();*. On the other hand, we're *not* so good doing so for
>>> function types, e.g., *std::function*.
>>> 
>>> I would like to see the codebase converge to *not* use *void* as a
>>> parameter type, and would like feedback if anyone holds strong opposing
>>> opinions.
>>> 
>>> Thanks,
>>> 
>>> MPark.
>> 
>> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Mesos build & testing environment instructions

2016-01-06 Thread Bernd Mathiske
+1. (Really exited about this prospect!)

This kind of documentation is not necessarily for “testing” only. How can we 
expect certain features to run in production if they don’t pass the unit tests 
on a given platform? So I am in favor of the “let’s expand our existing install 
instructions” tack. The setup for testing and production should be identical.

It is indeed quite tricky to get any given variant of Linux set up just right 
for every supported feature and it takes a lot of time to debug the setup 
script when something goes wrong. In particular when starting from a predefined 
vagrant file or AMI, which is often the case.

I suggest we post “verified" scripts (for a limited set of platforms, arguably 
the same as in “Getting Started") that explicitly name their starting 
configuration (vagrant file name, AMI name, or…) and contain all useful 
commands with comments why they are necessary. “Verified” means that as part of 
the release process, a committer has checked that the script worked as 
described for the RC. Bugs can be filed against this kind of documentation.

Regarding the problem if a feature depends on an obscure source: I’d suggest 
that if we cannot name any viable source then we cannot support the feature 
depending on that component on that platform in the given release. If the 
source is available during the release and later becomes non-viable, that’s 
another matter. We should label all commands pertaining to sources that are 
potentially in danger of changing as such, by source code comments in the 
scripts. Example for packages that can be regarded as “reliable”: what you can 
install by straight “apt-get install -y” from an LTS version, without prior 
repo meddling.

BTW: personally, I prefer a script that runs from beginning to end in one swoop 
to instructions like “open this file in an editor then change the line 
containing this and that to such and such and add a line that says bla.” Best 
to have a comment that explains all that next to a sed or similar command that 
just does it.

2c

Bernd

> On Dec 17, 2015, at 10:08 PM, Neil Conway  wrote:
> 
> +1 to the general idea of including this information in the documentation.
> 
> I'd probably lean towards including this information in the current
> "Getting Started" page, but in a separate section ("Running The Test
> Suite"?).
> 
> Neil
> 
> On Thu, Dec 17, 2015 at 12:38 PM, Greg Mann  wrote:
>> Hey folks!
>> Something occurred to me recently which is related to the extensive testing
>> we did in preparation for the 0.26.0 release. Since I started contributing
>> to the project, my Source of Truth for "how to prepare a given platform to
>> compile and run Mesos" has been the Getting Started page of our
>> documentation. However, this doc doesn't provide guidance for all platforms
>> on "how to prepare this platform to compile Mesos and then TEST it in all
>> configurations", which is crucial information for us when it comes to
>> testing, and would be useful to our users as well. I wonder if it makes
>> sense to have a separate place in our documentation where we include these
>> exhaustive installation instructions, which may be beyond the scope of a
>> "Getting Started" document for new users.
>> 
>> One option is to introduce a new documentation section on testing, where we
>> can include supplementary installation instructions, as well as information
>> on the test suite, how to run it, the available options, etc. We already
>> have a page on good patterns to use when *writing* tests, but nothing I can
>> find on running the tests, besides a brief mention in "Getting Started".
>> 
>> Another option would be to just expand our existing install instructions to
>> be a bit more comprehensive and include instructions for optional
>> components like libevent2, docker, kernel updates to enable cgroup tests,
>> etc.  Especially on an older platform like CentOS 6.6, this can be tricky.
>> 
>> Note that some of these installations (like libevent2 on CentOS 6.6)
>> require the use of hard-to-find RPMs whose origin is uncertain, and it's
>> possible that we wouldn't want to offer such instructions publicly to our
>> users.
>> 
>> Thoughts?
>> 
>> Cheers,
>> Greg



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.26.0 (rc4)

2015-12-10 Thread Bernd Mathiske
I think that whereas this would clearly be a desirable bug fix to have, it is 
not a blocker:
- Not a regression. This problem has been around for a long time, since 0.20 
AFAIK.
- There is a simple workaround.

Bernd

> On Dec 10, 2015, at 3:05 AM, Benjamin Mahler  
> wrote:
> 
> I'd really like to pull in the fix for:
> https://issues.apache.org/jira/browse/MESOS-4106 
> <https://issues.apache.org/jira/browse/MESOS-4106>
> 
> This has been a long standing bug that makes the health checking not function 
> correctly some of the time. While it is rare in CI, it appeared in a 
> colleague's cluster for about a third of the tasks he was launching to 
> demonstrate how he ran into this. The fix is trivial and is in review.
> 
> Thoughts?
> 
> On Tue, Dec 8, 2015 at 7:01 AM, Bernd Mathiske  <mailto:be...@mesosphere.io>> wrote:
> +1 (binding)
> 
> Ran make check, make distcheck, sudo bin/mesos-tests.sh, with SSL enabled and 
> without on: Ubuntu 12.04, CentOS 7.1.
> 
> Had 4 test failures with CentOS 7.1 for each configuration variant. All of 
> the failed tests are known to be flaky, they have MESOS tickets, and they 
> have been investigated and are deemed non-blockers.
> 
> Bernd
> 
> > On Dec 8, 2015, at 4:59 AM, Till Toenshoff  > <mailto:toensh...@me.com>> wrote:
> >
> > Hi friends,
> >
> > we had noticed some discrepancies between the V0 API and the V1 API,
> > hence we had to create a new release candidate even after the voting of
> > 0.26.0-rc3 had officially ended. Sorry for that!
> >
> > Please vote on releasing the following candidate as Apache Mesos 0.26.0.
> >
> > The CHANGELOG for the release is available at:
> > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.0-rc4
> >  
> > <https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.0-rc4>
> > 
> >
> > The candidate for Mesos 0.26.0 release is available at:
> > https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz 
> > <https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz>
> >
> > The tag to be voted on is 0.26.0-rc4:
> > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.26.0-rc4 
> > <https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.26.0-rc4>
> >
> > The MD5 checksum of the tarball can be found at:
> > https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz.md5
> >  
> > <https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz.md5>
> >
> > The signature of the tarball can be found at:
> > https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz.asc
> >  
> > <https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz.asc>
> >
> > The PGP key used to sign the release is here:
> > https://dist.apache.org/repos/dist/release/mesos/KEYS 
> > <https://dist.apache.org/repos/dist/release/mesos/KEYS>
> >
> > The JAR is up in Maven in a staging repository here:
> > https://repository.apache.org/content/repositories/orgapachemesos-1093 
> > <https://repository.apache.org/content/repositories/orgapachemesos-1093>
> >
> > Please vote on releasing this package as Apache Mesos 0.26.0!
> >
> > The vote is open until Fri Dec 11 04:50:51 CET 2015 and passes if a 
> > majority of at least 3 +1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Mesos 0.26.0
> > [ ] -1 Do not release this package because ...
> >
> > Thanks,
> > Bernd & Till
> >
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.26.0 (rc4)

2015-12-08 Thread Bernd Mathiske
+1 (binding)

Ran make check, make distcheck, sudo bin/mesos-tests.sh, with SSL enabled and 
without on: Ubuntu 12.04, CentOS 7.1.

Had 4 test failures with CentOS 7.1 for each configuration variant. All of the 
failed tests are known to be flaky, they have MESOS tickets, and they have been 
investigated and are deemed non-blockers.

Bernd

> On Dec 8, 2015, at 4:59 AM, Till Toenshoff  wrote:
> 
> Hi friends,
> 
> we had noticed some discrepancies between the V0 API and the V1 API,
> hence we had to create a new release candidate even after the voting of
> 0.26.0-rc3 had officially ended. Sorry for that!
> 
> Please vote on releasing the following candidate as Apache Mesos 0.26.0.
> 
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.0-rc4
> 
> 
> The candidate for Mesos 0.26.0 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz
> 
> The tag to be voted on is 0.26.0-rc4:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.26.0-rc4
> 
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz.md5
> 
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc4/mesos-0.26.0.tar.gz.asc
> 
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS
> 
> The JAR is up in Maven in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1093
> 
> Please vote on releasing this package as Apache Mesos 0.26.0!
> 
> The vote is open until Fri Dec 11 04:50:51 CET 2015 and passes if a majority 
> of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Mesos 0.26.0
> [ ] -1 Do not release this package because ...
> 
> Thanks,
> Bernd & Till
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.26.0 (rc3)

2015-12-07 Thread Bernd Mathiske
We found an issue with the v1 API, which is not completely on par with the 
legacy mesos.proto. Spinning RC4 shortly…

> On Dec 7, 2015, at 10:48 AM, Bernd Mathiske  wrote:
> 
> +1 (binding)
> 
> Ubuntu 14 (clean without SSL, a few known flaky tests with SSL, all analyzed 
> and deemed non-blockers)
> CentOS 6.6 (a few known flaky tests with SSL, all analyzed and deemed 
> non-blockers)
> 
>> On Dec 4, 2015, at 4:52 AM, Benjamin Mahler  
>> wrote:
>> 
>> +1 (binding) tests pass on OS X 10.11.1 with both SSL and non-SSL
>> configurations.
>> 
>> Some feedback from framework developers would be great here.
>> 
>> Agreed that MESOS-3973 <https://issues.apache.org/jira/browse/MESOS-3973> is
>> not a blocker, given it also occurs on 0.26.0 all the way back to 0.21.0.
>> 
>> 
>> On Wed, Dec 2, 2015 at 9:01 AM, Bernd Mathiske  wrote:
>> 
>>> We are still working on that, but we do not regard "make distcheck" on Mac
>>> as blocker. Other opinions?
>>> 
>>> On Dec 2, 2015, at 2:27 PM, Alex Rukletsov  wrote:
>>> 
>>> `make check -j7` — OK
>>> `make distcheck -j7` — fails, probably MESOS-3973
>>> <https://issues.apache.org/jira/browse/MESOS-3973>, see hints below.
>>> 
>>> Both on Mac OS 10.10.4
>>> 
>>> I see the following lines in the log:
>>> ...
>>> libtool: warning: 'libmesos.la' has not been installed in
>>> '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
>>> libtool: warning: 'libmesos.la' has not been installed in
>>> '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
>>> ...
>>> libtool: warning: 'libmesos.la' has not been installed in
>>> '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
>>> libtool: warning: 'libmesos.la' has not been installed in
>>> '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
>>> ...
>>> Cannot uninstall requirement mesos, not installed
>>> Cannot uninstall requirement mesos.cli, not installed
>>> Cannot uninstall requirement mesos.interface, not installed
>>> Cannot uninstall requirement mesos.native, not installed
>>> ERROR: files left after uninstall:
>>> ...
>>> 
>>> On Tue, Dec 1, 2015 at 8:49 PM, Till Toenshoff  wrote:
>>> 
>>>> Hi friends,
>>>> 
>>>> Please vote on releasing the following candidate as Apache Mesos 0.26.0.
>>>> 
>>>> The CHANGELOG for the release is available at:
>>>> 
>>>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.0-rc3
>>>> 
>>>> 
>>>> 
>>>> The candidate for Mesos 0.26.0 release is available at:
>>>> 
>>>> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz
>>>> 
>>>> The tag to be voted on is 0.26.0-rc3:
>>>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.26.0-rc3
>>>> 
>>>> The MD5 checksum of the tarball can be found at:
>>>> 
>>>> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz.md5
>>>> 
>>>> The signature of the tarball can be found at:
>>>> 
>>>> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz.asc
>>>> 
>>>> The PGP key used to sign the release is here:
>>>> https://dist.apache.org/repos/dist/release/mesos/KEYS
>>>> 
>>>> The JAR is up in Maven in a staging repository here:
>>>> https://repository.apache.org/content/repositories/orgapachemesos-1091
>>>> 
>>>> Please vote on releasing this package as Apache Mesos 0.26.0!
>>>> 
>>>> The vote is open until Fri Dec  4 19:00:35 CET 2015 and passes if a
>>>> majority of at least 3 +1 PMC votes are cast.
>>>> 
>>>> [ ] +1 Release this package as Apache Mesos 0.26.0
>>>> [ ] -1 Do not release this package because …
>>>> 
>>>> Thanks,
>>>> Bernd & Till
>>>> 
>>>> 
>>> 
>>> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.26.0 (rc3)

2015-12-07 Thread Bernd Mathiske
+1 (binding)

Ubuntu 14 (clean without SSL, a few known flaky tests with SSL, all analyzed 
and deemed non-blockers)
CentOS 6.6 (a few known flaky tests with SSL, all analyzed and deemed 
non-blockers)

> On Dec 4, 2015, at 4:52 AM, Benjamin Mahler  wrote:
> 
> +1 (binding) tests pass on OS X 10.11.1 with both SSL and non-SSL
> configurations.
> 
> Some feedback from framework developers would be great here.
> 
> Agreed that MESOS-3973 <https://issues.apache.org/jira/browse/MESOS-3973> is
> not a blocker, given it also occurs on 0.26.0 all the way back to 0.21.0.
> 
> 
> On Wed, Dec 2, 2015 at 9:01 AM, Bernd Mathiske  wrote:
> 
>> We are still working on that, but we do not regard "make distcheck" on Mac
>> as blocker. Other opinions?
>> 
>> On Dec 2, 2015, at 2:27 PM, Alex Rukletsov  wrote:
>> 
>> `make check -j7` — OK
>> `make distcheck -j7` — fails, probably MESOS-3973
>> <https://issues.apache.org/jira/browse/MESOS-3973>, see hints below.
>> 
>> Both on Mac OS 10.10.4
>> 
>> I see the following lines in the log:
>> ...
>> libtool: warning: 'libmesos.la' has not been installed in
>> '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
>> libtool: warning: 'libmesos.la' has not been installed in
>> '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
>> ...
>> libtool: warning: 'libmesos.la' has not been installed in
>> '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
>> libtool: warning: 'libmesos.la' has not been installed in
>> '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
>> ...
>> Cannot uninstall requirement mesos, not installed
>> Cannot uninstall requirement mesos.cli, not installed
>> Cannot uninstall requirement mesos.interface, not installed
>> Cannot uninstall requirement mesos.native, not installed
>> ERROR: files left after uninstall:
>> ...
>> 
>> On Tue, Dec 1, 2015 at 8:49 PM, Till Toenshoff  wrote:
>> 
>>> Hi friends,
>>> 
>>> Please vote on releasing the following candidate as Apache Mesos 0.26.0.
>>> 
>>> The CHANGELOG for the release is available at:
>>> 
>>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.0-rc3
>>> 
>>> 
>>> 
>>> The candidate for Mesos 0.26.0 release is available at:
>>> 
>>> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz
>>> 
>>> The tag to be voted on is 0.26.0-rc3:
>>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.26.0-rc3
>>> 
>>> The MD5 checksum of the tarball can be found at:
>>> 
>>> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz.md5
>>> 
>>> The signature of the tarball can be found at:
>>> 
>>> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz.asc
>>> 
>>> The PGP key used to sign the release is here:
>>> https://dist.apache.org/repos/dist/release/mesos/KEYS
>>> 
>>> The JAR is up in Maven in a staging repository here:
>>> https://repository.apache.org/content/repositories/orgapachemesos-1091
>>> 
>>> Please vote on releasing this package as Apache Mesos 0.26.0!
>>> 
>>> The vote is open until Fri Dec  4 19:00:35 CET 2015 and passes if a
>>> majority of at least 3 +1 PMC votes are cast.
>>> 
>>> [ ] +1 Release this package as Apache Mesos 0.26.0
>>> [ ] -1 Do not release this package because …
>>> 
>>> Thanks,
>>> Bernd & Till
>>> 
>>> 
>> 
>> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


How to write numbers in docs

2015-12-03 Thread Bernd Mathiske
Should numbers be written as digits (1) or as words (one) in markdown docs?
When and how to decide? How to be consistent about this?
Do you even care?

Anyway, here is a proposal from Jörg to make this easy:

https://reviews.apache.org/r/40292/ 

The proposal is to use words for one thru nine and digits/figures
from 10 onwards thru 11, 12, and so on.

Please review the RR if you agree or disagree strongly.

Bernd



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.26.0 (rc3)

2015-12-02 Thread Bernd Mathiske
We are still working on that, but we do not regard "make distcheck" on Mac as 
blocker. Other opinions?

> On Dec 2, 2015, at 2:27 PM, Alex Rukletsov  wrote:
> 
> `make check -j7` — OK
> `make distcheck -j7` — fails, probably MESOS-3973 
> , see hints below.
> 
> Both on Mac OS 10.10.4
> 
> I see the following lines in the log:
> ...
> libtool: warning: 'libmesos.la ' has not been installed 
> in '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
> libtool: warning: 'libmesos.la ' has not been installed 
> in '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
> ...
> libtool: warning: 'libmesos.la ' has not been installed 
> in '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
> libtool: warning: 'libmesos.la ' has not been installed 
> in '/Users/alex/Projects/mesos/build/default/mesos-0.26.0/_inst/lib'
> ...
> Cannot uninstall requirement mesos, not installed
> Cannot uninstall requirement mesos.cli, not installed
> Cannot uninstall requirement mesos.interface, not installed
> Cannot uninstall requirement mesos.native, not installed
> ERROR: files left after uninstall:
> ...
> 
> On Tue, Dec 1, 2015 at 8:49 PM, Till Toenshoff  > wrote:
> Hi friends,
> 
> Please vote on releasing the following candidate as Apache Mesos 0.26.0.
> 
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.0-rc3
>  
> 
> 
> 
> The candidate for Mesos 0.26.0 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz 
> 
> 
> The tag to be voted on is 0.26.0-rc3:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.26.0-rc3 
> 
> 
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz.md5
>  
> 
> 
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc3/mesos-0.26.0.tar.gz.asc
>  
> 
> 
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS 
> 
> 
> The JAR is up in Maven in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1091 
> 
> 
> Please vote on releasing this package as Apache Mesos 0.26.0!
> 
> The vote is open until Fri Dec  4 19:00:35 CET 2015 and passes if a majority 
> of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Mesos 0.26.0
> [ ] -1 Do not release this package because …
> 
> Thanks,
> Bernd & Till
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.26.0 (rc1)

2015-11-16 Thread Bernd Mathiske
All,

while we are investigating this test failure, and holding the release back, 
additional feedback on how rc-1 is performing is still welcome.

Bernd

> On Nov 14, 2015, at 3:14 AM, Marco Massenzio  wrote:
> 
> -1
> (non-binding)
> 
> Run on CentOS 7.1
> Builds and all tests pass
> 
> ROOT tests fail (with segfault) - --verbose logs attached.
> 
> Currently running on Ubuntu 14.04 too.
> 
> --
> Marco Massenzio
> Distributed Systems Engineer
> http://codetrips.com 
> On Fri, Nov 13, 2015 at 3:14 AM, Till Toenshoff  > wrote:
> Hi friends,
> 
> Please vote on releasing the following candidate as Apache Mesos 0.26.0.
> 
> 
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.0-rc1
>  
> 
> 
> 
> The candidate for Mesos 0.26.0 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc1/mesos-0.26.0.tar.gz 
> 
> 
> The tag to be voted on is 0.26.0-rc1:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.26.0-rc1 
> 
> 
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc1/mesos-0.26.0.tar.gz.md5
>  
> 
> 
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/0.26.0-rc1/mesos-0.26.0.tar.gz.asc
>  
> 
> 
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS 
> 
> 
> The JAR is up in Maven in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1085 
> 
> 
> Please vote on releasing this package as Apache Mesos 0.26.0!
> 
> The vote is open until Sun Nov 15 20:13:46 CET 2015 and passes if a majority 
> of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Mesos 0.26.0
> [ ] -1 Do not release this package because ...
> 
> Thanks,
> Bernd & Till
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Fetcher refactor proposal

2015-11-11 Thread Bernd Mathiske
+1 - go for it!

> On Nov 11, 2015, at 12:45 AM, Jie Yu  wrote:
> 
> Hi,
> 
> Fetcher was originally designed to fetch CommandInfo::URIs (e.g., executor
> binary) for executors/tasks. A recent refactor (MESOS-336
> ) added caching support to
> the fetcher. The recent work on filesystem isolation/unified containerizer (
> MESOS-2840 ) requires
> Mesos to fetch filesystem images (e.g., APPC/DOCKER images) as well. The
> natural question is: can we leverage the fetcher to fetch those filesystem
> images (and cache them accordingly)? Unfortunately, the existing fetcher
> interface is tightly coupled with CommandInfo::URIs for executors/tasks,
> making it very hard to be re-used to fetch/cache filesystem images.
> 
> Another motivation for the refactor is that we want to extend the fetcher
> to support more types of schemes. For instance, we want to support magnet
> URI to enable p2p fetching. This is in fact quite important for operating a
> large cluster (MESOS-3596 ).
> The goal here is to allow fetcher to be extended (e.g., using modules) so
> that operators can add custom fetching support.
> 
> I proposed a solution in this doc
> .
> The main idea is to decouple artifacts fetching from artifacts cache
> management. We can make artifacts fetching extensible (e.g. to support p2p
> fetching), and solve the cache management part later.
> 
> Let me know your thoughts! Thanks!
> 
> - Jie



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Mesos Style Guideline Adjustments

2015-11-09 Thread Bernd Mathiske
n...@mesosphere.io> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> just to echo Alexander’s point, for newbies like me being able to
>>> delegate
>>>> formatting decisions to tools as much as possible frees up a lot of
>>> mental
>>>> resources for tackling the real issues.
>>>> 
>>>> 
>>>> Cheers,
>>>> 
>>>> Benjamin
>>>> 
>>>> ps. Also looking forward to an updated and expanded clang-format
>> config.
>>>> 
>>>> 
>>>>> On Nov 6, 2015, at 1:44 PM, Alexander Rojas >> 
>>>> wrote:
>>>>> 
>>>>> I think one of the main reasons we move to having 80 as the limit for
>>>> both code and comments is the ability it gives us to use tools (e.g.
>>>> clang-format) to enforce formatting rules, so personally I rather have
>> us
>>>> putting effort towards that goal. On that note, the developer branch of
>>>> clang-format allows a much closer formatting options to the ones we
>> use.
>>> On
>>>> OS X it can be installed using `brew install --HEAD clang-format`.
>>>>> 
>>>>> Right now I’m working on setting the config file to be as close as
>>>> possible to our style.
>>>>> 
>>>>>> On 06 Nov 2015, at 10:09, Alex Rukletsov 
>> wrote:
>>>>>> 
>>>>>> I think jaggedness in the example you provide comes mainly from the
>>> fact
>>>>>> that the second comment has multiple logical blocks. I have
>> formatted
>>>> both
>>>>>> comments at 70 and at 80, here is the outcome:
>>>> http://pastebin.com/nRQB0nCD
>>>>>> 
>>>>>> While the first comment indeed looks better when wrapped at 70, I
>>> can't
>>>> say
>>>>>> the same about the second one.
>>>>>> 
>>>>>> I would say, that the longer a line could be, the less jagged the
>>>> comment
>>>>>> block is. The ratio (`averageWordLength` / `maxLineLength`)
>> approaches
>>>> 0 as
>>>>>> `maxLineLenght` approaches infinity, which means wrapping a long
>> word
>>>> right
>>>>>> before the line end should be perceived less jagged : ).
>>>>>> 
>>>>>> Also, the longer an individual line can be, the less total lines are
>>>> needed
>>>>>> for a comment block, which reduces jaggedness and makes code a
>> little
>>>> bit
>>>>>> more readable.
>>>>>> 
>>>>>> But my strongest argument is that having a separate soft rule for
>>>> comments
>>>>>> is hard to enforce. I think what we can do is to encourage
>>> contributors
>>>> /
>>>>>> committers to wrap comments in the most logical way—like the first
>>>> comment
>>>>>> in the example you provide—even if the line length is not fully
>>>> utilized.
>>>>>> Having said that, I would rather keep a single number: hard limit at
>>> 80
>>>> for
>>>>>> simplicity.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Thu, Nov 5, 2015 at 10:15 PM, Benjamin Mahler <
>>>> benjamin.mah...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> This has come up in a couple of reviews, seems like we should add
>>> some
>>>> soft
>>>>>>> guidelines around how to format comments for readability.
>>>>>>> 
>>>>>>> In particular, the reason that we wrapped at 70 in the past was for
>>>>>>> readability, so it would be great to continue doing so as a soft
>>>> stylistic
>>>>>>> rule. The other thing we've been doing for readability is reducing
>>>>>>> "jaggedness" (variability in line lengths).
>>>>>>> 
>>>>>>> It would be great to establish these as soft rules and encourage
>> new
>>>>>>> contributors / committers to follow them. Compare these two
>> comments
>>> in
>>>>>>> Master::updateTask. The first one wraps at 70 and reduces
>> jagedness,
>>>> the
>>>>>>> second wraps at 80 and is more jagged:
>>>>&g

Re: More Project Structure in JIRA

2015-11-09 Thread Bernd Mathiske
All,

thanks for upvoting this. AFAICT, we have consensus to go ahead. Let’s do this 
from now on!

Bernd

> On Oct 28, 2015, at 2:52 AM, Klaus Ma  wrote:
> 
> +1
> 
> On Sun, Oct 25, 2015 at 11:57 PM, Shuai Lin  wrote:
> 
>> +1
>> 
>> On Wed, Oct 21, 2015 at 12:55 AM, Greg Mann  wrote:
>> 
>>> +1
>>> 
>>> On Tue, Oct 20, 2015 at 9:50 AM, tommy xiao  wrote:
>>> 
>>>> +1 Yes please!
>>>> 
>>>> 2015-10-19 16:09 GMT+08:00 Alexander Rojas :
>>>> 
>>>>> +1 Yes please!
>>>>> 
>>>>>> On 15 Oct 2015, at 10:11, Bernd Mathiske 
>>> wrote:
>>>>>> 
>>>>>> Proposal: in extension of today’s limited two-level (epic, task)
>>>>> approach, make full use of expressive power already available in JIRA
>>> to
>>>>> provide more structure for larger projects to facilitate planning,
>>>>> tracking, and reporting. This will facilitate dynamically planning of
>>>>> sub-projects, which will make us more agile.
>>>>>> 
>>>>>> The general idea is to use links between epics to provide a
>> recursive
>>>>> hierarchical structure, with which one can span trees or DAGs of
>>>>> arbitrarily large projects. This does not mean that we want to plan
>>>>> everything in minute detail before starting to work. On the contrary.
>>>>>> 
>>>>>> You can start anywhere in the eventual tree and express part of the
>>>>> overall effort, maybe say a short epic with a few task tickets. Then
>>> you
>>>>> can LATER make this epic a dependency for a larger effort.
>>>>>> 
>>>>>> Conversely, you can subdivide a task in the epic into subtasks.
>>>> However,
>>>>> this does not mean that you have to literally use the feature
>> “subtask”
>>>> in
>>>>> JIRA for this. Instead, staying recursive in our JIRA grammar, so to
>>>> speak,
>>>>> convert the task to an epic and then create ordinary tasks in it to
>>>>> represent subtasks.
>>>>>> 
>>>>>> Now the task cannot be a task in its parent epic anymore. We fix
>> this
>>>> by
>>>>> putting in a link of type "blocks" to the parent. When you then look
>> at
>>>> the
>>>>> parent, it still holds a number of tasks, and it has one dependency
>> on
>>> an
>>>>> epic (to which you can add more).
>>>>>> 
>>>>>> Thus our dependency tree can grow in all directions. You can also
>>>>> rearrange and update it in any shape or form if necessary.
>>>>>> 
>>>>>> Overall, we only use two JIRA elements: epics and tasks (of
>> different
>>>>> flavors such as bugs, improvements, etc.). Tasks are the leaves,
>>>> everything
>>>>> else is an epic. Review requests only ever happen for tasks.
>>>>>> 
>>>>>> The epics are there to provide a high level view and to allow
>> dynamic
>>>>> (“more agilish”, non-waterfall) planning. Granted, you’d also use a
>>> tree
>>>> if
>>>>> you did waterfall. The difference is that you’d spec it all out at
>>> once.
>>>> My
>>>>> observation is that not too few of us do exactly this - outside JIRA
>> -
>>>> and
>>>>> then try to remember what tickets are where in their tree. Let’s make
>>>> this
>>>>> part of JIRA!
>>>>>> 
>>>>>> Why not use labels? Because they are in a flat name space and we
>> want
>>>> to
>>>>> represent tree structure. How would you know that a label denotes a
>>>>> subproject of another label? By memorizing or by depicting a tree
>>> outside
>>>>> JIRA. Why not use components? Same problem as with labels: flat name
>>>> space.
>>>>> We can use labels and components these for many other purposes.
>>> Separate
>>>>> discussion.
>>>>>> 
>>>>>> Aren’t we doing this already? Probably. I have not checked
>>> thoroughly.
>>>>> There may occasionally be epics that link to other epics. If so, I
>>> would
>>>>> merely like to encourage us to use this powerful expressive means
>> more
>>>>> often.
>>>>>> 
>>>>>> Bernd
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Deshi Xiao
>>>> Twitter: xds2000
>>>> E-mail: xiaods(AT)gmail.com
>>>> 
>>> 
>> 
> 
> 
> 
> --
> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
> Platform Symphony/DCOS Development & Support, STG, IBM GCG
> +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Add JIRA ticket# to `TODO`s in comments

2015-11-09 Thread Bernd Mathiske
 +1 on converting lots of TODOs into JIRAs with links to them in the TODOs.

Questions with opinions:
- Do we need to create extra tickets like “Edit TODO to mention ticket 
MESOS-XXX”? I suppose not.
- Do we even need an RR for updating a TODO? I suppose yes.
- Can we do several TODO updates at once across several/many files/topics in 
one RR? I propose: no limits, except stout/libprocess boundaries.
- Every TODO *CAN* be a ticket - how’s that for starters? I’d also go along 
with MUST if there is consensus on that.

Opinions "without question":
- The assignee MUST be left open until the ticket is in a sprint.
- Typically, the reporter should be the person now mentioned in the TODO. 
Alternatively, if there is significant extra information in the ticket,the 
person making the effort to write the ticket can be the Reporter, if leaving a 
comment giving some credit to the original TODO author.

Bernd

> On Nov 9, 2015, at 11:54 AM, Till Toenshoff  wrote:
> 
> +1 in general for this proposal.
> 
> Using JIRA for tracking TODO’s is great, especially for things like 
> deprecation over/at releases. I am however unsure if *all* TODOs need to have 
> a ticket assigned, so that is a detail we may want to discuss as well?
> 
>> On Nov 9, 2015, at 9:55 AM, Alex Clemmer  wrote:
>> 
>> I like this proposal a lot, as I often end up making a point to
>> mention the MESOS- number in the comment anyway. I would rather
>> have the format `TODO(MESOS-XXX)` though, because (1) the JIRA should
>> capture the reporter as well as the assignee, and (2) it's not
>> immediately clear from the structure that the name should be the
>> reporter and not, say, the assignee.
>> 
>> On Sat, Nov 7, 2015 at 8:50 PM, Kapil Arya  wrote:
>>> Folks,
>>> 
>>> I wanted to bring up a style issue related to the TODO tag in comments. I
>>> have filed a Jira ticket (https://issues.apache.org/jira/browse/MESOS-3850)
>>> with the following description:
>>> 
>>> Currently, we have a TODO() tags to note stuff
>>> has "should be"/"has to be" done in future. While this provides us with
>>> some notion of accounting, it's not enough.
>>> 
>>> The author listed in the TODO comment should be considered the "Reporter",
>>> but not necessarily the "Assignee". Further, since the stuff "should
>>> be"/"has to be" done, why not have a Jira issue tracking it?
>>> 
>>> We can use TODO(MESOS-XXX) or TODO(:MESOS-XXX) or something
>>> similar. Finally, we might wan to consider adding this to the style guide
>>> to make it a soft/hard requirement.
>>> 
>>> 
>>> Are there any opinions/suggestions on this one?
>>> 
>>> Best,
>>> Kapil
>> 
>> 
>> 
>> --
>> Alex
>> 
>> Theory is the first term in the Taylor series of practice. -- Thomas M
>> Cover (1992)
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Finish Oversubscription before 0.26.0?

2015-11-06 Thread Bernd Mathiske
Scratch that. We already announced this with Mesos 0.23. Only the epic never 
got closed. And we merely improved the feature since 0.23.

So, no exciting new features in 0.26.0 AFAIK.

> On Nov 6, 2015, at 11:16 AM, Bernd Mathiske  wrote:
> 
> Cool! So we now have a major feature to announce with this release. Thanks 
> for all the good work on this!
> 
> Bernd
> 
>> On Nov 5, 2015, at 8:11 PM, Niklas Nielsen > <mailto:n...@qni.dk>> wrote:
>> 
>> Nope - go ahead and close
>> 
>> On Thu, Nov 5, 2015 at 10:24 AM, Jie Yu > <mailto:j...@twitter.com>> wrote:
>> I would say the MVP is done. Of course, there'll be some followup 
>> improvement to the feature, and all the remaining issues are within that 
>> category. I am fine resolving this epic. Any one has any objection?
>> 
>> - Jie
>> 
>> On Thu, Nov 5, 2015 at 10:18 AM, Bernd Mathiske > <mailto:be...@mesosphere.io>> wrote:
>> All who worked on MESOS-354,
>> 
>> What’s the status of the Oversubscription epic? Can we already call it a 
>> feature in 0.26? Shall we wait a few days to finish it? Will it slip into 
>> 0.27?
>> 
>> I see only 6 unresolved tickets and lots of resolved ones here:
>> 
>> https://issues.apache.org/jira/browse/MESOS-354 
>> <https://issues.apache.org/jira/browse/MESOS-354>
>> 
>> (Could you please assign someone to this ticket as overall responsible epic 
>> master?)
>> 
>> Bernd
>> 
>> 
>> 
>> 
>> 
>> --
>> Niklas
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Finish Oversubscription before 0.26.0?

2015-11-06 Thread Bernd Mathiske
Cool! So we now have a major feature to announce with this release. Thanks for 
all the good work on this!

Bernd

> On Nov 5, 2015, at 8:11 PM, Niklas Nielsen  wrote:
> 
> Nope - go ahead and close
> 
> On Thu, Nov 5, 2015 at 10:24 AM, Jie Yu  <mailto:j...@twitter.com>> wrote:
> I would say the MVP is done. Of course, there'll be some followup improvement 
> to the feature, and all the remaining issues are within that category. I am 
> fine resolving this epic. Any one has any objection?
> 
> - Jie
> 
> On Thu, Nov 5, 2015 at 10:18 AM, Bernd Mathiske  <mailto:be...@mesosphere.io>> wrote:
> All who worked on MESOS-354,
> 
> What’s the status of the Oversubscription epic? Can we already call it a 
> feature in 0.26? Shall we wait a few days to finish it? Will it slip into 
> 0.27?
> 
> I see only 6 unresolved tickets and lots of resolved ones here:
> 
> https://issues.apache.org/jira/browse/MESOS-354 
> <https://issues.apache.org/jira/browse/MESOS-354>
> 
> (Could you please assign someone to this ticket as overall responsible epic 
> master?)
> 
> Bernd
> 
> 
> 
> 
> 
> --
> Niklas



signature.asc
Description: Message signed with OpenPGP using GPGMail


Finish Oversubscription before 0.26.0?

2015-11-05 Thread Bernd Mathiske
All who worked on MESOS-354,

What’s the status of the Oversubscription epic? Can we already call it a 
feature in 0.26? Shall we wait a few days to finish it? Will it slip into 0.27?

I see only 6 unresolved tickets and lots of resolved ones here:

https://issues.apache.org/jira/browse/MESOS-354 


(Could you please assign someone to this ticket as overall responsible epic 
master?)

Bernd



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Proposal: move towards #pragma and away from #include guards

2015-11-05 Thread Bernd Mathiske
+1

This site has a list of compilers that support #pragma once.

https://en.wikipedia.org/wiki/Pragma_once#Portability 


Clang, MS V C++, GCC as of 3.4. from 2006. OK!

(Too bad for Sun/Oracle Studio C++. But you can use GCC on Solaris, right?)

We could tackle this change at the same time as correcting the copyright notes, 
then we have the history clutter only once:

https://issues.apache.org/jira/browse/MESOS-3581 


Any other projects of that nature that can be bundled?

Bernd

> On Nov 5, 2015, at 6:36 AM, Alex Clemmer  wrote:
> 
> Hey folks.
> 
> In r/39803[1], Mike Hopcroft (in quintessential MSFT style, heh)
> brought up the issue of moving away from #include guards and towards
> `#pragma once`.
> 
> As this has been brought up before, I will be brief: we think it's
> revisiting because the primary objection in previous threads appears
> to be that, though `#pragma once` is a cleaner solution to the
> multiple-include problem, it's not so much better that it's worth the
> code churn. However, the ongoing Windows integration work means we
> have to touch these files anyway, so if we agree this is cleaner and
> desirable, then this is an opportunity to obtain that additional code
> clarity, without the cost of the churn.
> 
> For the remainder of the email, I will summarize the history of our
> discussion of this issue, who will do the work, and what the next
> steps are.
> 
> PROPOSAL: We propose that all new code use `#pragma once` instead of
> #include guards; for existing files, we propose that you change
> #include guards when you touch them.
> 
> HISTORY: This has been discussed before, most recently a year ago on
> the mailing list[2]. There is a relevant JIRA[3] and discarded
> review[4] that changes style guide's recommendation on the matter.
> 
> SUMMARIZED OBJECTIONS:
> 1. The Google style guide explicitly forbids `#pragma once`.
> 2. This results in a lot of code churn, but is only marginally better.
> 3. It's not C++ standardized/it's platform dependent/IBM's compiler
> doesn't support it.
> 4. Popular projects like Chrome don't do `#pragma once` because of
> history clutter.
> 5. Intermediate state of inconsistency as we transition to `#pragma
> once` from #include guards.
> 
> OUR RESPONSE:
> Objections (1), (2), and (4) are essentially the same -- Dominic Hamon
> points out in a previous thread that the Google style guide was
> canonized when `#pragma once` was Windows-only, and the guidance has
> not changed since because of the history churn problem. As noted
> above, we think the history churn problem is minimized by the fact
> that it can be wrapped up into the Windows integration work.
> 
> For objection (3), the consensus seems to be that the vast majority of
> compilers we care about (in particular, the ones supporting C++ 11) do
> support it.
> 
> For objection (5) we believe the inconsistent state is likely to not
> be long lived, as long as we commit to wrapping this work up into the
> Windows integration work.
> 
> SUMMARIZED ADVANTAGES:
> * Basically fool-proof. Communicates simply what its function is (you
> include this file once). Semantically it is "the right tool for the
> job".
> * No need for namespacing conventions for #include guards.
> * No conflicts with reserved identifiers[5].
> * No internal conflicts between include guards in Stout, Process
> library, and Mesos (this is one reason we need the namespacing
> conventions)
> * Reduces preprocessor definition clutter (we should rely on #define
> as little as humanly possible).
> * Optimized to be easy to read and reason about.
> 
> NEXT STEPS:
> If we agree that this is the right thing to do, committers would ask
> people to use `#pragma once` for new code when presented in code
> reviews. For files that exist, I will take point on transitioning as
> we complete the Windows integration work. I expect this work to
> completely land before the new year.
> 
> 
> Thanks,
> 
> 
> [1] https://reviews.apache.org/r/39803/
> [2] https://www.marc.info/?t=142540100400015&r=1&w=2
> [3] https://issues.apache.org/jira/browse/MESOS-2211
> [4] https://reviews.apache.org/r/30100/
> [5] 
> http://stackoverflow.com/questions/228783/what-are-the-rules-about-using-an-underscore-in-a-c-identifier
> 
> 
> --
> Alex
> 
> Theory is the first term in the Taylor series of practice. -- Thomas M
> Cover (1992)



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Backticks in comments

2015-11-04 Thread Bernd Mathiske
+1

> On Nov 2, 2015, at 8:34 PM, Isabel Jimenez  
> wrote:
> 
> +1 for backticks, same comment as Kapil, really nice to be able to make a
> difference from string literals.
> 
> On Mon, Nov 2, 2015 at 11:26 AM, Kapil Arya  wrote:
> 
>> +1 for backticks. Also allows us to differentiate ordinary string literals
>> like names, etc., from code.
>> 
>> On Mon, Nov 2, 2015 at 2:18 PM, Marco Massenzio 
>> wrote:
>> 
>>> +1
>>> 
>>> I much favor using backticks everywhere for consistency, since (as you
>>> correctly pointed out) our Doxygen style requires that.
>>> Hopefully, over time, we will have the whole codebase consistent again
>>> (also an invite to folks, if you touch the code, to update comments
>>> accordingly).
>>> 
>>> BTW - unfortunately, Jira's markdown does not support backticks IIRC, but
>>> {{ }} to demarcate 'fixed font' in paragraphs (and {code} or {noformat}
>>> blocks for code snippets).
>>> 
>>> (RB uses "generally-accepted" markdown, though, so that's good!)
>>> 
>>> Thanks for raising awareness about this, Greg!
>>> 
>>> --
>>> *Marco Massenzio*
>>> Distributed Systems Engineer
>>> http://codetrips.com
>>> 
>>> On Mon, Nov 2, 2015 at 10:38 AM, Alex Clemmer <
>> clemmer.alexan...@gmail.com
 
>>> wrote:
>>> 
 +1. Additional note is that this is now the de facto syntax for code
 snippets on the rest of our tools, too, including RB and JIRA.
 
 On Mon, Nov 2, 2015 at 10:32 AM, Greg Mann  wrote:
> Hey folks!
> I wanted to bring up a style issue that I noticed recently. In some
> comments in the codebase, backticks are used to quote code excerpts
>> and
> object names, while in other comments, single quotes are used. This
 doesn't
> seem to be documented in our style guide (nor in Google's), and I
>> think
 it
> would be a good idea to establish a policy on this and document it,
>> so
 that
> we can avoid wasted review cycles related to this in the future.
> 
> It's likely that the backtick convention began because Doxygen will
 render
> backtick-enclosed text in monospace, and for this reason I would
>>> propose
> that we consistently use backticks to quote code excerpts and object
 names
> in comments from now on. What does everyone else think?
> 
> Cheers,
> Greg
 
 
 
 --
 Alex
 
 Theory is the first term in the Taylor series of practice. -- Thomas M
 Cover (1992)
 
>>> 
>> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: RFC: license headers interfere with doxygen documentation (MESOS-3581)

2015-10-21 Thread Bernd Mathiske
If this means that c) requires a), then we should do a) first, and then c) 
incrementally.

> On Oct 21, 2015, at 10:23 AM, Benjamin Bannier 
>  wrote:
> 
> Hi Joseph,
> 
> yes, doing the right thing and having everything documented would make most 
> of this cleaner.
> 
> There is still an issue with e.g. namespaces (or anything else the particular 
> language allows to be extended later on):
> 
>{foo.hpp}
>/** Licensed ..
>*/
> 
>/** Foo is doxygenized!
>*/
>namespace foo {}
> 
>{foo/bar.hpp}
>/** Licensed ..
>*/
> 
>namespace foo {
>/** Bar is doxygenized!
>*/
>struct Bar {};
> }
> 
> Here the doxygen documentation for `foo` will contain both the license 
> header, and the namespace doc, so to prevent implicit inclusion of license 
> headers in the generated documentation one still needs to pick either of the 
> original options.
> 
> 
> Cheers,
> 
> Benjamin
> 
> 
> 
>> On Oct 20, 2015, at 11:49 PM, Joseph Wu  wrote:
>> 
>> +/- 0 (a) wouldn't hurt, but isn't the best solution.
>> 
>> 
>> I'd vote for adding actual comment blocks to each class.  Doxygen takes the
>> comment block immediately preceding the class and uses that as the
>> description.  This means a file like this would show up correctly on
>> Doxygen:
>> 
>> /**
>> * License ...
>> */
>> 
>> #include <...>
>> 
>> /**
>> * Bar!  <- This is what would show up on Doxygen.
>> * A lot of our existing classes don't have a comment block
>> * so Doxygen takes the License instead :(
>> */
>> class Foo {
>> ...
>> }
>> 
>> ~Joseph
>> 
>> On Tue, Oct 20, 2015 at 2:32 PM, Marco Massenzio 
>> wrote:
>> 
>>> +1
>>> (and thanks for flagging this!)
>>> 
>>> --
>>> *Marco Massenzio*
>>> Distributed Systems Engineer
>>> http://codetrips.com
>>> 
>>> On Tue, Oct 20, 2015 at 12:14 PM, Joris Van Remoortere <
>>> jo...@mesosphere.io>
>>> wrote:
>>> 
>>>> +1 for (a).
>>>> 
>>>> 
>>>> —
>>>> *Joris Van Remoortere*
>>>> Mesosphere
>>>> 
>>>> On Tue, Oct 20, 2015 at 3:02 PM, Benjamin Mahler <
>>>> benjamin.mah...@gmail.com>
>>>> wrote:
>>>> 
>>>>> +1 for (a), in this case the wide sweep only touches the license
>>>> comments,
>>>>> so it won't be disruptive to history.
>>>>> 
>>>>> On Tue, Oct 20, 2015 at 11:59 AM, James Peach 
>>> wrote:
>>>>> 
>>>>>> 
>>>>>>> On Oct 20, 2015, at 8:55 AM, Bernd Mathiske 
>>>>> wrote:
>>>>>>> 
>>>>>>> All, is changing every source code file prohibitive or not?
>>>>>>> 
>>>>>>>> On Oct 20, 2015, at 10:01 AM, Benjamin Bannier <
>>>>>> benjamin.bann...@mesosphere.io> wrote:
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I would like to ask for input on how we plan to fix (both short-
>>> and
>>>>>> longterm) the interference of the license headers and Doxygen
>>>>> documentation
>>>>>> (https://issues.apache.org/jira/browse/MESOS-3581).
>>>>>>>> 
>>>>>>>> Currently, and in line with the respective guidelines, license
>>>> blocks
>>>>>> are wrapped in Javadoc-style comments which are also used for Doxygen
>>>>>> documentation. This leads to Doxygen interpreting license headers as
>>>>>> documentation for whatever entity follows them in the code, and
>>> heavily
>>>>>> clutters the generated documentation (see e.g.
>>>>>> http://mesos.apache.org/api/latest/c++/annotated.html). Given that
>>>>>> considerable effort is done to improve the documentation this
>>>>> unfortunate.
>>>>>>>> 
>>>>>>>> * * *
>>>>>>>> 
>>>>>>>> For a TLDR; of the Jira issue, there are two ways to fix this:
>>>>>>>> 
>>>>>>>> (a) change *all* license headers to be wrapped in e.g. `/* .. */`,
>>>>> also
>>>>>> update the coding guidelines, or
>>>>>>>> (b) perform some preprocessor-like magic in the Doxygen layer.
>>>>>>>> 
>>>>>>>> Option (a) is very noise but obvious and stable; option (b) OTOH
>>>>>> employs a simple but stupid text replacement under the covers
>>> codified
>>>> in
>>>>>> the Doxygen config; it might produce some artifacts and be surprising
>>>>> since
>>>>>> the code Doxygen sees will be different from what is in the source.
>>>>>>>> 
>>>>>>>> I personally believe option (a) is superior for purely technical
>>>>> reasons
>>>>>> 
>>>>>> +1 for (a); there's no value in showing license headers to doxygen or
>>>>>> tooling workarounds
>>>>>> 
>>>>>>>> with option (b) a possible temporary workaround.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> To make sure that the generated documentation shows actual
>>>>>> documentation content in overviews like
>>>>>> http://mesos.apache.org/api/latest/c++/annotated.html and elsewhere
>>> we
>>>>>> should fix this. Please comment in the Jira issue (
>>>>>> https://issues.apache.org/jira/browse/MESOS-3581) your input on how
>>>> you
>>>>>> think this should be fixed (short- and longterm).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> 
>>>>>>>> Benjamin
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: RFC: license headers interfere with doxygen documentation (MESOS-3581)

2015-10-21 Thread Bernd Mathiske
Excellent idea! Let’s call this c)

It’s more work than a), but has to be done eventually anyway.

> On Oct 20, 2015, at 11:49 PM, Joseph Wu  wrote:
> 
> +/- 0 (a) wouldn't hurt, but isn't the best solution.
> 
> 
> I'd vote for adding actual comment blocks to each class.  Doxygen takes the
> comment block immediately preceding the class and uses that as the
> description.  This means a file like this would show up correctly on
> Doxygen:
> 
> /**
> * License ...
> */
> 
> #include <...>
> 
> /**
> * Bar!  <- This is what would show up on Doxygen.
> * A lot of our existing classes don't have a comment block
> * so Doxygen takes the License instead :(
> */
> class Foo {
>  ...
> }
> 
> ~Joseph
> 
> On Tue, Oct 20, 2015 at 2:32 PM, Marco Massenzio 
> wrote:
> 
>> +1
>> (and thanks for flagging this!)
>> 
>> --
>> *Marco Massenzio*
>> Distributed Systems Engineer
>> http://codetrips.com
>> 
>> On Tue, Oct 20, 2015 at 12:14 PM, Joris Van Remoortere <
>> jo...@mesosphere.io>
>> wrote:
>> 
>>> +1 for (a).
>>> 
>>> 
>>> —
>>> *Joris Van Remoortere*
>>> Mesosphere
>>> 
>>> On Tue, Oct 20, 2015 at 3:02 PM, Benjamin Mahler <
>>> benjamin.mah...@gmail.com>
>>> wrote:
>>> 
>>>> +1 for (a), in this case the wide sweep only touches the license
>>> comments,
>>>> so it won't be disruptive to history.
>>>> 
>>>> On Tue, Oct 20, 2015 at 11:59 AM, James Peach 
>> wrote:
>>>> 
>>>>> 
>>>>>> On Oct 20, 2015, at 8:55 AM, Bernd Mathiske 
>>>> wrote:
>>>>>> 
>>>>>> All, is changing every source code file prohibitive or not?
>>>>>> 
>>>>>>> On Oct 20, 2015, at 10:01 AM, Benjamin Bannier <
>>>>> benjamin.bann...@mesosphere.io> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I would like to ask for input on how we plan to fix (both short-
>> and
>>>>> longterm) the interference of the license headers and Doxygen
>>>> documentation
>>>>> (https://issues.apache.org/jira/browse/MESOS-3581).
>>>>>>> 
>>>>>>> Currently, and in line with the respective guidelines, license
>>> blocks
>>>>> are wrapped in Javadoc-style comments which are also used for Doxygen
>>>>> documentation. This leads to Doxygen interpreting license headers as
>>>>> documentation for whatever entity follows them in the code, and
>> heavily
>>>>> clutters the generated documentation (see e.g.
>>>>> http://mesos.apache.org/api/latest/c++/annotated.html). Given that
>>>>> considerable effort is done to improve the documentation this
>>>> unfortunate.
>>>>>>> 
>>>>>>> * * *
>>>>>>> 
>>>>>>> For a TLDR; of the Jira issue, there are two ways to fix this:
>>>>>>> 
>>>>>>> (a) change *all* license headers to be wrapped in e.g. `/* .. */`,
>>>> also
>>>>> update the coding guidelines, or
>>>>>>> (b) perform some preprocessor-like magic in the Doxygen layer.
>>>>>>> 
>>>>>>> Option (a) is very noise but obvious and stable; option (b) OTOH
>>>>> employs a simple but stupid text replacement under the covers
>> codified
>>> in
>>>>> the Doxygen config; it might produce some artifacts and be surprising
>>>> since
>>>>> the code Doxygen sees will be different from what is in the source.
>>>>>>> 
>>>>>>> I personally believe option (a) is superior for purely technical
>>>> reasons
>>>>> 
>>>>> +1 for (a); there's no value in showing license headers to doxygen or
>>>>> tooling workarounds
>>>>> 
>>>>>>> with option (b) a possible temporary workaround.
>>>>>>> 
>>>>>>> 
>>>>>>> To make sure that the generated documentation shows actual
>>>>> documentation content in overviews like
>>>>> http://mesos.apache.org/api/latest/c++/annotated.html and elsewhere
>> we
>>>>> should fix this. Please comment in the Jira issue (
>>>>> https://issues.apache.org/jira/browse/MESOS-3581) your input on how
>>> you
>>>>> think this should be fixed (short- and longterm).
>>>>>>> 
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> 
>>>>>>> Benjamin
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: RFC: license headers interfere with doxygen documentation (MESOS-3581)

2015-10-20 Thread Bernd Mathiske
All, is changing every source code file prohibitive or not?

> On Oct 20, 2015, at 10:01 AM, Benjamin Bannier 
>  wrote:
> 
> Hi,
> 
> I would like to ask for input on how we plan to fix (both short- and 
> longterm) the interference of the license headers and Doxygen documentation 
> (https://issues.apache.org/jira/browse/MESOS-3581).
> 
> Currently, and in line with the respective guidelines, license blocks are 
> wrapped in Javadoc-style comments which are also used for Doxygen 
> documentation. This leads to Doxygen interpreting license headers as 
> documentation for whatever entity follows them in the code, and heavily 
> clutters the generated documentation (see e.g. 
> http://mesos.apache.org/api/latest/c++/annotated.html). Given that 
> considerable effort is done to improve the documentation this unfortunate.
> 
> * * *
> 
> For a TLDR; of the Jira issue, there are two ways to fix this:
> 
> (a) change *all* license headers to be wrapped in e.g. `/* .. */`, also 
> update the coding guidelines, or
> (b) perform some preprocessor-like magic in the Doxygen layer.
> 
> Option (a) is very noise but obvious and stable; option (b) OTOH employs a 
> simple but stupid text replacement under the covers codified in the Doxygen 
> config; it might produce some artifacts and be surprising since the code 
> Doxygen sees will be different from what is in the source.
> 
> I personally believe option (a) is superior for purely technical reasons with 
> option (b) a possible temporary workaround.
> 
> 
> To make sure that the generated documentation shows actual documentation 
> content in overviews like 
> http://mesos.apache.org/api/latest/c++/annotated.html and elsewhere we should 
> fix this. Please comment in the Jira issue 
> (https://issues.apache.org/jira/browse/MESOS-3581) your input on how you 
> think this should be fixed (short- and longterm).
> 
> 
> Cheers,
> 
> Benjamin



signature.asc
Description: Message signed with OpenPGP using GPGMail


Towards Mesos release 0.26.0

2015-10-19 Thread Bernd Mathiske
Dear Mesos fans,

this is the tracking ticket for the upcoming next Mesos release.

https://issues.apache.org/jira/browse/MESOS-3758 


Please read the ticket’s description for further instructions and tips on how 
to get features and other improvements included.

The release managers will be Till and Bernd. If you have any questions, don’t 
hesitate to post them here or contact us directly:

{till, bernd}mesosphere.io

We aim at making the first cut with a tag around November 6 and at finalizing 
the release before November 17.

Best greetings,

Till and Bernd



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Community Sync Interval

2015-10-16 Thread Bernd Mathiske
+1 for bi-weekly and I have been assuming rotating times in the first place, as 
I suspect many others as well.

(9am and 3pm are not very far apart. Probably better to go for 8am and 5pm or 
something like that.)

> On Oct 16, 2015, at 11:08 AM, haosdent  wrote:
> 
> If we rotating times, weekly reasonable. I think 9am and 9pm(Pacific) OK
> for UTC+8.
> 
> +1 weekly, with rotating times. And do we need confirm host places?
> 
> On Fri, Oct 16, 2015 at 4:45 PM, Adam Bordelon  wrote:
> 
>> Wow, split crowd. Keep in mind that we also want to adjust the times to
>> better accommodate people in different time zones. This could mean
>> something like 9am, 3pm, 9pm, 3pm (Pacific). If we do one of these a week,
>> then we end up with bi-weekly 3pm meetings, for those on the west coast
>> that don't want to wake up early or stay up late. And we can still include
>> Europe and Asia with the early/late meetings, so they get to attend at
>> least one a month, hopefully with at least a one committer present.
>> Nobody's expected to attend every meeting, but those who crave weekly
>> meetings have a chance.
>> 
>> +1 weekly, with rotating times. Let's decide soon, so we can get it on the
>> calendar!
>> 
>> On Thu, Oct 15, 2015 at 8:22 PM, Klaus Ma  wrote:
>> 
>>> +1 for weekly
>>> 
>>> Agree with a shorter weekly meeting to sync up, make progress step by
>> step
>>> 
>>> 
>>> On 2015年10月16日 10:49, Yong Qiao Wang wrote:
>>> 
 +1 for bi-weekly.
 
 Regards!
 
 Yong Qiao Wang(王勇桥)
 
 
 
 From:   Jojy Varghese 
 To: dev@mesos.apache.org
 Date:   16/10/2015 09:01
 Subject:Re: Community Sync Interval
 
 
 
 +1 bi-weekly
 
 
 On Oct 15, 2015, at 5:42 PM, Isabel Jimenez
> 
  wrote:
 
> +1 for bi-weekly
> 
> On Thu, Oct 15, 2015 at 4:18 PM, Alex Rukletsov 
> 
 wrote:
 
> +1 for bi-weekly.
>> 
>> On Fri, Oct 16, 2015 at 12:50 AM, Elizabeth Lingg
>> 
> >>> 
> wrote:
>> 
>> +1 Weekly
>>> 
>>> -Elizabeth
>>> 
>>> On Thursday, October 15, 2015, Greg Mann  wrote:
>>> 
>>> I agree with Daniel, +1 for weekly and we can re-evaluate in a bit to
 
>>> see
>> 
>>> how people are liking it.
 
 On Thu, Oct 15, 2015 at 3:33 PM, Guangya Liu >>> > wrote:
 
 +1 for bi-weekly
> 
> Thanks,
> 
> Guangya
> 
> On Fri, Oct 16, 2015 at 6:08 AM, Daniel Mercer <
> 
 dmer...@fantoccini.com
>> 
>>> >
 
> wrote:
> 
> +1 for weekly -- if this results in diminishing returns we can
>> 
> always
>> 
>>> reset
> 
>> to biweekly.
>> 
>> On Thu, Oct 15, 2015 at 2:44 PM, Kapil Arya > 
> > wrote:
 
> +1 for bi-weekly.
>>> 
>>> On Thu, Oct 15, 2015 at 4:40 PM, Jan Schlicht >> 
>> >
 
> wrote:
> 
>> +1 for weekly.
 
 On Thu, Oct 15, 2015 at 1:36 PM, Artem Harutyunyan <
 
>>> ar...@mesosphere.io >
>> 
>>> wrote:
 
 +1 for weekly.
> 
> On Thu, Oct 15, 2015 at 10:41 AM, haosdent <
> 
 haosd...@gmail.com
>> 
>>> >
 
> wrote:
>> 
>>> +1 for bi-weekly
>> 
>> On Fri, Oct 16, 2015 at 1:19 AM, Michael Park <
>> 
> mcyp...@gmail.com 
 
> wrote:
 
> We discussed whether the community syncs should be weekly
>>> 
>> or
>> 
>>> bi-weekly
>>> 
 (once every 2 weeks).
>>> 
>>> There were differing opinions on the subject during the
>>> 
>> community
> 
>> sync
>>> 
 today.
>>> 
>>> An argument for weekly: meetings can be shorter and
>>> 
>> missing
>> 
>>> a
>>> 
 meeting
>>> 
 won't
> 
>> be as big a deal as missing a longer meeting.
>>> 
>>> An argument for bi-weekly: there are many people involved
>>> 
>> in
>> 
>>> these
> 
>> meetings, we should keep it infrequent so that it reduces
>>> 
>> people's
> 
>> time
 
> commitments.
>>> 
>>> This email is intended to capture your +1s or other ideas
>>> 
>> you
>>> 
 might
>> 
>>> have!
> 
>> Thanks,
>>> 
>>> MPark.
>>> 
>>> 
>> 
>> --
>> Bes

More Project Structure in JIRA

2015-10-15 Thread Bernd Mathiske
Proposal: in extension of today’s limited two-level (epic, task) approach, make 
full use of expressive power already available in JIRA to provide more 
structure for larger projects to facilitate planning, tracking, and reporting. 
This will facilitate dynamically planning of sub-projects, which will make us 
more agile.

The general idea is to use links between epics to provide a recursive 
hierarchical structure, with which one can span trees or DAGs of arbitrarily 
large projects. This does not mean that we want to plan everything in minute 
detail before starting to work. On the contrary.

You can start anywhere in the eventual tree and express part of the overall 
effort, maybe say a short epic with a few task tickets. Then you can LATER make 
this epic a dependency for a larger effort.

Conversely, you can subdivide a task in the epic into subtasks. However, this 
does not mean that you have to literally use the feature “subtask” in JIRA for 
this. Instead, staying recursive in our JIRA grammar, so to speak, convert the 
task to an epic and then create ordinary tasks in it to represent subtasks.

Now the task cannot be a task in its parent epic anymore. We fix this by 
putting in a link of type "blocks" to the parent. When you then look at the 
parent, it still holds a number of tasks, and it has one dependency on an epic 
(to which you can add more).

Thus our dependency tree can grow in all directions. You can also rearrange and 
update it in any shape or form if necessary.

Overall, we only use two JIRA elements: epics and tasks (of different flavors 
such as bugs, improvements, etc.). Tasks are the leaves, everything else is an 
epic. Review requests only ever happen for tasks.

The epics are there to provide a high level view and to allow dynamic (“more 
agilish”, non-waterfall) planning. Granted, you’d also use a tree if you did 
waterfall. The difference is that you’d spec it all out at once. My observation 
is that not too few of us do exactly this - outside JIRA - and then try to 
remember what tickets are where in their tree. Let’s make this part of JIRA!

Why not use labels? Because they are in a flat name space and we want to 
represent tree structure. How would you know that a label denotes a subproject 
of another label? By memorizing or by depicting a tree outside JIRA. Why not 
use components? Same problem as with labels: flat name space. We can use labels 
and components these for many other purposes. Separate discussion.

Aren’t we doing this already? Probably. I have not checked thoroughly. There 
may occasionally be epics that link to other epics. If so, I would merely like 
to encourage us to use this powerful expressive means more often.

Bernd



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.25.0 (rc2)

2015-10-08 Thread Bernd Mathiske
I suppose this makes my vote +1 binding, assuming the cherry-picking happens.

> On Oct 8, 2015, at 3:37 PM, Bernd Mathiske  wrote:
> 
> I have the exact same result as Greg.
> 
>> On Oct 8, 2015, at 12:14 AM, Greg Mann  wrote:
>> 
>> Successfully built `sudo make distcheck` on CentOS 7.1 and Ubuntu 14.04
>> with only expected test failures. On our Fedora 22 CI build, however, while
>> the tests are building the following compile-time error is produced:
>> 
>> [17:18:46][Step 4/6]   CXX
>> tests/containerizer/mesos_tests-composing_containerizer_tests.o
>> 
>> [17:18:48][Step 4/6] In file included from
>> ../../../src/tests/values_tests.cpp:22:0:
>> 
>> [17:18:48][Step 4/6]
>> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In
>> instantiation of ‘testing::AssertionResult
>> testing::internal::CmpHelperEQ(const char*, const char*, const T1&, const
>> T2&) [with T1 = int; T2 = long unsigned int]’:
>> 
>> [17:18:48][Step 4/6]
>> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1484:23:
>> required from ‘static testing::AssertionResult
>> testing::internal::EqHelper::Compare(const char*,
>> const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned int;
>> bool lhs_is_null_literal = false]’
>> 
>> [17:18:48][Step 4/6] ../../../src/tests/values_tests.cpp:149:3:   required
>> from here
>> 
>> [17:18:48][Step 4/6]
>> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1448:16:
>> error: comparison between signed and unsigned integer expressions
>> [-Werror=sign-compare]
>> 
>> [17:18:48][Step 4/6]if (expected == actual) {
>> 
>> [17:18:48][Step 4/6] ^
>> 
>> 
>> Cherry-picking one commit (bfeb070a2aef52f445e "Fixed compiler warning in
>> values test.") resolves this issue.
>> 
>> 
>> 
>> On Wed, Oct 7, 2015 at 2:32 AM, Joris Van Remoortere 
>> wrote:
>> 
>>> +1 (binding)
>>> 
>>> On Mon, Oct 5, 2015 at 11:12 PM, Niklas Nielsen 
>>> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> Please vote on releasing the following candidate as Apache Mesos 0.25.0.
>>>> 
>>>> 
>>>> 
>>>> 0.25.0 includes the following:
>>>> 
>>>> 
>>>> 
>>>> 
>>>> * [MESOS-1474] - Experimental support for maintenance primitives.
>>>> 
>>>> * [MESOS-2600] - Added master endpoints /reserve and /unreserve for
>>>> dynamic reservations.
>>>> 
>>>> * [MESOS-2044] - Extended Module APIs to enable IP per container
>>>> assignment, isolation and resolution.
>>>> 
>>>> 
>>>> ** Bug fixes
>>>> 
>>>> * [MESOS-2635] - Web UI Display Bug when starting lots of tasks with
>>>> small cpu value.
>>>> 
>>>> * [MESOS-2986] - Docker version output is not compatible with Mesos.
>>>> 
>>>> * [MESOS-3046] - Stout's UUID re-seeds a new random generator during
>>>> each call to UUID::random.
>>>> 
>>>> * [MESOS-3051] - performance issues with port ranges comparison.
>>>> 
>>>> * [MESOS-3052] - Allocator performance issue when using a large number
>>>> of filters.
>>>> 
>>>> * [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken.
>>>> 
>>>> * [MESOS-3169] - FrameworkInfo should only be updated if the
>>>> re-registration is valid.
>>>> 
>>>> * [MESOS-3185] - Refactor Subprocess logic in linux/perf.cpp to use
>>>> common subroutine.
>>>> 
>>>> * [MESOS-3239] - Refactor master HTTP endpoints help messages such that
>>>> they cannot be out of sync.
>>>> 
>>>> * [MESOS-3245] - The comments of DRFSorter::dirty is not correct.
>>>> 
>>>> * [MESOS-3254] - Cgroup CHECK fails test harness.
>>>> 
>>>> * [MESOS-3258] - Remove Frameworkinfo capabilities on re-registration.
>>>> 
>>>> * [MESOS-3261] - Move QoS plug-ins to a specified folder like
>>>> resource_estimator.
>>>> 
>>>> * [MESOS-3269] - The comments of Master::updateSlave() is not correct.
>>>> 
>>>> * [MESOS-3282] - Web UI no longer shows Tasks information.
>>>> 
>>>> * [MES

Re: [VOTE] Release Apache Mesos 0.25.0 (rc2)

2015-10-08 Thread Bernd Mathiske
I have the exact same result as Greg.

> On Oct 8, 2015, at 12:14 AM, Greg Mann  wrote:
> 
> Successfully built `sudo make distcheck` on CentOS 7.1 and Ubuntu 14.04
> with only expected test failures. On our Fedora 22 CI build, however, while
> the tests are building the following compile-time error is produced:
> 
> [17:18:46][Step 4/6]   CXX
> tests/containerizer/mesos_tests-composing_containerizer_tests.o
> 
> [17:18:48][Step 4/6] In file included from
> ../../../src/tests/values_tests.cpp:22:0:
> 
> [17:18:48][Step 4/6]
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In
> instantiation of ‘testing::AssertionResult
> testing::internal::CmpHelperEQ(const char*, const char*, const T1&, const
> T2&) [with T1 = int; T2 = long unsigned int]’:
> 
> [17:18:48][Step 4/6]
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1484:23:
>  required from ‘static testing::AssertionResult
> testing::internal::EqHelper::Compare(const char*,
> const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned int;
> bool lhs_is_null_literal = false]’
> 
> [17:18:48][Step 4/6] ../../../src/tests/values_tests.cpp:149:3:   required
> from here
> 
> [17:18:48][Step 4/6]
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1448:16:
> error: comparison between signed and unsigned integer expressions
> [-Werror=sign-compare]
> 
> [17:18:48][Step 4/6]if (expected == actual) {
> 
> [17:18:48][Step 4/6] ^
> 
> 
> Cherry-picking one commit (bfeb070a2aef52f445e "Fixed compiler warning in
> values test.") resolves this issue.
> 
> 
> 
> On Wed, Oct 7, 2015 at 2:32 AM, Joris Van Remoortere 
> wrote:
> 
>> +1 (binding)
>> 
>> On Mon, Oct 5, 2015 at 11:12 PM, Niklas Nielsen 
>> wrote:
>> 
>>> Hi all,
>>> 
>>> Please vote on releasing the following candidate as Apache Mesos 0.25.0.
>>> 
>>> 
>>> 
>>> 0.25.0 includes the following:
>>> 
>>> 
>>> 
>>> 
>>> * [MESOS-1474] - Experimental support for maintenance primitives.
>>> 
>>> * [MESOS-2600] - Added master endpoints /reserve and /unreserve for
>>> dynamic reservations.
>>> 
>>> * [MESOS-2044] - Extended Module APIs to enable IP per container
>>> assignment, isolation and resolution.
>>> 
>>> 
>>> ** Bug fixes
>>> 
>>>  * [MESOS-2635] - Web UI Display Bug when starting lots of tasks with
>>> small cpu value.
>>> 
>>>  * [MESOS-2986] - Docker version output is not compatible with Mesos.
>>> 
>>>  * [MESOS-3046] - Stout's UUID re-seeds a new random generator during
>>> each call to UUID::random.
>>> 
>>>  * [MESOS-3051] - performance issues with port ranges comparison.
>>> 
>>>  * [MESOS-3052] - Allocator performance issue when using a large number
>>> of filters.
>>> 
>>>  * [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken.
>>> 
>>>  * [MESOS-3169] - FrameworkInfo should only be updated if the
>>> re-registration is valid.
>>> 
>>>  * [MESOS-3185] - Refactor Subprocess logic in linux/perf.cpp to use
>>> common subroutine.
>>> 
>>>  * [MESOS-3239] - Refactor master HTTP endpoints help messages such that
>>> they cannot be out of sync.
>>> 
>>>  * [MESOS-3245] - The comments of DRFSorter::dirty is not correct.
>>> 
>>>  * [MESOS-3254] - Cgroup CHECK fails test harness.
>>> 
>>>  * [MESOS-3258] - Remove Frameworkinfo capabilities on re-registration.
>>> 
>>>  * [MESOS-3261] - Move QoS plug-ins to a specified folder like
>>> resource_estimator.
>>> 
>>>  * [MESOS-3269] - The comments of Master::updateSlave() is not correct.
>>> 
>>>  * [MESOS-3282] - Web UI no longer shows Tasks information.
>>> 
>>>  * [MESOS-3344] - Add more comments for strings::internal::fmt.
>>> 
>>>  * [MESOS-3351] - duplicated slave id in master after master failover.
>>> 
>>>  * [MESOS-3387] - Refactor MesosContainerizer to accept namespace
>>> dynamically.
>>> 
>>>  * [MESOS-3408] - Labels field of FrameworkInfo should be added into v1
>>> mesos.proto.
>>> 
>>>  * [MESOS-3411] - ReservationEndpointsTest.AvailableResources appears to
>>> be faulty.
>>> 
>>>  * [MESOS-3423] - Perf event isolator stops performing sampling if a
>>> single timeout occurs.
>>> 
>>>  * [MESOS-3426] - process::collect and process::await do not perform
>>> discard propagation.
>>> 
>>>  * [MESOS-3430] -
>>> LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithoutRootFilesystem
>>> fails on CentOS 7.1.
>>> 
>>>  * [MESOS-3450] - Update Mesos C++ Style Guide for namespace usage.
>>> 
>>>  * [MESOS-3451] - Failing tests after changes to
>>> Isolator/MesosContainerizer API.
>>> 
>>>  * [MESOS-3458] - Segfault when accepting or declining inverse offers.
>>> 
>>>  * [MESOS-3474] - ExamplesTest.{TestFramework, JavaFramework,
>>> PythonFramework} failed on CentOS 6.
>>> 
>>>  * [MESOS-3489] - Add support for exposing Accept/Decline responses for
>>> inverse offers.
>>> 
>>>  * [MESOS-3490] - Mesos UI fails to represent JSON entities.
>>> 
>>>  * [MESOS-3512] - D

Re: MesosCon EU Hackathon hot list?

2015-10-06 Thread Bernd Mathiske
Hi Soheila,

I have worked on the fetcher cache, but I won’t be at MesosCon Europe. 
Nevertheless, I am interested in this problem area, so I want to support your 
effort, also after MesosCon.

Bernd

> On Oct 5, 2015, at 6:49 PM, Soheila Dehghanzadeh  wrote:
> 
> Hi Casey, All,
> 
> As mentioned in [1] :
> "Unfortunately, there is no mechanism to refresh a cache entry in the
> current experimental version of the fetcher cache."
> 
> So in the community there is a need for a maintenance policy to keep the
> cache up-to-date and I would like to solve this problem as a 1-day hackaton
> project. So if any of the Mesos Fetcher Cache folks are attending the
> hackathon I would like to team up with them and solve this problem. In my
> phd I proposed some optimizations of increasing the freshness of the task
> outcome when cache data providers apply some restrictions on the number of
> accesses which is briefly explained here [2]. So  after we designed the
> maintenance policy, we can optimized it for these constraints based
> accesses using this idea as well.
> 
> [1].http://mesos.apache.org/documentation/latest/fetcher/
> [2].http://www.www2015.it/documents/proceedings/companion/p25.pdf
> 
> -Thanks.
> -Soheila



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Mesos Style Guideline Adjustments

2015-09-10 Thread Bernd Mathiske
+1
> On Sep 10, 2015, at 4:21 PM, tommy xiao  wrote:
> 
> +1
> 
> 2015-09-10 9:44 GMT+08:00 Marco Massenzio :
> 
>> +1
>> 
>> 
>> 
>> 
>> Thanks, Michael!
>> 
>> 
>> 
>> —
>> Sent from my iPhone, which is not as good as you'd hope to fix trypos n
>> abbrvtn.
>> 
>> On Wed, Sep 9, 2015 at 6:23 PM, Michael Park  wrote:
>> 
>>> I've removed the 70 column restriction on comments from the style guide:
>>> 
>> https://github.com/apache/mesos/commit/f9c2604ea97b91f8a9ec3b2863317761679b1c86
>>> Also, based on the comments, it seems like we should allow 80 column
>>> comments but omit the sweeping change.
>>> Thanks,
>>> MPark.
>>> On Wed, Aug 12, 2015 at 6:13 PM Marco Massenzio 
>> wrote:
>>>> On Wed, Aug 12, 2015 at 4:09 AM, Bernd Mathiske 
>>>> wrote:
>>>> 
>>>>> Like BenM,
>>>>> 
>>>>> +1 on allowing 80 column comments
>>>>> 
>>>> +1
>>>> (it really IS annoying having to keep an eye on the bottom column
>> counter
>>>> when typing comments :)
>>>> 
>>>> 
>>>>> -1 on sweeping changes; incremental changes when touching old comments
>>>>> will do IMHO
>>>>> 
>>>>> +1 on the -1? :)
>>>> Incremental changes are good and I doubt anyone will be "confused" by
>> them.
>>>> 
>>>> 
>>>>> Bernd
>>>>> 
>>>>>> On Aug 12, 2015, at 12:51 AM, Michael Park 
>> wrote:
>>>>>> 
>>>>>> Ben, thanks for your input!
>>>>>> 
>>>>>> Another update on this topic: the patches around break before braces
>>>> for
>>>>>> *enum* style and overloaded operators have been committed.
>>>>>> 
>>>>>> On Tue, Aug 11, 2015 at 6:23 PM Benjamin Mahler <
>>>>> benjamin.mah...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> We already don't necessarily wrap at 70 characters (often we wrap
>>>>> before 70
>>>>>>> to reduce "jaggedness" or to make it look cleaner).
>>>>>>> 
>>>>>>> So with the change to 80, this still makes all existing comments
>>>> valid.
>>>>> We
>>>>>>> can still encourage folks to write paragraphs in a way that is
>> easy to
>>>>>>> digest for the reader. That is, I think we should still be trying
>> not
>>>> to
>>>>>>> write jagged paragraphs of comments, it's just not a hard stylistic
>>>>>>> violation given we don't have an algorithm for this.
>>>>>>> 
>>>>>>> So +1 to relaxing the hard 70 character rule, but -1 to sweeping
>>>> across
>>>>> all
>>>>>>> the comments or doing wrapping based only on line length rather
>> than
>>>>>>> jaggedness going forward.
>>>>>>> 
>>>>>>> On Sat, Aug 8, 2015 at 3:25 PM, Joris Van Remoortere <
>>>>> jo...@mesosphere.io>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I will volunteer to update all the comments to wrap at 80 if we
>> agree
>>>>> to
>>>>>>>> keep the code base consistent.
>>>>>>>> Naturally that is also my vote ;-)
>>>>>>>> Joris
>>>>>>>> 
>>>>>>>>> On Aug 8, 2015, at 1:40 PM, Michael Park 
>> wrote:
>>>>>>>>> 
>>>>>>>>> An update on this topic since we covered it at the community
>>>> developer
>>>>>>>> sync.
>>>>>>>>> 
>>>>>>>>> 1. We will adopt *Mozilla*'s *BreakBeforeBraces* style as their
>>>> style
>>>>>>>> is
>>>>>>>>> equivalent to ours. The only change this entails for our
>> codebase
>>>> is
>>>>>>> to
>>>>>>>>> consistently wrap the braces for *enum* definitions, as we're
>>>>>>> currently
>>>>>>>>> inconsistent. I've taken on the work involved in this change:
>>>>>>>>>- stout: https://r

Re: [VOTE] Release Apache Mesos 0.24.0 (rc2)

2015-09-04 Thread Bernd Mathiske
And also Ubuntu 13.10: [  FAILED  ] ExamplesTest.PythonFramework, known flaky 
test, so still +1

> On Sep 4, 2015, at 9:11 PM, Bernd Mathiske  wrote:
> 
> +1 [binding]
> 
> MacOS X (make check)
> CentOS 7 (make distcheck)
> Ubuntu 14.4 (make distcheck)
> 
> 
>> On Sep 3, 2015, at 11:47 PM, Niklas Nielsen > <mailto:nik...@mesosphere.io>> wrote:
>> 
>> +1 - tested on our CI
>> 
>> On Tuesday, September 1, 2015, Vinod Kone > <mailto:vinodk...@apache.org>> wrote:
>> Hi all,
>> 
>> 
>> Please vote on releasing the following candidate as Apache Mesos 0.24.0.
>> 
>> 
>> 0.24.0 includes the following:
>> 
>> 
>> 
>> Experimental support for v1 scheduler HTTP API!
>> 
>> This release also wraps up support for fetcher.
>> 
>> The CHANGELOG for the release is available at:
>> 
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.0-rc2
>>  
>> <https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.0-rc2>
>> 
>> 
>> 
>> 
>> The candidate for Mesos 0.24.0 release is available at:
>> 
>> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz 
>> <https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz>
>> 
>> 
>> The tag to be voted on is 0.24.0-rc2:
>> 
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.24.0-rc2 
>> <https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.24.0-rc2>
>> 
>> 
>> The MD5 checksum of the tarball can be found at:
>> 
>> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz.md5
>>  
>> <https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz.md5>
>> 
>> 
>> The signature of the tarball can be found at:
>> 
>> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz.asc
>>  
>> <https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz.asc>
>> 
>> 
>> The PGP key used to sign the release is here:
>> 
>> https://dist.apache.org/repos/dist/release/mesos/KEYS 
>> <https://dist.apache.org/repos/dist/release/mesos/KEYS>
>> 
>> 
>> The JAR is up in Maven in a staging repository here:
>> 
>> https://repository.apache.org/content/repositories/orgapachemesos-1066 
>> <https://repository.apache.org/content/repositories/orgapachemesos-1066>
>> 
>> 
>> Please vote on releasing this package as Apache Mesos 0.24.0!
>> 
>> 
>> The vote is open until Fri Sep  4 17:33:05 PDT 2015 and passes if a
>> majority of at least 3 +1 PMC votes are cast.
>> 
>> 
>> [ ] +1 Release this package as Apache Mesos 0.24.0
>> 
>> [ ] -1 Do not release this package because ...
>> 
>> 
>> Thanks,
>> 
>> Vinod
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [VOTE] Release Apache Mesos 0.24.0 (rc2)

2015-09-04 Thread Bernd Mathiske
+1 [binding]

MacOS X (make check)
CentOS 7 (make distcheck)
Ubuntu 14.4 (make distcheck)


> On Sep 3, 2015, at 11:47 PM, Niklas Nielsen  wrote:
> 
> +1 - tested on our CI
> 
> On Tuesday, September 1, 2015, Vinod Kone  > wrote:
> Hi all,
> 
> 
> Please vote on releasing the following candidate as Apache Mesos 0.24.0.
> 
> 
> 0.24.0 includes the following:
> 
> 
> 
> Experimental support for v1 scheduler HTTP API!
> 
> This release also wraps up support for fetcher.
> 
> The CHANGELOG for the release is available at:
> 
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.0-rc2
>  
> 
> 
> 
> 
> 
> The candidate for Mesos 0.24.0 release is available at:
> 
> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz 
> 
> 
> 
> The tag to be voted on is 0.24.0-rc2:
> 
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.24.0-rc2 
> 
> 
> 
> The MD5 checksum of the tarball can be found at:
> 
> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz.md5
>  
> 
> 
> 
> The signature of the tarball can be found at:
> 
> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz.asc
>  
> 
> 
> 
> The PGP key used to sign the release is here:
> 
> https://dist.apache.org/repos/dist/release/mesos/KEYS 
> 
> 
> 
> The JAR is up in Maven in a staging repository here:
> 
> https://repository.apache.org/content/repositories/orgapachemesos-1066 
> 
> 
> 
> Please vote on releasing this package as Apache Mesos 0.24.0!
> 
> 
> The vote is open until Fri Sep  4 17:33:05 PDT 2015 and passes if a
> majority of at least 3 +1 PMC votes are cast.
> 
> 
> [ ] +1 Release this package as Apache Mesos 0.24.0
> 
> [ ] -1 Do not release this package because ...
> 
> 
> Thanks,
> 
> Vinod



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Lets stop using the CHECK macro in the test harness.

2015-08-14 Thread Bernd Mathiske
+1, but…

If we are going to touch all our tests, then IMHO while at it we might as well 
make a jump forward to something better than the current local return void to 
abort tests in macros. 

If we used exceptions instead, it should be easy to catch those in a wrapper 
somewhere in the test class and then we could install some test-class-specific 
or test-superclass-specific or test-specific, but NOT macro-specific (!), extra 
code that prints out extra diagnostic info iff the test has failed. For 
example, it can dump the contents of the sandbox.

If you really don’t like exceptions (?), we could make tests return a value 
instead of void and make the macros indicate that the test failed that way. 
Then we could also have a failure hook in the wrapper. (This is somewhat 
inferior, because it still does not support using the macros in nested 
methods/functions with a different return type. So I prefer exceptions. Not 
saying I want them everywhere in Mesos. Just in tests! 2c)

I have started to modify one individual AWAIT macro for the above purpose, 
because I need more info when an unreproducable flaky fetcher cache tests 
failure happens. I need the extra info dump then and there when it happens, 
because I cannot get to it it later. The problem with this approach is that it 
instruments only one macro. Here is what it looks like (ugly!):

#define AWAIT_READY_FOR_WITH_FAILURE_HOOK(actual, duration, onFailureHook) \
  GTEST_PRED_FORMAT2_( \
 AwaitAssertReady, \
 actual, \
 duration, \
 return [=](const char *message) { \
   onFailureHook(); \
   GTEST_FATAL_FAILURE_(message); \
 })

#define AWAIT_READY_WITH_FAILURE_HOOK(actual, onFailureHook) \
  AWAIT_READY_FOR_WITH_FAILURE_HOOK(actual, Seconds(15), onFailureHook)

Then I use it this way:

AWAIT_READY_WITH_FAILURE_HOOK(
someFuture,
[=]() { logSandbox(); }); // action on failure by timeout

Before you ask, this approach does not work, i.e. the output on failure does 
not happen:

AWAIT_READY(someFuture
  .onFailed([=]() { logSandbox(); });

---

Always printing out all the info would IMHO be prohibitively much.

An alternative would be to have two log files, one at a lower log level for 
success only and one with the highest log level, which buildbot prints only in 
case something bad happens. etc. We could overwrite the latter after each 
individual test. This approach would mean extra work on the logging system.

Opinions?

Of course, Paul’s proposal should be tackled in any case. :-)

Bernd

> On Aug 15, 2015, at 1:24 AM, Marco Massenzio  wrote:
> 
> +1
> 
> *Marco Massenzio*
> 
> *Distributed Systems Engineerhttp://codetrips.com *
> 
> On Fri, Aug 14, 2015 at 3:46 PM, Paul Brett 
> wrote:
> 
>> We are currently using the Google log CHECK macros (CHECK_SOME,
>> CHECK_NOTNULL etc) in the test harness, usually to verify test setup.  When
>> these checks fail, it causes the test harness to abort rather than simply
>> move onto the next test. The abort prevents any subsequent tests from
>> running, hiding errors and preventing the generation of the XML test
>> report.
>> 
>> I would like to propose that we eliminate the use of CHECK in the test
>> harness and replace it with the appropriate Google test macros to fail the
>> test case.
>> ​  I​
>> am not proposing that we change the use of CHECK outside the test harness
>> (although CHECK calls in master and slave can also kill the test harness).
>> 
>> For void functions, CHECK can
>> ​ easily​
>> be replaced with the corresponding ASSERT equivalent.
>> 
>> For non-void function, ASSERT cannot be used because it does not return the
>> correct data type and hence we need to use a combination of ADD_FAILURE()
>> and return.
>> 
>> For example:
>> 
>>CHECK(foo)
>> 
>> would become:
>> 
>>if(!foo) {
>>ADD_FAILURE();
>>return anything;
>>}
>> 
>> If there is general agreement, I will raise tickets to update the Mesos
>> testing patterns document and each of the test cases.
>> 
>> ​Thanks
>> ​
>> 
>> -- Paul Brett
>> 



Re: Prepping for next release

2015-08-14 Thread Bernd Mathiske
I just committed it. Thanks, James!

> On Aug 13, 2015, at 9:53 PM, James DeFelice  wrote:
> 
> Hi Vinod,
> 
> Would *really* like to see https://issues.apache.org/jira/browse/MESOS-2841
> in 0.24.0. Currently in review.
> 
> Any chance that can make it in?
> 
> 
> On Wed, Aug 12, 2015 at 1:16 PM, Vinod Kone  wrote:
> 
>> Removed the target versions for all unresolved tickets (except for HTTP
>> scheduler API ones) targeted for 0.24.0
>> 
>> 
>> Hoping to cut an RC tomorrow.
>> 
>> On Wed, Aug 5, 2015 at 11:31 AM, Vinod Kone  wrote:
>> 
>>> Hi,
>>> 
>>> The tracking ticket for the 0.24.0 release is
>>> https://issues.apache.org/jira/browse/MESOS-2562
>>> 
>>> The main feature of this release is going to be v1 (beta) release of the
>>> HTTP scheduler API.
>>> 
>>> Hoping to cut an RC early next week, so if there's anything you
>> absolutely
>>> need to be in 0.24.0 please land them by EOW.
>>> 
>>> Thanks,
>>> 
>>> On Tue, Jul 21, 2015 at 4:10 PM, Adam Bordelon 
>> wrote:
>>> 
 Thanks, Vinod.
 
 I've got a handful of JIRAs I'd really like to see land in 0.24.0.
 https://issues.apache.org/jira/browse/MESOS-2559 Do not use
 RunTaskMessage.framework_id.
 https://issues.apache.org/jira/browse/MESOS-2600 Add /reserve and
 /unreserve endpoints on the master for dynamic reservation
 https://issues.apache.org/jira/browse/MESOS-2998 Disable Persistent
 Volumes, Dynamic Reservations via master flags
 https://issues.apache.org/jira/browse/MESOS-3050 Failing ROOT_ tests in
 0.23.0-rc3 on CentOS 7.1
 https://issues.apache.org/jira/browse/MESOS-3079 `sudo make distcheck`
 fails on Ubuntu 14.04 (and possibly other OSes too)
 
 I understand your desire to untarget the majority of the tickets, since
 it's a time-based release, but we might want to keep some of these
 targeted
 so we can track the priority issues. When the actual rc1 cut date
 approaches, it's pretty easy to aggressively push things out of the
 release
 that haven't made it. Let me know what you think.
 
 Cheers,
 -A-
 
 
 On Tue, Jul 21, 2015 at 11:02 AM, Vinod Kone 
>> wrote:
 
> Hi folks,
> 
> I will be the release manager for the upcoming release (ETA early
 August).
> 
> To prep for the release (and make my life easy) I'm planning to remove
 the
> target versions for all *unresolved* tickets that have a target
>> version
> 0.24.0.
> 
> I would like folks to explicitly set the target version to 0.24.0* for
> tickets they want to absolutely land in the next release (keeping in
 mind
> the time frame). If you are unsure, please reach out to me or reply to
 this
> thread.
> 
> The main blocking feature for this release is going to the new HTTP
>> API.
> 
> Thanks,
> Vinod
> 
> P.S. If things go according to plan we might make this 1.0 release!
> 
 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> James DeFelice
> 585.241.9488 (voice)
> 650.649.6071 (fax)



Re: Mesos Style Guideline Adjustments

2015-08-12 Thread Bernd Mathiske
Like BenM, 

+1 on allowing 80 column comments
-1 on sweeping changes; incremental changes when touching old comments will do 
IMHO

Bernd

> On Aug 12, 2015, at 12:51 AM, Michael Park  wrote:
> 
> Ben, thanks for your input!
> 
> Another update on this topic: the patches around break before braces for
> *enum* style and overloaded operators have been committed.
> 
> On Tue, Aug 11, 2015 at 6:23 PM Benjamin Mahler 
> wrote:
> 
>> We already don't necessarily wrap at 70 characters (often we wrap before 70
>> to reduce "jaggedness" or to make it look cleaner).
>> 
>> So with the change to 80, this still makes all existing comments valid. We
>> can still encourage folks to write paragraphs in a way that is easy to
>> digest for the reader. That is, I think we should still be trying not to
>> write jagged paragraphs of comments, it's just not a hard stylistic
>> violation given we don't have an algorithm for this.
>> 
>> So +1 to relaxing the hard 70 character rule, but -1 to sweeping across all
>> the comments or doing wrapping based only on line length rather than
>> jaggedness going forward.
>> 
>> On Sat, Aug 8, 2015 at 3:25 PM, Joris Van Remoortere 
>> wrote:
>> 
>>> I will volunteer to update all the comments to wrap at 80 if we agree to
>>> keep the code base consistent.
>>> Naturally that is also my vote ;-)
>>> Joris
>>> 
 On Aug 8, 2015, at 1:40 PM, Michael Park  wrote:
 
 An update on this topic since we covered it at the community developer
>>> sync.
 
  1. We will adopt *Mozilla*'s *BreakBeforeBraces* style as their style
>>> is
  equivalent to ours. The only change this entails for our codebase is
>> to
  consistently wrap the braces for *enum* definitions, as we're
>> currently
  inconsistent. I've taken on the work involved in this change:
 - stout: https://reviews.apache.org/r/37258
 - libprocess: https://reviews.apache.org/r/37259
 - mesos: https://reviews.apache.org/r/37260
 2. We will drop the rule for adding spaces around overloaded
  operators. We'll simply do a sweep of the codebase to update all of
>>> them
  consistently. Artem has kindly taken action on this already:
 - stout: https://reviews.apache.org/r/37018/
 - libprocess: https://reviews.apache.org/r/37017/
 - mesos: https://reviews.apache.org/r/37013/
 3. We will drop the rule for wrapping comments at 70 characters.
>> We
  have a few options to proceed here:
 - Keep all the existing comments in tact, and simply allow new
 comments to wrap at 80, this is less work.
 - Update all instances of the comments wrapping at 70 to be
>> wrapped
 at 80, so that we can be consistent.
 
 I proposed that we simply allow new comments to wrap at 80, but I have
 heard arguments to update the existing comments, so that we can be
 consistent across the codebase. If you have a suggestion/opinion on how
>>> we
 should proceed with (3), please share!
 
 Thanks,
 
 MPark.
 
 On Mon, Aug 3, 2015 at 2:01 PM Alexander Rojas <
>> alexan...@mesosphere.io>
 wrote:
 
> I also vote up for that! I rather change our guidelines a little bit
>>> than
> waiting for months
> to get our changes into the clang-format source without any security
>>> that
> it will actually land.
> 
>> On 31 Jul 2015, at 09:31, Alex Rukletsov 
>> wrote:
>> 
>> I think automation is very important. If we should slightly change
>> our
>> style in order to set-up easily enforceable rules, I vote with both
>>> hands
>> for that.
>> 
>>> On Fri, Jul 31, 2015 at 3:25 AM, Michael Park 
>>> wrote:
>>> 
>>> Oops, sorry I was so excited that this could just solve the issue
>>> that I
>>> forgot to answer your question.
>>> 
>>> In general, the clang-format strives to adopt widely used styles,
>>> which
> I'm
>>> not sure if we would be considered widely used. Aside from that,
>>> another
>>> concern was that it could take a while for our style proposals to
>> make
> it
>>> upstream and for it to be useful.
>>> 
 On Thu, Jul 30, 2015, 6:12 PM Michael Park 
>>> wrote:
 
 Is it worth adding our own style?
 
 
 
 I noticed other have (LLVM, Google, Chromium, Mozilla, WebKit.).
>> How
> hard is it?
 
 
 I was just looking into this again and *Mozilla* was added as the
> newest
 *BreakBeforeBraces* style. It breaks before braces on enum,
>> function,
> and
 record definitions (struct, class, union). I think we can actually
>>> use
>>> that
 one and be done with it. Having looked through the codebase, we
>> wrap
> the
 braces for *enum* for about half of the cases. It would be about 35
 instances that we have to fix from what I can see in our codebase.
>>> What
>>> do
 

Re: Beware of ASSERT_* Placement

2015-07-28 Thread Bernd Mathiske
IMHO we would be better off with exception-based asserts, checks, and expects.

Bernd

> On Jul 28, 2015, at 7:53 AM, Paul Brett  wrote:
> 
> Michael
> 
> I think Ben's suggestion of using Try<> is just what we want for common
> functions.
> 
> In regards to ASSERTs, they can cause tests to be skipped within
> instantiations of the fixtures or test case as expected.
> 
> For example, If you look at tests such as
> SlaveRecoveryTest::ReconnectExecutor, it has 9 ASSERTs in a single test
> case.  The first 5 are in setup code and seem pretty reasonable but the
> last 4 are:
> 
> 489   // Executor should inform about the unacknowledged update.
> 490   ASSERT_EQ(1, reregister.updates_size());
> 491   const StatusUpdate& update = reregister.updates(0);
> 492   ASSERT_EQ(task.task_id(), update.status().task_id());
> 493   ASSERT_EQ(TASK_RUNNING, update.status().state());
> 494
> 495   // Scheduler should receive the recovered update.
> 496   AWAIT_READY(status);
> 497   ASSERT_EQ(TASK_RUNNING, status.get().state());
> 
> So looking at this code, I suspect that lines 492 and 493 might be better
> as EXPECT?  What about 497?  What follows afterwards is only cleanup code,
> so either it is not necessary and we can omit it or 497 should be an
> expect.
> 
> Looking through the tests directory, this appears to be a common pattern.
> Of course, it is all harmless while the code is passing the tests but when
> a change breaks things, the scope of the breakage can be obscured because
> of the skipped test conditions.
> 
> Given the restrictions you point out on the use of ASSERT combined with the
> ability to hide failing tests, I believe we should have a strong preference
> for EXPECT over ASSERT unless it is clear that every subsequent test in the
> test cast is dependent on the result of this test.
> 
> Just my 5c worth
> 
> @paul_b
> 
> On Mon, Jul 27, 2015 at 7:34 PM, Michael Park  wrote:
> 
>> Paul,
>> 
>> With ASSERT, I completely agree with you about the perils of using ASSERT
>>> that you list above, but additionally we have cases where ASSERT exits a
>>> test fixture skipping later tests that might or might not have failed.
>> 
>> 
>> We should only be using *ASSERT_** in cases where it doesn't make sense to
>> proceed with the rest of the test if the condition fails, so exiting the
>> test case seems like it's exactly what we would want. If you're saying that
>> we currently use it incorrectly, then yeah, we should perhaps write a guide
>> to help with how to use it correctly. But it sounds like you're saying we
>> shouldn't use it at all?
>> 
>> Since the CHECK macro aborts the test harness, a single test failure
>>> prevents tests from running on all the remaining tests.  Particularly
>>> annoying for anyone running automated regression tests.
>> 
>> 
>> Perhaps my suggestion of resorting to *CHECK_** was not a good one, but I
>> still don't think *EXPECT_** is what we want. If we have a condition in
>> which it doesn't make sense to proceed with the rest of the test, we should
>> stop. Perhaps the helper function should return a *Try* as Ben suggested,
>> proceeded by an *ASSERT_** of the result within the test case or something
>> like that.
>> 
>> I mainly wanted to inform folks of this limitation and the corresponding
>> confusing error message that follows.
>> 
>> On 27 July 2015 at 18:42, Benjamin Mahler 
>> wrote:
>> 
>>> Michael, note that we've avoided having ASSERT_ or EXPECT_ inside test
>>> helper methods because they print out the line number of the helper
>> method,
>>> rather than the line number where the helper method was called from the
>>> test. This is why we've been pretty careful when adding helpers and have
>>> tried to push assertions into the test itself (e.g. helper returns a Try
>>> instead of internally asserting).
>>> 
>>> Paul, are you saying that ASSERT within one case in a fixture will stop
>>> running all other cases for the fixture? Do you have a pointer to this?
>>> Sounds surprising.
>>> 
>>> On Mon, Jul 27, 2015 at 3:04 PM, Paul Brett 
>>> wrote:
>>> 
 Mike
 
 I would suggest that we want to avoid both ASSERT and CHECK macros in
 tests.
 
 With ASSERT, I completely agree with you about the perils of using
>> ASSERT
 that you list above, but additionally we have cases where ASSERT exits
>> a
 test fixture skipping later tests that might or might not have failed.
 
 Since the CHECK macro aborts the test harness, a single test failure
 prevents tests from running on all the remaining tests.  Particularly
 annoying for anyone running automated regression tests.
 
 We should add this to either the style guide or mesos-testing-patterns.
 
 -- @paul_b
 
 On Mon, Jul 27, 2015 at 2:28 PM, Michael Park 
>> wrote:
 
> I've had at least 3 individuals who ran into the issue of *ASSERT_**
 macro
> placement and since the generated error message is less than
>> useless, I
> would like to share wi

Re: Doxygen / Javadoc changes

2015-07-10 Thread Bernd Mathiske
s/non-/

> On Jul 10, 2015, at 2:27 AM, Benjamin Hindman  wrote:
> 
> We're only using non-javadoc comments for APIs, which are mostly in
> headers. We're still using // based C++ comments in implementations and
> places we don't want to be picked up via doxygen.
> 
> On Thu, Jul 9, 2015 at 5:23 PM Benjamin Mahler 
> wrote:
> 
>> A couple of thoughts:
>> 
>> (1) When introducing javadoc comments, can we please keep comment style
>> consistent within files and APIs? For the most part, it seems folks are
>> introducing javadoc in consistent sweeps, which is great. However, it looks
>> also like there are reviews and commits where we are introducing javadoc +
>> non-javadoc within a file / api, would love to avoid the inconsistency. :(
>> 
>> (2) Where are we planning to introduce javadoc comments? APIs only? All
>> headers? Would love to see some communication around how we'd like folks to
>> be proceeding. Maybe I missed it, but can't seem to find an email with
>> this.
>> 
>> (3) I ask because there is a tradeoff: we make the code more verbose to
>> navigate visually. Also, sometimes we document things unnecessarily:
>> 
>> /**
>> * Sends a message with data without a return address.
>> *
>> * @param to Receiver of the message.
>> * @param name Name of the message.
>> * @param data Data to send (gets copied).
>> * @param length Length of data.
>> */
>> void post(const UPID& to,
>>  const std::string& name,
>>  const char* data = NULL,
>>  size_t length = 0);
>> 
>> Here, having a 'to' or 'receiver' as a variable name is pretty
>> self-evident, ditto for 'messageName', 'data', 'length'. Are we ok with
>> omitting these kinds of comments? It seems like we have to be asking
>> ourselves when this provides value. Thoughts?
>> 
>> Ben
>> 



Doxygen style for libprocess: JavaDoc vs. Triple-Slash (and others?)

2015-06-02 Thread Bernd Mathiske
Are you interested in what style is being proposed for documenting libprocess? 
(And subsequently, there is a good chance that more of Mesos source code may be 
enhanced in the same way.) If so, please check out my most recent comment in 
this ticket:

https://issues.apache.org/jira/browse/MESOS-2501 


The core of the matter is: shall we adopt JavaDoc style or “///“ style? I have 
put some pros and cons into the JIRA discussion. Please contribute more 
(good:-) reasons to decide either way or a third way!

(Note that this discussion is in anticipation of later getting to doxygenize 
some interface-level source code while we are currently focussing on a 
libprocess User Guide, first, because this promises to have (even) more utility 
in the short term.)

Bernd



Re: [jira] [Closed] (MESOS-2780) Non-POD static variables

2015-06-01 Thread Bernd Mathiske
Sorry, did I say MESOS-2777? I meant MESOS-2779!

> On Jun 1, 2015, at 10:10 AM, Bernd Mathiske  wrote:
> 
> Fair enough. How about making the JIRA tickets easier to spot as different by 
> making their summary at least a little bit different from each other? My 
> impression was that they ought to be merged into one, addressing all non-POD 
> variables at once, since the summary said so in all its generality.
> 
> Bernd
> 
>> On May 29, 2015, at 5:46 PM, Paul Brett  wrote:
>> 
>> Disagree that this is a duplicate.  MESOS-2777 identifies an issue with the
>> handling of coverity reports within the project while MESOS-2880 identifies
>> around 80 locations where non-POD static initializations within the code
>> base should be corrected.  Should MESOS-2777 have a coverity report
>> attached listing these defects?
>> 
>> -- Paul
>> 
>> On Fri, May 29, 2015 at 8:18 AM, Bernd Mathiske (JIRA) 
>> wrote:
>> 
>>> 
>>>[
>>> https://issues.apache.org/jira/browse/MESOS-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>> ]
>>> 
>>> Bernd Mathiske closed MESOS-2780.
>>> -
>>>   Resolution: Duplicate
>>> 
>>> Seems to be the same as MESOS-2777.
>>> 
>>>> Non-POD static variables
>>>> 
>>>> 
>>>>   Key: MESOS-2780
>>>>   URL: https://issues.apache.org/jira/browse/MESOS-2780
>>>>   Project: Mesos
>>>>Issue Type: Bug
>>>>  Reporter: Paul Brett
>>>> 
>>>> We declare const non-POD static variables which should be converted to
>>> C++11 const expr.  These include the following:
>>>> {noformat}
>>>> tests/isolator_tests.cpp:1080:const string UNPRIVILEGED_USERNAME =
>>> "mesos.test.unprivileged.user";
>>>> tests/mesos.hpp:215:const static std::string TEST_CGROUPS_HIERARCHY =
>>> "/tmp/mesos_test_cgroup";
>>>> tests/mesos.hpp:218:const static std::string TEST_CGROUPS_ROOT =
>>> "mesos_test";
>>>> tests/zookeeper.cpp:53:const Duration ZooKeeperTest::NO_TIMEOUT =
>>> Seconds(10);
>>>> master/contender.cpp:45:const Duration
>>> MASTER_CONTENDER_ZK_SESSION_TIMEOUT = Seconds(10);
>>>> master/constants.cpp:33:const Bytes MIN_MEM = Megabytes(32);
>>>> master/constants.cpp:34:const Duration SLAVE_PING_TIMEOUT = Seconds(15);
>>>> master/constants.cpp:36:const Duration MIN_SLAVE_REREGISTER_TIMEOUT =
>>> Minutes(10);
>>>> master/constants.cpp:41:const Duration WHITELIST_WATCH_INTERVAL =
>>> Seconds(5);
>>>> master/constants.cpp:43:const std::string MASTER_INFO_LABEL = "info";
>>>> master/constants.cpp:44:const Duration ZOOKEEPER_SESSION_TIMEOUT =
>>> Seconds(10);
>>>> master/constants.cpp:45:const std::string DEFAULT_AUTHENTICATOR =
>>> "crammd5";
>>>> master/constants.cpp:46:const std::string DEFAULT_ALLOCATOR =
>>> "HierarchicalDRF";
>>>> master/detector.cpp:56:const Duration MASTER_DETECTOR_ZK_SESSION_TIMEOUT
>>> = Seconds(10);
>>>> master/http.cpp:274:const string Master::Http::HEALTH_HELP = HELP(
>>>> master/http.cpp:289:const static string HOSTS_KEY = "hosts";
>>>> master/http.cpp:290:const static string LEVEL_KEY = "level";
>>>> master/http.cpp:291:const static string MONITOR_KEY = "monitor";
>>>> master/http.cpp:293:const string Master::Http::OBSERVE_HELP = HELP(
>>>> master/http.cpp:385:const string Master::Http::REDIRECT_HELP = HELP(
>>>> master/http.cpp:424:const string Master::Http::SLAVES_HELP = HELP(
>>>> master/http.cpp:687:const TaskStateSummary TaskStateSummary::EMPTY;
>>>> master/http.cpp:864:const string Master::Http::SHUTDOWN_HELP = HELP(
>>>> master/http.cpp:877:const string Master::Http::TEARDOWN_HELP = HELP(
>>>> master/http.cpp:974:const string Master::Http::TASKS_HELP = HELP(
>>>> zookeeper/group.cpp:43:const Duration GroupProcess::RETRY_INTERVAL =
>>> Seconds(2);
>>>> zookeeper/authentication.cpp:11:const ACL_vector
>>> EVERYONE_READ_CREATOR_ALL = {
>>>> zookeeper/authentication.cpp:23:const ACL_vector
>>> EVERYONE_CREATE_AND_READ_CREATOR_ALL = {
>>>> common/build.cpp:32:const std::string DATE = BUILD_DATE;
>>>> common/build.cpp:34:const std::string USER = BUILD_USER;
>>>

Re: [jira] [Closed] (MESOS-2780) Non-POD static variables

2015-06-01 Thread Bernd Mathiske
Fair enough. How about making the JIRA tickets easier to spot as different by 
making their summary at least a little bit different from each other? My 
impression was that they ought to be merged into one, addressing all non-POD 
variables at once, since the summary said so in all its generality.

Bernd

> On May 29, 2015, at 5:46 PM, Paul Brett  wrote:
> 
> Disagree that this is a duplicate.  MESOS-2777 identifies an issue with the
> handling of coverity reports within the project while MESOS-2880 identifies
> around 80 locations where non-POD static initializations within the code
> base should be corrected.  Should MESOS-2777 have a coverity report
> attached listing these defects?
> 
> -- Paul
> 
> On Fri, May 29, 2015 at 8:18 AM, Bernd Mathiske (JIRA) 
> wrote:
> 
>> 
>> [
>> https://issues.apache.org/jira/browse/MESOS-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>> ]
>> 
>> Bernd Mathiske closed MESOS-2780.
>> -
>>Resolution: Duplicate
>> 
>> Seems to be the same as MESOS-2777.
>> 
>>> Non-POD static variables
>>> 
>>> 
>>>Key: MESOS-2780
>>>URL: https://issues.apache.org/jira/browse/MESOS-2780
>>>Project: Mesos
>>> Issue Type: Bug
>>>   Reporter: Paul Brett
>>> 
>>> We declare const non-POD static variables which should be converted to
>> C++11 const expr.  These include the following:
>>> {noformat}
>>> tests/isolator_tests.cpp:1080:const string UNPRIVILEGED_USERNAME =
>> "mesos.test.unprivileged.user";
>>> tests/mesos.hpp:215:const static std::string TEST_CGROUPS_HIERARCHY =
>> "/tmp/mesos_test_cgroup";
>>> tests/mesos.hpp:218:const static std::string TEST_CGROUPS_ROOT =
>> "mesos_test";
>>> tests/zookeeper.cpp:53:const Duration ZooKeeperTest::NO_TIMEOUT =
>> Seconds(10);
>>> master/contender.cpp:45:const Duration
>> MASTER_CONTENDER_ZK_SESSION_TIMEOUT = Seconds(10);
>>> master/constants.cpp:33:const Bytes MIN_MEM = Megabytes(32);
>>> master/constants.cpp:34:const Duration SLAVE_PING_TIMEOUT = Seconds(15);
>>> master/constants.cpp:36:const Duration MIN_SLAVE_REREGISTER_TIMEOUT =
>> Minutes(10);
>>> master/constants.cpp:41:const Duration WHITELIST_WATCH_INTERVAL =
>> Seconds(5);
>>> master/constants.cpp:43:const std::string MASTER_INFO_LABEL = "info";
>>> master/constants.cpp:44:const Duration ZOOKEEPER_SESSION_TIMEOUT =
>> Seconds(10);
>>> master/constants.cpp:45:const std::string DEFAULT_AUTHENTICATOR =
>> "crammd5";
>>> master/constants.cpp:46:const std::string DEFAULT_ALLOCATOR =
>> "HierarchicalDRF";
>>> master/detector.cpp:56:const Duration MASTER_DETECTOR_ZK_SESSION_TIMEOUT
>> = Seconds(10);
>>> master/http.cpp:274:const string Master::Http::HEALTH_HELP = HELP(
>>> master/http.cpp:289:const static string HOSTS_KEY = "hosts";
>>> master/http.cpp:290:const static string LEVEL_KEY = "level";
>>> master/http.cpp:291:const static string MONITOR_KEY = "monitor";
>>> master/http.cpp:293:const string Master::Http::OBSERVE_HELP = HELP(
>>> master/http.cpp:385:const string Master::Http::REDIRECT_HELP = HELP(
>>> master/http.cpp:424:const string Master::Http::SLAVES_HELP = HELP(
>>> master/http.cpp:687:const TaskStateSummary TaskStateSummary::EMPTY;
>>> master/http.cpp:864:const string Master::Http::SHUTDOWN_HELP = HELP(
>>> master/http.cpp:877:const string Master::Http::TEARDOWN_HELP = HELP(
>>> master/http.cpp:974:const string Master::Http::TASKS_HELP = HELP(
>>> zookeeper/group.cpp:43:const Duration GroupProcess::RETRY_INTERVAL =
>> Seconds(2);
>>> zookeeper/authentication.cpp:11:const ACL_vector
>> EVERYONE_READ_CREATOR_ALL = {
>>> zookeeper/authentication.cpp:23:const ACL_vector
>> EVERYONE_CREATE_AND_READ_CREATOR_ALL = {
>>> common/build.cpp:32:const std::string DATE = BUILD_DATE;
>>> common/build.cpp:34:const std::string USER = BUILD_USER;
>>> common/build.cpp:35:const std::string FLAGS = BUILD_FLAGS;
>>> common/build.cpp:36:const std::string JAVA_JVM_LIBRARY =
>> BUILD_JAVA_JVM_LIBRARY;
>>> common/build.cpp:39:const Option GIT_SHA =
>> std::string(BUILD_GIT_SHA);
>>> common/build.cpp:41:const Option GIT_SHA = None();
>>> common/build.cpp:45:const Option GIT_BRANCH =
>> std::string(BUILD_GIT_BRANCH);
>>> common/build.cpp:47:const Opti

How to reference images in /docs?

2015-05-19 Thread Bernd Mathiske
When adding/editing MD documentation in the “docs” folder of the Mesos 
workspace, I’d like to add some graphics. I have found two pre-existing 
approaches. 

1. In mesos-architecture.md, we find:
![Mesos 
Architecture](http://mesos.apache.org/assets/img/documentation/architecture3.jpg)

2. In external-containerizer.md, we find:
 ![Recovery Scheme](images/ec_recover_seqdiag.png?raw=true)

In case of #1 the images turn up on the Apache Mesos web page, eventually, but 
not until that has been pushed. In case of #2 you can point your MD renderer at 
your local Mesos workspace and they show up then.

Is there a better third path that displays the images in multiple situations? 
If not, am I assuming correctly that #1 is preferred?

Bernd



Apache Mesos Committer Checklist

2015-05-08 Thread Bernd Mathiske
As mentioned at the community meeting at Twitter headquarters yesterday, a few 
of us have compiled a list that Apache Mesos committer candidates can use to 
gather material for their nomination. It is in no way binding. Just a 
collection of categories of achievements and behaviors that can be helpful to 
look at, including handy links to relevant JIRA tickets etc. Some of us 
aspiring committers will play with this and start filling it out to gauge by 
ourselves where we stand so far. I propose that eventually some variant of this 
doc can be handed over to the PMC to facilitate evaluating a nomination case.

The doc at this link is open for viewing and commenting by everybody who has 
the link:

https://docs.google.com/document/d/1orsIXW41W_W-v4fb3jU1QntND6m961qOLalFDsZ-sxg/edit?usp=sharing
 


Bernd



Re: Review Request 33263: Extended SlaveTest.ShutdownUnregisteredExecutor test with a reason check.

2015-04-16 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33263/#review80332
---

Ship it!


- Bernd Mathiske


On April 16, 2015, 7:31 a.m., Andrey Dyatlov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33263/
> ---
> 
> (Updated April 16, 2015, 7:31 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Bernd Mathiske, and Till 
> Toenshoff.
> 
> 
> Bugs: MESOS-2625
> https://issues.apache.org/jira/browse/MESOS-2625
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Extended SlaveTest.ShutdownUnregisteredExecutor test with a reason check. 
> Check that the reason is REASON_COMMAND_EXECUTOR_FAILED. According to the 
> Slave::sendExecutorTerminatedStatusUpdate member function, this reason is 
> expected instead of more general REASON_EXECUTOR_TERMINATED because the 
> command executer is used in this test.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp b826000e0a4221690f956ea51f49ad4c99d5e188 
> 
> Diff: https://reviews.apache.org/r/33263/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Andrey Dyatlov
> 
>



Re: Review Request 30774: Fetcher Cache

2015-04-13 Thread Bernd Mathiske
 of the actual cache logic, 
including a hashmap of cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-04-13 Thread Bernd Mathiske


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 491
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line491>
> >
> > Why not just mock _fetch and do a barrier on it by giving it a promise 
> > in test?
> 
> Bernd Mathiske wrote:
> "just mock _fetch" is more work and harder to understand.
> 
> It would also function, but then you would need to touch test code every 
> time you change _fetch(). Furthermore, it would not be as clear why we wait 
> for this particular call.

Meanwhile I tried mocking _fetch, but it does not work. See the 
related/duplicate issue below. Let's drop this one here now so we can keep the 
comments on the same topic and code region in one place going forward, OK?


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review76902
---


On April 10, 2015, 4:33 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated April 10, 2015, 4:33 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 54c4e31ed6dfed3c23d492c19a301ce119a0519b 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
>   include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
>   src/Makefile.am fa609da08e23d6595a3f6d2efddd3e333b6c78f1 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp 6893684e6d199a5d69fc8bba8e60c4acaae9c3c9 
>   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> e4136095fca55637864f495098189ab3ad8d8fe7 
>   src/slave/flags.hpp d3b1ce117fbb4e0b97852ef150b63f35cc991032 
>   src/slave/flags.cpp 35f56252cfda5011d21aa188f33cc3e68a694968 
>   src/slave/slave.cpp 9fec023b643d410f4d511fa6f80e9835bab95b7e 
>   src/tests/docker_containerizer_tests.cpp 
> c772d4c836de18b0e87636cb42200356d24ec73d 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp 0e98572a62ae05437bd2bc800c370ad1a0c43751 
>   src/tests/mesos.cpp 02cbb4b8cf1206d0f32d160addc91d7e0f1ab28b 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> s

Re: Review Request 30774: Fetcher Cache

2015-04-13 Thread Bernd Mathiske


> On April 12, 2015, 11:11 p.m., Timothy Chen wrote:
> > src/tests/fetcher_cache_tests.cpp, line 308
> > <https://reviews.apache.org/r/30774/diff/42/?file=922829#file922829line308>
> >
> > Not sure why you picked an arbitrary number 5 here, why not let it be 
> > passed in?

OK, I will add an explanation in a comment. Two requirements need to be met by 
this constant.
- It needs to be larger than the expected number of status updates. We might 
choose something much larger than 5, but all tests run just fine with 5.
- It needs to be finite. Otherwise we will keep waiting for updates when none 
arrive due to a bug.

However, if we passed this constant in, then we would need to explain it at all 
the call sites, i.e. multiple times instead of only once. But the situation is 
exactly the same every time. So I will refrain from that.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review79838
---


On April 10, 2015, 4:33 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated April 10, 2015, 4:33 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 54c4e31ed6dfed3c23d492c19a301ce119a0519b 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
>   include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
>   src/Makefile.am fa609da08e23d6595a3f6d2efddd3e333b6c78f1 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp 6893684e6d199a5d69fc8bba8e60c4acaae9c3c9 
>   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> e4136095fca55637864f495098189ab3ad8d8fe7 
>   src/slave/flags.hpp d3b1ce117fbb4e0b97852ef150b63f35cc991032 
>   src/slave/flags.cpp 35f56252cfda5011d21aa188f33cc3e68a694968 
>   src/slave/slave.cpp 9fec023b643d410f4d511fa6f80e9835bab95b7e 
>   src/tests/docker_containerizer_tests.cpp 
> c772d4c836de18b0e87636cb42200356d24ec73d 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp 0e98572a62ae05437bd2bc800c370ad1a0c43751 
>   src/tests/mesos.cpp 02cbb4b8cf1206d0f32d160addc91d7e0f1ab28b 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except 

Re: Changing Mesos Minimum Compiler Version

2015-04-13 Thread Bernd Mathiske
+1

> On Apr 10, 2015, at 6:02 PM, Michael Park  wrote:
> 
> +1
> 
> On 9 April 2015 at 17:33, Alexander Gallego  wrote:
> 
>> This is amazing for native devs/frameworks.
>> 
>> Sent from my iPhone
>> 
>>> On Apr 9, 2015, at 5:16 PM, Joris Van Remoortere 
>> wrote:
>>> 
>>> +1
>>> 
 On Thu, Apr 9, 2015 at 2:14 PM, Cody Maloney 
>> wrote:
 As discussed in the last community meeting, we'd like to bump the
>> minimum required compiler version from GCC 4.4 to GCC 4.8.
 
 The overall goals are to make Mesos development safer, faster, and
>> reduce the maintenance burden. Currently a lot of stout has different
>> codepaths for Pre-C++11 and Post-C++11compilers.
 
 Progress will be tracked in the JIRA: MESOS-2604
 
 The resulting supported compiler versions will be:
 GCC 4.8, GCC 4.9
 Clang 3.5, Clang 3.6
 
 For reference
 Compilers by Distribution Version: http://goo.gl/p1t1ls
 
 C++11 features supported by each compiler:
 https://gcc.gnu.org/projects/cxx0x.html
 http://clang.llvm.org/cxx_status.html
>>> 
>> 



Re: Changing Mesos Minimum Compiler Version

2015-04-13 Thread Bernd Mathiske
Yeah! +1

> On Apr 10, 2015, at 6:02 PM, Michael Park  wrote:
> 
> +1
> 
> On 9 April 2015 at 17:33, Alexander Gallego  wrote:
> 
>> This is amazing for native devs/frameworks.
>> 
>> Sent from my iPhone
>> 
>>> On Apr 9, 2015, at 5:16 PM, Joris Van Remoortere 
>> wrote:
>>> 
>>> +1
>>> 
 On Thu, Apr 9, 2015 at 2:14 PM, Cody Maloney 
>> wrote:
 As discussed in the last community meeting, we'd like to bump the
>> minimum required compiler version from GCC 4.4 to GCC 4.8.
 
 The overall goals are to make Mesos development safer, faster, and
>> reduce the maintenance burden. Currently a lot of stout has different
>> codepaths for Pre-C++11 and Post-C++11compilers.
 
 Progress will be tracked in the JIRA: MESOS-2604
 
 The resulting supported compiler versions will be:
 GCC 4.8, GCC 4.9
 Clang 3.5, Clang 3.6
 
 For reference
 Compilers by Distribution Version: http://goo.gl/p1t1ls
 
 C++11 features supported by each compiler:
 https://gcc.gnu.org/projects/cxx0x.html
 http://clang.llvm.org/cxx_status.html
>>> 
>> 



Re: Review Request 30218: Add version() to docker abstraction.

2015-03-25 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30218/#review77742
---



src/docker/docker.cpp
<https://reviews.apache.org/r/30218/#comment125942>

It seems to me that you changed it to indefinite wait.


- Bernd Mathiske


On Jan. 23, 2015, 9:44 a.m., Timothy Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30218/
> ---
> 
> (Updated Jan. 23, 2015, 9:44 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Bernd Mathiske.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Add version() to docker abstraction.
> 
> 
> Diffs
> -
> 
>   src/docker/docker.hpp 3ebbc1f 
>   src/docker/docker.cpp 3a485a2 
> 
> Diff: https://reviews.apache.org/r/30218/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>



Re: Review Request 29336: Recover docker containers that launched in containers.

2015-03-25 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29336/#review77740
---



src/slave/containerizer/docker.cpp
<https://reviews.apache.org/r/29336/#comment125937>

Maybe use the "_" for the parameter, not the local variable? Then both 
local vars "containers" and "executors" look the same in that regard.

However, see the next issue, fixing which would cancel this one.



src/slave/containerizer/docker.cpp
<https://reviews.apache.org/r/29336/#comment125939>

This type/name scheme is hard to understand. How can Docker::Container* 
values be containers and executors? 

How about choosing different variable names that make this more clear? 

It seems that

containers -> logContainers
    executors -> executorContainers


- Bernd Mathiske


On Jan. 16, 2015, 5:37 p.m., Timothy Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29336/
> ---
> 
> (Updated Jan. 16, 2015, 5:37 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Bernd Mathiske.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Recover docker containers that launched in containers.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/docker.hpp b7bf54a 
>   src/slave/containerizer/docker.cpp 5f4b4ce 
> 
> Diff: https://reviews.apache.org/r/29336/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>



Re: Review Request 30774: Fetcher Cache

2015-03-24 Thread Bernd Mathiske
ially a cache file name. Refactors fetch() and run(), so 
there is only one of each. Introduces about half of the actual cache logic, 
including a hashmap of cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-21 Thread Bernd Mathiske
che actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 32233: Replaced raw pointer by Owned pointer

2015-03-20 Thread Bernd Mathiske


> On March 20, 2015, 4:01 a.m., Bernd Mathiske wrote:
> > LGTM. Thanks!
> 
> Akanksha Agrawal wrote:
> Thank you! Could you please merge this?

I'd love to, but I am not a committer. I see you have dedicated this review to 
Ben Mahler. He is a committer. I just wanted to help reviewing :-)


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32233/#review77197
---


On March 20, 2015, 1:11 a.m., Akanksha Agrawal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32233/
> ---
> 
> (Updated March 20, 2015, 1:11 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Replaced raw pointer by Owned pointer
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/external_containerizer.hpp 
> 70491375a6cd8988a33e3f18870c9170a37c0f17 
>   src/slave/containerizer/external_containerizer.cpp 
> 42c67f548caf7bddbe131e0dfa7d74227d8c2593 
> 
> Diff: https://reviews.apache.org/r/32233/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Akanksha Agrawal
> 
>



Re: Review Request 32233: Replaced raw pointer by Owned pointer

2015-03-20 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32233/#review77197
---


LGTM. Thanks!

- Bernd Mathiske


On March 20, 2015, 1:11 a.m., Akanksha Agrawal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32233/
> ---
> 
> (Updated March 20, 2015, 1:11 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Replaced raw pointer by Owned pointer
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/external_containerizer.hpp 
> 70491375a6cd8988a33e3f18870c9170a37c0f17 
>   src/slave/containerizer/external_containerizer.cpp 
> 42c67f548caf7bddbe131e0dfa7d74227d8c2593 
> 
> Diff: https://reviews.apache.org/r/32233/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Akanksha Agrawal
> 
>



Re: Review Request 32233: Replaced raw pointer by Owned pointer

2015-03-20 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32233/#review77179
---



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment125059>

Can you please try again? Did you perhaps try this with an editor that uses 
TAB? You can use blanks to get the alignment you want, since we should replace 
TABs with blanks anyway. I downloaded your patch, appplied it and tried. It 
worked.


- Bernd Mathiske


On March 19, 2015, 6:47 a.m., Akanksha Agrawal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32233/
> ---
> 
> (Updated March 19, 2015, 6:47 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Replaced raw pointer by Owned pointer
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/external_containerizer.hpp 
> 70491375a6cd8988a33e3f18870c9170a37c0f17 
>   src/slave/containerizer/external_containerizer.cpp 
> 42c67f548caf7bddbe131e0dfa7d74227d8c2593 
> 
> Diff: https://reviews.apache.org/r/32233/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Akanksha Agrawal
> 
>



Re: Review Request 30774: Fetcher Cache

2015-03-19 Thread Bernd Mathiske
che actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 31986: Container ID is sent on each TaskStatus message back to the framework.

2015-03-19 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31986/#review77159
---


LGTM

- Bernd Mathiske


On March 13, 2015, 8:21 a.m., Alexander Rojas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31986/
> ---
> 
> (Updated March 13, 2015, 8:21 a.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Isabel Jimenez, Joerg Schad, Till 
> Toenshoff, and Vinod Kone.
> 
> 
> Bugs: MESOS-2191
> https://issues.apache.org/jira/browse/MESOS-2191
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 9df972d750ce1e4a81d2e96cc508d6f83cad2fc8 
>   src/common/type_utils.cpp e92f6f36de0955784619029a016667b46bbe221b 
>   src/exec/exec.cpp d678f0682d803b0b080c3a6c296067ac9ab5dbf8 
>   src/tests/slave_tests.cpp a975305430097a8295b4b155e8448572c12bde22 
> 
> Diff: https://reviews.apache.org/r/31986/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Alexander Rojas
> 
>



Re: Review Request 30774: Fetcher Cache

2015-03-19 Thread Bernd Mathiske


> On Feb. 25, 2015, 10:30 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 686
> > <https://reviews.apache.org/r/30774/diff/19/?file=875881#file875881line686>
> >
> > Seems like lookupEntry is only used here, and it's always coupled with 
> > a reference call.
> > 
> > How about making the API less error prone, we can increment the 
> > reference directly in lookupEntry (also we should call it getEntry, as 
> > don't usually use lookup in our code base).
> > 
> > so getEntry becomes ->
> > - get cache key
> > - get cache item
> > - if cache item is present, increment ref
> > - return item
> > 
> > This way no one needs to call reference().
> 
> Bernd Mathiske wrote:
> Renamed to getEntry(). 
> 
> I disagree with the plan to combine looking up entries and referencing 
> them. I think this would make the code more error-prone, not less. It is too 
> easy then to use getEntry() and inadvertently also reference() when someone 
> later changes the code base. 
> 
> We would then better rename it to getAndReferenceEntry(). However, this 
> does not increase code readability at all.
> 
> BTW, getEntry is also used in one other location, but I could inline it 
> there, so we are back to one place.

In the latest revision I changed it to two different methods. One is called 
getEntry() and does not reference and the other is called referenceEntry() and 
does. So we can have it both ways.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review74048
---


On March 18, 2015, 11:43 p.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 18, 2015, 11:43 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 7119b1421ac1506fa118e9f91d07e027dec3d92e 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto ec8efaec13f54a56d82411f6cdbdb8ad8b103748 
>   src/Makefile.am 7a06c7028eca8164b1f5fdea6a7ecd37ee6826bb 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> fbd1c0a0e5f4f227adb022f0baaa6d2c7e3ad748 
>   src/slave/flags.hpp dbaf5f532d0bc65a6d16856b8ffcc2c06a98f1fa 
>   src

Re: Review Request 30774: Fetcher Cache

2015-03-19 Thread Bernd Mathiske


> On Feb. 24, 2015, 11:26 p.m., Timothy Chen wrote:
> > docs/fetcher.md, line 68
> > <https://reviews.apache.org/r/30774/diff/17/?file=872869#file872869line68>
> >
> > Not sure if putting the struct here is a good idea, as it's most likely 
> > going to be changed in the future.
> 
> Bernd Mathiske wrote:
> When the struct changes, we change the doc.
> 
> Timothy Chen wrote:
> We've avoided doing so in all our docs, and we also suggested removing 
> from another review for the serice discovery docs.
> No committer is going to remember this and help enforce keeping two 
> places as the source of truth. If you really like to provide it leave a 
> reference where the actual proto lives and comment and refer folks to there 
> for the true defintions.

Yes, just putting the proto there is not good if we tend to avoid that, which I 
was not aware of. Thanks! In this case, it is very hard to explain what is 
going on without proto, so I would very much like to opt for your way out, to 
put a reference in place that points from the proto to the doc and vice-versa.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review73991
-------


On March 18, 2015, 11:43 p.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 18, 2015, 11:43 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 7119b1421ac1506fa118e9f91d07e027dec3d92e 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto ec8efaec13f54a56d82411f6cdbdb8ad8b103748 
>   src/Makefile.am 7a06c7028eca8164b1f5fdea6a7ecd37ee6826bb 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> fbd1c0a0e5f4f227adb022f0baaa6d2c7e3ad748 
>   src/slave/flags.hpp dbaf5f532d0bc65a6d16856b8ffcc2c06a98f1fa 
>   src/slave/slave.cpp 0f99e4efb8fa2b96f120a3e49191158ca0364c06 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp 45e35204d1aa876fa0c871acf0f21afcd5ababe8 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing

Re: Review Request 30774: Fetcher Cache

2015-03-19 Thread Bernd Mathiske


> On Feb. 25, 2015, 10:56 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 528
> > <https://reviews.apache.org/r/30774/diff/19/?file=875881#file875881line528>
> >
> >     return size.error();
> 
> Bernd Mathiske wrote:
> That does not compile.
> 
> Timothy Chen wrote:
> It doesn't? How about wrapping in Error again?
> And why is returning size correct here?

Yes, we can wrap it in Error again. If we don't it does not compile because a 
string does not match a Try. I will wrap it so it does not stand out.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review74053
---


On March 18, 2015, 11:43 p.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 18, 2015, 11:43 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 7119b1421ac1506fa118e9f91d07e027dec3d92e 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto ec8efaec13f54a56d82411f6cdbdb8ad8b103748 
>   src/Makefile.am 7a06c7028eca8164b1f5fdea6a7ecd37ee6826bb 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> fbd1c0a0e5f4f227adb022f0baaa6d2c7e3ad748 
>   src/slave/flags.hpp dbaf5f532d0bc65a6d16856b8ffcc2c06a98f1fa 
>   src/slave/slave.cpp 0f99e4efb8fa2b96f120a3e49191158ca0364c06 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp 45e35204d1aa876fa0c871acf0f21afcd5ababe8 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those. In dependency order:
> 
> 30033: Removes the fetcher env tests since these won't be needed any more 
> when the fetcher uses JSON in a single env var as a parameter. They never 
> tested anything that won't be covered by other tests anyway.
&

Re: Review Request 30774: Fetcher Cache

2015-03-19 Thread Bernd Mathiske


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 503
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line503>
> >
> > Since this is only called in one place, how about put this in ___fetch, 
> > pass it the future and check if it failed log it there?
> 
> Bernd Mathiske wrote:
> How would this be simpler and more readable?
> 
> What is wrong with abstracting functions that are called only once? Doing 
> so saves a comment / pulls what would have been a comment into code!

Since we probably don't need a comment here, I'll fix it.


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 726
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line726>
> >
> > Why ignore error?
> 
> Bernd Mathiske wrote:
> The code that follows this line as of line 712 handles the error case.

See issue below for resolution.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review76902
---


On March 18, 2015, 11:43 p.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 18, 2015, 11:43 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 7119b1421ac1506fa118e9f91d07e027dec3d92e 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto ec8efaec13f54a56d82411f6cdbdb8ad8b103748 
>   src/Makefile.am 7a06c7028eca8164b1f5fdea6a7ecd37ee6826bb 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> fbd1c0a0e5f4f227adb022f0baaa6d2c7e3ad748 
>   src/slave/flags.hpp dbaf5f532d0bc65a6d16856b8ffcc2c06a98f1fa 
>   src/slave/slave.cpp 0f99e4efb8fa2b96f120a3e49191158ca0364c06 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp 45e35204d1aa876fa0c871acf0f21afcd5ababe8 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
>

Re: Review Request 30774: Fetcher Cache

2015-03-19 Thread Bernd Mathiske


> On March 19, 2015, 9:40 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 379
> > <https://reviews.apache.org/r/30774/diff/38/?file=899705#file899705line379>
> >
> > benh, what do you think of Bernd's contentionBarrier injection? 
> > commonly we always just mock the callback (_fetch in this case) in tests to 
> > block, but Bernd wanted to introduce a specific empty method for tests. I 
> > told him this is not a pattern we use in Mesos, but like to see what you 
> > think.

Of course I will stick to the prevalent patterns unless you start liking this 
one :-)


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review77057
-------


On March 18, 2015, 11:43 p.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 18, 2015, 11:43 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 7119b1421ac1506fa118e9f91d07e027dec3d92e 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto ec8efaec13f54a56d82411f6cdbdb8ad8b103748 
>   src/Makefile.am 7a06c7028eca8164b1f5fdea6a7ecd37ee6826bb 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> fbd1c0a0e5f4f227adb022f0baaa6d2c7e3ad748 
>   src/slave/flags.hpp dbaf5f532d0bc65a6d16856b8ffcc2c06a98f1fa 
>   src/slave/slave.cpp 0f99e4efb8fa2b96f120a3e49191158ca0364c06 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp 45e35204d1aa876fa0c871acf0f21afcd5ababe8 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those. In dependency order:
> 
> 30033: Removes the fetcher env tests since these won't be needed any more 
> when the fetcher uses JSON in a single env var as a parameter. They never 
> tested anything that 

Re: Review Request 30774: Fetcher Cache

2015-03-19 Thread Bernd Mathiske


> On March 18, 2015, 11:48 p.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 726
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line726>
> >
> > I'm not sure I understand, the error is never logged and in the end we 
> > simply return 0 if os::find returns a error. To me that looks like we're 
> > ignoring if Try has an error right?

No problem, I'll rewrite it and put a comment: When there is an error then the 
cache directory does not exist, which means the number of files in the cache is 
zero.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review77025
---


On March 18, 2015, 11:43 p.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 18, 2015, 11:43 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 7119b1421ac1506fa118e9f91d07e027dec3d92e 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto ec8efaec13f54a56d82411f6cdbdb8ad8b103748 
>   src/Makefile.am 7a06c7028eca8164b1f5fdea6a7ecd37ee6826bb 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> fbd1c0a0e5f4f227adb022f0baaa6d2c7e3ad748 
>   src/slave/flags.hpp dbaf5f532d0bc65a6d16856b8ffcc2c06a98f1fa 
>   src/slave/slave.cpp 0f99e4efb8fa2b96f120a3e49191158ca0364c06 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp 45e35204d1aa876fa0c871acf0f21afcd5ababe8 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those. In dependency order:
> 
> 30033: Removes the fetcher env tests since these won't be needed any more 
> when the fetcher uses JSON in a single env var as a parameter. They never 
> tested anything that won't be covered by other tests anyway

Re: Review Request 32233: Replaced raw pointer by Owned pointer

2015-03-19 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32233/#review77047
---



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment124839>

dispatch(process.get(),
 &ExternalContainerizerProcess::destroy,
 containerId);


Testing done?

- Bernd Mathiske


On March 19, 2015, 6:47 a.m., Akanksha Agrawal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32233/
> ---
> 
> (Updated March 19, 2015, 6:47 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Replaced raw pointer by Owned pointer
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/external_containerizer.hpp 
> 70491375a6cd8988a33e3f18870c9170a37c0f17 
>   src/slave/containerizer/external_containerizer.cpp 
> 42c67f548caf7bddbe131e0dfa7d74227d8c2593 
> 
> Diff: https://reviews.apache.org/r/32233/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Akanksha Agrawal
> 
>



Re: Review Request 30774: Fetcher Cache

2015-03-18 Thread Bernd Mathiske
troduces about half of the actual cache logic, 
including a hashmap of cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-18 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review77019
---



src/slave/containerizer/fetcher.cpp
<https://reviews.apache.org/r/30774/#comment124786>

The error case is handled right after line 711, which closes the branch for 
non-error.


- Bernd Mathiske


On March 17, 2015, 6:59 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 17, 2015, 6:59 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 7119b1421ac1506fa118e9f91d07e027dec3d92e 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto ec8efaec13f54a56d82411f6cdbdb8ad8b103748 
>   src/Makefile.am 7a06c7028eca8164b1f5fdea6a7ecd37ee6826bb 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> fbd1c0a0e5f4f227adb022f0baaa6d2c7e3ad748 
>   src/slave/flags.hpp dbaf5f532d0bc65a6d16856b8ffcc2c06a98f1fa 
>   src/slave/slave.cpp 0f99e4efb8fa2b96f120a3e49191158ca0364c06 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp 45e35204d1aa876fa0c871acf0f21afcd5ababe8 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those. In dependency order:
> 
> 30033: Removes the fetcher env tests since these won't be needed any more 
> when the fetcher uses JSON in a single env var as a parameter. They never 
> tested anything that won't be covered by other tests anyway.
> 
> 30034: Makes the code structure of all fetcher tests the same. Instead of 
> calling the run method of the fetcher directly, calling through fetch(). Also 
> removes all uses of I/O redirection, which is not really needed for 
> debugging, and thus the next patch can refactor fetch() and run(). (The 
> latter comes in two varieties, which complicates matters without much 
> benefit.)
&

Re: Review Request 30774: Fetcher Cache

2015-03-18 Thread Bernd Mathiske


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 406
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line406>
> >
> > Do we call fetch even if we don't have anything to fetch? I think it 
> > will be a good idea to have a fast return if there is nothing to be fetched.

There is a check for this in Fetcher::fetch(). No need to even dispatch the 
call to the process either if there is nothing to fetch.


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 491
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line491>
> >
> > Why not just mock _fetch and do a barrier on it by giving it a promise 
> > in test?

"just mock _fetch" is more work and harder to understand.

It would also function, but then you would need to touch test code every time 
you change _fetch(). Furthermore, it would not be as clear why we wait for this 
particular call.


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 503
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line503>
> >
> > Since this is only called in one place, how about put this in ___fetch, 
> > pass it the future and check if it failed log it there?

How would this be simpler and more readable?

What is wrong with abstracting functions that are called only once? Doing so 
saves a comment / pulls what would have been a comment into code!


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 518
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line518>
> >
> > In what scenario should a cache entry not exist?
> > If it doesn't somehow we won't be able to use it too?

As you can see at the call sites, this method is used in scenarios where 
fetching succeeded, where it failed, and incidentally where it left a partial 
download lying around. I added this comment:

  // We may or may not have started downloading. The download may or may
  // not have been partial. In any case, clean up whatever is there.
  
If there is no file, that's fine. Then we tried fetching and failed before 
starting to write the file. 

In any case, we remove the cache entry and the space amount it had 
reserved/claimed is released for later use.


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 521
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line521>
> >
> > Feel like this can be in a infintie loop, where if we can expire one 
> > item then forever other fetch items will get stuck?
> > I wonder if we should have some remedy action, or simply crash too?

This is not a loop, because the cache entry gets removed BEFORE we attempt to 
delete the file. See line 500 just above.

However, just in case future changed code were ever to call this method several 
times on the same entry, I added a line that sets the entry's size field to 
zero. This way, accounted cache space is only released once.


> On March 18, 2015, 11:05 a.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 726
> > <https://reviews.apache.org/r/30774/diff/37/?file=897704#file897704line726>
> >
> > Why ignore error?

The code that follows this line as of line 712 handles the error case.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review76902
---


On March 17, 2015, 6:59 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 17, 2015, 6:59 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 

Re: Review Request 32233: Replaced raw pointer by Owned pointer

2015-03-18 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32233/#review77012
---



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment124770>

Put each parameter on a new line if they don't all fit in one:

return dispatch(process.get(),
&ExternalContainerizerProcess::recover,
state);



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment124769>

Align the parameters like this:

return dispatch(process.get(),
&ExternalContainerizerProcess::launch,
containerId,
None(),
executorInfo,
directory,



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment124771>

Align parameters.



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment124773>

Align parameters.



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment124774>

1 param per line



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment124775>

1 param per line



src/slave/containerizer/external_containerizer.cpp
<https://reviews.apache.org/r/32233/#comment124776>

1 param per line


- Bernd Mathiske


On March 18, 2015, 9:14 p.m., Akanksha Agrawal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32233/
> ---
> 
> (Updated March 18, 2015, 9:14 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Replaced raw pointer by Owned pointer
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/external_containerizer.hpp 
> 70491375a6cd8988a33e3f18870c9170a37c0f17 
>   src/slave/containerizer/external_containerizer.cpp 
> 42c67f548caf7bddbe131e0dfa7d74227d8c2593 
> 
> Diff: https://reviews.apache.org/r/32233/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Akanksha Agrawal
> 
>



Re: Review Request 32163: Added a function which checks if a json object is contained within another.

2015-03-18 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32163/#review76897
---



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124590>

1. I'd rather forward-declare the class, not the inline function. Matter of 
taste, yes. "inline" without function body just seems weird.

More importantly, reading the whole function body and the function comments 
(see below) first provides a nice setup for understanding the class, which is 
the complicated part here, not the other way around.

2. The function needs a comment about what it does more than the class, 
because it is part of the API. (The class still needs one, too.)



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124604>

This is still pretty vague. What gets compared to what? Which side is the 
value in the comparator on?



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124588>

"for which checks" -> "for which it checks"



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124589>

// see bool contains(const Value&, const Value&)

->

// See 'bool contains(const Value&, const Value&)'.



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124608>

I am guessing (finding out later) that this is testing if 'object' is 
contained in 'value'? Correct? Why not write the purpose of this operator down?



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124611>

The text names the comparison in the opposite direction of the code. Easier 
to read if they parallel each other.

However, this is not really a line that needs any comment. This particular 
part of the code is entirely self-explanatory and any comment is redundant.

Higer up, describing the whole operator, is where I would put high level 
information about what this algorithm tries to do.



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124612>

Need to have or are guaranteed to have or both? I believe both.



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124615>

Redundant. That's exactly what the code says in one line, too.



3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp
<https://reviews.apache.org/r/32163/#comment124616>

    Also redundant.


Looks good. Just small matters of taste marked in this review. First pass, have 
not looked at the tests yet.

- Bernd Mathiske


On March 18, 2015, 7:32 a.m., Alexander Rojas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32163/
> ---
> 
> (Updated March 18, 2015, 7:32 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Bernd Mathiske, Joerg Schad, 
> Niklas Nielsen, and Till Toenshoff.
> 
> 
> Bugs: MESOS-2510
> https://issues.apache.org/jira/browse/MESOS-2510
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Adds a function which allows to perform comparison tests on subsets of json 
> blobs. i.e.
> 
> ```cpp
> JSON::Value expected = JSON::parse(
> "{"
> "  \"key\" : true"
>   "}").get();
> 
> // Returned json:
> // {
> //   "uptime" : 45234.123,
> //   "key" : true
> // }
> JSON::Value actual = bar();
> 
> // I'm only interested on the "key" entry and ignore the rest.
> EXPECT_TRUE(contains(actual, expected));
> ```
> 
> Increasing readability for tests that include json.
> 
> For more information on the reason of why this patch is needed, please check 
> the JIRA entry.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 
> 334c898906018be6e663f53815abbe047806b95c 
>   3rdparty/libprocess/3rdparty/stout/tests/json_tests.cpp 
> f60d1bbe60f2e2b6460c06bba98e8b85ebb6a3f9 
> 
> Diff: https://reviews.apache.org/r/32163/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Alexander Rojas
> 
>



Re: Review Request 30774: Fetcher Cache

2015-03-17 Thread Bernd Mathiske
che actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-17 Thread Bernd Mathiske
 
passing the slave ID all the way down to the place where the cache dir name is 
constructed.

30037: Extends the fetcher info protobuf with "actions" (fetch directly 
bypassing the cache, fetch through the cache, retrieve from the cache). 
Switches the basis for dealing with uris to "items", which contain the uri, the 
action, and potentially a cache file name. Refactors fetch() and run(), so 
there is only one of each. Introduces about half of the actual cache logic, 
including a hashmap of cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 32108: Added manual make for readability training source code

2015-03-16 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32108/#review76569
---


Needs fixing. Does not seem to work for make dist-check yet.

- Bernd Mathiske


On March 16, 2015, 9:09 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32108/
> ---
> 
> (Updated March 16, 2015, 9:09 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Readability source code now has a Makefile that is generated by bootstrap and 
> configure. "cd build/readability; make" compiles the sources in 
> "readability/". No linking occurs. This is just to ensure that we have 
> syntactially correct and type-checked example files.
> 
> Slightly rearranged the content of naming_*.cpp and broke out an extra file 
> for whitespace issues.
> 
> Replaces:
> https://reviews.apache.org/r/31990/
> https://reviews.apache.org/r/31992/
> 
> 
> Diffs
> -
> 
>   configure.ac 9b2d7f15f535aaaf85faf9b4f7af750f1dbdf472 
>   readability/Makefile.am PRE-CREATION 
>   readability/TODO PRE-CREATION 
>   readability/naming_comments.cpp PRE-CREATION 
>   readability/naming_review.cpp PRE-CREATION 
>   readability/whitespace_comments.cpp PRE-CREATION 
>   readability/whitespace_review.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/32108/diff/
> 
> 
> Testing
> ---
> 
> cd build/readability; make
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>



Review Request 32108: Added manual make for readability training source code

2015-03-16 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32108/
---

Review request for mesos, Benjamin Hindman and Ben Mahler.


Repository: mesos


Description
---

Readability source code now has a Makefile that is generated by bootstrap and 
configure. "cd build/readability; make" compiles the sources in "readability/". 
No linking occurs. This is just to ensure that we have syntactially correct and 
type-checked example files.

Slightly rearranged the content of naming_*.cpp and broke out an extra file for 
whitespace issues.

Replaces:
https://reviews.apache.org/r/31990/
https://reviews.apache.org/r/31992/


Diffs
-

  configure.ac 9b2d7f15f535aaaf85faf9b4f7af750f1dbdf472 
  readability/Makefile.am PRE-CREATION 
  readability/TODO PRE-CREATION 
  readability/naming_comments.cpp PRE-CREATION 
  readability/naming_review.cpp PRE-CREATION 
  readability/whitespace_comments.cpp PRE-CREATION 
  readability/whitespace_review.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/32108/diff/


Testing
---

cd build/readability; make


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-11 Thread Bernd Mathiske
e mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30609: Added a function that reports file size, not following links.

2015-03-11 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30609/
---

(Updated March 11, 2015, 10:06 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy 
Chen.


Changes
---

Rebased. Moved os::size() to os::stat::size().


Bugs: MESOS-2072
https://issues.apache.org/jira/browse/MESOS-2072


Repository: mesos


Description
---

This returns a file's size (on UNIXes as reported by lstat(), not stat()). It 
is desired that in case of a link, the size of the link, not the size of the 
referenced file, is returned.


Diffs (updated)
-

  3rdparty/libprocess/3rdparty/stout/include/stout/os/stat.hpp 
af940a48b161c194f2efb529b3d589c543b12f61 
  3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 
c396c1d2d833b2f1721092fa35b23b5c3c3d99b3 

Diff: https://reviews.apache.org/r/30609/diff/


Testing
---

Wrote a simple test that creates a file and tests its size, and also checks if 
a non-existing file yields an error.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-11 Thread Bernd Mathiske
rations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-11 Thread Bernd Mathiske


> On March 10, 2015, 8:35 a.m., Benjamin Hindman wrote:
> > src/launcher/fetcher.cpp, line 321
> > <https://reviews.apache.org/r/30774/diff/33/?file=888350#file888350line321>
> >
> > Can you comment the relationship between the FetcherInfo::Item and the 
> > FetcherInfo here? Is the FetcherInfo::Item within the FetcherInfo but 
> > FetcherInfo is included because you just want to get the 
> > 'sandbox_directory' and 'cache_directory' and rather than pulling those out 
> > explicitly you just passed the entire FetcherInfo?

There are more items in the FetcherInfo than just the one we are working in 
here. That's why this one is called out explicitly. I changed this to passing 
both directories in.


> On March 10, 2015, 8:35 a.m., Benjamin Hindman wrote:
> > src/launcher/fetcher.cpp, lines 364-366
> > <https://reviews.apache.org/r/30774/diff/33/?file=888350#file888350line364>
> >
> > Why are these not CHECKs? Since you're the one setting up the 
> > FetcherInfo it seems like you should know explicitly whether or not the 
> > cache_filename was set!
> > 
> > Same for the cache_directory below as well.

What if somebody else uses mesos-fetcher?


> On March 10, 2015, 8:35 a.m., Benjamin Hindman wrote:
> > src/launcher/fetcher.cpp, lines 403-404
> > <https://reviews.apache.org/r/30774/diff/33/?file=888350#file888350line403>
> >
> > As mentioned above, it would be great to really capture the 
> > relationship between the FetcherInfo and the FetcherInfo::Item. If The 
> > FetcherInfo encapsulates the FetcherInfo::Item I would also suggest 
> > switching the order of the parameters to signify that.

The main purpose here is to fetch this one particular item, not everything 
FetcherInfo carries. FetcherInfo is a secondary parameter that provides extra 
parameters like cache_directory, sandbox_directory, and framework_home. Putting 
it second makes this relationship clear IMHO. Do you suggest adding all these 
as individual parameters?

Yes, the item is included in the list of items in FetcherInfo. Shall we break 
up FetcherInfo into several shells, the inner one without items?


> On March 10, 2015, 8:35 a.m., Benjamin Hindman wrote:
> > src/slave/flags.hpp, line 487
> > <https://reviews.apache.org/r/30774/diff/33/?file=888358#file888358line487>
> >
> > Can we make this a Path to start?

Then it would be the only one. Confusing. I'd rather have a wholesale sweep 
over the whole code base to introduce Path - as a separate ticket.


> On March 10, 2015, 8:35 a.m., Benjamin Hindman wrote:
> > src/slave/slave.cpp, line 796
> > <https://reviews.apache.org/r/30774/diff/33/?file=888359#file888359line796>
> >
> > We should do recovery on the fetcher itself:
> > 
> > Try recover = fetcher->recover(flags, slaveId);
> > 
> > It seems very weird to have a static generic Fetcher recover 
> > functionality that implies that we can't have multiple Fetchers running at 
> > the same time. How do we start multiple slaves at the same time?

This is an artefact of the lack of injection of slaveId and flags. It should be 
cleaned up when we refactor those. The slave does not have access to the 
fetcher instance as it is right now. It would cause a lot of collateral changes 
if it did. I advise to refrain for now. I have put a comment at the static 
method to explain this. That's the best fix for now IMHO.

There is no problem starting multiple slaves, because they all have a different 
slaveID that gets passed into this call.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review75754
---


On March 7, 2015, 7:21 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 7, 2015, 7:21 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: me

Re: Review Request 30774: Fetcher Cache

2015-03-11 Thread Bernd Mathiske


> On March 9, 2015, 8:37 a.m., Joerg Schad wrote:
> > include/mesos/mesos.proto, line 208
> > <https://reviews.apache.org/r/30774/diff/33/?file=888347#file888347line208>
> >
> > Could you add a comment (i.e. backlink to the documention) reminding 
> > developers to update docs/fetcher.md when the protobuf is changed?

Since we are dropping the enum, there will be no such comment. There is one 
next to the remaining "cache" filed, though.


> On March 9, 2015, 8:37 a.m., Joerg Schad wrote:
> > src/slave/containerizer/fetcher.cpp, line 450
> > <https://reviews.apache.org/r/30774/diff/33/?file=888355#file888355line450>
> >
> > size_t position?

Using an iterator now.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review75700
---


On March 7, 2015, 7:21 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 7, 2015, 7:21 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 9df972d750ce1e4a81d2e96cc508d6f83cad2fc8 
>   src/Makefile.am d299f07d865080676ca8a550cf6005c6ab32839f 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> ec4626f903d44c0911093ff763ef16ad27c418a9 
>   src/slave/flags.hpp 56b25caf3901b38bdecb50310e8bcae0b114efa8 
>   src/slave/slave.cpp a06d68032f26ccb3f786b6ea7c3a6c3c52449bd2 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp e91e5e484eea4587ac8f2eb9cefeab4acc9f4615 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those. In dependency order:
> 
> 30033: Removes the fetcher env tests since these won't be needed any more 
> when the fetcher uses JSON in a single env var as a parameter. They 

Re: Review Request 30774: Fetcher Cache

2015-03-11 Thread Bernd Mathiske


> On March 6, 2015, 2:15 p.m., Timothy Chen wrote:
> > include/mesos/fetcher/fetcher.proto, line 58
> > <https://reviews.apache.org/r/30774/diff/32/?file=887350#file887350line58>
> >
> > It's harder to make a optional field required, but it's much easier the 
> > other way around.
> > 
> > If we always want it to be required, I think we should make the sandbox 
> > a required field.
> 
> Bernd Mathiske wrote:
> There was some discussion about whether this field should be required or 
> not. The general idea here is that a task might be able to run without 
> fetching anything into its sandbox. In this case, the framework may get away 
> without naming the sandbox. But since a task always has one, we could also 
> make it required. I am impartial in this choice, but I see that your argument 
> that required->optional is easier has pull.

I have heard good arguments both ways. Here is how I see it. 

For the recipient of a message, "optional" is the preferred choice. Then any 
legacy recipient's code is always prepared for everything and robust wrt. 
changing to "required". Not the other way around.

But for the sender, "required" is the better choice, making sender code more 
robust. If legacy senders still provide the field when it has become optional, 
that's OK. Not the other way around.

So which side are we on in this case? As much as this is an internal protocol, 
we are on neither side and we can change this in arbitrary ways. 

This is an external protocol if someone else than a Mesos slave uses 
mesos-fetcher. (Maybe a special external containerizer.) Then we are providing 
the message recipient and we have to be on that side. Therefore I am voting for 
"optional".


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review75491
---


On March 7, 2015, 7:21 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 7, 2015, 7:21 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 9df972d750ce1e4a81d2e96cc508d6f83cad2fc8 
>   src/Makefile.am d299f07d865080676ca8a550cf6005c6ab32839f 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> ec4626f903d44c

Re: Review Request 30774: Fetcher Cache

2015-03-11 Thread Bernd Mathiske


> On March 2, 2015, 8:13 p.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 759
> > <https://reviews.apache.org/r/30774/diff/25/?file=882228#file882228line759>
> >
> > If you're incrementing all the time just to count, why not just get the 
> > size from list?
> 
> Bernd Mathiske wrote:
> I am not incrementing to count anything. I am incrementing to hit the 
> right index in a vector that parallels the list I am iterating over. Is there 
> a C++ or Boost construct that can do this without indices?

Switched to using a const_iterator for this. This should be more obviously 
paralleling the foreach.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review74885
---


On March 7, 2015, 7:21 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 7, 2015, 7:21 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 9df972d750ce1e4a81d2e96cc508d6f83cad2fc8 
>   src/Makefile.am d299f07d865080676ca8a550cf6005c6ab32839f 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> ec4626f903d44c0911093ff763ef16ad27c418a9 
>   src/slave/flags.hpp 56b25caf3901b38bdecb50310e8bcae0b114efa8 
>   src/slave/slave.cpp a06d68032f26ccb3f786b6ea7c3a6c3c52449bd2 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp e91e5e484eea4587ac8f2eb9cefeab4acc9f4615 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those. In dependency order:
> 
> 30033: Removes the fetcher env tests since these won't be needed any more 
> when the fetcher uses JSON in a single env var as a parameter. They never 
> tested anything that won't be covered by other tests 

Re: Review Request 30774: Fetcher Cache

2015-03-09 Thread Bernd Mathiske


> On March 9, 2015, 8:37 a.m., Joerg Schad wrote:
> > src/tests/fetcher_cache_tests.cpp, line 134
> > <https://reviews.apache.org/r/30774/diff/33/?file=888361#file888361line134>
> >
> > Can't we simulate SERIALIZED_TASK externally (as discussed)? In this we 
> > would not have several modes...

The whole ExecutionMode enum should go. We should use executeTask inside the 
loop that creates TaskInfos in each test and then wait explicitly inide or 
outside the loop as needed. I'll refactor accordingly in the next iteration. 
Also, we don't need enum value FAIL_TO_FETCH. It's not used anywhere any more.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review75700
---


On March 7, 2015, 7:21 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 7, 2015, 7:21 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 9df972d750ce1e4a81d2e96cc508d6f83cad2fc8 
>   src/Makefile.am d299f07d865080676ca8a550cf6005c6ab32839f 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> ec4626f903d44c0911093ff763ef16ad27c418a9 
>   src/slave/flags.hpp 56b25caf3901b38bdecb50310e8bcae0b114efa8 
>   src/slave/slave.cpp a06d68032f26ccb3f786b6ea7c3a6c3c52449bd2 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp e91e5e484eea4587ac8f2eb9cefeab4acc9f4615 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those. In dependency order:
> 
> 30033: Removes the fetcher env tests since these won't be needed any more 
> when the fetcher uses JSON in a single env var as a parameter. They never 
> tested anything that won't be covered by other tests anyway.
> 
>

Re: Review Request 30774: Fetcher Cache

2015-03-07 Thread Bernd Mathiske
concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-07 Thread Bernd Mathiske


> On March 6, 2015, 2:15 p.m., Timothy Chen wrote:
> > src/launcher/fetcher.cpp, line 178
> > <https://reviews.apache.org/r/30774/diff/32/?file=887354#file887354line178>
> >
> > You log the extraction command but in this case don't log the copy 
> > command.
> > 
> > I think to be consistent, let's not log the command, and like you do 
> > here only log when the command fails.
> > 
> > What you think?

Logging the command in both cases now.


> On March 6, 2015, 2:15 p.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.hpp, line 195
> > <https://reviews.apache.org/r/30774/diff/32/?file=887358#file887358line195>
> >
> > Why not just store the Path and return that?

"directory" is a temporary artefact that will disappear once we refactor so 
that flags gets injected into the fetcher. I added a comment syaing that.


> On March 6, 2015, 2:15 p.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 175
> > <https://reviews.apache.org/r/30774/diff/32/?file=887359#file887359line175>
> >
> > Let's use strings::contains instead of find to be consistent here.

Also fixed all other occurences.


> On March 6, 2015, 2:15 p.m., Timothy Chen wrote:
> > src/slave/slave.cpp, line 3710
> > <https://reviews.apache.org/r/30774/diff/32/?file=887363#file887363line3710>
> >
> > Why is this just a Failure but the other recover is a LOG(FATAL)? 
> > Shouldn't we exit here too if unable to recover cache is a critical event?

The method we are in returns a future, so we can return a Failure here. This 
leads to exiting ventually. At the other site, the method we are in only 
returns void. Suggestions for that?


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review75491
---


On March 6, 2015, 5:46 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 6, 2015, 5:46 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 9df972d750ce1e4a81d2e96cc508d6f83cad2fc8 
>   src/Makefile.am d299f07d865080676ca8a550cf6005c6ab32839f 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 

Re: Review Request 30774: Fetcher Cache

2015-03-06 Thread Bernd Mathiske
n all the tests in Mesos, the "unmocked" methods 
> > tend to just have a prefix of "_", so run -> _run

If you use "_run" how do you distinguish this from a continuation of the same 
name? We cannot possibly use this naming scheme. Please either convert to mine 
or come up with a better one. I think that unmocked-something makes it very 
clear what is going on without making first readers guess or having to put 
extra comments. So I'd prefer leaving it like that.

(I had mocked a method _fetch in an earlier patch and there is also a 
continuation __fetch...)


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review75491
---


On March 6, 2015, 5:46 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 6, 2015, 5:46 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 9df972d750ce1e4a81d2e96cc508d6f83cad2fc8 
>   src/Makefile.am d299f07d865080676ca8a550cf6005c6ab32839f 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> ec4626f903d44c0911093ff763ef16ad27c418a9 
>   src/slave/flags.hpp 56b25caf3901b38bdecb50310e8bcae0b114efa8 
>   src/slave/slave.cpp a06d68032f26ccb3f786b6ea7c3a6c3c52449bd2 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp e91e5e484eea4587ac8f2eb9cefeab4acc9f4615 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> --- longer Description: ---
> 
> -Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those. In dependency order:
> 
> 30033: Removes the fetcher env tests since these won't be needed any more 
> when the fetcher uses JSON in a single env var as a parameter. They never 
> tested anything that won't be covered by other tests anyway.
> 
> 30034: Makes the code structure of all fetcher tests the same. Instead of

Re: Review Request 30774: Fetcher Cache

2015-03-06 Thread Bernd Mathiske
asic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-05 Thread Bernd Mathiske


> On March 4, 2015, 4:39 p.m., Jay Buffington wrote:
> > Hey Bernd,
> > 
> > I'm really looking forward to this feature.  There's a lot here, so I was 
> > hoping you could help me understand by responding to some of these 
> > questions:
> > 
> > Why do you need the cache table data structure?  Just use the filesystem?
> > Why are the expanded files cached as well?  
> > There shouldn’t be different behavior if we’re using the cache.  My 
> > understaing is that with this patch, if we use the cache the tar doesn’t 
> > exist in the sandbox.  Isn't this a regression?
> > What’s the point of segregating the cache by user?
> > Why not respect http caching headers?
> > Why does the framework need to even know if the cache is in use or not?
> > The images referenced in the fetcher docs aren’t part of the review.  Where 
> > can I find them?
> > 
> > Thanks!
> > Jay

Hi Jay,

thanks for these great questions! In summary, everything you are asking for 
feature-wise can be offered later (soonish) by relatively simple to implement 
feature additions. 

Answers to your questions in order as follows.

- If I just used the file system to implement the cache without a libprocess 
actor as complement, I would need to persist state about cache contents, use 
file locks, coordinate multiple instances of running mesos-fetcher programs, 
etc. There is a possible alternative architecture for this that would also 
work. See the JIRA commoents on MESOS-336 for an earlier discussion on this. My 
personal preference would be to perhaps further develop what is now 
FetcherProcess into an external program (with fail-over) rather than trying to 
beef up mesos-fetcher, which would lead to a lot of IPC for coordination.
- I am not aware of caching expanded files. We only cache the archive file 
itself.
- Not having a tar file in the sandbox is not a regression if you see using the 
cache at all as a new feature. But I can copy it over optionally if so desired 
in an add-on patch. This is just MVP and it seems more likely that people would 
rather not have the tar file copy.
- I would not want to have a framework for one user plant a cache file that a 
framework of another user then picks up. This file could be lying around for a 
long time, from way before the second framework starts. We can later make this 
optional as an extra feature. I am erring on the side of caution in this MVP.
- Excellent suggestion. But this is for later. Extra feature that I also find 
important.
- We can have another URI.cache value that makes it so.
- Sorry for having removed the images for now. I had trouble applying the patch 
with pictures in it. Advice on what git/RB supports here is welcome! For now, 
you can git clone https://github.com/bernd-mesos/MesosFetcherDocs and then open 
the md files locally or you can look at the PDFs which I also uploaded.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review75266
---


On March 5, 2015, 3:15 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 5, 2015, 3:15 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) H

Re: Review Request 30774: Fetcher Cache

2015-03-05 Thread Bernd Mathiske
rrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-04 Thread Bernd Mathiske
f cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-03 Thread Bernd Mathiske
f cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-03 Thread Bernd Mathiske
loading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-03 Thread Bernd Mathiske
. Refactors fetch() and run(), so 
there is only one of each. Introduces about half of the actual cache logic, 
including a hashmap of cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-03 Thread Bernd Mathiske


> On March 2, 2015, 8:13 p.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 759
> > <https://reviews.apache.org/r/30774/diff/25/?file=882228#file882228line759>
> >
> > If you're incrementing all the time just to count, why not just get the 
> > size from list?

I am not incrementing to count anything. I am incrementing to hit the right 
index in a vector that parallels the list I am iterating over. Is there a C++ 
or Boost construct that can do this without indices?


> On March 2, 2015, 8:13 p.m., Timothy Chen wrote:
> > src/slave/containerizer/fetcher.cpp, line 831
> > <https://reviews.apache.org/r/30774/diff/25/?file=882228#file882228line831>
> >
> > Why is the check entries necessary? Seems like if this for test only we 
> > should do the validations in test?

This is "in tests". This method is for testing. It says so in its header file 
comment.


- Bernd


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review74885
-------


On March 3, 2015, 5:01 a.m., Bernd Mathiske wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> ---
> 
> (Updated March 3, 2015, 5:01 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
> MESOS-2074
> https://issues.apache.org/jira/browse/MESOS-2057
> https://issues.apache.org/jira/browse/MESOS-2069
> https://issues.apache.org/jira/browse/MESOS-2070
> https://issues.apache.org/jira/browse/MESOS-2072
> https://issues.apache.org/jira/browse/MESOS-2073
> https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Almost all of the functionality in epic MESOS-336. Downloaded files from 
> CommandInfo::URIs can now be cached in a cache directory designated by a 
> slave flag. This only happens when asked for by an extra flag in the URI and 
> is thus backwards-compatible. The cache has a size limit also given by a new 
> slave flag. Cache-resident files are evicted as necessary to make space for 
> newly fetched ones. Concurrent attempts to cache the same URI leads to only 
> one download. The fetcher program remains external for safety reasons, but is 
> now augmented with more elaborate parameters packed into a JSON object to 
> implement specific fetch actions for all of the above. Additional testing 
> includes fetching from (mock) HDFS and coverage of the new features.
> 
> 
> Diffs
> -
> 
>   docs/fetcher-cache-internals.md PRE-CREATION 
>   docs/fetcher.md PRE-CREATION 
>   docs/images/fetch_cache.jpg PRE-CREATION 
>   docs/images/fetch_components.jpg PRE-CREATION 
>   docs/images/fetch_flow.jpg PRE-CREATION 
>   docs/images/fetch_force1.jpg PRE-CREATION 
>   docs/images/fetch_force2.jpg PRE-CREATION 
>   docs/images/fetch_state.jpg PRE-CREATION 
>   include/mesos/fetcher/fetcher.proto 
> 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
>   include/mesos/mesos.proto 9df972d750ce1e4a81d2e96cc508d6f83cad2fc8 
>   src/Makefile.am d299f07d865080676ca8a550cf6005c6ab32839f 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
>   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
>   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
>   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
>   src/slave/containerizer/fetcher.hpp 
> 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
>   src/slave/containerizer/fetcher.cpp 
> 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
>   src/slave/containerizer/mesos/containerizer.hpp 
> ae61a0fcd19f2ba808624312401f020121baf5d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> ec4626f903d44c0911093ff763ef16ad27c418a9 
>   src/slave/flags.hpp 56b25caf3901b38bdecb50310e8bcae0b114efa8 
>   src/slave/slave.cpp a06d68032f26ccb3f786b6ea7c3a6c3c52449bd2 
>   src/tests/docker_containerizer_tests.cpp 
> 06cd3d89ecbaaac17ae6970604b21fbe29f6e887 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
>   src/tests/mesos.hpp e91e5e484eea4587ac8f2eb9cefeab4acc9f4615 
>   src/tests/mesos.cpp c8f43d21b214e75eaac2870cbdf4f03fd18707d1 
> 
> Diff: https://reviews.apache.org/

Re: Review Request 30774: Fetcher Cache

2015-03-03 Thread Bernd Mathiske
le name. Refactors fetch() and run(), so 
there is only one of each. Introduces about half of the actual cache logic, 
including a hashmap of cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-02 Thread Bernd Mathiske
 of the actual cache logic, 
including a hashmap of cache file objects for bookkeeping and basic operations 
on it. 

30039: Enables fetcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-01 Thread Bernd Mathiske
loading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-01 Thread Bernd Mathiske
 concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30774: Fetcher Cache

2015-03-01 Thread Bernd Mathiske
tcher cache actions in the mesos fetcher program.

30006: Enables concurrent downloading into the fetcher cache. Reuse of download 
results in the cache when multiple fetcher runs occur concurrently. 

30614: This is to ensure that all this refactoring of fetcher code has not 
broken HDFS fetching. Adds a test that exercises the C++ code paths in Mesos 
and mesos-fetcher related to fetching from HDFS. Uses a mock HDFS client 
written in bash that acts just like a real "hadoop" command if used in the 
right limited way.

30124: Inserted fetcher cache zap upon slave startup, recovery and shutdown. 
This implements recovery in an acceptable, yet most simple way.

30173: Created fetcher cache tests. Adds a new test source file containing a 
test fixture and tests to find out if the fetcher cache works with a variety of 
settings.

30616: Adds hdfs::du() which calls "hadoop fs -du -h" and returns a string that 
contains the file size for the URI passed as argument. This is needed to 
determine the size of a file on HDFS before downloading it to the fetcher cache 
(to ensure there is enough space).

30621: Refactored URI type separation in mesos-fetcher. Moved the URI type 
separation code (distinguishes http, hdfs, local copying, etc.) from 
mesos-fetcher to the fetcher process/actor, since it is going to be reused by 
download size queries when we introduce fetcher cache management. Also factored 
out URI validation, which will be used the same way by mesos-fetcher and the 
fetcher process/actor.

30626: Fetcher cache eviction. This happens when the cache does not have enough 
space to accomodate upcoming downloads to the cache. Necessary provisions 
included here:
- mesos-fetcher does not run until evictions have been successful
- Cache space is reserved while (async) waiting for eviction to succeed. If it 
fails, the reservation gets undone.
- Reservations can be partly from available space, partly from evictions. All 
math included :-)
- To find out how much space is needed, downloading has a prelude in which we 
query the download size from the URI. This works for all URI types that 
mesos-fetcher currently supports, including http and hdfs.
- Size-determination requests are now synchronized, too. Only one per URI in 
play happens.
- There is cleanup code for all kinds of error situations. At the very end of 
the fetch attempt, each list is processed for undoing things like space 
reservations and eviction disabling.
- Eviction gets disabled for URIs that are currently in use, i.e. the related 
cache files are. We use reference counting for this, since there may be 
concurrent fetch attempts using the same cache files.


Thanks,

Bernd Mathiske



Re: Review Request 30606: Added net::contentLength() to query "content-length" field from HTTP header.

2015-02-28 Thread Bernd Mathiske

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30606/
---

(Updated Feb. 28, 2015, 6:51 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy 
Chen.


Changes
---

Added comment that content-length may not contain any useful value.


Bugs: MESOS-2072
https://issues.apache.org/jira/browse/MESOS-2072


Repository: mesos


Description
---

Adds net::contentLength(). This makes a short HTTP request to read the 
"content-length" field from the HTTP header.

This will be used to determine the size of a file before downloading it to the 
fetcher cache, in order to ensure there is enough space ahead of time.


Diffs (updated)
-

  3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 
9635bbc6f7dae1d75a780069fcc60fb706221053 

Diff: https://reviews.apache.org/r/30606/diff/


Testing
---

Used this function in the context of my implementation of MESOS-2072 and 
MESOS-2074. It correctly reported the content length from an HTTP server 
constructed from libprocess primitives.


Thanks,

Bernd Mathiske



  1   2   3   4   5   6   >