Re: RFC: Improving linting in Mesos (MESOS-9630)

2019-08-18 Thread James Peach



> On Aug 17, 2019, at 10:12 PM, Benjamin Bannier  wrote:
> 
> Hi,
> 
> I opened MESOS-9360[^1] to improve the way we do linting in Mesos some time
> ago. I have put some polish on my private setup and now published it, and am
> asking for feedback as linting is an important part of working with Mesos
> for
> most of you. I have moved my workflow to pre-commit more than 6 months ago
> and
> prefer it so much that I will not go back to `support/mesos-style.py`.
> 
> * * *
> 
> We use `support/mesos-style.py` to perform linting, most often triggered
> automatically when committing. This setup is powerful, but also hard to
> maintain and extend. pre-commit[^2] is a framework for managing Git commit
> hooks which has an exciting set of features, one can often enough configure
> it
> only with YAML and comes with a long list of existing linters[^3]. Should we
> go with this approach we could e.g., trivially enable linters for Markdown
> or
> HTML (after fixing the current, sometimes wild state of the sources).
> 
> I would encourage you to play with the [chain] ending in r/71300[^4] on some
> fresh clone (as this modifies your Git hooks). You need to install
> pre-commit[^5] _before applying the chain_, and then run
> `support/setup_dev.sh`. This setup mirrors the existing functionality of
> `support/mesos-style.py`, but also has new linters activated. This should
> present a pretty streamlined workflow. I have also adjusted the Windows
> setup,
> but not tested it.
> 
> I have also spent some time to make transitioning from our current linting
> setup easier. If you are feeling adventurous you can apply the chain up to
> r/71209/ on your existing setup and run `support/setup_dev.sh`.
> 
> One noticeable change is that with pre-commit we will store (some) linters
> in
> `$XDG_CACHE_HOME` (default: `$HOME/.cache`). The existing setup stores some
> linter files in the build directory, so a "clean build" might require
> downloading linter files again. With pre-commit OTOH one needs to perform
> garbage-collection out of band (e.g., by executing `pre-commit gc`, or
> deleting
> the cache directory).
> 
> * * *
> 
> Please let me know whether we should move forward with this change, you
> think
> it needs important adjustments, or you see fundamental reasons that this is
> a
> bad idea. If you like what you see here I would be happy to know about that
> as
> well.

I set this up and did a quick test commit. The only issue I hit was installing 
pre-commit on Fedora 30 (I needed to do "python3 -m pip install pre-commit”). 
This is a much more polished experience than the previous scripts, and I liked 
it a lot.

J

Re: Changing behaviour of suppressOffers() to preserve suppressed state on transparent re-registration by the scheduler driver

2019-06-21 Thread James Peach
So this proposal would only affect schedulers using the libmesos scheduler 
driver API? Schedulers using the v1 HTTP would not get any changes in 
behaviour, right?

> On Jun 21, 2019, at 9:56 PM, Andrei Sekretenko  
> wrote:
> 
> Hi all,
> 
> we are intending to change the behavior of the suppressOffers() method of
> MesosSchedulerDriver with regard to the transparent re-registration.
> 
> Currently, when driver becomes disconnected from a master, it performs on
> its own a re-registration with an empty set of suppressed roles. This
> causes un-suppression
> of all the suppressed roles of the framework.
> 
> The plan is to alter this behavior into preserving the suppression state on
> this re-registration.
> 
> The required set of suppressed roles will be stored in the driver, which
> will be now performing re-registration with this set (instead of an empty
> one),
> and updating the stored set whenever a call modifying the suppression state
> of the roles in the allocator is performed.
> Currently, the driver has two methods which perform such calls:
> suppressOffers()  and reviveOffers().
> 
> Please feel free to raise any concerns or objections - especially if you
> are aware of any V0 frameworks which (probably implicitly) depend on
> un-suppression of the roles when this re-registration occurs.
> 
> 
> 
> Note that:
> - Frameworks which do not call suppressOffers() are, obviously, unaffected
> by this change.
> 
> - Frameworks that reliably prevent transparent-re-registration (for
> example, by calling driver.abort() immediately from the disconnected()
> callback), should also be not affected.
> 
> - Storing the suppressed roles list for re-registration and clearing it in
> reviveOffers() do not change anything for the existing frameworks. It is
> setting this list in suppressOffers() which might be a cause of concerns.
> 
> - I'm using the word "un-suppression" because re-registering with roles
> removed from the suppressed roles list is NOT equivalent to performing
> REVIVE call for these roles (unlike REVIVE, it does not clear offerFilters
> in the allocator).
> 
> =
> A bit of background on why this change is needed.
> 
> To properly support V0 frameworks with large number of roles, it is
> necessary for the driver not to change the suppression state of the roles
> on its own.
> Therefore, due to the existence of the transparent re-registration in the
> driver, we will need to store the required suppression state in the driver
> and make it re-register using this state.
> 
> We could possibly avoid the proposed change of suppressOffers() by adding
> to the driver new interface for changing the suppression state, leaving
> suppressOffers() as it is, and marking it as deprecated.
> 
> However, this will leave the behaviour of suppressOffers() deeply
> inconsistent with everything else.
> Compare the following two sequences of events.
> First one:
> - The framework creates and starts a driver with roles "role1", "role2"...
> "role500", the driver registers
> - The framework calls a new method driver.suppressOffersForRoles({"role1",
> ..., "role500"}), the driver performs SUPPRESS call for these roles and
> stores them in its suppressed roles set.
>   (Alternative with the same result: the framework calls
> driver.updateFramework(FrameworkInfo, suppressedRoles={"role1", ...,
> "role500"}), the driver performs UPDATE_FRAMEWORK call with those
> parameters and stores the new suppressed roles set).
> - The driver, due to some reason, disconnects and re-registers with the
> same master, providing the stored suppressed roles set.
> - All the roles are still suppressed
> Second one:
> - The framework creates and starts a driver with roles "role1", "role2"...
> "role500", the driver registers
> - The framework calls driver.suppressOffers(), the driver performs
> SUPPRESS call for all roles, but doesn't modify required suppression state.
> - The driver, due to some reason, disconnects and re-registers with the
> same master, providing the stored suppressed roles set, which is empty.
> - Now, none of the roles are suppressed, allocator generates offers for
> 500 roles which will likely be declined by the framework.
> 
> This is one of the examples which makes us strongly consider altering the
> interaction between suppressOffers() and the transparent re-registration
> when we add storing the suppression state to the driver.
> 
> Regards,
> Andrei Sekretenko



Re: Propose to run debug container as the same user of its parent container by default

2018-10-25 Thread James Peach


> On Oct 23, 2018, at 7:47 PM, Qian Zhang  wrote:
> 
> Hi all,
> 
> Currently when launching a debug container (e.g., via `dcos task exec` or 
> command health check) to debug a task, by default Mesos agent will use the 
> executor's user as the debug container's user. There are actually 2 cases:
> 1. Command task: Since the command executor's user is same with command 
> task's user, so the debug container will be launched as the same user of the 
> command task.
> 2. The task in a task group: The default executor's user is same with the 
> framework user, so in this case the debug container will be launched as the 
> same user of the framework rather than the task.
> 
> Basically I think the behavior of case 1 is correct. For case 2, we may run 
> into a situation that the task is run as a user (e.g., root), but the debug 
> container used to debug that task is run as another user (e.g., a normal 
> user, suppose framework is run as a normal user), this may not be what user 
> expects.
> 
> So I created MESOS-9332  
> and propose to run debug container as the same user of its parent container 
> (i.e., the task to be debugged) by default. Please let me know if you have 
> any comments, thanks!

This sounds like a sensible default to me. I can imagine for debug use cases 
you might want to run the debug container as root or give it elevated 
capabilities, but that should not be the default.

J

Re: Libevent bundling ahead.

2018-09-12 Thread James Peach



> On Sep 11, 2018, at 6:14 PM, Till Toenshoff  wrote:
> 
> Hey All,
> 
> We are considering bundling/vendoring libevent 2.0.22 with upcoming releases 
> of Mesos.
> 
> Let me explain the motivation and then go into some details.
> 
> Due to https://issues.apache.org/jira/browse/MESOS-7076, SSL builds Mesos 
> stopped functioning on distributions that offer libevent 2.1.8 by default. 
> Specifically the failure was observed on Ubuntu 17/18 as well as on macOS. It 
> has also just come to my attention that Fedora 18 shares the same fate.

F28

> So the problem is less likely OS specific but more likely libevent + SSL + 
> libprocess specific.
> Instead of getting stuck in the rabbit hole of debugging right away, I 
> decided that bundling a known good version of libevent was the most reliable 
> way to prevent sad faces when building Mesos with SS but instead we can be 
> sure SSL builds of Mesos function properly across all supported platforms, 
> out of the box.
> 
> Details on the bundling;
> We will include libevent 2.0.22 and we also include a patch that makes that 
> version build against both openssl 1.0.x as well as 1.1.x. For unbundled 
> builds (--with-libevent) I have some additional checks foreseen that try to 
> prevent a build of a known bad variant of libevent + SSL + Mesos.
> 
> The bundling and those checks are a workaround, not a solution. I still am 
> pursueing debugging the underlying cause. However, way too much time has 
> passed already without a proper solution, hence this suggestion of a quick 
> fix, bundling workaround.
> 
> Let me know your thoughts!

I think this is OK as long as we have a reasonable expectation that we can 
unbundle soon-ish.

J

Re: [VOTE] Release Apache Mesos 1.7.0 (rc2)

2018-08-29 Thread James Peach
+1 (binding)

Built and tested on Fedora 28 (clang).

> On Aug 24, 2018, at 4:42 PM, Chun-Hung Hsiao  wrote:
> 
> Hi all,
> 
> Please vote on releasing the following candidate as Apache Mesos 1.7.0.
> 
> 
> 1.7.0 includes the following:
> 
> * Performance Improvements:
>   * Master `/state` endpoint: ~130% throughput improvement through RapidJSON
>   * Allocator: Improved allocator cycle significantly
>   * Agent `/containers` endpoint: Fixed a performance issue
>   * Agent container launch / destroy throughput is significantly improved
> * Containerization:
>   * **Experimental** Supported docker image tarball fetching from HDFS
>   * Added new `cgroups/all` and `linux/devices` isolators
>   * Added metrics for `network/cni` isolator and docker pull latency
> * Windows:
>   * Added support to libprocess for the Windows Thread Pool API
> * Multi-Framework Workloads:
>   * **Experimental** Added per-framework metrics to the master
>   * A new weighted random sorter was added as an alternative to the DRF sorter
> 
> The CHANGELOG for the release is available at:
> https://gitbox.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.7.0-rc2
>  
> 
> 
> 
> The candidate for Mesos 1.7.0 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc2/mesos-1.7.0.tar.gz 
> 
> 
> The tag to be voted on is 1.7.0-rc2:
> https://gitbox.apache.org/repos/asf?p=mesos.git;a=commit;h=1.7.0-rc2 
> 
> 
> The SHA512 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc2/mesos-1.7.0.tar.gz.sha512
>  
> 
> 
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc2/mesos-1.7.0.tar.gz.asc 
> 
> 
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS 
> 
> 
> The JAR is in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1233 
> 
> 
> Please vote on releasing this package as Apache Mesos 1.7.0!
> 
> The vote is open until Mon Aug 27 16:37:35 PDT 2018 and passes if a majority 
> of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Mesos 1.7.0
> [ ] -1 Do not release this package because ...
> 
> Thanks,
> Chun-Hung & Gaston



Re: Volume ownership and permission

2018-08-16 Thread James Peach



> On Aug 15, 2018, at 6:22 PM, Qian Zhang  wrote:
> 
> Hi Folks,
> 
> We found some issues for the solutions of this project and propose a better
> one, see here
> <https://docs.google.com/document/d/1QyeDDX4Zr9E-0jKMoPTzsGE-v4KWwjmnCR0l8V4Tq2U/edit#heading=h.tjuy5xk67tuu>
> for details. Please let me know if you have any comments, thanks!

Some general comments.

I assume that this scheme will only be supported on Linux, due to the 
dependencies on the Linux ACLs and supplementary group behaviour?  

Rewriting ACLs on volumes at each container launch sounds hugely expensive. 
It's IOP-bound process and there are an effectively unbounded number of files 
in the volume. Would this serialize container cleanup?

It seems like ACL evaluation will mean that this scheme will only mostly work. 
For example, if the container process UID matches a user ACE, then access could 
be denied independently of the volume policy.

Will the VolumeAclManager apply a default ACL on the root of the volume? Does 
this imply that when it updates the ACEs for the container GID, it also needs 
to update the default ACLs on all directories?

> 
> 
> Regards,
> Qian Zhang
> 
> On Sat, Apr 28, 2018 at 7:57 AM, Qian Zhang  wrote:
> 
>>> The framework launched tasks in a group with different users? Sounds
>> like they dug their own hole :)
>> 
>> So you mean we should actually put a best practice or limitation in doc:
>> when launching a task group with multiple tasks to share a SANDBOX volume
>> of PARENT type, all the tasks should be run with the same user, and that
>> user must be same with the user to launch the executor? Otherwise the task
>> will not be able to write to the volume.
>> 
>>> I'd argue that the "rw" on the sandbox path is analogous to the "rw"
>> mount option. That is, it is mounted writeable, but says nothing about
>> which credentials can write to it.
>> 
>> Can you please elaborate a bit on this? What would you suggest for the
>> "rw` volume mode?
>> 
>> 
>> Regards,
>> Qian Zhang
>> 
>> On Fri, Apr 27, 2018 at 12:07 PM, James Peach  wrote:
>> 
>>> 
>>> 
>>>> On Apr 26, 2018, at 7:25 PM, Qian Zhang  wrote:
>>>> 
>>>> Hi James,
>>>> 
>>>> Thanks for your comment!
>>>> 
>>>> I think you are talking about the SANDBOX_PATH volume ownership issue
>>>> mentioned in the design doc
>>>> <https://docs.google.com/document/d/1QyeDDX4Zr9E-0jKMoPTzsGE
>>> -v4KWwjmnCR0l8V4Tq2U/edit#heading=h.s6f8rmu65g2p>,
>>>> IIUC, you prefer to leaving it to framework, i.e., framework itself
>>> ought
>>>> to be able to handle such issue. But I am curious how framework can
>>> handle
>>>> it in such situation. If the framework launches a task group with
>>> different
>>>> users and with a SANDBOX_PATH volume of PARENT type, the tasks in the
>>> group
>>>> will definitely fail to write to the volume due to the ownership issue
>>>> though the volume's mode is set to "rw". So in this case, how should
>>>> framework handle it?
>>> 
>>> The framework launched tasks in a group with different users? Sounds like
>>> they dug their own hole :)
>>> 
>>> I'd argue that the "rw" on the sandbox path is analogous to the "rw"
>>> mount option. That is, it is mounted writeable, but says nothing about
>>> which credentials can write to it.
>>> 
>>>> And if we want to document it, what is our recommended
>>>> solution in the doc?
>>>> 
>>>> 
>>>> 
>>>> Regards,
>>>> Qian Zhang
>>>> 
>>>> On Fri, Apr 27, 2018 at 1:16 AM, James Peach  wrote:
>>>> 
>>>>> I commented on the doc, but at least some of the issues raised there I
>>>>> would not regard as issues. Rather, they are about setting expectations
>>>>> correctly and ensuring that we are documenting (and maybe enforcing)
>>>>> sensible behavior.
>>>>> 
>>>>> I'm not that keen on Mesos automatically "fixing" filesystem
>>> permissions
>>>>> and we should proceed down that path with caution, especially in the
>>> ACLs
>>>>> case.
>>>>> 
>>>>>> On Apr 10, 2018, at 3:15 AM, Qian Zhang  wrote:
>>>>>> 
>>>>>> Hi Folks,
>>>>>> 
>>>>>> I am working on MESOS-8767 to improve Mesos volume support regarding
>>>>> volume ownership and permission, here is the design doc. Please feel
>>> free
>>>>> to let me know if you have any comments/feedbacks, you can reply this
>>> mail
>>>>> or comment on the design doc directly. Thanks!
>>>>>> 
>>>>>> 
>>>>>> Regards,
>>>>>> Qian Zhang
>>>>> 
>>>>> 
>>> 
>>> 
>> 



Re: Using jemalloc as default allocator

2018-08-10 Thread James Peach



> On Aug 10, 2018, at 8:56 AM, Benno Evers  wrote:
> 
> Hi guys,
> 
> it's quite late in the release cycle, but I've been thinking about
> proposing to enable the `--enable-jemalloc-allocator` configuration setting
> by default for linux builds of Mesos.
> 
> The thinking is that
> - Benchmarks consistently show a small to medium performance improvement
> - The bundled jemalloc version (5.0.1) has been released as stable for
> over a year and has not seen any severe bugs
> - Our own Mesos builds with jemalloc don't show any issues so far
> 
> What do you think?

I don't think it's worth it. Anyone who wants to use jemalloc can already do 
it, and the Mesos profiling support works nicely without also forcing a 
build-time dependency. In general, I feel that bundling dependencies is a 
burden on the build (e.g out bundled jemalloc is already out of date).

J

Re: [VOTE] Move the project repos to gitbox

2018-07-17 Thread James Peach



> On Jul 17, 2018, at 7:58 AM, Vinod Kone  wrote:
> 
> Hi,
> 
> As discussed in another thread and in the committers sync, there seem to be 
> heavy interest in moving our project repos ("mesos", "mesos-site") from the 
> "git-wip" git server to the new "gitbox" server to better avail GitHub 
> integrations.
> 
> Please vote +1, 0, -1 regarding the move to gitbox. The vote will close in 3 
> business days.


+1

Re: RFC: update C++ style to require the "override" keyword

2018-07-09 Thread James Peach



> On Jul 8, 2018, at 10:55 PM, Benjamin Bannier 
>  wrote:
> 
> Hi James,
> 
>> I’d like to propose that we update our style to require that the
>> “override” keyword always be used when overriding virtual functions
>> (including destructors). The proposed text is below. I’ll also prepare
>> a clang-tidy patch to update stout, libprocess and mesos globally.
> 
> +1!
> 
> Thanks for bringing this up and offering to do the clean-up. Using `override`
> consistently would really give us some certainty as interface methods evolve.
> 
> * * *
> 
> Note that since our style guide _is_ the Google style guide plus some
> additions, we shouldn't need to update anything in our style guide; the Google
> style guide seems to have started requiring this from February this year and
> our code base just got out of sync

I'd prefer to hoist the rationale up to our guide, since the google one is 
pretty long and I don't expect us to all re-read it regularly :)

I'll definitely make the tooling changes you suggest (look for a review request 
in the near future)

J

RFC: update C++ style to require the "override" keyword

2018-07-08 Thread James Peach
Hi all,

I’d like to propose that we update our style to require that the “override” 
keyword always be used when overriding virtual functions (including 
destructors). The proposed text is below. I’ll also prepare a clang-tidy patch 
to update stout, libprocess and mesos globally.

--- a/docs/c++-style-guide.md
+++ b/docs/c++-style-guide.md
@@ -647,3 +647,16 @@ Const expression constructors allow object initialization 
at compile time provid
   C++11 does not provide `constexpr string` or `constexpr` containers in the 
STL and hence `constexpr` cannot be used for any class using stout's Error() 
class.

 * `enum class`.
+
+* `override`.
+
+When overriding a virtual member function, the `override` keyword should 
always be used.  The [Google C++ Style 
Guide](https://google.github.io/styleguide/cppguide.html#Inheritance) supplies 
the rationale for this:
+
+
+A function or destructor marked override or final that is not an
+override of a base class virtual function will not compile, and
+this helps catch common errors. The specifiers serve as documentation;
+if no specifier is present, the reader has to check all ancestors
+of the class in question to determine if the function or destructor
+is virtual or not.
+

thanks,
James

Re: Shall we move SASL based CRAM-MD5 authentication out of libmesos?

2018-07-02 Thread James Peach



> On Jul 2, 2018, at 12:45 AM, Till Toenshoff  wrote:
> 
> Dear fellow Apache Mesos developers,
> 
> as you know, Apache Mesos supports authentication on various levels - among 
> those is the RPC-style authentication allowing frameworks and agents to 
> authenticate against the master. Even though  this authentication interface 
> has been modularized a long time ago, we still kept the default, SASL based 
> challenge-response authentication mechanism, message digest 5 (aka CRAM-MD5)  
> authentication within libmesos. 
> 
> Modularizing of our RPC authentication has enabled you to add new 
> authentication mechanisms. User have chosen authentication fitting their 
> company security landscape - e.g. ticket based things like Kerberos or 
> Mesosphere’s use of JWT. It has also come to my attention that there are 
> users out there using directory backed (e.g.LDAP) variants - or even 
> combinations of those like LDAP backed Kerberos.
> 
> CRAM-MD5, while still being regarded as secure, is not very convenient or 
> flexible and therefor in my experience it is not chosen in production 
> environments.  This in turn means that all those builds of libmesos drag SASL 
> in as a dependency while in fact not making any use of it - and that is what 
> I would like to have fixed (since ages). It would benefit in reducing loading 
> times of libmesos dependent runnable and it would also reduce provisioning 
> complexity.
> 
> To fix that, we would have SASL CRAM-MD5 be available as a module 
> exclusively, provisioned only when really needed. That in turn means that 
> users of this mechanism would need to additionally provide the “modules” or 
> “modules_dir” flag to master, agent and/or framework - that would a breaking 
> change for those that rely on the fact that CRAM-MD5 works out of the box.
> 
> I could imagine ways where the user would not even have to provide that 
> “modules*” flag and we internal generate that data for him as a convenience 
> function - it is another option.
> 
> Given that CRAM-MD5 is the only RPC authentication mechanism Mesos is 
> bundling right now and given that our tests do rely on authentication to be 
> available and tested, we will not be able to remove the dependency against 
> SASL entirely for building and testing.
> 
> This first step would only ease the deployment.
> 
> But, there is a drawback - Windows builds currently do not support modules. 
> So either we get modules support for Windows up and running OR we would need 
> to let Windows be an exception to this plan for now. 
> 
> What do you think? Is it worth following this path - or do you have other 
> suggestions?

I’m +1 for simplifying libmesos dependencies in principle. In this case, 
CRAM-MD5 is the only auth mechanism available in the main Mesos tree, so we 
should consider the upgrade path carefully. For my usage, I can updae our 
puppet configs, but we need to make sure that either the old configuration 
continues to work, or that it hard-fails so that operators can switch to the 
new config.

J

Re: mesos git commit: Made `gpu/nvidia` isolator works with `cgroups/all` isolation option.

2018-06-27 Thread James Peach
Do we still need this check? The order of built-in isolators is now fixed, so 
do we still need to verify this ordering?

> On Jun 27, 2018, at 4:17 PM, gilb...@apache.org wrote:
> 
> Repository: mesos
> Updated Branches:
>  refs/heads/master 2e913d545 -> b581136bd
> 
> 
> Made `gpu/nvidia` isolator works with `cgroups/all` isolation option.
> 
> Review: https://reviews.apache.org/r/67743/
> 
> 
> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/b581136b
> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/b581136b
> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/b581136b
> 
> Branch: refs/heads/master
> Commit: b581136bd5d28ea74e410c7a57ac10d05c334b5b
> Parents: 2e913d5
> Author: Qian Zhang 
> Authored: Tue Jun 26 23:16:36 2018 -0700
> Committer: Gilbert Song 
> Committed: Tue Jun 26 23:16:36 2018 -0700
> 
> --
> .../mesos/isolators/gpu/isolator.cpp| 33 +---
> 1 file changed, 29 insertions(+), 4 deletions(-)
> --
> 
> 
> http://git-wip-us.apache.org/repos/asf/mesos/blob/b581136b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> --
> diff --git a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp 
> b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> index d79c940..5066882 100644
> --- a/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> +++ b/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
> @@ -102,20 +102,45 @@ Try NvidiaGpuIsolatorProcess::create(
> const Flags& flags,
> const NvidiaComponents& components)
> {
> -  // Make sure both the 'cgroups/devices' isolator and the
> -  // 'filesystem/linux' isolators are present and precede the GPU
> -  // isolator.
> +  // Make sure both the 'cgroups/devices' (or 'cgroups/all') isolator and the
> +  // 'filesystem/linux' isolators are present and precede the GPU isolator.
>   vector tokens = strings::tokenize(flags.isolation, ",");
> 
>   auto gpuIsolator =
> std::find(tokens.begin(), tokens.end(), "gpu/nvidia");
> +
>   auto devicesIsolator =
> std::find(tokens.begin(), tokens.end(), "cgroups/devices");
> +
> +  auto cgroupsAllIsolator =
> +std::find(tokens.begin(), tokens.end(), "cgroups/all");
> +
>   auto filesystemIsolator =
> std::find(tokens.begin(), tokens.end(), "filesystem/linux");
> 
>   CHECK(gpuIsolator != tokens.end());
> 
> +  if (cgroupsAllIsolator != tokens.end()) {
> +// The reason that we need to check if `devices` cgroups subsystem is
> +// enabled is, when `cgroups/all` is specified in the `--isolation` agent
> +// flag, cgroups isolator will only load the enabled subsystems. So if
> +// `cgroups/all` is specified but `devices` is not enabled, cgroups 
> isolator
> +// will not load `devices` subsystem in which case we should error out.
> +Try result = cgroups::enabled("devices");
> +if (result.isError()) {
> +  return Error(
> +  "Failed to check if the `devices` cgroups subsystem"
> +  " is enabled by kernel: " + result.error());
> +} else if (!result.get()) {
> +  return Error(
> +  "The `devices` cgroups subsystem is not enabled by the kernel");
> +}
> +
> +if (devicesIsolator > cgroupsAllIsolator) {
> +  devicesIsolator = cgroupsAllIsolator;
> +}
> +  }
> +
>   if (devicesIsolator == tokens.end()) {
> return Error("The 'cgroups/devices' isolator must be enabled in"
>  " order to use the 'gpu/nvidia' isolator");
> @@ -127,7 +152,7 @@ Try NvidiaGpuIsolatorProcess::create(
>   }
> 
>   if (devicesIsolator > gpuIsolator) {
> -return Error("'cgroups/devices' must precede 'gpu/nvidia'"
> +return Error("'cgroups/devices' or 'cgroups/all' must precede 
> 'gpu/nvidia'"
>  " in the --isolation flag");
>   }
> 
> 



Re: Getting write access to our GitHub repo

2018-06-22 Thread James Peach



> On Jun 22, 2018, at 7:34 PM, Jie Yu  wrote:
> 
> +1
> 
> Does this means we can add CI webhooks to the git repo?

FWIW, I'm hugely -1 on doing code reviews on GitHub. I'm cautiously optimistic 
about other kinds of integration though.

> On Thu, Jun 21, 2018 at 3:45 PM, James Peach  wrote:
> 
>> 
>> 
>>> On Jun 20, 2018, at 7:58 PM, Vinod Kone  wrote:
>>> 
>>> Hi folks,
>>> 
>>> Looks like ASF now supports <https://gitbox.apache.org/> giving write
>>> access to committers for their GitHub mirrors, which means we can merge
>> PRs
>>> directly on GitHub!
>> 
>> Are you proposing that we move to Github generally?
>> 
>>> FWICT, this requires us moving our repo to a new gitbox server by filing
>> an
>>> INFRA ticket. We probably need to update our CI and other tooling that
>>> references our git repo directly, so there will be work involved on our
>> end
>>> as well.
>>> 
>>> This has been one of the long requested features from several committers,
>>> so I'm gauging interest to see if folks think we should go down this
>> route
>>> (several projects seem to be already moving
>>> <https://issues.apache.org/jira/issues/?jql=text%20~%20%22gitbox%22>)
>> too.
>>> 
>>> If there is enough interest, we could start a vote.
>>> 
>>> Thanks,
>>> Vinod
>> 
>> 



Re: Getting write access to our GitHub repo

2018-06-21 Thread James Peach



> On Jun 20, 2018, at 7:58 PM, Vinod Kone  wrote:
> 
> Hi folks,
> 
> Looks like ASF now supports  giving write
> access to committers for their GitHub mirrors, which means we can merge PRs
> directly on GitHub!

Are you proposing that we move to Github generally?

> FWICT, this requires us moving our repo to a new gitbox server by filing an
> INFRA ticket. We probably need to update our CI and other tooling that
> references our git repo directly, so there will be work involved on our end
> as well.
> 
> This has been one of the long requested features from several committers,
> so I'm gauging interest to see if folks think we should go down this route
> (several projects seem to be already moving
> ) too.
> 
> If there is enough interest, we could start a vote.
> 
> Thanks,
> Vinod



Re: Support image and resource pre-fetching in Mesos

2018-06-20 Thread James Peach


> On Jun 20, 2018, at 4:02 PM, Zhitao Li  wrote:
> 
> Hi,
> 
> We have been working on optimizing container launch latency in our Mesos 
> based stack,

How are you measuring the launch latency?

> and one of the optimization we are considering is to pre-fetch docker image 
> and any necessary resources for the task/executor.
> 
> This is especially useful in "updating" of containers of long running 
> services.
> 
> Before delving into detailed proposal, I wonder if anyone has done similar 
> things or has similar requirements.
> 
> Thanks!
> 
> -- 
> Cheers,
> 
> Zhitao Li



Re: narrowing task sandbox permissions

2018-06-15 Thread James Peach



> On Jun 15, 2018, at 11:06 AM, Zhitao Li  wrote:
> 
> Sorry for getting back to this really late, but we got bit by this behavior
> change in our environment.
> 
> The broken scenario we had:
> 
>   1. We are using Aurora to launch docker containerizer based tasks on
>   Mesos;
>   2. Most of our docker containers had some legacy behavior: *the
>   execution entered as "root" in the entry point script,* setup a couple
>   of symlinks and other preparation work, then *"de-escalate" into a non
>   privileged user (i.e, "user")*;
>  1. This was added so that the entry point script has enough
>  permission to reconfigure certain side car processes (i.e, nginx) and
>  filesystem paths;
>   3. unfortunately, the "user" user will lose access to the sandbox after
>   this change.
> 
> 
> While I'd acknowledge that above behavior is legacy and a piece of major
> tech debt, cleaning it up for the thousands of applications on our platform
> was never easy. Given that our org has other useful features available in
> 1.6, I would propose a couple of options:
> 
>   1. making the sandbox permission bits configurable
>  1. Certain framework knows that their tasks do not leave sensitive
>  data on sandbox so we could provide this flexibility (it's very useful in
>  practice for migration to a container based system);
>  2. Alternatively, making this possible to reconfigure on agent flags:
>  This could be more secure and easier to manage, but lacks flexibility of
>  allowing different frameworks to do different things.
>   2. Until the customization is in place, consider a revert of the
>   permission bit change so we preserve the original behavior.

That's a pretty unfortunate outcome. Can you change the permissions in your 
script, or happy a Mesos patch until the legacy can be addressed?

J

Re: Follow up on instructions on CMake and VSCode support

2018-06-11 Thread James Peach



> On Jun 11, 2018, at 1:24 PM, Andrew Schwartzmeyer  
> wrote:
> 
> Ah, so with VS Code it's _mostly_ automatic; I don't actually use VS
> Code much myself, though Akash does.
> 
> I have a review up here https://reviews.apache.org/r/67308/ that goes
> over setting up and using Cquery (which is _also_ used by VS Code, but
> in the background) with Emacs, and was hoping to get someone who uses it
> with Vim to fill that in too. Really it should add support for any
> editor which supports LSP.

FWIW I had pretty good luck with a .cquery file:
http://jorgar.tumblr.com/post/174319627674/mesos-config-for-cquery


> On 06/11/2018 1:17 pm, Zhitao Li wrote: 
> 
>> Hi Andrew, 
>> 
>> I remembered that you gave an pretty exciting talk about setting up CMake 
>> and VSCode to parse Mesos code base. I'm quite curious on setting that up 
>> myself. 
>> 
>> Can you share any instructions/notes about that? 
>> 
>> Thanks!
>> 
>> -- 
>> 
>> Cheers,
>> 
>> Zhitao Li



Re: Deprecating the Python bindings

2018-06-06 Thread James Peach


> On May 9, 2018, at 11:51 AM, Andrew Schwartzmeyer  
> wrote:
> 
> Hi all,
> 
> There are two parallel efforts underway that would both benefit from 
> officially deprecating (and then removing) the Python bindings. The first 
> effort is the move to the CMake system: adding support to generate the Python 
> bindings was investigated but paused (see MESOS-8118), and the second effort 
> is the move to Python 3: producing Python 3 compatible bindings is under 
> investigation but not in progress (see MESOS-7163).
> 
> Benjamin Bannier, Joseph Wu, and I have all at some point just wondered how 
> the community would fare if the Python bindings were officially deprecated 
> and removed. So please, if this would negatively impact you or your project, 
> let me know in this thread.

Another approach could be to move the bindings from the `mesos` git repo to a 
separate repo (either the ASF or in the `mesos` GitHub org). This could 
decouple it from the main Mesos build infrastructure and create a project for a 
Python community to coalesce around. I think there's value in nominating an 
official Python binding, but maybe we don't have to carry that in the same git 
repo and build system.

J

Re: [6/6] mesos git commit: Added `linux/devices` isolator whitelist tests.

2018-05-28 Thread James Peach


> On May 28, 2018, at 6:05 AM, Alex R <al...@apache.org> wrote:
> 
> This commit breaks the build on Ubuntu 14.04 with `gcc (Ubuntu 
> 4.8.4-2ubuntu1~14.04.4) 4.8.4` due to what seems to me a compiler bug, likely 
> this one [1]. Ubuntu 14.04 is officially supported until mid-2019, hence not 
> sure we can ignore this issue.
> 
> James, can you commit a workaround?

Oh it’s the raw strings that break it. I’ll try to come up with something


> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57824
> 
>> On 25 May 2018 at 23:12, <jpe...@apache.org> wrote:
>> Added `linux/devices` isolator whitelist tests.
>> 
>> Added test to verify that the `linux/devices` isolator supports
>> populating devices that are whitelisted by the `allowed_devices`
>> agent flag.
>> 
>> Review: https://reviews.apache.org/r/67145/
>> 
>> 
>> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
>> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/ee6c6cfc
>> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/ee6c6cfc
>> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/ee6c6cfc
>> 
>> Branch: refs/heads/master
>> Commit: ee6c6cfcbe2b91eaf540afa38bab4521d23b747f
>> Parents: 68db3f9
>> Author: James Peach <jpe...@apache.org>
>> Authored: Fri May 25 13:38:14 2018 -0700
>> Committer: James Peach <jpe...@apache.org>
>> Committed: Fri May 25 13:38:14 2018 -0700
>> 
>> --
>>  src/Makefile.am |   1 +
>>  src/tests/CMakeLists.txt|   1 +
>>  .../linux_devices_isolator_tests.cpp| 231 +++
>>  3 files changed, 233 insertions(+)
>> --
>> 
>> 
>> http://git-wip-us.apache.org/repos/asf/mesos/blob/ee6c6cfc/src/Makefile.am
>> --
>> diff --git a/src/Makefile.am b/src/Makefile.am
>> index da0d683..b7184ce 100644
>> --- a/src/Makefile.am
>> +++ b/src/Makefile.am
>> @@ -2666,6 +2666,7 @@ mesos_tests_SOURCES += 
>>\
>>tests/containerizer/cgroups_tests.cpp\
>>tests/containerizer/cni_isolator_tests.cpp   \
>>tests/containerizer/docker_volume_isolator_tests.cpp \
>> +  tests/containerizer/linux_devices_isolator_tests.cpp \
>>tests/containerizer/linux_filesystem_isolator_tests.cpp  \
>>tests/containerizer/fs_tests.cpp \
>>tests/containerizer/memory_pressure_tests.cpp\
>> 
>> http://git-wip-us.apache.org/repos/asf/mesos/blob/ee6c6cfc/src/tests/CMakeLists.txt
>> --
>> diff --git a/src/tests/CMakeLists.txt b/src/tests/CMakeLists.txt
>> index 1fef060..b9c906d 100644
>> --- a/src/tests/CMakeLists.txt
>> +++ b/src/tests/CMakeLists.txt
>> @@ -224,6 +224,7 @@ if (LINUX)
>>  containerizer/docker_volume_isolator_tests.cpp
>>  containerizer/fs_tests.cpp
>>  containerizer/linux_capabilities_isolator_tests.cpp
>> +containerizer/linux_devices_isolator_tests.cpp
>>  containerizer/linux_filesystem_isolator_tests.cpp
>>  containerizer/memory_pressure_tests.cpp
>>  containerizer/nested_mesos_containerizer_tests.cpp
>> 
>> http://git-wip-us.apache.org/repos/asf/mesos/blob/ee6c6cfc/src/tests/containerizer/linux_devices_isolator_tests.cpp
>> --
>> diff --git a/src/tests/containerizer/linux_devices_isolator_tests.cpp 
>> b/src/tests/containerizer/linux_devices_isolator_tests.cpp
>> new file mode 100644
>> index 000..efaa43b
>> --- /dev/null
>> +++ b/src/tests/containerizer/linux_devices_isolator_tests.cpp
>> @@ -0,0 +1,231 @@
>> +// Licensed to the Apache Software Foundation (ASF) under one
>> +// or more contributor license agreements.  See the NOTICE file
>> +// distributed with this work for additional information
>> +// regarding copyright ownership.  The ASF licenses this file
>> +// to you under the Apache License, Version 2.0 (the
>> +// "License"); you may not use this file except in compliance
>> +// with the License.  You may obtain a copy of the License at
>> +//
>> +// http://www.apache.org/licenses/LICENSE-2.0
>> +//
>> +// Unless required by appli

Re: [VOTE] Release Apache Mesos 1.6.0 (rc1)

2018-05-10 Thread James Peach
+1 (binding)

Checked the signatures, build and tests on Fedora 27

> On May 10, 2018, at 9:06 AM, Chun-Hung Hsiao  wrote:
> 
> +1 (binding)
> 
> Tested on our internal CI (sudo make check) on Mac, CentOS 6/7, Debian 8/9
> and Ubuntu 14/16/17, with gRPC/SSL disabled/enabled.
> Also manually tested "make distcheck" w/ autotools, and "ninja check" w/
> CMake on Mac and CentOS 7 with gRPC enabled.
> 
> Observed the following failures:
> https://issues.apache.org/jira/browse/MESOS-8884
> https://issues.apache.org/jira/browse/MESOS-8875
> 
> The first one is a test flakiness, and the second one is related to
> MESOS-2407 which is a known problem.
> 
> On Wed, May 9, 2018 at 11:00 AM, Vinod Kone  wrote:
> 
>> +1 (binding)
>> 
>> Ran it on ASF CI. The only failures observed were known flaky command check
>> tests.
>> 
>> *Revision*: c7df5eadc075adcf525ea091f65786aaffb9b072
>> 
>>   - refs/tags/1.6.0-rc1
>> 
>> Configuration Matrix gcc clang
>> centos:7 --verbose --enable-libevent --enable-ssl autotools
>> [image: Failed]
>> > ease/48/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--
>> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>> GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%
>> 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> cmake
>> [image: Success]
>> > ease/48/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose
>> %20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=
>> 1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
>> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> --verbose autotools
>> [image: Failed]
>> > ease/48/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--
>> verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%
>> 3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> cmake
>> [image: Success]
>> > ease/48/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose
>> ,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_
>> exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
>> [image: Failed]
>> > ease/48/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--
>> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>> GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
>> docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> > ease/48/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=-
>> -verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>> GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
>> docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> cmake
>> [image: Success]
>> > ease/48/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose
>> %20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=
>> 1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
>> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> > ease/48/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--
>> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_
>> v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%
>> 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> --verbose autotools
>> [image: Success]
>> > ease/48/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--
>> verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%
>> 3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> > ease/48/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=-
>> -verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%
>> 3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> cmake
>> [image: Success]
>> > ease/48/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose
>> ,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,
>> label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> > ease/48/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--
>> verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A1
>> 4.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> 
>> 
>> On Mon, May 7, 2018 at 8:48 PM, Greg Mann  wrote:
>> 
>>> Hi all,
>>> 
>>> Please vote on releasing the following candidate 

Re: ApacheCon current-event banner

2018-05-01 Thread James Peach
Hi Piergiorgio,

I posted an update to the Mesos website at https://reviews.apache.org/r/66901/


> On Apr 26, 2018, at 2:41 AM, Piergiorgio Lucidi  
> wrote:
> 
> Hi,
> 
> I'm contributing in the Apache ComDev project and we would like to spread out 
> informations about Apache-related events such as ApacheCon. 
> 
> Rich Bowen started a thread about this topic and we are actually searching to 
> use a dynamic banner that will be updated automatically without the need to 
> update the website everytime for every new event. On the dev@community.a.o 
> list Rich Bowen asked to get this included in as many project websites as 
> possible [2].
> 
> It would be great to have also Mesos website updated to include the 
> current-event banner described in the first step of this README [1].
> 
> To track the progress of all projects updating their website, a link to this 
> thread will be put in a shared google spreadsheet [3].
> 
> Thank you so much for your cooperation.
> 
> Cheers,
> Piergiorgio
> 
> [1] - http://apache.org/events/README.txt
> [2] - 
> https://lists.apache.org/thread.html/d672b1849f6668c0f67ff4c71b20bbb4f56a49a1777607b12643d1dc@%3Cdev.community.apache.org%3E
> [3] - 
> https://docs.google.com/spreadsheets/d/101O3EVBYv_QhHW74bFLoO89ydaXoUJW4AC97YhnR530/edit#gid=0



Re: Volume ownership and permission

2018-04-26 Thread James Peach


> On Apr 26, 2018, at 7:25 PM, Qian Zhang <zhq527...@gmail.com> wrote:
> 
> Hi James,
> 
> Thanks for your comment!
> 
> I think you are talking about the SANDBOX_PATH volume ownership issue
> mentioned in the design doc
> <https://docs.google.com/document/d/1QyeDDX4Zr9E-0jKMoPTzsGE-v4KWwjmnCR0l8V4Tq2U/edit#heading=h.s6f8rmu65g2p>,
> IIUC, you prefer to leaving it to framework, i.e., framework itself ought
> to be able to handle such issue. But I am curious how framework can handle
> it in such situation. If the framework launches a task group with different
> users and with a SANDBOX_PATH volume of PARENT type, the tasks in the group
> will definitely fail to write to the volume due to the ownership issue
> though the volume's mode is set to "rw". So in this case, how should
> framework handle it?

The framework launched tasks in a group with different users? Sounds like they 
dug their own hole :)

I'd argue that the "rw" on the sandbox path is analogous to the "rw" mount 
option. That is, it is mounted writeable, but says nothing about which 
credentials can write to it.

> And if we want to document it, what is our recommended
> solution in the doc?
> 
> 
> 
> Regards,
> Qian Zhang
> 
> On Fri, Apr 27, 2018 at 1:16 AM, James Peach <jpe...@apache.org> wrote:
> 
>> I commented on the doc, but at least some of the issues raised there I
>> would not regard as issues. Rather, they are about setting expectations
>> correctly and ensuring that we are documenting (and maybe enforcing)
>> sensible behavior.
>> 
>> I'm not that keen on Mesos automatically "fixing" filesystem permissions
>> and we should proceed down that path with caution, especially in the ACLs
>> case.
>> 
>>> On Apr 10, 2018, at 3:15 AM, Qian Zhang <zhq527...@gmail.com> wrote:
>>> 
>>> Hi Folks,
>>> 
>>> I am working on MESOS-8767 to improve Mesos volume support regarding
>> volume ownership and permission, here is the design doc. Please feel free
>> to let me know if you have any comments/feedbacks, you can reply this mail
>> or comment on the design doc directly. Thanks!
>>> 
>>> 
>>> Regards,
>>> Qian Zhang
>> 
>> 



Re: Volume ownership and permission

2018-04-26 Thread James Peach
I commented on the doc, but at least some of the issues raised there I would 
not regard as issues. Rather, they are about setting expectations correctly and 
ensuring that we are documenting (and maybe enforcing) sensible behavior. 

I'm not that keen on Mesos automatically "fixing" filesystem permissions and we 
should proceed down that path with caution, especially in the ACLs case.

> On Apr 10, 2018, at 3:15 AM, Qian Zhang  wrote:
> 
> Hi Folks,
> 
> I am working on MESOS-8767 to improve Mesos volume support regarding volume 
> ownership and permission, here is the design doc. Please feel free to let me 
> know if you have any comments/feedbacks, you can reply this mail or comment 
> on the design doc directly. Thanks!
> 
> 
> Regards,
> Qian Zhang



Re: Convention for Backward Compatibility for New Operations in Mesos 1.6

2018-04-16 Thread James Peach

> On Apr 16, 2018, at 2:04 PM, Chun-Hung Hsiao  wrote:
> 
> Hi all,
> 
> As some might have already known, we are currently working on patches to
> implement the new GROW_VOLUME and SHRINK_VOLUME operations [1].
> 
> One problem surfaces is that, since the new operations are not supported in
> Mesos 1.5, they will lead to an agent crash during the operation application
> cycle if a Mesos 1.6 master send these operations to a Mesos 1.5 agent [2].
> 
> We are now consider two possibilities to address this compatibility problem:
> 
> 1) The Mesos 1.6 master should check the agent's Mesos version in
> `Master::accept` [3]. Moving forward, if we add new operations in future
> Mesos
> releases, we would have code like the following:

Using a capability follows the existing practice. I'm also sympathetic to the 
argument that this is an experimental feature and will cause 1.5 agents will 
crash.

> 2) Treat this issue as an agent crash bug. The Mesos master would forward
> the operation to the agent, regardless of the agent's Mesos version. In the
> agent,
> we deploy and backport the following logic in `Slave::applyOperation` [4]:
> 
> ```
> if (message.operation_info().type() == OPERATION_UNKNOWN) {
>  ... // Drop the operation and trigger a re-registration or send an
>  // `UpdateSlaveMessage` to force the master to update the total
> resource of
>  // the slave.
> }
> ```

You should never drop operations. This should respond with some sort of 
"UNKNOWN/UNSUPPORTED" status.

J

Re: Update the *Minimum Linux Kernel version* supported on Mesos

2018-04-05 Thread James Peach


> On Apr 5, 2018, at 5:00 AM, Andrei Budnik  wrote:
> 
> Hi All,
> 
> We would like to update minimum supported Linux kernel from 2.6.23 to
> 2.6.28.
> Linux kernel supports cgroups v1 starting from 2.6.24, but `freezer` cgroup
> functionality was merged into 2.6.28, which supports nested containers.

User namespaces require >= 3.12 (November 2013). Can we make that the minimum?

J

Re: API review: max_duration on TaskInfo

2018-03-28 Thread James Peach


> On Mar 23, 2018, at 2:21 PM, Zhitao Li  wrote:
> 
> Hi everyone,
> 
> I'd like to do an API review for MESOS-8725
> . We are adding an
> optional `max_duration` to `TaskInfo` field. If a task does not terminate
> within this duration, built-in executors will kill the task with a new
> reason `REASON_MAX_DURATION_REACHED`.
> 
> Proof of concept patch:
> https://reviews.apache.org/r/66258/
> 
> Reference implementation in command executor:
> https://reviews.apache.org/r/66259/
> 
> A design choice we made is to make this relative duration rather than an
> absolute timestamp of deadline. Our rationales:
> 
>   - Cluster could suffer from clock skews, so same absolute deadline would
>   result in inconsistent behavior;
>   - Framework can just trivially translate its own clock as source of
>   truth to translate absolute deadline to current time + max_duration.
> 
> Please let me know what you think. Thanks.

Bringing our conversation about task group semantics back to the list.

The current reviews require all tasks in a group to have the same max_duration. 
This is equivalent to specifying max_duration on the task group itself. This 
means that when the time is up, the whole group gets torn down. Validation on 
the master ensures that schedulers have to set the same value across all the 
tasks.

Alternatively, we could allow the duration to be different for tasks and then 
just kill the individual task when it's time expires. In this case, the task 
will have a final status of TASK_KILLED, which will cause the Mesos default 
executor to tear down the rest of the group. So we have the same effect, though 
it is expressed differently in the API.

So maybe the cleanest way to express this for task groups is to place the 
max_duration in the `TaskGroupInfo`? However if we do that, then we lose any 
information about which task exceeded the duration (since by definition they 
all did). So I'm leaning towards allowing a per-task max_duration.

We should also define what this API means for the final `TaskStatus` of the 
task. In my executor, the rule we follow is that `TASK_KILLED` is only ever 
used in response to explicit KILL requests from the scheduler. If the 
max_duration is exceeded, I think that we should classify that as `TASK_FAILED`.

thanks,
James

Re: Support deadline for tasks

2018-03-23 Thread James Peach


> On Mar 23, 2018, at 9:57 AM, Renan DelValle  wrote:
> 
> Hi Zhitao,
> 
> Since this is something that could potentially be handled by the executor 
> and/or framework, I was wondering if you could speak to the advantages of 
> making this a TaskInfo primitive vs having the executor (or even the 
> framework) handle it.

There's some discussion around this on 
https://issues.apache.org/jira/browse/MESOS-8725.

My take is that delegating too much to the scheduler makes schedulers harder to 
write and exacerbates the complexity of the system. If 4 different schedulers 
implement this feature, operators are likely to need to understand 4 different 
ways of doing the same thing, which would be unfortunate. 

J

Re: Support deadline for tasks

2018-03-22 Thread James Peach


> On Mar 22, 2018, at 10:06 AM, Zhitao Li  wrote:
> 
> In our environment, we run a lot of batch jobs, some of which have tight 
> timeline. If any tasks in the job runs longer than x hours, it does not make 
> sense to run it anymore. 
>  
> For instance, a team would submit a job which builds a weekly index and 
> repeats every Monday. If the job does not finish before next Monday for 
> whatever reason, there is no point to keep any task running.
>  
> We believe that implementing deadline tracking distributed across our cluster 
> makes more sense as it makes the system more scalable and also makes our 
> centralized state machine simpler.
>  
> One idea I have right now is to add an  optional TimeInfo deadline to 
> TaskInfo field, and all default executors in Mesos can simply terminate the 
> task and send a proper StatusUpdate.
> 
> I summarized above idea in MESOS-8725.
> 
> Please let me know what you think. Thanks! 

This sounds both useful and simple to implement. I’m happy to shepherd if you’d 
like

J

Re: API Review: Resize (persistent) volume support

2018-03-18 Thread James Peach


> On Mar 16, 2018, at 11:12 AM, Zhitao Li  wrote:
> 
> Hi everyone,
> 
> Chun, Greg, Gastón and I are working on supporting resizing of persistent
> volume[1]. See [2] for the design doc in length.
> 
> The proposed new offer operation and corresponding operator API are in
> following two patches:
> 
> https://reviews.apache.org/r/66049/
> https://reviews.apache.org/r/66052
> 
> Our intention is to eventually support resizing of not only persistent
> volumes, but also CSI volumes[3] introduced after Mesos 1.5 in the same set
> of API, so we are declaring the API as experimental in its first release
> version.
> 
> We also want to make sure the API is reasonable to use to framework authors
> and operators.

Why do you have separate GROW/SHRINK operations? Could a RESIZE operation with 
a target size work?

In all of these cases, is it possible for the operation to be applied more than 
once? Clearly, replaying a SHRINK would be bad. Applying RESIZE operations out 
of order would also be bad, but not in the same way.

What is the response to this request?

> Considering the above, both APIs need to include the original volume as
> resource. Some alternatives on extra fields:
> 1) size difference in Resource format: this may not be applicable in CSI
> volume;
> 2) size difference in Scalar value: this can be applicable in both CSI and
> persistent volume case, since there is always a quantitive difference. We
> can add extra CSI only fields once the spec is defined;
> 3) target volume in `Resource` format: this may not be possible for any CSI
> volume because the implementation could change certain metadata, so we did
> not take this approach.
> 
> Therefore, we are taking option 2) in current patches.
> 
> Please let me know what you think. Thanks.
> 
> [1] https://issues.apache.org/jira/browse/MESOS-4965
> [2] https://docs.google.com/document/d/1Z16okNG8mlf2eA6NyW_PUmBfNFs_
> 6EOaPzPtwYNVQUQ/edit#
> [3] https://github.com/apache/mesos/blob/master/docs/csi.md
> 
> -- 
> Cheers,
> 
> Zhitao Li



Re: [MoA] Update Jenkins ARM configuration

2018-03-16 Thread James Peach


> On Mar 16, 2018, at 8:27 AM, James Peach <jpe...@apache.org> wrote:
> 
> 
> 
>> On Mar 16, 2018, at 5:11 AM, Tomek Janiszewski <jani...@gmail.com> wrote:
>> 
>> Hey
>> 
>> Who can edit Jenkins configuration? I noticed we run tests with the wrong
>> configuration (we miss --disable-libtool-wrappers') it should
>> be CONFIGURATION='--disable-java --disable-python
>> --disable-libtool-wrappers'
>> Can you fix it, please?
>> 
>> https://issues.apache.org/jira/browse/MESOS-7500
> 
> Thanks Tomek, I will take care of that today!

done


Re: [MoA] Update Jenkins ARM configuration

2018-03-16 Thread James Peach


> On Mar 16, 2018, at 5:11 AM, Tomek Janiszewski  wrote:
> 
> Hey
> 
> Who can edit Jenkins configuration? I noticed we run tests with the wrong
> configuration (we miss --disable-libtool-wrappers') it should
> be CONFIGURATION='--disable-java --disable-python
> --disable-libtool-wrappers'
> Can you fix it, please?
> 
> https://issues.apache.org/jira/browse/MESOS-7500

Thanks Tomek, I will take care of that today!

J


Re: API Review Policy

2018-03-13 Thread James Peach


> On Mar 13, 2018, at 3:58 PM, Greg Mann  wrote:
> 
> Sure we can use that as a starting point. The basic policy that we
> discussed in the working group was that any public API change should be
> advertised on the developer mailing list. If no concerns are raised after
> several days, then the change could proceed. If concerns were raised, then
> discussion could start on the mailing list and continue in the next meeting
> of the API working group if necessary.
> 
> The Traffic Server policy includes a few other concrete details which we
> did not discuss. I'll quote it here to make things easy:
> 
> 
> Due to the importance of getting API right, there is a required review
>> process for changes to the Traffic Server API. For every API change, the
>> developer should post a message to the dev@ list that
>> 
>>   - references the relevant Github Issue or PR
>>   - explains the motivating problem and rationale
>>   - shows the actual API change itself (ie. API signatures, etc)
>>   - documents the semantics of the proposed API
>>   - notes any ABI or compatibility implicates
>> 
>> After a comments period (1 or 2 days), the committer would add the API. If
>> there were any comments or suggestions, then the committer would address
>> those as necessary.
>> 
>> New API can be added to experimental.h if the developer believe that it
>> might change after some adoption or implementation experience. APIs
>> intended for experimental.h should still be reviewed on the mailing list.
>> APIs added to experimental.h, or another experimental header, can (and
>> should!) get moved to a frozen and stable include file when appropriate.
>> It's up to the author to propose a promotion to stable on the mailing list,
>> lazy consensus applies here.
>> 
>> It is strongly preferable that a new API to be integrated into a sample
>> plugin - giving users a good sample to copy. API documentation and unit
>> tests should be provided as a matter of course.
>> 
>> 
> 
> I think this is pretty good as-is; replace "Github Issue or PR" with "JIRA
> issue or ReviewBoard patch", and remove the portion about 'experimental.h'
> and what follows.

I’m generally in favor of this. I think that we all try to raise compatibility 
and operational issues on the mailing lists, so this seems like a formalization 
and extension of that practice. Most of the information needed for an API 
proposal would already be captured in a design document, so in the Mesos 
context this would be about improving the visibility of changes and widening 
the feedback net.

cheers,
James.



Re: Reconsidering `allocatable` check in the allocator

2018-03-07 Thread James Peach


> On Mar 7, 2018, at 5:52 AM, Benjamin Bannier  
> wrote:
> 
> Hi,
> 
>> Chatted with BenM offline on this. There's another option what both of us
>> agreed that it's probably better than any of the ones mentioned above.
>> 
>> The idea is to make `allocable` return the portion of the input resources
>> that are allocatable, and strip the unelectable portion.
>> 
>> For example:
>> 1) If the input resources are "cpus:0.001,gpus:1", the `allocatable` method
>> will return "gpus:1".
>> 2) If the input resources are "cpus:1,mem:1", the `allocatable` method will
>> return "cpus:1".
>> 3) If the input resources are "cpus:0.001,mem:1", the `allocatable` method
>> will return an empty Resources object.
>> 
>> Basically, the algorithm is like the following:
>> 
>> allocatable = input
>> foreach known resource type t: do
>> r = resources of type t from the input
>> if r is less than the min resource of type t; then
>>   allocatable -= r
>> fi
>> done
>> return allocatable
> 
> I think that sounds like a faithful extension the current behavior to me 
> (removing too small resources from the offerable pool), but I feel we should 
> not just filter out any resource _kind_  below the minimum, but inside a kind 
> all _addable_ subresources,
> 
>allocatable : Resources = input
>  for (resource: Resource) in input:
>if resource < min(resource.kind):
>  allocatable -= resource
> 
>return allocatable
> 
> This would have the effect of clumping together each distinguishable resource 
> we care about instead of of accumulating say different disks which in sum are 
> potentially not that more interesting to frameworks (they would prefer more 
> of a particular disk than smaller pieces scattered across multiple disks).
> 
> @alexr
>> If we are about to offer some of the resources from a particular agent, why
>> would we filter anything at all? I doubt we should be concerned about the
>> size of the offer representation travelling through the network. If
>> available resources are "cpus:0.001,gpus:1" and we want to allocate GPU,
>> what is the benefit of filtering CPU?
>> 
>> What about the following:
>> allocatable(R)
>> {
>> return true
>>   iff (there exists r in R for which size(r) > MIN(type(r)))
>> }
> 
> I think this is less about communication overhead, but more a tool to help to 
> make sure that offered resources are actually useful to frameworks. 

I don't know whether there's a JIRA for this, but in the past we've proposed 
the idea of schedulers suppressing or filtering offers with a minimum resources 
specification, i.e. "don't bother me with offers that aren't at least X"

J

Re: Authorization Logging

2018-02-28 Thread James Peach


> On Feb 28, 2018, at 2:52 PM, Benjamin Mahler  wrote:
> 
> When touching some code, I noticed that authorization logging is currently
> done rather inconsistently across the call-sites and many cases do not log
> the request:
> 
> $ grep -R -A 3 'LOG.*Authorizing' src
> 
> Should authorization logging be the concern of an authorizer
> implementation? For audit purposes I could imagine this also being part of
> a separate log that the authorizer maintains?

Separating this out from the authorizer was the idea behind 
https://issues.apache.org/jira/browse/MESOS-7678.

J

Re: Surfacing additional issues on agent host to schedulers

2018-02-20 Thread James Peach

> On Feb 20, 2018, at 11:11 AM, Zhitao Li  wrote:
> 
> Hi,
> 
> In one of recent Mesos meet up, quite a couple of cluster operators had
> expressed complaints that it is hard to model host issues with Mesos at the
> moment.
> 
> For example, in our environment, the only signal scheduler would know is
> whether Mesos agent has disconnected from the cluster. However, we have a
> family of other issues in real production which makes the hosts (sometimes
> "partially") unusable. Examples include:
> - traffic routing software malfunction (i.e, haproxy): Mesos agent does not
> require this so scheduler/deployment system is not aware, but actual
> workload on the cluster will fail;
> - broken disk;
> - other long running system agent issues.
> 
> This email is looking at how can Mesos recommend best practice to surface
> these issues to scheduler, and whether we need additional primitives in
> Mesos to achieve such goal.

In the K8s world the node can publish "conditions" that describe its status

https://kubernetes.io/docs/concepts/architecture/nodes/#condition

The condition can automatically taint the node, which could cause pods to 
automatically be evicted (ie. if they can't tolerate that specific taint).

J

Re: [VOTE] C++14 Upgrade

2018-02-12 Thread James Peach


> On Feb 11, 2018, at 10:33 PM, Michael Park <mcyp...@gmail.com> wrote:
> 
> On Sun, Feb 11, 2018 at 6:00 PM James Peach <jpe...@apache.org> wrote:
> 
>> 
>> 
>>> On Feb 9, 2018, at 9:28 PM, Michael Park <mp...@apache.org> wrote:
>>> 
>>> I'm going to put this up for a vote. My plan is to bump us to C++14 on
>> Feb
>>> 21.
>>> 
>>> The following are the proposed changes:
>>> - Minimum GCC *4.8.1* => *5*.
>>> - Minimum Clang *3.5* => *3.6*.
>>> - Minimum Apple Clang *8* => *9*.
>>> 
>>> We'll have a standard voting, at least 3 binding votes, and no -1s.
>> 
>> +0
>> 
>> What’s the user benefit of this change?
>> 
> 
> Some of the features I've described in MESOS-7949
> <https://issues.apache.org/jira/browse/MESOS-7949> are:
> 
>   - Generic lambdas
>   - New lambda captures (Proper move captures!)
>   - SFINAE result_of (We can remove stout/result_of.hpp)
>   - Variable templates
>   - Relaxed constexpr functions
>   - Simple utilities such as std::make_unique
>   - Metaprogramming facilities such as decay_t, index_sequence

Are these all internal though? Maybe move captures could yield some performance 
improvements?



Re: [VOTE] C++14 Upgrade

2018-02-11 Thread James Peach


> On Feb 9, 2018, at 9:28 PM, Michael Park  wrote:
> 
> I'm going to put this up for a vote. My plan is to bump us to C++14 on Feb
> 21.
> 
> The following are the proposed changes:
>  - Minimum GCC *4.8.1* => *5*.
>  - Minimum Clang *3.5* => *3.6*.
>  - Minimum Apple Clang *8* => *9*.
> 
> We'll have a standard voting, at least 3 binding votes, and no -1s.

+0

What’s the user benefit of this change?

J

Re: [VOTE] Release Apache Mesos 1.5.0 (rc2)

2018-02-07 Thread James Peach
+1 (binding)

Tested on Fedora 27

> On Feb 1, 2018, at 5:36 PM, Gilbert Song  wrote:
> 
> Hi all,
> 
> Please vote on releasing the following candidate as Apache Mesos 1.5.0.
> 
> 1.5.0 includes the following:
> 
>  * Support Container Storage Interface (CSI).
>  * Agent reconfiguration policy.
>  * Auto GC docker images in Mesos Containerizer.
>  * Standalone containers.
>  * Support gRPC client.
>  * Non-leading VOTING replica catch-up.
> 
> 
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.5.0-rc2
> 
> 
> The candidate for Mesos 1.5.0 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/1.5.0-rc2/mesos-1.5.0.tar.gz
> 
> The tag to be voted on is 1.5.0-rc2:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.5.0-rc2
> 
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.5.0-rc2/mesos-1.5.0.tar.gz.md5
> 
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.5.0-rc2/mesos-1.5.0.tar.gz.asc
> 
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS
> 
> The JAR is in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1222
> 
> Please vote on releasing this package as Apache Mesos 1.5.0!
> 
> The vote is open until Tue Feb  6 17:35:16 PST 2018 and passes if a
> majority of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Mesos 1.5.0
> [ ] -1 Do not release this package because ...
> 
> Thanks,
> Jie and Gilbert



Re: Soliciting Hackathon Ideas

2018-02-07 Thread James Peach


> On Feb 6, 2018, at 11:21 PM, Benjamin Mahler  wrote:
> 
> +1 Versioned documentation would be heroic!

Based on https://reviews.apache.org/r/52064/ ?

> 
> On Tue, Feb 6, 2018 at 5:49 PM Vinod Kone  wrote:
> 
>> Versioned documentation!
>> 
>> Sent from my iPhone
>> 
>>> On Feb 6, 2018, at 4:37 PM, Benjamin Mahler  wrote:
>>> 
>>> A couple of ideas from the performance related working group:
>>> 
>>> -Use protobuf arenas for all non-trivial outbound master messages (easy)
>>> This can be done piecemeal.
>>> -Use move semantics (take a Message&&) in all of the master message
>>> handlers to reduce copying (medium) This one can be done piecemeal. For
>>> example Master::statusUpdate would be a good one to start with.
>>> -Audit the Registrar code to use move semantics to reduce copying
>> (medium)
>>> 
>>> If there are any UI programmers:
>>> 
>>> -Consider a webui "refresh", try to find a new set of fonts and style,
>>> could be fun.
>>> 
>>> On Fri, Feb 2, 2018 at 12:47 PM, Andrew Schwartzmeyer <
>>> and...@schwartzmeyer.com> wrote:
>>> 
 Hello all,
 
 Next month I'll be attending HackIllinois (https://hackillinois.org/)
>> as
 an open-source mentor. It's a huge student-run hackathon at the
>> University
 of Illinois at Urbana-Champaign, running from February 23rd to the 25th.
 Students from a multitude of schools will be attending (they even bus
>> them
 in). The hackathon has an open-source focus, and while there will be
>> many
 projects for the students to work on, I want to make sure Mesos gets
>> some
 attention too.
 
 I am asking you all for open issues and new ideas for small,
 beginner-friendly projects that could fit a two-day Hackathon project.
>> For
 Mesos, I'm looking through our open issues labeled "easyfix",
>> "beginner",
 or "newbie", which actually returns 74 results! If you have anything in
 particular that you think would be a good fit, please let me know. I'd
>> like
 to go with a list of vetted issues so I don't accidentally start some
 students in on a giant can of worms. Our excellent new Beginner
>> Contributor
 Guide will be a huge help too.
 
 Thanks,
 
 Andy
 
 P.S. If any of you also want to attend, let me know, and I'll get you in
 touch with their director.
 
>> 



Re: [VOTE] Release Apache Mesos 1.5.0 (rc1)

2018-01-24 Thread James Peach
+1

Verified on CentOS 6 and Fedora 27

> On Jan 22, 2018, at 7:15 PM, Gilbert Song  wrote:
> 
> Hi all,
> 
> Please vote on releasing the following candidate as Apache Mesos 1.5.0.
> 
> 1.5.0 includes the following:
> 
>   * Support Container Storage Interface (CSI).
>   * Agent reconfiguration policy.
>   * Auto GC docker images in Mesos Containerizer.
>   * Standalone containers.
>   * Support gRPC client.
>   * Non-leading VOTING replica catch-up.
> 
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.5.0-rc1
> 
> 
> The candidate for Mesos 1.5.0 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/1.5.0-rc1/mesos-1.5.0.tar.gz
> 
> The tag to be voted on is 1.5.0-rc1:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.5.0-rc1
> 
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.5.0-rc1/mesos-1.5.0.tar.gz.md5
> 
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.5.0-rc1/mesos-1.5.0.tar.gz.asc
> 
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS
> 
> The JAR is in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1221
> 
> Please vote on releasing this package as Apache Mesos 1.5.0!
> 
> The vote is open until Thu Jan 25 18:24:36 PST 2018 and passes if a majority 
> of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Mesos 1.5.0
> [ ] -1 Do not release this package because ...
> 
> Thanks,
> Jie and Gilbert



Re: Doc-a-thon - January 11th, 2018

2018-01-09 Thread James Peach
Just a reminder that the Docathon is this Thursday :)

> On Nov 21, 2017, at 4:14 PM, Judith Malnick  wrote:
> 
> Hi all,
> 
> I'm excited to announce the next Apache Mesos doc-a-thon!
> 
> *Date:* January 11th, 2018
> 
> Location:
> 
> Mesosphere HQ
> 
> 88 Stevenson Street
> 
> San Francisco, CA
> 
> Schedule (Pacific time):
> 
> 3 - 3:30 PM: Discuss docs projects, split into groups
> 
> 3:30 - 6:30 PM: Work on docs
> 
> 6:30 - 7 PM: Present progress
> 
> 7 - 8 PM: Drinks and hangout!
> 
> 
> If you will be attending in person, please RSVP
>  so we
> know how much food to get.
> If you plan on attending remotely, you can with this Zoom link
> .
> Feel free to brainstorm project proposals on this planning doc
> .
> 
> 
> Let me know if you have any questions. I'm looking forward to seeing all of
> you and your amazing projects!
> 
> All the Best,
> Judith
> -- 
> Judith Malnick
> Community Manager
> 310-709-1517



Re: Reusing `reserve_resources` ACL for static reservation

2017-12-15 Thread James Peach


> On Dec 15, 2017, at 5:34 AM, Alexander Rojas  wrote:
> 
> Hey Yan,
> 
> We were discussing this issue with James and I think this is not enough
> to guarantee that an Agent won’t be assigned (neither statically nor
> dynamically) resources under certain role. The problem here is that nothing
> will avoid a principal to dynamically reserve resources later.

Maybe my confusion arises from 2 possible reading of the semantics of the 
`reserve_resources` ACL.

If we read it as "these principals may reserve resources for role A", then it 
makes sense that a framework principal can place those reservations on any 
agent and that agents must register with an allowed principal to make their 
static reservations.

If we read it as "agents with this principal may host resources reserved for 
role A", then that implies that we ought to consider the agent's principal for 
both static and dynamic reservations.

Due to the security context of this change, I had assumed that the latter was 
the desired outcome.

> However your approach does work if you want to treat statically resources
> as dynamical ones. It does require however that agents register using 
> different credentials (which I don’t think is a bad idea).
> 
> What I was thinking now is to use only one authorization call, `RegisterAgent`
> so that it looks at the whole `slave_info` message (That will require 
> modifications
> in `mesos::ObjectApprover::Object`) and then check the roles from the 
> `slave_info`.

This seems like a pretty useful enhancement. An authorizer could validate agent 
attributes and any number of other interesting properties.

> Then for each reservation we would need to not only authorize the reservation
> action, but the roles themselves with the principal used for agent 
> registration
> and compute a logical and of the two results. There are indeed antecedents
> for this solution.
> 
> What I don’t think we can get around is for agents using different principals
> on registration since we only authorize against principals and agent-id’s are
> dynamically generated.

I think this is fine, though I was originally a bit concerned that specifying a 
large number of agent principals in the ACLs JSON would not be scalable. I 
expect that people using the built-in authorization mechanism are using only a 
few agent principals while only those who have wired Mesos up to a directory 
service are probably using machine accounts.

> @jpeach do you have any objections or ideas here?
> @yan could you discuss this with @jpeach.
> 
> Finally @yan, you have been working without a shepherd and I really recommend
> you to get one in order to get this through. I could help here but I may lack 
> some
> of the context that james has. I guess is up to you.
> 
> Best,
> 
> 
> Alexander Rojas
> alexan...@mesosphere.io
> 
> 
> 
> 
>> On 12. Dec 2017, at 20:31, Yan Xu  wrote:
>> 
>> Hi,
>> 
>> In https://issues.apache.org/jira/browse/MESOS-8306 I am proposing that we
>> use an ACL to restrict the roles that agents can statically reserve
>> resources for to address a security concern in which a process on a
>> compromised host can impersonate an agent and then then reservation
>> resources for arbitrary roles.
>> 
>> Resuing `reserve_resources` ACL for this purpose feels intuitive to me and
>> I don't think it interferes with its use for authorizing dynamic
>> reservations by the frameworks and operators.
>> 
>> Are there any concerns about it?
>> 
>> Also as part of this change I am revising the doc to change the wording on
>> static reservations so its use is not discouraged:
>> https://reviews.apache.org/r/64516/diff
>> 
>> Thanks,
>> Yan
> 



narrowing task sandbox permissions

2017-12-14 Thread James Peach
Hi all,

In https://issues.apache.org/jira/browse/MESOS-8332, I'm proposing a change to 
narrow the permissions used for the task sandbox directory from 0755 to 0750. 
Note that this change also makes failure to chown this directory into a hard 
failure.

I expect this is a safe change for well-behaved configurations, but please let 
me know if you have any compatibility concerns.

thanks,
James

Re: Customize executor_registration_timeout per executor

2017-12-11 Thread James Peach

> On Dec 11, 2017, at 8:55 AM, Zhitao Li  wrote:
> 
> Hi,
> 
> We are running tasks which has very large docker images and tasks which use
> much smaller images in our clusters, Therefore, we expect to see occasional
> violation of --executor_registration_timeout for tasks which has uncached
> large docker images.
> 
> I wonder whether we can introduce some executor specific parameter to make
> this customizable per executor, instead of one single value per agent.

Sounds like the registration timeout should not start until all the required 
images have been staged?

J

Re: 1.4.1 release

2017-11-03 Thread James Peach
I think MESOS-8169 is a candidate, but I don't be able to get to it until next 
week


> On Nov 3, 2017, at 1:48 AM, Qian Zhang  wrote:
> 
> And I will backport MESOS-8051 to 1.2.x, 1.3.x and 1.4.x.
> 
> 
> Regards,
> Qian Zhang
> 
> On Fri, Nov 3, 2017 at 9:01 AM, Qian Zhang  wrote:
> We want to backport https://reviews.apache.org/r/62518/ to 1.2.x, 1.3.x and 
> 1.4.x, James will work on it.
> 
> 
> Regards,
> Qian Zhang
> 
> On Fri, Nov 3, 2017 at 12:11 AM, Kapil Arya  wrote:
> Please reply to this email if you have pending patches to be backported to 
> 1.4.x as we are aiming to cut a release candidate for 1.4.1 early next week.
> 
> Thanks,
> Anand and Kapil
> 
> 



Re: clearing the executor authentication token from the task environment

2017-11-02 Thread James Peach

> On Nov 1, 2017, at 2:28 PM, James Peach <jor...@gmail.com> wrote:
> 
> Hi all,
> 
> In https://issues.apache.org/jira/browse/MESOS-8140, I'm proposing that we 
> clear the MESOS_EXECUTOR_AUTHENTICATION_TOKEN environment variable 
> immediately after consuming it in the built-in executors. This protects it 
> from observation by other tasks in the same PID namespace, however I wanted 
> to verify that no-one currently has a use case that depends on this. 
> Currently, the token is inherited to the environment of tasks running under 
> the command executor (i.e. not to task group tasks).
> 
> Eventually we would add a formal API for tasks to access the executor token 
> in MESOS-8018.

Ok, we will be landing this change for Mesos 1.5

thanks,
James

clearing the executor authentication token from the task environment

2017-11-01 Thread James Peach
Hi all,

In https://issues.apache.org/jira/browse/MESOS-8140, I'm proposing that we 
clear the MESOS_EXECUTOR_AUTHENTICATION_TOKEN environment variable immediately 
after consuming it in the built-in executors. This protects it from observation 
by other tasks in the same PID namespace, however I wanted to verify that 
no-one currently has a use case that depends on this. Currently, the token is 
inherited to the environment of tasks running under the command executor (i.e. 
not to task group tasks).

Eventually we would add a formal API for tasks to access the executor token in 
MESOS-8018.

thanks,
James

Re: Static build?

2017-10-31 Thread James Peach

> On Oct 31, 2017, at 9:51 AM, Charles Allen  
> wrote:
> 
> Is it possible to statically build mesos?
> 
> https://issues.apache.org/jira/browse/MESOS-8127 fails for me.

It looks like compiler flags are not propagated through the 3rdparty builds 
very consistently. Depending on your build environment you might be able to 
build against unbundled dependencies?

> 
> Some other related tickets
> https://issues.apache.org/jira/browse/MESOS-1633
> https://issues.apache.org/jira/browse/MESOS-144
> 
> Thank you,
> Charles Allen



Re: Adding the limited resource to TaskStatus messages

2017-10-10 Thread James Peach

> On Oct 9, 2017, at 7:15 PM, Wil Yegelwel <wyegel...@gmail.com> wrote:
> 
> Is it correct to say that the limited resource field is *only* meant to 
> provide machine readable information about what resources limits were 
> exceeded?

Yes,

> If so, does it make sense to provide richer reporting fields for all failure 
> reasons? I imagine other failure reasons could benefit from being able to 
> report details of the failure that are machine readable.

Some other reasons already have their own structured information, eg. the 
TASK_UNREACHABLE state populates the `unreachable_time` field. I'm not planning 
to add structured information to any other failure reasons, but I'd support 
doing it if you have a specific suggestion.

> On Mon, Oct 9, 2017, 3:50 PM James Peach <jor...@gmail.com> wrote:
> 
> > On Oct 9, 2017, at 1:27 PM, Vinod Kone <vinodk...@apache.org> wrote:
> >
> >> In the case that a task is killed because it violated a resource
> >> constraint (ie. the reason field is REASON_CONTAINER_LIMITATION,
> >> REASON_CONTAINER_LIMITATION_DISK or REASON_CONTAINER_LIMITATION_MEMORY),
> >> this field may be populated with the resource that triggered the
> >> limitation. This is intended to give better information to schedulers about
> >> task resource failures, in the expectation that it will help them bubble
> >> useful information up to the user or a monitoring system.
> >>
> >
> > Can you elaborate what schedulers are expected to do with this information?
> > Looking for some concrete use cases if you can.
> 
> There's no concrete use case here; it's just a matter of propagating 
> information we know in a structured way.
> 
> If we assume that the scheduler knows about some sort of monitoring system or 
> has a UI, we can present this to the user or a system that can take action on 
> it. The status quo is that the raw message string is dumped to logs, and has 
> to be manually interpreted.
> 
> Additionally, this can pave the way to getting rid of 
> REASON_CONTAINER_LIMITATION_DISK and REASON_CONTAINER_LIMITATION_MEMORY. All 
> you really need is REASON_CONTAINER_LIMITATION plus the resource information.
> 
> J
> 



Re: Adding the limited resource to TaskStatus messages

2017-10-09 Thread James Peach

> On Oct 9, 2017, at 1:27 PM, Vinod Kone  wrote:
> 
>> In the case that a task is killed because it violated a resource
>> constraint (ie. the reason field is REASON_CONTAINER_LIMITATION,
>> REASON_CONTAINER_LIMITATION_DISK or REASON_CONTAINER_LIMITATION_MEMORY),
>> this field may be populated with the resource that triggered the
>> limitation. This is intended to give better information to schedulers about
>> task resource failures, in the expectation that it will help them bubble
>> useful information up to the user or a monitoring system.
>> 
> 
> Can you elaborate what schedulers are expected to do with this information?
> Looking for some concrete use cases if you can.

There's no concrete use case here; it's just a matter of propagating 
information we know in a structured way.

If we assume that the scheduler knows about some sort of monitoring system or 
has a UI, we can present this to the user or a system that can take action on 
it. The status quo is that the raw message string is dumped to logs, and has to 
be manually interpreted. 

Additionally, this can pave the way to getting rid of 
REASON_CONTAINER_LIMITATION_DISK and REASON_CONTAINER_LIMITATION_MEMORY. All 
you really need is REASON_CONTAINER_LIMITATION plus the resource information.

J



Adding the limited resource to TaskStatus messages

2017-10-09 Thread James Peach
Hi all,

In https://reviews.apache.org/r/62644/, I am proposing to add an optional 
Resources field to the TaskStatus message named `limited_resources`.

In the case that a task is killed because it violated a resource constraint 
(ie. the reason field is REASON_CONTAINER_LIMITATION, 
REASON_CONTAINER_LIMITATION_DISK or REASON_CONTAINER_LIMITATION_MEMORY), this 
field may be populated with the resource that triggered the limitation. This is 
intended to give better information to schedulers about task resource failures, 
in the expectation that it will help them bubble useful information up to the 
user or a monitoring system.

diff --git a/include/mesos/v1/mesos.proto b/include/mesos/v1/mesos.proto
index d742adbbf..559d09e37 100644
--- a/include/mesos/v1/mesos.proto
+++ b/include/mesos/v1/mesos.proto
@@ -2252,6 +2252,13 @@ message TaskStatus {
   // status updates for tasks running on agents that are unreachable
   // (e.g., partitioned away from the master).
   optional TimeInfo unreachable_time = 14;
+
+  // If the reason field indicates a container resource limitation,
+  // this field contains the resource whose limits were violated.
+  //
+  // NOTE: 'Resources' is used here because the resource may span
+  // multiple roles (e.g. `"mem(*):1;mem(role):2"`).
+  repeated Resource limited_resources = 16;
 }



cheers,
James




Re: RFC: Partition Awareness

2017-10-05 Thread James Peach

> On Jun 21, 2017, at 10:16 AM, Megha Sharma  wrote:
> 
> Thank you all for the feedback.
> To summarize, not killing tasks for non-Partition Aware frameworks will make 
> the schedulers see a higher volume of non terminal updates for tasks for 
> which they have already received a TASK_LOST but nothing new that they are 
> not seeing today. So, this shouldn’t be a breaking change for frameworks and 
> this will make the partition awareness logic simpler. I will update 
> MESOS-7215 with the details once the design is ready.

What happens for short-lived frameworks? That is, the lost task comes back, 
causing the master to track its framework as disconnected, but the framework is 
gone and will never return.

J

Re: Refactoring configuration docs

2017-10-02 Thread James Peach

> On Oct 2, 2017, at 10:12 AM, Andrew Schwartzmeyer  
> wrote:
> 
> I'm going to assume everyone said "go for it!"

Go for it!

> 
> Cheers,
> 
> Andrew Schwartzmeyer
> 
> On 09/28/2017 9:18 am, Andrew Schwartzmeyer wrote:
>> Hi all,
>> I'm going through and updating the CMake docs, and that brought me to
>> our docs/configuration.md file. This file is huge! In the style of
>> Jie's refactor of the isolator docs, I'd like to create a
>> docs/configuration folder, and split these files into mesos.md,
>> master.md, agent.md, libprocess.md, autotools.md, and cmake.md within
>> that folder.
>> I think this important for discoverability, especially considering
>> that we lack a table of contents. After re-organizing, I would suggest
>> that docs/configuration.md become a table of contents with relative
>> links into the configuration folder.
>> Any objections / ongoing work I might interfere with?
>> Thanks,
>> Andrew Schwartzmeyer



Re: Are there any supported systems without O_CLOEXEC?

2017-09-29 Thread James Peach

> On Sep 29, 2017, at 11:34 AM, Benjamin Mahler <bmah...@apache.org> wrote:
> 
> Is this altering the minimum Linux or OS X version we support?


I couldn't find a clear statement of what OS support we guarantee. OS X got 
O_CLOEXEC in 10.10. CentOS 6.9 has kernel 2.6.32, apparently Ubuntu 14.04 has 
3.19. Do we support anything older than that?

> 
> On Fri, Sep 29, 2017 at 9:15 AM, James Peach <jor...@gmail.com> wrote:
> 
>> 
>>> On Sep 27, 2017, at 5:03 PM, James Peach <jor...@gmail.com> wrote:
>>> 
>>> Hi all,
>>> 
>>> In MESOS-8027 and https://reviews.apache.org/r/62638/, I'm claiming
>> that, in practice, we do not have any supported platforms that don't
>> implement O_CLOEXEC to open. All current Linux, FreeBSD and Solaris
>> versions implement O_CLOEXEC. Does anyone know of a platform that doesn't
>> have O_CLOEXEC that we ought to work on?
>>> 
>>> https://www.freebsd.org/cgi/man.cgi?sektion=2=open
>>> http://man7.org/linux/man-pages/man2/open.2.html
>>> https://docs.oracle.com/cd/E23824_01/html/821-1463/open-2.html
>>> https://developer.apple.com/legacy/library/documentation/
>> Darwin/Reference/ManPages/man2/open.2.html
>> 
>> Bump! If you run Mesos on a platform that doesn't support O_CLOEXEC (eg.
>> Linux kernel <= 2.6.23), please let us know!
>> 
>> J



Re: Are there any supported systems without O_CLOEXEC?

2017-09-29 Thread James Peach

> On Sep 27, 2017, at 5:03 PM, James Peach <jor...@gmail.com> wrote:
> 
> Hi all,
> 
> In MESOS-8027 and https://reviews.apache.org/r/62638/, I'm claiming that, in 
> practice, we do not have any supported platforms that don't implement 
> O_CLOEXEC to open. All current Linux, FreeBSD and Solaris versions implement 
> O_CLOEXEC. Does anyone know of a platform that doesn't have O_CLOEXEC that we 
> ought to work on?
> 
> https://www.freebsd.org/cgi/man.cgi?sektion=2=open
> http://man7.org/linux/man-pages/man2/open.2.html
> https://docs.oracle.com/cd/E23824_01/html/821-1463/open-2.html
> https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man2/open.2.html

Bump! If you run Mesos on a platform that doesn't support O_CLOEXEC (eg. Linux 
kernel <= 2.6.23), please let us know!

J

Are there any supported systems without O_CLOEXEC?

2017-09-27 Thread James Peach
Hi all,

In MESOS-8027 and https://reviews.apache.org/r/62638/, I'm claiming that, in 
practice, we do not have any supported platforms that don't implement O_CLOEXEC 
to open. All current Linux, FreeBSD and Solaris versions implement O_CLOEXEC. 
Does anyone know of a platform that doesn't have O_CLOEXEC that we ought to 
work on?

https://www.freebsd.org/cgi/man.cgi?sektion=2=open
http://man7.org/linux/man-pages/man2/open.2.html
https://docs.oracle.com/cd/E23824_01/html/821-1463/open-2.html
https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man2/open.2.html


thanks,
James

Re: Collect feedbacks on TASK_FINISHED

2017-09-22 Thread James Peach

> On Sep 21, 2017, at 10:12 PM, Vinod Kone  wrote:
> 
> I think it makes sense for `TASK_KILLED` to be sent in response to a KILL
> call irrespective of the exit status. IIRC, that was the original intention.

Those are the semantics we implement and expect in our scheduler and executor. 
The only time we emit TASK_KILLED is in response to a scheduler kill, and a 
scheduler kill always ends in a TASK_KILLED.

The rationale for this is
1. We want to distinguish whether the task finished for its own reasons (ie. 
not due to a scheduler kill)
2. The scheduler told us to kill the task and we did, so it was TASK_KILLED 
(irrespective of any exit status)

> On Thu, Sep 21, 2017 at 8:20 PM, Qian Zhang  wrote:
> 
>> Hi Folks,
>> 
>> I'd like to collect the feedbacks on the task state TASK_FINISHED.
>> Currently the default and command executor will always send TASK_FINISHED
>> as long as the exit code of task is 0, this cause an issue: when scheduler
>> initiates a kill task, the executor will send SIGTERM to the task first,
>> and if the task handles SIGTERM gracefully and exit with 0, the executor
>> will send TASK_FINISHED for that task, so we will see the task state
>> transition: TASK_KILLING -> TASK_FINISHED.
>> 
>> This seems incorrect because we thought it should be TASK_KILLING ->
>> TASK_KILLED, that's why we filed a ticket MESOS-7975
>>  for it. However, I am
>> not very sure if it is really a bug, because I think it depends on how we
>> define the meaning of TASK_FINISHED, if it means the task is terminated
>> successfully on its own without external interference, then I think it does
>> not make sense for scheduler to receive a TASK_KILLING followed by a
>> TASK_FINISHED since there is indeed an external interference (killing task
>> is initiated by scheduler). However, if TASK_FINISHED means the task is
>> terminated successfully for whatever reason (no matter it is killed or
>> terminated on its own), then I think it is OK to receive a TASK_KILLING
>> followed by a TASK_FINISHED.
>> 
>> Please let us know your thoughts on this issue, thanks!
>> 
>> 
>> Regards,
>> Qian Zhang
>> 



Re: [Design Doc] Native Support for Prometheus Metrics

2017-09-11 Thread James Peach

> On Sep 9, 2017, at 5:29 AM, Benjamin Bannier  
> wrote:
> 
> Hi James,
> 
> I'd like to make a longer comment here to make it easier to discuss.
> 
>> [...]
>> 
>> Note the proposal to alter how Timer metrics are exposed in an incompatible
>> way (I argue this is OK because you can't really make use of these metrics
>> now).
> 
> I am not sure I follow your argument around `Timer`. It is similar to a gauge
> caching the last value and an associated statistics calculated from a time 
> series.

I'm arguing that this does not provide useful semantics. When we think about 
the real-world objects we are representing with Timers, they don't look at all 
like what we represent with Gauges. For example, if I'm asked how much disk 
space is free, giving an instantaneous value with no reference to prior state 
(ie. Gauge) is informative and useful. Conversely, if I was asked to bill for 
my work time over the last month and I handed you a bill for the 10min because 
that was the last interval I worked, that answer is seriously unhelpful.

> I have never used Prometheus, but a brief look at the Prometheus
> docs seems to suggest that a `Timer` could be mapped onto a Prometheus summary
> type with minimal modifications (namely, by adding a `sum` value that you
> propose as sole replacement).

Right, that's what the current implementation does.

> I believe that exposing statistics is useful, and moving all `Timer` metrics 
> to
> counters (cumulative value and number of samples) would leads to information
> loss.

I'm not proposing that we remove the Timer statistics. I am, however, proposing 
that representing a Timer as a cumulative count of elapsed time units makes it 
possible to actually use Timers for practical purposes. When we plotted the 
allocation_run Timer, would see the difference between full and partial 
allocation runs by the area under the graph. We could see the difference over 
time and we could even see how allocation runs behave across failover.

> Since most of your criticism of `Timer` is about it its associated statistics,

That wasn't my intention. The problem with the Timer is the value and count 
fields. While I did mention that I think a raw histogram would be more useful, 
I explicitly put that out of scope.

> maybe we can make fixes to libprocess' `TimeSeries` and the derived
> `Statistics` to make them more usable. Right now `Statistics` seems to be more
> apt for dealing with timing measurements where one probably worries more about
> the long tail of the distribution (it only exposes the median and higher
> percentiles). It seems that if one would e.g., make the exposed percentiles
> configurable, it should be possible to expose a useful characterization of the
> underlying distribution (think: box plot). It might be that one would need to
> revisit how `TimeSeries` sparsifies older data to make sure the quantiles we
> expose are meaningful.

I agree that is is possible to measure and improve the statistics. Probably how 
I'd approach this is to add extra instrumentation to capture all the raw Timer 
observations. Then I would attempt to show that the running percentile summary 
approximates the actual percentiles measured from the complete data.

>> First, note that the “allocator/mesos/allocation_run_ms/count” sample is not
>> useful at all. It has the semantics of a saturating counter that saturates at
>> the size of the bounded time series. To address this, there is another metric
>> “allocator/mesos/allocation_runs”, which tracks the actual count of
>> allocation runs (3161331.00 in this case). If you plot this counter over time
>> (ie. as a rate), it will be zero for all time once it reaches saturation. In
>> the case of allocation runs, this is almost all the time, since 1000
>> allocations will be performed within a few hours.
> 
> While `count` is not a useful measure of the behavior of the measured datum, 
> it
> is critical to assess whether the derived statistic is meaningful (sample
> size). Like you write, it becomes less interesting once enough data was
> collected.

If the count doesn't saturate, it is always meaningful. If it is possible for 
the metric to become non-meaningful, that's pretty bad. I'm not sure I accept 
your premise here, though. Once the count saturates at 1000 samples, how do you 
know whether the statistics are for the last hour, or 3 hours ago? It is 
possible to accumulate no samples and for that to be invisible in the metrics.

>> Finally, while the derived statistics metrics can be informative, they are
>> actually less expressive than a raw histogram would be. A raw histogram of
>> timed values would allow an observer to distinguish cases where there are
>> clear performance bands (e.g. when allocation times cluster at either 15ms or
>> 200ms), but the percentile statistics obscure this information.
> 
> I would argue that is more a problem of `Statistics` only reporting 
> percentiles
> from the far out, 

[Design Doc] Native Support for Prometheus Metrics

2017-09-08 Thread James Peach
Hi all,

This document discussed a proposal to implement native support for Prometheus 
metrics in libprocess. Note the proposal to alter how Timer metrics are exposed 
in an incompatible way (I argue this is OK because you can't really make use of 
these metrics now).

> https://docs.google.com/document/d/1j1TkckxGrKixvAUoz_TJRl-YayFMCNIUYv8Cq19Kal8
>  
> 

thanks,
James

[Design Doc] Mesos PAM support

2017-09-01 Thread James Peach
Hi all,

I wrote up a design doc on how we can support PAM on Mesos agents. Any comments 
or feedback would be appreciated

> https://docs.google.com/document/d/1nlWC7ArgQRr5f_uH5wJ0AGV8BHMhoKgYf__RTmu7TKk
>  
> 

thanks,
James

Re: Sending TASK_STARTING in the built-in executors

2017-08-23 Thread James Peach

> On Aug 23, 2017, at 2:38 AM, Benno Evers  wrote:
> 
> Hi all,
> 
> when starting a task, an executor can send out the following status updates:
> 
>  - [optional] TASK_STARTING: Sent by the executor when it received the
> launch command
>  - TASK_RUNNING: Sent by the executor when the task is running


How is "running" defined?

> 
> The built-in executors currently don't send out TASK_STARTING updates. I
> think this discards potentially valuable information, because TASK_RUNNING
> informs us about the current status of the task, but not about the status
> change.
> 
> For example, if the network connection between scheduler and master is
> interrupted during task start, it has no good way to estimate the tasks
> start time, because the TASK_RUNNING update that it eventually gets might
> be a much later one. Also, for tasks with a long delay between STARTING and
> RUNNING, to an outside observer it will look the same as if the task was
> stuck in STAGING.
> 
> There is a small risk that sending an additional update could break
> existing frameworks. We briefly looked through some of the most popular
> open-source frameworks and didn't find any major issues, but of course it's
> impossible to do an exhaustive check.
> 
> In particular, a framework will break if
> 
> 1. It runs tasks using one of the built-in mesos executors, and
> 2. it doesn't handle the possibility of receiving TASK_STARTING update, and
> 3. it reports an error whenever it encounters an unexpected task states in
> an update.
> 
> 
> If you are aware of any such framework, please speak up so we can consider
> it.
> 
> 
> Thanks,
> -- 
> Benno Evers
> Software Engineer, Mesosphere



Re: [Proposal] Use jemalloc as default memory allocator for Mesos

2017-08-18 Thread James Peach

> On Aug 18, 2017, at 3:49 AM, Benno Evers  wrote:
> 
> Hi all,
> 
> I would like to propose bundling jemalloc as a new dependency
> under `3rdparty/`, and to link Mesos against this new memory
> allocator by default.

I support doing this for all the Mesos executable programs. We have been 
running under jemalloc for a couple of years with zero problems and improved 
performance (we run a pretty old glibc).

Note that we must not lib libmesos.so to jemalloc since that is used by 
programs that may not be able to tolerate linking in a separate malloc.

> # Motivation
> 
> The Mesos master and agent binaries are, ideally, very long-running
> processes. This makes them susceptible to memory issues, because
> even small leaks have a chance to build up over time to the point
> where they become problematic.
> 
> We have seen several such issues on our internal Mesos installations,
> for example https://issues.apache.org/jira/browse/MESOS-7748
> or https://issues.apache.org/jira/browse/MESOS-7800.
> 
> I imagine any organization running Mesos for an extended period
> of time has had its share of similar issues, so I expect this
> proposal to be useful for the whole community.
> 
> 
> # Why jemalloc?
> 
> Given that memory issues tend to be most visible after a given
> process has been running for a long time, it would be great to
> have the option to enable heap tracking and profiling at runtime,
> without having to restart the process. (This ability could then
> be connected to a Mesos endpoint, similar to how we can adjust
> the log level at runtime)
> 
> The two production-quality memory allocators that have this
> ability currently seem to be tcmalloc and jemalloc. Of these,
> jemalloc does produce in our experience better and more
> detailed statistics.
> 
> 
> # What is the impact on users who do not need this feature?
> 
> Naturally, not every single user of Mesos will have a need
> for this feature. To ensure these users would not experience serious
> performance regressions as a result of this change, we conducted
> a preliminary set of benchmarks whose results are collected
> under https://issues.apache.org/jira/browse/MESOS-7876
> 
> It turns out that we could probably even expect a small speedup (1% - 5%)
> as a nice side-effect of this change.
> 
> Users who compile Mesos themselves would of course have the option
> to disable jemalloc at configuration time or replace it with their
> memory allocator of choice.
> 
> 
> 
> I'm looking forward to hear any thoughts and comments.
> 
> 
> Thanks,
> -- 
> Benno Evers
> Software Engineer, Mesosphere



Re: Future Mesos Developer Community Meetings

2017-08-10 Thread James Peach

> On Aug 9, 2017, at 10:58 AM, Michael Park  wrote:
> 
> A few announcements here:
> 
>   - I'll no longer be hosting Mesos Developer Community Meetings going
>   forward. There's a lot of work involved in hosting/running good meetings. I
>   feel that I have not been putting enough work to make it good, and would
>   rather have someone else run it better instead.
>   - The meeting tomorrow, Aug 10, 2017, is *CANCELLED*.
>   - We'll resume Aug 24, 2017 with *Benjamin Hindman* as our new host!
>   - The meetings will occur *monthly* going forward rather than *every
>   other week*.

Mike, thanks for the time and effort you've put into the Community Meetings. 
I'm glad to hear that they will be continuing.

cheers,
James

Re: Deprecating `--disable-zlib` in libprocess

2017-08-08 Thread James Peach

> On Aug 8, 2017, at 10:57 AM, Chun-Hung Hsiao  wrote:
> 
> Hi all,
> 
> In libprocess, we have an optional `--disable-zlib` flag, but it's
> currently not used
> for conditional compilation and we always use zlib in libprocess,
> and there's a requirement check in Mesos to make sure that zlib exists.
> Should this option be removed then?

Yes.

> Or is there anyone working on a system without zlib?
> 
> Thanks for your opinions!
> Chun-Hung



seeking shepherd for MESOS-7675 (network ports isolator)

2017-07-03 Thread James Peach
Hi all,

I'm looking for a shepherd for 
https://issues.apache.org/jira/browse/MESOS-7675, which implements a 
network/ports isolator for ensuring correct ports resource usage on the host 
network.

thanks,
James



Re: Metrics for committing code for contributors

2017-07-03 Thread James Peach

> On Jul 1, 2017, at 1:07 AM, Benjamin Mahler  wrote:

[snip]

> I know reviewing code from newer contributors is often hard work and can
> sometimes be thankless, but it's really important. So I just wanted to
> celebrate and thank all those that have helped to grow the community of
> contributors so far!

Thank you to all the committers who have spent time reviewing my contributions. 
The review process in Mesos is much more stringent that in most other project 
I've been involved with, but in almost all cases the end result has been worth 
it for me.

Thanks!

Re: how to wait for nested container launch in tests

2017-07-01 Thread James Peach

> On Jun 30, 2017, at 11:23 AM, Jie Yu <yujie@gmail.com> wrote:
> 
> One way is to define a health check?

Good idea, I'll give that a crack

> 
> - Jie
> 
> On Fri, Jun 30, 2017 at 3:36 AM, James Peach <jor...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> I'm trying to write a nested container test for an isolator. It turns out
>> that the default executor sends TASK_RUNNING immediately after receiving
>> the LAUNCH_NESTED_CONTAINER response. However, at this time the nested
>> container is not actually up, and the command specified in the task info
>> hasn't been launched. Can anyone suggest a way for a test to detect that
>> the nested container command is actually running, so that I can advance the
>> clock to reliably trigger the isolator?
>> 
>> J



how to wait for nested container launch in tests

2017-06-30 Thread James Peach
Hi all,

I'm trying to write a nested container test for an isolator. It turns out that 
the default executor sends TASK_RUNNING immediately after receiving the 
LAUNCH_NESTED_CONTAINER response. However, at this time the nested container is 
not actually up, and the command specified in the task info hasn't been 
launched. Can anyone suggest a way for a test to detect that the nested 
container command is actually running, so that I can advance the clock to 
reliably trigger the isolator?

J

Re: RFC: removing process implementations from common headers

2017-06-28 Thread James Peach

> On Jun 28, 2017, at 2:19 AM, Benjamin Mahler <bmah...@apache.org> wrote:
> 
> Thanks James! As you said, removing Process implementations from the
> headers is the existing practice, but we need to do a sweep to enforce this
> consistently. Folks could work on this sweep today to make progress on the
> 3 benefits you outlined.
> 
> This proposal to me seems to just be:
> 
> (1) When needed for testing, whether to expose the Process declaration in
> its own foo_process.hpp header, rather than within foo.hpp.
> (2) whether to name the .cpp as foo_process.cpp rather than foo.cpp.
> 
> I'm not sure if I like (2), instead of keeping the .cpp named foo.cpp.
> Consider the case where there is no foo_process.hpp (not needed for
> testing), then you just have foo.hpp and foo_process.cpp. Or consider the
> case where a user is looking for the implementation of limiter.hpp, they
> have to know to look for limiter_process.cpp rather than limiter.cpp (but
> only when a Process is involved!). Seems unfortunate?

I'm OK with putting both Foo and FooProcess in foo.cpp.

> For Mesos, (1) sounds good, but I'm not sure if libprocess should be
> exposing the foo_process.hpp header in the public includes alongside the
> foo.hpp header. Because then libprocess users are assuming our particular
> implementation of the interface. I think for the libprocess testing
> purposes, we probably want the *_process.hpp header to be within the src/
> directory?

There are 2 options for libprocess. Put the internal headers in the src/ 
directory, or keep them in include/ but don't install them (use 
noinst_HEADERS). The former gives better protection against accidentally 
consuming the internal headers from Mesos.

> On Sat, Jun 24, 2017 at 8:23 AM, James Peach <jor...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> There is a common Mesos pattern where a subsystem is implemented by a
>> facade class that forwards calls to an internal Process class, eg. Fetcher
>> and FetcherProcess, or zookeeper::Group and zookeeper::GroupProcess. Since
>> the Process is an internal implementation detail, I'd like to propose that
>> we adopt a general policy that it should not be exposed in the primary
>> header file. This has the following benefits:
>> 
>> - reduces the number of symbols exposed to clients including the primary
>> header file
>> - reduces the number of header files needed in the primary header file
>> - reduces the number of rebuilt dependencies when the process
>> implementation changes
>> 
>> Although each individual case of this practice may not improve build
>> times, I think it is likely that over time, consistent application of this
>> will help.
>> 
>> In many cases, when FooProcess is only used by Foo, both the declaration
>> and definitions of Foo can be inlined into "foo.cpp", which is already our
>> common practice. If the implementation of the Process class is needed
>> outside the facade (eg. for testing), the pattern I would propose is:
>> 
>>foo.hpp - Primary API for Foo, forward declares FooProcess
>>foo_process.hpp - Declarations for FooProcess
>>foo_process.cpp - Definitions of FooProcess
>> 
>> The "checks/checker.hpp" interface almost follows this pattern, but gives
>> up the build benefits by including "checker_process.hpp" in "checker.hpp".
>> This should be simple to fix however.
>> 
>> thanks,
>> James



RFC: removing process implementations from common headers

2017-06-23 Thread James Peach
Hi all,

There is a common Mesos pattern where a subsystem is implemented by a facade 
class that forwards calls to an internal Process class, eg. Fetcher and 
FetcherProcess, or zookeeper::Group and zookeeper::GroupProcess. Since the 
Process is an internal implementation detail, I'd like to propose that we adopt 
a general policy that it should not be exposed in the primary header file. This 
has the following benefits:

- reduces the number of symbols exposed to clients including the primary header 
file
- reduces the number of header files needed in the primary header file
- reduces the number of rebuilt dependencies when the process implementation 
changes

Although each individual case of this practice may not improve build times, I 
think it is likely that over time, consistent application of this will help.

In many cases, when FooProcess is only used by Foo, both the declaration and 
definitions of Foo can be inlined into "foo.cpp", which is already our common 
practice. If the implementation of the Process class is needed outside the 
facade (eg. for testing), the pattern I would propose is:

foo.hpp - Primary API for Foo, forward declares FooProcess
foo_process.hpp - Declarations for FooProcess
foo_process.cpp - Definitions of FooProcess

The "checks/checker.hpp" interface almost follows this pattern, but gives up 
the build benefits by including "checker_process.hpp" in "checker.hpp". This 
should be simple to fix however.

thanks,
James

Re: An independent server communicating between master and client

2017-06-21 Thread James Peach

> On Jun 20, 2017, at 11:23 AM, Wenzhao Zhang  wrote:
> 
> Hello, All:
> 
> I'm working on an independent server, which should be able to talk to the
> master via  HTTP POST requests.
> I setup a Jersey REST server.   My initial plan is to use *JSON/XML* to map
> Entities, as this is widely used.
> 
> However, I find some communication compatibility issues,
> 1. I generate the Java classes from the *.proto files.
>I try to create an Event.Offers object via a REST (with *JSON*) call,
> but get some data stream deserializing errors.
>I think this is because of the complex structure of the generated
> classes. They are not POJO's, e.g. they don't have public constructors.

Your JSON serializer needs to follow the Protobuf JSON mapping, with the 
exception that field names are not mapped to lowerCamelCase.

https://developers.google.com/protocol-buffers/docs/proto3#json


> 2. "src/cli/execute.cpp" sets "ContentType" to "*PROTOBUF*".
>I think in most cases, Mesos internally uses "PROTOBUF", not JSON.
> 
> So, given the above issues,
> Should I implement my server with *ProtocalBuffer*, is this a better
> approach?
> Or should I try to convert PROTOBUF to JSON inside Mesos?
> 
> Could anyone kindly give some suggestions? I become confused on this point.
> 
> Thanks very much
> Wenzhao



Re: Work group on Community

2017-06-16 Thread James Peach

> On Jun 15, 2017, at 10:57 AM, Vinod Kone  wrote:
> 
> Hi folks,
> 
> Seeing that our first official containerizer WG is off to a good start, we
> want to use that momentum to start new WGs.
> 
> I'm proposing that we start a new work group on community. The mission of
> this work group would be to figure out ways to grow the size of our
> community and improve the experience of community members (users, devs,
> contributors, committers etc).
> 
> In the first meeting, we can nail down what the charter of this work group
> should be etc. My initial ideas for the topics/components this work group
> could cover
> 
> --> Releases
> --> Roadmap
> --> Reviews
> --> JIRA
> --> CI
> 
> Over time, I'm hoping that new specific work groups will sprung up that can
> own some of these topics.
> 
> If you are interested in joining this work group, please reply to this
> thread and I'll add you to the invite.

I'm interested, but unlikely to have much bandwidth to contribute anything 
substantial. One suggestion I have is that a Mesos Weekly news would be pretty 
great. There is a lot of activity on reviewboard, slack and in design documents 
and collecting that in a regular newsletter would give that activity a lot more 
visibility.

J

Re: Isolating metrics collection from master/agent slowness

2017-05-22 Thread James Peach

> On May 19, 2017, at 11:35 AM, Zhitao Li  wrote:
> 
> Hi,
> 
> I'd like to start a conversation to talk about metrics collection endpoints
> (especially `/metrics/snapshot`) behavior.
> 
> Right now, these endpoints are served from the same master/agent's
> libprocess, and extensively uses `gauge` to chain further callbacks to
> collect various metrics (DRF allocator specifically adds several metrics
> per role).
> 
> This brings a problem when the system is under load: when the
> master/allocator libprocess becomes busy, stats collection itself becomes
> slow too. Flying dark when the system is under load is specifically painful
> for an operator.

Yes, sampling metrics should approach zero cost.

> I would like to explore the direction of isolating metric collection even
> when the master is slow. A couple of ideas:
> 
> - (short term) reduce usage of gauge and prefer counter (since I believe
> they are less affected);

I'd rather not squash the semantics for performance reasons. If a metric has 
gauge semantics, I don't think we should represent that as a Counter.

> - alternative implementation of `gauge` which does not contend on
> master/allocator's event queue;

This is doable in some circumstances, but not always. For example, 
Master::_uptime_secs() doesn't need to run on the master queue, but 
Master::_outstanding_offers arguably does. The latter could be implemented by 
sampling an variable that is updated, but that's not very generic, so we should 
try to think of something better.

> - serving metrics collection from a different libprocess routine.

See MetricsProcess. One (mitigation?) approach would be to sample the metrics 
at a fixed rate and then serve the cached samples from the MetricsProcess. I 
expect most installations have multiple clients sampling the metrics, so this 
would at least decouple the sample rate from the metrics request rate.

> 
> Any thoughts on these?
> 
> -- 
> Cheers,
> 
> Zhitao Li



Re: Added task status update reason for health checks

2017-05-22 Thread James Peach

> On May 22, 2017, at 5:28 AM, Andrei Budnik  wrote:
> 
> Hi All,
> 
> The new reason is REASON_TASK_HEALTH_CHECK_STATUS_UPDATED.
> The corresponding ticket is https://issues.apache.org/jira/browse/MESOS-6905

Is there any documentation about how executors ought to use this reason? Even a 
comment in the proto files would help executor authors use this consistently.

J

Re: New YouTube channel to house working group recordings?

2017-05-18 Thread James Peach

> On May 18, 2017, at 4:53 PM, Michael Park  wrote:
> 
> Is there a reason why you want to use new YouTube channel? I think I would
> prefer to use the existing channel and house them in a different playlist.


Is there a link to this from mesos.apache.org? I couldn't find one ...


> On Thu, May 18, 2017 at 5:09 PM Vinod Kone  wrote:
> 
>> +1
>> 
>> @vinodkone
>> 
>>> On May 18, 2017, at 12:29 PM, Judith Malnick 
>> wrote:
>>> 
>>> Hi All,
>>> 
>>> I'd like to start an Apache Mesos YouTube channel to post the recordings
>> of
>>> public meetings and would also like to post the meetings on a new
>> playlist
>>> here , so
>>> that community members in China can access the recordings without VPN.
>>> 
>>> Please let me know your thoughts.
>>> 
>>> Best,
>>> Judith
>>> --
>>> Judith Malnick
>>> DC/OS Community Manager
>>> 310-709-1517
>> 



Re: Choice between LOG(FATAL) and EXIT(EXIT_FAILURE)

2017-05-08 Thread James Peach

> On May 8, 2017, at 2:02 PM, Zhitao Li  wrote:
> 
> Hi Vinod,
> 
> I'm reviving this old conversation from last year.
> 
> We are feeling some operational pain again, mostly due to journald
> truncates the stderr of Mesos agent so we cannot see a full exit message
> even in journald, and the error is not in GLOG output at all.
> 
> I filed https://issues.apache.org/jira/browse/MESOS-7472 to track this. We
> can submit the patch as long as some committer can shepherd this.


https://reviews.apache.org/r/56681/


> 
> Thanks!
> 
> On Fri, Sep 16, 2016 at 2:17 PM, Vinod Kone  wrote:
> 
>> We typically used LOG(FATAL) when we were interested in the stack trace. If
>> not, we preferred to use EXIT(EXIT_FAILURE). While that was the original
>> intention, not sure if we have been following that distinction diligently.
>> 
>> Separately, we should fix EXIT to log at ERROR level instead of just
>> printing to to stderr.
>> 
>> On Tue, Aug 30, 2016 at 10:42 AM, Zhitao Li  wrote:
>> 
>>> Hi,
>>> 
>>> Can someone explain better about when we should use LOG(FATAL) and when
>>> EXIT(EXIT_FAILURE) in Mesos codebase?
>>> 
>>> One thing is that EXIT(EXIT_FAILURE) does not seem to leave anything in
>> the
>>> level separated GLOG files so it could be mysterious to people who relies
>>> on that for debugging issues.
>>> 
>>> I see we have about 100 call sites to LOG(FATAL) and 200 call sites
>>> to EXIT(EXIT_FAILURE) at the moment.
>>> 
>>> Many thanks!
>>> 
>>> --
>>> Cheers,
>>> 
>>> Zhitao Li
>>> 
>> 
> 
> 
> 
> -- 
> Cheers,
> 
> Zhitao Li



Re: documenting test expactations

2017-05-08 Thread James Peach

> On May 1, 2017, at 4:28 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> 
> Do you have some examples?

I think that this:

EXPECT_EQ(Bytes(512u), BasicBlocks(Bytes(128)).bytes())
<< "a partial block should round up";

is a strict superset of this:

// A partial block should round up.
EXPECT_EQ(Bytes(512u), BasicBlocks(Bytes(128)).bytes())

The former is preferable since the person triaging test failures gets the 
immediate context of what the expectation is doing. This is valuable even if 
you might also find you need to check the source.

> 
> Thinking through my own experience debugging tests, I tend to only get
> value out of EXPECT messages when they are providing information that I
> can't get access to from the line number / actual vs expected printing.
> (e.g. the value of a variable). If the EXPECT message is simply explaining
> what the test is doing, then I tend to ignore it and read the test instead,
> so it would be helpful to discuss some examples to get a better sense. :)
> 
> On Sat, Apr 29, 2017 at 10:02 AM, James Peach <jor...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> In a couple of reviews, I've been asked to avoid emitting explanatory
>> messages from the EXPECT() macro. The rationale for this is that tests
>> usually use comments. However, I think that emitting the reason for a
>> failed expectation into the test log is pretty helpful and we should do it
>> more often.
>> 
>> What do people think about explicitly allowing (or even encouraging) this?
>> ie. EXPECT(...) << "some explanation goes here"
>> 
>> J



documenting test expactations

2017-04-29 Thread James Peach
Hi all,

In a couple of reviews, I've been asked to avoid emitting explanatory messages 
from the EXPECT() macro. The rationale for this is that tests usually use 
comments. However, I think that emitting the reason for a failed expectation 
into the test log is pretty helpful and we should do it more often.

What do people think about explicitly allowing (or even encouraging) this? ie. 
EXPECT(...) << "some explanation goes here"

J

Re: RFC: constraining UPIDs in libprocess messages

2017-04-26 Thread James Peach

> On Apr 25, 2017, at 5:45 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> 
> Thanks, this sounds good, just want to clarify a few things:
> 
> (1) When you say "bind" the UPID IP address to the address of the message
> sender, you specifically just mean disallowing messages where they do not
> match?

Correct.

> 
> (2) Why IP and not port? Or do you think port enforcement would be
> optionally useful on top of IP enforcement?

The UPID contains the listener port, not the sender port, so they are 
guaranteed not to match.

> 
> (3) What will libprocess do when it finds the IP doesn't match? Drop and
> log?

It responds with an InternalServerError.

> 
> On Thu, Apr 20, 2017 at 9:16 AM, James Peach <jor...@gmail.com> wrote:
> 
>> 
>>> On Apr 19, 2017, at 5:24 PM, Benjamin Mahler <bmah...@apache.org> wrote:
>>> 
>>> It's not obvious to me what all of the implications are. I feel we need
>> a more comprehensive proposal here, can you do a small write up that shows
>> you've thought through all the implications?
>> 
>> Ok, let me summarize ...
>> 
>> In libprocess, actors (processes) are globally identified by their UPID.
>> The UPID is a string of the format @:port. When libprocess
>> actors communicate, they listen on a single IP address and port and claim
>> this endpoint in their UPID. The expectation of other actors is that they
>> can send messages to this actor by connecting to the endpoint claimed in
>> the UPID. Note that the peer address used to send messages is not the UPID,
>> but whichever source address:port that the sender happened to bind.
>> 
>> Due to various network configurations (eg. NAT, multihoming), situations
>> arise where a receiver is not able to easily predict which IP address it
>> needs to claim in its UPID such that senders will be able to connect to it.
>> This leads to the need for configuration options like LIBPROCESS_IP
>> (specify the address to listen on) and LIBPROCESS_ADVERTISE_IP (specify the
>> address in the UPID).
>> 
>> The fundamental problem this proposal attempts to mitigate is the
>> libprocess has no mechanism that can authenticate a UPID. This means that
>> unless the network fabric is completely trusted, anyone with access to the
>> network can inject arbitrary messages into any libprocess actor,
>> impersonating any other actor.
>> 
>> This proposal adds an optional libprocess setting to bind the IP address
>> claimed in the UPID to the IP address of the message sender. This is a
>> receiver option, where the receiver compared the result of getpeername(2)
>> with the UPID from the libprocess message. The IP address is required to
>> match, but the port and ID are allowed to differ. In practice, this means
>> that libprocess actors are required to send from the same IP address they
>> are listening on. So to impersonate a specific UPID you have to be able to
>> send the message from the same IP address the impersonatee is using (ie. be
>> running on the same host), which increases the difficulty of impersonation.
>> 
>> In our deployments, all the cluster members are single-homed and remote
>> access to the system is restricted. We know in advance that neither
>> LIBPROCESS_IP nor LIBPROCESS_ADVERTISE_IP options are required. We have
>> some confidence that we control the services running on cluster hosts. If
>> we bind the UPID address to the socket peer address, then UPID
>> impersonation requires that malicious code is already running on a cluster
>> host, at which point we probably have bigger problems.
>> 
>> In terms of testing, all the test suite passes except for 
>> ExamplesTest.DiskFullFramework.
>> This test uses the LIBPROCESS_IP option to bind to the lo0 interface, which
>> can result in the UPID claiming 127.0.0.1 but getpeername(2) returning one
>> of the hosts external IP addresses. This is really just one type of a
>> multi-homed host configuration, so it demonstrates that this configuration
>> will break.
>> 
>>> 
>>> E.g. how does this affect the use of proxies, multihomed hosts, our
>> testing facilities for message spoofing, etc.
>>> 
>>> On Tue, Apr 18, 2017 at 2:07 PM, James Peach <jor...@gmail.com> wrote:
>>> 
>>>> On Apr 7, 2017, at 8:47 AM, Jie Yu <yujie@gmail.com> wrote:
>>>> 
>>>> + BenM
>>>> 
>>>> James, I don't have immediate context on this issue. BenM will be back
>> next
>>>> week and he should be able to give you more accurate feedback.
>>> 
>>> 

Re: RFC: constraining UPIDs in libprocess messages

2017-04-20 Thread James Peach

> On Apr 19, 2017, at 5:24 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> 
> It's not obvious to me what all of the implications are. I feel we need a 
> more comprehensive proposal here, can you do a small write up that shows 
> you've thought through all the implications?

Ok, let me summarize ...

In libprocess, actors (processes) are globally identified by their UPID. The 
UPID is a string of the format @:port. When libprocess actors 
communicate, they listen on a single IP address and port and claim this 
endpoint in their UPID. The expectation of other actors is that they can send 
messages to this actor by connecting to the endpoint claimed in the UPID. Note 
that the peer address used to send messages is not the UPID, but whichever 
source address:port that the sender happened to bind.

Due to various network configurations (eg. NAT, multihoming), situations arise 
where a receiver is not able to easily predict which IP address it needs to 
claim in its UPID such that senders will be able to connect to it. This leads 
to the need for configuration options like LIBPROCESS_IP (specify the address 
to listen on) and LIBPROCESS_ADVERTISE_IP (specify the address in the UPID). 

The fundamental problem this proposal attempts to mitigate is the libprocess 
has no mechanism that can authenticate a UPID. This means that unless the 
network fabric is completely trusted, anyone with access to the network can 
inject arbitrary messages into any libprocess actor, impersonating any other 
actor.

This proposal adds an optional libprocess setting to bind the IP address 
claimed in the UPID to the IP address of the message sender. This is a receiver 
option, where the receiver compared the result of getpeername(2) with the UPID 
from the libprocess message. The IP address is required to match, but the port 
and ID are allowed to differ. In practice, this means that libprocess actors 
are required to send from the same IP address they are listening on. So to 
impersonate a specific UPID you have to be able to send the message from the 
same IP address the impersonatee is using (ie. be running on the same host), 
which increases the difficulty of impersonation.

In our deployments, all the cluster members are single-homed and remote access 
to the system is restricted. We know in advance that neither LIBPROCESS_IP nor 
LIBPROCESS_ADVERTISE_IP options are required. We have some confidence that we 
control the services running on cluster hosts. If we bind the UPID address to 
the socket peer address, then UPID impersonation requires that malicious code 
is already running on a cluster host, at which point we probably have bigger 
problems.

In terms of testing, all the test suite passes except for 
ExamplesTest.DiskFullFramework. This test uses the LIBPROCESS_IP option to bind 
to the lo0 interface, which can result in the UPID claiming 127.0.0.1 but 
getpeername(2) returning one of the hosts external IP addresses. This is really 
just one type of a multi-homed host configuration, so it demonstrates that this 
configuration will break.

> 
> E.g. how does this affect the use of proxies, multihomed hosts, our testing 
> facilities for message spoofing, etc.
> 
> On Tue, Apr 18, 2017 at 2:07 PM, James Peach <jor...@gmail.com> wrote:
> 
> > On Apr 7, 2017, at 8:47 AM, Jie Yu <yujie@gmail.com> wrote:
> >
> > + BenM
> >
> > James, I don't have immediate context on this issue. BenM will be back next
> > week and he should be able to give you more accurate feedback.
> 
> 
> I updated the review chain (from https://reviews.apache.org/r/58517/). Is 
> anyone able to shepherd this?
> 
> 
> >
> > - Jie
> >
> > On Fri, Apr 7, 2017 at 8:37 AM, James Peach <jor...@gmail.com> wrote:
> >
> >>
> >>> On Apr 5, 2017, at 5:42 PM, Jie Yu <yujie@gmail.com> wrote:
> >>>
> >>> One comment here is that:
> >>>
> >>> We plan to support libprocess communication using domain socket. In other
> >>> words, we plan to make UPID a socket addr. Can we make sure this approach
> >>> also works for the case where UPID is a unix address in the future? For
> >>> instance, what will `socket->peer();` returns for domain socket?
> 
> This would probably work, but it depends on the lib process implementation. 
> Since this is (default false) option, I this it is OK for now. I'll be happy 
> to revisit when libprocess messaging supports domain sockets.
> 
> >>
> >> I can look into that.
> >>
> >> So you would consider this approach a reasonable mitigation?
> >>
> >>>
> >>> - Jie
> >>>
> >>> On Wed, Apr 5, 2017 at 3:27 PM, James Peach <jor...@gmail.com> wrote:
> >>>
> >

Re: RFC: constraining UPIDs in libprocess messages

2017-04-18 Thread James Peach

> On Apr 7, 2017, at 8:47 AM, Jie Yu <yujie@gmail.com> wrote:
> 
> + BenM
> 
> James, I don't have immediate context on this issue. BenM will be back next
> week and he should be able to give you more accurate feedback.


I updated the review chain (from https://reviews.apache.org/r/58517/). Is 
anyone able to shepherd this?


> 
> - Jie
> 
> On Fri, Apr 7, 2017 at 8:37 AM, James Peach <jor...@gmail.com> wrote:
> 
>> 
>>> On Apr 5, 2017, at 5:42 PM, Jie Yu <yujie@gmail.com> wrote:
>>> 
>>> One comment here is that:
>>> 
>>> We plan to support libprocess communication using domain socket. In other
>>> words, we plan to make UPID a socket addr. Can we make sure this approach
>>> also works for the case where UPID is a unix address in the future? For
>>> instance, what will `socket->peer();` returns for domain socket?

This would probably work, but it depends on the lib process implementation. 
Since this is (default false) option, I this it is OK for now. I'll be happy to 
revisit when libprocess messaging supports domain sockets.

>> 
>> I can look into that.
>> 
>> So you would consider this approach a reasonable mitigation?
>> 
>>> 
>>> - Jie
>>> 
>>> On Wed, Apr 5, 2017 at 3:27 PM, James Peach <jor...@gmail.com> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> Currently, libprocess messages contain a UPID, which is send by the peer
>>>> in the HTTP message header. There's no validation of this, so generally
>>>> messages are trusted to be from the UPID they claim to be.
>>>> 
>>>> As an RFC, I've pushed https://reviews.apache.org/r/58224/. This patch
>>>> constrains the UPID to not change for the lifetime of the socket

I dropped this constraint. There are DiskQuotaTest.SlaveRecovery depends on the 
UPID being able to change. More accurately, libprocess only matches on the 
address portion of the UPID when finding a socket to use, and for now I don't 
think this change is beneficial enough to break that assumption.

>>>> , and
>> also
>>>> enforces that the the IP address portion of the UPID matches the peer
>>>> socket address. This makes UPIDs more reliable, but the latter check
>> would
>>>> break existing configurations. I'd appreciate any feedback on whether
>> this
>>>> is worth pursuing at the lib process level and whether people feel that
>>>> this specific mitigation is worthwhile.
>>>> 
>>>> thanks,
>>>> James
>> 
>> 



Re: RFC: constraining UPIDs in libprocess messages

2017-04-07 Thread James Peach

> On Apr 5, 2017, at 5:42 PM, Jie Yu <yujie@gmail.com> wrote:
> 
> One comment here is that:
> 
> We plan to support libprocess communication using domain socket. In other
> words, we plan to make UPID a socket addr. Can we make sure this approach
> also works for the case where UPID is a unix address in the future? For
> instance, what will `socket->peer();` returns for domain socket?

I can look into that.

So you would consider this approach a reasonable mitigation?

> 
> - Jie
> 
> On Wed, Apr 5, 2017 at 3:27 PM, James Peach <jor...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> Currently, libprocess messages contain a UPID, which is send by the peer
>> in the HTTP message header. There's no validation of this, so generally
>> messages are trusted to be from the UPID they claim to be.
>> 
>> As an RFC, I've pushed https://reviews.apache.org/r/58224/. This patch
>> constrains the UPID to not change for the lifetime of the socket, and also
>> enforces that the the IP address portion of the UPID matches the peer
>> socket address. This makes UPIDs more reliable, but the latter check would
>> break existing configurations. I'd appreciate any feedback on whether this
>> is worth pursuing at the lib process level and whether people feel that
>> this specific mitigation is worthwhile.
>> 
>> thanks,
>> James



RFC: constraining UPIDs in libprocess messages

2017-04-05 Thread James Peach
Hi all,

Currently, libprocess messages contain a UPID, which is send by the peer in the 
HTTP message header. There's no validation of this, so generally messages are 
trusted to be from the UPID they claim to be.

As an RFC, I've pushed https://reviews.apache.org/r/58224/. This patch 
constrains the UPID to not change for the lifetime of the socket, and also 
enforces that the the IP address portion of the UPID matches the peer socket 
address. This makes UPIDs more reliable, but the latter check would break 
existing configurations. I'd appreciate any feedback on whether this is worth 
pursuing at the lib process level and whether people feel that this specific 
mitigation is worthwhile.

thanks,
James

Re: protbuf to json not compatible

2017-03-24 Thread James Peach

> On Mar 24, 2017, at 12:49 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> 
> James, I'm curious, do you know specifically what the incompatibility is?

https://developers.google.com/protocol-buffers/docs/proto3#json

"Message field names are mapped to lowerCamelCase and become JSON object keys. "

So field names like "failover_timeout" are renamed to "failoverTimeout".

> 
> Olivier, if you're dealing with protobuf already and trying to send it to
> mesos, there's no need to use JSON. Unless you have a requirement to do so?
> There are some outstanding issues with our JSON<->Protobuf conversion,
> specifically we currently are inconsistent from proto3 when it comes to the
> int(32|64), fixed(32|64), uint(32|64) handling, for one (we don't allow
> strings on the input side (tomek is addressing that), and we don't use
> strings on the output side).
> 
> On Fri, Mar 24, 2017 at 12:44 AM, Olivier Sallou <olivier.sal...@irisa.fr>
> wrote:
> 
>> 
>> 
>> On 03/24/2017 04:02 AM, James Peach wrote:
>>>> On Mar 23, 2017, at 7:58 PM, James Peach <jor...@gmail.com> wrote:
>>>> 
>>>>> On Mar 23, 2017, at 1:54 AM, Olivier Sallou <olivier.sal...@irisa.fr>
>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> when transforming a protobug message to json with MessageToJson, the
>>>>> json is not compatible with the json format expected by Mesos master.
>>>> This is because you generated the protobuf bindings with proto3
>> compiler. AFAICT they made an incompatible change to the JSON wire format.
>> This bites you when using the jsonpb Go package, for example. I ended up
>> post-processing the generated Go code to correct the field names.
>>> Sorry I forgot to mention that the other workaround is to generate the
>> protobuf bindings with the proto2 compiler.
>> Thanks
>> My first workaround is to generate json directly, not a big deal in my
>> case, but I wanted to understand.
>> 
>> Olivier
>>> 
>>> J
>> 
>> --
>> Olivier Sallou
>> IRISA / University of Rennes 1
>> Campus de Beaulieu, 35000 RENNES - FRANCE
>> Tel: 02.99.84.71.95
>> 
>> gpg key id: 4096R/326D8438  (keyring.debian.org)
>> Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438
>> 
>> 



Re: protbuf to json not compatible

2017-03-23 Thread James Peach

> On Mar 23, 2017, at 7:58 PM, James Peach <jor...@gmail.com> wrote:
> 
>> 
>> On Mar 23, 2017, at 1:54 AM, Olivier Sallou <olivier.sal...@irisa.fr> wrote:
>> 
>> Hi,
>> 
>> when transforming a protobug message to json with MessageToJson, the
>> json is not compatible with the json format expected by Mesos master.
> 
> This is because you generated the protobuf bindings with proto3 compiler. 
> AFAICT they made an incompatible change to the JSON wire format. This bites 
> you when using the jsonpb Go package, for example. I ended up post-processing 
> the generated Go code to correct the field names.

Sorry I forgot to mention that the other workaround is to generate the protobuf 
bindings with the proto2 compiler.

J

Re: protbuf to json not compatible

2017-03-23 Thread James Peach

> On Mar 23, 2017, at 1:54 AM, Olivier Sallou  wrote:
> 
> Hi,
> 
> when transforming a protobug message to json with MessageToJson, the
> json is not compatible with the json format expected by Mesos master.

This is because you generated the protobuf bindings with proto3 compiler. 
AFAICT they made an incompatible change to the JSON wire format. This bites you 
when using the jsonpb Go package, for example. I ended up post-processing the 
generated Go code to correct the field names.

> 
> For example, for volumes it generates
> 
> 
> volumes: [
> 
>{'hostPath': '',
> 
>  'containerPath': '...',
> 
> ...
> 
>   }
> 
> ]
> 
> 
> but HTTP API expects "source" and "container_path"
> 
> is it an expected behavior ? This prevents from "creating" a task in
> protobuf format and sending it to HTTP API with a protobug to json
> conversion.

It’s expected from a proto3 compiler. IMHO this is a breaking change and they 
should have made a fallback option, but there’s not one. There’s no good 
choices here AFAICT.

J

Re: One question about src/Makefile.am

2017-03-13 Thread James Peach

> On Mar 13, 2017, at 9:20 AM, Yu Wei  wrote:
> 
> Hi guys,
> 
> 
> In src/Makefile.am, there is two lines as below,
> 
> MESOS_CPPFLAGS += -I$(top_srcdir)/include
> MESOS_CPPFLAGS += -I../include

This one refers to the directory you are building in, which might not be the 
same as $(top_srcdir).

> 
> 
> In current code structure, it seems they references the same directory.
> 
> 
> Is this redundant code? Or any story about this?
> 
> 
> Thanks,
> 
> Jared, (韦煜)
> Software developer
> Interested in open source software, big data, Linux



Re: thread_local supported on Apple

2016-12-19 Thread James Peach

> On Dec 19, 2016, at 3:00 PM, Joris Van Remoortere  wrote:
> 
> Thanks for your input Zameer.
> 
> Is it common for developers on mac to use XCode as their compilation
> environment as well? I would think if you used clang on the command line
> then you could still install an updated version of clang without having to
> do a system upgrade from Yosemite?

Xcode provides both the integrated development environment (typically not used 
with Mesos) and one or more toolchains and SDKs (used by Mesos). Whether the 
modern toolchain can be used with an older release depends on whether the Xcode 
release is considered supported on that release.

> 
> I'm getting the impression that it's reasonable to make this change without
> a deprecation cycle. Please let me know if you (anyone) disagrees.
> 
> —
> *Joris Van Remoortere*
> Mesosphere
> 
> On Mon, Dec 19, 2016 at 2:17 PM, Zameer Manji  wrote:
> 
>> I believe this thread_local support is in XCode 8.2. From the link you
>> shared:
>> 
>>> Xcode 8.2 requires a Mac running macOS 10.11.5 or later
>> 
>> This means that users can upgrade the compiler on El Capitan just fine
>> without a system upgrade.
>> 
>> Users on Yosemite need to do a system upgrade to pick up the new compiler
>> however.
>> 
>> On Mon, Dec 19, 2016 at 12:33 PM, Joris Van Remoortere <
>> jo...@mesosphere.io>
>> wrote:
>> 
>>> Is my understanding incorrect regarding the ability to upgrade the
>> compiler
>>> version on Yosemite and El Capitan without requiring a full system
>> upgrade?
>>> 
>>> @Mpark are you making a case for not updating to `thread_local` just yet?
>>> 
>>> —
>>> *Joris Van Remoortere*
>>> Mesosphere
>>> 
>>> On Fri, Dec 16, 2016 at 11:11 AM, Michael Park  wrote:
>>> 
 Brief survey from the #dev channel: https://mesos.slack.com/
 archives/dev/p1481760285000430
 
 Yosemite 10.10: Fail. Compilation error. (by @hausdorff
 https://mesos.slack.com/archives/dev/p1481760552000435)
 El Capitan 10.11: Fail. Compilation error. (by @zhitao
 https://mesos.slack.com/files/zhitao/F3F7WUCNM/-.diff)
 Sierra 10.12: Success (by @mpark)
 
 On Wed, Dec 14, 2016 at 3:27 PM, Joris Van Remoortere <
>>> jo...@mesosphere.io
> 
 wrote:
 
> The time has come and we finally have `thread_local` support in the
>>> Apple
> tool chain:
> https://developer.apple.com/library/content/releasenotes/Dev
> eloperTools/RN-Xcode/Introduction.html
> 
> In our code base we have a special exception for Apple that defines
>> our
> thread local to be `__thread` rather than the c++11 standard
> `thread_local`.
> https://github.com/apache/mesos/blob/812e5e3d4e4d9e044a1cfe6
> cc7eaab10efb499b6/3rdparty/stout/include/stout/thread_local.hpp
> 
> A consequence of using `__thread` on Apple is that initializers for
 thread
> locals are required to be constant expressions. This is not the case
>>> for
> the c++11 standard `thread_local`.
> 
> I would like to propose that we remove this exception on the Apple
 platform
> now that the Apple toolchain supports the c++11 standard.
> 
> As I am not a common user of the Apple development experience I would
 like
> to ask for some input from the community as to whether requiring this
> toolchain update is acceptable, and if we need a deprecation period
>> or
>>> if
> we can just make this change now.
> 
> I am leaning towards no deprecation period as I am not aware of
 production
> environments running on systems that define `__APPLE__`.
> —
> *Joris Van Remoortere*
> Mesosphere
> 
 
>>> 
>>> --
>>> Zameer Manji
>>> 
>> 



Re: Structured logging for Mesos (or c++ glog)

2016-12-19 Thread James Peach

> On Dec 19, 2016, at 9:43 AM, Zhitao Li  wrote:
> 
> Hi,
> 
> I'm looking at how to better utilize ElasticSearch to perform log analysis 
> for logs from Mesos. It seems like ElasticSearch would generally work better 
> for structured logging, but Mesos still uses glog thus all logs produced are 
> old-school unstructured lines.
> 
> I wonder whether anyone has brought the conversation of making Mesos logs 
> easier to process, or if anyone has experience to share.

Are you trying to stitch together sequences of events? I that case, would 
direct event logging be more useful?

J

Re: Building on OS X 10.12

2016-12-05 Thread James Peach

> On Dec 2, 2016, at 10:54 PM, Jie Yu  wrote:
> 
> Another tip. If you are on macOS sierra, you might notice the linking is
> extremely slow using the default clang.
> 
> Using CXXFLAGS `-fvisibility-inlines-hidden` will greatly speedup the
> linking.

Is there a reason we should not always do this? It reduces the number of 
exported symbols in libmesos.so from 250K to 100K.

J 

Re: MESOS-6233 Allow agents to re-register post a host reboot

2016-11-29 Thread James Peach

> On Nov 28, 2016, at 6:09 PM, Yan Xu  wrote:
> 
> So one thing that was brought up during offline conversations was that if the 
> host reboot is associated with hardware change (e.g., a new memory stick):
> 
>   • Currently: the agent would skip the recovery (and the chance of 
> running into incompatible agent info) and register as a new agent.
>   • With the change: the agent could run into incompatible agent info due 
> to resource change and flap indefinitely until the operator intervenes.
> 
> To mitigate this and maintain the current behavior, we can have the agent 
> remove `rm -f /meta/slaves/latest` automatically upon recovery 
> failure but only after the host has rebooted. This way the agent can restart 
> as a new agent without operator intervention. 
> 
> Any thoughts?

I still think you need a mechanism for the master/agent to tell you whether it 
will honor the restart policy. Without this, you have to lock the framework to 
a Mesos version.

An empty RestartPolicy is also problematic since it precludes using 
RestartPolicy in pods. If you later want to restart a task inside a pod but not 
across agent restarts you would have no way to express that.

J

Re: Attendance for Mesos Developer Community Meeting (Nov 17)

2016-11-16 Thread James Peach

> On Nov 16, 2016, at 3:06 PM, Michael Park  wrote:
> 
> If you're planning to attend this meeting, please reply to this before Nov
> 17 8am PST. If there are less than 5 people planning to attend (including
> me), we'll skip it.

+1

> 
> On Wed, Nov 16, 2016 at 11:02 AM, Haripriya Ayyalasomayajula <
> aharipriy...@gmail.com> wrote:
> 
>> +1.
>> 
>> On Wed, Nov 16, 2016 at 10:58 AM, Michael Park  wrote:
>> 
>>> Many people will be in China for MesosCon, so I'd like to get a quick
>> count
>>> for how many people are planning to join the developer community meeting
>>> tomorrow.
>>> 
>>> Please reply with a +1 if you're planning to attend.
>>> 
>> 
>> 
>> 
>> --
>> Regards,
>> Haripriya Ayyalasomayajula
>> 



Re: Mesos Documentation Project

2016-11-09 Thread James Peach

> On Nov 9, 2016, at 4:29 PM, James Neiman  wrote:
> 
> Dear Mesos Users, Operators, Developers, and Contributors:
> 
> My name is James Neiman. I have been working with Benjamin Hindman, Artem
> Harutyunyan, Neil Conway, and Joseph Wu on improving the Mesos
> documentation. We now have a proposal for the community to critique.
> 
> Our goal is to satisfy the needs of Operators, Developers, and Contributors
> by:
> 
>   - Revising, restructuring, and expanding existing topics.
>   - Authoring new topics, such as *Quick Start* and *What is Mesos?*.
>   - Reorganizing the table of contents.
>   - Providing role-specific views of the table of contents.
> 
>   *Please note that versioning of the documentation will be addressed in a
> separate project.*
> 
> This will be an iterative process. Your feedback and contributions are very
> important to making this project a success!
> 
> I will follow up very soon with a request for your comments on proposed
> changes. I look forward to your feedback.

Is the proposal github PR that Joseph linked, or is there more? Is there a 
rendered version of the PR available anywhere?

Are the “intallation-$platform.md” named intentionally, or the result of a 
typo? A lot of the headings on these pages have a trailing ‘]’. Is that markup 
or typo?

Did you consider switching to a more full-featured docs toolchain (I have had 
good experiences with sphinx) that can generate man pages, indices, TOC, 
cross-references, search, etc?

thanks,
James

Re: Slack as the canonical chat channel

2016-06-24 Thread James Peach
Are there public archives of the slack channel 

> On Jun 24, 2016, at 10:31 AM, Vinod Kone  wrote:
> 
> The plan is to open them up to all. Currently, slack has a limitation that
> either users have to be invited individually (which doesn't scale) or they
> need to belong to manually created white-list of corporate domains (which
> doesn't scale either).
> 
> I'm looking into using 3rd party tools like
> https://github.com/rauchg/slackin to let anyone signup.
> 
> On Fri, Jun 24, 2016 at 9:28 AM, Vaibhav Khanduja > wrote:
> 
>> We should atleast  open this to contributors or who has access to assign
>> issue to them ...
>> 
>> 
>> 
>>> On 6/23/16, 10:50 PM, "Vinod Kone"  wrote:
>>> 
>>> Opened it up for few more domain names (ibm, apple etc). If your domain is
>>> listed at https://mesos.slack.com/signup please feel free to join.
>>> 
 On Thu, Jun 23, 2016 at 7:07 PM, tommy xiao  wrote:
 
 because the mesos repo is not hosted on github.  gitter.im is not best
 option on team. slack is popular than gitter.im. so i suggest based on
 slack.
 
 2016-06-24 8:11 GMT+08:00 Jay JN Guo :
 
> Great, thanks for your effort! We'd love to see it's opening up soon!
> 
> /J
> 
> Vinod Kone  wrote on 06/24/2016 00:36:44:
> 
>> From: Vinod Kone 
>> To: dev 
>> Cc: Benjamin Hindman , Jake Farrell
> 
>> Date: 06/24/2016 00:37
>> Subject: Re: Slack as the canonical chat channel
>> 
>> Looks like there is an overwhelming majority for *slack*! So, I went
> ahead
>> and created a slack team https://mesos.slack.com.
>> 
>> For now, you can signup if you have a "*apache.org <
>> http://apache.org
> *"
>> email address (https://mesos.slack.com/signup). I'll start slowly
> opening
>> it up for more people as we get our feet wet and iron out any
>> kinks. So
>> everyone should still keep using #mesos IRC channel.
>> 
>> Thanks,
>> Vinod
>> 
>> On Tue, Jun 21, 2016 at 9:42 AM, José Guilherme Vanz <
>> guilherme@gmail.com> wrote:
>> 
>>> Yeah, sound as a good option.
>>> 
>>> On Mon, 20 Jun 2016 at 20:04 Cosmin Lehene 
 wrote:
>>> 
 Looks like there's a majority of +1 for Slack, so this this may
>> be
 pointless, however :), have you considered gitter.im (
 https://gitter.im/home/explore/)?
 
 
 It has similar capabilities to Slack, but it's (unlimited) free
>> for
> open
 source projects and seamlessly works over Github organizations
>> and
> repos
 with several major open source projects using it.
 
 
 Cheers,
 
 Cosmin
 
 
 From: haosdent 
 Sent: Friday, June 17, 2016 6:48:02 PM
 To: dev
 Cc: Benjamin Hindman; Jake Farrell
 Subject: Re: Slack as the canonical chat channel
 
 +1 For Slack
 On Jun 18, 2016 4:04 AM, "Vinod Kone" 
 wrote:
 
> Looks like people have jumped the gun here before I sent the
> email :)
> 
> Here is the context. During the community sync we discussed
>> about
> using
> *Slack* or *HipChat* as our official chat channel instead of
>> our
>>> current
> #mesos IRC channel on freenode.
> 
> The main reasons for using Slack/Hipchat are
> 
>   - In-client chat history
>   - Discoverability of work group specific channels
>   - Email notifications when offline
>   - Modern UX and clients
> 
> During the sync most people preferred the move to *Slack*. I
 wanted
> to
 get
> a sense from other community members as well through this
>> email.
> Please
 let
> us know what you think.
> 
> Note that even if we move to Slack, we will make sure people
>> can
> still
> connect using IRC clients and that the chat history is
>> publicly
>>> available
> (per ASF guidelines). During the transition period, we might
 mirror
> messages from Slack channel to IRC and vice-versa.
> 
> Thoughts?
> 
> On Fri, Jun 17, 2016 at 8:52 AM, Vinit Mahedia
>  wrote:
> 
>> +1 Slack.
>> 
>> On Fri, Jun 17, 2016 at 12:59 AM, Jay JN Guo
> 
>> wrote:
>> 
>>> +1 Slack!
>>> 
>>> /J
>>> 
>>> Vaibhav Khanduja  wrote on
> 06/16/2016

Re: [Proposal] Remove the default value for agent work_dir

2016-04-12 Thread James Peach

> On Apr 12, 2016, at 3:58 PM, Greg Mann  wrote:
> 
> Hey folks!
> A number of situations have arisen in which the default value of the Mesos 
> agent `--work_dir` flag (/tmp/mesos) has caused problems on systems in which 
> the automatic cleanup of '/tmp' deletes agent metadata. To resolve this, we 
> would like to eliminate the default value of the agent `--work_dir` flag. You 
> can find the relevant JIRA here.
> 
> We considered simply changing the default value to a more appropriate 
> location, but decided against this because the expected filesystem structure 
> varies from platform to platform, and because it isn't guaranteed that the 
> Mesos agent would have access to the default path on a particular platform.
> 
> Eliminating the default `--work_dir` value means that the agent would exit 
> immediately if the flag is not provided, whereas currently it launches 
> successfully in this case. This will break existing infrastructure which 
> relies on launching the Mesos agent without specifying the work directory. I 
> believe this is an acceptable change because '/tmp/mesos' is not a suitable 
> location for the agent work directory except for short-term local testing, 
> and any production scenario that is currently using this location should be 
> altered immediately.

+1 from me too. Defaulting to /tmp just helps people shoot themselves in the 
foot.

J

Re: [02/11] mesos git commit: Added support for contender and detector modules.

2016-04-07 Thread James Peach

> On Apr 6, 2016, at 3:48 PM, ka...@apache.org wrote:
> 
> Added support for contender and detector modules.
> 
> 
> http://git-wip-us.apache.org/repos/asf/mesos/blob/cbbc8f0b/src/tests/module.cpp
> --
> diff --git a/src/tests/module.cpp b/src/tests/module.cpp
> index 8cc305c..4b24048 100644
> --- a/src/tests/module.cpp
> +++ b/src/tests/module.cpp
> @@ -222,6 +222,52 @@ static void addHttpAuthenticatorModules(Modules* modules)
> }
> 
> 
> +static void addMasterContenderModules(Modules* modules)
> +{
> +  CHECK_NOTNULL(modules);
> +
> +  const string libraryPath = path::join(
> +  tests::flags.build_dir,
> +  "src",
> +  ".libs",
> +  os::libraries::expandName("testmastercontender"));

mesos::internal::tests::getModulePath("testmastercontender");



Re: Compile with CFLAGS=-DWITH_NETWORK_ISOLATOR

2016-03-22 Thread James Peach

> On Mar 22, 2016, at 6:21 AM, Jay Guo  wrote:
> 
> Hi,
> 
> I got error trying to compile Mesos
> on Ubuntu
> with CFLAG WITH_NETWORK_ISOLATOR
> 
> Here's what I did:
> 1. apt-get install libnl-dev
> 2. ./bootstrap
> 3. mkdir build && cd build
> 4. CXXFLAGS=-DWITH_NETWORK_ISOLATOR ../configure --disable-java
> --disable-python

You should do:

../configure --disable-java --disable-python --with-network-isolator

This will check for the dependencies correctly and enable the right build 
components.

> 5. make check
> 
> Although I got following error:
> 
> In file included from ../../src/linux/routing/filter/ip.hpp:35:0,
> from
> ../../src/slave/containerizer/mesos/isolators/network/port_mapping.hpp:44,
> from
> ../../src/slave/containerizer/mesos/containerizer.cpp:82:
> ../../src/linux/routing/handle.hpp:92:39: error: ‘TC_H_ROOT’ was not
> declared in this scope
> constexpr Handle EGRESS_ROOT = Handle(TC_H_ROOT);
>   ^
> ../../src/linux/routing/handle.hpp:93:40: error: ‘TC_H_INGRESS’ was not
> declared in this scope
> constexpr Handle INGRESS_ROOT = Handle(TC_H_INGRESS);
> 
> Any ideas?
> 
> Also, does this work with OSX? Is there any equivalent library as libnl?
> 
> Cheers,
> /J



Re: Upgrade to clang-format-3.8

2016-03-21 Thread James Peach

> On Mar 18, 2016, at 10:22 AM, Michael Park <mp...@apache.org> wrote:
> 
> Hi James,
> 
> Someone would need to propose to auto-format everything with clang-format
> and convince the community
> that while clang-format will never be perfect, it generates a systematic,
> sane and just as readable codebase.
> If we can reach consensus on that, we could add it as a commit hook and no
> one will worry about formatting.
> 
> I dreamed of the world described above, but at least at the time
> clang-format was still learning about some
> of the C++11 constructs (especially around lambdas), so it seemed a bit too
> far fetched.
> 
> My goal with clang-format currently is for people to have it integrated in
> their editors to help them get 80, 90%
> of the way there with single key-stroke, then follow-up with necessary
> minor edits.

Yeh the current config did help me get a start on the correct style. 
Unfortunately you can only run it once otherwise it eats it again. To deal with 
whitespace, IMHO the best approach is to automatically run git-stripspace and 
do without the double-blanks :)

> We've been converging from
> both sides to get that percentage higher (new features in clang-format +
> modifying our style), and we'll continue
> to make such efforts where it makes sense.
> 
> MPark
> 
> On 18 March 2016 at 12:45, James Peach <jor...@gmail.com> wrote:
> 
>> 
>>> On Mar 17, 2016, at 10:41 AM, Yong Tang <yong.tang.git...@outlook.com>
>> wrote:
>>> 
>>> Hi All
>>> 
>>> 
>>> This email is to announce that the default configuration and the
>> recommended version of the clang-format is being upgraded to 3.8 (from 3.5)
>> in mesos.
>>> 
>>> 
>>> In clang-format-3.8, the newly introduced option "AlignAfterOpenBracket:
>> AlwaysBreak" closes the largest gap between ClangFormat and the style guide
>> in mesos. It avoids  "jaggedness" in function calls and is worth migrating
>> for.
>>> 
>>> 
>>> Along with the changes in clang-format configuration
>> (support/clang-format), the documentation (docs/clang-format.md) is also
>> going to be updated to reflect changes in version and the recommended
>> installation process.
>>> 
>>> 
>>> More details about this upgrade could be found in MESOS-4906 (
>> https://issues.apache.org/jira/browse/MESOS-4906). By the way, thanks
>> Michael for the help on this issue.
>> 
>> This sounds really promising. Is the plan to auto-format everything with
>> clang-format?



Re: Upgrade to clang-format-3.8

2016-03-19 Thread James Peach

> On Mar 17, 2016, at 10:41 AM, Yong Tang  wrote:
> 
> Hi All
> 
> 
> This email is to announce that the default configuration and the recommended 
> version of the clang-format is being upgraded to 3.8 (from 3.5) in mesos.
> 
> 
> In clang-format-3.8, the newly introduced option "AlignAfterOpenBracket: 
> AlwaysBreak" closes the largest gap between ClangFormat and the style guide 
> in mesos. It avoids  "jaggedness" in function calls and is worth migrating 
> for.
> 
> 
> Along with the changes in clang-format configuration (support/clang-format), 
> the documentation (docs/clang-format.md) is also going to be updated to 
> reflect changes in version and the recommended installation process.
> 
> 
> More details about this upgrade could be found in MESOS-4906 
> (https://issues.apache.org/jira/browse/MESOS-4906). By the way, thanks 
> Michael for the help on this issue.

This sounds really promising. Is the plan to auto-format everything with 
clang-format?

Re: [3/3] mesos git commit: New python lib with only the executor driver.

2016-03-14 Thread James Peach
FWIW I bet it would be helpful to have a general-purpose minimal executor 
library.

> On Mar 11, 2016, at 6:55 PM, Benjamin Mahler  wrote:
> 
> +vinod
> 
> This breaks the build for me on OS X, it appears this line is the culprit:
> 
> EXTRA_LINK_ARGS = ['-Wl,--as-needed']
> 
> This leads to the following:
> 
> clang++ -bundle -undefined dynamic_lookup -arch x86_64 -arch i386 -Wl,-F. 
> -L/usr/local/opt/subversion/lib -O2 -O2 -Wno-unused-local-typedef -std=c++11 
> -stdlib=libc++ -DGTEST_USE_OWN_TR1_TUPLE=1 -DGTEST_LANG_CXX11 
> -Qunused-arguments -I/usr/local/opt/subversion/include/subversion-1 
> -I/usr/include/apr-1 -I/usr/include/apr-1.0 -Qunused-arguments 
> build/temp.macosx-10.11-intel-2.7/src/mesos/executor/mesos_executor_driver_impl.o
>  build/temp.macosx-10.11-intel-2.7/src/mesos/executor/module.o 
> build/temp.macosx-10.11-intel-2.7/src/mesos/executor/proxy_executor.o 
> /Users/bmahler/git/mesos/build/src/.libs/libmesos_no_3rdparty.a 
> /Users/bmahler/git/mesos/build/3rdparty/libprocess/.libs/libprocess.a 
> /Users/bmahler/git/mesos/build/3rdparty/leveldb-1.4/libleveldb.a 
> /Users/bmahler/git/mesos/build/3rdparty/zookeeper-3.4.5/src/c/.libs/libzookeeper_mt.a
>  
> /Users/bmahler/git/mesos/build/3rdparty/libprocess/3rdparty/glog-0.3.3/.libs/libglog.a
>  
> /Users/bmahler/git/mesos/build/3rdparty/libprocess/3rdparty/protobuf-2.5.0/src/.libs/libprotobuf.a
>  -o build/lib.macosx-10.11-intel-2.7/mesos/executor/_executor.so 
> -Wl,--as-needed -L/usr/local/opt/subversion/lib -levent_openssl -lcrypto 
> -lssl -levent_pthreads -levent -lsasl2 -lsvn_delta-1 -lsvn_subr-1 -lapr-1 
> -lcurl -lz
> ld: unknown option: --as-needed
> clang: error: linker command failed with exit code 1 (use -v to see 
> invocation)
> 
> On Fri, Mar 11, 2016 at 1:56 PM,  wrote:
> New python lib with only the executor driver.
> 
> This patch produces a new python egg, mesos.executor, which contains only the
> code needed to create a MesosExecutorDriver.  By doing so, the linker can 
> remove
> unused code in libmesos_no_3rdparty.a, and therefor not include any external
> dependencies in the resulting _mesos.so.
> 
> Review: https://reviews.apache.org/r/41049/
> 
> 
> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/c81a52ec
> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/c81a52ec
> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/c81a52ec
> 
> Branch: refs/heads/master
> Commit: c81a52ec22266e1f2beb61b224c0f0d9be82521f
> Parents: 482dc14
> Author: Steve Niemitz 
> Authored: Fri Mar 11 16:56:13 2016 -0500
> Committer: Vinod Kone 
> Committed: Fri Mar 11 16:56:13 2016 -0500
> 
> --
>  configure.ac|   7 +-
>  src/Makefile.am |  33 +-
>  src/python/executor/setup.py.in |  39 +
>  src/python/executor/src/mesos/__init__.py   |  10 +
>  .../executor/src/mesos/executor/__init__.py |  17 +
>  .../executor/mesos_executor_driver_impl.cpp | 347 
>  .../executor/mesos_executor_driver_impl.hpp | 103 +++
>  .../executor/src/mesos/executor/module.cpp  |  91 +++
>  .../src/mesos/executor/proxy_executor.cpp   | 273 +++
>  .../src/mesos/executor/proxy_executor.hpp   |  64 ++
>  src/python/native/ext_modules.py.in | 151 
>  src/python/native/setup.py.in   |   9 +-
>  src/python/native/src/mesos/native/__init__.py  |   7 +-
>  .../mesos/native/mesos_executor_driver_impl.cpp | 347 
>  .../mesos/native/mesos_executor_driver_impl.hpp | 103 ---
>  .../native/mesos_scheduler_driver_impl.cpp  | 782 ---
>  .../native/mesos_scheduler_driver_impl.hpp  | 134 
>  src/python/native/src/mesos/native/module.cpp   | 100 ---
>  src/python/native/src/mesos/native/module.hpp   | 136 
>  .../native/src/mesos/native/proxy_executor.cpp  | 273 ---
>  .../native/src/mesos/native/proxy_executor.hpp  |  64 --
>  .../native/src/mesos/native/proxy_scheduler.cpp | 384 -
>  .../native/src/mesos/native/proxy_scheduler.hpp |  72 --
>  src/python/native_common/common.hpp | 136 
>  src/python/native_common/ext_modules.py.in  | 154 
>  src/python/scheduler/setup.py.in|  39 +
>  src/python/scheduler/src/mesos/__init__.py  |  10 +
>  .../scheduler/src/mesos/scheduler/__init__.py   |  17 +
>  .../scheduler/mesos_scheduler_driver_impl.cpp   | 782 +++
>  .../scheduler/mesos_scheduler_driver_impl.hpp   | 134 
>  .../scheduler/src/mesos/scheduler/module.cpp|  91 +++
>  .../src/mesos/scheduler/proxy_scheduler.cpp | 384 +
>  .../src/mesos/scheduler/proxy_scheduler.hpp |  72 ++
>  33 files changed, 2795 insertions(+), 2570 deletions(-)
> 

  1   2   >