Re: jobs are stuck in agents and staying in stagged state

2016-09-01 Thread Erik Weathers
You need to ensure there is no overlap between these 3 things:

1. Static ports for mesos agent/master, e.g. 5050, 5051 by default.
2. Linux ephemeral port range (32768-61000 by default).
3. Mesos view of ports as resources (31000-32000 by default).

The behavior initially described here sounds to me like a scheduler may
have been offered an already used port on the agent host.

I answered a related question before:

http://unix.stackexchange.com/a/237543

- Erik

On Thursday, September 1, 2016, haosdent  wrote:

> If you use Linux, you could execute follow command on every Mesos Agent to
> make the ephemeral port assigned during 5000~6000
>
> echo "5000 6000" > /proc/sys/net/ipv4/ip_local_port_range
>
> On Thu, Sep 1, 2016 at 3:12 PM, Vinod Kone  > wrote:
>
> > AFAICT, your agent is listening on port 8082 and not the default 5051.
> > -
> > I0829 14:24:21.750063  2679 slave.cpp:193] Slave started on 1)@
> > 128.226.116.69:8082
> > 
> >
> > The fact that agent is receiving a task from the master means that the
> > firewall on the agent allows incoming connections to 8082. So I'm
> surprised
> > that a local connection from the executor to the agent is being denied.
> > What exactly are your firewall rules on the agent?
> >
> > Also, can you share the stderr/stdout of an example executor?
> >
> >
> > On Wed, Aug 31, 2016 at 6:18 PM, Pankaj Saha  >
> > wrote:
> >
> > > I think the executor wants to get registered by communicating with
> > > mesos master and it fails due to network restriction.
> > > How can I change the /tmp/ path? I have mentioned /var/lib/mesos  as my
> > > work_dir.
> > >
> > >
> > > *I am explaining my setup here:*
> > >  I have a Mesos setup where master and slave both are running on the
> same
> > > network of my university campus. Mesos agent node is situated under a
> > > firewall and only port: 5000 to port:6000 are open for incoming traffic
> > > whereas Mesos master has no such restrictions. I am running master
> > service
> > > on master:5050 and agent is running on agent:5051 as default.
> > >
> > > I can see agent is communicating correctly to master and offering the
> > > available resources. I have mentioned the available ports for agents
> are
> > > ports:[5001-6000] in *src/slave/constants.cpp* file so that framework
> can
> > > communicate only through those ports which are open for my agent system
> > > behind the firewall.
> > >
> > > Now when I am launching jobs through Mesosphere marathon framework, I
> can
> > > see all jobs are connected to mesos-agent through those mentioned port
> > > ranges[5001-6000]. But my jobs are not getting submitted. So I started
> > > debugging and realised that when launching jobs mesos slaves create and
> > > launch an executor (*/erc/executor/executor.cpp*) which communicates to
> > the
> > > mesos master through a random port. Which is outside my available range
> > of
> > > 5000-6000 open ports. Now as through those ports my agent machine can
> not
> > > take any requests so executor is getting timed out and restarting the
> > > executor again and again after every 1 min of time limit.
> > >
> > > I could not find out where exactly that random port is assigned. Is
> there
> > > any socket connection that we can change to get executor connection
> > happen
> > > on desired range of ports? Please let me know if my understanding is
> > > correct and how can I change those ports for executor registration.
> > >
> > >
> > >
> > >
> > > On Wed, Aug 31, 2016 at 3:09 AM, haosdent  > wrote:
> > >
> > > > >I0829 14:27:38.322805  2700 slave.cpp:4307] *Terminating executor
> > > > ''test.1fb85a35-6e16-11e6-bec9-c27afc834a0c' of framework
> > > > c796100f-9ecb-46fa-90a2-72ad649c5dd3-' because it did not
> register
> > > > within 1mins
> > > >
> > > > This log looks wired. Could you find anything in the stdout/stderr of
> > the
> > > > executor. For the executor 'test.1fb85a35-6e16-11e6-bec9-
> c27afc834a0c'
> > > > above, it should be under the folder '/tmp/mesos/slaves/d6f0e3e2-
> > > > d144-4275-9d38-82327408622b-S12/frameworks/c796100f-9ecb-
> > > > 46fa-90a2-72ad649c5dd3-/executors/test.1fb85a35-6e16-
> > > > 11e6-bec9-c27afc834a0c/runs/dff399f0-beb1-4c49-bd8e-c19621de2f71/'
> > > >
> > > > Apart from that, run mesos under '/tmp' is not recommended.
> > > >
> > > > On Tue, Aug 30, 2016 at 2:32 AM, Pankaj Saha  >
> > > > wrote:
> > > >
> > > > > here is the log:
> > > > >
> > > > >
> > > > >
> > > > > I0829 14:24:21.727960 2679 main.cpp:223] Build: 2016-08-28 13:39:46
> > by
> > > > > root
> > > > > I0829 14:24:21.728159  2679 main.cpp:225] Version: 0.28.2
> > > > > I0829 14:24:21.733256  2679 containerizer.cpp:149] Using isolation:
> > > > > posix/cpu,posix/mem,filesystem/posix
> > > > > I0829 14:24:21.738895  2679 linux_launcher.cpp:101] Using
> > > 

Re: New external dependency

2016-06-20 Thread Erik Weathers
@Kevin:

FYI, it's best practice to use a commit SHA in GitHub links so that future
readers are seeing the content you intended.

i.e., instead of:

   -
   
https://github.com/NVIDIA/nvidia-docker/blob/master/tools/src/nvidia/volumes.go#L109

It's best to do:

   -
   
https://github.com/NVIDIA/nvidia-docker/blob/101b436c89c3a74e9a3025a104587b6612d903d8/tools/src/nvidia/volumes.go#L109


And (awesomely!) GitHub makes it trivial to do this!  [1]

   - when you're looking at a file (such as the original link you pasted),
   just type "y" and GitHub will redirect to the latest commit in master:

- Erik

[1] https://help.github.com/articles/getting-permanent-links-to-files/

On Mon, Jun 20, 2016 at 6:59 PM, Kevin Klues  wrote:

> For now we've decided to actually remove the hard dependence on libelf
> for the 1.0 release and spend a bit more time thinking about the right
> way to pull it in.
>
> Jean, to answer your question though -- someone would still need to
> consolidate these libraries, even if it wasn't left to Mesos to do so.
> These libraries are spread across the file system, and need to be
> pulled into a single place for easy injection. The full list of
> binaries / libraries are here:
>
>
> https://github.com/NVIDIA/nvidia-docker/blob/master/tools/src/nvidia/volumes.go#L109
>
> We could put this burden on the operator and trust he gets it right,
> or we could have Mesos programmatically do it itself. We considered
> just leveraging the nvidia-docker-plugin itself (instead of
> duplicating its functionality into mesos), but ultimately decided it
> was better not to introduce an external dependency on it (since it is
> a separate running excutable, rather than a simple library, like
> libelf).
>
> On Mon, Jun 20, 2016 at 5:12 PM, Jean Christophe “JC” Martin
>  wrote:
> > As an operator not using GPUs, I feel that the burden seems misplaced,
> and disproportionate.
> > I assume that the operator of a GPU cluster knows the location of the
> libraries based on their OS, and could potentially provide this information
> at the time of creating the containers. I am not sure to see why this
> something that mesos is required to do (consolidating the libraries in the
> volume, versus being a configuration/external information).
> >
> > Thanks,
> >
> > JC
> >
> >> On Jun 20, 2016, at 2:30 PM, Kevin Klues  wrote:
> >>
> >> Sorry, the ticket just links to the nvidia-docker project without much
> >> further explanation. The information at the link below should make it
> >> a bit more clear:
> >>
> >> https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver.
> >>
> >> The crux of the issue is that we need to be able consolidate all of
> >> the Nvidia binaries/libraries into a single volume that we inject into
> >> a docker container.  We use libelf is used to get the canonical names
> >> of all the Nvidia libraries (i.e. SONAME in their dynamic sections) as
> >> well as lookup what external dependences they have (i.e. NEEDED in
> >> their dynamic sections) in order to build this volume.
> >>
> >> NOTE: None of this volume support is actually in Mesos yet -- we just
> >> added the libelf dependence in anticipation of it.
> >>
> >>
> >>
> >>
> >> On Mon, Jun 20, 2016 at 12:59 PM, Yan Xu  wrote:
> >>> It's not immediately clear form the ticket why the change from optional
> >>> dependency to required dependency though? Could you summarize?
> >>>
> >>>
> >>> On Sun, Jun 19, 2016 at 12:33 PM, Kevin Klues 
> wrote:
> 
>  Thanks Zhitao,
> 
>  I just pushed out a review for upgrades.md and added you as a
> reviewer.
> 
>  The new dependence was added in the JIRA that haosdent linked, but the
>  actual reason for adding the dependence is more related to:
>  https://issues.apache.org/jira/browse/MESOS-5401
> 
>  On Sun, Jun 19, 2016 at 9:34 AM, haosdent  wrote:
> > The related issue is Change build to always enable Nvidia GPU support
> > for
> > Linux
> > Last time my local build break before Kevin send out the email, and
> then
> > find this change.
> >
> > On Mon, Jun 20, 2016 at 12:11 AM, Zhitao Li 
> > wrote:
> >>
> >> Hi Kevin,
> >>
> >> Thanks for letting us know. It seems like this is not called out in
> >> upgrades.md, so can you please document this additional dependency
> >> there?
> >>
> >> Also, can you include the link to the JIRA or patch requiring this
> >> dependency so we can have some contexts?
> >>
> >> Thanks!
> >>
> >> On Sat, Jun 18, 2016 at 10:25 AM, Kevin Klues 
> >> wrote:
> >>
> >>> Hello all,
> >>>
> >>> Just an FYI that the newest libmesos now has an external dependence
> >>> on
> >>> libelf on Linux. This dependence can be installed via the following
> >>> packages:
> >>>
> 

Re: Mesos admin REST API

2016-05-18 Thread Erik Weathers
Maybe I'm misunderstanding the question, but I've used this mechanism to
kill tasks via the scheduler REST API, so from my perspective that *does*
exist already:

https://mesos.apache.org/documentation/latest/scheduler-http-api/

We don't do any authentication stuff in our Mesos system though.

- Erik

On Wed, May 18, 2016 at 1:47 AM, Guangya Liu  wrote:

> No, but there are some discussion and JIRA tracing this
> https://issues.apache.org/jira/browse/MESOS-3220
>
> On Wed, May 18, 2016 at 4:08 PM, Olivier Sallou 
> wrote:
>
> > Hi,
> > Is there any operator/admin admin to kill a task,  via an admin API ?
> >
> > I faced issue where mesos does not send any offer to my framework after
> > a task failure (remains in staging, or can't contact an old framework.
> > The result is my framework cannot send new kills etc..
> >
> > I'd like, as a mesos admin, to send a kill request (or other kind of
> > requests), "by passing" the framework.
> >
> > Thanks
> >
> > Olivier
> >
> > --
> > Olivier Sallou
> > IRISA / University of Rennes 1
> > Campus de Beaulieu, 35000 RENNES - FRANCE
> > Tel: 02.99.84.71.95
> >
> > gpg key id: 4096R/326D8438  (keyring.debian.org)
> > Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438
> >
> >
>


Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-18 Thread Erik Weathers
BTW, if the tag is created against a commit that *doesn't* become
"unreachable" from HEAD [1], then `git pull` is sufficient to also pull
down the tags.

The only time I've needed to do `git fetch --tags` is when the tagged
commit SHA gets merged away.  So presumably the process being followed by
the core committers / releasers is resulting in these "unreachable" tags.
Not sure if that is preventable though.

- Erik

[1] http://eddiemoya.com/2013/02/21/better-git-git-fetch-not-getting-tags/

>From the git manual (“git help fetch”): [1]

-t, –tags Most of the tags are fetched automatically as branch heads are
downloaded, but tags that do not point at objects reachable from the branch
heads that are being tracked will not be fetched by this mechanism. This
flag lets all tags and their associated objects be downloaded. The default
behavior for a remote may be specified with the remote..tagopt
setting. See git-config(1).



On Fri, Mar 18, 2016 at 6:22 PM, Michael Browning 
wrote:

> I agree with Kevin -- tags are immutable, so they're naturally suited
> for labeling releases, which ought to be immutable too.
>
> On Fri, Mar 18, 2016 at 4:59 PM, Kevin Klues  wrote:
> > I respectfully disagree.
> >
> > The whole purpose of tags is to mark permanent things like releases,
> > whereas branches are designed as temporary lines of development that
> > come and go (and grow and shrink) dynamically all the time.
> >
> > On Fri, Mar 18, 2016 at 4:04 PM, Jie Yu  wrote:
> >> I like the idea of using branches to manage releases.
> >>
> >> We can use that to manage point releases and backports as well.
> >>
> >> Say we want to cut 0.29.0 now, we fork a branch 0.29.0 and tag RCs in
> that
> >> branch. Once the RC is accepted, the head of that branch will become the
> >> release.
> >>
> >> Then, we immediate fork that branch and create 0.29.1 branch.
> >>
> >> When a new bug fix is committed on the trunk, the committer will decide
> >> whether it'll affect the old releases (a bounded number, we can decide
> that
> >> later). If it does, the committer of that patch should also cherry-pick
> >> that patch to the point releases (e.g., 0.29.1 in this case). We can do
> a
> >> timely based point releases.
> >>
> >> - Jie
> >>
> >> On Fri, Mar 18, 2016 at 1:35 PM, Cong Wang 
> wrote:
> >>
> >>> On Wed, Mar 16, 2016 at 11:56 AM, Joseph Wu 
> wrote:
> >>> > Cong Wang,
> >>> >
> >>> > The tags are sync'd.  See: https://github.com/apache/mesos/releases
> >>> >
> >>> > You might not have done: git pull --tags
> >>>
> >>>
> >>> Yeah, I figured it out by myself too. This is why I hate tags
> personally,
> >>> branches are better since they are fetched without additional
> parameters.
> >>>
> >>> Any reason why Mesos maintainers picked tags over branches to manage
> >>> releases? Just curious...
> >>>
> >
> >
> >
> > --
> > ~Kevin
>


Re: Port management on the host

2016-03-09 Thread Erik Weathers
On Wed, Mar 9, 2016 at 4:16 PM, Ashwin Murthy 
wrote:

> Is mesos aware of ports on the host that are in use vs free?


Not directly.  Mesos only knows about a port being used if it is part of an
accepted resource Offer from a host.  i.e., if a Framework uses one of the
offered ports to launch an Executor or Task, then Mesos records it and
won't offer the port up again until that Executor or Task terminates.  By
default Mesos allows frameworks to use ports 31000-32000, which is separate
from the default Linux ephemeral port range of 32768-61000.  So in general
nothing *should* be using the Mesos ports.  See my answer on this question:

   -
   
https://unix.stackexchange.com/questions/211647/how-safe-is-it-to-change-the-linux-ephemeral-port-range/237543#237543

However, we've seen problems with cgroups where a process can get stuck in
a zombie state where it holds onto a port, even though it is unschedulable
by the kernel.   And unfortunately Mesos may have already been told that
the task has finished (e.g., the Executor dies and so the Task is
considered lost).  In such cases, Mesos can offer this unavailable port up
to a Framework which attempts to run a Task which will fail due to the port
not being bindable.

Are these offered to frameworks to make scheduling decisions?
>

Yes, as noted above, ports are part of the resources offered by Mesos.


>
> If yes, then does mesos make any assumptions on IP addresses on the host?
>

These questions about multiple IPs are getting into an area of Mesos that
I'm not very familiar with.  But the way *I* use Mesos is with just the
host-network stack with a single IP on the host, so I have a single IP that
is shared between the slave/agent host and the containers running on that
host, and thus the ports are all from that single IP.

- Erik


> What if there are multiple IPs configured on multiple NICs/vNICs? The port
> range is mapped to an IP.
>
> Thanks
> Ashwin
>


Re: Reorganize 3rdparty directory

2016-02-16 Thread Erik Weathers
If we go to git submodules, please ensure there are good docs around how to
update cloned repos.

e.g., From ansible: https://docs.ansible.com/ansible/intro_installation.html

Note when updating ansible, be sure to not only update the source tree, but
also the “submodules” in git which point at Ansible’s own modules (not the
same kind of modules, alas).

$ git pull --rebase
$ git submodule update --init --recursive

Thanks,

- Erik

On Tue, Feb 16, 2016 at 8:54 AM, Alexander Rojas 
wrote:

> +1
> I am one who is totally in for that change. It is not only the directories
> problem, but the structure which has led that the stout tests (which do
> need to be compiled) are actually managed in the libprocess Makefile, on
> top of all the things you have already mentioned.
>
>
> > On 09 Feb 2016, at 17:53, Kapil Arya  wrote:
> >
> > On Tue, Feb 9, 2016 at 8:23 PM, Jie Yu  wrote:
> >> Kapil,
> >>
> >> I guess what I want to understand is why the existing structure makes it
> >> hard for you to do the things that you want to do (installing
> >> module-specific 3rdparty dependencies into "${pkglibdir}/3rdparty" as
> part
> >> of "make install").
> >
> > Let me see if I can answer that :-).
> >
> > This is somewhat related. For example, if we want to install protobuf
> > in 3rdparty/{include,lib} (for module developers to use them without
> > doing a proper mesos installation), you need to provide the correct
> > "--prefix" flag that points to 3rdparty/. However, due to multiple
> > levels of configure.ac, the "--prefix" can at best be generated by
> > prepending "../../../" to get to the great-grandparent directory. This
> > is because we have a separate configure.ac which manages
> > 3rdparty/libprocess/3rdparty/Makefile.am. There are ways around it,
> > but they are not clean.
> >
> > Similar thing holds for system-wide installation of these 3rdparty
> > packages. For example, ideally, we would want to use
> > "${pkglibdir}/3rdparty" as a prefix for those packages. However, since
> > they are part of libprocess package, we don't get the correct
> > directory and have to use either hardwired $pkglibdir, or somehow pass
> > it from the top-level configure all the way down to
> > 3rdparty/libprocess/3rdparty/Makefile.am :-(.
> >
> >
> >> The only reason you mentioned in the original email is that "in the
> current
> >> code base, we don't strictly follow the 3rdparty structure", which IMO
> is
> >> not a very convincing reason for such a big change.
> >
> > How about a not so big change? :-). What if we just move
> > 3rdparty/libprocess/3rdparty/* stuff out to 3rdparty/ while leaving
> > stout as is? That is not a big change since we are not touching
> > libprocess/stout. Just adjusting Makefiles and I am pretty sure it
> > will be cleaner and simpler than what we have right now.
> >
> > As a later time, we can then consider moving stout out to 3rdparty/
> > while leaving libprocess as is. But that's something we can decide
> > later and leave stout as an exception for now.
> >
> > BTW, if we were to install all the 3rdparty packages in 3rdparty/,
> > that would also cut down a lot on the compiler flags (i.e., fewer "-I"
> > and "-L" flags) :-).
> >
> > Kapil
> >
> >>
> >> - Jie
> >>
> >> On Tue, Feb 9, 2016 at 5:04 PM, Kapil Arya  wrote:
> >>
> >>> On Tue, Feb 9, 2016 at 7:20 PM, Jie Yu  wrote:
> 
> >
> > However, in the current code base, we don't strictly follow the
> >>> 3rdparty
> > structure. For example, stout has a dependency on picojson and
> > google-protobuf, but we don't put these two packages inside
> > 3rdparty/libprocess/3rdparty/stout/3rdparty/.
> 
> 
>  My understanding is that stout is header only. So it does not have to
>  bundle 3rdparty libraries. The user of stout is responsible for
> bundling
>  them if they are used.
> >>>
> >>>
> >>> I don't think being header-only is an excuse to have a broken
> >>> installation :-). Further, we don't make it easier for the user to get
> >>> the 3rdparty binaries either. For example, if the user has a different
> >>> version of protobuf installed on the system, the compilation of any
> >>> program that uses stout will fail spectacularly!
> >>>
> >>> Having said that, the gist here is that we have somewhat deviated from
> >>> original motivation behind the 3rdparty directory and it would be nice
> >>> if we can have a flatter structure.
> >>>
> 
> 
>  - Jie
> 
>  On Tue, Feb 9, 2016 at 4:14 PM, Kapil Arya 
> wrote:
> 
> > Hi All,
> >
> > TLDR: Move everything from 3rdparty/libprocess/3rdparty/* into
> >>> 3rdparty/.
> > (Optionally) Move libprocess/stout to the top-level directory.
> >
> > I wanted to start some discussion around reorganizing stuff inside
> > "3rdparty". I apologize for the length of the email, please bear with
> >>> 

Re: are mesos package version names predictable?

2016-02-04 Thread Erik Weathers
hey Till, thanks for the response!   Would be great to have those
CI-build-numbers squelched in the future if possible.  Or some sort of
alias set up to allow redirecting to the fully expanded build.

Any suggestions for the short term of how to perform this redirection?  For
reference, the purpose of this is to allow the storm-on-mesos project's
Dockerfile to have a configurable mesos version.

   - https://github.com/mesos/storm/pull/91

- Erik

On Thu, Feb 4, 2016 at 1:51 PM, Till Toenshoff <toensh...@me.com> wrote:

> Hey Erik,
>
> those added values (e.g. “-0.2.190") come from Mesosphere's CI system
> which adds build-numbers (the last digit) and some more “noise" for those
> distribution packages.
>
> I will raise this issue internally and get back to you - hoping that we
> can get rid of these not so useful extensions in a not too distant future.
>
> hth,
> Till
>
>
> > On Feb 3, 2016, at 2:23 AM, Erik Weathers <eweath...@groupon.com.INVALID>
> wrote:
> >
> > I've noticed that in the published mesos packages [1] & docker images [2]
> > that the version name isn't simply:
> >
> >   -  mesos_0.27.0.ubuntu1404_amd64
> >
> > Instead it has the form of:
> >
> >   - mesos_0.27.0*-0.2.190*.ubuntu1404_amd64
> >
> > Here are a few more examples of this numeric suffix:
> >
> >   - 0.27.0 -> 0.27.0-0.2.190
> >   - 0.26.0 -> 0.26.0-0.2.145
> >   - 0.25.0 -> 0.25.0-0.2.70
> >   - 0.24.1 -> 0.24.1-0.2.35
> >   - 0.24.0 -> 0.24.0-1.0.27
> >
> > It is not clear to me what these suffixes represent, and it makes it hard
> > to write code that can download or install the mesos package for a
> > particular version given just the simple version name (e.g., 0.27.0).  I
> > tried searching for what might be generating this version suffix, or for
> > documentation of the release process for mesos, but I have failed.
> >
> > So my question is really 2-fold:
> > (1) Where does this extra suffix come from?  Does it represent something
> > specific?  What is its purpose?   Why isn't the version simply the
> version?
> > (I'm sure there *is* a reason, but I haven't found it on my own.)
> > (2) What is the "right" way to handle this seeming unpredictability?
> >
> > Thanks!
> >
> > - Erik
> >
> > References:
> > [1] http://open.mesosphere.com/downloads/mesos/
> > [2] https://hub.docker.com/r/mesosphere/mesos/tags/
>
>


Re: are mesos package version names predictable?

2016-02-04 Thread Erik Weathers
Thanks Till, that is what I expected, and I appreciate the extra historical
insight.  Fingers crossed for future releases!  ;-)

On Thu, Feb 4, 2016 at 2:42 PM, Till Toenshoff <toensh...@me.com> wrote:

> Given that the final number (build number) is constantly being raised, I
> fear there is really not much you can do. The “0.2” and “1.0” were
> originally planned for snapshots (0.2) vs. releases (1.0) but as you can
> see, we changed that over time.
>
> The only robust option I see right now is a mapping table as you drafted
> in your initial mail already. You would have to touch that mapping with
> each new release :(
>
> … but hope is in sight, for future releases at least… will keep you
> updated.
>
> > On Feb 4, 2016, at 11:07 PM, Erik Weathers <eweath...@groupon.com.INVALID>
> wrote:
> >
> > hey Till, thanks for the response!   Would be great to have those
> > CI-build-numbers squelched in the future if possible.  Or some sort of
> > alias set up to allow redirecting to the fully expanded build.
> >
> > Any suggestions for the short term of how to perform this redirection?
> For
> > reference, the purpose of this is to allow the storm-on-mesos project's
> > Dockerfile to have a configurable mesos version.
> >
> >   - https://github.com/mesos/storm/pull/91
> >
> > - Erik
> >
> > On Thu, Feb 4, 2016 at 1:51 PM, Till Toenshoff <toensh...@me.com> wrote:
> >
> >> Hey Erik,
> >>
> >> those added values (e.g. “-0.2.190") come from Mesosphere's CI system
> >> which adds build-numbers (the last digit) and some more “noise" for
> those
> >> distribution packages.
> >>
> >> I will raise this issue internally and get back to you - hoping that we
> >> can get rid of these not so useful extensions in a not too distant
> future.
> >>
> >> hth,
> >> Till
> >>
> >>
> >>> On Feb 3, 2016, at 2:23 AM, Erik Weathers
> <eweath...@groupon.com.INVALID>
> >> wrote:
> >>>
> >>> I've noticed that in the published mesos packages [1] & docker images
> [2]
> >>> that the version name isn't simply:
> >>>
> >>>  -  mesos_0.27.0.ubuntu1404_amd64
> >>>
> >>> Instead it has the form of:
> >>>
> >>>  - mesos_0.27.0*-0.2.190*.ubuntu1404_amd64
> >>>
> >>> Here are a few more examples of this numeric suffix:
> >>>
> >>>  - 0.27.0 -> 0.27.0-0.2.190
> >>>  - 0.26.0 -> 0.26.0-0.2.145
> >>>  - 0.25.0 -> 0.25.0-0.2.70
> >>>  - 0.24.1 -> 0.24.1-0.2.35
> >>>  - 0.24.0 -> 0.24.0-1.0.27
> >>>
> >>> It is not clear to me what these suffixes represent, and it makes it
> hard
> >>> to write code that can download or install the mesos package for a
> >>> particular version given just the simple version name (e.g., 0.27.0).
> I
> >>> tried searching for what might be generating this version suffix, or
> for
> >>> documentation of the release process for mesos, but I have failed.
> >>>
> >>> So my question is really 2-fold:
> >>> (1) Where does this extra suffix come from?  Does it represent
> something
> >>> specific?  What is its purpose?   Why isn't the version simply the
> >> version?
> >>> (I'm sure there *is* a reason, but I haven't found it on my own.)
> >>> (2) What is the "right" way to handle this seeming unpredictability?
> >>>
> >>> Thanks!
> >>>
> >>> - Erik
> >>>
> >>> References:
> >>> [1] http://open.mesosphere.com/downloads/mesos/
> >>> [2] https://hub.docker.com/r/mesosphere/mesos/tags/
> >>
> >>
>
>


Contributor request

2016-02-03 Thread Erik Weathers
Please add me as a contributor in the Mesos JIRA project.
Apache JIRA username: erikdw

Thanks!

- Erik


Re: are mesos package version names predictable?

2016-02-03 Thread Erik Weathers
A. Thanks Tommy.  So I see that the Apache community doesn't release
binary packages, just versioned source code tarballs:

   - https://mesos.apache.org/downloads/

I'll follow up with Mesosphere directly, thanks!

- Erik

On Wednesday, February 3, 2016, tommy xiao <xia...@gmail.com> wrote:

> it belong to mesosphere concerns, not apache community's scope.
>
> 2016-02-03 9:23 GMT+08:00 Erik Weathers <eweath...@groupon.com.invalid>:
>
> > I've noticed that in the published mesos packages [1] & docker images [2]
> > that the version name isn't simply:
> >
> >-  mesos_0.27.0.ubuntu1404_amd64
> >
> > Instead it has the form of:
> >
> >- mesos_0.27.0*-0.2.190*.ubuntu1404_amd64
> >
> > Here are a few more examples of this numeric suffix:
> >
> >- 0.27.0 -> 0.27.0-0.2.190
> >- 0.26.0 -> 0.26.0-0.2.145
> >- 0.25.0 -> 0.25.0-0.2.70
> >- 0.24.1 -> 0.24.1-0.2.35
> >- 0.24.0 -> 0.24.0-1.0.27
> >
> > It is not clear to me what these suffixes represent, and it makes it hard
> > to write code that can download or install the mesos package for a
> > particular version given just the simple version name (e.g., 0.27.0).  I
> > tried searching for what might be generating this version suffix, or for
> > documentation of the release process for mesos, but I have failed.
> >
> > So my question is really 2-fold:
> > (1) Where does this extra suffix come from?  Does it represent something
> > specific?  What is its purpose?   Why isn't the version simply the
> version?
> >  (I'm sure there *is* a reason, but I haven't found it on my own.)
> > (2) What is the "right" way to handle this seeming unpredictability?
> >
> > Thanks!
> >
> > - Erik
> >
> > References:
> > [1] http://open.mesosphere.com/downloads/mesos/
> > [2] https://hub.docker.com/r/mesosphere/mesos/tags/
> >
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>


Re: tasks not being scheduled; cfs_rq for /mesos is missing

2016-01-04 Thread Erik Weathers
hi Jojy,

Unfortunately, I haven't been able to reproduce this issue on demand, it
has just happened spontaneously a few times.   So I cannot say for sure if
it would happen on a newer mesos/kernel version.  I'm thinking of trying to
force reproduction by creating and destroying a ton of cgroups, since the
issue does *seem* to possibly correlate with some badly behaved storm
topologies that are constantly crashing and causing the cgroups to be
created and destroyed often.

I have a couple test hosts that are in this bad state right now, so I'm
trying to get as much info out of them as I can.  I'm thinking of trying
SystemTap to introspect the kernel's run queue state and see what is
happening.

Here is the info you requested:

*/cgroup/cpu files:*

% for f in cpu.cfs_quota_us cpu.cfs_period_us cpu.shares cpu.stat tasks; do
echo $f: ; cat /cgroup/cpu/$f; done | head -n20
cpu.cfs_quota_us:
0
cpu.cfs_period_us:
0
cpu.shares:
1024
cpu.stat:
nr_periods 0
nr_throttled 0
throttled_time 0
tasks:
1
...

*/cgroup/cpu/mesos files:*

% for f in cpu.cfs_quota_us cpu.cfs_period_us cpu.shares cpu.stat tasks; do
echo $f: ; cat /cgroup/cpu/mesos/$f; done
cpu.cfs_quota_us:
-1
cpu.cfs_period_us:
10
cpu.shares:
1024
cpu.stat:
nr_periods 0
nr_throttled 0
throttled_time 0
tasks:

NOTE: no tasks, and the cpu.cfs_quota_us being -1.  But those are both
consistent with other hosts that aren't exhibiting this problem.

*/cgroup/cpu/mesos/08610169-76d5-4fd2-86bc-d3ef4d163e3e files:*

% for f in cpu.cfs_quota_us cpu.cfs_period_us cpu.shares cpu.stat tasks; do
echo $f: ; cat
/cgroup/cpu/mesos/08610169-76d5-4fd2-86bc-d3ef4d163e3e/$f; done
cpu.cfs_quota_us:
180
cpu.cfs_period_us:
10
cpu.shares:
18432
cpu.stat:
nr_periods 680868
nr_throttled 254025
throttled_time 55400010353125
tasks:
6473
...

- Erik

On Sun, Jan 3, 2016 at 10:38 AM, Jojy Varghese <j...@mesosphere.io> wrote:

> Hi Erik
>   Happy to work on this with you. Thanks for the details.
>
> As you might know, in cfs_rq:/ (from /proc/sched_debug),  is
> the CPU cgroup hierarchy name. I am curious about the contents and cgroups
> hierarchy when this happens. Could you send the “mesos” hierarchy
> (directory tree) and contents of files like
> ‘tasks’,’cpu.cfs_quota_us’,’cpu.cfs_period_us' ‘cpu.shares’,  ‘cpu.stat’.
>
> It does look strange that the parent cgroup is missing when child is
> present.
>
> Also, wondering if you are able to see same issue with latest Mesos and/or
> kernel?
>
> -Jojy
>
>
> > On Jan 2, 2016, at 9:43 PM, Erik Weathers <eweath...@groupon.com.INVALID>
> wrote:
> >
> > hey Jojy,  Thanks for your reply.  Response inline.
> >
> > On Thu, Dec 31, 2015 at 11:31 AM, Jojy Varghese <j...@mesosphere.io>
> wrote:
> >
> >>> Are /foo/bar cgroups hierarchical such that /foo missing would prevent
> >>>  /foo/bar tasks from being scheduled?  i.e., might that be the root
> >> cause of
> >>>  why the kernel is ignoring these tasks?
> >>
> >> Was curious why you said the above. CPU scheduling shares are a function
> >> of their parent’s CPU bandwidth.
> >>
> >
> > This question arose from an earlier observation in my initial email:
> >
> > In my initial email I pointed out that the contents of /proc/sched_debug
> > list all of the CFS run queues, but it seems like some of those run
> queues
> > are missing on the affected hosts.  i.e., usually they look like this
> (only
> > including output for the 1st CPU's CFS run queues):
> >
> > % grep 'cfs_rq\[0\]' /proc/sched_debug
> > cfs_rq[0]:/mesos/e8aa3b46-8004-466a-9a5e-249d6d19993f
> > cfs_rq[0]:/mesos
> > cfs_rq[0]:/
> >
> > But on the problematic hosts, they look like this:
> >
> > % grep 'cfs_rq\[0\]' /proc/sched_debug
> > cfs_rq[0]:/mesos/5cf9a444-e612-4d5b-b8bb-7ee93e44b352
> > cfs_rq[0]:/
> >
> > Notably, "cfs_rq[0]:/mesos" is missing on the problematic hosts.
> >
> > I'm not sure how that is possible, given my understanding that these
> > cfs_rq's are created from the special cgroups filesystem having
> directories
> > added to it, and since the /cgroup/cpu/mesos dir exists (as well as
> > /cgroup/cpu/mesos/5cf9a444-e612-4d5b-b8bb-7ee93e44b352/), I don't see how
> > the CFS run queues for "/mesos" could have been deleted.   I've been
> trying
> > to read the kernel cgroup CFS scheduling code, but it's tough for a newb.
> >
> > Notably, the cgroup settings that I see in /cgroup/cpu/mesos and
> > /cgroup/cpu/mesos/5cf9a444-e612-4d5b-b8bb-7ee93e44b352 are not
> suspici

Re: tasks not being scheduled; cfs_rq for /mesos is missing

2016-01-02 Thread Erik Weathers
hey Jojy,  Thanks for your reply.  Response inline.

On Thu, Dec 31, 2015 at 11:31 AM, Jojy Varghese <j...@mesosphere.io> wrote:

> > Are /foo/bar cgroups hierarchical such that /foo missing would prevent
> >   /foo/bar tasks from being scheduled?  i.e., might that be the root
> cause of
> >   why the kernel is ignoring these tasks?
>
> Was curious why you said the above. CPU scheduling shares are a function
> of their parent’s CPU bandwidth.
>

This question arose from an earlier observation in my initial email:

In my initial email I pointed out that the contents of /proc/sched_debug
list all of the CFS run queues, but it seems like some of those run queues
are missing on the affected hosts.  i.e., usually they look like this (only
including output for the 1st CPU's CFS run queues):

% grep 'cfs_rq\[0\]' /proc/sched_debug
cfs_rq[0]:/mesos/e8aa3b46-8004-466a-9a5e-249d6d19993f
cfs_rq[0]:/mesos
cfs_rq[0]:/

But on the problematic hosts, they look like this:

% grep 'cfs_rq\[0\]' /proc/sched_debug
cfs_rq[0]:/mesos/5cf9a444-e612-4d5b-b8bb-7ee93e44b352
cfs_rq[0]:/

Notably, "cfs_rq[0]:/mesos" is missing on the problematic hosts.

I'm not sure how that is possible, given my understanding that these
cfs_rq's are created from the special cgroups filesystem having directories
added to it, and since the /cgroup/cpu/mesos dir exists (as well as
/cgroup/cpu/mesos/5cf9a444-e612-4d5b-b8bb-7ee93e44b352/), I don't see how
the CFS run queues for "/mesos" could have been deleted.   I've been trying
to read the kernel cgroup CFS scheduling code, but it's tough for a newb.

Notably, the cgroup settings that I see in /cgroup/cpu/mesos and
/cgroup/cpu/mesos/5cf9a444-e612-4d5b-b8bb-7ee93e44b352 are not suspicious.
i.e., it's not that the cgroup settings of the "parent" /mesos cgroup are
preventing the tasks from being scheduled.  It seems to be that the cgroup
settings of the parent are simply gone from the kernel.  Poof.

At this point I'm assuming that the above observation is indeed the root
cause of the problem, and I'm simply hoping that whatever logic deleted the
"/mesos" run queue is fixed in either a newer kernel or newer mesos version.

Thanks!

- Erik



>
> -Jojy
>
>
> > On Dec 30, 2015, at 6:55 PM, Erik Weathers <eweath...@groupon.com.INVALID>
> wrote:
> >
> > I'm trying to figure out a situation where we see tasks in a mesos
> > container no longer being scheduled by the Linux kernel.  None of the
> tasks
> > in the container are zombies, nor are they stuck in "Disk sleep" state.
> > They are all in Running state.  But if I try to strace the processes the
> > strace cmd just hangs.  I've also noticed that none of the RIPs (64-bit
> > instruction pointers) are changing at all in these tasks, and they're not
> > accumulating any cputime.   So the kernel is just not scheduling them.
> >
> > Despite the behavior described above, these non-running tasks *are*
> listed
> > in the run queues of /proc/sched_debug.  Notably, I have observed that on
> > hosts without this problem that there exist "cfs_rq[N]:/mesos" run
> queues,
> > but on the hosts that have the broken scheduling, these run queues don't
> > exist, though we still have "cfs_rq[N]:/mesos/" in
> > /proc/sched_debug.  That is mighty suspicious to me.
> >
> > I'm curious about:
> >
> >   - Has anyone seen similar behavior?
> >   - Are /foo/bar cgroups hierarchical such that /foo missing would
> prevent
> >   /foo/bar tasks from being scheduled?  i.e., might that be the root
> cause of
> >   why the kernel is ignoring these tasks?
> >   - What creates the /mesos cfs run queue, and why would that cease to
> >   exist without the subordinate cgroups being cleaned up?
> >  - I'm assuming the creation of the "cpu" cgroup with the path
> >  "/mesos" done by mesos-slave creates this run queue.
> >  - But I'm not sure how/why it would be removed, since I still see a
> >  mesos cgroup in my cgroupfs cpu path (i.e., /cgroup/cpu/mesos
> exists).
> >
> > I'm assuming that this is a kernel bug, and I'm hopeful RedHat has
> patched
> > fixes into newer kernel versions that we are running on other hosts
> (e.g.,
> > 2.6.32-573.7.1.el6).
> >
> > Setup info:
> >
> > Kernel version:  2.6.32-431.el6.x86_64
> > Mesos version:  0.22.1
> > Containerizer: Mesos
> > Isolators: Have seen this behavior with both of these configs:
> >   cgroups/cpu,cgroups/mem
> >   cgroups/cpu,cgroups/mem,namespaces/pid
> >
> > Thanks for any insight or help!
> >
> > - Erik
>
>


Re: minor doc fixes - do I need to have a JIRA ticket and a shepherd and a compass and a map of the pacific NW?

2015-10-21 Thread Erik Weathers
Thanks @Joris!  I look forward to watching your (and Michael's) talk.

- Erik

On Wed, Oct 21, 2015 at 7:52 AM, Joris Van Remoortere <jo...@mesosphere.io>
wrote:

> Hi Erik!
> For a simple patch like spelling / grammar mistakes you don't need a
> shepherd or a JIRA.
> Just submit your patch, and add a committer as a reviewer.
> If you don't have a relationship yet with one, you can add me
> (jvanremoortere), or in general just ask in IRC and someone will volunteer.
>
> I'm sorry the current impression of the contribution process is laborious.
> Mpark and I gave a talk about this at MesosCon EU. Hopefully the material
> will be available soon!
>
> Thanks for contributing!
> Joris
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Tue, Oct 20, 2015 at 11:36 PM, Erik Weathers <eweath...@groupon.com>
> wrote:
>
> > i.e., the process for making contributions to the Mesos code base is, uh,
> > how shall I say it... *involved*.
> >
> >- https://mesos.apache.org/documentation/latest/submitting-a-patch/
> >
> > Hoping I don't have to jump through all the hoops to make some minor doc
> > fixes.
> >
> > Appreciate if anyone can tell me how to easily submit some doc fixes
> > without doing all of the possible 51 (sub)steps on that page.  Notably,
> my
> > fixes include a spelling/grammar fix to the doc I just linked above (how
> > meta).
> >
> > Thanks!!
> >
> > - Erik
> >
>


minor doc fixes - do I need to have a JIRA ticket and a shepherd and a compass and a map of the pacific NW?

2015-10-20 Thread Erik Weathers
i.e., the process for making contributions to the Mesos code base is, uh,
how shall I say it... *involved*.

   - https://mesos.apache.org/documentation/latest/submitting-a-patch/

Hoping I don't have to jump through all the hoops to make some minor doc
fixes.

Appreciate if anyone can tell me how to easily submit some doc fixes
without doing all of the possible 51 (sub)steps on that page.  Notably, my
fixes include a spelling/grammar fix to the doc I just linked above (how
meta).

Thanks!!

- Erik


Re: building mesos with libmesos.so RPATH customized

2015-07-13 Thread Erik Weathers
Ah, yes, and that does the autoreconf. Thanks.

- Erik

On Sunday, July 12, 2015, haosdent haosd...@gmail.com wrote:

 Yes, need execute
 ```
 $ ./bootstrap
 ```
  before configuration.

 On Mon, Jul 13, 2015 at 2:49 AM, Erik Weathers eweath...@groupon.com
 javascript:;
 wrote:

  Had to add autoreconf -i after unpacking the mesos tar from github, in
  order to generate the configure script.
  But it worked, woot!
  Thanks again!
 
  - Erik
 
  On Sun, Jul 12, 2015 at 10:01 AM, haosdent haosd...@gmail.com
 javascript:; wrote:
 
   Never mind. You are welcome. I also boring at weekend. :-)
  
   On Mon, Jul 13, 2015 at 12:41 AM, Erik Weathers eweath...@groupon.com
 javascript:;
   wrote:
  
Whoa!!  Great find Haosong!  Makes sense that it would be the source
tarball since our environments are seemingly the same.  I'll try this
  out
this morning and let you know if I succeed. Thanks so much for your
   helping
of a random stranger, I truly appreciate it!
   
- Erik
   
On Sunday, July 12, 2015, haosdent haosd...@gmail.com
 javascript:; wrote:
   
 Hi, Erik. The release package in
 http://archive.apache.org/dist/mesos/0.22.1/mesos-0.22.1.tar.gz
 contains m4/libtool.m4 while the
 https://github.com/apache/mesos/archive/0.22.1.tar.gz don't
 contains
 m4/libtool.m4. And use the m4/libtool.m4 (version is 2.4.6) in
 http://archive.apache.org/dist/mesos/0.22.1/mesos-0.22.1.tar.gz
  would
 append /usr/lib64 to rpath. But use your system libtool(version is
2.2.6),
 would not append any rpath except you special it.
 I still don't know why contains m4/libtool.m4 in release package,
  maybe
 have some special reason. But for you, I think could use
 https://github.com/apache/mesos/archive/0.22.1.tar.gz  directly.

 On Sun, Jul 12, 2015 at 10:44 PM, haosdent haosd...@gmail.com
 javascript:;
 javascript:; wrote:

  Hi, Erik. I find the problem. The mesos-0.22.1.tar.gz(
  http://apache.cs.utah.edu/mesos/0.22.1/mesos-0.22.1.tar.gz) you
provide
  have problems. Please download it from here(
  https://github.com/apache/mesos/archive/0.22.1.tar.gz). I could
   build
it
  with correct rpath.
 
  On Sun, Jul 12, 2015 at 10:23 PM, haosdent haosd...@gmail.com
 javascript:;
 javascript:; wrote:
 
  The problem seems libtool generate in build directory are
  different.
The
  libtool mesos-0.22.1 used would append /usr/lib64 when link
 libmesos.so. If
  copy the libtool which generate under mesos master branch, would
  not
  contains /usr/lib64 when link.
 
  On Sun, Jul 12, 2015 at 8:03 PM, haosdent haosd...@gmail.com
 javascript:;
 javascript:; wrote:
 
  Hi, Erik. I use 0.22.1 to build libmesos.so, could reproduce
 your
  problem. But use master branch, could not reproduce.
 
 
  On Sun, Jul 12, 2015 at 3:09 PM, Erik Weathers 
eweath...@groupon.com javascript:;
 javascript:;
  wrote:
 
  FYI, here's how I'm doing the build:
  https://gist.github.com/erikdw/67db1eac4fb1ede8
 
  I included the RPM list on the VM.
 
  - Erik
 
 
  On Sat, Jul 11, 2015 at 11:35 PM, Erik Weathers 
 eweath...@groupon.com javascript:; javascript:;
  wrote:
 
   Thanks for bearing with me Haosong.
  
   No environment variable mucking with it that I can see.
 Only
  potentially
   relevant thing is perhaps the LD_LIBRARY_PATH set by scl:
  
  
 

   
  
 
 LD_LIBRARY_PATH=/opt/rh/devtoolset-2/root/usr/lib64:/opt/rh/devtoolset-2/root/usr/lib
  
   Regarding package versions, we are identical for those:
   automake-1.11.1-4.el6.noarch
   autoconf-2.63-5.1.el6.noarch
   libtool-2.2.6-15.5.el6.x86_64
  
   Are you using scl with devtool-set2?
  
   I feel like the original suggestion you had won't yield
   different
  behavior
   than my original LDFLAGS setting.  i.e., my belief is that
  -Wl,-rpath=/usr/local/lib
   is identical to -Wl,-rpath,/usr/local/lib.  I'm guessing
  these
are
  just
   syntax variants.
  
   It's worth reemphasizing that the mesos-slave, mesos-master,
   etc.
  binaries
   have the RPATH set as I expect (the --prefix setting is
   sufficient
  for that
   it).
  
   - Erik
  
   On Sat, Jul 11, 2015 at 11:24 PM, haosdent 
  haosd...@gmail.com javascript:;
 javascript:;
  wrote:
  
   My autotool version:
   automake-1.11.1-4.el6.noarch
   autoconf-2.63-5.1.el6.noarch
   libtool-2.2.6-15.5.el6.x86_64
  
   On Sun, Jul 12, 2015 at 2:23 PM, haosdent 
  haosd...@gmail.com javascript:;
 javascript:;
  wrote:
  
I also use CentOS 6.5.
   
On Sun, Jul 12, 2015 at 2:20 PM, haosdent 
   haosd...@gmail.com javascript:;
 javascript:;
  wrote:
   
Does any exists

Re: building mesos with libmesos.so RPATH customized

2015-07-12 Thread Erik Weathers
Thanks for the response Hao.  Unfortunately that didn't work for me, the
default /usr/lib64 is inserted anyways.
I'm building on CentOS 6.5, using the instructions for CentOS 6.6 here:
http://mesos.apache.org/gettingstarted/

- Erik

On Sat, Jul 11, 2015 at 10:08 PM, haosdent haosd...@gmail.com wrote:

 Hi, @Erik I think you need to change -Wl,-rpath=/usr/local/lib to
 -Wl,-rpath,/usr/local/lib. My build step:

 ```
 LDFLAGS=-Wl,-rpath,/usr/local/lib ../configure
 ```

 And the result show the RPATH only contains /usr/local/lib

 ```
 $objdump -x ./src/.libs/libmesos.so |grep RPATH
   RPATH/usr/local/lib
 ```

 On Sun, Jul 12, 2015 at 12:06 PM, Erik Weathers eweath...@groupon.com
 wrote:

  hi mesos dev people,
 
  I'm hoping to enlist some help in building mesos such that the
 libmesos.so
  has its RPATH set as our environment expects.  Specifically, in our
  environment we install our own custom-built libraries under
 /usr/local/lib,
  so I want the RPATH in the libmesos.so ELF to look like so:
 
 Library rpath: [/usr/local/lib:/usr/lib64]
 
  I've tried to effect this change by running configure like so:
 
 LDFLAGS=-Wl,-rpath=/usr/local/lib ./configure --prefix=/usr/local
 
  This resulted in the following RPATH being embedded in libmesos.so:
 
 Library rpath: [/usr/lib64:/usr/local/lib]
 
  The RPATH *does* have /usr/local/lib, but I want that to be the 1st
 entry,
  not the 2nd.  I'm not familiar enough with autoconf nor libtool to figure
  out how to get the order reversed.  I *could* hack the embedded RPATH
 with
  the chrpath tool, but I'd prefer changing build arguments instead.
 
  I see in the g++ cmd that generates the .so that there are includes of
  /usr/lib64 earlier than my passed LDFLAGS, so I wonder if it's a
  configure.ac change I need to make to allow the LDFLAGS to be shoved in
  front instead of behind the automatically generated /usr/lib64 portion.
 
  Notably, simple use of the --prefix=/usr/local option allows the mesos-*
  binaries to have the embedded RPATH as I want, I'm only struggling with
 the
  libmesos.so RPATH.
 
  Thanks for whatever help you might provide!
 
  - Erik
 
  P.S., this is for building mesos-0.22.1
  P.P.S., I tried --with-rpath=/usr/local/lib, but that didn't help either.
 



 --
 Best Regards,
 Haosdent Huang



Re: building mesos with libmesos.so RPATH customized

2015-07-12 Thread Erik Weathers
Thanks for bearing with me Haosong.

No environment variable mucking with it that I can see.  Only potentially
relevant thing is perhaps the LD_LIBRARY_PATH set by scl:

 
LD_LIBRARY_PATH=/opt/rh/devtoolset-2/root/usr/lib64:/opt/rh/devtoolset-2/root/usr/lib

Regarding package versions, we are identical for those:
automake-1.11.1-4.el6.noarch
autoconf-2.63-5.1.el6.noarch
libtool-2.2.6-15.5.el6.x86_64

Are you using scl with devtool-set2?

I feel like the original suggestion you had won't yield different behavior
than my original LDFLAGS setting.  i.e., my belief is that
-Wl,-rpath=/usr/local/lib
is identical to -Wl,-rpath,/usr/local/lib.  I'm guessing these are just
syntax variants.

It's worth reemphasizing that the mesos-slave, mesos-master, etc. binaries
have the RPATH set as I expect (the --prefix setting is sufficient for that
it).

- Erik

On Sat, Jul 11, 2015 at 11:24 PM, haosdent haosd...@gmail.com wrote:

 My autotool version:
 automake-1.11.1-4.el6.noarch
 autoconf-2.63-5.1.el6.noarch
 libtool-2.2.6-15.5.el6.x86_64

 On Sun, Jul 12, 2015 at 2:23 PM, haosdent haosd...@gmail.com wrote:

  I also use CentOS 6.5.
 
  On Sun, Jul 12, 2015 at 2:20 PM, haosdent haosd...@gmail.com wrote:
 
  Does any exists environment variables affect your build?
 
  On Sun, Jul 12, 2015 at 2:16 PM, Erik Weathers eweath...@groupon.com
  wrote:
 
  Thanks for the response Hao.  Unfortunately that didn't work for me,
 the
  default /usr/lib64 is inserted anyways.
  I'm building on CentOS 6.5, using the instructions for CentOS 6.6 here:
  http://mesos.apache.org/gettingstarted/
 
  - Erik
 
  On Sat, Jul 11, 2015 at 10:08 PM, haosdent haosd...@gmail.com wrote:
 
   Hi, @Erik I think you need to change -Wl,-rpath=/usr/local/lib to
   -Wl,-rpath,/usr/local/lib. My build step:
  
   ```
   LDFLAGS=-Wl,-rpath,/usr/local/lib ../configure
   ```
  
   And the result show the RPATH only contains /usr/local/lib
  
   ```
   $objdump -x ./src/.libs/libmesos.so |grep RPATH
 RPATH/usr/local/lib
   ```
  
   On Sun, Jul 12, 2015 at 12:06 PM, Erik Weathers 
 eweath...@groupon.com
  
   wrote:
  
hi mesos dev people,
   
I'm hoping to enlist some help in building mesos such that the
   libmesos.so
has its RPATH set as our environment expects.  Specifically, in our
environment we install our own custom-built libraries under
   /usr/local/lib,
so I want the RPATH in the libmesos.so ELF to look like so:
   
   Library rpath: [/usr/local/lib:/usr/lib64]
   
I've tried to effect this change by running configure like so:
   
   LDFLAGS=-Wl,-rpath=/usr/local/lib ./configure
  --prefix=/usr/local
   
This resulted in the following RPATH being embedded in libmesos.so:
   
   Library rpath: [/usr/lib64:/usr/local/lib]
   
The RPATH *does* have /usr/local/lib, but I want that to be the 1st
   entry,
not the 2nd.  I'm not familiar enough with autoconf nor libtool to
  figure
out how to get the order reversed.  I *could* hack the embedded
 RPATH
   with
the chrpath tool, but I'd prefer changing build arguments instead.
   
I see in the g++ cmd that generates the .so that there are includes
  of
/usr/lib64 earlier than my passed LDFLAGS, so I wonder if it's a
configure.ac change I need to make to allow the LDFLAGS to be
  shoved in
front instead of behind the automatically generated /usr/lib64
  portion.
   
Notably, simple use of the --prefix=/usr/local option allows the
  mesos-*
binaries to have the embedded RPATH as I want, I'm only struggling
  with
   the
libmesos.so RPATH.
   
Thanks for whatever help you might provide!
   
- Erik
   
P.S., this is for building mesos-0.22.1
P.P.S., I tried --with-rpath=/usr/local/lib, but that didn't help
  either.
   
  
  
  
   --
   Best Regards,
   Haosdent Huang
  
 
 
 
 
  --
  Best Regards,
  Haosdent Huang
 
 
 
 
  --
  Best Regards,
  Haosdent Huang
 



 --
 Best Regards,
 Haosdent Huang



Re: building mesos with libmesos.so RPATH customized

2015-07-12 Thread Erik Weathers
FYI, here's how I'm doing the build:
https://gist.github.com/erikdw/67db1eac4fb1ede8

I included the RPM list on the VM.

- Erik


On Sat, Jul 11, 2015 at 11:35 PM, Erik Weathers eweath...@groupon.com
wrote:

 Thanks for bearing with me Haosong.

 No environment variable mucking with it that I can see.  Only potentially
 relevant thing is perhaps the LD_LIBRARY_PATH set by scl:

  
 LD_LIBRARY_PATH=/opt/rh/devtoolset-2/root/usr/lib64:/opt/rh/devtoolset-2/root/usr/lib

 Regarding package versions, we are identical for those:
 automake-1.11.1-4.el6.noarch
 autoconf-2.63-5.1.el6.noarch
 libtool-2.2.6-15.5.el6.x86_64

 Are you using scl with devtool-set2?

 I feel like the original suggestion you had won't yield different behavior
 than my original LDFLAGS setting.  i.e., my belief is that 
 -Wl,-rpath=/usr/local/lib
 is identical to -Wl,-rpath,/usr/local/lib.  I'm guessing these are just
 syntax variants.

 It's worth reemphasizing that the mesos-slave, mesos-master, etc. binaries
 have the RPATH set as I expect (the --prefix setting is sufficient for that
 it).

 - Erik

 On Sat, Jul 11, 2015 at 11:24 PM, haosdent haosd...@gmail.com wrote:

 My autotool version:
 automake-1.11.1-4.el6.noarch
 autoconf-2.63-5.1.el6.noarch
 libtool-2.2.6-15.5.el6.x86_64

 On Sun, Jul 12, 2015 at 2:23 PM, haosdent haosd...@gmail.com wrote:

  I also use CentOS 6.5.
 
  On Sun, Jul 12, 2015 at 2:20 PM, haosdent haosd...@gmail.com wrote:
 
  Does any exists environment variables affect your build?
 
  On Sun, Jul 12, 2015 at 2:16 PM, Erik Weathers eweath...@groupon.com
  wrote:
 
  Thanks for the response Hao.  Unfortunately that didn't work for me,
 the
  default /usr/lib64 is inserted anyways.
  I'm building on CentOS 6.5, using the instructions for CentOS 6.6
 here:
  http://mesos.apache.org/gettingstarted/
 
  - Erik
 
  On Sat, Jul 11, 2015 at 10:08 PM, haosdent haosd...@gmail.com
 wrote:
 
   Hi, @Erik I think you need to change -Wl,-rpath=/usr/local/lib to
   -Wl,-rpath,/usr/local/lib. My build step:
  
   ```
   LDFLAGS=-Wl,-rpath,/usr/local/lib ../configure
   ```
  
   And the result show the RPATH only contains /usr/local/lib
  
   ```
   $objdump -x ./src/.libs/libmesos.so |grep RPATH
 RPATH/usr/local/lib
   ```
  
   On Sun, Jul 12, 2015 at 12:06 PM, Erik Weathers 
 eweath...@groupon.com
  
   wrote:
  
hi mesos dev people,
   
I'm hoping to enlist some help in building mesos such that the
   libmesos.so
has its RPATH set as our environment expects.  Specifically, in
 our
environment we install our own custom-built libraries under
   /usr/local/lib,
so I want the RPATH in the libmesos.so ELF to look like so:
   
   Library rpath: [/usr/local/lib:/usr/lib64]
   
I've tried to effect this change by running configure like so:
   
   LDFLAGS=-Wl,-rpath=/usr/local/lib ./configure
  --prefix=/usr/local
   
This resulted in the following RPATH being embedded in
 libmesos.so:
   
   Library rpath: [/usr/lib64:/usr/local/lib]
   
The RPATH *does* have /usr/local/lib, but I want that to be the
 1st
   entry,
not the 2nd.  I'm not familiar enough with autoconf nor libtool to
  figure
out how to get the order reversed.  I *could* hack the embedded
 RPATH
   with
the chrpath tool, but I'd prefer changing build arguments instead.
   
I see in the g++ cmd that generates the .so that there are
 includes
  of
/usr/lib64 earlier than my passed LDFLAGS, so I wonder if it's a
configure.ac change I need to make to allow the LDFLAGS to be
  shoved in
front instead of behind the automatically generated /usr/lib64
  portion.
   
Notably, simple use of the --prefix=/usr/local option allows the
  mesos-*
binaries to have the embedded RPATH as I want, I'm only struggling
  with
   the
libmesos.so RPATH.
   
Thanks for whatever help you might provide!
   
- Erik
   
P.S., this is for building mesos-0.22.1
P.P.S., I tried --with-rpath=/usr/local/lib, but that didn't help
  either.
   
  
  
  
   --
   Best Regards,
   Haosdent Huang
  
 
 
 
 
  --
  Best Regards,
  Haosdent Huang
 
 
 
 
  --
  Best Regards,
  Haosdent Huang
 



 --
 Best Regards,
 Haosdent Huang





Re: building mesos with libmesos.so RPATH customized

2015-07-12 Thread Erik Weathers
Whoa!!  Great find Haosong!  Makes sense that it would be the source
tarball since our environments are seemingly the same.  I'll try this out
this morning and let you know if I succeed. Thanks so much for your helping
of a random stranger, I truly appreciate it!

- Erik

On Sunday, July 12, 2015, haosdent haosd...@gmail.com wrote:

 Hi, Erik. The release package in
 http://archive.apache.org/dist/mesos/0.22.1/mesos-0.22.1.tar.gz
 contains m4/libtool.m4 while the
 https://github.com/apache/mesos/archive/0.22.1.tar.gz don't contains
 m4/libtool.m4. And use the m4/libtool.m4 (version is 2.4.6) in
 http://archive.apache.org/dist/mesos/0.22.1/mesos-0.22.1.tar.gz would
 append /usr/lib64 to rpath. But use your system libtool(version is 2.2.6),
 would not append any rpath except you special it.
 I still don't know why contains m4/libtool.m4 in release package, maybe
 have some special reason. But for you, I think could use
 https://github.com/apache/mesos/archive/0.22.1.tar.gz  directly.

 On Sun, Jul 12, 2015 at 10:44 PM, haosdent haosd...@gmail.com
 javascript:; wrote:

  Hi, Erik. I find the problem. The mesos-0.22.1.tar.gz(
  http://apache.cs.utah.edu/mesos/0.22.1/mesos-0.22.1.tar.gz) you provide
  have problems. Please download it from here(
  https://github.com/apache/mesos/archive/0.22.1.tar.gz). I could build it
  with correct rpath.
 
  On Sun, Jul 12, 2015 at 10:23 PM, haosdent haosd...@gmail.com
 javascript:; wrote:
 
  The problem seems libtool generate in build directory are different. The
  libtool mesos-0.22.1 used would append /usr/lib64 when link
 libmesos.so. If
  copy the libtool which generate under mesos master branch, would not
  contains /usr/lib64 when link.
 
  On Sun, Jul 12, 2015 at 8:03 PM, haosdent haosd...@gmail.com
 javascript:; wrote:
 
  Hi, Erik. I use 0.22.1 to build libmesos.so, could reproduce your
  problem. But use master branch, could not reproduce.
 
 
  On Sun, Jul 12, 2015 at 3:09 PM, Erik Weathers eweath...@groupon.com
 javascript:;
  wrote:
 
  FYI, here's how I'm doing the build:
  https://gist.github.com/erikdw/67db1eac4fb1ede8
 
  I included the RPM list on the VM.
 
  - Erik
 
 
  On Sat, Jul 11, 2015 at 11:35 PM, Erik Weathers 
 eweath...@groupon.com javascript:;
  wrote:
 
   Thanks for bearing with me Haosong.
  
   No environment variable mucking with it that I can see.  Only
  potentially
   relevant thing is perhaps the LD_LIBRARY_PATH set by scl:
  
  
 
 LD_LIBRARY_PATH=/opt/rh/devtoolset-2/root/usr/lib64:/opt/rh/devtoolset-2/root/usr/lib
  
   Regarding package versions, we are identical for those:
   automake-1.11.1-4.el6.noarch
   autoconf-2.63-5.1.el6.noarch
   libtool-2.2.6-15.5.el6.x86_64
  
   Are you using scl with devtool-set2?
  
   I feel like the original suggestion you had won't yield different
  behavior
   than my original LDFLAGS setting.  i.e., my belief is that
  -Wl,-rpath=/usr/local/lib
   is identical to -Wl,-rpath,/usr/local/lib.  I'm guessing these are
  just
   syntax variants.
  
   It's worth reemphasizing that the mesos-slave, mesos-master, etc.
  binaries
   have the RPATH set as I expect (the --prefix setting is sufficient
  for that
   it).
  
   - Erik
  
   On Sat, Jul 11, 2015 at 11:24 PM, haosdent haosd...@gmail.com
 javascript:;
  wrote:
  
   My autotool version:
   automake-1.11.1-4.el6.noarch
   autoconf-2.63-5.1.el6.noarch
   libtool-2.2.6-15.5.el6.x86_64
  
   On Sun, Jul 12, 2015 at 2:23 PM, haosdent haosd...@gmail.com
 javascript:;
  wrote:
  
I also use CentOS 6.5.
   
On Sun, Jul 12, 2015 at 2:20 PM, haosdent haosd...@gmail.com
 javascript:;
  wrote:
   
Does any exists environment variables affect your build?
   
On Sun, Jul 12, 2015 at 2:16 PM, Erik Weathers 
  eweath...@groupon.com javascript:;
wrote:
   
Thanks for the response Hao.  Unfortunately that didn't work
 for
  me,
   the
default /usr/lib64 is inserted anyways.
I'm building on CentOS 6.5, using the instructions for CentOS
 6.6
   here:
http://mesos.apache.org/gettingstarted/
   
- Erik
   
On Sat, Jul 11, 2015 at 10:08 PM, haosdent haosd...@gmail.com
 javascript:;
   wrote:
   
 Hi, @Erik I think you need to change
  -Wl,-rpath=/usr/local/lib to
 -Wl,-rpath,/usr/local/lib. My build step:

 ```
 LDFLAGS=-Wl,-rpath,/usr/local/lib ../configure
 ```

 And the result show the RPATH only contains /usr/local/lib

 ```
 $objdump -x ./src/.libs/libmesos.so |grep RPATH
   RPATH/usr/local/lib
 ```

 On Sun, Jul 12, 2015 at 12:06 PM, Erik Weathers 
   eweath...@groupon.com javascript:;

 wrote:

  hi mesos dev people,
 
  I'm hoping to enlist some help in building mesos such that
  the
 libmesos.so
  has its RPATH set as our environment expects.
 Specifically,
  in
   our
  environment we install our own custom-built libraries under
 /usr/local/lib,
  so I want the RPATH in the libmesos.so ELF

Re: building mesos with libmesos.so RPATH customized

2015-07-12 Thread Erik Weathers
Had to add autoreconf -i after unpacking the mesos tar from github, in
order to generate the configure script.
But it worked, woot!
Thanks again!

- Erik

On Sun, Jul 12, 2015 at 10:01 AM, haosdent haosd...@gmail.com wrote:

 Never mind. You are welcome. I also boring at weekend. :-)

 On Mon, Jul 13, 2015 at 12:41 AM, Erik Weathers eweath...@groupon.com
 wrote:

  Whoa!!  Great find Haosong!  Makes sense that it would be the source
  tarball since our environments are seemingly the same.  I'll try this out
  this morning and let you know if I succeed. Thanks so much for your
 helping
  of a random stranger, I truly appreciate it!
 
  - Erik
 
  On Sunday, July 12, 2015, haosdent haosd...@gmail.com wrote:
 
   Hi, Erik. The release package in
   http://archive.apache.org/dist/mesos/0.22.1/mesos-0.22.1.tar.gz
   contains m4/libtool.m4 while the
   https://github.com/apache/mesos/archive/0.22.1.tar.gz don't contains
   m4/libtool.m4. And use the m4/libtool.m4 (version is 2.4.6) in
   http://archive.apache.org/dist/mesos/0.22.1/mesos-0.22.1.tar.gz would
   append /usr/lib64 to rpath. But use your system libtool(version is
  2.2.6),
   would not append any rpath except you special it.
   I still don't know why contains m4/libtool.m4 in release package, maybe
   have some special reason. But for you, I think could use
   https://github.com/apache/mesos/archive/0.22.1.tar.gz  directly.
  
   On Sun, Jul 12, 2015 at 10:44 PM, haosdent haosd...@gmail.com
   javascript:; wrote:
  
Hi, Erik. I find the problem. The mesos-0.22.1.tar.gz(
http://apache.cs.utah.edu/mesos/0.22.1/mesos-0.22.1.tar.gz) you
  provide
have problems. Please download it from here(
https://github.com/apache/mesos/archive/0.22.1.tar.gz). I could
 build
  it
with correct rpath.
   
On Sun, Jul 12, 2015 at 10:23 PM, haosdent haosd...@gmail.com
   javascript:; wrote:
   
The problem seems libtool generate in build directory are different.
  The
libtool mesos-0.22.1 used would append /usr/lib64 when link
   libmesos.so. If
copy the libtool which generate under mesos master branch, would not
contains /usr/lib64 when link.
   
On Sun, Jul 12, 2015 at 8:03 PM, haosdent haosd...@gmail.com
   javascript:; wrote:
   
Hi, Erik. I use 0.22.1 to build libmesos.so, could reproduce your
problem. But use master branch, could not reproduce.
   
   
On Sun, Jul 12, 2015 at 3:09 PM, Erik Weathers 
  eweath...@groupon.com
   javascript:;
wrote:
   
FYI, here's how I'm doing the build:
https://gist.github.com/erikdw/67db1eac4fb1ede8
   
I included the RPM list on the VM.
   
- Erik
   
   
On Sat, Jul 11, 2015 at 11:35 PM, Erik Weathers 
   eweath...@groupon.com javascript:;
wrote:
   
 Thanks for bearing with me Haosong.

 No environment variable mucking with it that I can see.  Only
potentially
 relevant thing is perhaps the LD_LIBRARY_PATH set by scl:


   
  
 
 LD_LIBRARY_PATH=/opt/rh/devtoolset-2/root/usr/lib64:/opt/rh/devtoolset-2/root/usr/lib

 Regarding package versions, we are identical for those:
 automake-1.11.1-4.el6.noarch
 autoconf-2.63-5.1.el6.noarch
 libtool-2.2.6-15.5.el6.x86_64

 Are you using scl with devtool-set2?

 I feel like the original suggestion you had won't yield
 different
behavior
 than my original LDFLAGS setting.  i.e., my belief is that
-Wl,-rpath=/usr/local/lib
 is identical to -Wl,-rpath,/usr/local/lib.  I'm guessing these
  are
just
 syntax variants.

 It's worth reemphasizing that the mesos-slave, mesos-master,
 etc.
binaries
 have the RPATH set as I expect (the --prefix setting is
 sufficient
for that
 it).

 - Erik

 On Sat, Jul 11, 2015 at 11:24 PM, haosdent haosd...@gmail.com
   javascript:;
wrote:

 My autotool version:
 automake-1.11.1-4.el6.noarch
 autoconf-2.63-5.1.el6.noarch
 libtool-2.2.6-15.5.el6.x86_64

 On Sun, Jul 12, 2015 at 2:23 PM, haosdent haosd...@gmail.com
   javascript:;
wrote:

  I also use CentOS 6.5.
 
  On Sun, Jul 12, 2015 at 2:20 PM, haosdent 
 haosd...@gmail.com
   javascript:;
wrote:
 
  Does any exists environment variables affect your build?
 
  On Sun, Jul 12, 2015 at 2:16 PM, Erik Weathers 
eweath...@groupon.com javascript:;
  wrote:
 
  Thanks for the response Hao.  Unfortunately that didn't
 work
   for
me,
 the
  default /usr/lib64 is inserted anyways.
  I'm building on CentOS 6.5, using the instructions for
 CentOS
   6.6
 here:
  http://mesos.apache.org/gettingstarted/
 
  - Erik
 
  On Sat, Jul 11, 2015 at 10:08 PM, haosdent 
  haosd...@gmail.com
   javascript:;
 wrote:
 
   Hi, @Erik I think you need to change
-Wl,-rpath=/usr/local/lib to
   -Wl,-rpath,/usr/local/lib. My build step:
  
   ```
   LDFLAGS=-Wl,-rpath,/usr

building mesos with libmesos.so RPATH customized

2015-07-11 Thread Erik Weathers
hi mesos dev people,

I'm hoping to enlist some help in building mesos such that the libmesos.so
has its RPATH set as our environment expects.  Specifically, in our
environment we install our own custom-built libraries under /usr/local/lib,
so I want the RPATH in the libmesos.so ELF to look like so:

   Library rpath: [/usr/local/lib:/usr/lib64]

I've tried to effect this change by running configure like so:

   LDFLAGS=-Wl,-rpath=/usr/local/lib ./configure --prefix=/usr/local

This resulted in the following RPATH being embedded in libmesos.so:

   Library rpath: [/usr/lib64:/usr/local/lib]

The RPATH *does* have /usr/local/lib, but I want that to be the 1st entry,
not the 2nd.  I'm not familiar enough with autoconf nor libtool to figure
out how to get the order reversed.  I *could* hack the embedded RPATH with
the chrpath tool, but I'd prefer changing build arguments instead.

I see in the g++ cmd that generates the .so that there are includes of
/usr/lib64 earlier than my passed LDFLAGS, so I wonder if it's a
configure.ac change I need to make to allow the LDFLAGS to be shoved in
front instead of behind the automatically generated /usr/lib64 portion.

Notably, simple use of the --prefix=/usr/local option allows the mesos-*
binaries to have the embedded RPATH as I want, I'm only struggling with the
libmesos.so RPATH.

Thanks for whatever help you might provide!

- Erik

P.S., this is for building mesos-0.22.1
P.P.S., I tried --with-rpath=/usr/local/lib, but that didn't help either.


Re: Regarding old frameworks in Mesos repository

2015-06-23 Thread Erik Weathers
Please maintain the git history for the files when you move them.  They
should not all appear to have been born into the new repos...

- Erik

On Tuesday, June 23, 2015, Yan Xu y...@jxu.me wrote:

 So I'd like to resurface this topic. The last attempt
 https://reviews.apache.org/r/33090/ to remove things under frameworks/
 was put off because scripts under ec2/ still reference these frameworks.

 However we seem to have reached the consensus that these unmaintained code
 need to be moved out to avoid confusion (People asking questions /
 reporting errors for things we don't maintain anymore). This point was
 reiterated during our last community sync and we decided to remove ec2/
 folder as well.

 Therefore, if there's no objection, I will delete these files and recreate
 them as individual projects under github.com/mesos. Our website will be
 updated with the links to them either deleted or replaced by similar
 external projects.

 --
 Jiang Yan Xu y...@jxu.me javascript:; @xujyan 
 http://twitter.com/xujyan

 On Fri, Apr 10, 2015 at 2:39 AM, Alexander Rojas alexan...@mesosphere.io
 javascript:;
 wrote:

  +1 If they are not maintain they should be somewhere else.
 
   On 06 Apr 2015, at 21:10, Yan Xu y...@jxu.me javascript:; wrote:
  
   There exist a couple of frameworks in the Mesos codebase under
  /frameworks:
   deploy_jar haproxy+apache mesos-submit   torque
   (See https://github.com/apache/mesos/tree/master/frameworks)
  
   Anyone still uses them?
  
   These frameworks are not trivial implementations like the ones under
   src/examples to demonstrate/test Mesos features and they rely on
 external
   programs to run. Since we don't actively maintain them, they may have
   already stopped working with the current versions of these programs.
  
   We'd like to remove these from the Mesos repository. If there are folks
  who
   still use them and would like to contribute, the ideal place to host
 them
   is in their own repos. e.g., https://github.com/mesos/hadoop
  
   Any comments?
  
   --
   Jiang Yan Xu y...@jxu.me javascript:; @xujyan 
 http://twitter.com/xujyan
 
 



Re: Problems With EC2 Script

2015-05-08 Thread Erik Weathers
Can you please clarify what this the script thing is that you're
referring to?  :-)

- Erik

On Thu, May 7, 2015 at 8:12 PM, Colin Williams lack...@gmail.com wrote:

 I've been trying to set up a mesos cluster on AWS, and I've run into a
 number of problems with the script:

- When the script reaches the rsync, the new instances are still
starting up, and so the ssh daemon is not yet available.
- Once I was able to get the cluster set up, the script wasn't able to
find it to perform additional operations like stopping.
- The default instance type is m1.large, which is now considered a
previous generation.

 I'd like to spend some time working on these problems. Do these sound like
 issues worth solving?

 Thanks,
 Colin



Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-16 Thread Erik Weathers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/#review80308
---



src/slave/containerizer/docker.cpp
https://reviews.apache.org/r/33257/#comment130134

Cosmetic comment: `executor.id` is enclosed in single-quotes but 
`framework.id` is not.  This is in keeping with most of this file, so it *is* 
somewhat consistent.  But the `launch` member function does enclose 
`executorInfo.framework_id()` in single-quotes, so there is precedent for doing 
so within this file.


- Erik Weathers


On April 16, 2015, 7:10 a.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33257/
 ---
 
 (Updated April 16, 2015, 7:10 a.m.)
 
 
 Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie 
 Yu, and Till Toenshoff.
 
 
 Bugs: MESOS-2601
 https://issues.apache.org/jira/browse/MESOS-2601
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fixed recover tasks only by the intiated containerizer.
 Currently both mesos and docker containerizer recovers tasks that wasn't 
 started by themselves.
 The proposed fix is to record the intended containerizer in the checkpointed 
 executorInfo, and reuse that information on recover to know if the 
 containerizer should recover or not. We are free to modify the executorInfo 
 since it's not being used to relaunch any task.
 The external containerizer doesn't need to change since it is only recovering 
 containers that are returned by the containers script.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
   src/slave/containerizer/mesos/containerizer.cpp 
 e4136095fca55637864f495098189ab3ad8d8fe7 
   src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
   src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
   src/tests/docker_containerizer_tests.cpp 
 c772d4c836de18b0e87636cb42200356d24ec73d 
 
 Diff: https://reviews.apache.org/r/33257/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Timothy Chen