Re: Mesos Python Daemon Launch

2017-07-20 Thread Timothy Chen
Are you using Docker containerizer or?

Tim

On Thu, Jul 20, 2017 at 10:50 PM, Chawla,Sumit  wrote:
> Any clue on this one?
>
> The python daemon is getting launched in different session and process
> group.  Not sure why its getting killed when the mesos slave is terminating
> the framework.
>
> Regards
> Sumit Chawla
>
>
> On Wed, Jul 19, 2017 at 4:24 PM, Chawla,Sumit 
> wrote:
>
>> I am using Mesos 0.27.  I am launching a Python Daemon from spark task.
>> Idea is that this Daemon should keep running even when the mesos framework
>> shuts dowm. However, I am facing issues in keeping this Python Daeamon
>> process alive. The process is getting killed as soon as Mesos framework is
>> dying.
>>
>>
>>
>> Regards
>> Sumit Chawla
>>
>>


Re: Welcome Gilbert Song as a new committer and PMC member!

2017-05-24 Thread Timothy Chen
Congrats! Rocking the containerizer world!

Tim

On Wed, May 24, 2017 at 11:23 AM, Zhitao Li  wrote:
> Congrats Gilbert!
>
> On Wed, May 24, 2017 at 11:08 AM, Yan Xu  wrote:
>
>> Congrats! Well deserved!
>>
>> ---
>> Jiang Yan Xu  | @xujyan 
>>
>> On Wed, May 24, 2017 at 10:54 AM, Vinod Kone  wrote:
>>
>>> Congrats Gilbert!
>>>
>>> On Wed, May 24, 2017 at 1:32 PM, Neil Conway 
>>> wrote:
>>>
>>> > Congratulations Gilbert! Well-deserved!
>>> >
>>> > Neil
>>> >
>>> > On Wed, May 24, 2017 at 10:32 AM, Jie Yu  wrote:
>>> > > Hi folks,
>>> > >
>>> > > I' happy to announce that the PMC has voted Gilbert Song as a new
>>> > committer
>>> > > and member of PMC for the Apache Mesos project. Please join me to
>>> > > congratulate him!
>>> > >
>>> > > Gilbert has been working on Mesos project for 1.5 years now. His main
>>> > > contribution is his work on unified containerizer, nested container
>>> (aka
>>> > > Pod) support. He also helped a lot of folks in the community regarding
>>> > their
>>> > > patches, questions and etc. He also played an important role
>>> organizing
>>> > > MesosCon Asia last year and this year!
>>> > >
>>> > > His formal committer checklist can be found here:
>>> > > https://docs.google.com/document/d/1iSiqmtdX_0CU-YgpViA6r6PU_
>>> > aMCVuxuNUZ458FR7Qw/edit?usp=sharing
>>> > >
>>> > > Welcome, Gilbert!
>>> > >
>>> > > - Jie
>>> >
>>>
>>
>>
>
>
> --
> Cheers,
>
> Zhitao Li


Re: Welcome Kevin Klues as a Mesos Committer and PMC member!

2017-03-01 Thread Timothy Chen
Congrats Kevin!

Tim

On Wed, Mar 1, 2017 at 3:20 PM, Neil Conway  wrote:
> Congratulations Kevin! Very well-deserved.
>
> Neil
>
> On Wed, Mar 1, 2017 at 2:05 PM, Benjamin Mahler  wrote:
>> Hi all,
>>
>> Please welcome Kevin Klues as the newest committer and PMC member of the
>> Apache Mesos project.
>>
>> Kevin has been an active contributor in the project for over a year, and in
>> this time he made a number of contributions to the project: Nvidia GPU
>> support [1], the containerization side of POD support (new container init
>> process), and support for "attach" and "exec" of commands within running
>> containers [2].
>>
>> Also, Kevin took on an effort with Haris Choudhary to revive the CLI [3]
>> via a better structured python implementation (to be more accessible to
>> contributors) and a more extensible architecture to better support adding
>> new or custom subcommands. The work also adds a unit test framework for the
>> CLI functionality (we had no tests previously!). I think it's great that
>> Kevin took on this much needed improvement with Haris, and I'm very much
>> looking forward to seeing this land in the project.
>>
>> Here is his committer eligibility document for perusal:
>> https://docs.google.com/document/d/1mlO1yyLCoCSd85XeDKIxTYyboK_uiOJ4Uwr6ruKTlFM/edit
>>
>> Thanks!
>> Ben
>>
>> [1] http://mesos.apache.org/documentation/latest/gpu-support/
>> [2]
>> https://docs.google.com/document/d/1nAVr0sSSpbDLrgUlAEB5hKzCl482NSVk8V0D56sFMzU
>> [3]
>> https://docs.google.com/document/d/1r6Iv4Efu8v8IBrcUTjgYkvZ32WVscgYqrD07OyIglsA/


Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Timothy Chen
Dynamic allocation works with Coarse grain mode only, we wasn't aware
a need for Fine grain mode after we enabled dynamic allocation support
on the coarse grain mode.

What's the reason you're running fine grain mode instead of coarse
grain + dynamic allocation?

Tim

On Mon, Dec 19, 2016 at 2:45 PM, Mehdi Meziane
<mehdi.mezi...@ldmobile.net> wrote:
> We will be interested by the results if you give a try to Dynamic allocation
> with mesos !
>
>
> - Mail Original -
> De: "Michael Gummelt" <mgumm...@mesosphere.io>
> À: "Sumit Chawla" <sumitkcha...@gmail.com>
> Cc: u...@mesos.apache.org, dev@mesos.apache.org, "User"
> <u...@spark.apache.org>, d...@spark.apache.org
> Envoyé: Lundi 19 Décembre 2016 22h42:55 GMT +01:00 Amsterdam / Berlin /
> Berne / Rome / Stockholm / Vienne
> Objet: Re: Mesos Spark Fine Grained Execution - CPU count
>
>
>> Is this problem of idle executors sticking around solved in Dynamic
>> Resource Allocation?  Is there some timeout after which Idle executors can
>> just shutdown and cleanup its resources.
>
> Yes, that's exactly what dynamic allocation does.  But again I have no idea
> what the state of dynamic allocation + mesos is.
>
> On Mon, Dec 19, 2016 at 1:32 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> wrote:
>>
>> Great.  Makes much better sense now.  What will be reason to have
>> spark.mesos.mesosExecutor.cores more than 1, as this number doesn't include
>> the number of cores for tasks.
>>
>> So in my case it seems like 30 CPUs are allocated to executors.  And there
>> are 48 tasks so 48 + 30 =  78 CPUs.  And i am noticing this gap of 30 is
>> maintained till the last task exits.  This explains the gap.   Thanks
>> everyone.  I am still not sure how this number 30 is calculated.  ( Is it
>> dynamic based on current resources, or is it some configuration.  I have 32
>> nodes in my cluster).
>>
>> Is this problem of idle executors sticking around solved in Dynamic
>> Resource Allocation?  Is there some timeout after which Idle executors can
>> just shutdown and cleanup its resources.
>>
>>
>> Regards
>> Sumit Chawla
>>
>>
>> On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt <mgumm...@mesosphere.io>
>> wrote:
>>>
>>> >  I should preassume that No of executors should be less than number of
>>> > tasks.
>>>
>>> No.  Each executor runs 0 or more tasks.
>>>
>>> Each executor consumes 1 CPU, and each task running on that executor
>>> consumes another CPU.  You can customize this via
>>> spark.mesos.mesosExecutor.cores
>>> (https://github.com/apache/spark/blob/v1.6.3/docs/running-on-mesos.md) and
>>> spark.task.cpus
>>> (https://github.com/apache/spark/blob/v1.6.3/docs/configuration.md)
>>>
>>> On Mon, Dec 19, 2016 at 12:09 PM, Chawla,Sumit <sumitkcha...@gmail.com>
>>> wrote:
>>>>
>>>> Ah thanks. looks like i skipped reading this "Neither will executors
>>>> terminate when they’re idle."
>>>>
>>>> So in my job scenario,  I should preassume that No of executors should
>>>> be less than number of tasks. Ideally one executor should execute 1 or more
>>>> tasks.  But i am observing something strange instead.  I start my job with
>>>> 48 partitions for a spark job. In mesos ui i see that number of tasks is 
>>>> 48,
>>>> but no. of CPUs is 78 which is way more than 48.  Here i am assuming that 1
>>>> CPU is 1 executor.   I am not specifying any configuration to set number of
>>>> cores per executor.
>>>>
>>>> Regards
>>>> Sumit Chawla
>>>>
>>>>
>>>> On Mon, Dec 19, 2016 at 11:35 AM, Joris Van Remoortere
>>>> <jo...@mesosphere.io> wrote:
>>>>>
>>>>> That makes sense. From the documentation it looks like the executors
>>>>> are not supposed to terminate:
>>>>>
>>>>> http://spark.apache.org/docs/latest/running-on-mesos.html#fine-grained-deprecated
>>>>>>
>>>>>> Note that while Spark tasks in fine-grained will relinquish cores as
>>>>>> they terminate, they will not relinquish memory, as the JVM does not give
>>>>>> memory back to the Operating System. Neither will executors terminate 
>>>>>> when
>>>>>> they’re idle.
>>>>>
>>>>>
>>>>> I suppose your task to executor CPU ratio is low enough that it l

Re: Mesos Spark Fine Grained Execution - CPU count

2016-12-19 Thread Timothy Chen
Hi Chawla,

One possible reason is that Mesos fine grain mode also takes up cores
to run the executor per host, so if you have 20 agents running Fine
grained executor it will take up 20 cores while it's still running.

Tim

On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit  wrote:
> Hi
>
> I am using Spark 1.6. I have one query about Fine Grained model in Spark.
> I have a simple Spark application which transforms A -> B.  Its a single
> stage application.  To begin the program, It starts with 48 partitions.
> When the program starts running, in mesos UI it shows 48 tasks and 48 CPUs
> allocated to job.  Now as the tasks get done, the number of active tasks
> number starts decreasing.  How ever, the number of CPUs does not decrease
> propotionally.  When the job was about to finish, there was a single
> remaininig task, however CPU count was still 20.
>
> My questions, is why there is no one to one mapping between tasks and cpus
> in Fine grained?  How can these CPUs be released when the job is done, so
> that other jobs can start.
>
>
> Regards
> Sumit Chawla


Re: Vote on #MesosCon topics, deadline Friday, July 1

2016-06-28 Thread Timothy Chen
Hi Kiersten,

Looks like there are some duplicates in the survey?

Tim

On Tue, Jun 28, 2016 at 9:29 AM, Kiersten Gaffney
 wrote:
> Please take 15 minutes over the next few days and review what members of
> the community have submitted! If you see a topic missing, please let us
> know via the text box at the end of the survey.
>
> Voting form closes Friday, July 1, 11:59 PST
>
> A total of 92 proposals were submitted in time for #MesosCon EU review. The
> MesosCon program committee is opening these proposals up for community
> review/feedback to better-inform our decisions about what should be
> included in the program. Once the program is outlined, we will look at the
> previously submitted MesosCon US proposals to fill in topic gaps.
>
>
> Survey is here: https://www.surveymonkey.com/r/MesosConEU2016
>
>
> Thank you in advance for your participation!
>
> Kiersten, David and Chris
>
> Co-Chair, MesosCon Program Committee


Re: [Proposal] Use dev mailing list for working groups

2016-03-25 Thread Timothy Chen
+1

Tim

On Thu, Mar 24, 2016 at 8:17 PM, Chris Lambert  wrote:
> Another +1.  These WGs are great, and more visibility (with the ability to
> filter) will be awesome.
>
>
> On Thu, Mar 24, 2016 at 8:13 PM, Klaus Ma  wrote:
>
>> +1, that's helpful to filter feature/question out :).
>>
>> 
>> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
>> Platform OpenSource Technology, STG, IBM GCG
>> +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
>>
>> On Fri, Mar 25, 2016 at 11:04 AM, Du, Fan  wrote:
>>
>> > +1
>> >
>> > This will definitely make the new developer easily get to know where each
>> > component
>> > is heading, and which component is most of his interest and then
>> > contribute.
>> >
>> > Thanks for the proposal!
>> >
>> >
>> > On 2016/3/25 6:55, Jie Yu wrote:
>> >
>> >> Hi,
>> >>
>> >> This came up during today's community sync.
>> >>
>> >> Mesos currently has a few working groups for various features:
>> >>
>> >>
>> https://cwiki.apache.org/confluence/display/MESOS/Apache+Mesos+Working+Groups
>> >>
>> >> Some of those working groups are using separate mailing lists. That
>> limits
>> >> the visibility of some discussions. Also, some people in the community
>> are
>> >> not aware of those mailing lists (and the wiki page).
>> >>
>> >> Therefore, I am proposing that we consolidate all working groups mailing
>> >> lists to the dev mailing list. To distinguish discussions from different
>> >> working groups, please use a special subject format. For instance, if
>> you
>> >> want to send an email to "Mesos GPU" working group, please use the
>> >> subject:
>> >>
>> >> "[Mesos GPU WG] YOUR SUBJECT HERE"
>> >>
>> >> Let me know if you have any comments/thoughts on this!
>> >>
>> >> - Jie
>> >>
>> >>
>>


Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-09 Thread Timothy Chen
Also like to include MESOS-4370 as it fixes IP Address look up logic
and also unblocks users using custom Docker network.

Tim

On Wed, Mar 9, 2016 at 9:55 AM, Gilbert Song  wrote:
> Hi Kevin,
>
> Please remove the the patch below from the list:
> Implemented runtime isolator default cmd test (still under review).
> https://reviews.apache.org/r/44469/
>
> Because the bug was fixed by patch #44468, the test should not be
> considered as a block. I am updating MESOS-4888 and move the test to a
> separate JIRA.
>
> Thanks,
> Gilbert
>
> On Tue, Mar 8, 2016 at 2:43 PM, Kevin Klues  wrote:
>
>> Here are the list of reviews/patches that have been called out in this
>> thread for inclusion in 0.28.0-rc2.  Some of them are still under
>> review and will need to land by Thursday to be included.
>>
>> Are there others?
>>
>> Jie's container image documentation (submitted):
>> commit 7de8cdd4d8ed1d222fa03ea0d8fa6740c4a9f84b
>> https://reviews.apache.org/r/44414
>>
>> Restore Mesos' ability to extract Docker assigned IPs (still under review):
>> https://reviews.apache.org/r/43093/
>>
>> Fixed the logic for default docker cmd case (submitted).
>> commit e42f740ccb655c0478a3002c0b6fa90c1144f41c
>> https://reviews.apache.org/r/44468/
>>
>> Implemented runtime isolator default cmd test (still under review).
>> https://reviews.apache.org/r/44469/
>>
>> Fixed a bug that causes the task stuck in staging state (still under
>> review).
>> https://reviews.apache.org/r/44435/
>>
>> On Tue, Mar 8, 2016 at 10:30 AM, Kevin Klues  wrote:
>> > Yes, will do.
>> >
>> > On Tue, Mar 8, 2016 at 10:26 AM, Vinod Kone 
>> wrote:
>> >> +kevin klues
>> >>
>> >> OK. I'm cancelling this vote since there are some show stopper issues
>> that
>> >> we need to cherry-pick. I'll cut another RC on Thursday.
>> >>
>> >> @shepherds: can you please make sure the blocker tickets are marked with
>> >> fix version and that they land today or tomorrow?
>> >>
>> >> @kevin: since you have volunteered to help with the release, can you
>> make
>> >> sure we have a list of commits to cherry pick for rc2?
>> >>
>> >> Thanks,
>> >>
>> >>
>> >> On Tue, Mar 8, 2016 at 12:05 AM, Shuai Lin 
>> wrote:
>> >>
>> >>> Maybe also https://issues.apache.org/jira/browse/MESOS-4877 and
>> >>> https://issues.apache.org/jira/browse/MESOS-4878 ?
>> >>>
>> >>>
>> >>> On Tue, Mar 8, 2016 at 9:13 AM, Jie Yu  wrote:
>> >>>
>>  I'd like to fix https://issues.apache.org/jira/browse/MESOS-4888 as
>> well
>>  if you guys plan to cut another RC
>> 
>>  On Mon, Mar 7, 2016 at 10:16 AM, Daniel Osborne <
>>  daniel.osbo...@metaswitch.com> wrote:
>> 
>> > -1
>> >
>> > If it doesn’t cause too much pain, I'm hoping we can squeeze a
>> > relatively small patch which restores Mesos' ability to extract
>> Docker
>> > assigned IPs. This has been broken with Docker 1.10's release over
>> a month
>> > ago, and prevents service discovery and DNS from working.
>> >
>> > Mesos-4370: https://issues.apache.org/jira/browse/MESOS-4370
>> > RB# 43093: https://reviews.apache.org/r/43093/
>> >
>> > I've built 0.28.0-rc1 with this patch and can confirm that it fixes
>> it
>> > as expected.
>> >
>> > Apologies for not bringing this to attention earlier.
>> >
>> > Thanks all,
>> > Dan
>> >
>> > -Original Message-
>> > From: Vinod Kone [mailto:vinodk...@apache.org]
>> > Sent: Thursday, March 3, 2016 5:44 PM
>> > To: dev ; user 
>> > Subject: [VOTE] Release Apache Mesos 0.28.0 (rc1)
>> >
>> > Hi all,
>> >
>> >
>> > Please vote on releasing the following candidate as Apache Mesos
>> 0.28.0.
>> >
>> >
>> > 0.28.0 includes the following:
>> >
>> >
>> >
>> 
>> >
>> >   * [MESOS-4343] - A new cgroups isolator for enabling the net_cls
>> > subsystem in
>> >
>> > Linux. The cgroups/net_cls isolator allows operators to provide
>> > network
>> >
>> >
>> > performance isolation and network segmentation for containers
>> within
>> > a Mesos
>> >
>> > cluster. To enable the cgroups/net_cls isolator, append
>> > `cgroups/net_cls` to
>> >
>> > the `--isolation` flag when starting the slave. Please refer to
>> >
>> >
>> > docs/mesos-containerizer.md for more details.
>> >
>> >
>> >
>> >
>> >
>> >   * [MESOS-4687] - The implementation of scalar resource values
>> (e.g.,
>> > "2.5
>> >
>> >
>> > CPUs") has changed. Mesos now reliably supports resources with
>> up to
>> > three
>> >
>> > decimal digits of precision (e.g., "2.501 CPUs"); resources with
>> > more than

Re: 0.28.0 release

2016-03-03 Thread Timothy Chen
Sorry I pushed a quick typo fix before seeing this email.

Tim

On Thu, Mar 3, 2016 at 4:15 PM, Vinod Kone  wrote:
> Alright, all the blockers are resolved. I'll be cutting the RC shortly.
>
> I'm also taking a soft lock on the 'master' branch. *Committers:* *Please
> do not push any commits upstream until I release the lock.*
>
> Thanks,
>
> On Mon, Feb 29, 2016 at 1:36 PM, Vinod Kone  wrote:
>
>> Hi folks,
>>
>> I'm volunteering to be the Release Manager for 0.28.0. Joris and Kevin
>> Klues have kindly agreed to help me out. The plan is cut an RC tomorrow
>> 03/01.
>>
>> The dashboard for the release is here:
>> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12327751
>>
>> *If you have a ticket marked with "Fix Version 028.0" and is not in
>> "Resolved" state, verify if it's a blocker for 0.28.0. If not, please unset
>> the Fix Version.*
>>
>>
>> Thanks,
>> Vinod
>>
>>


Re: JIRA Shepherds

2016-01-27 Thread Timothy Chen
Yes the intention is to make the seasoned contributors (really all
contributors!) committers, and we are definitely trying to make that
happen with actively being mentored and contribution over time.

Also encourage anyone interested in becoming committers with Mesos to
look at the Bernd's committer checklist:

https://mail-archives.apache.org/mod_mbox/mesos-dev/201505.mbox/%3ccaakwvazmvr5nxz5ios3uohcx41yv9n73jh_gcrmnnaiekh8...@mail.gmail.com%3E

Please also reach out on the dev list or to us on IRC if you have more
questions or like to understand more what concrete steps required for
them.

Tim


On Wed, Jan 27, 2016 at 11:16 AM, Vaibhav Khanduja
 wrote:
> +1 on this …. it would be great to discuss this in weekly call ...
>
> On Wed, Jan 27, 2016 at 4:17 AM, Christopher Hicks  wrote:
>
>> Would it be easier to invite some of those seasoned contributors to be
>> committers rather than creating a new tier of contributors?  Creating
>> additional organization complexity seems unnecessary and potentially
>> distracting unless there is some reason not to increase the core committer
>> team count.
>>
>> On Wed, Jan 27, 2016 at 2:56 AM, Alexander Rojas 
>> wrote:
>>
>> > My grain of sand here. It is true that committers are a scare resource
>> and
>> > it might be hard to get shepherds nowadays. However we do have a bunch of
>> > seasoned contributors, which while not being committers are active and
>> know
>> > some of the innards of Mesos very well. How about having these guys as
>> > shepherds?
>> >
>> > At the end a committer may be required to sign off a project, but all the
>> > work of communicating with the contributor, come up with a design could
>> be
>> > lifter off from committers?
>> >
>> > What do you guys thing about the idea?
>> >
>> >
>> > > On 27 Jan 2016, at 00:29, Vaibhav Khanduja 
>> > wrote:
>> > >
>> > > The community is growing with more individuals getting interested in
>> > > contributing to the project. This definitely brings an extra bit of
>> > > workload for committers “Shepherds” but at the same time more
>> developers
>> > > eventually leads more adoptability across organization and enterprises.
>> > >
>> > >
>> > >
>> > > I am not sure if this is easy to find an immediate solution but would
>> > > really like some sort of resolution on this. If shepherd is busy, what
>> > else
>> > > can be done for a low priority but a genuine issue.
>> > >
>> > > On Sun, Jan 24, 2016 at 4:12 PM, Joris Van Remoortere <
>> > > joris.van.remoort...@gmail.com> wrote:
>> > >
>> > >> Hello Mesos developers,
>> > >>
>> > >> You may have noticed some churn in Jira recently around the shepherd
>> > >> assignment. Specifically, we have unassigned the shepherds for a bunch
>> > of
>> > >> projects. We did this in order to get a better sense of which projects
>> > are
>> > >> being actively shepherded versus having gone stale, and to identify
>> for
>> > >> which projects we need to find a new shepherd who has sufficient time
>> to
>> > >> dedicate to it.
>> > >>
>> > >> This is not a statement that the un-assigned tickets are not
>> important,
>> > >> rather, we want to ensure that the people working on them have a
>> > shepherd
>> > >> with sufficient resources.
>> > >>
>> > >> We ask that you communicate (and agree!) with your shepherd before
>> > >> assigning them in Jira, so that they are not surprised when you
>> reviews
>> > >> start getting posted.
>> > >>
>> > >> The benefit for the developer community should be that it will be more
>> > >> clear when working on a ticket whether there are sufficient resources
>> in
>> > >> the community to iterate on it in a timely manner.
>> > >>
>> > >> Joris
>> > >>
>> >
>> >
>>
>>
>> --
>> Christopher Hicks
>> Uber SRE
>> +1.757.598.2032
>>


[VOTE] Release Apache Mesos 0.27.0 (rc1)

2016-01-26 Thread Timothy Chen
Hi all,

Please vote on releasing the following candidate as Apache Mesos 0.27.0.

0.27.0 includes the following:

We added major features such as Implicit Roles, Quota, Multiple Disks and more.

We also added major bug fixes such as performance improvements to
state.json requests and GLOG.

The CHANGELOG for the release is available at:
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.27.0-rc1


The candidate for Mesos 0.27.0 release is available at:
https://dist.apache.org/repos/dist/dev/mesos/0.27.0-rc1/mesos-0.27.0.tar.gz

The tag to be voted on is 0.27.0-rc1:
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.27.0-rc1

The MD5 checksum of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/mesos/0.27.0-rc1/mesos-0.27.0.tar.gz.md5

The signature of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/mesos/0.27.0-rc1/mesos-0.27.0.tar.gz.asc

The PGP key used to sign the release is here:
https://dist.apache.org/repos/dist/release/mesos/KEYS

The JAR is up in Maven in a staging repository here:
https://repository.apache.org/content/repositories/orgapachemesos-1097

Please vote on releasing this package as Apache Mesos 0.27.0!

The vote is open until Fri Jan 29 23:59:59 PST 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Mesos 0.27.0
[ ] -1 Do not release this package because ...

Thanks,

Tim, MPark and Kapil


Re: Mesos 0.27.0 release update

2016-01-25 Thread Timothy Chen
We're creating a release from latest master this as soon as the
remaining two blockers are resolved, current estimate should be
tomorrow.

Please go ahead and try out the latest master if you can.

Thanks!

Tim

On Mon, Jan 25, 2016 at 11:53 AM, Sarjeet Singh
<sarjeetsi...@maprtech.com> wrote:
> Thanks Bernd for confirming. This should fix this issue in Mesos 0.27.
>
> Any update on when this is planned to be released?
>
> -Sarjeet
>
> On Mon, Jan 25, 2016 at 10:40 AM, Bernd Mathiske <be...@mesosphere.io>
> wrote:
>
>> Hi Sarjeet,
>>
>> I hope we fixed this in upcoming 0.27:
>>
>> https://issues.apache.org/jira/browse/MESOS-4304
>>
>> Bernd
>>
>> On Jan 25, 2016, at 7:30 PM, Sarjeet Singh <sarjeetsi...@maprtech.com>
>> wrote:
>>
>> I ran into an issue when tried Mesos 0.26 version and started a marathon
>> app using URI (with maprfs path) on a mesos cluster.
>>
>> The issue seems to be caused by Mesos-3602 fix, and is causing issue for
>> maprfs (mapr filesystem) when specified maprfs path as URI on marathon.
>>
>> The issue is that, It is appending '/' to the URI maprfs path specified, in
>> the beginning, and is not executed as expected. e.g.
>>
>> =
>> *  hadoop fs -copyToLocal '/
>> maprfs:///dist/hadoop-2.7.0.myriad1.tar.gz'
>>
>> '/opt/mapr/slaves/67d1f64c-449b-4609-82f3-5da309f3c5c5-S9/frameworks/67d1f64c-449b-4609-82f3-5da309f3c5c5-/executors/myriad1.63bbb98c-c072-11e5-b686-0cc47a587d20/runs/427fe309-82c5-4f8b-9fa3-6dd39a4a5ef4/hadoop-2.7.0.myriad1.tar.gz*
>>
>>
>> -copyToLocal: java.net.URISyntaxException: Expected scheme-specific part at
>> index 7: maprfs:
>> =
>>
>> The fix for Mesos-3602 only assumes hdfs path, and doesn't consider other
>> cases, such as maprfs or other dfs paths. I haven't filed a JIRA yet on
>> this issue, but would like to get some feedback on this, and expect this to
>> be fixed for next Mesos release.
>>
>> Let me know if there is anything else I could provide related to the issue.
>>
>> -Sarjeet
>>
>> On Sat, Jan 23, 2016 at 12:41 AM, Timothy Chen <tnac...@gmail.com> wrote:
>>
>> Hi all,
>>
>> (Kapil, MPark and I) We're still having 3 blocker issues outstanding
>> at this moment:
>>
>> MESOS-4449: SegFault on agent during executor startup (shepherd: Joris)
>> MESOS-4441: Do not allocate non-revocable resources beyond quota
>> guarantee. (shepherd: Joris)
>> MESOS-4410: Introduce protobuf for quota set request. (shepherd: Joris)
>>
>> The remaining major tickets are ContainerLogger related and should be
>> committed today according to Ben.
>>
>> We've started to test latest master and will be looking at the test
>> failures to see what needs to be addressed.
>>
>> I encourage everyone to test the latest master on your platform if
>> possible to catch issues early, and once the Blocker issues are
>> resolved we'll be sending a RC to test and vote.
>>
>> Thanks,
>>
>> Tim
>>
>>
>>


Mesos 0.27.0 release update

2016-01-22 Thread Timothy Chen
Hi all,

(Kapil, MPark and I) We're still having 3 blocker issues outstanding
at this moment:

MESOS-4449: SegFault on agent during executor startup (shepherd: Joris)
MESOS-4441: Do not allocate non-revocable resources beyond quota
guarantee. (shepherd: Joris)
MESOS-4410: Introduce protobuf for quota set request. (shepherd: Joris)

The remaining major tickets are ContainerLogger related and should be
committed today according to Ben.

We've started to test latest master and will be looking at the test
failures to see what needs to be addressed.

I encourage everyone to test the latest master on your platform if
possible to catch issues early, and once the Blocker issues are
resolved we'll be sending a RC to test and vote.

Thanks,

Tim


Re: Shepherd for MESOS-4369 (Enhance DockerExecuter to support Docker's user-defined networks)

2016-01-19 Thread Timothy Chen
I'll shepherd this.

Tim

On Tue, Jan 19, 2016 at 10:00 AM, Ezra Silvera  wrote:
> Hi all,
>
> Anyone is willing to shepherd
> https://issues.apache.org/jira/browse/MESOS-4369   ?
>
> Thanks
>
> Ezra Silvera
>
>
>


Re: find a shepherd

2016-01-14 Thread Timothy Chen
Hi Jian, I can help shepherd these.

Tim

On Thu, Jan 14, 2016 at 6:53 PM, Jian Qiu  wrote:
> Hi,
>
> Anyone could help shepherding on these tickets? Thanks.
>
> https://issues.apache.org/jira/browse/MESOS-4174
> https://issues.apache.org/jira/browse/MESOS-4161
> https://issues.apache.org/jira/browse/MESOS-4158
>
> Regards
> Jian Qiu


Mesos 0.27.0

2016-01-13 Thread Timothy Chen
Hi all,

As we continue with the monthly release cadence, we should be cutting
a new release a month from the last one (12/16/15), which the next
version will be 0.27.0.

MPark, Kapil and I are volunteering to be the release manager for
0.27.0, let us know if there are any questions and definitely want and
welcome the community help.

MPark has created a dashboard for 0.27.0 release:
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12327632

We plan to try to cut a preview next monday (01/18) and a RC on
wednesday (01/20).

Please start updating your resolved jira tickets to either add target
version to 0.27.0 or remove open jira tickets if it's not going to
land before the weekend, which will just roll into the next release
next month.

Thanks!


Re: Mesos 0.27.0

2016-01-13 Thread Timothy Chen
Thanks for the clarification!

Tim

On Wed, Jan 13, 2016 at 2:14 PM, Adam Bordelon <a...@mesosphere.io> wrote:
> Tim/MPark, Note that JIRAs must have their "Fixed Version" set to 0.27.0 to
> show up in the generated "Release Notes".
> We use "Target Version" to track blockers for the release (and things that
> we really really hope will make it in).
>
> On Wed, Jan 13, 2016 at 2:08 PM, Joris Van Remoortere <
> joris.van.remoort...@gmail.com> wrote:
>
>> +1
>>
>> On Wed, Jan 13, 2016 at 1:27 PM, Timothy Chen <tnac...@gmail.com> wrote:
>>
>> > Hi all,
>> >
>> > As we continue with the monthly release cadence, we should be cutting
>> > a new release a month from the last one (12/16/15), which the next
>> > version will be 0.27.0.
>> >
>> > MPark, Kapil and I are volunteering to be the release manager for
>> > 0.27.0, let us know if there are any questions and definitely want and
>> > welcome the community help.
>> >
>> > MPark has created a dashboard for 0.27.0 release:
>> >
>> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12327632
>> >
>> > We plan to try to cut a preview next monday (01/18) and a RC on
>> > wednesday (01/20).
>> >
>> > Please start updating your resolved jira tickets to either add target
>> > version to 0.27.0 or remove open jira tickets if it's not going to
>> > land before the weekend, which will just roll into the next release
>> > next month.
>> >
>> > Thanks!
>> >
>>


Re: Request Mesos contributor role

2016-01-11 Thread Timothy Chen
Have you created a jira user already? I searched for Du Fan and
couldn't find anything.

Let me know what's your jira username.

Tim

On Mon, Jan 11, 2016 at 10:22 PM, Du, Fan  wrote:
> hi Mesos committer
>
> I want to assign myself to a JIRA issue I'm about to create,
> As from the http://mesos.apache.org/documentation/latest/submitting-a-patch/
> Could you please add me to the Mesos contributor list?
>
> thanks a lot.


Re: Shepherd for MESOS-4279 (Graceful restart of docker task)

2016-01-08 Thread Timothy Chen
I'll shepherd this, can you add me to the jira?

Thanks,

Tim

> On Jan 8, 2016, at 7:04 AM, Qian Zhang  wrote:
> 
> Hi,
> 
> Can anyone shepherd https://issues.apache.org/jira/browse/MESOS-4279? I
> have posted some findings there, we can do further discussion in the ticket.
> 
> 
> Thanks,
> Qian Zhang


Re: mesos git commit: Fixed posix filesystem isolator to not allow executors with image.

2016-01-06 Thread Timothy Chen
Hi James,

There isn't any backward compatibility needed since we never really do
anything with volumes with posix filesystem, now we're just making
sure we don't allow it since it can cause problems especially with
volumes that has images.

Tim

On Wed, Jan 6, 2016 at 7:26 PM, James Peach <jor...@gmail.com> wrote:
> Hi Tim,
>
> What are the backwards compatibility implications of this?
>
>> On Jan 6, 2016, at 6:50 PM, tnac...@apache.org wrote:
>>
>> Repository: mesos
>> Updated Branches:
>>  refs/heads/master c258d8af7 -> 52abf8de3
>>
>>
>> Fixed posix filesystem isolator to not allow executors with image.
>>
>> Review: https://reviews.apache.org/r/41909/
>>
>>
>> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
>> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/52abf8de
>> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/52abf8de
>> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/52abf8de
>>
>> Branch: refs/heads/master
>> Commit: 52abf8de380cf7a3c3d8a2e5616b3d34d7b6b277
>> Parents: c258d8a
>> Author: Timothy Chen <tnac...@apache.org>
>> Authored: Tue Jan 5 17:29:57 2016 -0800
>> Committer: Timothy Chen <tnac...@apache.org>
>> Committed: Wed Jan 6 18:01:32 2016 -0800
>>
>> --
>> .../containerizer/mesos/isolators/filesystem/posix.cpp  | 9 +
>> 1 file changed, 5 insertions(+), 4 deletions(-)
>> --
>>
>>
>> http://git-wip-us.apache.org/repos/asf/mesos/blob/52abf8de/src/slave/containerizer/mesos/isolators/filesystem/posix.cpp
>> --
>> diff --git a/src/slave/containerizer/mesos/isolators/filesystem/posix.cpp 
>> b/src/slave/containerizer/mesos/isolators/filesystem/posix.cpp
>> index 00ff84b..4d6100e 100644
>> --- a/src/slave/containerizer/mesos/isolators/filesystem/posix.cpp
>> +++ b/src/slave/containerizer/mesos/isolators/filesystem/posix.cpp
>> @@ -78,17 +78,18 @@ Future<Option> 
>> PosixFilesystemIsolatorProcess::prepare(
>> return Failure("Container has already been prepared");
>>   }
>>
>> -  // Return failure if the container change the filesystem root
>> -  // because the symlinks will become invalid in the new root.
>>   if (executorInfo.has_container()) {
>> CHECK_EQ(executorInfo.container().type(), ContainerInfo::MESOS);
>>
>> +// Return failure if the container change the filesystem root
>> +// because the symlinks will become invalid in the new root.
>> if (executorInfo.container().mesos().has_image()) {
>>   return Failure("Container root filesystems not supported");
>> }
>>
>> -// TODO(jieyu): Also return a failure if there exists images in
>> -// the specified volumes.
>> +if (executorInfo.container().volumes().size() > 0) {
>> +  return Failure("Volumes in ContainerInfo is not supported");
>> +}
>>   }
>>
>>   infos.put(containerId, Owned(new Info(directory)));
>>
>


Re: Mesos Flocker - Custom Isolator and Docker

2015-11-25 Thread Timothy Chen
Hi Tommy,

We didn't modify MesosContainerizer container creation but only just
added image support and the ability to inherit runtime configuration
from images.

We're considering different options like runc for OCI but still just
investigating, if you have any more thoughts let us know!

Tim

On Wed, Nov 25, 2015 at 8:05 AM, tommy xiao <xia...@gmail.com> wrote:
> Tim,
>
> interesting on the MesosContainerizer's design, what technology to support
> it? runC solution?
>
> 2015-11-25 19:09 GMT+08:00 Timothy Chen <tnac...@gmail.com>:
>
>> Hi there,
>>
>> DockerContainerizer doesn't support isolators simply because we
>> delegate all container creation to Docker via docker client, and
>> therefore we don't have a way to run our isolators that's in
>> mesos-slave from docker.
>>
>> We're working on unified containerizer that brings Docker images to
>> Mesos Containerizer, so you can launch containers with docker images
>> with it. There is a working initial code that's already in master that
>> you can try out, and once you have the isolator you can then launch
>> containers with docker images with your new isolator with
>> MesosContainerizer.
>>
>> Tim
>>
>> On Wed, Nov 25, 2015 at 2:48 AM, Frank Scholten <fr...@frankscholten.nl>
>> wrote:
>> > Hi all,
>> >
>> > We (Frank Scholten and Phil Winder) are currently developing the Mesos
>> > Flocker framework. As part of this framework we want to develop a
>> > custom isolator module which interacts with the Flocker Control
>> > Service.
>> >
>> > After creating a simple stub isolator and installing it on one of the
>> > agents we noticed it gets picked up we run a regular task but it does
>> > not get picked up when we run a Docker container.
>> >
>> > To us it seems that this is because isolators can only be used by the
>> > MesosContainerizer. The MesosContainerizer configures isolation in its
>> > create factory method while the DockerContainerizes does not:
>> >
>> > MesosContainerizer
>> >
>> > Try<MesosContainerizer*> MesosContainerizer::create(
>> > const Flags& flags,
>> > bool local,
>> > Fetcher* fetcher)
>> > {
>> >   string isolation;
>> >
>> >   if (flags.isolation == "process") {
>> > LOG(WARNING) << "The 'process' isolation flag is deprecated, "
>> >  << "please update your flags to"
>> >  << " '--isolation=posix/cpu,posix/mem'.";
>> >
>> > isolation = "posix/cpu,posix/mem";
>> >   } else if (flags.isolation == "cgroups") {
>> > LOG(WARNING) << "The 'cgroups' isolation flag is deprecated, "
>> >  << "please update your flags to"
>> >  << " '--isolation=cgroups/cpu,cgroups/mem'.";
>> >
>> > isolation = "cgroups/cpu,cgroups/mem";
>> >   } else {
>> > isolation = flags.isolation;
>> >   }
>> >
>> > DockerContainerizer
>> >
>> > Try<DockerContainerizer*> DockerContainerizer::create(
>> > const Flags& flags,
>> > Fetcher* fetcher)
>> > {
>> >   Try<Docker*> create = Docker::create(flags.docker,
>> flags.docker_socket, true);
>> >   if (create.isError()) {
>> > return Error("Failed to create docker: " + create.error());
>> >   }
>> >
>> >   Shared docker(create.get());
>> >
>> >   if (flags.docker_mesos_image.isSome()) {
>> > Try validateResult = docker->validateVersion(Version(1, 5,
>> 0));
>> > if (validateResult.isError()) {
>> >   string message = "Docker with mesos images requires docker 1.5+";
>> >   message += validateResult.error();
>> >   return Error(message);
>> > }
>> >   }
>> >
>> >   return new DockerContainerizer(flags, fetcher, docker);
>> > }
>> >
>> > The question now is how can we create an isolator to work together
>> > with the DockerContainerizer. Should we subclass the
>> > DockerContainerizer and create a FlockerContainerizer instead? Any
>> > other suggestions?
>> >
>> > Thanks in advance.
>> >
>> > Cheers,
>> >
>> > Frank and Phil
>>
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com


Re: Mesos Flocker - Custom Isolator and Docker

2015-11-25 Thread Timothy Chen
Hi there,

DockerContainerizer doesn't support isolators simply because we
delegate all container creation to Docker via docker client, and
therefore we don't have a way to run our isolators that's in
mesos-slave from docker.

We're working on unified containerizer that brings Docker images to
Mesos Containerizer, so you can launch containers with docker images
with it. There is a working initial code that's already in master that
you can try out, and once you have the isolator you can then launch
containers with docker images with your new isolator with
MesosContainerizer.

Tim

On Wed, Nov 25, 2015 at 2:48 AM, Frank Scholten  wrote:
> Hi all,
>
> We (Frank Scholten and Phil Winder) are currently developing the Mesos
> Flocker framework. As part of this framework we want to develop a
> custom isolator module which interacts with the Flocker Control
> Service.
>
> After creating a simple stub isolator and installing it on one of the
> agents we noticed it gets picked up we run a regular task but it does
> not get picked up when we run a Docker container.
>
> To us it seems that this is because isolators can only be used by the
> MesosContainerizer. The MesosContainerizer configures isolation in its
> create factory method while the DockerContainerizes does not:
>
> MesosContainerizer
>
> Try MesosContainerizer::create(
> const Flags& flags,
> bool local,
> Fetcher* fetcher)
> {
>   string isolation;
>
>   if (flags.isolation == "process") {
> LOG(WARNING) << "The 'process' isolation flag is deprecated, "
>  << "please update your flags to"
>  << " '--isolation=posix/cpu,posix/mem'.";
>
> isolation = "posix/cpu,posix/mem";
>   } else if (flags.isolation == "cgroups") {
> LOG(WARNING) << "The 'cgroups' isolation flag is deprecated, "
>  << "please update your flags to"
>  << " '--isolation=cgroups/cpu,cgroups/mem'.";
>
> isolation = "cgroups/cpu,cgroups/mem";
>   } else {
> isolation = flags.isolation;
>   }
>
> DockerContainerizer
>
> Try DockerContainerizer::create(
> const Flags& flags,
> Fetcher* fetcher)
> {
>   Try create = Docker::create(flags.docker, flags.docker_socket, 
> true);
>   if (create.isError()) {
> return Error("Failed to create docker: " + create.error());
>   }
>
>   Shared docker(create.get());
>
>   if (flags.docker_mesos_image.isSome()) {
> Try validateResult = docker->validateVersion(Version(1, 5, 0));
> if (validateResult.isError()) {
>   string message = "Docker with mesos images requires docker 1.5+";
>   message += validateResult.error();
>   return Error(message);
> }
>   }
>
>   return new DockerContainerizer(flags, fetcher, docker);
> }
>
> The question now is how can we create an isolator to work together
> with the DockerContainerizer. Should we subclass the
> DockerContainerizer and create a FlockerContainerizer instead? Any
> other suggestions?
>
> Thanks in advance.
>
> Cheers,
>
> Frank and Phil


Re: Mesos .26 failing on centos7

2015-11-09 Thread Timothy Chen
My commits that caused the trouble are reverted now.

And also 0.26 will not be based on master, it typically are cherry picked 
commits to specific tag.

Tim

> On Nov 9, 2015, at 6:37 AM, Plotka, Bartlomiej  
> wrote:
> 
> I had the same issue (broken build) on Ubuntu 14.04.. Commit “cee4958” helped.
> 
> Kind Regards,
> Bartek Plotka
> 
> From: Jan Schlicht [mailto:j...@mesosphere.io]
> Sent: Monday, November 9, 2015 3:27 PM
> To: u...@mesos.apache.org
> Cc: dev 
> Subject: Re: Mesos .26 failing on centos7
> 
> There were some build errors due to some reverts in `registry_puller.cpp`. 
> Your error logs hints that it may be related to this. They should be fixed 
> now (with `cee4958`).
> 
> On Mon, Nov 9, 2015 at 3:23 PM, haosdent 
> > wrote:
> Could you show more details about error log? I could build current master 
> branch in CentOS 7.
> 
> On Mon, Nov 9, 2015 at 10:00 PM, Pradeep Kiruvale 
> > wrote:
> Hi All,
> 
> I am trying to compile mesos on Centos7, but its failing. Please let me know 
> what is the reason.
> 
> Find the logs below.
> 
> Regards,
> Pradeep
> 
> make[2]: *** 
> [slave/containerizer/mesos/provisioner/docker/libmesos_no_3rdparty_la-registry_puller.lo]
>  Error 1
> make[2]: *** Waiting for unfinished jobs
> mv -f 
> slave/containerizer/mesos/isolators/cgroups/.deps/libmesos_no_3rdparty_la-cpushare.Tpo
>  
> slave/containerizer/mesos/isolators/cgroups/.deps/libmesos_no_3rdparty_la-cpushare.Plo
> mv -f master/.deps/libmesos_no_3rdparty_la-master.Tpo 
> master/.deps/libmesos_no_3rdparty_la-master.Plo
> mv -f java/jni/.deps/libjava_la-convert.Tpo 
> java/jni/.deps/libjava_la-convert.Plo
> mv -f examples/.deps/libexamplemodule_la-example_module_impl.Tpo 
> examples/.deps/libexamplemodule_la-example_module_impl.Plo
> mv -f 
> slave/containerizer/mesos/isolators/namespaces/.deps/libmesos_no_3rdparty_la-pid.Tpo
>  
> slave/containerizer/mesos/isolators/namespaces/.deps/libmesos_no_3rdparty_la-pid.Plo
> mv -f 
> slave/containerizer/mesos/isolators/cgroups/.deps/libmesos_no_3rdparty_la-perf_event.Tpo
>  
> slave/containerizer/mesos/isolators/cgroups/.deps/libmesos_no_3rdparty_la-perf_event.Plo
> mv -f 
> slave/containerizer/mesos/provisioner/backends/.deps/libmesos_no_3rdparty_la-bind.Tpo
>  
> slave/containerizer/mesos/provisioner/backends/.deps/libmesos_no_3rdparty_la-bind.Plo
> mv -f 
> slave/containerizer/mesos/isolators/cgroups/.deps/libmesos_no_3rdparty_la-mem.Tpo
>  
> slave/containerizer/mesos/isolators/cgroups/.deps/libmesos_no_3rdparty_la-mem.Plo
> mv -f linux/.deps/libmesos_no_3rdparty_la-perf.Tpo 
> linux/.deps/libmesos_no_3rdparty_la-perf.Plo
> mv -f 
> slave/containerizer/.deps/libmesos_no_3rdparty_la-external_containerizer.Tpo 
> slave/containerizer/.deps/libmesos_no_3rdparty_la-external_containerizer.Plo
> mv -f log/.deps/liblog_la-replica.Tpo log/.deps/liblog_la-replica.Plo
> mv -f slave/.deps/libmesos_no_3rdparty_la-slave.Tpo 
> slave/.deps/libmesos_no_3rdparty_la-slave.Plo
> mv -f 
> slave/containerizer/mesos/.deps/libmesos_no_3rdparty_la-containerizer.Tpo 
> slave/containerizer/mesos/.deps/libmesos_no_3rdparty_la-containerizer.Plo
> mv -f 
> slave/resource_estimators/.deps/libfixed_resource_estimator_la-fixed.Tpo 
> slave/resource_estimators/.deps/libfixed_resource_estimator_la-fixed.Plo
> mv -f 
> slave/containerizer/mesos/isolators/filesystem/.deps/libmesos_no_3rdparty_la-linux.Tpo
>  
> slave/containerizer/mesos/isolators/filesystem/.deps/libmesos_no_3rdparty_la-linux.Plo
> mv -f log/.deps/liblog_la-coordinator.Tpo log/.deps/liblog_la-coordinator.Plo
> mv -f log/.deps/liblog_la-recover.Tpo log/.deps/liblog_la-recover.Plo
> 
> 
> 
> 
> --
> Best Regards,
> Haosdent Huang
> 
> 
> 
> --
> Jan Schlicht
> Distributed Systems Engineer, Mesosphere
> 
> 
> Intel Technology Poland sp. z o.o.
> ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII 
> Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 
> 957-07-52-316 | Kapital zakladowy 200.000 PLN.
> 
> Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
> moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
> wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
> jakiekolwiek
> przegladanie lub rozpowszechnianie jest zabronione.
> This e-mail and any attachments may contain confidential material for the 
> sole use of the intended recipient(s). If you are not the intended recipient, 
> please contact the sender and delete all copies; any review or distribution by
> others is strictly prohibited.


Re: soliciting shepherds

2015-11-03 Thread Timothy Chen
I'll help with 3602 and 3605.

Tim

On Tue, Nov 3, 2015 at 2:59 PM, Benjamin Mahler
 wrote:
> I will take https://issues.apache.org/jira/browse/MESOS-3708, can other
> committers pitch in for the other tickets here?
>
> On Thu, Oct 29, 2015 at 8:15 PM, James Peach  wrote:
>
>> Hi all,
>>
>> Can anyone shepherd the following changes?
>>
>> https://issues.apache.org/jira/browse/MESOS-3605
>> https://issues.apache.org/jira/browse/MESOS-3602
>> https://issues.apache.org/jira/browse/MESOS-3725
>> https://issues.apache.org/jira/browse/MESOS-3708
>>
>> thanks,
>> James
>>


Re: Removing external containerizer from code base

2015-10-12 Thread Timothy Chen
Yes it can be a module.

Tim

On Mon, Oct 12, 2015 at 12:04 AM, tommy xiao  wrote:
> HI Jie Yu,
>
> https://issues.apache.org/jira/browse/MESOS-3435 in this proposal, i sure
> the hyper is new containerizer, Does this can implement as a module?
>
> 2015-10-08 8:12 GMT+08:00 Jie Yu :
>
>> Hey guys,
>>
>> Per discussion in MESOS-3370
>> , I'll start removing
>> the
>> external containerizer and cleaning up the relevant code.
>>
>> Please let me know if you have any concern. Thanks!
>>
>> - Jie
>>
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com


Re: custom(ized) conteinerization

2015-10-10 Thread Timothy Chen
As Ben said containierizers don't conflict since the framework specifies which 
containerizer the task should be launched from.

And isolators are not configurable per task but per slave, so all tasks will 
use the same isolators in the same slave.

Tim

> On Oct 9, 2015, at 11:22 PM, Alex Glikson  wrote:
> 
> Thanks!
> 
>> You can mix containerizers, although they should not conflict with each
>> other.
> 
> How would I know whether they conflict? For example, the docker one and 
> the default mesos one with certain configuration of isolators etc?
> 
> Thanks,
> Alex
> 
> 
> Benjamin Mahler  wrote on 10/10/2015 03:51:56 
> AM:
> 
>> From: Benjamin Mahler 
>> To: dev 
>> Date: 10/10/2015 03:52 AM
>> Subject: Re: custom(ized) conteinerization
>> 
>> (1) is something that has come up before. The containerizer deals with a
>> variable sized container, the semantics of a task are defined by the
>> executor, there is no way for the containerizer to understand the
> meaning
>> of a task currently. Some tasks are "special" in that they don't have an
>> executor (e.g. command task, docker task, etc), and in this case they
> will
>> be isolated individually. The main approach that has been discussed to
> my
>> knowledge is to have the executor leverage the mesos containerizer to
>> create nested containers:
> https://issues.apache.org/jira/browse/MESOS-
>> 
>> For (2) you should implement a custom network _isolator_ that the mesos
>> containerizer can use.
>> 
>> With respect to (3), for regular resources the policy used in Mesos is
> that
>> Mesos will never make decisions to kill things, it must be triggered by
> the
>> framework or an operator. So this requirement should be met already. For
>> revocable resources, Mesos may destroy containers, but the framework
> will
>> be aware of that when deciding to use them. If they are stateful, you
>> should use reservations and store the data in a volume.
>> 
>> You can mix containerizers, although they should not conflict with each
>> other.
>> 
>>> On Fri, Oct 9, 2015 at 3:04 AM, Alex Glikson  wrote:
>>> 
>>> Triggered by the thread on potential deprecation of external
>>> containerizer, I wonder what would make sense to do to address the
>>> following set of requirements:
>>> 1. I need resource isolation for individual tasks (mainly for QoS
>>> reasons), so having container per task seems reasonable
>>> 2. I have rather advanced networking requirements, not easily
> addressable
>>> with default mesos containerizer or docker
>>> 3. Some of the tasks are stateful, so I would really prefer that Mesos
>>> doesn't kill them, pretty much ever (unless triggered by the
> framework)
>>> 
>>> It seems that having my own containerizer would be a reasonable
> approach.
>>> But given some of the requirements above, I am trying to figure out
>>> whether I would at all need to implement "usage" and "update" (and
> maybe
>>> even 'destroy', unless it is invoked as part of killTask received from
> the
>>> framework?).
>>> 
>>> Moreover, the isolation mechanism I have in mind does use the same
> Linux
>>> features as docker/mesos containerizers (cgroups, namespaces, etc),
> but in
>>> a somewhat different manner. So, I wonder whether I can use more than
> one
>>> containerizer on the same host -- e.g., to run tasks of my framework
> on
>>> the same host as tasks of , say, Marathon+docker (and if yes, how can
> I
>>> check whether they will work together). If mixing containerizers in
> the
>>> same host is not recommended, is there an easy way to dynamically
> decide
>>> which slaves are 'allocated' to which 'type' of resources (e.g., some
> sort
>>> of entire-host allocation policy)?
>>> 
>>> Some thoughts/advice would be really helpful, before we actually spend
>>> time implementing a new containerizer, one way or another.
>>> 
>>> Thanks!
>>> Alex
>>> 
>>> P.S. Disclaimer: I am new to Mesos, so maybe some (or all) of the
> above
>>> doesn't make much sense, so bear with me..
> 
> 


Fwd: Still Failing: apache/mesos#1269 (master - b74ed17)

2015-10-08 Thread Timothy Chen
I just received this and it looks like it's failing because it's missing a
Rakefile.

Someone recently set this up?

Tim

-- Forwarded message --
From: Travis CI 
Date: Thu, Oct 8, 2015 at 4:15 PM
Subject: Still Failing: apache/mesos#1269 (master - b74ed17)
To: tnac...@gmail.com


*apache / mesos
*
(master

)
Build #1269 is still failing.

7 seconds *Jojy Varghese* b74ed17

Changeset
→

  Fixed minor style issues in Docker Store.

Review: https://reviews.apache.org/r/39141

*Want to know about upcoming build environment updates?*

Would you like to stay up-to-date with the upcoming Travis CI build
environment updates? We set up a mailing list for you! Sign up here
.

Documentation

about Travis CI
For help please join our IRC channel irc.freenode.net#travis.
Choose who receives these build notification emails in your configuration
file
.


*Would you like to test your private code?*

Travis Pro

could be your new best friend!

Travis CI is powered by



Re: Proposing a deterministic simulation tool for Mesos master and allocator debugging and testing

2015-10-06 Thread Timothy Chen
I wonder that if testing the allocator and the allocation choices, the
easier way might be extracting the Allocator and write a
framework/standalone tool just around that?

Tim


On Mon, Oct 5, 2015 at 4:49 PM, Neil Conway  wrote:
> On Mon, Oct 5, 2015 at 3:20 PM, Maged Michael  wrote:
>> I have in mind three options.
>> (1) Text translation of Mesos source code. E.g., "process::Future"
>> into, say, "sim::process::Future".
>> - Pros: Does not require any changes to any Mesos or libprocess code.
>> Replace only what needs to be replaced in libprocess for simulation.
>> - Cons: Fragile.
>> (2) Integrate the simulation mode with the libprocess code.
>> - Pros: Robust. Add only what needs to be added to libprocess for
>> simulation. Partial reuse some data structures from regular-mode
>> libprocess.
>> - Cons: Might get in the way of the development and bug fixes in the
>> regular libprocess code.
>> (3) Changes to Mesos makefiles to use alternative simulation-oriented
>> libprocess code.
>> - Pros: Robust.
>> - Cons: Might need to create a lot of stubs that redirect to the
>> regular-mode (i.e., not for simulation) libprocess code that doesn't
>> need any change under simulation.
>
> My vote is for #2, with the caveat that we might have the code live in
> a separate Git repo/branch for a period of time until it has matured.
> If the simulator requires drastic (architectural) changes to
> libprocess, then merging the changes into mainline Mesos might be
> tricky -- but it might be easier to figure that out once we're closer
> to an MVP.
>
>> As an example of what I have in mind. this a sketch of 
>> sim::process::dispatch.
>>
>> template
>> // Let R be an abbreviation of typename result_of::type
>> sim::process::Future
>> dispatch(
>>const sim::process::Process& pid,
>>R (T::*method)(Args...),
>>Args... args)
>> {
>> /* Still running in the context of the parent simulated thread -
>> the same C++/OS thread as the simulator. */
>> > interleaving> /* e.g., setjmp/longjmp */
>> // create a promise
>> std::shared_ptr> sim::process::Promise());
>> 
>>  // e.g., a map structure
>> 
>> return prom->future();
>> /* The dispatched function will start running when at some point
>> later the simulator decides to switch to the child thread (pid) when
>> pid is ready to run fn. */
>> }
>
> I wonder how much of what is happening here (e.g., during the
> setjmp/longjmp) could be implemented by instead modifying the
> libprocess event queuing/dispatching logic. For example, suppose Mesos
> is running on two CPUs (and let's ignore network I/O + clock for now).
> If you want to explore all possible schedules, you could start by
> capturing the non-deterministic choices that are made when the
> processing threads (a) send messages concurrently (b) choose new
> processes to run from the run queue. Does that sound like a feasible
> approach?
>
> Other suggestions:
>
> * To make what you're suggesting concrete, it would be great if you
> started with a VERY minimal prototype -- say, a test program that
> creates three libprocess processes and has them exchange messages. The
> order in which messages will be sent/received is non-deterministic [1]
> -- can we build a simulator that (a) can explore all possible
> schedules (b) can replay the schedule chosen by a previous simulation
> run?
>
> * For a more interesting but still somewhat-tractable example, the
> replicated log (src/log) might be a good place to start. It is fairly
> decoupled from the rest of Mesos and involves a bunch of interesting
> concurrency. If you setup a test program that creates N log replicas
> (in a single OS process) and then explores the possible interleavings
> of the messages exchanged between them, that would be a pretty cool
> result! There's also a bunch of Paxos-specific invariants that you can
> check for (e.g., once the value of a position is agreed-to by a quorum
> of replicas, that value will eventually appear at that position in all
> sufficiently connected log replicas).
>
> Neil
>
> [1] Although note that not all message schedules are possible: for
> example, message schedules can't violate causal dependencies. i.e., if
> process P1 sends M1 and then M2 to P2, P2 can't see  (it might
> see only <>, , or  if P2 is remote). Actually, that suggests
> to me we probably want to distinguish between local and remote message
> sends in the simulator: the former will never be dropped.


Re: Problems with deprecation cycles for critical/hard to adapt dependencies

2015-09-30 Thread Timothy Chen
I think besides changing to time based, we should provide a lot more visibility 
of the features that we are starting to deprecate, and I think each release we 
can also highlight the remaining releases/time each feature remaining lifetime 
so users are reminded on each release the full list they should be aware.

Tim

> On Sep 30, 2015, at 5:17 PM, Niklas Nielsen  wrote:
> 
> @vinod, ben, jie - Any thoughts on this?
> 
> I am in favor of the time based deprecation as well and can come up with a
> proposal, taken there are no objections.
> 
> Niklas
> 
> On 28 September 2015 at 21:09, James DeFelice 
> wrote:
> 
>> +1 for time-based deprecation cycle of O(months)
>> 
>>> On Mon, Sep 28, 2015 at 6:16 PM, Zameer Manji  wrote:
>>> 
>>> Niklas,
>>> 
>>> Thanks for starting this thread. I think Mesos can best move forward if
>> it
>>> switches from release based deprecation cycle to a time based deprecation
>>> cycle. This means that APIs would be deprecated after a time period (ie 4
>>> months) instead of at a specific release. This is the model that Google's
>>> Guava library uses and I think it works really well. It ensures that the
>>> ecosystem and community has sufficient time to react to deprecations
>> while
>>> still allowing them to move forward at a reasonable pace.
>>> 
>>> On Mon, Sep 28, 2015 at 2:19 PM, Niklas Nielsen 
>>> wrote:
>>> 
 Hi everyone,
 
 With a (targeted) release cadence of *one month*, we should revisit our
 deprecation cycles of 3 releases (e.g. in version N, we warn. In
>> version
 N+1, support both old and new API. In Version N+2, we break
>>> compatibility).
 Sometimes we cannot do the first step, and we deprecate in version N+1
>>> and
 thus in 2 releases. With the new cadence, that is no longer around two
 quarters but two months which is too short for 3rd party tooling to
>>> adapt.
 
 Even though our release cycles have been longer than one month in the
>>> past,
 we are running into issues with deprecation due to lack of outreach
>> (i.e.
 our communication to framework and 3rd party tooling communities) or
 because we are simply unaware on the internal dependencies they have on
 Mesos.
 
 We/I became aware of this, when we saw a planned deprecation of
>>> /state.json
 in 0.26.0 (0.25.0 supports both). I suspect that _a lot_ of tools will
 break because of this. This, on top of the problems we have run into
 recently with the Zookeeper master info change from binary protobuf to
 json.
 
 Even though we document this in our upgrade.md, the
>> visibility/knowledge
 of
 this document seem too low and we probably need to do more.
 
 Do you guys have thoughts/ideas on how we can address this?
 
 Cheers,
 Niklas
 
 --
 Zameer Manji
>> 
>> 
>> 
>> --
>> James DeFelice
>> 585.241.9488 (voice)
>> 650.649.6071 (fax)
>> 


Re: need help/shepherd on MESOS-3435

2015-09-18 Thread Timothy Chen
I think it helps to write a design doc first, as I'm not familiar with hyper 
and basically how it works and how it maps to the containerizer API.

 But yes docker.cpp can be a reference how to integrate another containerizer, 
and it's not the only way which depends on the containerizer.

Tim

> On Sep 18, 2015, at 4:34 AM, tommy xiao  wrote:
> 
> I plan to add Hyper support on Marathon. So i proposal this request on
> Marathon, https://github.com/mesosphere/marathon/issues/1815
> but in discussion and tim mentioned, i first need familiar with mesos
> docker implement.
> I want to know which way is best:
> does it good start:
> /src/slave/containerizer/docker.hpp
> if i can as this file as reference to add hyper.hpp, right?
> 
> 
> 
> -- 
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com


Re: Newbie: Mesos issue

2015-09-04 Thread Timothy Chen
I can shepherd you on this one.

Tim

On Fri, Sep 4, 2015 at 10:00 AM, Vaibhav Khanduja
 wrote:
> Hi
>
> I am looking at issue: https://issues.apache.org/jira/browse/MESOS-2863
> marked as newbie. I would like to work on it and was wondering if somebody
> could Shepherd me on this?  I believe code being talked here is the sleep
> in executor for e.g. in docker/executor.cpp -
>   // A hack for now ... but we need to wait until the status update
>
>   // is sent to the slave before we shut ourselves down.
>
>   // TODO(tnachen): Remove this hack and also the same hack in the
>
>   // command executor when we have the new HTTP APIs to wait until
>
>   // an ack.
>
>   os::sleep(Seconds(1));
>
> Thanks


Re: Newbie: Mesos issue

2015-09-04 Thread Timothy Chen
Hi Vaibhav,

Yes let's move the conversation over JIRA and IRC, we don't have to
involve the whole dev list for this.

Tim

On Fri, Sep 4, 2015 at 11:07 AM, Vaibhav Khanduja
<vaibhavkhand...@gmail.com> wrote:
> Hi Tim
>
> Thanks you very much.
>
> I see a place in docker/executor.cpp & launch/executor.cpp where a sleep is
> added to wait till the message is received by slave. The length of sleep is
> based on an assumption that time is good enough and not waiting for
> acknowledgement.  Am I looking at the right place? More details on the
> issue would help me my analysis?
>
> Please also let me know if it is preferred to have such communication over
> jira.
>
> Thanks
>
> On Fri, Sep 4, 2015 at 10:01 AM, Timothy Chen <tnac...@gmail.com> wrote:
>
>> I can shepherd you on this one.
>>
>> Tim
>>
>> On Fri, Sep 4, 2015 at 10:00 AM, Vaibhav Khanduja
>> <vaibhavkhand...@gmail.com> wrote:
>> > Hi
>> >
>> > I am looking at issue: https://issues.apache.org/jira/browse/MESOS-2863
>> > marked as newbie. I would like to work on it and was wondering if
>> somebody
>> > could Shepherd me on this?  I believe code being talked here is the sleep
>> > in executor for e.g. in docker/executor.cpp -
>> >   // A hack for now ... but we need to wait until the status
>> update
>> >
>> >   // is sent to the slave before we shut ourselves down.
>> >
>> >   // TODO(tnachen): Remove this hack and also the same hack in
>> the
>> >
>> >   // command executor when we have the new HTTP APIs to wait
>> until
>> >
>> >   // an ack.
>> >
>> >   os::sleep(Seconds(1));
>> >
>> > Thanks
>>


Re: Guidelines for new contributors

2015-09-01 Thread Timothy Chen
Hi Vinod,

All of this is in Diana's WIP newbie guide, we should have a draft
published soon to the dev/user list.

Tim

On Tue, Sep 1, 2015 at 10:46 AM, Vinod Kone  wrote:
> Hi folks,
>
> I've seen a bunch of new contributors and contributions pop up lately
> (awesome!), so wanted to take some time to provide some guidelines to
> newbies as its been a bit chaotic. Hopefully this will be part of the
> newbie doc that Tim and Diana are working on.
>
> *Shepherds*: It is really important to find a shepherd *before* you assign
> a ticket to yourself and definitely before you submit a review. Sometimes
> working with a shepherd or discussing on the ticket will reveal its
> priority for the project at the current time. Look at the maintainers
>  file to get an
> idea for who to ask to be a shepherd.
>
> *Reviews*: Please only submit a review *after* you have come to agreement
> with your shepherd on the proposed solution. Make sure to add your shepherd
> as a "reviewer" (among others) in the review. I'll be updating the review
> bot to flag reviews that do not have reviewers. This will ensure that your
> reviews get attention.
>
> *Finding tickets*: If you are fairly new, it's best to pick tickets labeled
> "newbie". After that, it's best to work on projects that are important for
> the next release. See the tracking ticket for the release to figure out the
> high priority projects or ask the release manager to guide you. You are
> more likely to find shepherds and reviewers, when you work on projects that
> are important for the next release.
>
> Please remember that shepherds and reviewers are extremely busy, so any
> upfront work you can do to streamline the review process, will reduce their
> burden and increase the chance of your reviews getting committed.
>
> Hope this helps and looking forward to all your contributions,
>
> Vinod


Re: Docker socket path for slave

2015-08-28 Thread Timothy Chen
Hi Vaibhav,

Thanks for the ping, sorry as you said there are other work that is
going on that causes some delays on your review.

But yes you're right you need a ship it from a committer, and
afterwards you need a committer to merge your patch too.

I'll take a look at your patch and we can go from there.

Tim



On Fri, Aug 28, 2015 at 3:09 PM, Khanduja, Vaibhav
vaibhav.khand...@emc.com wrote:
 Hello All,

 I apologies for my repeated requests here.

 This is my first contribution to source code and I am bit unaware of next
 steps. I have updated code with feedback received and was wondering if the
 code is good for it make into source repository. From what I read, there
 has to be a “Ship It” on the code review before it can make to main source
 line. I understand community is pre-occupied with other important patches
 and bugs, so perfectly fine with a wait here, but would appreciate any
 info on next steps.


 Thanks


 On 8/25/15, 9:58 AM, Khanduja, Vaibhav vaibhav.khand...@emc.com wrote:

@

I have updated the review as suggested in feedback. I would appreciate if
review can be marked as ship-it if no more changes are needed.

Thanks

On 8/12/15, 9:31 AM, Khanduja, Vaibhav vaibhav.khand...@emc.com wrote:

Hi

I have raised review for the code changes:

https://issues.apache.org/jira/browse/MESOS-3187


https://reviews.apache.org/r/37114/ -  A second version updated based on
the feedback.

As per contribution documentation, the review has to be marked as “ship
it” before it can be committed.

I was wondering, if somebody could help me (Shepherd) here?

Thanks

On 8/3/15, 10:25 AM, Vinod Kone vinodk...@gmail.com wrote:

Added you to the contributors.

On Mon, Aug 3, 2015 at 9:47 AM, Khanduja, Vaibhav
vaibhav.khand...@emc.com
wrote:

 Hi Peter,

 Thanks for  your reply.

 The change for docker daemon options is probably has to be in slave
code
 and not int the framework. Other than marathon, there could be other
 framework requiring such support.

 The slave during bootup checks for connection, by querying the version
of
 docker daemon.

 I have opened an issue, and plan to work on it:
 https://issues.apache.org/jira/browse/MESOS-3187

 I am not part of contributors list, so cannot assign to myself. Can
 somebody do this for me? I made few changes to code to get this
working.
 The changes are now in a pull request:

 https://github.com/apache/mesos/pull/53

 I am aware of contribution requirements:
 http://mesos.apache.org/documentation/latest/, and would work in
creating
 patch if I get the bug assigned.

 — VK,

 Technologist,
 EMC OCTO

 Thx

 On 8/3/15, 6:39 AM, Peter Kolloch pe...@mesosphere.io wrote:

 Hi Vaibhav,
 
 the parameters option works for parameters of the docker run
command:
 
 

https://mesosphere.github.io/marathon/docs/native-docker.html#privilege
d
-
m
 ode-and-arbitrary-docker-options
 
 You tried to use it with a command line argument for the docker
_daemon_.
 Starting the docker daemon with the right command line arguments is
out of
 scope for Marathon.
 
 If you find a parameter of the docker run command that works for
you,
 you
 need to specify the long name for this option (the none-one-letter
option)
 in the parameters option.
 
 If you find the Marathon documentation lacking, we would love to get
a
PR
 for a documentation improvement from you!
 
 See
 
 https://mesosphere.github.io/marathon/docs/contributing.html
 
 for details.
 
 Best regards,
 Peter
 
 [BTW: The Marathon mailing list might be better suited for this kind
of
 question.]
 
 On Sat, Aug 1, 2015 at 1:57 AM, Khanduja, Vaibhav
 vaibhav.khand...@emc.com
 wrote:
 
  Hi
 
  Having not received any answer, I support there is no solution for
this.
 
  The mesos slave process, along with accepting other docker
arguments
  should somehow manage to take these extra arguments too. I have
logged
  this as an enhancement, and wondering if somebody can look at this:
 
  https://issues.apache.org/jira/browse/MESOS-3187
 
  I am available to provide the fix, if somebody can be a help as
sheperd.
 
  Thx
 
  On 7/30/15, 12:58 PM, Khanduja, Vaibhav
vaibhav.khand...@emc.com
  wrote:
 
  Hi
  
  jfyi,
  
  I have tried parameter option in marathon json file Š
  
  Š..
  parameters: {
  key: H, value: unix:///var/run/mydocker.sock
  }
  Š.
  
  
  
  
  On 7/30/15, 12:29 PM, Khanduja, Vaibhav
vaibhav.khand...@emc.com
  wrote:
  
  Hi
  
  I have a use-case where docker deamon does not run on original
socket
  path which is /var/run/docker.sock but to a path as given by
user.
 I
  start my docker daemon with ­H option,
  
  docker ­d ­H unix:///var/run/mydocker.sock
  
  all my docker calls now use this ­H for e.g. to print images:
  
  docker ­H unix:///var/run/mydocker.sock images
  
  I am using Marathon and have started my slave with Docker as
container
  option.
  
  I cannot see an option in slave, where I can specify the socket
port
 for
  the it talk with Docker daemon. docker_socket option is used for
  

Re: [VOTE] Release Apache Mesos 0.24.0 (rc1)

2015-08-27 Thread Timothy Chen
That test is failing because of a wierd bug in CentOS 7 not naming the
cgroups correctly (or at least not following the pattern every other
OS).

I filed a CentOS bug but no response so far, if we want to fix it we
will have to work around this problem by hardcoding another cgroup
name to test cpuacct,cpu.

Tim

On Thu, Aug 27, 2015 at 4:00 PM, Vinod Kone vinodk...@apache.org wrote:
 Happy to cut another RC.

 IIUC, https://reviews.apache.org/r/37684 doesn't fix the below test.

 [  FAILED  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where
 TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess

 Is someone working on fixing that (MESOS-3294
 https://issues.apache.org/jira/browse/MESOS-3294)? If yes, I would wait a
 day or two to get that in.

 Any other issues people have encountered with RC1?



 On Thu, Aug 27, 2015 at 3:45 PM, Niklas Nielsen nik...@mesosphere.io
 wrote:

 If it is that easy to fix, why not get it in?

 How about https://issues.apache.org/jira/browse/MESOS-3053 (which
 Haosdent ran into)?

 On 27 August 2015 at 15:36, Jie Yu yujie@gmail.com wrote:

 Niklas,

 This is the known problem reported by Marco. I am OK with both because
 the linux filesystem isolator cannot be used in 0.24.0.

 If you guys prefer to cut another RC, here is the patch that needs to be
 cherry picked:

 commit 3ecd54320397c3a813d555f291b51778372e273b
 Author: Greg Mann g...@mesosphere.io
 Date:   Fri Aug 21 13:21:10 2015 -0700

 Added symlink test for /bin, lib, and /lib64 when preparing test root
 filesystem.

 Review: https://reviews.apache.org/r/37684



 On Thu, Aug 27, 2015 at 3:30 PM, Niklas Nielsen nik...@mesosphere.io
 wrote:

 -1: sudo make check on centos 7

 [--] Global test environment tear-down

 [==] 793 tests from 121 test cases ran. (606946 ms total)

 [  PASSED  ] 786 tests.

 [  FAILED  ] 7 tests, listed below:

 [  FAILED  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where
 TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess

 [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem

 [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox

 [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost

 [  FAILED  ]
 LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint

 [  FAILED  ]
 LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem

 [  FAILED  ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs

 Configured with:

 ../mesos/configure --prefix=/home/vagrant/releases/0.24.0/
 --disable-python

 On 26 August 2015 at 17:00, Khanduja, Vaibhav vaibhav.khand...@emc.com
 wrote:

 +1

  On Aug 26, 2015, at 4:43 PM, Vinod Kone vinodk...@gmail.com wrote:
 
  Pinging the thread for more (binding) votes. Hopefully people have
 caught
  up with emails after Mesos madness.
 
  On Wed, Aug 19, 2015 at 1:28 AM, haosdent haosd...@gmail.com
 wrote:
 
  +1
 
  OS: Ubutnu 14.04
  Verify command: sudo make -j8 check
  Compiler: Both gcc4.8 and clang3.5
  Configuration: default configuration
  Result: all tests(828 tests) pass
 
  MESOS-3053 https://issues.apache.org/jira/browse/MESOS-3053 is
 because
  need update add iptable first.
 
  On Wed, Aug 19, 2015 at 2:39 PM, haosdent haosd...@gmail.com
 wrote:
 
  Could not
  pass DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged in
 Ubuntu
  14.04. Already have a issue for this
  https://issues.apache.org/jira/browse/MESOS-3053, it is acceptable?
 
  On Wed, Aug 19, 2015 at 12:55 PM, Marco Massenzio 
 ma...@mesosphere.io
  wrote:
 
  +1 (non-binding)
 
  All tests (including ROOT) pass on:
  Ubuntu 14.04 (physical box)
 
  All non-ROOT tests pass on:
  CentOS 7 (VirtualBox VM)
 
  Known issue (MESOS-3050) for ROOT tests on CentOS 7, non-blocker.
 
  Thanks,
 
  *Marco Massenzio*
 
  *Distributed Systems Engineerhttp://codetrips.com 
 http://codetrips.com*
 
  On Tue, Aug 18, 2015 at 3:26 PM, Vinod Kone vinodk...@apache.org
  wrote:
 
  0.24.0 includes the following:
 
 
 
 
 
  Experimental support for v1 scheduler HTTP API!
 
  This release also wraps up support for fetcher.
 
 
  The CHANGELOG for the release is available at:
 
 
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.0-rc1
 
 
 
 
 
 
  The candidate for Mesos 0.24.0 release is available at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz
 
 
  The tag to be voted on is 0.24.0-rc1:
 
 
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.24.0-rc1
 
 
  The MD5 checksum of the tarball can be found at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz.md5
 
 
  The signature of the tarball can be found at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz.asc
 
 
  The 

Re: [VOTE] Release Apache Mesos 0.23.0 (rc4)

2015-07-22 Thread Timothy Chen
+1 

The docker bridge network test failed because some iptable rules that was set 
on the environment. I will comment on the JIRA but not a blocker.

Tim


 On Jul 22, 2015, at 1:07 PM, Benjamin Hindman benjamin.hind...@gmail.com 
 wrote:
 
 +1 (binding)
 
 On Ubuntu 14.04:
 
 $ make check
 ... all tests pass ...
 $ sudo make check
 ... tests with known issues fail, but ignoring because these have all been
 resolved and are issues with the tests alone ...
 
 Thanks Adam.
 
 On Fri, Jul 17, 2015 at 4:42 PM Adam Bordelon a...@mesosphere.io wrote:
 
 Hello Mesos community,
 
 Please vote on releasing the following candidate as Apache Mesos 0.23.0.
 
 0.23.0 includes the following:
 
 
 - Per-container network isolation
 - Dockerized slaves will properly recover Docker containers upon failover.
 - Upgraded minimum required compilers to GCC 4.8+ or clang 3.5+.
 
 as well as experimental support for:
 - Fetcher Caching
 - Revocable Resources
 - SSL encryption
 - Persistent Volumes
 - Dynamic Reservations
 
 The CHANGELOG for the release is available at:
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.23.0-rc4
 
 
 
 The candidate for Mesos 0.23.0 release is available at:
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz
 
 The tag to be voted on is 0.23.0-rc4:
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.23.0-rc4
 
 The MD5 checksum of the tarball can be found at:
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz.md5
 
 The signature of the tarball can be found at:
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz.asc
 
 The PGP key used to sign the release is here:
 https://dist.apache.org/repos/dist/release/mesos/KEYS
 
 The JAR is up in Maven in a staging repository here:
 https://repository.apache.org/content/repositories/orgapachemesos-1062
 
 Please vote on releasing this package as Apache Mesos 0.23.0!
 
 The vote is open until Wed July 22nd, 17:00 PDT 2015 and passes if a
 majority of at least 3 +1 PMC votes are cast.
 
 [ ] +1 Release this package as Apache Mesos 0.23.0 (I've tested it!)
 [ ] -1 Do not release this package because ...
 
 Thanks,
 -Adam-
 
 


Re: [VOTE] Release Apache Mesos 0.23.0 (rc3)

2015-07-16 Thread Timothy Chen
As Adam mention I also think this is not a blocker, as it only affects
the way we test the cgroup on CentOS 7.x due to a CentOS bug and
doesn't actually impact Mesos normal operations.

My vote is +1 as well.

Tim

On Thu, Jul 16, 2015 at 12:10 PM, Vinod Kone vinodk...@gmail.com wrote:
 Found a bug in HTTP API related code: MESOS-3055
 https://issues.apache.org/jira/browse/MESOS-3055

 If we don't fix this in 0.23.0, we cannot expect the 0.24.0 scheduler
 driver (that will send Calls) to properly subscribe with a 0.23.0 master. I
 could add a work around in the driver to only send Calls if the master
 version is 0.24.0, but would prefer to not have to do that.

 Also, on the review https://reviews.apache.org/r/36518/ for that bug, we
 realized that we might want to make Subscribe.force 'optional' instead of
 'required'. That's an API change, which would be nice to go into 0.23.0 as
 well.

 So, not a -1 per se, but if you are willing to cut another RC, I can land
 the fixes today. Sorry for the trouble.

 On Thu, Jul 16, 2015 at 11:48 AM, Adam Bordelon a...@mesosphere.io wrote:

 +1 (binding)
 This vote has been silent for almost a week. I assume everybody's busy
 testing. My testing results: basic integration tests passed for Mesos
 0.23.0 on CoreOS with DCOS GUI/CLI, Marathon, Chronos, Spark, HDFS,
 Cassandra, and Kafka.

 `make check` passes on Ubuntu and CentOS, but `sudo make check` fails on
 CentOS 7.1 due to errors in CentOS. See
 https://issues.apache.org/jira/browse/MESOS-3050 for more details. I'm not
 convinced this is serious enough to do another release candidate and voting
 round, but I'll let Tim and others chime in with their thoughts.

 If we don't get enough deciding votes by 6pm Pacific today, I'll extend the
 vote for another day.

 On Thu, Jul 9, 2015 at 6:09 PM, Khanduja, Vaibhav 
 vaibhav.khand...@emc.com
 wrote:

  +1
 
  Sent from my iPhone. Please excuse the typos and brevity of this message.
 
   On Jul 9, 2015, at 6:07 PM, Adam Bordelon a...@mesosphere.io wrote:
  
   Hello Mesos community,
  
   Please vote on releasing the following candidate as Apache Mesos
 0.23.0.
  
   0.23.0 includes the following:
  
 
 
   - Per-container network isolation
   - Dockerized slaves will properly recover Docker containers upon
  failover.
   - Upgraded minimum required compilers to GCC 4.8+ or clang 3.5+.
  
   as well as experimental support for:
   - Fetcher Caching
   - Revocable Resources
   - SSL encryption
   - Persistent Volumes
   - Dynamic Reservations
  
   The CHANGELOG for the release is available at:
  
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.23.0-rc3
  
 
 
  
   The candidate for Mesos 0.23.0 release is available at:
  
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc3/mesos-0.23.0.tar.gz
  
   The tag to be voted on is 0.23.0-rc3:
  
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.23.0-rc3
  
   The MD5 checksum of the tarball can be found at:
  
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc3/mesos-0.23.0.tar.gz.md5
  
   The signature of the tarball can be found at:
  
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc3/mesos-0.23.0.tar.gz.asc
  
   The PGP key used to sign the release is here:
   https://dist.apache.org/repos/dist/release/mesos/KEYS
  
   The JAR is up in Maven in a staging repository here:
   https://repository.apache.org/content/repositories/orgapachemesos-1060
  
   Please vote on releasing this package as Apache Mesos 0.23.0!
  
   The vote is open until Thurs July 16th, 18:00 PDT 2015 and passes if a
   majority of at least 3 +1 PMC votes are cast.
  
   [ ] +1 Release this package as Apache Mesos 0.23.0
   [ ] -1 Do not release this package because ...
  
   Thanks,
   -Adam-
 



Re: Mesos/Marathon is not update docker image.

2015-07-02 Thread Timothy Chen
Hi John,

You should run your mesos-slave container with --pid=host so that it's
able to find the processes that is forked from the slave.

Thanks,

Tim

On Wed, Jul 1, 2015 at 10:10 PM, John Kim dreamerad...@gmail.com wrote:
 I have Mesos cluster consists of 9 Ubuntu 14.04 machines on cloud
 environment.

 ZooKeeper and Mesos-Master is running on 3 machines and Mesos-Slave is
 running on 6 machines.

 Also, Marathon is running on master node.

 Docker registry is private resistry.

 I am trying to deploy this container via Marathon:
 https://mesosphere.github.io/marathon/docs/rest-api.html#post-/v2/apps

 The application is successfully deployed and operated.

 After a few days, I am trying to update this container via Marathon:
 https://mesosphere.github.io/marathon/docs/rest-api.html#put-/v2/apps/%7Bappid%7D

 Then, container is deploying, it becomes locked.

 If the Zookeer, Marathon, Mesos-Master is restart, and then you can deploy
 and operate normally.


 A Search fo /var/log/mesos reveals:

 E0526 14:05:29.145584 31390 slave.cpp:2662] Failed to update resources for
 container 465b12d4-65a3-4f03-873f-3f41601b1db5 of executor
 device-api.1c7a63a7-0353-11e5-8ec8-56847afe9799
 running task device-api.1c7a63a7-0353-11e5-8ec8-56847afe9799 on status
 update for terminal task, destroying container: Failed to determine cgroup
 for the 'cpu' subsystem:
 Failed to read /proc/32725/cgroup: Failed to open file
 '/proc/32725/cgroup': No such file or directory

 Tell us the reason or solution.

 Thank you.


Re: how to get docker container id?

2015-06-12 Thread Timothy Chen
Hi Oliver,

With latest master and next 0.23 release we've added docker inspect output in 
the first task running status update data field.

Therefore from the scheduler you can read and parse as json, and find all the 
information you need about the container including name and id.

Tim

 On Jun 12, 2015, at 6:04 AM, Olivier Sallou olivier.sal...@irisa.fr wrote:
 
 
 
 On 06/12/2015 12:02 PM, Adam Bordelon wrote:
 You can query the slave's state.json to get the container ID.
 See the previous thread:
 http://search-hadoop.com/m/0Vlr6OtCiO1p8ypc2/mesos+accessing+programmatticallysubj=Re+Accessing+stdout+stderr+of+a+task+programmattically+
 Thanks, I could get it, but it would be nice to get the information in
 update message rather than needing to trigger the nodes (with
 information for all tasks).
 
 Olivier
 
 On Fri, Jun 12, 2015 at 2:35 AM, Olivier Sallou olivier.sal...@irisa.fr
 wrote:
 
 Hi,
 how can we get the container id when executing a TaskInfo with a  Docker
 ContainerInfo ?
 
 Mesos execute a Docker container with name mesos-xxx but how can we get
 this identifier ?
 
 I set in my TaskInfo a unique id in Task Id, but itis not used as Docker
 identifier.
 
 I need it to query cAdvisor, running on my nodes.
 
 Thanks
 
 Olivier
 
 --
 
 
 gpg key id: 4096R/326D8438  (keyring.debian.org)
 Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438
 
 -- 
 Olivier Sallou
 IRISA / University of Rennes 1
 Campus de Beaulieu, 35000 RENNES - FRANCE
 Tel: 02.99.84.71.95
 
 gpg key id: 4096R/326D8438  (keyring.debian.org)
 Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438
 


Re: Introduction / Request to be added to Jira committers

2015-06-09 Thread Timothy Chen
Hi Oliver,

It's great to hear you're using the Storm over Mesos framework too! I
think there are lots of improvements that can be made there, are you
guys planning to help work on that framework as well?

Tim

On Tue, Jun 9, 2015 at 6:37 PM, Oliver Nicholas b...@uber.com wrote:
 Hello Folks,

 Name's Oliver Nicholas, I'm a sometimes-manager, sometimes-engineer here at
 Uber.  We've been running some Storm jobs over Mesos for a year or two but
 are looking into moving larger workloads under Marathon now, and ironing
 out some kinks along the way.  My first patch is pretty straightforward,
 and I'd love to get it committed so I can get on to the rest of them.

 My Jira username is bigo.

 https://issues.apache.org/jira/browse/MESOS-1825
 https://reviews.apache.org/r/35270/

 Thanks!
 -o

 --
 *bigo* / oliver nicholas | staff engineer, infrastructure | uber
 technologies, inc.


Re: Mesos/Marathon support for Docker extension

2015-06-06 Thread Timothy Chen
Yes.

Tim

Sent from my iPhone

 On Jun 6, 2015, at 8:38 PM, Khanduja, Vaibhav vaibhav.khand...@emc.com 
 wrote:
 
 Hi Tim
 
 Are you referring to following pull request
 
 https://github.com/mesosphere/marathon/pull/798
 
 Thanks
 
 
 On 6/6/15, 8:21 PM, Timothy Chen tnac...@gmail.com wrote:
 
 Hi Khadijah/Shuai,
 
 Mesos slave actually does invoke the docker cli directly for its
 integration.
 
 To support various options that docker will be adding we allowed
 arbitrary flags to be passed when launching a docker task (Params field I
 believe).
 
 Therefore if you know you have latest docker installed you can pass the
 Extra volume option and it should work.
 
 Tim
 
 On Jun 6, 2015, at 7:51 PM, Shuai Lin linshuai2...@gmail.com wrote:
 
 Hi Khanduja,
 
 What I understand the slave code which calls in docker cli would need
 this
 additional parameter to be passed into.
 
 
 I don't think mesos slave invokes docker cli directly. It calls the
 docker
 api instead.
 
 Also Is there any official documentation/introduction to docker plugin
 or
 volume extensions? I found the following two pages when googling, but
 non
 of them has an official introduction to volume extension.
 
 https://clusterhq.com/2014/12/08/docker-extensions/
 https://github.com/docker/docker/pull/13161
 
 Best Regards,
 Shuai
 
 
 On Sat, Jun 6, 2015 at 12:56 AM, Khanduja, Vaibhav
 vaibhav.khand...@emc.com
 wrote:
 
 Recently docker pulled in code for supporting docker volume extensions.
 The docker now (1.9  above) through CLI can be specified volume plugin
 name, which docker daemon connects get the actual storage. The plugin
 has
 today reasonable hooks for maintaining and cleaning the storage. I was
 wondering if there is any analysis done on the support of this in the
 code? What I understand the slave code which calls in docker cli would
 need this additional parameter to be passed into.
 
 Thx
 


Re: Docker port_mapping issue

2015-05-29 Thread Timothy Chen
Hi Oliver,

Yes you need to include the port range you want to use in your Task,
and later in the DockerInfo specifiy how do you want to use them (you
can map multiple ones in that list).

Thanks!

Tim

On Fri, May 29, 2015 at 5:43 AM, Olivier Sallou olivier.sal...@irisa.fr wrote:


 On 05/29/2015 02:07 PM, Olivier Sallou wrote:
 Hi,
 I can run task with success in a Docker container in my mesos install
 using base executor.

 However, I cannot get a task running when I add port mapping (though
 port is available).
 ok, it appears that in addition to Docker port_mapping, we need to add a
 port resource declaration in the task too, with something like:

 ports = task.resources.add()
 ports.name = ports
 ports.type = mesos_pb2.Value.RANGES
 port_range = ports.ranges.range.add()
 port_range.begin=31000
 port_range.end=31000

 we kinda need to duplicate port declaration (task and docker) in task.


 I use mesos 0.22, with python 2.7.


 If I print the sent task I have:

 name: task 0
 task_id {
   value: 0
 }
 slave_id {
   value: 20150526-114150-16777343-5050-2035-S0
 }
 resources {
   name: cpus
   type: SCALAR
   scalar {
 value: 1
   }
 }
 resources {
   name: mem
   type: SCALAR
   scalar {
 value: 128
   }
 }
 command {
   value: echo \hello world # $MESOS_SANDBOX #\
 }
 container {
   type: DOCKER
   docker {
 image: centos
 network: BRIDGE
 port_mappings {
   host_port: 31000
   container_port: 22
 }
 force_pull_image: true
   }
 }

 And it ends with error:

 Task 0 is in state TASK_FAILED
 Abnormal executor termination


 Slave shows:

 I0529 13:50:49.813928 18426 docker.cpp:626] Starting container
 'd9b5be3e-9f00-4242-aa91-d6a6f3a5175a' for task '0' (and executor '0')
 of framework '20150529-103634-16777343-5050-18179-0020'
 E0529 13:50:54.362663 18420 slave.cpp:3112] Container
 'd9b5be3e-9f00-4242-aa91-d6a6f3a5175a' for executor '0' of framework
 '20150529-103634-16777343-5050-18179-0020' failed to start: Port
 mappings require port resources

 However the offer present port resources:

 resources {
   name: ports
   type: RANGES
   ranges {
 range {
   begin: 31000
   end: 32000
 }
   }
   role: *
 }

 At slave startup I also see:
 I0529 14:05:37.481212 22455 slave.cpp:322] Slave resources: cpus(*):8;
 mem(*):6900; disk(*):215925; ports(*):[31000-32000]


 Any idea of what is going wrong?


 Thanks

 Olivier


 --
 Olivier Sallou
 IRISA / University of Rennes 1
 Campus de Beaulieu, 35000 RENNES - FRANCE
 Tel: 02.99.84.71.95

 gpg key id: 4096R/326D8438  (keyring.debian.org)
 Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438



Re: Please add me to jira group

2015-05-26 Thread Timothy Chen
Hi Shuai,

Sorry about that, just added you now.

Thanks,

Tim

On Tue, May 26, 2015 at 6:32 PM, Shuai Lin linshuai2...@gmail.com wrote:
 Hi list,

 I asked for joining the mesos contributors jira group last week, but
 seems no one replied so far:

 https://www.mail-archive.com/dev%40mesos.apache.org/msg32459.html

 So could anyone please do that at your convenience? My jira username is
 lins05.

 Best Regards,
 Shuai Lin


Re: [jira] [Updated] (MESOS-2020) mesos should send docker failure messages to scheduler

2015-05-13 Thread Timothy Chen
I have no idea too, I wanted to resolve the ticket but it seems to be stuck on 
this state.

Tim

 On May 13, 2015, at 5:03 PM, Benjamin Mahler benjamin.mah...@gmail.com 
 wrote:
 
 +tim, jake
 
 What does pending closed mean? I noticed that this had become the default
 resolution yesterday. Did something change Jake?
 
 Tim, should this be resolved?
 
 On Tue, May 12, 2015 at 12:28 AM, Timothy Chen (JIRA) j...@apache.org
 wrote:
 
 
 [
 https://issues.apache.org/jira/browse/MESOS-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]
 
 Timothy Chen updated MESOS-2020:
 
Fix Version/s: 0.23.0
 
 mesos should send docker failure messages to scheduler
 --
 
Key: MESOS-2020
URL: https://issues.apache.org/jira/browse/MESOS-2020
Project: Mesos
 Issue Type: Bug
 Components: docker, slave
   Affects Versions: 0.20.1
Environment: mesos-slaves running in docker
   Reporter: Ian Babrou
   Assignee: Jay Buffington
Fix For: 0.23.0
 
 
 I tried to start container that cannot actually start for some reason,
 like this (from slave logs):
 {log:I1031 08:58:02.69295860 slave.cpp:1112] Launching task
 topface_demo.055a8f8f-60dc-11e4-b4ad-56847afe9799 for framework
 20141003-172543-3892422848-5050-1-\n,stream:stderr,time:2014-10-31T08:58:02.692986684Z}
 {log:I1031 08:58:02.69369660 slave.cpp:1222] Queuing task
 'topface_demo.055a8f8f-60dc-11e4-b4ad-56847afe9799' for executor
 topface_demo.055a8f8f-60dc-11e4-b4ad-56847afe9799 of framework
 '20141003-172543-3892422848-5050-1-\n,stream:stderr,time:2014-10-31T08:58:02.693734418Z}
 {log:I1031 08:58:02.69561264 docker.cpp:743] Starting container
 '02a786c9-556c-4ed9-80d2-1850a49030fe' for task
 'topface_demo.055a8f8f-60dc-11e4-b4ad-56847afe9799' (and executor
 'topface_demo.055a8f8f-60dc-11e4-b4ad-56847afe9799') of framework
 '20141003-172543-3892422848-5050-1-'\n,stream:stderr,time:2014-10-31T08:58:02.6956675Z}
 {log:E1031 08:58:05.27290265 slave.cpp:2485] Container
 '02a786c9-556c-4ed9-80d2-1850a49030fe' for executor
 'topface_demo.055a8f8f-60dc-11e4-b4ad-56847afe9799' of framework
 '20141003-172543-3892422848-5050-1-' failed to start: Failed to 'docker
 run -d -c 204 -m 67108864 -e PORT=31459 -e PORT0=31459 -e PORTS=31459 -e
 HOST=web544 -e MESOS_SANDBOX=/mnt/mesos/sandbox -v
 /var/lib/mesos/slave/slaves/20141028-073834-3909200064-5050-1-75/frameworks/20141003-172543-3892422848-5050-1-/executors/topface_demo.055a8f8f-60dc-11e4-b4ad-56847afe9799/runs/02a786c9-556c-4ed9-80d2-1850a49030fe:/mnt/mesos/sandbox
 --net host --name mesos-02a786c9-556c-4ed9-80d2-1850a49030fe
 docker.core.tf/demo:1': exit status = exited with status 1 stderr =
 2014/10/31 08:58:04 Error response from daemon: Cannot start container
 05666763bff98ad70f35968add3923018a9d457a1f4e9dab0981936841093d2f: exec:
 \/server.js\: permission
 denied\n,stream:stderr,time:2014-10-31T08:58:05.273016253Z}
 But when I go to task sandbox from mesos ui, stdout and stderr are empty.
 Marathon keeps scheduling tasks, but they all silently fail, this is
 very misleading.
 
 
 
 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)
 


Re: [VOTE] Release Apache Mesos 0.22.1 (rc6)

2015-05-04 Thread Timothy Chen
+1 Verified with test cluster and running make check myself.

Tim

On Fri, May 1, 2015 at 2:28 PM, Alexander Rojas alexan...@mesosphere.io wrote:
 +1 non binding (Tested in OSX and 3 VM’s cluster running Ubuntu 12.10)

 On 30 Apr 2015, at 00:48, Adam Bordelon a...@mesosphere.io wrote:

 Hi all,

 Please vote on releasing the following candidate as Apache Mesos 0.22.1.

 0.22.1 is a bug fix release and includes the following:
 
  * [MESOS-1795] - Assertion failure in state abstraction crashes JVM.
  * [MESOS-2161] - AbstractState JNI check fails for Marathon framework.
  * [MESOS-2461] - Slave should provide details on processes running in its
 cgroups
  * [MESOS-2583] - Tasks getting stuck in staging.
  * [MESOS-2592] - The sandbox directory is not chown'ed if the fetcher
 doesn't run.
  * [MESOS-2601] - Tasks are not removed after recovery from slave and
 mesos containerizer
  * [MESOS-2614] - Update name, hostname, failover_timeout, and webui_url
 in master on framework re-registration
  * [MESOS-2643] - Python scheduler driver disables implicit
 acknowledgments by default.
  * [MESOS-2668] - Slave fails to recover when there are still processes
 left in its cgroup

 The CHANGELOG for the release is available at:
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.22.1-rc6
 

 The candidate for Mesos 0.22.1 release is available at:
 https://dist.apache.org/repos/dist/dev/mesos/0.22.1-rc6/mesos-0.22.1.tar.gz

 The tag to be voted on is 0.22.1-rc6:
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.22.1-rc6

 The MD5 checksum of the tarball can be found at:
 https://dist.apache.org/repos/dist/dev/mesos/0.22.1-rc6/mesos-0.22.1.tar.gz.md5

 The signature of the tarball can be found at:
 https://dist.apache.org/repos/dist/dev/mesos/0.22.1-rc6/mesos-0.22.1.tar.gz.asc

 The PGP key used to sign the release is here:
 https://dist.apache.org/repos/dist/release/mesos/KEYS

 The JAR is up in Maven in a staging repository here:
 https://repository.apache.org/content/repositories/orgapachemesos-1054

 Please vote on releasing this package as Apache Mesos 0.22.1!

 The vote is open until Mon May 4 18:00:00 PDT 2015 and passes if a majority
 of at least 3 +1 PMC votes are cast.

 [ ] +1 Release this package as Apache Mesos 0.22.1
 [ ] -1 Do not release this package because ...

 Thanks,
 -Adam-



Re: Suggestion: Mesos 0.22.1 point release

2015-04-24 Thread Timothy Chen
Thanks for the fix Jie!

Tim

On Fri, Apr 24, 2015 at 2:38 PM, Jie Yu j...@twitter.com.invalid wrote:
 Tim's patch cause a few compiler warnings. I committed a fix (just added to
 the spreadsheet).

 On Fri, Apr 24, 2015 at 2:35 PM, Jie Yu j...@twitter.com wrote:

 Now that MESOS-2601 has landed, shall we include it in 0.22.1(-rc5) too?


 +1

 On Fri, Apr 24, 2015 at 2:34 PM, Adam Bordelon a...@mesosphere.io wrote:

 Added MESOS-2643 to the cherry-pick spreadsheet
 https://docs.google.com/a/mesosphere.io/spreadsheets/d/1OzAWNjAL4zKtI-jOJqaQUcDNlnrNik2Dd7dHhwFLKcI/edit#gid=0
 .

 Now that MESOS-2601 has landed, shall we include it in 0.22.1(-rc5) too?

 On Wed, Apr 22, 2015 at 2:58 PM, Benjamin Mahler 
 bmah...@twitter.com.invalid wrote:

 Linked it with the release ticket as a blocker.

 The fix is committed.

 On Wed, Apr 22, 2015 at 2:53 PM, Benjamin Mahler bmah...@twitter.com
 wrote:

  One regression in 0.22.x we missed:
  https://issues.apache.org/jira/browse/MESOS-2643
 
  Here, the python bindings are not backwards compatible in that they
  disable implicit acknowledgements by default. The fix is trivial, and I
  have a review linked below. Can we get this in 0.22.1?
 
  https://reviews.apache.org/r/33452/
 
  On Tue, Apr 14, 2015 at 11:26 PM, Timothy Chen tnac...@gmail.com
 wrote:
 
  I also think we should push that fix for 0.23.0, it will take time to
  review and merge.
 
  Tim
 
  On Tue, Apr 14, 2015 at 10:17 PM, Benjamin Hindman
  b...@eecs.berkeley.edu wrote:
   Yes, fixing it in 0.23.0 SGTM.
  
   On Tue, Apr 14, 2015 at 10:02 PM, Jie Yu yujie@gmail.com
 wrote:
  
   I am just asking if you guys want to fix that for 0.22.1 or not. It
  sounds
   to me a non trivial fix. Given the bug is there for quite a while,
  maybe we
   can fix it in 0.23.0?
  
   - Jie
  
   On Tue, Apr 14, 2015 at 9:55 PM, Benjamin Hindman 
  b...@eecs.berkeley.edu
   wrote:
  
We are going to include MESOS-2614 (it's a really trivial fix).
   
Jie, where did you get MESOS-2601 from? That's definitely not in
 the
spreadsheet.
   
On Tue, Apr 14, 2015 at 7:40 PM, Jie Yu yujie@gmail.com
 wrote:
   
 Also, this one:
 https://issues.apache.org/jira/browse/MESOS-2601

 This sounds like a non trivial fix.

 - Jie

 On Tue, Apr 14, 2015 at 6:35 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com
 wrote:

   Per Nik's comment here:
 
  Based on input from Vinod and Adam; I will reduce the scope
 on
  the
point
   release to focus on MESOS-1795 and MESOS-2583.
   I will move the other tickets back to 0.23.0 if you don't
 have
  any
   objections - let me know if you have any tickets which were
regressions
  in
   0.22.0.
 
 
  I expected there to be fewer tickets in the spreadsheet, are
 the
   extra
  tickets (e.g.
 https://issues.apache.org/jira/browse/MESOS-2614)
   going
to
  be
  included after all?
 
  On Tue, Apr 14, 2015 at 6:20 PM, Joris Van Remoortere 
 jo...@mesosphere.io
  
  wrote:
 
   I think the plan is to cut a new RC by sometime tomorrow.
 The
 spreadsheet
   is up-to-date, just need to cherry-pick and modify the
  change-log.
  
   Joris
  
   On Tue, Apr 14, 2015 at 5:37 PM, Benjamin Mahler 
   benjamin.mah...@gmail.com
   wrote:
  
Hey Nik, any progress on this? Is the spreadsheet
 up-to-date?
   
On Wed, Apr 8, 2015 at 1:00 AM, Adam Bordelon 
   a...@mesosphere.io
   wrote:
   
 Hi Adam,

 Yes, once we have finalized the scope of the point
 release,
Niklas
  will
 send out an announcement of Mesos 0.22.1-rc1 (release
   candidate)
  which
   we
 would love you to test any way you can. The email will
  contain
instructions
 for building the release candidate and voting in the
  thread.
   See
 the
   vote
 thread from 0.22.0-rc4 (became final):

   http://www.mail-archive.com/dev%40mesos.apache.org/msg30668.html

 The current thread is to collect suggestions for bug
 fixes
  to
 include
   in
 this point release.

 Cheers,
 -Adam-

 On Tue, Apr 7, 2015 at 9:22 AM, Adam Avilla 
 a...@avil.la
wrote:

  On Fri, Apr 3, 2015 at 3:47 PM, Niklas Nielsen 
  nik...@mesosphere.io
   
  wrote:
 
   Based on input from Vinod and Adam; I will reduce
 the
  scope
on
  the
 point
   release to focus on MESOS-1795 and MESOS-2583.
  
 
  Can I help test these in any way?
 
  --
  /adam
 

   
  
 

   
  
 
 
 






Re: Review Request 29889: Recover Docker containers when mesos slave is in a container

2015-04-22 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29889/
---

(Updated April 22, 2015, 5:33 p.m.)


Review request for mesos and Benjamin Hindman.


Repository: mesos


Description
---

This is a one mega patch and contains many reviews that's already on rb.

This review is not meant to be merged, only provided for easier review.


Diffs (updated)
-

  Dockerfile 35abf25 
  docs/configuration.md 54c4e31 
  docs/docker-containerizer.md a5438b7 
  src/Makefile.am 93c7c8a 
  src/docker/docker.hpp 3ebbc1f 
  src/docker/docker.cpp 3a485a2 
  src/docker/executor.cpp PRE-CREATION 
  src/slave/containerizer/docker.hpp 0d5289c 
  src/slave/containerizer/docker.cpp f9fb078 
  src/slave/flags.hpp d3b1ce1 
  src/slave/flags.cpp d0932b0 
  src/tests/docker_containerizer_tests.cpp b119a17 

Diff: https://reviews.apache.org/r/29889/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-21 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33249/#review80915
---



src/slave/slave.cpp
https://reviews.apache.org/r/33249/#comment131094

std::string is already imported.
And we usually spell out the whole words instead of abbreviations 
(errorMessage).



src/slave/slave.cpp
https://reviews.apache.org/r/33249/#comment131095

mesos namespace should been imported?



src/tests/slave_tests.cpp
https://reviews.apache.org/r/33249/#comment131105

You don't need a mock executor right?



src/tests/slave_tests.cpp
https://reviews.apache.org/r/33249/#comment131100

Much better!



src/tests/slave_tests.cpp
https://reviews.apache.org/r/33249/#comment131099

Just one empty line.



src/tests/slave_tests.cpp
https://reviews.apache.org/r/33249/#comment131104

I don't think this is necessary?



src/tests/slave_tests.cpp
https://reviews.apache.org/r/33249/#comment131101

one line



src/tests/slave_tests.cpp
https://reviews.apache.org/r/33249/#comment131102

What's this for?


- Timothy Chen


On April 20, 2015, 10:43 p.m., Jay Buffington wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33249/
 ---
 
 (Updated April 20, 2015, 10:43 p.m.)
 
 
 Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
 
 
 Bugs: MESOS-2020
 https://issues.apache.org/jira/browse/MESOS-2020
 
 
 Repository: mesos
 
 
 Description
 ---
 
 When mesos is unable to launch the containerizer the scheduler should
 get a TASK_FAILED with a status message that includes the error the
 containerizer encounted when trying to launch.
 
 Introduces a new TaskStatus: REASON_CONTAINERIZER_LAUNCH_FAILED
 
 Fixes MESOS-2020
 
 
 Diffs
 -
 
   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
   src/slave/slave.cpp 8ec80ed26f338690e0a1e712065750ab77a724cd 
   src/tests/slave_tests.cpp b826000e0a4221690f956ea51f49ad4c99d5e188 
 
 Diff: https://reviews.apache.org/r/33249/diff/
 
 
 Testing
 ---
 
 I added test case to slave_test.cpp.  I also tried this with Aurora, supplied 
 a bogus docker image url and saw the docker pull failure stderr message in 
 Aurora's web UI.
 
 
 Thanks,
 
 Jay Buffington
 




Re: Build failed in Jenkins: Mesos » clang,docker||Hadoop,ubuntu:14.10 #159

2015-04-21 Thread Timothy Chen
Thanks for the catch/forward and the jira ticket, is this actively
being worked on?

Tim

On Tue, Apr 21, 2015 at 3:28 PM, Vinod Kone vinodk...@apache.org wrote:
 https://issues.apache.org/jira/browse/MESOS-1303

 On Tue, Apr 21, 2015 at 2:50 PM, Apache Jenkins Server
 jenk...@builds.apache.org wrote:

 See
 https://builds.apache.org/job/Mesos/COMPILER=clang,LABEL=docker%7C%7CHadoop,OS=ubuntu%3A14.10/159/changes

 Changes:

 [tnachen] Fix destroying containerizer during isolator prepare.

 --
 [...truncated 92645 lines...]
 I0421 21:50:26.543983 23235 replica.cpp:744] Replica recovered with log
 positions 0 - 0 with 1 holes and 0 unlearned
 I0421 21:50:26.544834 23260 leveldb.cpp:306] Persisting metadata (8 bytes)
 to leveldb took 552418ns
 I0421 21:50:26.544863 23260 replica.cpp:323] Persisted replica status to
 VOTING
 I0421 21:50:26.548025 23235 leveldb.cpp:176] Opened db in 2.697963ms
 I0421 21:50:26.550454 23235 leveldb.cpp:183] Compacted db in 2.409055ms
 I0421 21:50:26.550513 23235 leveldb.cpp:198] Created db iterator in
 29502ns
 I0421 21:50:26.550550 23235 leveldb.cpp:204] Seeked to beginning of db in
 26404ns
 I0421 21:50:26.550631 23235 leveldb.cpp:273] Iterated through 1 keys in
 the db in 31989ns
 I0421 21:50:26.550665 23235 replica.cpp:744] Replica recovered with log
 positions 0 - 0 with 1 holes and 0 unlearned
 I0421 21:50:26.553575 23235 leveldb.cpp:176] Opened db in 2.777253ms
 I0421 21:50:26.556329 23235 leveldb.cpp:183] Compacted db in 2.731748ms
 I0421 21:50:26.556382 23235 leveldb.cpp:198] Created db iterator in
 27342ns
 I0421 21:50:26.556417 23235 leveldb.cpp:204] Seeked to beginning of db in
 27324ns
 I0421 21:50:26.556459 23235 leveldb.cpp:273] Iterated through 1 keys in
 the db in 39859ns
 I0421 21:50:26.556507 23235 replica.cpp:744] Replica recovered with log
 positions 0 - 0 with 1 holes and 0 unlearned
 I0421 21:50:26.556982 23264 recover.cpp:449] Starting replica recovery
 I0421 21:50:26.557277 23264 recover.cpp:475] Replica is in VOTING status
 I0421 21:50:26.557397 23264 recover.cpp:464] Recover process terminated
 I0421 21:50:26.560544 23266 registrar.cpp:313] Recovering registrar
 [   OK ] Strict/RegistrarTest.fetchTimeout/0 (50 ms)
 [ RUN  ] Strict/RegistrarTest.fetchTimeout/1
 Using temporary directory
 '/tmp/Strict_RegistrarTest_fetchTimeout_1_owEdnG'
 I0421 21:50:26.588191 23235 leveldb.cpp:176] Opened db in 2.831873ms
 I0421 21:50:26.589162 23235 leveldb.cpp:183] Compacted db in 966842ns
 I0421 21:50:26.589227 23235 leveldb.cpp:198] Created db iterator in
 24734ns
 I0421 21:50:26.589246 23235 leveldb.cpp:204] Seeked to beginning of db in
 7315ns
 I0421 21:50:26.589256 23235 leveldb.cpp:273] Iterated through 0 keys in
 the db in 5774ns
 I0421 21:50:26.589283 23235 replica.cpp:744] Replica recovered with log
 positions 0 - 0 with 1 holes and 0 unlearned
 I0421 21:50:26.590036 23268 leveldb.cpp:306] Persisting metadata (8 bytes)
 to leveldb took 561242ns
 I0421 21:50:26.590073 23268 replica.cpp:323] Persisted replica status to
 VOTING
 I0421 21:50:26.593502 23235 leveldb.cpp:176] Opened db in 2.884378ms
 I0421 21:50:26.594516 23235 leveldb.cpp:183] Compacted db in 992663ns
 I0421 21:50:26.594563 23235 leveldb.cpp:198] Created db iterator in
 23944ns
 I0421 21:50:26.594583 23235 leveldb.cpp:204] Seeked to beginning of db in
 7217ns
 I0421 21:50:26.594593 23235 leveldb.cpp:273] Iterated through 0 keys in
 the db in 5732ns
 I0421 21:50:26.594619 23235 replica.cpp:744] Replica recovered with log
 positions 0 - 0 with 1 holes and 0 unlearned
 I0421 21:50:26.595427 23255 leveldb.cpp:306] Persisting metadata (8 bytes)
 to leveldb took 558110ns
 I0421 21:50:26.595461 23255 replica.cpp:323] Persisted replica status to
 VOTING
 I0421 21:50:26.598736 23235 leveldb.cpp:176] Opened db in 2.747694ms
 I0421 21:50:26.601641 23235 leveldb.cpp:183] Compacted db in 2.886673ms
 I0421 21:50:26.601701 23235 leveldb.cpp:198] Created db iterator in
 30034ns
 I0421 21:50:26.601739 23235 leveldb.cpp:204] Seeked to beginning of db in
 28567ns
 I0421 21:50:26.601775 23235 leveldb.cpp:273] Iterated through 1 keys in
 the db in 30272ns
 I0421 21:50:26.601807 23235 replica.cpp:744] Replica recovered with log
 positions 0 - 0 with 1 holes and 0 unlearned
 I0421 21:50:26.604929 23235 leveldb.cpp:176] Opened db in 2.992692ms
 I0421 21:50:26.607494 23235 leveldb.cpp:183] Compacted db in 2.545334ms
 I0421 21:50:26.607553 23235 leveldb.cpp:198] Created db iterator in
 29819ns
 I0421 21:50:26.607594 23235 leveldb.cpp:204] Seeked to beginning of db in
 26889ns
 I0421 21:50:26.607631 23235 leveldb.cpp:273] Iterated through 1 keys in
 the db in 30789ns
 I0421 21:50:26.607666 23235 replica.cpp:744] Replica recovered with log
 positions 0 - 0 with 1 holes and 0 unlearned
 I0421 21:50:26.608104 23266 recover.cpp:449] Starting replica recovery
 I0421 21:50:26.608356 23266 recover.cpp:475] Replica is in VOTING status
 I0421 21:50:26.608477 23266 recover.cpp:464] 

Re: Review Request 29334: Add option to launch docker containers with helper containers.

2015-04-21 Thread Timothy Chen


 On Jan. 18, 2015, 12:18 p.m., Bernd Mathiske wrote:
  src/slave/containerizer/docker.cpp, line 982
  https://reviews.apache.org/r/29334/diff/5/?file=824297#file824297line982
 
  Not checking the outcome of this?
 
 Timothy Chen wrote:
 This is actually reverted in the last patch I have, so please ignore this.

Also we never used to check the outcome as it's just a best effort, I'll leave 
a comment in for now. Going to merge these patches and later finalize the 
remaining ones.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29334/#review68559
---


On April 7, 2015, 12:46 a.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29334/
 ---
 
 (Updated April 7, 2015, 12:46 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Bernd Mathiske.
 
 
 Bugs: MESOS-2183
 https://issues.apache.org/jira/browse/MESOS-2183
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add option to launch docker containers with helper containers.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.hpp b7bf54a 
   src/slave/containerizer/docker.cpp 5f4b4ce 
 
 Diff: https://reviews.apache.org/r/29334/diff/
 
 
 Testing
 ---
 
 make, tests are fixed in next commit
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-20 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/#review80813
---



src/slave/containerizer/mesos/containerizer.cpp
https://reviews.apache.org/r/33257/#comment130926

Although functionally doesn't break anything, the effect is that mesos 
containerizer will think it owns these executors as well and report it to 
monitoring, IMO it's another issue that we will need to fix in the mesos 
containerizer in another patch.



src/slave/slave.cpp
https://reviews.apache.org/r/33257/#comment130925

Not sure what else we need in the future, sounds good I'll copy the whole 
containerinfo when it's there.



src/tests/containerizer_tests.cpp
https://reviews.apache.org/r/33257/#comment130927

I can set it as well.


- Timothy Chen


On April 17, 2015, 7:07 p.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33257/
 ---
 
 (Updated April 17, 2015, 7:07 p.m.)
 
 
 Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie 
 Yu, and Till Toenshoff.
 
 
 Bugs: MESOS-2601
 https://issues.apache.org/jira/browse/MESOS-2601
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fixed recover tasks only by the intiated containerizer.
 Currently both mesos and docker containerizer recovers tasks that wasn't 
 started by themselves.
 The proposed fix is to record the intended containerizer in the checkpointed 
 executorInfo, and reuse that information on recover to know if the 
 containerizer should recover or not. We are free to modify the executorInfo 
 since it's not being used to relaunch any task.
 The external containerizer doesn't need to change since it is only recovering 
 containers that are returned by the containers script.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.hpp 6893684e6d199a5d69fc8bba8e60c4acaae9c3c9 
   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
   src/slave/containerizer/mesos/containerizer.cpp 
 e4136095fca55637864f495098189ab3ad8d8fe7 
   src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
   src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
   src/tests/docker_containerizer_tests.cpp 
 c772d4c836de18b0e87636cb42200356d24ec73d 
 
 Diff: https://reviews.apache.org/r/33257/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-20 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/
---

(Updated April 20, 2015, 9:18 p.m.)


Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie Yu, 
and Till Toenshoff.


Bugs: MESOS-2601
https://issues.apache.org/jira/browse/MESOS-2601


Repository: mesos


Description
---

Fixed recover tasks only by the intiated containerizer.
Currently both mesos and docker containerizer recovers tasks that wasn't 
started by themselves.
The proposed fix is to record the intended containerizer in the checkpointed 
executorInfo, and reuse that information on recover to know if the 
containerizer should recover or not. We are free to modify the executorInfo 
since it's not being used to relaunch any task.
The external containerizer doesn't need to change since it is only recovering 
containers that are returned by the containers script.


Diffs (updated)
-

  src/slave/containerizer/docker.hpp 0d5289c626bdf22420759485f2146abfb6bdaffc 
  src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
  src/slave/containerizer/mesos/containerizer.cpp 
e4136095fca55637864f495098189ab3ad8d8fe7 
  src/slave/slave.cpp 8ec80ed26f338690e0a1e712065750ab77a724cd 
  src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
  src/tests/docker_containerizer_tests.cpp 
b119a174de79c2f98a0c575e6be2f59649f509ef 

Diff: https://reviews.apache.org/r/33257/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29334: Add option to launch docker containers with helper containers.

2015-04-20 Thread Timothy Chen


 On Jan. 18, 2015, 12:18 p.m., Bernd Mathiske wrote:
  src/slave/containerizer/docker.cpp, line 615
  https://reviews.apache.org/r/29334/diff/5/?file=824297#file824297line615
 
  Here one can see that we are dealing with a more general graph than a 
  series. How can one know why ___launch (3 underscores) is not needed in 
  this branch?
  
  And how can one know what ___launch, launch, _launch do in the 
  first place? One must read them all, memorize them, and then read this 
  stretch of code again. Ideas?

We can talk about what we like this to be, and I'd like to avoid general 
renaming and flow changes in this patch, we can file another JIRA if we need 
changes to make things more clear.


 On Jan. 18, 2015, 12:18 p.m., Bernd Mathiske wrote:
  src/slave/containerizer/docker.cpp, line 784
  https://reviews.apache.org/r/29334/diff/5/?file=824297#file824297line784
 
  mesos-docker-executor comes out of nowhere. Declaring it as a 
  constant (possibly, but not necessarily in this file) would provide a 
  natural way to link to background information (e.g. what exactly this 
  program does and where to find it). This info could then be found trivially 
  by looking up said declaration, where one would place some explanations.

As it's only referenced here, I'll just leave comment around it for now.


 On Jan. 18, 2015, 12:18 p.m., Bernd Mathiske wrote:
  src/slave/containerizer/docker.cpp, line 982
  https://reviews.apache.org/r/29334/diff/5/?file=824297#file824297line982
 
  Not checking the outcome of this?

This is actually reverted in the last patch I have, so please ignore this.


 On Jan. 18, 2015, 12:18 p.m., Bernd Mathiske wrote:
  src/slave/containerizer/docker.cpp, line 1421
  https://reviews.apache.org/r/29334/diff/5/?file=824297#file824297line1421
 
  not - not have
  on - in
  has - had been
  executor - executors

This is no longer relevant in the my latest patch in this seris, please ignore.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29334/#review68559
---


On April 7, 2015, 12:46 a.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29334/
 ---
 
 (Updated April 7, 2015, 12:46 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Bernd Mathiske.
 
 
 Bugs: MESOS-2183
 https://issues.apache.org/jira/browse/MESOS-2183
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add option to launch docker containers with helper containers.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.hpp b7bf54a 
   src/slave/containerizer/docker.cpp 5f4b4ce 
 
 Diff: https://reviews.apache.org/r/29334/diff/
 
 
 Testing
 ---
 
 make, tests are fixed in next commit
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-20 Thread Timothy Chen


 On April 18, 2015, 2:55 a.m., Till Toenshoff wrote:
  src/slave/containerizer/docker.cpp, line 424
  https://reviews.apache.org/r/33257/diff/3/?file=934066#file934066line424
 
  Not sure what would be correct but I guess it would be right to use a 
  lower-case d for docker in all comments. Right now we have both 
  variants in this patch.

It's actually much more prevelant to use a upper case Docker in the comments, 
I'll fix the other lower case ones.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/#review80576
---


On April 17, 2015, 7:07 p.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33257/
 ---
 
 (Updated April 17, 2015, 7:07 p.m.)
 
 
 Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie 
 Yu, and Till Toenshoff.
 
 
 Bugs: MESOS-2601
 https://issues.apache.org/jira/browse/MESOS-2601
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fixed recover tasks only by the intiated containerizer.
 Currently both mesos and docker containerizer recovers tasks that wasn't 
 started by themselves.
 The proposed fix is to record the intended containerizer in the checkpointed 
 executorInfo, and reuse that information on recover to know if the 
 containerizer should recover or not. We are free to modify the executorInfo 
 since it's not being used to relaunch any task.
 The external containerizer doesn't need to change since it is only recovering 
 containers that are returned by the containers script.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.hpp 6893684e6d199a5d69fc8bba8e60c4acaae9c3c9 
   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
   src/slave/containerizer/mesos/containerizer.cpp 
 e4136095fca55637864f495098189ab3ad8d8fe7 
   src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
   src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
   src/tests/docker_containerizer_tests.cpp 
 c772d4c836de18b0e87636cb42200356d24ec73d 
 
 Diff: https://reviews.apache.org/r/33257/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 33174: Fix for docker not configuring CFS quotas correctly

2015-04-18 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33174/#review80595
---



src/slave/containerizer/docker.cpp
https://reviews.apache.org/r/33174/#comment130681

The pid here is the executor pid, which is not the docker container pid, so 
setting cgroup on the command executor doesn't buy you much.



src/slave/containerizer/docker.cpp
https://reviews.apache.org/r/33174/#comment130682

We shouldn't need to call update right after for cfs, and also this seems 
like it's going to write to cfs cgroup every update, which seems wrong too.

If we want to set up this we should have a patch in launch that is only 
responsible for setting cfs if it's configured.


- Timothy Chen


On April 14, 2015, 8:32 p.m., Steve Niemitz wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33174/
 ---
 
 (Updated April 14, 2015, 8:32 p.m.)
 
 
 Review request for mesos, Ian Downes, Jie Yu, and Timothy Chen.
 
 
 Bugs: MESOS-2617
 https://issues.apache.org/jira/browse/MESOS-2617
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fix for docker containerizer not configuring CFS quotas correctly.
 
 It would be nice to refactor all this isolation code in a way that can be 
 shared between all containerizers, as this is basically just copied from the 
 CgroupsCpushareIsolator, but that's a much bigger undertaking.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
 
 Diff: https://reviews.apache.org/r/33174/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Steve Niemitz
 




Re: Review Request 30609: Added a function that reports file size, not following links.

2015-04-18 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30609/#review80596
---

Ship it!


Ship It!

- Timothy Chen


On March 11, 2015, 5:06 p.m., Bernd Mathiske wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30609/
 ---
 
 (Updated March 11, 2015, 5:06 p.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
 Timothy Chen.
 
 
 Bugs: MESOS-2072
 https://issues.apache.org/jira/browse/MESOS-2072
 
 
 Repository: mesos
 
 
 Description
 ---
 
 This returns a file's size (on UNIXes as reported by lstat(), not stat()). It 
 is desired that in case of a link, the size of the link, not the size of the 
 referenced file, is returned.
 
 
 Diffs
 -
 
   3rdparty/libprocess/3rdparty/stout/include/stout/os/stat.hpp 
 af940a48b161c194f2efb529b3d589c543b12f61 
   3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 
 c396c1d2d833b2f1721092fa35b23b5c3c3d99b3 
 
 Diff: https://reviews.apache.org/r/30609/diff/
 
 
 Testing
 ---
 
 Wrote a simple test that creates a file and tests its size, and also checks 
 if a non-existing file yields an error.
 
 
 Thanks,
 
 Bernd Mathiske
 




Re: Review Request 33174: Fix for docker not configuring CFS quotas correctly

2015-04-18 Thread Timothy Chen


 On April 18, 2015, 10:43 a.m., Timothy Chen wrote:
  src/slave/containerizer/docker.cpp, line 838
  https://reviews.apache.org/r/33174/diff/2/?file=927472#file927472line838
 
  The pid here is the executor pid, which is not the docker container 
  pid, so setting cgroup on the command executor doesn't buy you much.
 
 Steve Niemitz wrote:
 So what is your suggestion?

Look at how update() does it, we need to ask Docker what the actual pid of the 
container and use that pid to update it's cgroups.


 On April 18, 2015, 10:43 a.m., Timothy Chen wrote:
  src/slave/containerizer/docker.cpp, line 976
  https://reviews.apache.org/r/33174/diff/2/?file=927472#file927472line976
 
  We shouldn't need to call update right after for cfs, and also this 
  seems like it's going to write to cfs cgroup every update, which seems 
  wrong too.
  
  If we want to set up this we should have a patch in launch that is only 
  responsible for setting cfs if it's configured.
 
 Steve Niemitz wrote:
 Isn't that exactly what I'm doing?  This update() is now called from 
 launch to set the CFS quota (and all quotas).  
 The normaly update() path only checks if the resources are different, and 
 if they aren't doesn't write anything to the cgroups.  So I don't know what 
 you mean by and also this seems like it's going to write to cfs cgroup every 
 update.
 Also, why would it be different than the non-CFS code above it?  Wouldn't 
 that have the same issue of writing every update?

So I think what's I'm trying to convey is that update() has a specific meaning 
to the slave and it's being called during different times (task launched, 
destroy, etc) for the same executor.

The CFS values only need to be setup once per container, so we should create a 
new method and let it being used just for launch instead of modifying update, 
since update will be called several times.

We change update unless we want the CFS quota to change over the lifetime of 
the container, which I don't believe so.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33174/#review80595
---


On April 14, 2015, 8:32 p.m., Steve Niemitz wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33174/
 ---
 
 (Updated April 14, 2015, 8:32 p.m.)
 
 
 Review request for mesos, Ian Downes, Jie Yu, and Timothy Chen.
 
 
 Bugs: MESOS-2617
 https://issues.apache.org/jira/browse/MESOS-2617
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fix for docker containerizer not configuring CFS quotas correctly.
 
 It would be nice to refactor all this isolation code in a way that can be 
 shared between all containerizers, as this is basically just copied from the 
 CgroupsCpushareIsolator, but that's a much bigger undertaking.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
 
 Diff: https://reviews.apache.org/r/33174/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Steve Niemitz
 




Review Request 33318: Fix docker containerizer usage and test.

2015-04-17 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33318/
---

Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till Toenshoff.


Repository: mesos


Description
---

Fix docker containerizer usage and test.
The docker usage test is failing with the most recent change in including 
executor resources in docker containerizer.


Diffs
-

  src/slave/containerizer/docker.hpp 6893684e6d199a5d69fc8bba8e60c4acaae9c3c9 
  src/tests/docker_containerizer_tests.cpp 
c772d4c836de18b0e87636cb42200356d24ec73d 

Diff: https://reviews.apache.org/r/33318/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-17 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/#review80504
---



src/slave/containerizer/docker.cpp
https://reviews.apache.org/r/33257/#comment130460

We'll fix this later altogether with another patch


- Timothy Chen


On April 16, 2015, 7:10 a.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33257/
 ---
 
 (Updated April 16, 2015, 7:10 a.m.)
 
 
 Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie 
 Yu, and Till Toenshoff.
 
 
 Bugs: MESOS-2601
 https://issues.apache.org/jira/browse/MESOS-2601
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fixed recover tasks only by the intiated containerizer.
 Currently both mesos and docker containerizer recovers tasks that wasn't 
 started by themselves.
 The proposed fix is to record the intended containerizer in the checkpointed 
 executorInfo, and reuse that information on recover to know if the 
 containerizer should recover or not. We are free to modify the executorInfo 
 since it's not being used to relaunch any task.
 The external containerizer doesn't need to change since it is only recovering 
 containers that are returned by the containers script.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
   src/slave/containerizer/mesos/containerizer.cpp 
 e4136095fca55637864f495098189ab3ad8d8fe7 
   src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
   src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
   src/tests/docker_containerizer_tests.cpp 
 c772d4c836de18b0e87636cb42200356d24ec73d 
 
 Diff: https://reviews.apache.org/r/33257/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-17 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/
---

(Updated April 17, 2015, 7:07 p.m.)


Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie Yu, 
and Till Toenshoff.


Changes
---

This change also handles recovering docker containers after upgrade.


Bugs: MESOS-2601
https://issues.apache.org/jira/browse/MESOS-2601


Repository: mesos


Description
---

Fixed recover tasks only by the intiated containerizer.
Currently both mesos and docker containerizer recovers tasks that wasn't 
started by themselves.
The proposed fix is to record the intended containerizer in the checkpointed 
executorInfo, and reuse that information on recover to know if the 
containerizer should recover or not. We are free to modify the executorInfo 
since it's not being used to relaunch any task.
The external containerizer doesn't need to change since it is only recovering 
containers that are returned by the containers script.


Diffs (updated)
-

  src/slave/containerizer/docker.hpp 6893684e6d199a5d69fc8bba8e60c4acaae9c3c9 
  src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
  src/slave/containerizer/mesos/containerizer.cpp 
e4136095fca55637864f495098189ab3ad8d8fe7 
  src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
  src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
  src/tests/docker_containerizer_tests.cpp 
c772d4c836de18b0e87636cb42200356d24ec73d 

Diff: https://reviews.apache.org/r/33257/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Build failed in Jenkins: Mesos » gcc,docker||Hadoop,centos:7 #152

2015-04-17 Thread Timothy Chen
Hi Vinod,

BenH already pushed a commit to fix this.

Thanks,

Tim

On Fri, Apr 17, 2015 at 3:14 PM, Vinod Kone vinodk...@apache.org wrote:
 ben, looks like this is your commit?

 In file included from 
 ../../3rdparty/libprocess/include/process/address.hpp:9:0,
  from 
 ../../3rdparty/libprocess/include/process/process.hpp:10,
  from
 ../../3rdparty/libprocess/include/process/c++11/dispatch.hpp:8,
  from 
 ../../3rdparty/libprocess/include/process/dispatch.hpp:2,
  from ../../src/slave/containerizer/containerizer.cpp:22:
 ../../src/slave/containerizer/docker.hpp: In constructor
 'mesos::internal::slave::DockerContainerizerProcess::Container::Container(const
 mesos::ContainerID, const Optionmesos::TaskInfo, const
 mesos::ExecutorInfo, const string, const
 Optionstd::basic_stringchar , const mesos::SlaveID, const
 process::PIDmesos::internal::slave::Slave, bool, bool, const
 mesos::internal::slave::Flags)':
 ../../src/slave/containerizer/docker.hpp:303:36: error: no match for
 'operator=' (operand types are 'const
 google::protobuf::RepeatedPtrFieldmesos::Resource' and 'const
 google::protobuf::RepeatedPtrFieldmesos::Resource')
  CHECK(executor.resources() = task.get().resources());
 ^
 ../3rdparty/libprocess/3rdparty/glog-0.3.3/src/glog/logging.h:548:5:
 note: in definition of macro 'LOG_IF'
!(condition) ? (void) 0 : google::LogMessageVoidify()  LOG(severity)
  ^
 ../3rdparty/libprocess/3rdparty/glog-0.3.3/src/glog/logging.h:562:21:
 note: in expansion of macro 'GOOGLE_PREDICT_BRANCH_NOT_TAKEN'
LOG_IF(FATAL, GOOGLE_PREDICT_BRANCH_NOT_TAKEN(!(condition))) \
  ^
 ../../src/slave/containerizer/docker.hpp:303:9: note: in expansion of
 macro 'CHECK'
  CHECK(executor.resources() = task.get().resources());



 On Fri, Apr 17, 2015 at 3:10 PM, Apache Jenkins Server 
 jenk...@builds.apache.org wrote:

 See 
 https://builds.apache.org/job/Mesos/COMPILER=gcc,LABEL=docker%7C%7CHadoop,OS=centos%3A7/152/changes
 

 Changes:

 [benjamin.hindman] Fix docker containerizer usage and test.

 --
 [...truncated 14953 lines...]
 [5526919.944592] docker0: port 1(vethbaa02af) entered forwarding state
 [5528260.552205] device veth2f4139b entered promiscuous mode
 [5528260.552918] IPv6: ADDRCONF(NETDEV_UP): veth2f4139b: link is not ready
 [5528260.587532] IPv6: ADDRCONF(NETDEV_CHANGE): veth2f4139b: link becomes
 ready
 [5528260.587590] docker0: port 2(veth2f4139b) entered forwarding state
 [5528260.587602] docker0: port 2(veth2f4139b) entered forwarding state
 [5528261.089483] docker0: port 2(veth2f4139b) entered disabled state
 [5528261.090011] device veth2f4139b left promiscuous mode
 [5528261.090020] docker0: port 2(veth2f4139b) entered disabled state
 [5528264.183826] device veth6605cd0 entered promiscuous mode
 [5528264.184472] IPv6: ADDRCONF(NETDEV_UP): veth6605cd0: link is not ready
 [5528264.226441] IPv6: ADDRCONF(NETDEV_CHANGE): veth6605cd0: link becomes
 ready
 [5528264.226479] docker0: port 2(veth6605cd0) entered forwarding state
 [5528264.226484] docker0: port 2(veth6605cd0) entered forwarding state
 [5528279.281925] docker0: port 2(veth6605cd0) entered forwarding state
 [5529322.453364] docker0: port 2(veth6605cd0) entered disabled state
 [5529322.454146] device veth6605cd0 left promiscuous mode
 [5529322.454154] docker0: port 2(veth6605cd0) entered disabled state
 [5530892.344572] docker0: port 1(vethbaa02af) entered disabled state
 [5530892.345423] device vethbaa02af left promiscuous mode
 [5530892.345437] docker0: port 1(vethbaa02af) entered disabled state
 [5531169.621507] device veth85da1b2 entered promiscuous mode
 [5531169.622008] IPv6: ADDRCONF(NETDEV_UP): veth85da1b2: link is not ready
 [5531169.656545] IPv6: ADDRCONF(NETDEV_CHANGE): veth85da1b2: link becomes
 ready
 [5531169.656584] docker0: port 1(veth85da1b2) entered forwarding state
 [5531169.656591] docker0: port 1(veth85da1b2) entered forwarding state
 [5531170.282755] docker0: port 1(veth85da1b2) entered disabled state
 [5531170.283477] device veth85da1b2 left promiscuous mode
 [5531170.283488] docker0: port 1(veth85da1b2) entered disabled state
 [5531173.398973] device veth0a7f876 entered promiscuous mode
 [5531173.399701] IPv6: ADDRCONF(NETDEV_UP): veth0a7f876: link is not ready
 [5531173.439748] IPv6: ADDRCONF(NETDEV_CHANGE): veth0a7f876: link becomes
 ready
 [5531173.439796] docker0: port 1(veth0a7f876) entered forwarding state
 [5531173.439806] docker0: port 1(veth0a7f876) entered forwarding state
 [5531188.478086] docker0: port 1(veth0a7f876) entered forwarding state
 [5532087.368782] docker0: port 1(veth0a7f876) entered disabled state
 [5532087.369700] device veth0a7f876 left promiscuous mode
 [5532087.369715] docker0: port 1(veth0a7f876) entered disabled state
 [5534768.804830] device vethda651d2 entered promiscuous mode
 [5534768.805253] IPv6: 

Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-16 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/
---

Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie Yu, 
and Till Toenshoff.


Bugs: MESOS-2601
https://issues.apache.org/jira/browse/MESOS-2601


Repository: mesos


Description
---

Fixed recover tasks only by the intiated containerizer.
Currently both mesos and docker containerizer recovers tasks that wasn't 
started by themselves.
The proposed fix is to record the intended containerizer in the checkpointed 
executorInfo, and reuse that information on recover to know if the 
containerizer should recover or not. We are free to modify the executorInfo 
since it's not being used to relaunch any task.


Diffs
-

  src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
  src/slave/containerizer/mesos/containerizer.cpp 
e4136095fca55637864f495098189ab3ad8d8fe7 
  src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
  src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
  src/tests/docker_containerizer_tests.cpp 
c772d4c836de18b0e87636cb42200356d24ec73d 

Diff: https://reviews.apache.org/r/33257/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-16 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/
---

(Updated April 16, 2015, 7:08 a.m.)


Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie Yu, 
and Till Toenshoff.


Bugs: MESOS-2601
https://issues.apache.org/jira/browse/MESOS-2601


Repository: mesos


Description
---

Fixed recover tasks only by the intiated containerizer.
Currently both mesos and docker containerizer recovers tasks that wasn't 
started by themselves.
The proposed fix is to record the intended containerizer in the checkpointed 
executorInfo, and reuse that information on recover to know if the 
containerizer should recover or not. We are free to modify the executorInfo 
since it's not being used to relaunch any task.


Diffs (updated)
-

  src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
  src/slave/containerizer/mesos/containerizer.cpp 
e4136095fca55637864f495098189ab3ad8d8fe7 
  src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
  src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
  src/tests/docker_containerizer_tests.cpp 
c772d4c836de18b0e87636cb42200356d24ec73d 

Diff: https://reviews.apache.org/r/33257/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 33257: Fixed recover tasks only by the intiated containerizer.

2015-04-16 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33257/
---

(Updated April 16, 2015, 7:10 a.m.)


Review request for mesos, Benjamin Hindman, Bernd Mathiske, Ian Downes, Jie Yu, 
and Till Toenshoff.


Bugs: MESOS-2601
https://issues.apache.org/jira/browse/MESOS-2601


Repository: mesos


Description (updated)
---

Fixed recover tasks only by the intiated containerizer.
Currently both mesos and docker containerizer recovers tasks that wasn't 
started by themselves.
The proposed fix is to record the intended containerizer in the checkpointed 
executorInfo, and reuse that information on recover to know if the 
containerizer should recover or not. We are free to modify the executorInfo 
since it's not being used to relaunch any task.
The external containerizer doesn't need to change since it is only recovering 
containers that are returned by the containers script.


Diffs
-

  src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
  src/slave/containerizer/mesos/containerizer.cpp 
e4136095fca55637864f495098189ab3ad8d8fe7 
  src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
  src/tests/containerizer_tests.cpp 5991aa628083dac7c5e8bf7ba297f4f9edeec05f 
  src/tests/docker_containerizer_tests.cpp 
c772d4c836de18b0e87636cb42200356d24ec73d 

Diff: https://reviews.apache.org/r/33257/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-16 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33249/#review80347
---



src/slave/slave.cpp
https://reviews.apache.org/r/33249/#comment130197

We're already sending back a status update when the registration timeout, 
and if we send another one here the scheduler will actually get two TASK_FAILED 
statuses instead.

I think either we populate the reason when we send back the final status 
update that it's the containerizer launched failed, or we make sure we just 
send one here.

The nice thing about having it be handled in the timeout is that it's less 
places in the slave that we do status updates, but with the cavaet you wait 
until the timeout to occur which is something I never really liked about.

I think if we can make the code clean and make sure there is just one 
status update propagated back I rather see it happen here.


- Timothy Chen


On April 16, 2015, 3:16 p.m., Jay Buffington wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33249/
 ---
 
 (Updated April 16, 2015, 3:16 p.m.)
 
 
 Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
 
 
 Bugs: MESOS-2020
 https://issues.apache.org/jira/browse/MESOS-2020
 
 
 Repository: mesos
 
 
 Description
 ---
 
 When mesos is unable to launch the containerizer the scheduler should
 get a TASK_FAILED with a status message that includes the error the
 containerizer encounted when trying to launch.
 
 Introduces a new TaskStatus: REASON_CONTAINERIZER_LAUNCH_FAILED
 
 Fixes MESOS-2020
 
 
 Diffs
 -
 
   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
   src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
   src/tests/containerizer.cpp 26b87ac6b16dfeaf84888e80296ef540697bd775 
   src/tests/slave_tests.cpp b826000e0a4221690f956ea51f49ad4c99d5e188 
 
 Diff: https://reviews.apache.org/r/33249/diff/
 
 
 Testing
 ---
 
 I added test case to slave_test.cpp.  I also tried this with Aurora, supplied 
 a bogus docker image url and saw the docker pull failure stderr message in 
 Aurora's web UI.
 
 
 Thanks,
 
 Jay Buffington
 




Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-16 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33249/#review80350
---



src/tests/containerizer.cpp
https://reviews.apache.org/r/33249/#comment130199

How about setting a flag in the TestContainerizer constructor that tells it 
to send a failure on launch? Checking value like is too implicit, and someone 
else might not know the magic meaning on the other side (as I don't see any 
comment in the test to mark that this has sepecial meaning).


- Timothy Chen


On April 16, 2015, 3:16 p.m., Jay Buffington wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33249/
 ---
 
 (Updated April 16, 2015, 3:16 p.m.)
 
 
 Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
 
 
 Bugs: MESOS-2020
 https://issues.apache.org/jira/browse/MESOS-2020
 
 
 Repository: mesos
 
 
 Description
 ---
 
 When mesos is unable to launch the containerizer the scheduler should
 get a TASK_FAILED with a status message that includes the error the
 containerizer encounted when trying to launch.
 
 Introduces a new TaskStatus: REASON_CONTAINERIZER_LAUNCH_FAILED
 
 Fixes MESOS-2020
 
 
 Diffs
 -
 
   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
   src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
   src/tests/containerizer.cpp 26b87ac6b16dfeaf84888e80296ef540697bd775 
   src/tests/slave_tests.cpp b826000e0a4221690f956ea51f49ad4c99d5e188 
 
 Diff: https://reviews.apache.org/r/33249/diff/
 
 
 Testing
 ---
 
 I added test case to slave_test.cpp.  I also tried this with Aurora, supplied 
 a bogus docker image url and saw the docker pull failure stderr message in 
 Aurora's web UI.
 
 
 Thanks,
 
 Jay Buffington
 




Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-16 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33249/#review80352
---



src/slave/slave.cpp
https://reviews.apache.org/r/33249/#comment130205

Btw the framework and executor might not be there anymore if they are 
terminated and cleaned up between launch callback, see the checks we do below.


- Timothy Chen


On April 16, 2015, 3:16 p.m., Jay Buffington wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33249/
 ---
 
 (Updated April 16, 2015, 3:16 p.m.)
 
 
 Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
 
 
 Bugs: MESOS-2020
 https://issues.apache.org/jira/browse/MESOS-2020
 
 
 Repository: mesos
 
 
 Description
 ---
 
 When mesos is unable to launch the containerizer the scheduler should
 get a TASK_FAILED with a status message that includes the error the
 containerizer encounted when trying to launch.
 
 Introduces a new TaskStatus: REASON_CONTAINERIZER_LAUNCH_FAILED
 
 Fixes MESOS-2020
 
 
 Diffs
 -
 
   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
   src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
   src/tests/containerizer.cpp 26b87ac6b16dfeaf84888e80296ef540697bd775 
   src/tests/slave_tests.cpp b826000e0a4221690f956ea51f49ad4c99d5e188 
 
 Diff: https://reviews.apache.org/r/33249/diff/
 
 
 Testing
 ---
 
 I added test case to slave_test.cpp.  I also tried this with Aurora, supplied 
 a bogus docker image url and saw the docker pull failure stderr message in 
 Aurora's web UI.
 
 
 Thanks,
 
 Jay Buffington
 




Re: Review Request 33249: Send statusUpdate to scheduler on containerizer launch failure

2015-04-16 Thread Timothy Chen


 On April 16, 2015, 5:32 p.m., Timothy Chen wrote:
  src/slave/slave.cpp, line 3085
  https://reviews.apache.org/r/33249/diff/1/?file=931231#file931231line3085
 
  We're already sending back a status update when the registration 
  timeout, and if we send another one here the scheduler will actually get 
  two TASK_FAILED statuses instead.
  
  I think either we populate the reason when we send back the final 
  status update that it's the containerizer launched failed, or we make sure 
  we just send one here.
  
  The nice thing about having it be handled in the timeout is that it's 
  less places in the slave that we do status updates, but with the cavaet you 
  wait until the timeout to occur which is something I never really liked 
  about.
  
  I think if we can make the code clean and make sure there is just one 
  status update propagated back I rather see it happen here.
 
 Jay Buffington wrote:
 Sending a terminal update (TASK_FAILED) removes the task from 
 'executor-queuedTasks', so the scheduler won't get two status updates.  See 
 https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L2112
 
 I admit this is super confusing, in fact, when I ran the code the first 
 time I was expecting to see two status updates. I pinged Vinod about it and 
 he was confused and it took us a while to work through what was going on.
 
 I am concerned that we are changing state for the callbacks that clean 
 things up, so I'm open to moving it.  When you say timeout are you 
 referring to the Slave::sendExecutorTerminatedStatusUpdate method?

Ah, I forgot about it too. I think comments will be great so we avoid confusion!


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33249/#review80347
---


On April 16, 2015, 3:16 p.m., Jay Buffington wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33249/
 ---
 
 (Updated April 16, 2015, 3:16 p.m.)
 
 
 Review request for mesos, Ben Mahler, Timothy Chen, and Vinod Kone.
 
 
 Bugs: MESOS-2020
 https://issues.apache.org/jira/browse/MESOS-2020
 
 
 Repository: mesos
 
 
 Description
 ---
 
 When mesos is unable to launch the containerizer the scheduler should
 get a TASK_FAILED with a status message that includes the error the
 containerizer encounted when trying to launch.
 
 Introduces a new TaskStatus: REASON_CONTAINERIZER_LAUNCH_FAILED
 
 Fixes MESOS-2020
 
 
 Diffs
 -
 
   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
   src/slave/slave.cpp a0595f93ce4720f5b9926326d01210460ccb0667 
   src/tests/containerizer.cpp 26b87ac6b16dfeaf84888e80296ef540697bd775 
   src/tests/slave_tests.cpp b826000e0a4221690f956ea51f49ad4c99d5e188 
 
 Diff: https://reviews.apache.org/r/33249/diff/
 
 
 Testing
 ---
 
 I added test case to slave_test.cpp.  I also tried this with Aurora, supplied 
 a bogus docker image url and saw the docker pull failure stderr message in 
 Aurora's web UI.
 
 
 Thanks,
 
 Jay Buffington
 




Re: Review Request 31480: Fix check for lseek error.

2015-04-15 Thread Timothy Chen


 On April 8, 2015, 9:33 p.m., Ian Downes wrote:
  src/slave/state.cpp, line 621
  https://reviews.apache.org/r/31480/diff/1/?file=878304#file878304line621
 
  separate errno string:
  s/'/': /

ErrnoError actually seperates it for you already.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31480/#review79432
---


On Feb. 26, 2015, 6:54 p.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31480/
 ---
 
 (Updated Feb. 26, 2015, 6:54 p.m.)
 
 
 Review request for mesos, Dominic Hamon, Ian Downes, and Kapil Arya.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fix check for lseek error.
 
 
 Diffs
 -
 
   src/slave/state.cpp 0329ba56367a89e3ac5b1f3fcc3e3315ccd33999 
 
 Diff: https://reviews.apache.org/r/31480/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Suggestion: Mesos 0.22.1 point release

2015-04-15 Thread Timothy Chen
I also think we should push that fix for 0.23.0, it will take time to
review and merge.

Tim

On Tue, Apr 14, 2015 at 10:17 PM, Benjamin Hindman
b...@eecs.berkeley.edu wrote:
 Yes, fixing it in 0.23.0 SGTM.

 On Tue, Apr 14, 2015 at 10:02 PM, Jie Yu yujie@gmail.com wrote:

 I am just asking if you guys want to fix that for 0.22.1 or not. It sounds
 to me a non trivial fix. Given the bug is there for quite a while, maybe we
 can fix it in 0.23.0?

 - Jie

 On Tue, Apr 14, 2015 at 9:55 PM, Benjamin Hindman b...@eecs.berkeley.edu
 wrote:

  We are going to include MESOS-2614 (it's a really trivial fix).
 
  Jie, where did you get MESOS-2601 from? That's definitely not in the
  spreadsheet.
 
  On Tue, Apr 14, 2015 at 7:40 PM, Jie Yu yujie@gmail.com wrote:
 
   Also, this one:
   https://issues.apache.org/jira/browse/MESOS-2601
  
   This sounds like a non trivial fix.
  
   - Jie
  
   On Tue, Apr 14, 2015 at 6:35 PM, Benjamin Mahler 
   benjamin.mah...@gmail.com
   wrote:
  
 Per Nik's comment here:
   
Based on input from Vinod and Adam; I will reduce the scope on the
  point
 release to focus on MESOS-1795 and MESOS-2583.
 I will move the other tickets back to 0.23.0 if you don't have any
 objections - let me know if you have any tickets which were
  regressions
in
 0.22.0.
   
   
I expected there to be fewer tickets in the spreadsheet, are the
 extra
tickets (e.g. https://issues.apache.org/jira/browse/MESOS-2614)
 going
  to
be
included after all?
   
On Tue, Apr 14, 2015 at 6:20 PM, Joris Van Remoortere 
   jo...@mesosphere.io

wrote:
   
 I think the plan is to cut a new RC by sometime tomorrow. The
   spreadsheet
 is up-to-date, just need to cherry-pick and modify the change-log.

 Joris

 On Tue, Apr 14, 2015 at 5:37 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com
 wrote:

  Hey Nik, any progress on this? Is the spreadsheet up-to-date?
 
  On Wed, Apr 8, 2015 at 1:00 AM, Adam Bordelon 
 a...@mesosphere.io
 wrote:
 
   Hi Adam,
  
   Yes, once we have finalized the scope of the point release,
  Niklas
will
   send out an announcement of Mesos 0.22.1-rc1 (release
 candidate)
which
 we
   would love you to test any way you can. The email will contain
  instructions
   for building the release candidate and voting in the thread.
 See
   the
 vote
   thread from 0.22.0-rc4 (became final):
  
 http://www.mail-archive.com/dev%40mesos.apache.org/msg30668.html
  
   The current thread is to collect suggestions for bug fixes to
   include
 in
   this point release.
  
   Cheers,
   -Adam-
  
   On Tue, Apr 7, 2015 at 9:22 AM, Adam Avilla a...@avil.la
  wrote:
  
On Fri, Apr 3, 2015 at 3:47 PM, Niklas Nielsen 
nik...@mesosphere.io
 
wrote:
   
 Based on input from Vinod and Adam; I will reduce the scope
  on
the
   point
 release to focus on MESOS-1795 and MESOS-2583.

   
Can I help test these in any way?
   
--
/adam
   
  
 

   
  
 



Re: Review Request 31985: Mesos container ID available to the executor through an environment variable.

2015-04-14 Thread Timothy Chen


 On March 14, 2015, 12:49 a.m., Timothy Chen wrote:
  The change looks good, but I'm not sure how exposing the container id is 
  the right thing to do overall yet. Container id as I know of is meant to be 
  a internal id that is used only in mesos, and I believe the whole 
  motivation was for users to be able to guess the docker container name from 
  the container id. However, container id - docker container name mapping 
  might change since we manage that, and I'm progress changing it to include 
  the slave id.
  
  I personally think we should think about exposing container specific 
  information.
  
  Do you know if there is any other use case to know the container id?
 
 Alexander Rojas wrote:
 1. I am not aware of any other use case. However I asked benh about this 
 and he mentioned it was ok to have this as extra info on the update, as long 
 as it was no docker specific data.
 2. After asking some questions to the original reporter, the idea was 
 more to be able to group tasks assigned to the same container, but not as a 
 way to extract the specific container.
 3. I am not sure, then, what would be considered mesos private info and 
 info which can be shared. For example, why can the framework id and the 
 executor id be shared but no the container id?
 
 Vinod Kone wrote:
 group tasks assigned to the same container  .. What does this mean? 
 IIUC, our docker containerizer only supports single task containers.
 
 regarding why framework id and executor id are exposed: framework id is 
 needed by frameworks to reregister with master. executor id (and task id) is 
 generated by the framework and not mesos.
 
 Alexander Rojas wrote:
 So I guess then, I will discard this patch and set the issue to won't fix?
 
 Vinod Kone wrote:
 I think we should try to understand the root of the issue that the 
 reporter is having before jumping onto a specific implementation.
 
 Alexander Rojas wrote:
 Well, I felt like I had it clear but apparently not. Can you please ask 
 the questions in the Jira entry vinod?
 
 I was also wondering, As mentioned above, we want to keep the 
 `containerId` private within mesos, but patch 
 [32426](https://reviews.apache.org/r/32426) effectively makes it public.
 
 Alexander Rojas wrote:
 I got it now… it is still quite private. Forget my question on the second 
 paragraph.
 
 Alexander Rojas wrote:
 One more thing Vinod, you wrote one container per task, however if you 
 check the code (Slave.cpp at `Executor* 
 Framework::launchExecutor(ExecutorInfo, TaskInfo)`) what we have is one 
 container per executor.
 
 Timothy Chen wrote:
 Docker containerizer supports multiple tasks only if the container is a 
 custom executor, so you get grouping naturally through that. 
 From what I understand most people wants the container ID so they can get 
 docker specific information like name, IP, etc. I think by sending this all 
 back through the docker executor we shouldn't need this anymore.
 
 Alexander Rojas wrote:
 Hey guys, I asked the reporter of the issue the reason why he needs it. 
 He wrote a full explanation on the [JIRA 
 Issue](https://issues.apache.org/jira/browse/MESOS-2191?focusedCommentId=14387724page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14387724),
  so if you can read that and then decide if it is worth commiting this patch 
 or not.

I think we shouldn't, as I guessed correctly he just wants the container name. 
Since the container is internally managed by Mesos, it's subject to change any 
time and you can't always guess that it's mesos- + containerId. I'm proposed 
a solution that I plan to put in that JIRA.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31985/#review76470
---


On March 20, 2015, 9:27 a.m., Alexander Rojas wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31985/
 ---
 
 (Updated March 20, 2015, 9:27 a.m.)
 
 
 Review request for mesos, Bernd Mathiske, Isabel Jimenez, Joerg Schad, Till 
 Toenshoff, and Vinod Kone.
 
 
 Bugs: MESOS-2191
 https://issues.apache.org/jira/browse/MESOS-2191
 
 
 Repository: mesos
 
 
 Description
 ---
 
 When the executor is created, the container ID where it runs is made 
 accesible through an environment variable.
 
 
 Diffs
 -
 
   src/exec/exec.cpp d678f0682d803b0b080c3a6c296067ac9ab5dbf8 
   src/slave/containerizer/containerizer.hpp 
 129e60f20835f5d151701e934330b81825887af1 
   src/slave/containerizer/containerizer.cpp 
 4d66e767de1f877cb66b37826ba7c9d00639a7c0 
   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
   src/slave/containerizer

Re: Review Request 31985: Mesos container ID available to the executor through an environment variable.

2015-04-13 Thread Timothy Chen


 On March 14, 2015, 12:49 a.m., Timothy Chen wrote:
  The change looks good, but I'm not sure how exposing the container id is 
  the right thing to do overall yet. Container id as I know of is meant to be 
  a internal id that is used only in mesos, and I believe the whole 
  motivation was for users to be able to guess the docker container name from 
  the container id. However, container id - docker container name mapping 
  might change since we manage that, and I'm progress changing it to include 
  the slave id.
  
  I personally think we should think about exposing container specific 
  information.
  
  Do you know if there is any other use case to know the container id?
 
 Alexander Rojas wrote:
 1. I am not aware of any other use case. However I asked benh about this 
 and he mentioned it was ok to have this as extra info on the update, as long 
 as it was no docker specific data.
 2. After asking some questions to the original reporter, the idea was 
 more to be able to group tasks assigned to the same container, but not as a 
 way to extract the specific container.
 3. I am not sure, then, what would be considered mesos private info and 
 info which can be shared. For example, why can the framework id and the 
 executor id be shared but no the container id?
 
 Vinod Kone wrote:
 group tasks assigned to the same container  .. What does this mean? 
 IIUC, our docker containerizer only supports single task containers.
 
 regarding why framework id and executor id are exposed: framework id is 
 needed by frameworks to reregister with master. executor id (and task id) is 
 generated by the framework and not mesos.
 
 Alexander Rojas wrote:
 So I guess then, I will discard this patch and set the issue to won't fix?
 
 Vinod Kone wrote:
 I think we should try to understand the root of the issue that the 
 reporter is having before jumping onto a specific implementation.
 
 Alexander Rojas wrote:
 Well, I felt like I had it clear but apparently not. Can you please ask 
 the questions in the Jira entry vinod?
 
 I was also wondering, As mentioned above, we want to keep the 
 `containerId` private within mesos, but patch 
 [32426](https://reviews.apache.org/r/32426) effectively makes it public.
 
 Alexander Rojas wrote:
 I got it now… it is still quite private. Forget my question on the second 
 paragraph.
 
 Alexander Rojas wrote:
 One more thing Vinod, you wrote one container per task, however if you 
 check the code (Slave.cpp at `Executor* 
 Framework::launchExecutor(ExecutorInfo, TaskInfo)`) what we have is one 
 container per executor.

Docker containerizer supports multiple tasks only if the container is a custom 
executor, so you get grouping naturally through that. 
From what I understand most people wants the container ID so they can get 
docker specific information like name, IP, etc. I think by sending this all 
back through the docker executor we shouldn't need this anymore.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31985/#review76470
---


On March 20, 2015, 9:27 a.m., Alexander Rojas wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31985/
 ---
 
 (Updated March 20, 2015, 9:27 a.m.)
 
 
 Review request for mesos, Bernd Mathiske, Isabel Jimenez, Joerg Schad, Till 
 Toenshoff, and Vinod Kone.
 
 
 Bugs: MESOS-2191
 https://issues.apache.org/jira/browse/MESOS-2191
 
 
 Repository: mesos
 
 
 Description
 ---
 
 When the executor is created, the container ID where it runs is made 
 accesible through an environment variable.
 
 
 Diffs
 -
 
   src/exec/exec.cpp d678f0682d803b0b080c3a6c296067ac9ab5dbf8 
   src/slave/containerizer/containerizer.hpp 
 129e60f20835f5d151701e934330b81825887af1 
   src/slave/containerizer/containerizer.cpp 
 4d66e767de1f877cb66b37826ba7c9d00639a7c0 
   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
   src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 
   src/slave/containerizer/external_containerizer.cpp 
 42c67f548caf7bddbe131e0dfa7d74227d8c2593 
   src/slave/containerizer/mesos/containerizer.cpp 
 fbd1c0a0e5f4f227adb022f0baaa6d2c7e3ad748 
   src/tests/containerizer.cpp 26b87ac6b16dfeaf84888e80296ef540697bd775 
   src/tests/slave_tests.cpp a975305430097a8295b4b155e8448572c12bde22 
 
 Diff: https://reviews.apache.org/r/31985/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Alexander Rojas
 




Re: Review Request 29889: Recover Docker containers when mesos slave is in a container

2015-04-13 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29889/
---

(Updated April 13, 2015, 6:36 a.m.)


Review request for mesos and Benjamin Hindman.


Repository: mesos


Description
---

This is a one mega patch and contains many reviews that's already on rb.

This review is not meant to be merged, only provided for easier review.


Diffs (updated)
-

  Dockerfile 35abf25 
  docs/configuration.md 54c4e31 
  docs/docker-containerizer.md a5438b7 
  src/docker/docker.hpp 3ebbc1f 
  src/docker/docker.cpp 3a485a2 
  src/docker/executor.cpp PRE-CREATION 
  src/slave/containerizer/docker.hpp 6893684 
  src/slave/containerizer/docker.cpp f9fb078 
  src/tests/docker_containerizer_tests.cpp c772d4c 

Diff: https://reviews.apache.org/r/29889/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29889: Recover Docker containers when mesos slave is in a container

2015-04-13 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29889/
---

(Updated April 13, 2015, 6:36 a.m.)


Review request for mesos and Benjamin Hindman.


Repository: mesos


Description
---

This is a one mega patch and contains many reviews that's already on rb.

This review is not meant to be merged, only provided for easier review.


Diffs
-

  Dockerfile 35abf25 
  docs/configuration.md 54c4e31 
  docs/docker-containerizer.md a5438b7 
  src/docker/docker.hpp 3ebbc1f 
  src/docker/docker.cpp 3a485a2 
  src/docker/executor.cpp PRE-CREATION 
  src/slave/containerizer/docker.hpp 6893684 
  src/slave/containerizer/docker.cpp f9fb078 
  src/tests/docker_containerizer_tests.cpp c772d4c 

Diff: https://reviews.apache.org/r/29889/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 30774: Fetcher Cache

2015-04-13 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review79838
---



src/tests/fetcher_cache_tests.cpp
https://reviews.apache.org/r/30774/#comment129437

Not sure why you picked an arbitrary number 5 here, why not let it be 
passed in?



src/tests/fetcher_cache_tests.cpp
https://reviews.apache.org/r/30774/#comment129438

Always one file expected in the cache


- Timothy Chen


On April 10, 2015, 11:33 a.m., Bernd Mathiske wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30774/
 ---
 
 (Updated April 10, 2015, 11:33 a.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
 Timothy Chen.
 
 
 Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, MESOS-2073, and 
 MESOS-2074
 https://issues.apache.org/jira/browse/MESOS-2057
 https://issues.apache.org/jira/browse/MESOS-2069
 https://issues.apache.org/jira/browse/MESOS-2070
 https://issues.apache.org/jira/browse/MESOS-2072
 https://issues.apache.org/jira/browse/MESOS-2073
 https://issues.apache.org/jira/browse/MESOS-2074
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Almost all of the functionality in epic MESOS-336. Downloaded files from 
 CommandInfo::URIs can now be cached in a cache directory designated by a 
 slave flag. This only happens when asked for by an extra flag in the URI and 
 is thus backwards-compatible. The cache has a size limit also given by a new 
 slave flag. Cache-resident files are evicted as necessary to make space for 
 newly fetched ones. Concurrent attempts to cache the same URI leads to only 
 one download. The fetcher program remains external for safety reasons, but is 
 now augmented with more elaborate parameters packed into a JSON object to 
 implement specific fetch actions for all of the above. Additional testing 
 includes fetching from (mock) HDFS and coverage of the new features.
 
 
 Diffs
 -
 
   docs/configuration.md 54c4e31ed6dfed3c23d492c19a301ce119a0519b 
   docs/fetcher-cache-internals.md PRE-CREATION 
   docs/fetcher.md PRE-CREATION 
   include/mesos/fetcher/fetcher.proto 
 311af9aebc6a85dadba9dbeffcf7036b70896bcc 
   include/mesos/mesos.proto 3a8e8bf303e0576c212951f6028af77e54d93537 
   include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
   src/Makefile.am fa609da08e23d6595a3f6d2efddd3e333b6c78f1 
   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
   src/launcher/fetcher.cpp 796526f59c25898ef6db2b828b0e2bb7b172ba25 
   src/slave/constants.hpp fd1c1aba0aa62372ab399bee5709ce81b8e92cec 
   src/slave/containerizer/docker.hpp 6893684e6d199a5d69fc8bba8e60c4acaae9c3c9 
   src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 
   src/slave/containerizer/fetcher.hpp 
 1db0eaf002c8d0eaf4e0391858e61e0912b35829 
   src/slave/containerizer/fetcher.cpp 
 9e9e9d0eb6b0801d53dec3baea32a4cd4acdd5e2 
   src/slave/containerizer/mesos/containerizer.hpp 
 ae61a0fcd19f2ba808624312401f020121baf5d4 
   src/slave/containerizer/mesos/containerizer.cpp 
 e4136095fca55637864f495098189ab3ad8d8fe7 
   src/slave/flags.hpp d3b1ce117fbb4e0b97852ef150b63f35cc991032 
   src/slave/flags.cpp 35f56252cfda5011d21aa188f33cc3e68a694968 
   src/slave/slave.cpp 9fec023b643d410f4d511fa6f80e9835bab95b7e 
   src/tests/docker_containerizer_tests.cpp 
 c772d4c836de18b0e87636cb42200356d24ec73d 
   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
   src/tests/fetcher_tests.cpp 4549e6a631e2c17cec3766efaa556593eeac9a1e 
   src/tests/mesos.hpp 0e98572a62ae05437bd2bc800c370ad1a0c43751 
   src/tests/mesos.cpp 02cbb4b8cf1206d0f32d160addc91d7e0f1ab28b 
 
 Diff: https://reviews.apache.org/r/30774/diff/
 
 
 Testing
 ---
 
 make check
 
 --- longer Description: ---
 
 -Replaces all other reviews for the fetcher cache except those related to 
 stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
 30618, 30621, 30626. See descriptions of those. In dependency order:
 
 30033: Removes the fetcher env tests since these won't be needed any more 
 when the fetcher uses JSON in a single env var as a parameter. They never 
 tested anything that won't be covered by other tests anyway.
 
 30034: Makes the code structure of all fetcher tests the same. Instead of 
 calling the run method of the fetcher directly, calling through fetch(). Also 
 removes all uses of I/O redirection, which is not really needed for 
 debugging, and thus the next patch can refactor fetch() and run(). (The 
 latter comes in two varieties, which complicates matters without much 
 benefit.)
 
 30036: Extends the CommandInfo::URI protobuf with a boolean caching field 
 that will later cause fetcher cache actions. Also introduces the notion

Re: [jira] [Commented] (MESOS-2368) Provide a backchannel for information to the framework

2015-04-11 Thread Timothy Chen
In the docker executor it can be done by the executor (once we have
the docker executor, which makes this also easier).

That's what I'm trying to figure out as well, should I just reuse data
or introducing a brand new field for this as it can have a specific
meaning that it is container specific information.

Probably using data is fine as it's expected to be arbitrary
information passed from the executor back to scheduler.

In the docker context, I'm also currently thinking it makes sense just
to pass back the whole docker inspect JSON output back.

Tim




On Sat, Apr 11, 2015 at 4:57 PM, Vinod Kone vi...@twitter.com.invalid wrote:
 Who sends the information? If it's the executor can it just not use 'data' 
 field in status?

 @vinodkone

 On Apr 11, 2015, at 2:48 PM, Timothy Chen (JIRA) j...@apache.org wrote:


[ 
 https://issues.apache.org/jira/browse/MESOS-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14491208#comment-14491208
  ]

 Timothy Chen commented on MESOS-2368:
 -

 I've started to think about this more, and I think having a generic way to 
 express container information back to the framework seems to be necessary 
 especially when the scheduler likes to do more advanced options with the 
 container that is spawned. Currently a scheduler cannot tell which container 
 is spawned by its task simply because there is no correlation that can be 
 made and scheduler doesn't know about the container id.

 What I'm thinking is perhaps we can expose a free form key/value pair that 
 are like labels, to be passed back as part of the TaskStatus update that is 
 container specific information. In the case of Docker, it can be the docker 
 inspect information that is extracted and exposed back, which will include 
 name, network, and other info.

 [~idownes] [~jieyu] what you guys think?




 Provide a backchannel for information to the framework
 --

Key: MESOS-2368
URL: https://issues.apache.org/jira/browse/MESOS-2368
Project: Mesos
 Issue Type: Improvement
 Components: containerization, docker
   Reporter: Henning Schmiedehausen
   Assignee: Timothy Chen

 So that description is not very verbose. Here is my use case:
 In our usage of Mesos and Docker, we assign IPs when the container starts 
 up. We can not allocate the IP ahead of time, but we must rely on docker to 
 give our containers their IP. This IP can be examined through docker 
 inspect.
 We added code to the docker containerizer that will pick up this 
 information and add it to an optional protobuf struct in the TaskStatus 
 message. Therefore, when the executor and slave report a task as running, 
 the corresponding message will contain information about the IP address 
 that the container was assigned by docker and we can pick up this 
 information in our orchestration framework. E.g. to drive our load 
 balancers.
 There was no good way to do that in stock Mesos, so we built that back 
 channel. However, having a generic channel (not one for four pieces of 
 arbitrary information) from the executor to a framework may be a good thing 
 in general. Clearly, this information could be transferred out of band but 
 having it in the standard Mesos communication protocol turned out to be 
 very elegant.
 To turn our current, hacked, prototype into something useful, this is what 
 I am thinking:
 - TaskStatus gains a new, optional field:
  - optional TaskContext task_context = 11; (better name suggestions very 
 welcome)
 - TaskContext has optional fields:
  - optional ContainerizerContext containerizer_context = 1;
  - optional ExecutorContext executor_context = 2;
 Each executor and containerizer can add information to the TaskContext, 
 which in turn is exposed in TaskStatus. To avoid crowding of the various 
 fields, I want to experiment with the nested extensions as described here: 
 http://www.indelible.org/ink/protobuf-polymorphism/
 At the end of the day, the goal is that any piece that is involved in 
 executing code on the slave side can send information back to the framework 
 along with TaskStatus messages. Any of these fields should be optional to 
 be backwards compatible and they should (same as any other messages back) 
 be considered best effort, but it will allow an effective way to 
 communicate execution environment state back to the framework and allow the 
 framework to react on it.
 I am planning to work on this an present a cleaned up version of our 
 prototype in a bit.



 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)


Re: Review Request 29333: Add docker_sock to slave flags

2015-04-06 Thread Timothy Chen


 On Feb. 15, 2015, 3:01 a.m., Ben Mahler wrote:
  src/slave/flags.hpp, line 320
  https://reviews.apache.org/r/29333/diff/2/?file=824295#file824295line320
 
  Why not docker_socket ?

Why not :)


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29333/#review72520
---


On Jan. 17, 2015, 1:35 a.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29333/
 ---
 
 (Updated Jan. 17, 2015, 1:35 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Bernd Mathiske.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add docker_sock to slave flags
 
 
 Diffs
 -
 
   src/slave/flags.hpp a4498e6 
 
 Diff: https://reviews.apache.org/r/29333/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 29334: Add option to launch docker containers with helper containers.

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29334/
---

(Updated April 7, 2015, 12:46 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Bugs: MESOS-2183
https://issues.apache.org/jira/browse/MESOS-2183


Repository: mesos


Description
---

Add option to launch docker containers with helper containers.


Diffs
-

  src/slave/containerizer/docker.hpp b7bf54a 
  src/slave/containerizer/docker.cpp 5f4b4ce 

Diff: https://reviews.apache.org/r/29334/diff/


Testing
---

make, tests are fixed in next commit


Thanks,

Timothy Chen



Re: Review Request 29333: Add docker_socket to slave flags

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29333/
---

(Updated April 7, 2015, 12:45 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Repository: mesos


Description (updated)
---

Add docker_socket to slave flags


Diffs
-

  src/slave/flags.hpp 3da71af 

Diff: https://reviews.apache.org/r/29333/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29333: Add docker_socket to slave flags

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29333/
---

(Updated April 7, 2015, 12:45 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Bugs: MESOS-2183
https://issues.apache.org/jira/browse/MESOS-2183


Repository: mesos


Description
---

Add docker_socket to slave flags


Diffs
-

  src/slave/flags.hpp 3da71af 

Diff: https://reviews.apache.org/r/29333/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29328: Add option to disable docker containerizer killing orphans

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29328/
---

(Updated April 7, 2015, 12:28 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Bugs: MESOS-2155
https://issues.apache.org/jira/browse/MESOS-2155


Repository: mesos


Description
---

Add option to disable docker containerizer killing orphans


Diffs
-

  src/slave/containerizer/docker.cpp f9fb078 
  src/slave/flags.hpp 3da71af 

Diff: https://reviews.apache.org/r/29328/diff/


Testing
---


make check


Thanks,

Timothy Chen



Re: Review Request 29330: Integrate docker executor into containerizer.

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29330/
---

(Updated April 7, 2015, 12:36 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Changes
---

rebased


Bugs: MESOS-2115
https://issues.apache.org/jira/browse/MESOS-2115


Repository: mesos


Description
---

Integrate docker executor into containerizer.


Diffs (updated)
-

  src/slave/containerizer/docker.cpp f9fb078 

Diff: https://reviews.apache.org/r/29330/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29331: Re-enable docker recover test.

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29331/
---

(Updated April 7, 2015, 12:36 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Changes
---

rebased


Repository: mesos


Description
---

Re-enable docker recover test.


Diffs (updated)
-

  src/tests/docker_containerizer_tests.cpp 
c772d4c836de18b0e87636cb42200356d24ec73d 

Diff: https://reviews.apache.org/r/29331/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29334: Add option to launch docker containers with helper containers.

2015-04-06 Thread Timothy Chen


 On Jan. 18, 2015, 12:18 p.m., Bernd Mathiske wrote:
  src/slave/containerizer/docker.hpp, line 219
  https://reviews.apache.org/r/29334/diff/5/?file=824296#file824296line219
 
  IMHO readability is subverted by prolonging the underscore scheme when 
  there is no strict series of continuations. This code would be so much more 
  easy to read if there were descriptive method names. Absent these, please 
  add comments that summarize the purpose and general approach of these 
  methods in places such as this one.

The comments is in the cpp code itself, I'm not inclined to add comments in the 
header since this is not really used outside.


 On Jan. 18, 2015, 12:18 p.m., Bernd Mathiske wrote:
  src/slave/containerizer/docker.hpp, line 196
  https://reviews.apache.org/r/29334/diff/5/?file=824296#file824296line196
 
  At this compleity level, the comments here have begun to look like an 
  anti-pattern that might create unwanted precedent for others to mimic. IMHO 
  this was already the case before this patch, but the current additions 
  exacerbate it.
  
  The naming launchInContainer helps, but it is still in line with the 
  expected style? 
  
  It seems to me that the expected style breaks down here. Ideas?

I think this is going to stay until we have a better naming scheme.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29334/#review68559
---


On April 7, 2015, 12:46 a.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29334/
 ---
 
 (Updated April 7, 2015, 12:46 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Bernd Mathiske.
 
 
 Bugs: MESOS-2183
 https://issues.apache.org/jira/browse/MESOS-2183
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add option to launch docker containers with helper containers.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.hpp b7bf54a 
   src/slave/containerizer/docker.cpp 5f4b4ce 
 
 Diff: https://reviews.apache.org/r/29334/diff/
 
 
 Testing
 ---
 
 make, tests are fixed in next commit
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 29328: Add option to disable docker containerizer killing orphans

2015-04-06 Thread Timothy Chen


 On March 11, 2015, 5:25 p.m., Joerg Schad wrote:
  src/slave/flags.hpp, line 325
  https://reviews.apache.org/r/29328/diff/4/?file=824288#file824288line325
 
  Could you add this flag also the the docs/configuration.md?

This is fixed in a later commit.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29328/#review76081
---


On Jan. 17, 2015, 1:26 a.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29328/
 ---
 
 (Updated Jan. 17, 2015, 1:26 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Bernd Mathiske.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add option to disable docker containerizer killing orphans
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.cpp 5f4b4ce 
   src/slave/flags.hpp a4498e6 
 
 Diff: https://reviews.apache.org/r/29328/diff/
 
 
 Testing
 ---
 
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 29328: Add option to disable docker containerizer killing orphans

2015-04-06 Thread Timothy Chen


 On Jan. 17, 2015, 1:34 a.m., Ben Mahler wrote:
  Just curious, what happens to the orphans if you don't kill them? Was there 
  a ticket for this?
 
 Timothy Chen wrote:
 the orphans remains untouched. there is a jira ticket for adding this 
 flag, i can fimd it later once im next to a computer
 
 Ben Mahler wrote:
 I understood that part :)
 
 But why is it ok to leave orphans untouched? Sounds like a bug to me.. is 
 there some context I'm missing here?
 
 Timothy Chen wrote:
 I think the context is that sometimes it's not desirable to remove all 
 orphans on recovery, especially when the discovery mechanism that a task is 
 launched by Mesos currently is looking for Docker containers with a mesos- 
 prefix (future going to be mesos-{slave_id} which is safer).
 We want to leave this optionally so if users like to keep the containers 
 and want to have their own recovery or gc plan we can let them do so.

Btw this flag is on  by default so unless users really want to it's always 
killing orphan docker container tasks.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29328/#review68532
---


On Jan. 17, 2015, 1:26 a.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29328/
 ---
 
 (Updated Jan. 17, 2015, 1:26 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Bernd Mathiske.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add option to disable docker containerizer killing orphans
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.cpp 5f4b4ce 
   src/slave/flags.hpp a4498e6 
 
 Diff: https://reviews.apache.org/r/29328/diff/
 
 
 Testing
 ---
 
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 29330: Integrate docker executor into containerizer.

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29330/
---

(Updated April 7, 2015, 12:39 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Bugs: MESOS-2595
https://issues.apache.org/jira/browse/MESOS-2595


Repository: mesos


Description
---

Integrate docker executor into containerizer.


Diffs
-

  src/slave/containerizer/docker.cpp f9fb078 

Diff: https://reviews.apache.org/r/29330/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29329: Add executor for docker containerizer

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29329/
---

(Updated April 7, 2015, 12:39 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Bugs: MESOS-2595
https://issues.apache.org/jira/browse/MESOS-2595


Repository: mesos


Description
---

Add executor for docker containerizer, replaces the usage of command executor


Diffs
-

  src/Makefile.am 9c01f5d6c692f835100e7cade928748cc4763cc8 
  src/docker/executor.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/29329/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29332: Add docker_mesos_image flag to slave flags.

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29332/
---

(Updated April 7, 2015, 12:44 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Changes
---

rebased


Bugs: MESOS-2183
https://issues.apache.org/jira/browse/MESOS-2183


Repository: mesos


Description
---

Add docker_mesos_image flag to slave flags.


Diffs (updated)
-

  src/slave/flags.hpp 3da71af 

Diff: https://reviews.apache.org/r/29332/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29333: Add docker_sock to slave flags

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29333/
---

(Updated April 7, 2015, 12:44 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Repository: mesos


Description
---

Add docker_sock to slave flags


Diffs (updated)
-

  src/slave/flags.hpp 3da71af 

Diff: https://reviews.apache.org/r/29333/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29327: Add slave id to docker container name prefix.

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29327/
---

(Updated April 7, 2015, 12:22 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Changes
---

rebased.


Repository: mesos


Description
---

Add slave id to docker container name prefix.


Diffs (updated)
-

  src/slave/containerizer/docker.hpp 6893684e6d199a5d69fc8bba8e60c4acaae9c3c9 
  src/slave/containerizer/docker.cpp f9fb07806e3b7d7d2afc1be3b8756eac23b32dcd 

Diff: https://reviews.apache.org/r/29327/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29329: Add executor for docker containerizer

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29329/
---

(Updated April 7, 2015, 12:34 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Bugs: MESOS-2115
https://issues.apache.org/jira/browse/MESOS-2115


Repository: mesos


Description
---

Add executor for docker containerizer, replaces the usage of command executor


Diffs (updated)
-

  src/Makefile.am 9c01f5d6c692f835100e7cade928748cc4763cc8 
  src/docker/executor.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/29329/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 29332: Add docker_mesos_image flag to slave flags.

2015-04-06 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29332/
---

(Updated April 7, 2015, 12:39 a.m.)


Review request for mesos, Benjamin Hindman and Bernd Mathiske.


Bugs: MESOS-2183
https://issues.apache.org/jira/browse/MESOS-2183


Repository: mesos


Description
---

Add docker_mesos_image flag to slave flags.


Diffs
-

  src/slave/flags.hpp a4498e6 

Diff: https://reviews.apache.org/r/29332/diff/


Testing
---

make check


Thanks,

Timothy Chen



Re: Review Request 32797: Kill the executor when docker container is destroyed.

2015-04-03 Thread Timothy Chen


 On April 3, 2015, 3:46 p.m., Benjamin Hindman wrote:
  src/slave/containerizer/docker.cpp, line 1230
  https://reviews.apache.org/r/32797/diff/1/?file=914221#file914221line1230
 
  Why kill the executor before doing Docker::stop? Can you comment here 
  why you do it in this order versus the other order and the ramifications 
  that has?

This is because we're waiting on the executor to finish 
(os::reaped(executorPid)) in the container-status future, and if we don't kill 
the executor first the later container-status call will just hang. I can leave 
a comment about this too.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32797/#review78785
---


On April 2, 2015, 11:38 p.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32797/
 ---
 
 (Updated April 2, 2015, 11:38 p.m.)
 
 
 Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till 
 Toenshoff.
 
 
 Bugs: MESOS-2583
 https://issues.apache.org/jira/browse/MESOS-2583
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Kill the executor when docker container is destroyed.
 
 
 Diffs
 -
 
   src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
   src/slave/containerizer/docker.cpp e83b912c707a3f2687b09a647a9ed248a940ea97 
 
 Diff: https://reviews.apache.org/r/32797/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 32798: Add test to verify executor clean up in docker containerizer.

2015-04-03 Thread Timothy Chen


 On April 3, 2015, 3:38 p.m., Benjamin Hindman wrote:
  src/tests/docker_containerizer_tests.cpp, lines 2625-2627
  https://reviews.apache.org/r/32798/diff/1/?file=914222#file914222line2625
 
  My only question here is how do you know the executor is properly 
  killed and cleaned up? Is this because you know there aren't any more child 
  processes? Is that something you want to check after you call Shutdown()? 
  I.e., os::children(0).get().empty() or something like that?

Shutdown does indeed checks for remaining child processes, I've verified this 
by just running the test and get a exception on shutdown. I'll leave a comment 
in the end too.


- Timothy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32798/#review78783
---


On April 2, 2015, 11:38 p.m., Timothy Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32798/
 ---
 
 (Updated April 2, 2015, 11:38 p.m.)
 
 
 Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till 
 Toenshoff.
 
 
 Bugs: MESOS-2583
 https://issues.apache.org/jira/browse/MESOS-2583
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add test to verify executor clean up in docker containerizer.
 
 
 Diffs
 -
 
   src/tests/docker_containerizer_tests.cpp 
 fdd706a892ee1c8d55a406b3f956d99c076c623b 
 
 Diff: https://reviews.apache.org/r/32798/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Timothy Chen
 




Re: Review Request 32832: Added CHANGELOG for 0.22.1

2015-04-03 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32832/#review78831
---

Ship it!


Ship It!

- Timothy Chen


On April 3, 2015, 8:43 p.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32832/
 ---
 
 (Updated April 3, 2015, 8:43 p.m.)
 
 
 Review request for mesos, Ben Mahler and Timothy Chen.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added changelog section for Mesos 0.22.1
 
 
 Diffs
 -
 
   CHANGELOG efcadfa0f896a50f21f34b84bdcaa61046d8cd4b 
 
 Diff: https://reviews.apache.org/r/32832/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Niklas Nielsen
 




Review Request 32834: Modifiy gdb scripts error message to check gdb is installed.

2015-04-03 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32834/
---

Review request for mesos, Adam B, Cody Maloney, and Niklas Nielsen.


Repository: mesos


Description
---

Ran into a problem where gdb isn't installed and I see a error message saying 
generated libtool doesn't support gdb. 
Changed the error message to ask user to make sure gdb is also installed.


Diffs
-

  bin/gdb-mesos-local.sh.in 72cfb68b4ff2ac796aa381cf6c49f6a4b83eb28b 
  bin/gdb-mesos-master.sh.in f00af078bb9b8a6c3689d1ddd0db6efe38614d87 
  bin/gdb-mesos-slave.sh.in e01325c59ed62eb2e0d6bdf24808fc3f0cd206ab 
  bin/gdb-mesos-tests.sh.in 626fefe7d953bf226e6d5fb84c87a6f3d66f4da9 

Diff: https://reviews.apache.org/r/32834/diff/


Testing
---

make


Thanks,

Timothy Chen



Review Request 32796: Only update docker container when resources differs.

2015-04-02 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32796/
---

Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till Toenshoff.


Bugs: MESOS-2583
https://issues.apache.org/jira/browse/MESOS-2583


Repository: mesos


Description
---

Only update docker container when resources differs.
Also include the executor resources when launching the docker container to 
avoid updating it again later on.


Diffs
-

  src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
  src/slave/containerizer/docker.cpp e83b912c707a3f2687b09a647a9ed248a940ea97 

Diff: https://reviews.apache.org/r/32796/diff/


Testing
---

make check


Thanks,

Timothy Chen



Review Request 32798: Add test to verify executor clean up in docker containerizer.

2015-04-02 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32798/
---

Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till Toenshoff.


Bugs: MESOS-2583
https://issues.apache.org/jira/browse/MESOS-2583


Repository: mesos


Description
---

Add test to verify executor clean up in docker containerizer.


Diffs
-

  src/tests/docker_containerizer_tests.cpp 
fdd706a892ee1c8d55a406b3f956d99c076c623b 

Diff: https://reviews.apache.org/r/32798/diff/


Testing
---

make check


Thanks,

Timothy Chen



Review Request 32797: Kill the executor when docker container is destroyed.

2015-04-02 Thread Timothy Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32797/
---

Review request for mesos, Benjamin Hindman, Bernd Mathiske, and Till Toenshoff.


Bugs: MESOS-2583
https://issues.apache.org/jira/browse/MESOS-2583


Repository: mesos


Description
---

Kill the executor when docker container is destroyed.


Diffs
-

  src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 
  src/slave/containerizer/docker.cpp e83b912c707a3f2687b09a647a9ed248a940ea97 

Diff: https://reviews.apache.org/r/32797/diff/


Testing
---

make check


Thanks,

Timothy Chen



  1   2   3   4   5   6   7   8   9   10   >