Re: Finding Tasks Waiting for Resources in Mesos

2017-10-31 Thread Vinod Kone
I think you need to query the framework that you are running on top of
Mesos for this information. The workflow is as follows: User submits a task
to framework, framework waits for resources to be available in Mesos
cluster, once available it launches the task.

On Tue, Oct 31, 2017 at 11:16 AM, SenthilKumar K 
wrote:

> + User Group.
>
> --Senthil
>
> On Tue, Oct 31, 2017 at 11:44 PM, SenthilKumar K 
> wrote:
>
> > Hi All ,  What is the way to Query Mesos Cluster to Check whether the
> task
> > is waiting for Resources or not?
> >
> > Context :
> >   Say 10 tasks are running and Mesos Cluster Usage is : 99%. And if i
> > submit another Task mesos is accepting and task will wait for resources.
> We
> > have our own business logic to check if the task is not running
> resubmit...
> > Truth is cluster has the same task and its waiting for resources.
> >
> > Pls advise..
> >
> > --Senthil
> >
>


Re: orphan executor

2017-10-27 Thread Vinod Kone
Can you share the agent and executor logs of an example orphaned executor?
That would help us diagnose the issue.

On Fri, Oct 27, 2017 at 8:19 PM, Mohit Jaggi  wrote:

> Folks,
> Often I see some orphaned executors in my cluster. These are cases where
> the framework was informed of task loss, so has forgotten about them as
> expected, but the container(docker) is still around. AFAIK, Mesos agent is
> the only entity that has knowledge of these containers. How do I ensure
> that they get cleaned up by the agent?
>
> Mohit.
>


Disallowing comma in attribute values

2017-10-18 Thread Vinod Kone
Hi folks,

I would like to propose that we enforce the character set used for
attribute values as per the documentation.

Currently, the documentation
 states
that attribute values should be of the form [a-zA-Z0-9_/.-] but we don't
enforce it
.
This makes it hard for frameworks to use certain delimiters (say comma) for
parsing a list of attributes.

I would like to know if there is anyone out there who is using attribute
values that have characters other than [a-zA-Z0-9_/.-]. If yes, please let
us know so we can plan for the breaking change. If no one is using
undocumented character set, I propose that we make this change in the
upcoming 1.5 release of Mesos.

Thoughts?

Vinod


Re: Adding the limited resource to TaskStatus messages

2017-10-09 Thread Vinod Kone
> In the case that a task is killed because it violated a resource
> constraint (ie. the reason field is REASON_CONTAINER_LIMITATION,
> REASON_CONTAINER_LIMITATION_DISK or REASON_CONTAINER_LIMITATION_MEMORY),
> this field may be populated with the resource that triggered the
> limitation. This is intended to give better information to schedulers about
> task resource failures, in the expectation that it will help them bubble
> useful information up to the user or a monitoring system.
>

Can you elaborate what schedulers are expected to do with this information?
Looking for some concrete use cases if you can.


Re: Collect feedbacks on TASK_FINISHED

2017-09-21 Thread Vinod Kone
I think it makes sense for `TASK_KILLED` to be sent in response to a KILL
call irrespective of the exit status. IIRC, that was the original intention.

On Thu, Sep 21, 2017 at 8:20 PM, Qian Zhang  wrote:

> Hi Folks,
>
> I'd like to collect the feedbacks on the task state TASK_FINISHED.
> Currently the default and command executor will always send TASK_FINISHED
> as long as the exit code of task is 0, this cause an issue: when scheduler
> initiates a kill task, the executor will send SIGTERM to the task first,
> and if the task handles SIGTERM gracefully and exit with 0, the executor
> will send TASK_FINISHED for that task, so we will see the task state
> transition: TASK_KILLING -> TASK_FINISHED.
>
> This seems incorrect because we thought it should be TASK_KILLING ->
> TASK_KILLED, that's why we filed a ticket MESOS-7975
>  for it. However, I am
> not very sure if it is really a bug, because I think it depends on how we
> define the meaning of TASK_FINISHED, if it means the task is terminated
> successfully on its own without external interference, then I think it does
> not make sense for scheduler to receive a TASK_KILLING followed by a
> TASK_FINISHED since there is indeed an external interference (killing task
> is initiated by scheduler). However, if TASK_FINISHED means the task is
> terminated successfully for whatever reason (no matter it is killed or
> terminated on its own), then I think it is OK to receive a TASK_KILLING
> followed by a TASK_FINISHED.
>
> Please let us know your thoughts on this issue, thanks!
>
>
> Regards,
> Qian Zhang
>


Re: [ISSUE] Check failed: slave.maintenance.isSome()

2017-09-18 Thread Vinod Kone
This looks similar to https://issues.apache.org/jira/browse/MESOS-7966. Can
you add your information and logs to that ticket?

On Fri, Sep 15, 2017 at 3:18 AM, Qi Feng  wrote:

> My mesos version is 1.2.0. Sorry.
>
> --
> *From:* Qi Feng 
> *Sent:* Friday, September 15, 2017 10:14 AM
> *To:* user@mesos.apache.org; Bayou
> *Subject:* Re: [ISSUE] Check failed: slave.maintenance.isSome()
>
>
> This case could be reproduced by calling `for i in {1..8}; do python
> call.py; done` (call.py gist: https://gist.github.com/athlum/
> e2cd04bfb9f81a790d31643606252a49 ).
>
> Looks like there is something wrong when call /maintenance/schedule
> concurrently.
>
> We met this case because we use wrote a service base on ansible that
> manage the mesos cluster. When we create a task to update slave configs
> with a certain number of workers. Just like:
>
>1. call schedule for 3 machine: a,b,c.
>2. as machine a was done, maintenance window updates to: b,c
>3. as an other machine "d" assigned after a immediately, windows will
>update to: b,c,d
>
> This change sometimes happen in little interval. Then we find the fatal
> log just in Bayou's mail.
>
> What's the right way to update maintanence window? Thanks to any reply.
>
>
> --
> *From:* Bayou 
> *Sent:* Thursday, September 14, 2017 12:06 PM
> *To:* user@mesos.apache.org
> *Subject:* [ISSUE] Check failed: slave.maintenance.isSome()
>
> Hi all,
>I’m trying to continuously do mesos-maintenance-schedule, machine-down,
> machine-up, mesos-maintenance-schedule-cancel over and over again in a
> three-slaves cluster, no any other operations, just trying mesos API to do
> these to schedule the three slaves asynchronously. At the beginning, It
> worked well, after I tried many times, about hundreds times, unfortunately,
> there were alway a Check failed of slave.maintenance.isSome() and mesos
> master crashed, the origin code at
> https://github.com/apache/mesos/blob/2fe2bb26a425da9aaf1d7cf34019dd
> 347d0cf9a4/src/master/allocator/mesos/hierarchical.cpp#L983
> And some log from mesos master at below:
> 2017-09-12 16:39:07.394 err mesos-master[254491]: F0912 16:39:07.393944
> 254527 hierarchical.cpp:903] Check failed: slave.maintenance.isSome()
> 2017-09-12 16:39:07.394 err mesos-master[254491]: *** Check failure stack
> trace: ***
> 2017-09-12 16:39:07.402 err mesos-master[254491]: @
> 0x7f4cf356fba6  google::LogMessage::Fail()
> 2017-09-12 16:39:07.413 err mesos-master[254491]: @
> 0x7f4cf356fb05  google::LogMessage::SendToLog()
> 2017-09-12 16:39:07.420 err mesos-master[254491]: @
> 0x7f4cf356f516  google::LogMessage::Flush()
> 2017-09-12 16:39:07.424 err mesos-master[254491]: @
> 0x7f4cf357224a  google::LogMessageFatal::~LogMessageFatal()
> 2017-09-12 16:39:07.429 err mesos-master[254491]: @
> 0x7f4cf2344a32  mesos::internal::master::allocator::internal::
> HierarchicalAllocatorProcess::updateInverseOffer()
> 2017-09-12 16:39:07.435 err mesos-master[254491]: @
> 0x7f4cf1f8d9f9  _ZZN7process8dispatchIN5mesos8i
> nternal6master9allocator21MesosAllocatorProcessERKNS1_7SlaveIDERKNS1_
> 11FrameworkIDERK6OptionINS1_20UnavailableResourcesEERKSC_INS1_
> 9allocator18InverseOfferStatusEERKSC_INS1_7FiltersEES6_S9_
> SE_SJ_SN_EEvRKNS_3PIDIT_EEMSR_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_
> T9_ENKUlPNS_11ProcessBaseEE_clES18_
> 2017-09-12 16:39:07.445 err mesos-master[254491]: @
> 0x7f4cf1f938bb  _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_
> 8dispatchIN5mesos8internal6master9allocator21MesosAllocatorP
> rocessERKNS5_7SlaveIDERKNS5_11FrameworkIDERK6OptionINS5_
> 20UnavailableResourcesEERKSG_INS5_9allocator18InverseOfferStatus
> EERKSG_INS5_7FiltersEESA_SD_SI_SN_SR_EEvRKNS0_3PIDIT_
> EEMSV_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_
> invokeERKSt9_Any_dataS2_
> 2017-09-12 16:39:07.455 err mesos-master[254491]: @
> 0x7f4cf34dd049  std::function<>::operator()()
> 2017-09-12 16:39:07.460 err mesos-master[254491]: @
> 0x7f4cf34c1285  process::ProcessBase::visit()
> 2017-09-12 16:39:07.464 err mesos-master[254491]: @
> 0x7f4cf34cc58a  process::DispatchEvent::visit()
> 2017-09-12 16:39:07.465 err mesos-master[254491]: @
> 0x7f4cf4e4ad4e  process::ProcessBase::serve()
> 2017-09-12 16:39:07.469 err mesos-master[254491]: @
> 0x7f4cf34bd281  process::ProcessManager::resume()
> 2017-09-12 16:39:07.471 err mesos-master[254491]: @
> 0x7f4cf34b9a2c  _ZZN7process14ProcessManager12init_threadsEvENKUt_clEv
> 2017-09-12 16:39:07.473 err mesos-master[254491]: @
> 0x7f4cf34cbbf2  _ZNSt12_Bind_simpleIFZN7process14ProcessMan
> ager12init_threadsEvEUt_vEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE
> 2017-09-12 16:39:07.475 err mesos-master[254491]: @
> 0x7f4cf34cbb36  _ZNSt12_Bind_simpleIFZN7process14ProcessMan
> ager12init_threadsEvEUt_vEEclEv
> 2017-09-12 16:39:07.477 err mesos-master[254491]: @
> 

Re: [VOTE] Release Apache Mesos 1.4.0 (rc5)

2017-09-15 Thread Vinod Kone
Ok. Looks like a test issue per https://reviews.apache.org/r/60467/

+1(binding)

On Fri, Sep 15, 2017 at 12:16 PM, Michael Park <mp...@apache.org> wrote:

> Vinod, regarding MESOS-7729
> <https://issues.apache.org/jira/browse/MESOS-7729>:
>
> I found MESOS-6345 <https://issues.apache.org/jira/browse/MESOS-6345> related
> to persistent volume framework, which leads me to believe that this is not
> new.
>
> Thanks,
>
> MPark
>
> On Tue, Sep 12, 2017 at 12:01 PM Vinod Kone <vinodk...@apache.org> wrote:
>
>> Tested this on ASF CI.
>>
>> Saw 3 flaky tests.
>>
>> https://issues.apache.org/jira/browse/MESOS-7729
>> <https://issues.apache.org/jira/browse/MESOS-7972>
>>
>> https://issues.apache.org/jira/browse/MESOS-7971
>> https://issues.apache.org/jira/browse/MESOS-7972
>>
>> The first one was a known (since 1.4.0) flaky test with a double free
>> corruption. @Kapil and @MPark can you verify that this is an issue with
>> the
>> test and not the source code? Once verified, I'll give a +1.
>>
>> *Revision*: b3fd2e7ab26e118222fe18af4b92c53a3c01e6cc
>>
>>- refs/tags/1.4.0-rc5
>>
>> Configuration Matrix gcc clang
>> centos:7 --verbose --enable-libevent --enable-ssl autotools
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
>> enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
>> 20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
>> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> cmake
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
>> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>> GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%
>> 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> --verbose autotools
>> [image: Failed]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
>> ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_
>> exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> cmake
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
>> verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%
>> 3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
>> enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
>> 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
>> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=autotools,COMPILER=clang,
>> CONFIGURATION=--verbose%20--enable-libevent%20--enable-
>> ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%
>> 3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> cmake
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
>> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>> GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
>> docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=-
>> -verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>> GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
>> docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> --verbose autotools
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
>> Release/42/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
>> ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,
>> label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/M

Re: [VOTE] Release Apache Mesos 1.4.0 (rc5)

2017-09-12 Thread Vinod Kone
Tested this on ASF CI.

Saw 3 flaky tests.

https://issues.apache.org/jira/browse/MESOS-7729


https://issues.apache.org/jira/browse/MESOS-7971
https://issues.apache.org/jira/browse/MESOS-7972

The first one was a known (since 1.4.0) flaky test with a double free
corruption. @Kapil and @MPark can you verify that this is an issue with the
test and not the source code? Once verified, I'll give a +1.

*Revision*: b3fd2e7ab26e118222fe18af4b92c53a3c01e6cc

   - refs/tags/1.4.0-rc5

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Failed]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Success]

cmake
[image: Failed]

[image: Failed]





On Sat, Sep 9, 2017 at 6:49 AM, Kapil Arya  wrote:

> Hi all,
>
> [NOTE: Starting with this RC candidate, we will not be "releasing" RC jar
> files in the Maven release channel. This prevents polluting of the Maven
> repositories with numerous RC tags. As before, you can continue to test the
> release candidate using the Maven staging repository provided below.]
>
> Please vote on releasing the following candidate as Apache Mesos 1.4.0.
>
> 1.4.0 includes the following:
> 
> 
>   * Ability to recover the agent ID after a host reboot.
>   * File-based and image-pull secrets.
>   * Linux ambient and bounding capabilities support.
>   * Ability to efficiently measure disk usage without enforcing usage
> constraints.
>   * Hierarchical resource 

Re: SchedulerDriver / libmesos API version compatibility

2017-09-07 Thread Vinod Kone
1.3.0 Scheduler Java driver will work. 

@vinodkone

> On Sep 7, 2017, at 7:02 PM, Eli Jordan <elias.k.jor...@gmail.com> wrote:
> 
> Ok, thanks Vinod
> 
> So we need to upgrade then. I’m just looking at the new API, and the new 
> version of the java driver.
> 
> I can see that the MesosSchedulerDriver class / Scheduler interface are still 
> available but there is also an additional v1 package for the new API.
> 
> If I upgrade to libmesos v1.3.0 + mesos v1.3.0 of the scheduler java driver, 
> but continue to use the MesosSchedulerDriver / Scheduler APIs, will this be 
> compatible with a 1.3.0 mesos master?
> Or, do I need to upgrade to the v1 scheduler API in order to be compatible?
> 
>> On 8 Sep 2017, at 10:41 am, Vinod Kone <vinodk...@gmail.com> wrote:
>> 
>> Could be. We don't guarantee backwards compatible between pre 1.0 components 
>> and post 1.0 components. Component here could be master/agent binaries, 
>> scheduler/executor libs and jars and eggs. 
>> 
>> @vinodkone
>> 
>>> On Sep 7, 2017, at 3:25 PM, Eli Jordan <elias.k.jor...@gmail.com> wrote:
>>> 
>>> Hi
>>> 
>>> Is there an API version compatibility guide somewhere, describing what 
>>> versions of the scheduler API are compatible with which mesos master 
>>> versions?
>>> 
>>> I'm asking because we have several frameworks using the 0.28.2 java based 
>>> scheduler driver, and recently upgraded our mesos masters and agents to 
>>> 1.3.0. 
>>> 
>>> The frameworks do work, but we have been seeing some issues around 
>>> reconciliation that we weren't seeing before. Could this be related to a 
>>> scheduler driver incompatibility?
>>> 
>>> Thanks
>>> Eli
> 


Re: SchedulerDriver / libmesos API version compatibility

2017-09-07 Thread Vinod Kone
Could be. We don't guarantee backwards compatible between pre 1.0 components 
and post 1.0 components. Component here could be master/agent binaries, 
scheduler/executor libs and jars and eggs. 

@vinodkone

> On Sep 7, 2017, at 3:25 PM, Eli Jordan  wrote:
> 
> Hi
> 
> Is there an API version compatibility guide somewhere, describing what 
> versions of the scheduler API are compatible with which mesos master versions?
> 
> I'm asking because we have several frameworks using the 0.28.2 java based 
> scheduler driver, and recently upgraded our mesos masters and agents to 
> 1.3.0. 
> 
> The frameworks do work, but we have been seeing some issues around 
> reconciliation that we weren't seeing before. Could this be related to a 
> scheduler driver incompatibility?
> 
> Thanks
> Eli


Re: Welcome James Peach as a new committer and PMC memeber!

2017-09-06 Thread Vinod Kone
Congrats and welcome!

On Wed, Sep 6, 2017 at 2:22 PM, Jie Yu  wrote:

> Congrats James! Well deserved!
>
> On Wed, Sep 6, 2017 at 2:08 PM, Yan Xu  wrote:
>
>> Hi Mesos devs and users,
>>
>> Please welcome James Peach as a new Apache Mesos committer and PMC member.
>>
>> James has been an active contributor to Mesos for over two years now. He
>> has made many great contributions to the project which include XFS disk
>> isolator, improvement to Linux capabilities support and IPC namespace
>> isolator. He's super active on the mailing lists and slack channels, always
>> eager to help folks in the community and he has been helping with a lot of
>> Mesos reviews as well.
>>
>> Here is his formal committer candidate checklist:
>>
>> https://docs.google.com/document/d/19G5zSxhrRBdS6GXn9KjCznjX
>> 3cp0mUbck6Jy1Hgn3RY/edit?usp=sharing
>> 
>>
>> Congrats James!
>>
>> Yan
>>
>>
>


Re: [VOTE] Release Apache Mesos 1.1.3 (rc2)

2017-08-27 Thread Vinod Kone
+1 (binding)

Tested on ASF CI. The only red build was the known perf core dump issue.

*Revision*: ce77d91bd3a59227d5684ce0783b460c54ea311f

   - refs/tags/1.1.3-rc2

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]


On Fri, Aug 25, 2017 at 7:48 AM, Alex Rukletsov  wrote:

> Folks,
>
> Please vote on releasing the following candidate as Apache Mesos 1.1.3.
> Note that this will be the last 1.1.x release.
>
> 1.1.3 includes the following:
> 
> 
> ** Bug
>  * [MESOS-5187] - The filesystem/linux isolator does not set the
> permissions of the host_path.
>   * [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
>   * [MESOS-6950] - Launching two tasks with the same Docker image
> simultaneously may cause a staging dir never cleaned up.
>   * [MESOS-7540] - Add an agent flag for executor re-registration timeout.
>   * [MESOS-7569] - Allow "old" executors with half-open connections to be
> preserved during agent upgrade / restart.
>   * [MESOS-7689] - Libprocess can crash on malformed request paths for
> libprocess messages.
>   * [MESOS-7690] - The agent can crash when an unknown executor tries to
> register.
>   * [MESOS-7581] - Fix interference of external Boost installations when
> using some unbundled dependencies.
>   * [MESOS-7703] - Mesos fails to exec a custom executor when no shell is
> used.
>  

Re: [VOTE] Release Apache Mesos 1.4.0 (rc1)

2017-08-21 Thread Vinod Kone
Ran on ASF CI.

Found 3 issues with tests.

1) GarbageCollectorIntegrationTest.ExitedFramework
 : This one seems fairly
new. *@Kapil can you confirm if this is a test issue or something in the
code?*

2) DiskResource Persistent Volume tests seem to have interleaved output.
This is a known issue 
which we never got to the bottom of; I added logs from the CI. This is a
CMake build FWIW.

3)  Double free corruption in python example framework test. Known issue
 which we never got to
the bottom of; added latest logs.


*Revision*: b9187d54a97206b4a09fb5cb1d0834ab5fa5abd3

   - refs/tags/1.4.0-rc1

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Failed]

cmake
[image: Failed]

[image: Success]

--verbose autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]



On Mon, Aug 21, 2017 at 10:58 AM, Zhitao Li  wrote:

> +1 (nonbinding)
>
> Tested by running `make check` on a debian/jessie server on AWS.
>
> On Fri, Aug 18, 2017 at 12:27 PM, Kapil Arya  wrote:
>
>> Hi all,
>>
>> Please vote on releasing the following candidate as Apache Mesos 1.4.0.
>>
>> 1.4.0 includes the following:
>> 
>>
>>   * Ability to recover the agent ID after a host reboot.
>>   * File-based and image-pull secrets.
>>   * Linux ambient and bounding capabilities support.
>>   * Hierarchical 

Re: [VOTE] Release Apache Mesos 1.3.1 (rc1)

2017-08-01 Thread Vinod Kone
+1 (binding)

Tested on ASF CI. The 2 red builds are known flaky tests (health checks)
and a perf core dump issue that's fixed on HEAD.

*Revision*: 1beaede8c13f0832d4921121da34f924deec8950

   - refs/tags/1.3.1-rc1

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Failed]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]


On Fri, Jul 28, 2017 at 5:45 PM, Michael Park  wrote:

> Hi all,
>
> Please vote on releasing the following candidate as Apache Mesos 1.3.1.
>
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_
> plain;f=CHANGELOG;hb=1.3.1-rc1
> 
> 
>
> The candidate for Mesos 1.3.1 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.1-rc1/mesos-1.3.1.tar.gz
>
> The tag to be voted on is 1.3.1-rc1:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.3.1-rc1
>
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.1-rc1/
> mesos-1.3.1.tar.gz.md5
>
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.1-rc1/
> mesos-1.3.1.tar.gz.asc
>
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS
>
> The JAR is up in Maven in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1200
>
> Please vote on 

Re: Containerizers & Executors

2017-07-30 Thread Vinod Kone
See my answers inline.


> 1. Mesos Containerizer
> - posix isolators
> - cgroups isolators
>

Mesos container also allows you to use custom isolators.



> 2. Docker containerizer
> - docker isolators
>

Docker containerizer doesn't have a concept of isolator(s).



> 3. Custom containerizer
> - my isolators
>

It is up to the custom containerizer how it wants to do containerization;
it could've have a concept of isolator or not.


- Executors:
> Generally: Each executor has the minimum resources assigned by default
> (0.01 CPU & 32MB MEM)
>Executor expands its resources when a task is assigned
> (executor default resources + task resources)
>

Only the built-in "default" executor needs to have a minimum amount of
resources. Other built-in executors and custom executors can technically
have zero resources.



> 1. Mesos commandExecutor
> - run shell commands or docker
> - Each executor is a container that can have only one task to
> execute, you can't specify group of tasks
> - Isolation between executors/containers so isolation between
> tasks because each task runs in one container
>

Not that the executor that runs shell commands is called the "command"
executor (run by mesos containerizer), whereas the one that runs docker
images is called "docker" executor (run by docker containerizer).



> 2. Mesos defaultExecutor
> - can run shell commands or a custom executor file e.g
> TestExecutor.java (from tests)
> - can execute one task per executor/container or multiple tasks (1
> group).
> - No resource isolation between tasks of the same container
>

"default" executor is another built-in executor. it can run a group of
tasks. it does not run any other (custom) executor.



> 3. Custom Executor
> - ?
>

you could write a custom executor that can run a single task or group of
tasks. totally up to you.



> So, i guess i can use one offer to run some tasks on the same agent with
> commandExecutor or with defaultExecutor….
> But how would somebody specify if the offer corresponds to one agent or
> multiple agents?
>

Each offer has an 'AgentId' which corresponds to one agent.

HTH,
Vinod


Re: dynamic resource reservations

2017-07-28 Thread Vinod Kone
Typically a framework with no role cannot use resources reserved for
another role. So, it would be interesting to see what happened.

Also, please be aware that directly upgrading from 0.28.0 to 1.3.0 is not
supported. You need to go from 0.28.0 to 1.0.0 and then jump from 1.0.0 to
1.3.0.

On Fri, Jul 28, 2017 at 4:01 AM, Hendrik Haddorp 
wrote:

> Hi,
>
> we did a migration from Mesos 0.28 to 1.3.0 and somehow it looks like one
> framework "stole" resources another framework had reserved earlier.
> Unfortunately I do not have any logs for the time frame so I'm not certain
> what exactly happened. Currently we have one framework running with a role
> and principal while the others are running with roles * and no principal.
> Would a framework running with no role be able to use a resource that
> another framework reserved for a specific role?
>
> regards,
> Hendrik
>


Re: How to detemine Mesos Capabilities?

2017-07-05 Thread Vinod Kone
When a scheduler registers or re-registers with the master, `MasterInfo` is
provided as the callback. This includes the version information which can
be used to determine which capabilities a Master has. This is admittedly
not great; there is a ticket to introduce Master capabilities and include
them in MasterInfo. https://issues.apache.org/jira/browse/MESOS-5675

On Wed, Jul 5, 2017 at 3:58 AM, Tomek Janiszewski  wrote:

> Here is the context of this problem https://github.com/
> mesosphere/marathon/pull/5406#discussion_r125454193
> I want to backport support for Mesos HealthChecks to Marathon 1.3. How can
> I ensure that Mesos supports HTTP/TCP healthchecks form Marathon
> perspective?
>
> wt., 4 lip 2017 o 17:56 użytkownik Tomek Janiszewski 
> napisał:
>
>> Hi
>>
>> Mesos allows frameworks to declare it's abilities. How can I get Mesos
>> capabilities from framework perspective?
>>
>> For example, I'm developing a framework that would use Mesos
>> Healthchecks. How can I determine if Mesos version support it. I think it
>> should be a part of subscription response. Currently I need to query Mesos
>> API after subscription to get Mesos version and configuration. What is the
>> best practice to do this?
>>
>> Thanks
>> Tomek
>>
>


Re: Agent Working Directory Best Practices

2017-06-26 Thread Vinod Kone
This is great information. Thanks for sharing Steven!

On Tue, Jun 27, 2017 at 7:05 AM, Steven Schlansker <
sschlans...@opentable.com> wrote:

>
> > On Jun 25, 2017, at 11:24 PM, Benjamin Mahler 
> wrote:
> >
> > As a data point, as far as I'm aware, most users are using a local work
> directory, not an NFS mounted one. Would love to hear from anyone on the
> list if they are doing this, and if there are any subtleties that should be
> documented.
>
> We don't run NFS in particular but we did originally use a SAN -- two
> observations:
>
> NFS (historically, maybe it's better now, but doubtful...) has really bad
> failure modes.
> Network failures can cause serious hangs both in user-space and
> kernel-space.  Such
> hangs can be impossible to clear without rebooting the machine, and in
> some edge cases
> can even make it difficult or impossible to reboot the machine via normal
> means.
>
> Network attached drives (our SAN) are less reliable, slower, and more
> complex
> (read: more failure modes) than local disk.  It's also a really big single
> point
> of failure.  So far our only true cluster outages have been due to failure
> of
> the SAN, since it took down all nodes at once -- once we removed the SAN,
> future
> failures had islands of availability and any properly written application
> could continue running (obviously without network resources) through the
> incident.
>
> Maybe this isn't a huge deal for your use case, which might differ from
> ours.
> For us, it was enough of a problem that we now purchase local SSD scratch
> space
> for every node just so that we have some storage we can depend on a bit
> more
> than network attached storage.
>
> >
> > On Thu, Jun 22, 2017 at 11:13 PM, 
> wrote:
> > Hi,
> >
> > We have a couple of server nodes mainly used for computational tasks in
> > our mesos cluster. These servers have beefy cpus, gpus etc. but only
> > limited ssd space. We also have a 40GBe network and a decently fast
> > file server.
> >
> > My question is simple but I didnt find an answer anywhere: What are the
> > best practices for the working directory on mesos-agent nodes? Should
> > we keep the working directory local or is it reasonable to use a nfs
> > mounted folder? We implemented both and they seem to work fine, but I
> > would rather like to follow "best practices".
> >
> > Thanks and cheers
> >
> > Tom
> >
>
>


Re: Work group on Community

2017-06-21 Thread Vinod Kone
Can we use http://doodle.com/ to arrive at consensus regarding time slot?

@vinodkone

> On Jun 22, 2017, at 8:07 AM, Judith Malnick <jmaln...@mesosphere.io> wrote:
> 
> Hi everyone, 
> 
> Thanks for the interest! I know many of you are in Asia for MesosCon, so I'm 
> just going to propose a few times (Pacific time) and see if anything works. 
> Monday, June 26th at 5 pm
> Wednesday, June 28th at 10 am
> Thursday, July 6th at 8 am
> Wednesday, July 19th at 10 am
> Tell me what you think about these, and if none of them work we can try some 
> others. 
> 
> All the best! 
> Judith 
> 
> 
>> On Wed, Jun 21, 2017 at 2:47 AM, Jörg Schad <jo...@mesosphere.io> wrote:
>> Very excited and happy to join!
>> 
>>> On Sat, Jun 17, 2017 at 1:38 AM, James Peach <jor...@gmail.com> wrote:
>>> 
>>> > On Jun 15, 2017, at 10:57 AM, Vinod Kone <vinodk...@apache.org> wrote:
>>> >
>>> > Hi folks,
>>> >
>>> > Seeing that our first official containerizer WG is off to a good start, we
>>> > want to use that momentum to start new WGs.
>>> >
>>> > I'm proposing that we start a new work group on community. The mission of
>>> > this work group would be to figure out ways to grow the size of our
>>> > community and improve the experience of community members (users, devs,
>>> > contributors, committers etc).
>>> >
>>> > In the first meeting, we can nail down what the charter of this work group
>>> > should be etc. My initial ideas for the topics/components this work group
>>> > could cover
>>> >
>>> > --> Releases
>>> > --> Roadmap
>>> > --> Reviews
>>> > --> JIRA
>>> > --> CI
>>> >
>>> > Over time, I'm hoping that new specific work groups will sprung up that 
>>> > can
>>> > own some of these topics.
>>> >
>>> > If you are interested in joining this work group, please reply to this
>>> > thread and I'll add you to the invite.
>>> 
>>> I'm interested, but unlikely to have much bandwidth to contribute anything 
>>> substantial. One suggestion I have is that a Mesos Weekly news would be 
>>> pretty great. There is a lot of activity on reviewboard, slack and in 
>>> design documents and collecting that in a regular newsletter would give 
>>> that activity a lot more visibility.
>>> 
>>> J
>> 
> 
> 
> 
> -- 
> Judith Malnick
> DC/OS Community Manager
> 310-709-1517


Re: Docker support bug in Mesos Containerizer in 1.3.0

2017-06-20 Thread Vinod Kone
There is. It's monthly unless it's urgent, in which case it could be done on 
demand. 

@vinodkone

> On Jun 21, 2017, at 8:10 AM, Michael Park  wrote:
> 
> We have not set a 1.3.1 just yet. I'd be in charge of making that happen 
> though.
> I'm happy to cut one, I'm not sure if there's a timeline for point releases. 
> cc @vinodkone
> 
>> On Tue, Jun 20, 2017 at 5:03 PM, Akash Gangil  wrote:
>> Is there any release data set for 1.3.1? 
>> 
>>> On Tue, Jun 20, 2017 at 3:53 PM, Jie Yu  wrote:
>>> Hi,
>>> 
>>> We missed a backport into the 1.3.0 release. This will cause missing 
>>> environment variables from the Docker image. The patch has now been 
>>> backported to the 1.3.x branch. If you are using Mesos Containerizer's 
>>> docker image support, please wait for 1.3.1 release.
>>> 
>>> Details can be found in this ticket:
>>> https://issues.apache.org/jira/browse/MESOS-7692
>>> 
>>> - Jie
>> 
>> 
>> 
>> -- 
>> Akash
> 


Re: On Apache Mesos release process

2017-06-17 Thread Vinod Kone
Thanks for starting the discussion around on this Alex! Much appreciated
and needed.

I agree with all the points here :) I'm a big proponent of predictable time
based releases.

As an aside, should we spin up a working group for releases? Given the
frequency of our releases and burn down meetings needed, I think it will be
an active and vibrant group. In addition to the tactical aspects of
releases, this group can also come up with guidelines for releases,
improvements etc.

Thoughts?

On Sat, Jun 17, 2017 at 10:25 AM, Alex Rukletsov 
wrote:

> Folks,
>
> for more than a year Apache Mesos releases are done according to our "then
> new" release policy [1]. It seems to work quite well, but today I would
> like to address things that can be improved.
>
> Let's start with pain points:
> * A minor bug can cancel a release vote, even for a patch release.
> * More canceled votes lead to more RCs and hence create more work for
> committers and voters.
> * Demotivation for release on a candidate unless other people vote.
> * Releases often run behind schedule.
>
> I would like to suggest some improvements to the process:
>
> 1. Stricter time releases. The next release should go into planning (with
> release managers elected) right after the current is cut. Feature owners
> work with the release managers prior to the cut to track progress (k8s
> community aims for 2-3 meeting per week discussing blockers and schedule).
> This way release managers should have a satisfactory understanding which
> new features are going in and what can slow down the release several days
> before the cut.
>
> 2. Written guideline for which issues can '-1' the release. Though it is
> up to the voter how to vote, a clear guideline will set reasonable
> expectations and hopefully help us decrease the number of RCs. Regressions
> (security, performance, compatibility, functional) can cause -1.
> Regressions of experimental features cannot cause -1. Patch releases can be
> -1'd in exceptional cases, e.g., critical bug fix missing in the last patch
> release. New features cannot block a release.
>
> Note: We love reasonable -1 votes! It is so much better to defer a release
> than discover a critical regression from a production user report!
>
> 3. Release managers decides what is back ported to the RC branch once it
> is cut (same for patch releases). Feature owners and committers are
> encouraged to update the release managers timely on the status and
> importance of features and bug fixes.
>
> And of course, I encourage everyone using Mesos to test & vote on release
> candidates! Identical cluster configurations are rare, each new setup helps
> with finding bugs and hence build better software.
>
> [1] https://github.com/apache/mesos/blob/master/docs/versioning.md
>
> Alex.
>


Welcome Greg Mann as a new committer and PMC member!

2017-06-13 Thread Vinod Kone
Hi folks,

Please welcome Greg Mann as the newest committer and PMC member of the
Apache Mesos project.

Greg has been an active contributor to the Mesos project for close to 2
years now and has made many solid contributions. His biggest source code
contribution to the project has been around adding authentication support
for default executor. This was a major new feature that involved quite a
few moving parts. Additionally, he also worked on improving the scheduler
and executor APIs.

Here is his more formal checklist for your perusal.

https://docs.google.com/document/d/1S6U5OFVrl7ySmpJsfD4fJ3_R8JYRRc5spV0yKrpsGBw/edit

Thanks,
Vinod


Re: What is the scheduler for the Command Executor

2017-06-05 Thread Vinod Kone
Hey Wenzhao.

Sorry for the delay in response to your earlier email. Looks like this
email is a duplicate of that, so I'll just answer this one. Please feel
free to ask further questions on this email thread.

I'm studying Mesos code, become very confused about the internal working
> flow of executing a simple docker image,
> such as  "mesos-execute  --master=XXX  --containerizer=docker  --name=test
>  --docker_image=XXX --shell=false".
> I believe "mesos-1.2.0/src/cli/*execute.cpp*" is the implementation of
> this "mesos-execute", which is called "Command Executor" in the official
> document.
>

Your understanding is partially correct. `cli/execute.cpp` which gets
compiled into `mesos-execute` is a scheduler. It is responsible for
registering with mesos master and launching tasks. `mesos-execute` uses
either 1) command executor 2) docker executor or 3) default executor
depending on the arguments passed to it to execute the tasks. For running a
simple docker image 2) is typically used.



> I see "*execute.cpp*" internally setups a "CommandScheduler", which has a
> "received()" function that listens for events from the master. If it
> receives an "*Event::OFFERS*", it will start the procedure of executing
> the tasks on the offered resources (slaves).
>
> However, I cannot find exactly where is the resource offered to the client
> executable.
> I see there is an "offer()" function in "mesos-1.2.0/src/master/
> *master.cpp*". But it sends a "*ResourceOffersMessage*", not an event,
> and no transforming the event to a message.
>

Master currently supports old (v0; that understand `ResourceOffersMessage`
for example) and new (v1; that understand `Event::OFFERS` for example)
schedulers.

The transformation/evolution from `ResourceOffersMessage` to
`Event::Offers` (for new schedulers) happens in
https://github.com/apache/mesos/blob/1.2.0/src/master/master.hpp#L289 .

For more info see:
http://mesos.apache.org/documentation/latest/scheduler-http-api/


> I find that only "mesos-1.2.0/src/sched/*sched.cpp*" can receive and
> process this type of message. But I don't see how is "*sched.cpp*" used
> in other code
>

`mesos-execute` uses the v1 Mesos scheduler library to send Calls and
receive Events. The code for that is in scheduler/scheduler.cpp
.
sched.cpp is used by old v0 schedulers.



> So, I cannot find the exact workflow of sending the offered resource (from
> master), to the Command Executor.  What's the scheduler for this Command
> Executor?
> Could someone help me to understand?
>

As described above, master sends offers to CommandScheduler. Command
scheduler then launches command executor or docker executor or default
executor depending on the args.

Hope that helps,

Vinod


Re: RFC: Partition Awareness

2017-06-01 Thread Vinod Kone
On Thu, Jun 1, 2017 at 2:22 PM, Benjamin Mahler  wrote:

> If I understood correctly, the proposal is to not kill the tasks for
> non-partition aware frameworks? That seems like a pretty big change for
> frameworks that are not partition aware and expect the old killing
> semantics.
>

Adding to what Neil said, I think most (if not all) non-PA frameworks
would've already rescheduled the task after seeing a TASK_LOST. The
difference is that previously such tasks can come back to TASK_RUNNING iff
master fails over and non-strict registry (default) is used. Now, we are
saying tasks can come back to TASK_RUNNING irrespective of master fail
over. The assumption/hope is that this shouldn't break existing frameworks
in a catastrophic way.


Re: [VOTE] Release Apache Mesos 1.3.0 (rc3)

2017-05-31 Thread Vinod Kone
Thanks for the triage.

+1 (binding)

On Wed, May 31, 2017 at 1:33 PM, Neil Conway  wrote:

> On Tue, May 30, 2017 at 3:43 PM, Neil Conway 
> wrote:
> > Attached is the test log for this failure. From a quick look, seems as
> > though the agent starts to launch the task, including forking the
> > child process, but no subsequent task status updates or error messages
> > are observed. Gaston, have you seen this before?
> >
> > I filed https://issues.apache.org/jira/browse/MESOS-7589 to track this.
>
> I wasn't able to repro this failure. Per Gaston's email, there isn't
> enough information in the logs to understand what is going on here,
> although it certainly seems weird that apparently the executor doesn't
> start.
>
> I think this doesn't justify blocking the release, but we should watch
> to see if the problem recurs.
>
> Neil
>


Re: [Req]Starting Japan User Group

2017-05-31 Thread Vinod Kone
Great to hear about the participation for the meetup!

On Tue, May 30, 2017 at 9:22 PM, Kitayama, Shingo 
wrote:

> When you are interested, please register it from followings!!
>
> Slack: https://mesostokyo.slack.com/
>
> Slack Registration: https://mesostokyo.herokuapp.com/
>
>
If you want, you can create a "tokyo" slack channel for the user group on
mesos.slack.com. That way users have one slack team to go to for everything
mesos.


Re: [VOTE] Release Apache Mesos 1.3.0 (rc3)

2017-05-30 Thread Vinod Kone
Ran on ASF CI.

Found following issues.

Failed test: CommandExecutorCheckTest.CommandCheckDeliveredAndReconciled



Failed test: OneWayPartitionTest.MasterToSlave


Can you confirm if these are known or new issues?

Thanks,

On Thu, May 25, 2017 at 2:20 AM, Michael Park  wrote:

> Hi all,
>
> Please vote on releasing the following candidate as Apache Mesos 1.3.0.
>
>
> 1.3.0 includes the following:
> 
> 
>   - Multi-role framework support
>   - Executor authentication support
>   - Allow frameworks to modify their roles.
>
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_p
> lain;f=CHANGELOG;hb=1.3.0-rc3
> 
> 
>
> The candidate for Mesos 1.3.0 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.0-rc3/mesos-1.3.0.tar.gz
>
> The tag to be voted on is 1.3.0-rc3:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.3.0-rc3
>
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.0-rc3/mesos
> -1.3.0.tar.gz.md5
>
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.0-rc3/mesos
> -1.3.0.tar.gz.asc
>
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS
>
> The JAR is up in Maven in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1198
>
> Please vote on releasing this package as Apache Mesos 1.3.0!
>
> The vote is open until Tue May 30 11:59:59 PDT 2017 and passes if a
> majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Mesos 1.3.0
> [ ] -1 Do not release this package because ...
>
> Thanks,
>
> MPark & Neil
>


Re: [RESULT][VOTE] Release Apache Mesos 1.0.4 (rc2)

2017-05-25 Thread Vinod Kone
We are officially done with 1.0.x. 

@vinodkone

> On May 25, 2017, at 8:55 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> 
> Should we add a 1.0.5 release version to JIRA? Or are we done with 1.0 bug
> fix release support?
> 
>> On Thu, May 4, 2017 at 12:32 PM, Vinod Kone <vinodk...@apache.org> wrote:
>> 
>> Hi all,
>> 
>> 
>> The vote for Mesos 1.0.4 (rc2) has passed with the
>> 
>> following votes.
>> 
>> 
>> +1 (Binding)
>> 
>> --
>> 
>> Ben Mahler
>> 
>> Vinod Kone
>> 
>> Anand Mazumdar
>> 
>> +1 (Non-binding)
>> 
>> --
>> 
>> 
>> There were no 0 or -1 votes.
>> 
>> 
>> Please find the release at:
>> 
>> https://dist.apache.org/repos/dist/release/mesos/1.0.4
>> 
>> 
>> It is recommended to use a mirror to download the release:
>> 
>> http://www.apache.org/dyn/closer.cgi
>> 
>> 
>> The CHANGELOG for the release is available at:
>> 
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_
>> plain;f=CHANGELOG;hb=1.0.4
>> 
>> 
>> The mesos-1.0.4.jar has been released to:
>> 
>> https://repository.apache.org
>> 
>> 
>> The website (http://mesos.apache.org) will be updated shortly to reflect
>> this release.
>> 
>> 
>> Thanks,
>> 
>>> On Wed, May 3, 2017 at 10:08 AM, Anand Mazumdar <an...@apache.org> wrote:
>>> 
>>> +1 (binding)
>>> 
>>> make check passed on Ubuntu 16.04 with clang 3.6
>>> 
>>> -anand
>>> 
>>>> On Wed, May 3, 2017 at 10:01 AM, Vinod Kone <vinodk...@apache.org> wrote:
>>>> 
>>>> +1 (binding)
>>>> 
>>>> *Revision*: 4154f66d6c6dde8fd2cf2bbf0bfa155f24ac55d4
>>>> 
>>>>   - refs/tags/1.0.4-rc2
>>>> 
>>>> Configuration Matrix gcc clang
>>>> centos:7 --verbose --enable-libevent --enable-ssl autotools
>>>> [image: Success]
>>>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--
>>> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>>> GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%
>>> 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>>> [image: Not run]
>>>> cmake
>>>> [image: Success]
>>>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose
>>> %20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=
>>> 1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
>>> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>>> [image: Not run]
>>>> --verbose autotools
>>>> [image: Success]
>>>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--
>>> verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%
>>> 3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>>> [image: Not run]
>>>> cmake
>>>> [image: Success]
>>>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose
>>> ,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_
>>> exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>>> [image: Not run]
>>>> ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
>>>> [image: Success]
>>>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--
>>> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>>> GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
>>> docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>>> [image: Success]
>>>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/32/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=-
>>> -verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
>>> GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
>>> docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>>> cmake
>>>> [image: Success]
>>>> <https://builds.ap

Re: Welcome Gilbert Song as a new committer and PMC member!

2017-05-24 Thread Vinod Kone
Congrats Gilbert!

On Wed, May 24, 2017 at 1:32 PM, Neil Conway  wrote:

> Congratulations Gilbert! Well-deserved!
>
> Neil
>
> On Wed, May 24, 2017 at 10:32 AM, Jie Yu  wrote:
> > Hi folks,
> >
> > I' happy to announce that the PMC has voted Gilbert Song as a new
> committer
> > and member of PMC for the Apache Mesos project. Please join me to
> > congratulate him!
> >
> > Gilbert has been working on Mesos project for 1.5 years now. His main
> > contribution is his work on unified containerizer, nested container (aka
> > Pod) support. He also helped a lot of folks in the community regarding
> their
> > patches, questions and etc. He also played an important role organizing
> > MesosCon Asia last year and this year!
> >
> > His formal committer checklist can be found here:
> > https://docs.google.com/document/d/1iSiqmtdX_0CU-YgpViA6r6PU_
> aMCVuxuNUZ458FR7Qw/edit?usp=sharing
> >
> > Welcome, Gilbert!
> >
> > - Jie
>


Re: Use of ACLs.RegisterAgent.agent

2017-05-24 Thread Vinod Kone
If it hasn't been released it should be ok for us to do the rename. There
are no backwards compatible guarantees for such things. But a heads up is
always nice, so thanks for doing that.

On Wed, May 24, 2017 at 12:44 PM, Neil Conway  wrote:

> FYI, I merged the change to rename this field into the master and
> 1.3.x branches; it will be included in the next 1.3.0 release
> candidate.
>
> Neil
>
>
> On Mon, May 22, 2017 at 10:43 AM, Alexander Rojas
>  wrote:
> > Hey guys,
> >
> > We just noted that there was an error when the `RegisterAgent` act was
> > introduced. Namely, its object field is listed as `agent` when by
> convention
> > we have used plural, so it should be `agents`. This ACL hasn’t been part
> of
> > any released version of Mesos, so if no one is using it I will try to
> push
> > for a rename without going through any deprecation cycle.
> >
> > The big question is if any of you are using this particular ACL in
> > production right now?
> >
> > Alexander Rojas
> > alexan...@mesosphere.io
> >
> >
> >
> >
>


Re: [VOTE] Release Apache Mesos 1.2.1 (rc1)

2017-05-17 Thread Vinod Kone
Ran it on ASF CI and saw some issues.

Segfault in "MasterTest.MultipleExecutors" in two builds [1]

[2
],
which is concerning. Is this a known issue?

"ContentType/AgentAPIStreamingTest.AttachInputToNestedContainerSession"
test failed 
.




On Sun, May 14, 2017 at 12:55 AM, tommy xiao  wrote:

> +1
>
> 2017-05-12 7:33 GMT+08:00 Adam Bordelon :
>
> > Hi all,
> >
> > Please vote on releasing the following candidate as Apache Mesos 1.2.1.
> >
> > 1.2.1 is a bug fix release. The CHANGELOG for the release is available
> at:
> > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_
> > plain;f=CHANGELOG;hb=1.2.1-rc1
> >
> > The candidate for Mesos 1.2.1 release is available at:
> > https://dist.apache.org/repos/dist/dev/mesos/1.2.1-rc1/mesos
> -1.2.1.tar.gz
> >
> > The tag to be voted on is 1.2.1-rc1:
> > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.2.1-rc1
> >
> > The MD5 checksum of the tarball can be found at:
> > https://dist.apache.org/repos/dist/dev/mesos/1.2.1-rc1/
> > mesos-1.2.1.tar.gz.md5
> >
> > The signature of the tarball can be found at:
> > https://dist.apache.org/repos/dist/dev/mesos/1.2.1-rc1/
> > mesos-1.2.1.tar.gz.asc
> >
> > The PGP key used to sign the release is here:
> > https://dist.apache.org/repos/dist/release/mesos/KEYS
> >
> > The JAR is up in Maven in a staging repository here:
> > https://repository.apache.org/content/repositories/orgapachemesos-1192
> >
> > Please vote on releasing this package as Apache Mesos 1.2.1!
> >
> > The vote is open until Tue May 16 17:00 PDT 2017 and passes if a majority
> > of at least 3 +1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Mesos 1.2.1
> > [ ] -1 Do not release this package because ...
> >
> > Thanks,
> > -Adam-
> >
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>


Re: [VOTE] Release Apache Mesos 1.1.2 (rc2)

2017-05-12 Thread Vinod Kone
+1 (binding)

Ran on ASF CI. The one configuration that is in "red" is due to a known
flaky issue with perf core dump during test suite teardown.

*Revision*: 37d98c55e1f43d6734729b5cdbed242ebc3263ed

   - refs/tags/1.1.2-rc2

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]


On Fri, May 12, 2017 at 8:07 AM, Alex Rukletsov  wrote:

> Folks,
>
> Please vote on releasing the following candidate as Apache Mesos 1.1.2.
>
> 1.1.2 includes the following:
> 
> 
> ** Bug
>   * [MESOS-2537] - AC_ARG_ENABLED checks are broken.
>   * [MESOS-5028] - Copy provisioner cannot replace directory with symlink.
>   * [MESOS-5172] - Registry puller cannot fetch blobs correctly from http
> Redirect 3xx urls.
>   * [MESOS-6327] - Large docker images causes container launch failures:
> Too many levels of symbolic links.
>   * [MESOS-7057] - Consider using the relink functionality of libprocess in
> the executor driver.
>   * [MESOS-7119] - Mesos master crash while accepting inverse offer.
>   * [MESOS-7152] - The agent may be flapping after the machine reboots due
> to provisioner recover.
>   * [MESOS-7197] - Requesting tiny amount of CPU crashes master.
>   * [MESOS-7210] - HTTP health check doesn't work when mesos runs with
> --docker_mesos_image.
>   * [MESOS-7237] - Enabling cgroups_limit_swap can lead to "invalid
> argument" error.
> 

Re: [Req]Starting Japan User Group

2017-05-09 Thread Vinod Kone
On Tue, May 9, 2017 at 8:10 AM, Kitayama, Shingo 
wrote:

> In Japan, usually engineers will take tech event information from
> CONNPASS, not from Meetup.com
>
> ※CONNPASS  https://connpass.com/ (Sorry all Japanese)
>
>
>
>
I think using whatever portal works for you is fine. If it's not too much
work though, I would suggest to create a meetup.com group as well and just
link to connpass event when you schedule meetups.


Re: [VOTE] Release Apache Mesos 1.1.2 (rc1)

2017-05-08 Thread Vinod Kone
I saw this on ASF CI
.
Expected flaky test?

[ RUN  ] HTTPCommandExecutorTest.TerminateWithACK
I0504 15:43:05.341382 32064 cluster.cpp:158] Creating default 'local' authorizer
I0504 15:43:05.345090 32064 leveldb.cpp:174] Opened db in 3.444533ms
I0504 15:43:05.345728 32064 leveldb.cpp:181] Compacted db in 603462ns
I0504 15:43:05.345772 32064 leveldb.cpp:196] Created db iterator in 16838ns
I0504 15:43:05.345788 32064 leveldb.cpp:202] Seeked to beginning of db in 1987ns
I0504 15:43:05.345799 32064 leveldb.cpp:271] Iterated through 0 keys
in the db in 269ns
I0504 15:43:05.345834 32064 replica.cpp:776] Replica recovered with
log positions 0 -> 0 with 1 holes and 0 unlearned
I0504 15:43:05.346590 32091 recover.cpp:451] Starting replica recovery
I0504 15:43:05.346793 32091 recover.cpp:477] Replica is in EMPTY status
I0504 15:43:05.347823 32098 replica.cpp:673] Replica in EMPTY status
received a broadcasted recover request from
__req_res__(168)@172.17.0.3:41866
I0504 15:43:05.348352 32090 recover.cpp:197] Received a recover
response from a replica in EMPTY status
I0504 15:43:05.348784 32098 recover.cpp:568] Updating replica status to STARTING
I0504 15:43:05.349874 32095 leveldb.cpp:304] Persisting metadata (8
bytes) to leveldb took 840720ns
I0504 15:43:05.349900 32095 replica.cpp:320] Persisted replica status
to STARTING
I0504 15:43:05.350070 32088 recover.cpp:477] Replica is in STARTING status
I0504 15:43:05.350971 32102 master.cpp:380] Master
2075640b-b7dc-44f0-89b5-b0f9af99be7e (41c61dc99119) started on
172.17.0.3:41866
I0504 15:43:05.351112 32088 replica.cpp:673] Replica in STARTING
status received a broadcasted recover request from
__req_res__(169)@172.17.0.3:41866
I0504 15:43:05.350991 32102 master.cpp:382] Flags at startup:
--acls="" --agent_ping_timeout="15secs"
--agent_reregister_timeout="10mins" --allocation_interval="1secs"
--allocator="HierarchicalDRF" --authenticate_agents="true"
--authenticate_frameworks="true" --authenticate_http_frameworks="true"
--authenticate_http_readonly="true"
--authenticate_http_readwrite="true" --authenticators="crammd5"
--authorizers="local" --credentials="/tmp/t7Ea9P/credentials"
--framework_sorter="drf" --help="false" --hostname_lookup="true"
--http_authenticators="basic" --http_framework_authenticators="basic"
--initialize_driver_logging="true" --log_auto_initialize="true"
--logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5"
--max_completed_frameworks="50"
--max_completed_tasks_per_framework="1000" --quiet="false"
--recovery_agent_removal_limit="100%" --registry="replicated_log"
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins"
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400"
--registry_store_timeout="100secs" --registry_strict="false"
--root_submissions="true" --user_sorter="drf" --version="false"
--webui_dir="/mesos/mesos-1.1.2/_inst/share/mesos/webui"
--work_dir="/tmp/t7Ea9P/master" --zk_session_timeout="10secs"
I0504 15:43:05.351322 32102 master.cpp:432] Master only allowing
authenticated frameworks to register
I0504 15:43:05.351335 32102 master.cpp:446] Master only allowing
authenticated agents to register
I0504 15:43:05.351341 32102 master.cpp:459] Master only allowing
authenticated HTTP frameworks to register
I0504 15:43:05.351348 32102 credentials.hpp:37] Loading credentials
for authentication from '/tmp/t7Ea9P/credentials'
I0504 15:43:05.351394 32094 recover.cpp:197] Received a recover
response from a replica in STARTING status
I0504 15:43:05.351594 32102 master.cpp:504] Using default 'crammd5'
authenticator
I0504 15:43:05.351850 32102 http.cpp:887] Using default 'basic' HTTP
authenticator for realm 'mesos-master-readonly'
I0504 15:43:05.352252 32102 http.cpp:887] Using default 'basic' HTTP
authenticator for realm 'mesos-master-readwrite'
I0504 15:43:05.352270 32090 recover.cpp:568] Updating replica status to VOTING
I0504 15:43:05.352500 32102 http.cpp:887] Using default 'basic' HTTP
authenticator for realm 'mesos-master-scheduler'
I0504 15:43:05.352635 32092 leveldb.cpp:304] Persisting metadata (8
bytes) to leveldb took 225076ns
I0504 15:43:05.352660 32092 replica.cpp:320] Persisted replica status to VOTING
I0504 15:43:05.352707 32102 master.cpp:584] Authorization enabled
I0504 15:43:05.352778 32087 recover.cpp:582] Successfully joined the Paxos group
I0504 15:43:05.352880 32091 hierarchical.cpp:149] Initialized
hierarchical allocator process
I0504 15:43:05.352883 32089 whitelist_watcher.cpp:77] No whitelist given
I0504 15:43:05.353144 32087 recover.cpp:466] Recover process terminated
I0504 15:43:05.355403 32085 master.cpp:2017] Elected as the leading master!
I0504 15:43:05.355437 32085 master.cpp:1560] Recovering from registrar

[RESULT][VOTE] Release Apache Mesos 1.0.4 (rc2)

2017-05-04 Thread Vinod Kone
Hi all,


The vote for Mesos 1.0.4 (rc2) has passed with the

following votes.


+1 (Binding)

--

Ben Mahler

Vinod Kone

Anand Mazumdar

+1 (Non-binding)

--


There were no 0 or -1 votes.


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/1.0.4


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.4


The mesos-1.0.4.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,

On Wed, May 3, 2017 at 10:08 AM, Anand Mazumdar <an...@apache.org> wrote:

> +1 (binding)
>
> make check passed on Ubuntu 16.04 with clang 3.6
>
> -anand
>
> On Wed, May 3, 2017 at 10:01 AM, Vinod Kone <vinodk...@apache.org> wrote:
>
> > +1 (binding)
> >
> > *Revision*: 4154f66d6c6dde8fd2cf2bbf0bfa155f24ac55d4
> >
> >- refs/tags/1.0.4-rc2
> >
> > Configuration Matrix gcc clang
> > centos:7 --verbose --enable-libevent --enable-ssl autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> 20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%
> 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > --verbose autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
> ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_
> exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%
> 3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose%20--
> enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=-
> -verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > --verbose autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
> ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,
> label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> Release/32/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose,
> ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,
> label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-e

Re: [VOTE] Release Apache Mesos 1.0.4 (rc2)

2017-05-03 Thread Vinod Kone
+1 (binding)

*Revision*: 4154f66d6c6dde8fd2cf2bbf0bfa155f24ac55d4

   - refs/tags/1.0.4-rc2

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
--verbose autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
--verbose autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>

On Tue, May 2, 2017 at 4:03 PM, Benjamin Mahler <bmah...@apache.org> wrote:

> +1 make check passes on macOS 10.12.4 with clang
>
> On Tue, May 2, 2017 at 12:04 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
> > Hi all,
> >
> >
> > Please vote on releasing the following candidate as Apache Mesos 1.0.4.
> >
> >
> > 1.0.4 includes the following:
> >
> > 
> > 
> >
> > * [MESOS-2537] - AC_ARG_ENABLED checks are broken
> >
> >
> > * [MESOS-6606] - Reject optimized builds with libcxx before 3.9
> >
> >
> > * [MESOS-7008] - Quota not recovered from registry in empty cluster.
> >
> >
> > * [MESOS-7265] - Containerizer startup may cause sensitive data to
> leak
> > into sandbox logs.
> >
> > * [MESOS-7366] - Agent 

[VOTE] Release Apache Mesos 1.0.4 (rc2)

2017-05-02 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.4.


1.0.4 includes the following:



* [MESOS-2537] - AC_ARG_ENABLED checks are broken


* [MESOS-6606] - Reject optimized builds with libcxx before 3.9


* [MESOS-7008] - Quota not recovered from registry in empty cluster.


* [MESOS-7265] - Containerizer startup may cause sensitive data to leak
into sandbox logs.

* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire
persistent volume content.

* [MESOS-7383] - Docker executor logs possibly sensitive parameters.


* [MESOS-7422] - Docker containerizer should not leak possibly
sensitive data to agent log.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.4-rc2




The candidate for Mesos 1.0.4 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc2/mesos-1.0.4.tar.gz


The tag to be voted on is 1.0.4-rc2:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.4-rc2


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc2/mesos-1.0.4.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc2/mesos-1.0.4.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1186


Please vote on releasing this package as Apache Mesos 1.0.4!


The vote is open until Fri May  5 12:02:42 PDT 2017 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.4

[ ] -1 Do not release this package because ...


Thanks,


Re: [VOTE] Release Apache Mesos 1.0.4 (rc1)

2017-05-01 Thread Vinod Kone
This vote has been cancelled. I'll cut another RC with the fix for
MESOS-7265 as Adam requested.

On Tue, Apr 25, 2017 at 5:38 PM, Greg Mann <g...@mesosphere.io> wrote:

> +1 (non-binding)
>
> Ran `sudo make check` on CentOS 7 with Docker 1.12.1. The only test
> failure was: ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_Whiteout
> While I haven't had a chance to look deeply into this, it seems that the
> whiteout handling was not correct at the time of 1.0, and these changes
> were not backported to 1.0 so the failure is not surprising:
> https://issues.apache.org/jira/browse/MESOS-6360
>
> Also successfully ran the `test-upgrade.py` script both from 0.28.3 ->
> 1.0.4-rc1 and from 1.0.4-rc1 -> 1.1.1
>
> Cheers,
> Greg
>
>
> On Mon, Apr 24, 2017 at 3:23 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> +1 (binding)
>>
>> Tested on ASF CI.
>>
>> *Revision*: 71e41f166f671c988e36c1bf04728ec3589eb509
>>
>>- refs/tags/1.0.4-rc1
>>
>> Configuration Matrix gcc clang
>> centos:7 --verbose --enable-libevent --enable-ssl autotools
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> cmake
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> --verbose autotools
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> cmake
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Not run]
>> ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> cmake
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> --verbose autotools
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> cmake
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>> [image: Success]

Fwd: Github's disappearing mirrors

2017-04-28 Thread Vinod Kone
FYI

-- Forwarded message --
From: Chris Lambertus 
Date: Fri, Apr 28, 2017 at 12:22 PM
Subject: Github's disappearing mirrors
To: committers 


Hello committers,

We have received quite a few reports of github mirrors gone missing. We’ve
tracked this down to an errant process at Github which appears to be
deleting
not only ours but also other orgs’ mirrors. We contacted Github but have
yet to
receive a reply. Another organization also contacted github and received the
following reply:

"Hi there, Sorry for the trouble! We've now had a couple of reports of this
problem, and we've opened an issue internally to investigate.  I don't have
an
ETA on a fix, but we'll be in touch if we need more information from you or
if
we have any information to share.  Regards, Laura GitHub Support”


We have no further information at this time. We have been restoring the
mirrors
wherever possible, but until the root cause is resolved on Github’s side, we
expect mirrors to continue to be erroneously removed.

Access to the repos via the usual https://git-wip-us.apache.org/ channel
remains functional.

-Chris
ASF Infra


signature.asc
Description: PGP signature


Re: [VOTE] Release Apache Mesos 1.0.4 (rc1)

2017-04-24 Thread Vinod Kone
+1 (binding)

Tested on ASF CI.

*Revision*: 71e41f166f671c988e36c1bf04728ec3589eb509

   - refs/tags/1.0.4-rc1

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
--verbose autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
--verbose autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>

On Mon, Apr 17, 2017 at 4:49 PM, Adam Bordelon <a...@mesosphere.io> wrote:

> -0, wish we could include the fix for https://issues.apache.org/jira
> /browse/MESOS-7265 in 1.0.4, but I won't hold the release for it.
>
> On Mon, Apr 17, 2017 at 3:44 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> Hi all,
>>
>> Please vote on releasing the following candidate as Apache Mesos 1.0.4.
>>
>>
>> 1.0.4 includes the following:
>>
>> 
>> 
>>
>> * [MESOS-2537] - AC_ARG_ENABLED checks are broken
>>
>>
>> * [MESOS-6606] - Reject optimized builds with libcxx before 3.9
>>
>>
>> * [MESOS-7008] - Quota not recovered from registry in empty cluster.
>>
>>
>> * [MESOS-7366] - Agent sandbox gc could accidentally delete the
>> entire persistent volume content.
>>
&g

[VOTE] Release Apache Mesos 1.0.4 (rc1)

2017-04-17 Thread Vinod Kone
Hi all,

Please vote on releasing the following candidate as Apache Mesos 1.0.4.


1.0.4 includes the following:



* [MESOS-2537] - AC_ARG_ENABLED checks are broken


* [MESOS-6606] - Reject optimized builds with libcxx before 3.9


* [MESOS-7008] - Quota not recovered from registry in empty cluster.


* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire
persistent volume content.

* [MESOS-7383] - Docker executor logs possibly sensitive parameters.



The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.4-rc1




The candidate for Mesos 1.0.4 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc1/mesos-1.0.4.tar.gz


The tag to be voted on is 1.0.4-rc1:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.4-rc1


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc1/mesos-1.0.4.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc1/mesos-1.0.4.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1184


Please vote on releasing this package as Apache Mesos 1.0.4!


The vote is open until Thu Apr 20 15:42:56 PDT 2017 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.4

[ ] -1 Do not release this package because ...


Thanks,


Re: resourceOffer

2017-03-07 Thread Vinod Kone
Hmm. These logs do not have enough information. All I see is a master
starting up and an agent re-registering with a bunch of orphan tasks.  I
don't see the framework re-registering with the master at all.

On Tue, Mar 7, 2017 at 9:41 AM, Oeg Bizz <oegb...@yahoo.com> wrote:

> Sure, there they are.
>
>
> On Tuesday, March 7, 2017 12:34 PM, Vinod Kone <vinodk...@gmail.com>
> wrote:
>
>
> Can you share master log?
>
> @vinodkone
>
> On Mar 7, 2017, at 2:54 AM, Oeg Bizz <oegb...@yahoo.com> wrote:
>
> Hi,
>I am new at mesos and started exploring its usability for a new project
> I will be involved.  I wrote an scheduler and an executor and I am able to
> send one task which is executed properly.  After the first task is finished
> I no longer get resourceOffer() invocations to my Scheduler.  What am I
> missing?  If I do not send a task I can the resourceOffer calls
> consistently every 5 seconds or so.  Also, does Mesos send all of the
> resources every time or just a partial list?  Thanks in advance for any
> help,
>
> Oscar
>
>
>
>


Re: [VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-03 Thread Vinod Kone
+1 (binding)

Since the perf issue I reported earlier doesn't seem to be a blocker.

On Fri, Mar 3, 2017 at 12:14 AM, Alex Rukletsov <a...@mesosphere.com> wrote:

> Was this perf issue introduced by one of the fixes included in 1.1.1-rc2?
> If not, I would suggest we vote for 1.1.1-rc2 and back port the perf fix
> into 1.1.2. IIUC, time based patch releases should *not be worse*, hence if
> the perf issue was already in 1.1.0 it is *fine* to fix it in 1.1.2. I
> would like to avoid postponing already belated 1.1.1 for even longer.
>
> On Wed, Mar 1, 2017 at 8:02 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
> > Tested on ASF CI.
> >
> > Saw 2 configurations fail with
> > https://issues.apache.org/jira/browse/MESOS-7160
> >
> > I think @jpeach and @bbannier were looking into this. Not sure about the
> > severity of the issue, so withholding my vote.
> >
> >
> > *Revision*: b9d8202a7444d0d1e49476bfc9817eb4583beaff
> >
> >- refs/tags/1.1.1-rc2
> >
> > Configuration Matrix gcc clang
> > centos:7 --verbose --enable-libevent --enable-ssl autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%
> > 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > --verbose autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
> > ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_
> > exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%
> > 3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Failed]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=clang,
> CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> > docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=-
> > -verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> > docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > --verbose autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
> > ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,
> > label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Failed]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose,
> > ENVIRONMENT=GLOG_v=1%20MESOS_VERBO

Re: [VOTE] Release Apache Mesos 1.2.0 (rc2)

2017-03-03 Thread Vinod Kone
+1 (binding)

Since the perf and flaky test that I reported earlier doesn't seem to be
blockers.

On Fri, Mar 3, 2017 at 4:01 PM, Adam Bordelon <a...@mesosphere.io> wrote:

> I haven't heard any -1's so I'm going to go ahead and vote myself, from a
> DC/OS perspective:
>
> +1 (binding)
>
> I ran 1.2.0-rc2 through the DC/OS integration tests on top of the
> 1.9.0-rc1, which covers many Mesos features and tests multiple frameworks.
> See CI results of https://github.com/dcos/dcos/pull/1295
>
> This was then merged into DC/OS 1.9.0-rc2 which passed another suite of
> integration tests. Available for testing at https://dcos.io/releases/1.9.
> 0-rc2/
>
>
> On Thu, Mar 2, 2017 at 12:02 AM, Adam Bordelon <a...@mesosphere.io> wrote:
>
>> TL;DR: No consensus yet. Let's extend the vote for a day or two, until we
>> have 3 +1s or a legit -1.
>> During that time we can test further, and investigate any issues that
>> have shown up.
>>
>> Here's a summary of what's been reported on the 1.2.0-rc2 vote thread:
>>
>> - There was a perf core dump on ASF CI, which is not necessarily a
>> blocker:
>> MESOS-7160  Parsing of perf version segfaults
>>   Perhaps fixed by backporting MESOS-6982: PerfTest.Version fails on
>> recent Arch Linux
>>
>> - There were a couple of (known/unsurprising) flaky tests:
>> MESOS-7185  
>> DockerRuntimeIsolatorTest.ROOT_INTERNET_CURL_DockerDefaultEntryptRegistryPuller
>> is flaky
>> MESOS-4570  DockerFetcherPluginTest.INTERNET_CURL_FetchImage seems flaky.
>>
>> - If we were to have an rc3, the following Critical bugs could be
>> included:
>> MESOS-7050  IOSwitchboard FDs leaked when containerizer launch fails --
>> leads to deadlock
>> MESOS-6982  PerfTest.Version fails on recent Arch Linux
>>
>> - Plus doc updates:
>> MESOS-7188 Add documentation for Debug APIs to Operator API doc
>> MESOS-7189 Add nested container launch/wait/kill APIs to agent API
>> docs.
>>
>>
>> On Wed, Mar 1, 2017 at 11:30 AM, Neil Conway <neil.con...@gmail.com>
>> wrote:
>>
>>> The perf core dump might be addressed if we backport this change:
>>>
>>> https://reviews.apache.org/r/56611/
>>>
>>> Although my guess is that this isn't a severe problem: for some
>>> as-yet-unknown reason, running `perf` on the host segfaulted, which
>>> causes the test to fail.
>>>
>>> Neil
>>>
>>> On Wed, Mar 1, 2017 at 11:09 AM, Vinod Kone <vinodk...@apache.org>
>>> wrote:
>>> > Tested on ASF CI.
>>> >
>>> > Saw 2 configurations fail. One was the perf core dump issue
>>> > <https://issues.apache.org/jira/browse/MESOS-7160>. Other is a known
>>> (since
>>> > 0..28.0) flaky test with Docker fetcher plugin
>>> > <https://issues.apache.org/jira/browse/MESOS-4570>.
>>> >
>>> > Withholding the vote until we know the severity of the perf core dump.
>>> >
>>> >
>>> > *Revision*: b9d8202a7444d0d1e49476bfc9817eb4583beaff
>>> >
>>> >- refs/tags/1.1.1-rc2
>>> >
>>> > Configuration Matrix gcc clang
>>> > centos:7 --verbose --enable-libevent --enable-ssl autotools
>>> > [image: Success]
>>> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--ver
>>> bose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=
>>> 1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
>>> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>> > [image: Not run]
>>> > cmake
>>> > [image: Success]
>>> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose
>>> %20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
>>> 20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoo
>>> p)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>> > [image: Not run]
>>> > --verbose autotools
>>> > [image: Success]
>>> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--ver
>>> bose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,
>>> label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>> > [image: Not run]
>>> > cmake
>>> > [image: Success]
>>> > <h

Re: [VOTE] Release Apache Mesos 1.2.0 (rc2)

2017-03-01 Thread Vinod Kone
Tested on ASF CI.

Saw 2 configurations fail. One was the perf core dump issue
. Other is a known (since
0..28.0) flaky test with Docker fetcher plugin
.

Withholding the vote until we know the severity of the perf core dump.


*Revision*: b9d8202a7444d0d1e49476bfc9817eb4583beaff

   - refs/tags/1.1.1-rc2

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]


On Wed, Mar 1, 2017 at 9:24 AM, Greg Mann  wrote:

> I wanted to give a heads up on a flaky test failure I've encountered while
> testing this RC: 'DockerRuntimeIsolatorTest.ROO
> T_INTERNET_CURL_DockerDefaultEntryptRegistryPuller'. One issue related to
> this test was resolved recently (https://issues.apache.org/
> jira/browse/MESOS-6001), but this seems to be a separate issue (
> https://issues.apache.org/jira/browse/MESOS-7185). I haven't had time to
> triage yet so I'm not sure if this represents a legitimate bug, but I
> thought I'd email here to increase visibility while the vote is out.
>
> Cheers,
> Greg
>
>
> On Fri, Feb 24, 2017 at 1:14 AM, Adam Bordelon  wrote:
>
> > Dear Mesos developers and users,
> >
> > Please vote on releasing the following candidate as Apache Mesos 1.2.0.
> >
> > 1.2.0 includes the following:
> > 
> > 
> >   * 

Re: [VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-01 Thread Vinod Kone
Tested on ASF CI.

Saw 2 configurations fail with
https://issues.apache.org/jira/browse/MESOS-7160

I think @jpeach and @bbannier were looking into this. Not sure about the
severity of the issue, so withholding my vote.


*Revision*: b9d8202a7444d0d1e49476bfc9817eb4583beaff

   - refs/tags/1.1.1-rc2

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]


On Mon, Feb 27, 2017 at 5:54 AM, Alex Rukletsov  wrote:

> Hi all,
>
> Please vote on releasing the following candidate as Apache Mesos 1.1.1.
>
> 1.1.1 includes the following:
> 
> 
> ** Bug
>   * [MESOS-6002] - The whiteout file cannot be removed correctly using
> aufs backend.
>   * [MESOS-6010] - Docker registry puller shows decode error "No response
> decoded".
>   * [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
>   * [MESOS-6360] - The handling of whiteout files in provisioner is not
> correct.
>   * [MESOS-6411] - Add documentation for CNI port-mapper plugin.
>   * [MESOS-6526] - `mesos-containerizer launch --environment` exposes
> executor env vars in `ps`.
>   * [MESOS-6571] - Add "--task" flag to mesos-execute.
>   * [MESOS-6597] - Include v1 Operator API protos in generated JAR and
> python packages.
>   * [MESOS-6606] - Reject optimized builds with libcxx before 3.9.
>   * [MESOS-6621] - SSL downgrade path will CHECK-fail when using both
> 

Re: Question: Modify mesos agent to add custom resources that change dinamically

2017-02-09 Thread Vinod Kone
Don't think that's possible today and I cannot think of easy workarounds
for it.

On Thu, Feb 9, 2017 at 1:39 AM, Carnero Iglesias, Javier <
javier.carn...@atos.net> wrote:

> Hi guys, I’ve posted in StackOverflow a *question*
> 
> that is not been answered by anyone. I thought to share it with you so
> maybe I can reach someone who has the answer:
>
> I'm developing a new mesos-slurm framework where jobs from outside mesos
> can also be pushed to slurm queues.
>
> The mesos agent has a slurm workload manager installed in the same
> computer that orchestrates jobs in a HPC. This Slurm receive jobs either
> from the mesos executor as from other methods (for example third-party
> users sending jobs directly to slurm through ssh).
>
> Therefore I'd like the agent could know, before sending offers to mesos,
> the state of the slurm queues (number of jobs running and waiting to run),
> and offer resources accordingly. This cannot be achieved only by knowing
> the tasks accepted by the executor, as other resources of the HPC could
> have been taken by third-party users using slurm directly.
>
> In other words what I'd like to do is customize the way the agent know the
> resources available to offer, to take into account the current state of
> Slurm queues.
>
> Is this possible? If positive, how could be achieved?
>
> Thanks in advance.
>
> Javier Carnero
> Software Architect
> Research and Innovation Group
> *ARI booklet*
> 
> Atos IT Solutions and Services Iberia SL
> *javier.carnero**@atos.net*
> 
> +34 955 25 41 03 <+34%20955%2025%2041%2003>
>
>
> This e-mail and the documents attached are confidential and intended
> solely for the addressee; it may also be privileged. If you receive this
> e-mail in error, please notify the sender immediately and destroy it.
> As its integrity cannot be secured on the Internet, the Atos group
> liability cannot be triggered for the message content. Although the sender
> endeavors to maintain a computer virus-free network, the sender does not
> warrant that this transmission is virus-free and will not be liable for any
> damages resulting from any virus transmitted.
>
> Este mensaje y los ficheros adjuntos pueden contener información
> confidencial destinada solamente a la(s) persona(s) mencionadas
> anteriormente y pueden estar protegidos por secreto profesional.
> Si usted recibe este correo electrónico por error, gracias por informar
> inmediatamente al remitente y destruir el mensaje.
> Al no estar asegurada la integridad de este mensaje sobre la red, Atos no
> se hace responsable por su contenido. Su contenido no constituye ningún
> compromiso para el grupo Atos, salvo ratificación escrita por ambas partes.
> Aunque se esfuerza al máximo por mantener su red libre de virus, el emisor
> no puede garantizar nada al respecto y no será responsable de cualesquiera
> daños que puedan resultar de una transmisión de virus.
>


Re: [VOTE] Release Apache Mesos 1.1.1 (rc1)

2017-02-08 Thread Vinod Kone
+1 (binding)

Tested on ASF CI.

*Revision*: 5d4c9962930c3f5c08e802caff40b670424cb091

   - refs/tags/1.1.1-rc1

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]


On Wed, Feb 8, 2017 at 9:09 AM, Kapil Arya  wrote:

> +1 binding.
>
> Internal CI to build deb/rpm packages.
>
> The deb/rpm binary packages are available at:
> http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-1.1.1-rc1
>
>
> On Tue, Feb 7, 2017 at 5:39 PM, Alex R  wrote:
>
>> Hi all,
>>
>> Please vote on releasing the following candidate as Apache Mesos 1.1.1.
>>
>> 1.1.1 includes the following:
>> 
>> 
>> ** Bug
>>   * [MESOS-6002] - The whiteout file cannot be removed correctly using
>> aufs backend.
>>   * [MESOS-6010] - Docker registry puller shows decode error "No response
>> decoded".
>>   * [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
>>   * [MESOS-6360] - The handling of whiteout files in provisioner is not
>> correct.
>>   * [MESOS-6411] - Add documentation for CNI port-mapper plugin.
>>   * [MESOS-6526] - `mesos-containerizer launch --environment` exposes
>> executor env vars in `ps`.
>>   * [MESOS-6571] - Add "--task" flag to mesos-execute.
>>   * [MESOS-6597] - Include v1 Operator API protos in generated JAR and
>> python packages.
>>   * [MESOS-6621] - SSL downgrade path will CHECK-fail when 

[RESULT][VOTE] Release Apache Mesos 1.0.3 (rc2)

2017-02-06 Thread Vinod Kone
Hi all,


The vote for Mesos 1.0.3 (rc2) has passed with the

following votes.


+1 (Binding)

--

Vinod Kone

Adam Bordelon

Kapil Arya


There were no 0 or -1 votes.


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/1.0.3


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.3


The mesos-1.0.3.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,

On Mon, Feb 6, 2017 at 12:07 PM, Kapil Arya <ka...@mesosphere.io> wrote:

> +1 binding.
>
> Built rpm/deb packages on internal build jobs. Packages are available here:
>  http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-1.0.3-rc2
>
> On Mon, Feb 6, 2017 at 3:01 PM, Adam Bordelon <a...@mesosphere.io> wrote:
>
> > +1 binding
> >
> > Tests passed against DC/OS 1.8.8 (prerelease), which is based on the
> Apache
> > Mesos 1.0.x branch.
> > https://github.com/dcos/dcos/pull/1210
> >
> > On Wed, Feb 1, 2017 at 10:01 AM, Vinod Kone <vinodk...@apache.org>
> wrote:
> >
> > > +1 (binding)
> > >
> > > Tested on ASF CI
> > >
> > >
> > > *Revision*: c673fdd00e7f93ab7844965435d57fd691fb4d8d
> > >
> > >- refs/tags/1.0.3-rc2
> > >
> > > Configuration Matrix gcc clang
> > > centos:7 --verbose --enable-libevent --enable-ssl autotools
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Not run]
> > > cmake
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%
> > 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Not run]
> > > --verbose autotools
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
> > ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_
> > exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Not run]
> > > cmake
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%
> > 3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Not run]
> > > ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=autotools,COMPILER=clang,
> CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > cmake
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> > docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=-
> > -verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> >

[VOTE] Release Apache Mesos 1.0.3 (rc2)

2017-01-31 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.3.


1.0.3 includes the following:



* [MESOS-6052] - Unable to launch containers on CNI networks on CoreOS


* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.


* [MESOS-6621] - SSL downgrade path will CHECK-fail when using both
temporary and persistent sockets

* [MESOS-6676] - Always re-link with scheduler during re-registration.


* [MESOS-6917] - Segfault when the executor sets an invalid UUID when
sending a status update.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.3-rc2




The candidate for Mesos 1.0.3 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.3-rc2/mesos-1.0.3.tar.gz


The tag to be voted on is 1.0.3-rc2:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.3-rc2


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.3-rc2/mesos-1.0.3.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.3-rc2/mesos-1.0.3.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1174


Please vote on releasing this package as Apache Mesos 1.0.3!


The vote is open until Fri Feb  3 11:45:37 PST 2017 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.3

[ ] -1 Do not release this package because ...


Thanks,


Re: Framework stops to receive the heartbeats and events and gets removed from master

2017-01-23 Thread Vinod Kone
No problem. Glad you figured out. 

@vinodkone

> On Jan 23, 2017, at 8:38 AM, Vova Shelgunov  wrote:
> 
> Yes, it works. Sorry for troubling, the first time when I looked at the logs 
> I did not notice that failover_timeout is zero.
> 
> 2017-01-23 19:27 GMT+03:00 Vova Shelgunov :
>> Logs from mesos master:
>> 
>> 0123 15:53:44.523613 7 http.cpp:391] HTTP POST for 
>> /master/api/v1/scheduler from 172.18.0.1:58864 with User-Agent='AHC/2.0'
>> I0123 15:53:44.524159 7 master.cpp:4827] Processing ACKNOWLEDGE call 
>> ac9a6e5e-67b3-490a-930f-0024eab734b4 for task 10336 of framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework) on agent 
>> 16c100c1-13fe-47b8-a2a0-aed9bafbbf8c-S0
>> I0123 15:53:44.524849 7 master.cpp:7744] Removing task 10336 with 
>> resources cpus(*):0.1; mem(*):32 of framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 on agent 
>> 16c100c1-13fe-47b8-a2a0-aed9bafbbf8c-S0 at slave(1)@172.18.0.3:5051 
>> (mesos-slave)
>> I0123 15:53:44.529033 7 master.cpp:1297] Framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework) disconnected
>> I0123 15:53:44.529636 7 master.cpp:2902] Disconnecting framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework)
>> I0123 15:53:44.529974 7 master.cpp:2926] Deactivating framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework)
>> I0123 15:53:44.530299 7 master.cpp:1310] Giving framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework) 0ns to 
>> failover
>> I0123 15:53:44.530594 7 hierarchical.cpp:386] Deactivated framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005
>> I0123 15:53:44.531962 7 master.cpp:6369] Framework failover timeout, 
>> removing framework 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTif TP 
>> Framework)
>> I0123 15:53:44.534992 7 master.cpp:7103] Removing framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework)
>> 
>> It seems failover timeout is set to zero for the framework.
>> 
>> It can be my coding error if framework looses its connection to the master 
>> multiple times (I see that I do not pass failover_timeout value during 
>> reconnection).
>> I will try to observe if it solves my issue.
>> 
>> Thanks
>> 
>> 2017-01-23 19:05 GMT+03:00 Vova Shelgunov :
>>> Hi,
>>> 
>>> I faced a very strange situation with my framework that talks to mesos 
>>> master via Scheduler HTTP API:
>>> 
>>> Sometimes my framework stops to receive the heartbeats and task updates 
>>> from a master.
>>> I read the documentation of mesos 
>>> (http://mesos.apache.org/documentation/latest/scheduler-http-api/), Network 
>>> partitions section and I see that if a framework does not receive the 
>>> heartbeats within some time it should reconnect to the master.
>>> 
>>> I have written a heartbeat monitor that checks if there were not heartbeats 
>>> last n seconds, then reconnect, but after the reconnection, I all the time 
>>> receive an ERROR from the mesos master that my framework has been removed.
>>> 
>>> Why is it happening?
>>> 
>>> Regards,
>>> Uladzimir
>> 
> 


Re: Framework stops to receive the heartbeats and events and gets removed from master

2017-01-23 Thread Vinod Kone
Can you paste the logs or master and framework?

@vinodkone

> On Jan 23, 2017, at 8:05 AM, Vova Shelgunov  wrote:
> 
> Hi,
> 
> I faced a very strange situation with my framework that talks to mesos master 
> via Scheduler HTTP API:
> 
> Sometimes my framework stops to receive the heartbeats and task updates from 
> a master.
> I read the documentation of mesos 
> (http://mesos.apache.org/documentation/latest/scheduler-http-api/), Network 
> partitions section and I see that if a framework does not receive the 
> heartbeats within some time it should reconnect to the master.
> 
> I have written a heartbeat monitor that checks if there were not heartbeats 
> last n seconds, then reconnect, but after the reconnection, I all the time 
> receive an ERROR from the mesos master that my framework has been removed.
> 
> Why is it happening?
> 
> Regards,
> Uladzimir


Welcome Neil Conway as Mesos Committer and PMC member!

2017-01-20 Thread Vinod Kone
Hi folks,

Please welcome Neil Conway as the newest committer and PMC member of the
Apache Mesos project.

Neil has been an active contributor to Mesos for more than a year now. As
part of his work, he has contributed some major features (Partition aware
frameworks, floating point operations for resources). Neil also took the
initiative to improve the documentation of our project and shepherded
several improvements over time. Doing that even without being a committer,
shows that he takes ownership of the project seriously.

Here is his more formal checklist for your perusal.

https://docs.google.com/document/d/137MYwxEw9QCZRH09CXfn1544p1LuM
uoj9LxS-sk2_F4/edit

Thanks,
Vinod


Mesos 1.0.3 release

2017-01-16 Thread Vinod Kone
Hi folks,

I'm planning to cut 1.0.3 release tomorrow. If you need anything that needs
to be backported, please mark the tickets as such.

Release dashboard:
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12330112

Thanks,
Vinod


Re: Mesos YouTube Channel

2017-01-09 Thread Vinod Kone
Thanks for doing this MPark!

On Mon, Jan 9, 2017 at 6:21 PM, Michael Park  wrote:

> I've created a brand channel for Mesos on YouTube for community activities:
> https://www.youtube.com/channel/UC0wxLxgX8ilUn0m31lCpzAw.
>
> The only community activities currently captured in the channel are:
>   - Developer Community Meetings, and
>   - MesosCon presentations I've collected as "Saved Playlists".
>
> Going forward, I think we can use this channel for work group meetings as
> well
> once those are a little more fleshed out.
>
> Thanks,
>
> MPark
>


Re: Proposal for evaluating Mesos scalability and robustness through stress test.

2017-01-06 Thread Vinod Kone
Great to hear!

Haven't looked at the doc yet, but I know some folks from Twitter were also
interested this.  https://issues.apache.org/jira/browse/MESOS-6768

Probably worth to see if the ideas can be consolidated?

On Fri, Jan 6, 2017 at 6:57 PM, Zhitao Li  wrote:

> (sending this again since previous attempt seemed bumped back)
>
> Hi folks,
>
> As all of you we are super excited to use Mesos to manage thousands of
> different applications on  our large-scale clusters. When the application
> and host amount keeps increasing, we are getting more and more curious
> about what would be the potential scalability limit/bottleneck to Mesos'
> centralized architecture and what is its robustness in the face of various
> failures. If we can identify them in advance, probably we can manage and
> optimize them before we are suffering in any potential performance
> degradations.
>
> To explore Mesos' capability and break the knowledge gap, we have a
> proposal to evaluate Mesos scalability and robustness through stress test,
> the draft of which can be found at: draft_link
>  qpXzHYFQAZGWjCdS3cZA/edit?usp=sharing>.
> Please
> feel free to provide your suggestions and feedback through comment on the
> draft.
>
> Probably many of you have similar questions as we have. We will be happy to
> share our findings in these experiments with the Mesos community. Please
> stay tuned.
>
> --
> Cheers,
>
> Ao Ma & Zhitao Li
>


Re: Mesos 1.1.1 release dashboard

2016-12-22 Thread Vinod Kone
Same deal with the next patch release for 1.0.x ;)

@vinodkone

> On Dec 22, 2016, at 10:15 AM, Alex Rukletsov  wrote:
> 
> Folks,
> 
> We are planning to cut the 1.1.1 release early next week. If you have any
> patches that need to get into 1.1.1, please make sure that either it is
> already in the 1.1.x branch or the corresponding ticket has a target
> version including 1.1.1 *by Monday* Dec 26.
> 
> The release dashboard:
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12329892
> 
> AlexR & Till.


Re: Welcome Guangya Liu as Mesos Committer and PMC member!

2016-12-16 Thread Vinod Kone
Congrats Guangya! Welcome to the PMC!

On Fri, Dec 16, 2016 at 7:03 PM, Sam  wrote:

> congratulations Guangya
>
> Sent from my iPhone
>
> On 17 Dec 2016, at 3:23 AM, Avinash Sridharan 
> wrote:
>
> Congrats Guangya !!
>
> On Fri, Dec 16, 2016 at 11:20 AM, Greg Mann  wrote:
>
>> Congratulations Guangya!!! :D
>>
>> On Fri, Dec 16, 2016 at 11:10 AM, Jie Yu  wrote:
>>
>>> Hi folks,
>>>
>>> Please join me in formally welcoming Guangya Liu as Mesos Committer and
>>> PMC
>>> member.
>>>
>>> Guangya has worked on the project for more than a year now and has been a
>>> very active contributor to the project. I think one of the most important
>>> contribution he has for the community is that he helped grow the Mesos
>>> community in China. He initiated the Xian-Mesos-User-Group and
>>> successfully
>>> organized two meetups which attracted more than 100 people from Xi’an
>>> China. He wrote a handful of blogs and articles in Chinese tech media
>>> which
>>> attracted a lot of interests in Mesos. He had given several talks about
>>> Mesos at conferences in China.
>>>
>>> His major coding contribution to the project was the docker volume driver
>>> isolator. He has also been involved in allocator performance improvement,
>>> gpu support for docker containerizer, Mesos Tiers/Optimistic Offer
>>> design,
>>> scarce resources discussion, and many others.
>>>
>>> His formal checklist is here:
>>> https://docs.google.com/document/d/1tot79kyJCTTgJHBhzStFKrVkDK4pX
>>> qfl-LHCLOovNtI/edit?usp=sharing
>>> 
>>>
>>> Thanks,
>>> - Jie
>>>
>>
>>
>
>
> --
> Avinash Sridharan, Mesosphere
> +1 (323) 702 5245 <(323)%20702-5245>
>
>


Welcome Haosdent Huang as Mesos Committer and PMC member!

2016-12-16 Thread Vinod Kone
Hi folks,

Please join me in formally welcoming Haosdent Huang as Mesos Committer and
PMC member.

Haosdent has been an active contributor to the project for more than a year
now. He has contributed a number of patches and features to the Mesos code
base, most notably the unified cgroups isolator and health check
improvements. The most impressive thing about him is that he always
volunteers to help out people in the community, be it on slack/IRC or
mailing lists. The fact that he does all this even though working on Mesos
is not part of his day job is even more impressive.

Here is his more formal checklist

for your perusal.

Thanks,
Vinod

P.S: Sorry for the delay in sending the welcome email.


Re: Quota

2016-12-09 Thread Vinod Kone
And how many resources does spark need?

On Fri, Dec 9, 2016 at 4:05 PM, Vijay Srinivasaraghavan <
vijikar...@yahoo.com> wrote:

> Here is the slave state info. I see marathon is registered as
> "slave_public" role and is configured with "default_accepted_resource_roles"
> as "*"
>
> "slaves":[
>   {
>  "id":"69356344-e2c4-453d-baaf-22df4a4cc430-S0",
>  "pid":"slave(1)@xxx.xxx.xxx.100:5051",
>  "hostname":"xxx.xxx.xxx.100",
>  "registered_time":1481267726.19244,
>  "resources":{
> "disk":12099.0,
> "mem":14863.0,
> "gpus":0.0,
> "cpus":4.0,
> "ports":"[1025-2180, 2182-3887, 3889-5049,
> 5052-8079, 8082-8180, 8182-32000]"
>  },
>  "used_resources":{
> "disk":0.0,
> "mem":0.0,
> "gpus":0.0,
> "cpus":0.0
>  },
>  "offered_resources":{
> "disk":0.0,
> "mem":0.0,
> "gpus":0.0,
> "cpus":0.0
>  },
>  "reserved_resources":{
>
>  },
>  "unreserved_resources":{
> "disk":12099.0,
> "mem":14863.0,
> "gpus":0.0,
> "cpus":4.0,
> "ports":"[1025-2180, 2182-3887, 3889-5049,
> 5052-8079, 8082-8180, 8182-32000]"
>  },
>  "attributes":{
>
>  },
>  "active":true,
>  "version":"1.0.1"
>   }
>],
>
> Regards
> Vijay
> On Friday, December 9, 2016 3:48 PM, Vinod Kone <vinodk...@apache.org>
> wrote:
>
>
> How many resources does the agent register with the master? How many
> resources does spark task need?
>
> I'm guessing marathon is not registered with "test" role so it is only
> getting un-reserved resources which are not enough for spark task?
>
> On Fri, Dec 9, 2016 at 2:54 PM, Vijay Srinivasaraghavan <
> vijikar...@yahoo.com> wrote:
>
> I have a standalone DCOS setup (Single node Vagrant VM running DCOS
> v.1.9-dev build + Mesos 1.0.1 + Marathon 1.3.0). Both master and agent are
> running on same VM.
>
> Resource: 4 CPU, 16GB Memory, 20G Disk
>
> I have created a quota using new V1 API which creates a role "test" with
> resource constraints of 0.5 CPU and 1G Memory.
>
> When I try to deploy Spark package, Marathon receives the request but the
> task is in "waiting" state since it did not receive any offers from Master
> though I don't see any resource constraints from the hardware perspective.
>
> However, when I deleted the quota, Marathon is able to move forward with
> the deployment and Spark was deployed/up and running. I could see from the
> Mesos master logs that it had sent an offer to the Marathon framework.
>
> To debug the issue, I was trying to create a quota but this time did not
> provide any CPU and Memory (0 cpu and 0 mem). After this, when I try to
> deploy Spark from DCOS UI, I could see Marathon getting offer from Master
> and able to deploy Spark without the need to delete the quota this time.
>
> Did anyone notice similar behavior?
>
> Regards
> Vijay
>
>
>
>
>


Re: Quota

2016-12-09 Thread Vinod Kone
How many resources does the agent register with the master? How many
resources does spark task need?

I'm guessing marathon is not registered with "test" role so it is only
getting un-reserved resources which are not enough for spark task?

On Fri, Dec 9, 2016 at 2:54 PM, Vijay Srinivasaraghavan <
vijikar...@yahoo.com> wrote:

> I have a standalone DCOS setup (Single node Vagrant VM running DCOS
> v.1.9-dev build + Mesos 1.0.1 + Marathon 1.3.0). Both master and agent are
> running on same VM.
>
> Resource: 4 CPU, 16GB Memory, 20G Disk
>
> I have created a quota using new V1 API which creates a role "test" with
> resource constraints of 0.5 CPU and 1G Memory.
>
> When I try to deploy Spark package, Marathon receives the request but the
> task is in "waiting" state since it did not receive any offers from Master
> though I don't see any resource constraints from the hardware perspective.
>
> However, when I deleted the quota, Marathon is able to move forward with
> the deployment and Spark was deployed/up and running. I could see from the
> Mesos master logs that it had sent an offer to the Marathon framework.
>
> To debug the issue, I was trying to create a quota but this time did not
> provide any CPU and Memory (0 cpu and 0 mem). After this, when I try to
> deploy Spark from DCOS UI, I could see Marathon getting offer from Master
> and able to deploy Spark without the need to delete the quota this time.
>
> Did anyone notice similar behavior?
>
> Regards
> Vijay
>


Re: Authentication module

2016-12-04 Thread Vinod Kone
Authentication is enabled for Mesos APIs used by schedulers (to talk to
master), operators (to talk to master/agent) and agents (to talk to
master). Executor to agent communication is not currently authenticated.

This might throw some light:
https://github.com/apache/mesos/blob/master/docs/authentication.md

On Fri, Dec 2, 2016 at 11:48 AM, Alexander Gallego 
wrote:

>
> For the authentication module: http://mesos.apache.org/
> documentation/latest/modules/ does it mean kerberos,ldap, etc for tasks
> or for framework registration or for machine registration
>
> are there any more docs on this?
>
>
>


Re: Force offer from all of the slaves

2016-11-28 Thread Vinod Kone
Once you set GLOG_v, you should be able to see lines like these "Framework
 filtered agent   for <123> seconds"

On Sun, Nov 27, 2016 at 8:18 AM, haosdent  wrote:

> > I choose the right offer and decline the rest.
> Hi, @krishnanvr Do you use up all available resources in that agent's
> offer? If so, that agent could not provide offers anymore until the
> resource release.
>
> And you may consider starting the master with the `GLOG_v=1` environment
> variable which would print more detail logs to help you debug this.
>
> On Sat, Nov 26, 2016 at 5:05 PM, Krishnanarayanan VR <
> krishna...@phonepe.com> wrote:
>
>> Hello:
>>
>> Is there a way to force ResourceOffers to get offers from all available
>> slaves ?
>>
>> Let me clarify:
>>
>> I have a single framework in my cluster. Each time ResourceOffers gets
>> the list of offers, I choose the right offer and decline the rest. But I
>> notice that next time a callback to ResourceOffers occurs, only a subset of
>> slaves is present in the offer. The slave from offer that was chosen in the
>> previous iteration is invariably absent.
>>
>> I also tried to set refuse_seconds to 0 in  both LaunchTasks and
>> Decline(egs below):
>>
>> driver.DeclineOffer(offer.Id, {RefuseSeconds:
>> proto.Float64(0)})
>>
>> ^^ but that didn't seem to help.
>>
>> Any pointers how I can make sure am presented with offers from all the
>> slaves all the time ?
>>
>> Thanks
>>
>>
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>


Re: Mesos making offers for no CPU

2016-11-27 Thread Vinod Kone
On Sun, Nov 27, 2016 at 7:53 PM, Christopher Hunt <
christopher.h...@lightbend.com> wrote:

> My question here though is in the event of receiving a resource offer with
> no CPU in it, and then declining it, why shouldn’t my framework receive
> offers regarding other nodes with CPU? Surely a resource offer declination
> is an indicator that the particular node in question isn’t suitable and
> Mesos should move on…
>

You should receive offers from other nodes. Are there other frameworks in
your cluster that are starving out this framework? Can you see (and paste
it here) master logs to see who are being sent offers for other nodes?


[RESULT][VOTE] Release Apache Mesos 1.0.2 (rc3)

2016-11-15 Thread Vinod Kone
Hi all,


The vote for Mesos 1.0.2 (rc3) has passed with the

following votes.


+1 (Binding)

--

Alex Rukletsov

Till Toenshoff

Yan Xu


There were no 0 or -1 votes.


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/1.0.2


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.2


The mesos-1.0.2.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,


[VOTE] Release Apache Mesos 1.0.2 (rc3)

2016-11-07 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.2.


This is a bug fix release.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.2-rc3




The candidate for Mesos 1.0.2 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc3/mesos-1.0.2.tar.gz


The tag to be voted on is 1.0.2-rc3:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.2-rc3


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc3/mesos-1.0.2.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc3/mesos-1.0.2.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1168


Please vote on releasing this package as Apache Mesos 1.0.2!


The vote is open until Thu Nov 10 11:22:30 PST 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.2

[ ] -1 Do not release this package because ...


Thanks,


[VOTE] Release Apache Mesos 1.0.2 (rc2)

2016-10-31 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.2.


This is a bug fix release.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.2-rc2




The candidate for Mesos 1.0.2 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc2/mesos-1.0.2.tar.gz


The tag to be voted on is 1.0.2-rc2:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.2-rc2


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc2/mesos-1.0.2.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc2/mesos-1.0.2.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1164


Please vote on releasing this package as Apache Mesos 1.0.2!


The vote is open until Thu Nov  3 16:34:20 PDT 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.2

[ ] -1 Do not release this package because ...


Thanks,


Re: outstanding offers

2016-10-31 Thread Vinod Kone
Are you running a custom framework?

Can you see in scheduler logs which offers you are receiving? Am I
understanding your question correctly that Mesos thinks offers are being
sent to your framework but (you think) your framework hasn't received them?

Note that you can increase logging on the framework (driver) and Mesos
master by setting GLOG_v=1 in the environment.

On Mon, Oct 31, 2016 at 12:42 AM, Hendrik Haddorp 
wrote:

> Hi,
>
> I have a Mesos 0.28.2 system and generally things seem to run fine. The
> "Outstanding Offers" normally shows nothing, which I believe is normal.
> However at some point my framework gets disconnected for some odd reason,
> might be due to some high load or so. A few seconds later I receive a
> reregistered call from Mesos. However it looks like around this time offers
> start to get listed on the "Oustanding Offers" page. Even more strangely no
> Mesos log file contains any information for the offer IDs shown.
> Unfortunately the default logging does not show what offer IDs are being
> send out while it shows the IDs that are being declined or got accepted. So
> I don't know when these actually offers got send out.
>
> How can I deal with such situation? Should I:
> Stop the SchedulerDriver when I get disconnected instead of waiting
> for a reregistered call?
> Is it advised to set --offer_timeout to recover from such a situation?
> Is there any way to reconcile offers like one can do for tasks?
>
> thanks,
> Hendrik
>


Re: On Mesos versioning and deprecation policy

2016-10-28 Thread Vinod Kone
We had an extended discussion around this in the last community sync.
Thanks for those who participated!

To sum up the discussion:

--> As mesos devs, we should strive to not make incompatible changes in
APIs, flags, environment variables.

--> In the rare case where an incompatible change is preferred (e.g., code
complexity), we should give a clear 6 months heads up the users that a
breaking change is going to take place.

--> Breaking changes do not necessitate a major version bump. This is
because we want to allow live upgrades between major versions (e.g., 1.10
to 2.0).

--> Compatibility guarantees do not apply to experimental features (incl.
APIs).

--> We need to have clear documentation about procedure that devs could
follow when deprecating/removing stable features and adding experimental
features.

--> We need to improve upgrades.md to make it easy for operators to know
what features are deprecated/removed between versions X and Y.

--> We should decouple internal protos used by Mesos from the unversioned
protos used by driver based frameworks.

I will spend some time in the next few weeks to create/update the
documentation reflecting these points.

Anything else I missed?

Thanks,

On Sat, Oct 15, 2016 at 11:47 AM, haosdent <haosd...@gmail.com> wrote:

> Thanks @yan's great inputs! I couldn't agree more almost of them.
>
> > Also the API is not just what the machine reads but all the documentation
> associated with it, right? It depends on what the documentation says; what
> the user _should_ expect.
>
> I think different users may have different expectations. And the guy who
> developed the APIs may have different understand from some users as well.
> Our documentations should cover most of cases.
>
> But in case that we didn't or forgot to write it explicitly in the
> document, should we give up to update the API? Just like user Alice said
> this is a BUG while user Bob said this is a feature. I think we still need
> to raise it case by case to ensure most users are not affected by the
> breaking API changes.
>
> On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone <vinodk...@apache.org> wrote:
>
> > We will chat about this in the upcoming community sync (thursday 3 PM).
> > So, please make sure to attend if you are interested.
> >
> > On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xuj...@apple.com> wrote:
> >
> >>
> >> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xuj...@apple.com> wrote:
> >>
> >>> Thanks Alex for starting this!
> >>>
> >>> In addition to comments below, I think it'll be helpful to keep the
> >>> existing versioning doc concise and user-friendly while having a
> dedicated
> >>> doc for the "implementation details" where precise requirements and
> >>> procedures go. Maybe some duplication/cross-referencing is needed but
> Mesos
> >>> developers will find the latter much more helpful while the
> users/framework
> >>> developer will find the former easy to read.
> >>>
> >>> e.g., a similar split:
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> >>> vel/api_changes.md (which has a lot of details on how the kubernetes
> >>> community is thinking about similar issues, which we can learn from)
> >>>
> >>> Jiang Yan Xu 
> >>>
> >>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <a...@mesosphere.com>
> >>> wrote:
> >>>
> >>>> Folks,
> >>>>
> >>>> There have been a bunch of online [1, 2] and offline discussions about
> >>>> our
> >>>> deprecation and versioning policy. I found that people—including
> >>>> myself—read the versioning doc [3] differently; moreover some aspects
> >>>> are
> >>>> not captured there. I would like to start a discussion around this
> >>>> topic by
> >>>> sharing my confusions and suggestions. This will hopefully help us
> stay
> >>>> on
> >>>> the same page and have similar expectations. The second goal is to
> >>>> eliminate ambiguities from the versioning doc (thanks Vinod for
> >>>> volunteering to update it).
> >>>>
> >>>
> >>> +1 Let me know if there are things I can help with.
> >>>
> >>>
> >>>>
> >>>> 1. API vs. semantic changes.
> >>>> Current versioning guide treat features (e.g. flags, metrics,
> endpoints)
> >>>> and API dif

Re: Does libprocess support multi-port?

2016-10-26 Thread Vinod Kone
No it doesn't.

On Wed, Oct 26, 2016 at 1:10 AM, Suteng  wrote:

> Hi,
>
> Does libprocess support multi port? Some process bind to a port, and some
> other process bind to another port in the same OS process.
>
>
>
> Thanks,
>
> Teng
>
>
>
>
>
>
>
>
>
> Su Teng  00241668
>
>
>
> Distributed and Parallel Software Lab
>
> Huawei Technologies Co., Ltd.
>
> Email:sut...@huawei.com
>
>
>
>
>


Re: On Mesos versioning and deprecation policy

2016-10-14 Thread Vinod Kone
We will chat about this in the upcoming community sync (thursday 3 PM). So,
please make sure to attend if you are interested.

On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu  wrote:

>
> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu  wrote:
>
>> Thanks Alex for starting this!
>>
>> In addition to comments below, I think it'll be helpful to keep the
>> existing versioning doc concise and user-friendly while having a dedicated
>> doc for the "implementation details" where precise requirements and
>> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
>> developers will find the latter much more helpful while the users/framework
>> developer will find the former easy to read.
>>
>> e.g., a similar split:
>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
>> vel/api_changes.md (which has a lot of details on how the kubernetes
>> community is thinking about similar issues, which we can learn from)
>>
>> Jiang Yan Xu 
>>
>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov 
>> wrote:
>>
>>> Folks,
>>>
>>> There have been a bunch of online [1, 2] and offline discussions about
>>> our
>>> deprecation and versioning policy. I found that people—including
>>> myself—read the versioning doc [3] differently; moreover some aspects are
>>> not captured there. I would like to start a discussion around this topic
>>> by
>>> sharing my confusions and suggestions. This will hopefully help us stay
>>> on
>>> the same page and have similar expectations. The second goal is to
>>> eliminate ambiguities from the versioning doc (thanks Vinod for
>>> volunteering to update it).
>>>
>>
>> +1 Let me know if there are things I can help with.
>>
>>
>>>
>>> 1. API vs. semantic changes.
>>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>>> and API differently: incompatible changes for the former are allowed
>>> after
>>> 6 month deprecation cycle, while for the latter they require bumping a
>>> major version. I suggest we consolidate these policies.
>>>
>>
>> I feel that the distinction is not API vs. semantic changes, Backwards
>> compatible API guarantee should imply backwards compatible semantics (of
>> the API).
>> i.e., if a change in API doesn't cause the message to be dropped to the
>> floor but leads to behavior change that causes problems in the system, it
>> still breaks compatibility.
>>
>> IMO the distinction is more between:
>> - Compatibility between components that are impossible/very unpleasant to
>> upgrade in lockstep - high priority for compatibility guarantee.
>> - Compatibility between components that are generally bundled (modules)
>> or things that usually aren't built into automated tooling (e.g., the
>> /state endpoint) - more relaxed for now but we should explicitly exclude
>> them from the guarantee.
>>
>>
>>>
>>> We should also define and clearly explain what changes require bumping
>>> the
>>> major version. I have no strong opinion here and would love to hear what
>>> people think. The original motivation for maintaining backwards
>>> compatibility is to make sure vN schedulers can correctly work with vN
>>> API
>>> without being updated. But what about semantic changes that do not touch
>>> the API? For example, what if we decide to send less task health updates
>>> to
>>> schedulers based on some health policy? It influences the flow of task
>>> status updates, should such change be considered compatible? Taking it to
>>> an extreme, we may not even be able to fix some bugs because someone may
>>> already rely on this behaviour!
>>>
>>
>> API changes should warrant a major version bump. Also the API is not just
>> what the machine reads but all the documentation associated with it, right?
>> It depends on what the documentation says; what the user _should_ expect.
>>
>> That said, I feel that these things are hard to be talked about in the
>> abstract. Even with a guideline, we still need to make case-by-case
>> decisions. (e.g., has the documentation precisely defined this precise
>> behavior? If not, is it reasonable for the users to expect some behavior
>> because it's common sense? How bad is it if some behavior just changes a
>> tiny bit?) Therefore we need to make sure the process for API changes are
>> more rigorously defined.
>>
>> Whether something is a bug depends on whether the API does what it says
>> it'll do. The line may sometimes be blurry but in general I don't feel it's
>> a problem. If someone is relying on the behavior that is a bug, we should
>> still help them fix it but the bug shouldn't count as "our guarantee".
>>
>>
>>>
>>> Another tightly related thing we should explicitly call out is
>>> upgradability and rollback capabilities inside a major release.
>>> Committing
>>> to this may significantly limit what we can change within a major
>>> release;
>>> on the other side it will give users more time and a 

Re: 1.1.0 release

2016-10-07 Thread Vinod Kone
I think you need to clean up the JIRA a bit.

1) Make sure unresolved tickets do not have fix version (1.1.0) set.
2) Move "Fix version 1.1.0" to "Target version 1.1.0".

2) might obviate the need for 1).



On Fri, Oct 7, 2016 at 7:24 AM, Till Toenshoff  wrote:

> Hi everyone!
>
> its us who will be the Release Managers for 1.1.0 - Alex and Till!
>
> We are planning to cut the next release (1.1.0) within three workdays -
> that would be Wednesday next week. So, if you have any patches that need to
> get into 1.1.0 make sure that either is already in the master branch or the
> corresponding ticket has a target version set to 1.1.0.
>
> The release dashboard:
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12329720
>
> Alex & Till
>


Re: 1.0.2 release

2016-10-05 Thread Vinod Kone
Release dashboard:
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12329719

I'm waiting for 2 issues to be resolved. Once that's done, I'll start
prepping the release.

On Wed, Oct 5, 2016 at 4:11 PM, Vinod Kone <vinodk...@apache.org> wrote:

> Hi,
>
> As the Release Manager for 1.0, I'm responsible for all subsequent patch
> releases.
>
> I'm planning to cut the next patch release (1.0.2) within a week. So, if
> you have any patches that need to get into 1.0.2 make sure that either it
> is already in the 1.0.x branch or the corresponding ticket has a target
> version set to 1.0.2.
>
> I'll send a link to the release dashboard shortly.
>
> Thanks,
> -- Vinod
>


1.0.2 release

2016-10-05 Thread Vinod Kone
Hi,

As the Release Manager for 1.0, I'm responsible for all subsequent patch
releases.

I'm planning to cut the next patch release (1.0.2) within a week. So, if
you have any patches that need to get into 1.0.2 make sure that either it
is already in the 1.0.x branch or the corresponding ticket has a target
version set to 1.0.2.

I'll send a link to the release dashboard shortly.

Thanks,
-- Vinod


Re: Target version vs Fixed Version

2016-10-03 Thread Vinod Kone
Yes.

On Mon, Oct 3, 2016 at 7:58 PM, haosdent <haosd...@gmail.com> wrote:

> For resolved issue, is it OK to do similar things? For example, this issue
> https://issues.apache.org/jira/browse/MESOS-5613 make mesos-local not
> work in 1.0.x, and I think it would be better that check pick this into
> 1.0.x.
>
> On Tue, Oct 4, 2016 at 9:17 AM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> Hi,
>>
>> Going forward, if you want an unresolved issue to be targeted for a
>> specific version please set the "Target Version". The committer that
>> commits the fix and resolves the ticket will set the appropriate "Fix
>> Version".
>> This applies to backports as well.
>>
>> Thanks,
>> Vinod
>>
>> -- Forwarded message --
>> From: Vinod Kone (JIRA) <j...@apache.org>
>> Date: Mon, Oct 3, 2016 at 6:13 PM
>> Subject: [jira] [Updated] (MESOS-6026) Tasks mistakenly marked as FAILED
>> due to race b/w ⁠sendExecutorTerminatedStatusUpdate()⁠ and
>> ⁠_statusUpdate()⁠
>> To: iss...@mesos.apache.org
>>
>>
>>
>>  [ https://issues.apache.org/jira/browse/MESOS-6026?page=
>> com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>
>> Vinod Kone updated MESOS-6026:
>> --
>> Target Version/s: 1.0.2
>>Fix Version/s: (was: 1.0.2)
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>


Re: Mesos 1.1.0 release date

2016-10-03 Thread Vinod Kone
We are planning to release it in a week or so.

Till has agreed to be the release manager for the release and will be
supported by AlexR.

@Till: Can you create a release dashboard and reply to this thread?


Re: Updating ExecutorInfo after framework failover or best practice

2016-09-29 Thread Vinod Kone
We cannot easily make ExecutorInfo mutable because there might be existing
tasks with executors with the old ExecutorInfo. If there are two different
ExecutorInfos for the same ExecutorID it gets confusing for Mesos (e.g.,
SHUTDOWN executor id 'foo' kills which executor?).

One possible solution is to not re-use ExecutorID, but that depends on what
semantics you want for your executor.

On Thu, Sep 29, 2016 at 3:01 AM, Kota UENISHI <
ueni...@nautilus-technologies.com> wrote:

> Hi there,
>
> I'm going to implement scheduler failover into my framework, and hit
> an issue - while I know it's how Mesos works for now:
>
> My framework lets Mesos agents fetch my custom executor jar file from
> scheduler process's HTTP endpoint. Suppose framework process restarted
> by Marathon or whatever in a different machine after failure, the URL
> of the HTTP endpoint to download executor jar file from changes to
> that of new scheduler process. This causes ExecutorInfo validation
> failure, like [1]. And I think this is why Spark's
> MesosClusterDispatcher is not ready for HA yet.
>
> As a (major?) workaround, [1] avoids this by assuming URL identity by
> DNS or load balancer-ish stuff. Another short-sighted kludge
> workaround would be relaxing the ExecutorInfo validation for the
> failover case - which I believe solves many framework developers'
> headache.
>
> Also, best workaround in Mesos code would be just clearing
> ExecutorInfo after Master found scheduler failover. I think
> ExecutorInfo must be 1:1 with FrameworkInfo, but I does not have to be
> immutable. Under partition, it may diverge across masters but LWW
> merge after partition heal would be enough to keep it unique.
>
> Thoughts?
>
> [1] https://github.com/mesosphere/kubernetes-mesos/issues/15
>
> Kota UENISHI
>


Re: Threshold-based CPU and Memory Oversubscription

2016-09-21 Thread Vinod Kone
Awesome. Great to see this!

Looking forward to the blog post on how this helped utilization in
production :P

On Wed, Sep 21, 2016 at 10:26 AM, Erb, Stephan 
wrote:

> Hi everyone,
>
>
>
> we are happy to announce that we have open sourced two simple
> threshold-based oversubscription modules for Mesos. We use them for CPU and
> memory oversubscription and have them running in production.
>
>
>
> https://github.com/blue-yonder/mesos-threshold-oversubscription
>
>
>
> The threshold-based approach enabled us to double the peak CPU and peak
> memory utilization in our Mesos/Aurora clusters. Your mileage may vary, so
> please take this statement with a grain of salt.
>
>
>
> Best Regards,
>
> Stephan Erb
>
> PS: Retweets welcome :-) https://twitter.com/BlueYonderTech/status/
> 778630174996893696
>


Re: Setting log path for mesos java client library

2016-09-12 Thread Vinod Kone
Looks like Mesos logging flags

for these override

the corresponding GLOG related flags.

Try setting "MESOS_LOG_DIR=" and "MESOS_QUIET=true"

On Mon, Sep 12, 2016 at 12:09 PM, Wil Yegelwel  wrote:

> I’m trying to set the log path (and later the log format) for the mesos
> java lib. From the docs in http://mesos.apache.org/api/
> latest/java/org/apache/mesos/MesosSchedulerDriver.html it appears I need
> to set the correct GLOG environment variable in order to get this to work,
> but I can’t seem to get it. I’ve tried setting the environment variables:
> “GLOG_log_dir=…”, “GLOG_logtostderr=0” but neither seem to change the
> behavior and it is still logging to stderr. Has anyone been able to set the
> path the mesos java client library writes to and, if so, how?
>
>


Fwd: REMINDER: MesosCon Asia’s CFP Deadline is September 9! Submit your Proposal Today

2016-09-08 Thread Vinod Kone
Hi folks,

Just a friendly reminder that the CFP for MesosCon Asia is fast
approaching! If you were planning to submit a talk please do so ASAP. If
you weren't, please do :)

Thanks,
Vinod

-- Forwarded message --
From: Linux Foundation Events 
Date: Fri, Aug 26, 2016 at 2:09 AM
Subject: REMINDER: MesosCon Asia’s CFP Deadline is September 9! Submit your
Proposal Today
To: vi...@mesosphere.io


Having trouble? View Online

.
[image: Speak at MesosCon Asia, THE conference of the Apache Mesos
community. Proposals are due September 9.]

MesosCon Asia

fosters greater collaboration around the Apache Mesos community, bringing
together users and developers to learn and share while accelerating growth
of the project's ecosystem. The Apache Mesos community wants to hear from
you! Share lessons learned, best practices or pitch an idea for a hands-on
workshop or in-depth tutorial.

Check out the list of suggested topics for MesosCon Asia

.

Don’t delay - submit your proposal now. The deadline to submit proposals is
September 9.
Submit Now →


Thank You to Our Sponsors

*COMMUNITY PARTNER*
[image: The Apache Software Foundation]


*Apache, Apache Mesos, and Mesos are either registered trademarks or
trademarks of the Apache Software Foundation (ASF) in the United States
and/or other countries. MesosCon is run in partnership with the ASF.*



About Us 
| Events 
| Training
 |
Projects 
| Linux.com


[image: Facebook]
[image:
Twitter] [image:
YouTube]

You are receiving this email because you have expressed interest in The
Linux Foundation Events. Visit Your Email Preferences

.
The Linux Foundation One Letterman Drive Building D, Suite D4700, San
Francisco, CA, 94129


Re: mesos libraries

2016-08-23 Thread Vinod Kone
If you are writing a new scheduler, I would highly recommend using the new
HTTP API instead of the Java bindings. This would eliminate the dep on the
native library.

If you still want to use the old bindings, the easiest way might be to
install mesos deb package in your docker image.

On Tue, Aug 23, 2016 at 11:27 AM, Hendrik Haddorp 
wrote:

> Hi,
>
> I wrote a Mesos scheduler using the Java bindings, which worked great so
> far. Now I would like to run my scheduler as a docker container on
> Marathon. The problem is now that I'm missing the required native
> libraries. What is the best way to install them (in Ubuntu) without
> pulling heaps of other stuff?
>
> Thanks,
> Hendrik
>


Re: Mesos logging

2016-08-21 Thread Vinod Kone
Did you figure this out? AFAICT, the LOG(INFO) line should be printed in
agent logs. What agent flags are you using?

On Tue, Aug 9, 2016 at 8:19 AM, Hendrik Haddorp 
wrote:

> I saw a few "Running ..." log entries from the docker support code but
> they seem to be all from VLOG(1) calls while for some reason the code
> that does the actual "docker run" call uses LOG(INFO) and that does not
> seem to come out by default, or I don't see it. But I'll try on ...
>
> On 09/08/16 11:38, haosdent wrote:
> > Hi, @Hendrik You could see INFO log when running Mesos Agent in
> > default level. Some docker run logs may exist in the stdout/stderr of
> > executor.
> >
> > On Tue, Aug 9, 2016 at 12:27 PM, Hendrik Haddorp
> > > wrote:
> >
> > I would like to see the "docker run" trace from
> > https://github.com/apache/mesos/blob/master/src/docker/docker.cpp
> > 
> > line 811.
> > What verbosity level does INFO map to?
> >
> > On 09/08/16 05:06, Charles Allen wrote:
> > > Which glog are you trying to capture? You can set the verbosity
> > level
> > > with the environment variable GLOG_v
> > >
> > > And you can also set it through things like Spark. So if you want a
> > > lot of ZK chatter at the mesos level in your spark logs, add
> > >
> > > spark.executorEnv.GLOG_v=9
> > >
> > > to your spark context
> > >
> > > On Mon, Aug 8, 2016 at 2:53 PM Hendrik Haddorp
> > > 
> > >>
> > wrote:
> > >
> > > Hi,
> > >
> > > the Mesos code contains log statements using LOG(INFO) and
> > > VLOG(1), for
> > > example. So far I found that Mesos is using the Google Logging
> > > Library.
> > > Looking in the logs I only seem to be able to find output
> > from VLOG
> > > statements. What do I need to do to get the output from the LOG
> > > statements? Where would I typically find the output? I'm
> > using CentOS
> > > 7.2 and found the output so far in the files below
> > /var/log/mesos.
> > >
> > > thanks,
> > > Hendrik
> > >
> >
> >
> >
> >
> > --
> > Best Regards,
> > Haosdent Huang
>
>


[VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-10 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.1.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.1-rc1




The candidate for Mesos 1.0.1 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz


The tag to be voted on is 1.0.1-rc1:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.1-rc1


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1155


Please vote on releasing this package as Apache Mesos 1.0.1!


The vote is open until Mon Aug 15 17:29:33 PDT 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.1

[ ] -1 Do not release this package because ...


Thanks,


Re: Support for tasks groups aka pods in Mesos

2016-08-08 Thread Vinod Kone
Sorry, sent the wrong link earlier for design doc.

Design doc: https://issues.apache.org/jira/browse/MESOS-6009
>

Direct link:
https://docs.google.com/document/d/1FtcyQkDfGp-bPHTW4pUoqQCgVlPde936bo-IIENO_ho/edit#heading=h.ip4t59nlogfz


Support for tasks groups aka pods in Mesos

2016-08-08 Thread Vinod Kone
Hi folks,

One of the most requested features in Mesos has been first class support
for managing pod like containers. We finally have some time to focus and
shepherd this work.

The epic tracking this work is :
https://issues.apache.org/jira/browse/MESOS-2449

Design doc: https://issues.apache.org/jira/browse/MESOS-2449

Your feedback on the design will be most welcome. Once we get agreement on
the design, we can start breaking down the epic into tickets.

Thanks,
Vinod & Jie


1.0.1 release

2016-08-01 Thread Vinod Kone
Hi,

As discussed on the 1.0 voting thread, we plan to cut a 1.0.1 as early as
this week. So if you have anything that needs to absolutely go into the
patch release, please work with your shepherd and get it landed on trunk
and backported to the 1.0.x branch.

Thanks,


Re: Enabling basic access authentication

2016-08-01 Thread Vinod Kone
We separated out the default authentication mode for read only (default: no
authn) and read-write (default: authn) endpoints. Since the webui only
depends on the read-only endpoints you need to explicitly enable authn for
read-only endpoints if you need authn. See
https://github.com/apache/mesos/blob/master/docs/upgrades.md for more
details.

On Mon, Aug 1, 2016 at 12:20 PM, Douglas Nelson  wrote:

> It was working for me with mesos 1.0.0-rc2. Now that I made the switch to
> 1.0.0 the feature is missing for user/pass prompt at the WebUI. Was another
> flag added or was it decided that this feature wasn't necessary?
>
> On Tue, Jul 12, 2016 at 6:26 PM, Douglas Nelson 
> wrote:
>
>> Ah, I missed that in the vote message. That makes sense. I'm running
>> version 0.28.2 so that would be why.
>>
>> On Tue, Jul 12, 2016 at 6:22 PM, Zhitao Li  wrote:
>>
>>> Just went through this: I think the necessary endpoint `/master/state`
>>> is only authenticated after 1.0.0, which is still going through release
>>> vote.
>>>
>>> Can you share which version of Mesos you are running?
>>>
>>> On Tue, Jul 12, 2016 at 5:18 PM, Douglas Nelson 
>>> wrote:
>>>
 With marathon you can enable basic access authentication to the WebUI
 with the flag --http_credentials.

 I expected something similar with the flag --authenticate_http in mesos
 but when I hit the WebUI I'm not prompted to give a username/pass. Is that
 feature not included in mesos or is there a different configuration I need
 to set?

 Thanks!

>>>
>>>
>>>
>>> --
>>> Cheers,
>>>
>>> Zhitao Li
>>>
>>
>>
>


Re: [RESULT][VOTE] Release Apache Mesos 1.0.0 (rc4)

2016-07-27 Thread Vinod Kone
The 1.0 blog post is up: http://mesos.apache.org/blog/mesos-1-0-0-released/

Thank you all for making this possible!

@vinodkone

> On Jul 27, 2016, at 7:39 AM, Vinod Kone <vinodk...@apache.org> wrote:
> 
> Hi all,
> 
> The vote for Mesos 1.0.0 (rc4) has passed with the following votes.
> 
> 
> 
> +1 (Binding)
> 
> --
> 
> Kapil Arya
> 
> Jie Yu
> 
> Benjamin Mahler
> 
> 
> 
> +1 (Non-binding)
> 
> --
> 
> Haosdent
> 
> Greg Mann
> 
> Zhitao Li
> 
> 
> 
> +0
> 
> -
> 
> Yan Xu
> 
> 
> 
> There were no  -1 votes.
> 
> 
> 
> NOTE: There were a couple known issues [MESOS-5911, MESOS-5913] that couldn't 
> be fixed in time for the 1.0. We plan to do a patch release to fix these ASAP.
> 
> 
> 
> Please find the release at:
> 
> https://dist.apache.org/repos/dist/release/mesos/1.0.0
> 
> 
> 
> It is recommended to use a mirror to download the release:
> 
> http://www.apache.org/dyn/closer.cgi
> 
> 
> 
> The CHANGELOG for the release is available at:
> 
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.0
> 
> 
> 
> The mesos-1.0.0.jar has been released to:
> 
> https://repository.apache.org
> 
> 
> 
> The website (http://mesos.apache.org) will be updated shortly to reflect this 
> release.
> 
> 
> 
> Thanks,
> 
> 
>> On Fri, Jul 22, 2016 at 10:40 PM, Vinod Kone <vinodk...@apache.org> wrote:
>> Hi all,
>> 
>> 
>> 
>> Please vote on releasing the following candidate as Apache Mesos 1.0.0.
>> 
>> The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a majority 
>> of at least 3 +1 PMC votes are cast.
>> 
>> 
>> 1.0.0 includes the following:
>> 
>> 
>> 
>>   * Scheduler and Executor v1 HTTP APIs are now considered stable.   
>> 
>> 
>>  
>> 
>> 
>>   * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs. 
>> These  
>> 
>> APIs let operators and services (monitoring, load balancers) send HTTP   
>> 
>> 
>> requests to '/api/v1' endpoint on master or agent. See   
>> 
>> 
>> `docs/operator-http-api.md` for details. 
>> 
>> 
>>  
>> 
>> 
>>   * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator
>> 
>> 
>> has been added to isolate disk resources more efficiently. Please refer 
>> to   
>> 
>> docs/mesos-containerizer.md for more details.
>>
>> 
>>  
>> 
>> 
>>   * [MESOS-4355] - **Experimental** support for Docker volume plugin. We 
>> added a 
>> 
>> new isolator 'docker/volume' which allows users to use external volumes 
>> in   
>> 
>> Mesos containerizer. Currently, the isolator interacts with the Docker   
>> 
>> 
>> volume plugins using a tool called 'dvdcli'. By speaking the Docker 
>> volume   
>> 
>> plugin API, most of the Docker volume plugins are supported. 
>> 
>> 
>>  
>> 
>> 
>>   * [MESOS-4641] - **Experimental** A new network isolator, the  
>> 
>> 
>> `network/cni` isolator, has been introduced in the `MesosContainerizer`. 
>> The 
>> 
>> `network/cni` isolator implements the Container Network Interface (CNI)  
>> 
>> 
>> specification proposed by CoreOS.  With CNI the `network/cni` isolator 
>> is
>> 
>> able to allocate a network namespace to Mesos containers and attach the  
>> 
>> 
>> container to different types of IP networks by invoking network drivers  
>> 
>> 
>> called CNI plugins.  
>>
>> 
>>  
>> 
>> 
>>   * [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored 
>> in   
>> 
>>

[RESULT][VOTE] Release Apache Mesos 1.0.0 (rc4)

2016-07-27 Thread Vinod Kone
Hi all,

The vote for Mesos 1.0.0 (rc4) has passed with the following votes.


+1 (Binding)

--

Kapil Arya

Jie Yu

Benjamin Mahler


+1 (Non-binding)

--

Haosdent

Greg Mann

Zhitao Li


+0

-

Yan Xu


There were no  -1 votes.


*NOTE: There were a couple known issues [MESOS-5911
<https://issues.apache.org/jira/browse/MESOS-5911>, MESOS-5913
<https://issues.apache.org/jira/browse/MESOS-5913>] that couldn't be fixed
in time for the 1.0. We plan to do a patch release to fix these ASAP.*


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/1.0.0


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.0


The mesos-1.0.0.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,

On Fri, Jul 22, 2016 at 10:40 PM, Vinod Kone <vinodk...@apache.org> wrote:

> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 1.0.0.
>
> *The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a
> majority of at least 3 +1 PMC votes are cast.*
>
> 1.0.0 includes the following:
>
>
> 
>
>   * Scheduler and Executor v1 HTTP APIs are now considered stable.
>
>
>
>
>
>   * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
> These
>
> APIs let operators and services (monitoring, load balancers) send
> HTTP
>
> requests to '/api/v1' endpoint on master or agent. See
>
>
> `docs/operator-http-api.md` for details.
>
>
>
>
>
>   * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator
>
>
> has been added to isolate disk resources more efficiently. Please
> refer to
>
> docs/mesos-containerizer.md for more details.
>
>
>
>
>
>   * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
> added a
>
> new isolator 'docker/volume' which allows users to use external
> volumes in
>
> Mesos containerizer. Currently, the isolator interacts with the
> Docker
>
> volume plugins using a tool called 'dvdcli'. By speaking the Docker
> volume
>
> plugin API, most of the Docker volume plugins are supported.
>
>
>
>
>
>   * [MESOS-4641] - **Experimental** A new network isolator, the
>
>
> `network/cni` isolator, has been introduced in the
> `MesosContainerizer`. The
>
> `network/cni` isolator implements the Container Network Interface
> (CNI)
>
> specification proposed by CoreOS.  With CNI the `network/cni` isolator
> is
>
> able to allocate a network namespace to Mesos containers and attach
> the
>
> container to different types of IP networks by invoking network
> drivers
>
> called CNI plugins.
>
>
>
>
>
>   * [MESOS-2948, MESOS-5403] - The authorizer interface has been
> refactored in
>
> order to decouple the ACLs definition language from the interface.
>
>
> It additionally includes the option of retrieving `ObjectApprover`.
> An
>
> `ObjectApprover` can be used to synchronously check authorizations for
> a
>
> given object and is hence useful when authorizing a large number of
> objects
>
> and/or large objects (which need to be copied using request based
>
>
> authorization). NOTE: This is a **breaking change** for authorizer
> modules.
>
>
>
>
>   * [MESOS-5405] - The `subject` and `object` fields in
> authorization::Request
>
> have been changed from required to optional. If either of these fields
> is
>
> not set, the request should only be authorized if any subject/object
> should
>
> be allowed.
>
>
> NOTE: This is a semantic change for authorizer modules.
>
>
>
>
>
>   * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
> endpoint
>
> filtering enables operators to restrict what part of the cluster state
> a
>
> user is authorized to see.
>
>
> Consider for example the `/state` master endpoint: an operator can
> now
>
> authorize users to only see a subset of the running frameworks, tasks,
> or
>
> Consider for example the `/state` master endpoint: an operator can
> now
>
> authorize users to only see a subset of the running frameworks, tasks,
> or
>
> executors. The following endpoints support HTTP endpoint f

Re: [VOTE] Release Apache Mesos 1.0.0 (rc4)

2016-07-26 Thread Vinod Kone
We've the ASF press wire and other community blog posts lined up to be
posted tomorrow, so it will be really hard to tell all those folks to
postpone it this late. I've a couple options that I want to propose

1) Fix the webui bug in 1.0.1 which we will cut as soon as we fix this bug.

2) Try to fix the bug in the next couple hours, cut rc5, and vote it in
tonight without doing the typical 72 hour voting period.


I'm personally leaning towards 1) given the timing and the nature of the
bug. What do others think? PMC?

On Tue, Jul 26, 2016 at 4:08 PM, Yan Xu <xuj...@apple.com> wrote:

> I don't mind if it's shepherd by folks with more front-end expertise.
> Actually my original suggested solution on
> https://issues.apache.org/jira/browse/MESOS-5911 seemed incorrect. Let's
> discuss the actual fix on the ticket, I feel that a short term fix
> shouldn't be more than a few lines to unblock the release.
>
> On Jul 26, 2016, at 3:26 PM, Jie Yu <yujie@gmail.com> wrote:
>
> Yan, are you going to shepherd the fix for this one? If yes, when do you
> think it can be done?
>
> - Jie
>
> On Tue, Jul 26, 2016 at 3:05 PM, Yan Xu <xuj...@apple.com> wrote:
>
> -1
>
> We tested it in our testing environment but webUI redirect didn't work. We
> filed: https://issues.apache.org/jira/browse/MESOS-5911
>
> Given that webUI is the portal for Mesos clusters I feel that we should at
> least have a basic fix (more context in the JIRA) before release 1.0.
> Thoughts?
>
> On Jul 26, 2016, at 2:52 PM, Kapil Arya <ka...@mesosphere.io> wrote:
>
> +1 (binding)
>
> OpenSUSE Tumbleweed:
>./configure --disable-java --disable-python && make check
>
> On Tue, Jul 26, 2016 at 4:44 PM, Zhitao Li <zhitaoli...@gmail.com> wrote:
>
> Also tested:
>
> make check passes on OS X
>
> One thing I found when testing RC4 debian with Aurora integration test
> suite (on its master) is that scheduler previously expected GPU resource
> will not receive offers without new `GPU_RESOURCES` capability even it's
> the only scheduler.
>
> Given that GPU support is not technically released until 1.0, I don't
> consider this is a blocker to me, but it might be surprising to people
> already testing GPU support.
>
> On Tue, Jul 26, 2016 at 12:45 PM, Benjamin Mahler <bmah...@apache.org>
> wrote:
>
> +1 (binding)
>
> OS X 10.11.6
> ./configure --disable-python --disable-java
> make check
>
> On Tue, Jul 26, 2016 at 10:24 AM, Greg Mann <g...@mesosphere.io> wrote:
>
> +1 (non-binding)
>
> * Ran `sudo make distcheck` successfully on CentOS 7.1 with only one
>
> test
>
> failure: ExamplesTest.PythonFramework fails for me the first time it's
> executed as part of the whole test suite, and then succeeds on
>
> subsequent
>
> executions. I'm investigating further, and will file a ticket if
>
> necessary.
>
> * Ran the upgrade testing script successfully from 0.28.2 -> 1.0.0-rc4
>
> Cheers,
> Greg
>
> On Tue, Jul 26, 2016 at 1:58 AM, haosdent <haosd...@gmail.com> wrote:
>
> +1
>
> * make check in CentOS 7.2
> * make check in Ubuntu 14.04
> * test upgrade from 0.28.2 to 1.0.0-rc4
>
>
> On Tue, Jul 26, 2016 at 8:33 AM, Kapil Arya <ka...@mesosphere.io>
>
> wrote:
>
>
> One can find the deb/rpm packages here:
>
> http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-1.0.0-rc4
>
>
> And here are the corresponding docker images based off of Ubuntu
>
> 14.04:
>
>mesosphere/mesos:1.0.0-rc4
>mesosphere/mesos-master:1.0.0-rc4
>mesosphere/mesos-slave:1.0.0-rc4
>
> Kapil
>
> On Sat, Jul 23, 2016 at 1:40 AM, Vinod Kone <vinodk...@apache.org>
>
> wrote:
>
>
> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos
>
> 1.0.0.
>
>
> *The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes
>
> if a
>
> majority of at least 3 +1 PMC votes are cast.*
>
> 1.0.0 includes the following:
>
>
>
>
>
>
> 
>
>
>  * Scheduler and Executor v1 HTTP APIs are now considered stable.
>
>
>
>
>
>  * [MESOS-4791] - **Experimental** support for v1 Master and
>
> Agent
>
> APIs.
>
> These
>
>APIs let operators and services (monitoring, load balancers)
>
> send
>
> HTTP
>
>requests to '/api/v1' endpoint on master or agent. See
>
>
>`docs/operator-http-api.md` for details.
>
>
>
>
>
>  * [MESOS-4828] - **Experimental** support for a new `disk/xfs'
>
> isolator
>
>
>
>has been added t

[VOTE] Release Apache Mesos 1.0.0 (rc4)

2016-07-22 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.0.

*The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.*

1.0.0 includes the following:



  * Scheduler and Executor v1 HTTP APIs are now considered stable.





  * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
These

APIs let operators and services (monitoring, load balancers) send HTTP


requests to '/api/v1' endpoint on master or agent. See


`docs/operator-http-api.md` for details.





  * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator


has been added to isolate disk resources more efficiently. Please refer
to

docs/mesos-containerizer.md for more details.





  * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
added a

new isolator 'docker/volume' which allows users to use external volumes
in

Mesos containerizer. Currently, the isolator interacts with the Docker


volume plugins using a tool called 'dvdcli'. By speaking the Docker
volume

plugin API, most of the Docker volume plugins are supported.





  * [MESOS-4641] - **Experimental** A new network isolator, the


`network/cni` isolator, has been introduced in the
`MesosContainerizer`. The

`network/cni` isolator implements the Container Network Interface (CNI)


specification proposed by CoreOS.  With CNI the `network/cni` isolator
is

able to allocate a network namespace to Mesos containers and attach the


container to different types of IP networks by invoking network drivers


called CNI plugins.





  * [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored
in

order to decouple the ACLs definition language from the interface.


It additionally includes the option of retrieving `ObjectApprover`. An


`ObjectApprover` can be used to synchronously check authorizations for
a

given object and is hence useful when authorizing a large number of
objects

and/or large objects (which need to be copied using request based


authorization). NOTE: This is a **breaking change** for authorizer
modules.




  * [MESOS-5405] - The `subject` and `object` fields in
authorization::Request

have been changed from required to optional. If either of these fields
is

not set, the request should only be authorized if any subject/object
should

be allowed.


NOTE: This is a semantic change for authorizer modules.





  * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
endpoint

filtering enables operators to restrict what part of the cluster state
a

user is authorized to see.


Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

executors. The following endpoints support HTTP endpoint filtering:


'/state', '/state-summary', '/tasks', '/frameworks','/weights',


and '/roles'. Additonally the following v1 API calls support filtering:


'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and
'GET_TASKS'.




  * [MESOS-4909] - Tasks can now specify a kill policy. They are
best-effort,

because machine failures or forcible terminations may occur. Currently,
the

only available kill policy is how long to wait between graceful and
forcible

task kill. In the future, more policies may be available (e.g. hitting
an

HTTP endpoint, running a command, etc). Note that it is the executor's


responsibility to enforce kill policies. For executor-less
command-based

tasks, the kill is performed via sending a signal to the task process:


SIGTERM for the graceful kill and SIGKILL for the forcible kill. For
docker

executor-less tasks the grace period is passed to 'docker stop --time'.
This

feature supersedes the '--docker_stop_timeout', which is now
deprecated.




  * [MESOS-4908] - The task kill policy defined within 'TaskInfo' can now
be

overridden when the scheduler kills the task. This can be used by
schedulers

to forcefully kill a task which is already being killed, e.g. if
something

went wrong during a graceful kill and a forcible kill is desired. Note
that

it is the executor's responsibility to honor the
'Event.kill.kill_policy'

field and override the task's kill policy and kill policy from a
previous

kill task request. To use this feature, schedulers and executors must


support HTTP API; use the '--http_command_executor' agent flag to
ensure

the agent launches the HTTP API based command executor.





  * [MESOS-4949] - The executor shutdown grace period can now be configured
in


Re: [VOTE] Release Apache Mesos 1.0.0 (rc3)

2016-07-22 Thread Vinod Kone
Looks like we missed a cherry pick. I'm cancelling this vote and spinning
up rc4.

On Fri, Jul 22, 2016 at 2:24 PM, Vinod Kone <vinodk...@apache.org> wrote:

> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 1.0.0.
>
> *The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a
> majority of at least 3 +1 PMC votes are cast.*
>
> 1.0.0 includes the following:
>
>
> 
>
>   * Scheduler and Executor v1 HTTP APIs are now considered stable.
>
>
>
>
>
>   * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
> These
>
> APIs let operators and services (monitoring, load balancers) send
> HTTP
>
> requests to '/api/v1' endpoint on master or agent. See
>
>
> `docs/operator-http-api.md` for details.
>
>
>
>
>
>   * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator
>
>
> has been added to isolate disk resources more efficiently. Please
> refer to
>
> docs/mesos-containerizer.md for more details.
>
>
>
>
>
>   * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
> added a
>
> new isolator 'docker/volume' which allows users to use external
> volumes in
>
> Mesos containerizer. Currently, the isolator interacts with the
> Docker
>
> volume plugins using a tool called 'dvdcli'. By speaking the Docker
> volume
>
> plugin API, most of the Docker volume plugins are supported.
>
>
>
>
>
>   * [MESOS-4641] - **Experimental** A new network isolator, the
>
>
> `network/cni` isolator, has been introduced in the
> `MesosContainerizer`. The
>
> `network/cni` isolator implements the Container Network Interface
> (CNI)
>
> specification proposed by CoreOS.  With CNI the `network/cni` isolator
> is
>
> able to allocate a network namespace to Mesos containers and attach
> the
>
> container to different types of IP networks by invoking network
> drivers
>
> called CNI plugins.
>
>
>
>
>
>   * [MESOS-2948, MESOS-5403] - The authorizer interface has been
> refactored in
>
> order to decouple the ACLs definition language from the interface.
>
>
> It additionally includes the option of retrieving `ObjectApprover`.
> An
>
> `ObjectApprover` can be used to synchronously check authorizations for
> a
>
> given object and is hence useful when authorizing a large number of
> objects
>
> and/or large objects (which need to be copied using request based
>
>
> authorization). NOTE: This is a **breaking change** for authorizer
> modules.
>
>
>
>
>   * [MESOS-5405] - The `subject` and `object` fields in
> authorization::Request
>
> have been changed from required to optional. If either of these fields
> is
>
> not set, the request should only be authorized if any subject/object
> should
>
> be allowed.
>
>
> NOTE: This is a semantic change for authorizer modules.
>
>
>
>
>
>   * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
> endpoint
>
> filtering enables operators to restrict what part of the cluster state
> a
>
> user is authorized to see.
>
>
> Consider for example the `/state` master endpoint: an operator can
> now
>
> authorize users to only see a subset of the running frameworks, tasks,
> or
>
> Consider for example the `/state` master endpoint: an operator can
> now
>
> authorize users to only see a subset of the running frameworks, tasks,
> or
>
> executors. The following endpoints support HTTP endpoint filtering:
>
>
> '/state', '/state-summary', '/tasks', '/frameworks','/weights',
>
>
> and '/roles'. Additonally the following v1 API calls support
> filtering:
>
> 'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and
> 'GET_TASKS'.
>
>
>
>
>   * [MESOS-4909] - Tasks can now specify a kill policy. They are
> best-effort,
>
> because machine failures or forcible terminations may occur.
> Currently, the
>
> only available kill policy is how long to wait between graceful and
> forcible
>
> task kill. In the future, more policies may be available (e.g. hitting
> an
>
> HTTP endpoint, running a command, etc). Note that it is the
> executor's
>
> responsibility to enforce kill policies. For executor-less
> command-based
>
> tasks, the kill is performed via sending a signal to the task
> process:
>
> SIGTERM for the g

[VOTE] Release Apache Mesos 1.0.0 (rc3)

2016-07-22 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.0.

*The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.*

1.0.0 includes the following:



  * Scheduler and Executor v1 HTTP APIs are now considered stable.





  * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
These

APIs let operators and services (monitoring, load balancers) send HTTP


requests to '/api/v1' endpoint on master or agent. See


`docs/operator-http-api.md` for details.





  * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator


has been added to isolate disk resources more efficiently. Please refer
to

docs/mesos-containerizer.md for more details.





  * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
added a

new isolator 'docker/volume' which allows users to use external volumes
in

Mesos containerizer. Currently, the isolator interacts with the Docker


volume plugins using a tool called 'dvdcli'. By speaking the Docker
volume

plugin API, most of the Docker volume plugins are supported.





  * [MESOS-4641] - **Experimental** A new network isolator, the


`network/cni` isolator, has been introduced in the
`MesosContainerizer`. The

`network/cni` isolator implements the Container Network Interface (CNI)


specification proposed by CoreOS.  With CNI the `network/cni` isolator
is

able to allocate a network namespace to Mesos containers and attach the


container to different types of IP networks by invoking network drivers


called CNI plugins.





  * [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored
in

order to decouple the ACLs definition language from the interface.


It additionally includes the option of retrieving `ObjectApprover`. An


`ObjectApprover` can be used to synchronously check authorizations for
a

given object and is hence useful when authorizing a large number of
objects

and/or large objects (which need to be copied using request based


authorization). NOTE: This is a **breaking change** for authorizer
modules.




  * [MESOS-5405] - The `subject` and `object` fields in
authorization::Request

have been changed from required to optional. If either of these fields
is

not set, the request should only be authorized if any subject/object
should

be allowed.


NOTE: This is a semantic change for authorizer modules.





  * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
endpoint

filtering enables operators to restrict what part of the cluster state
a

user is authorized to see.


Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

executors. The following endpoints support HTTP endpoint filtering:


'/state', '/state-summary', '/tasks', '/frameworks','/weights',


and '/roles'. Additonally the following v1 API calls support filtering:


'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and
'GET_TASKS'.




  * [MESOS-4909] - Tasks can now specify a kill policy. They are
best-effort,

because machine failures or forcible terminations may occur. Currently,
the

only available kill policy is how long to wait between graceful and
forcible

task kill. In the future, more policies may be available (e.g. hitting
an

HTTP endpoint, running a command, etc). Note that it is the executor's


responsibility to enforce kill policies. For executor-less
command-based

tasks, the kill is performed via sending a signal to the task process:


SIGTERM for the graceful kill and SIGKILL for the forcible kill. For
docker

executor-less tasks the grace period is passed to 'docker stop --time'.
This

feature supersedes the '--docker_stop_timeout', which is now
deprecated.




  * [MESOS-4908] - The task kill policy defined within 'TaskInfo' can now
be

overridden when the scheduler kills the task. This can be used by
schedulers

to forcefully kill a task which is already being killed, e.g. if
something

went wrong during a graceful kill and a forcible kill is desired. Note
that

it is the executor's responsibility to honor the
'Event.kill.kill_policy'

field and override the task's kill policy and kill policy from a
previous

kill task request. To use this feature, schedulers and executors must


support HTTP API; use the '--http_command_executor' agent flag to
ensure

the agent launches the HTTP API based command executor.





  * [MESOS-4949] - The executor shutdown grace period can now be configured
in


Re: Possible authentication bug

2016-07-21 Thread Vinod Kone
On Thu, Jul 21, 2016 at 4:49 PM, Douglas Nelson  wrote:

> Just out of curiosity, is there a rough ETA for the stable release of
> 1.0.0? Or is anyone currently using rc2 in production?
>

I'm hoping to cut RC3 later today or tomorrow and barring any -ve votes do
the official release early next week.


Re: Possible authentication bug

2016-07-18 Thread Vinod Kone
Might be related to MESOS-2043
?

Can you paste master and agent logs?

On Mon, Jul 18, 2016 at 3:13 PM, Douglas Nelson  wrote:

> I have SSL enabled for mesos and for the most part everything seems to be
> working fine. But when I stop a slave node for long enough that it shows up
> with status LOST then I start up the slave again, registration with the
> master fails:
>
> I0718 15:51:45.646260 16791 master.cpp:5495] Authenticating slave(1)@
> 10.5.7.5:5051
> I0718 15:51:45.646960 16791 authenticator.cpp:98] Creating new server SASL
> connection
> I0718 15:51:50.648329 16790 master.cpp:5481] Queuing up authentication
> request from slave(1)@10.5.7.5:5051 because authentication is still in
> progress
> W0718 15:51:50.648696 16790 master.cpp:5522] Failed to authenticate
> slave(1)@10.5.7.5:5051: Authentication discarded
>
> It cycles through this over and over again until I restart the master
> node. Is restarting the master the only way to handle re-authentication? I
> expected it to be more automatic. Thanks!
>


Re: mesos agent not recovering after ZK init failure

2016-07-15 Thread Vinod Kone
On Fri, Jul 15, 2016 at 11:31 AM, Sharma Podila  wrote:

> We had this issue happen again and were able to debug further. The cause
> for agent not being able to restart is that one of the resources (disk)
> changed its total size since the last restart. However, this error does not
> show up in INFO/WARN/ERROR files. We saw it in stdout only when manually
> restarting the agent. It would be good to have all messages going to
> stdout/stderr show up in the logs. Is there a config setting for it that I
> missed?
>

When the master/agent exits due to an un-recoverable error they use a stout
library function `EXIT` which only prints to stderr. Agreed that this is
not great UX, mind filing a ticket? Note that even if we fix this in Mesos,
we can't easily fix this behavior in the 3rd party libraries that we use
(e.g., ZooKeeper).  The way we've dealt with this in production, in my
previous company, was to redirect stdout/stderr to a
mesos-{master,agent}.log. You can disable "--log_dir" to avoid double
logging.



> The disk size total is changing sometimes on our agents. It is off by a
> few bytes (seeing ~10 bytes difference out of, say, 600 GB). We use ZFS on
> our agents to manage the disk partition. From my colleague, Andrew (copied
> here):
>
> The current Mesos approach (i.e., `statvfs()` for total blocks and assume
>> that never changes) won’t work reliably on ZFS
>
>
As Jie alluded to, one strategy is to have a startup wrapper script that
calculates the resources and calls `mesos-agent` binary with `--resources`
flag set. This is what we used to do in production.


Re: test

2016-07-13 Thread Vinod Kone
Don't sweat about the test email. Not a big deal. Welcome to the community!

On Wed, Jul 13, 2016 at 1:51 PM, Rahul Palamuttam 
wrote:

> I'm truly sorry.
> Just kept getting several message denied errors, until I realized I needed
> to send a reply to user-subscribe.
> I will not do that again.
>
>
> On Wed, Jul 13, 2016 at 11:57 AM, daemeon reiydelle 
> wrote:
>
>> Why are you wasting our time with this? Lame.
>>
>>
>> *...*
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>
>> On Wed, Jul 13, 2016 at 11:56 AM, Rahul Palamuttam <
>> rahulpala...@gmail.com> wrote:
>>
>>>
>>>
>>
>


Re: How to send a task to a running framework?

2016-07-08 Thread Vinod Kone
Are you asking how users can submit their tasks to your custom framework?
Your framework should probably expose an API for that.

On Fri, Jul 8, 2016 at 4:30 AM, Bryan Fok  wrote:

> Hi all
>
> After I have my custom framework running in , for instance, a python
> process
>
> How do I submit a task to it through another python process from another
> machine? Through the framework name? Any document around this?
>
> BR
> Bryan
>


[VOTE] Release Apache Mesos 1.0.0 (rc2)

2016-07-07 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.0.


1.0.0 includes the following:



  * Scheduler and Executor v1 HTTP APIs are now considered stable.





  * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
These

APIs let operators and services (monitoring, load balancers) send HTTP


requests to '/api/v1' endpoint on master or agent. See


`docs/operator-http-api.md` for details.





  * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator


has been added to isolate disk resources more efficiently. Please refer
to

docs/mesos-containerizer.md for more details.





  * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
added a

new isolator 'docker/volume' which allows users to use external volumes
in

Mesos containerizer. Currently, the isolator interacts with the Docker


volume plugins using a tool called 'dvdcli'. By speaking the Docker
volume

plugin API, most of the Docker volume plugins are supported.





  * [MESOS-4641] - **Experimental** A new network isolator, the


`network/cni` isolator, has been introduced in the
`MesosContainerizer`. The

`network/cni` isolator implements the Container Network Interface (CNI)


specification proposed by CoreOS.  With CNI the `network/cni` isolator
is

able to allocate a network namespace to Mesos containers and attach the


container to different types of IP networks by invoking network drivers


called CNI plugins.





  * [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored
in

order to decouple the ACLs definition language from the interface.


It additionally includes the option of retrieving `ObjectApprover`. An


`ObjectApprover` can be used to synchronously check authorizations for
a

given object and is hence useful when authorizing a large number of
objects

and/or large objects (which need to be copied using request based


authorization). NOTE: This is a **breaking change** for authorizer
modules.




  * [MESOS-5405] - The `subject` and `object` fields in
authorization::Request

have been changed from required to optional. If either of these fields
is

not set, the request should only be authorized if any subject/object
should

be allowed.

NOTE: This is a semantic change for authorizer modules.





  * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
endpoint

filtering enables operators to restrict what part of the cluster state
a

user is authorized to see.


Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

executors. The following endpoints support HTTP endpoint filtering:


'/state', '/state-summary', '/tasks', '/frameworks','/weights',


and '/roles'. Additonally the following v1 API calls support filtering:


'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and
'GET_TASKS'.




  * [MESOS-4909] - Tasks can now specify a kill policy. They are
best-effort,

because machine failures or forcible terminations may occur. Currently,
the

only available kill policy is how long to wait between graceful and
forcible

task kill. In the future, more policies may be available (e.g. hitting
an

HTTP endpoint, running a command, etc). Note that it is the executor's


responsibility to enforce kill policies. For executor-less
command-based

tasks, the kill is performed via sending a signal to the task process:


SIGTERM for the graceful kill and SIGKILL for the forcible kill. For
docker

executor-less tasks the grace period is passed to 'docker stop --time'.
This

feature supersedes the '--docker_stop_timeout', which is now
deprecated.




  * [MESOS-4908] - The task kill policy defined within 'TaskInfo' can now
be

overridden when the scheduler kills the task. This can be used by
schedulers

to forcefully kill a task which is already being killed, e.g. if
something

went wrong during a graceful kill and a forcible kill is desired. Note
that

it is the executor's responsibility to honor the
'Event.kill.kill_policy'

field and override the task's kill policy and kill policy from a
previous

kill task request. To use this feature, schedulers and executors must


support HTTP API; use the '--http_command_executor' agent flag to
ensure

the agent launches the HTTP API based command executor.





  * [MESOS-4949] - The executor shutdown grace period can now be configured
in

`ExecutorInfo`, which overrides the agent flag. When shutting down an


executor the agent will wait in a best-effort manner for the grace
period

specified here before forcibly destroying the container. The executor
must

not assume that it will always be 

Re: 1.0.0 RC2

2016-06-30 Thread Vinod Kone
Update: We still have about 6 blockers for the RC2 cut :( Good news is that
all of them are either reviewable or in progress :). I'll cut RC2 whenever
they land, whether it's tomorrow or coming tuesday.

Dashboard to track progress:
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12328715

On Tue, Jun 21, 2016 at 11:48 AM, Vinod Kone <vinodk...@apache.org> wrote:

> There are still 8 outstanding issues, including 1 blocker. We are waiting
> for these to land for RC2.
>
>
> On Fri, Jun 17, 2016 at 5:11 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> We still have 12 issues, including 1 blocker, targeted for 1.0.
>>
>> Dashboard: https://issues.apache.org/jira/secure/Dashboard.jspa
>>
>> So I'll wait until *monday morning PST *to cut RC2, for the blocker to
>> get resolved and any other targeted issues to land.
>>
>> Also note that with RC2 we will create a 1.0.x branch and update the
>> version on trunk to 1.1.0. Any further fixes for RC2 will be cherry picked
>> on to that branch.
>>
>>
>> On Wed, Jun 15, 2016 at 4:09 PM, Vinod Kone <vinodk...@apache.org> wrote:
>>
>>> There are still 17 un-resolved issues targeted for 1.0. We have only
>>> couple more days left for the RC cut. Whoever is driving & shepherding
>>> these please make sure to land them.
>>>
>>>
>>>
>>> On Mon, Jun 13, 2016 at 1:58 PM, Vinod Kone <vinodk...@apache.org>
>>> wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I'm planning to cut 1.0 RC2 later this week (likely friday). So please
>>>> make sure to get any patches targeted for 1.0 (esp. blockers) upstreamed.
>>>>
>>>> The dashboard for the release is here:
>>>> https://issues.apache.org/jira/issues/?filter=12335793
>>>>
>>>> Thanks,
>>>> Vinod
>>>>
>>>
>>>
>>
>


<    1   2   3   4   5   6   >