答复: Deactivationg framework unexpectly

2016-08-12 Thread 志昌 余
Hi Anindya,

The problem occurred again. The following is the log of the scheduler 
driver log at Chronos side:


I0812 08:15:43.90271296 sched.cpp:1937] Asked to abort the driver
I0812 08:15:43.90276396 sched.cpp:981] Scheduler::statusUpdate took 
1.436378441secs
I0812 08:15:43.90278896 sched.cpp:988] Not sending status update 
acknowledgment message b\
ecause the driver is not running!
I0812 08:15:43.90286696 sched.cpp:919] Ignoring task status update message 
because the dr\
iver is not running!

However from the earlier log I don't see the clue of why scheduler driver 
be aborted.



Thankds,

Zhichang Yu




发件人: 志昌 余 
发送时间: 2016年8月9日 18:03:31
收件人: user@mesos.apache.org
主题: 答复: Deactivationg framework unexpectly


Hi Anindys,

Thanks for the info. I'll enable  scheduler driver log to see what happen.

Regards,

Zhichang Yu


发件人: anindya_si...@apple.com  代表 Anindya Sinha 

发送时间: 2016年8月8日 23:50:10
收件人: user@mesos.apache.org
主题: Re: Deactivationg framework unexpectly

Looks like your framework (chronos) is sending a DeactivateFrameworkMessage 
message to the master. The scheduler driver would also send a 
DeativateFramework message if it is aborted 
(https://github.com/apache/mesos/blob/master/src/sched/sched.cpp#L1224).

Also, master can deactivate your framework if your framework disconnects or 
fails over. Please check logs in master or see if your framework received a 
FrameworkErrorMessage.

Thanks
Anindya

On Aug 8, 2016, at 3:35 AM, 志昌 余 
mailto:yuzhichang_...@hotmail.com>> wrote:

Hi,
I recently faced a wired problem. I'm running mesos + chronos. Chronos 
often (once every several days) stops scheduling tasks due to mesos deactived 
the framework.
As following is the log of mesos master leader:


# grep -iP "activat|disconnected" /var/log/mesos/mesos-master.INFO
I0806 13:40:33.14365830 master.cpp:2551] Deactivating framework 
90a6a7dc-7256-4e55-bd7e-573233c5df74- (chronos-2.5.0-SNAPSHOT) at 
scheduler-86a64d22-5201-4bb0-8a2c-70d3e97afae6@10.8.139.246:34544
I0806 13:40:33.14390823 hierarchical.cpp:375] Deactivated framework 
90a6a7dc-7256-4e55-bd7e-573233c5df74-

The fix is to manually reboot the chronos leader.


My env:
There are 3 physical machines, on each are running containerized mesos master 
and chronos. When the issue occurred,  the mesos leader and chronos leader were 
both running on the same machine.

Software Version:
mesos-master:0.28.0-2.0.16.ubuntu1404

chronos:2.5.0-ce4469d.ubuntu1404-mesos-0.28.0-2.0.16.ubuntu1404

Can anyone give insight for this problem?
Thanks,
Zhichang Yu



Re: [VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-12 Thread Greg Mann
+1 (non-binding):

* Ran "sudo make distcheck" on CentOS 7 with libevent and SSL enabled. All
tests passed.
* Used "test-upgrade.py" to test upgrades from 0.28.2 -> 1.0.1 and 1.0.0 ->
1.0.1; both were successful.

Cheers,
Greg


On Wed, Aug 10, 2016 at 5:32 PM, Vinod Kone  wrote:

> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 1.0.1.
>
>
> The CHANGELOG for the release is available at:
>
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_
> plain;f=CHANGELOG;hb=1.0.1-rc1
>
> 
> 
>
>
> The candidate for Mesos 1.0.1 release is available at:
>
> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz
>
>
> The tag to be voted on is 1.0.1-rc1:
>
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.1-rc1
>
>
> The MD5 checksum of the tarball can be found at:
>
> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/
> mesos-1.0.1.tar.gz.md5
>
>
> The signature of the tarball can be found at:
>
> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/
> mesos-1.0.1.tar.gz.asc
>
>
> The PGP key used to sign the release is here:
>
> https://dist.apache.org/repos/dist/release/mesos/KEYS
>
>
> The JAR is up in Maven in a staging repository here:
>
> https://repository.apache.org/content/repositories/orgapachemesos-1155
>
>
> Please vote on releasing this package as Apache Mesos 1.0.1!
>
>
> The vote is open until Mon Aug 15 17:29:33 PDT 2016 and passes if a
> majority of at least 3 +1 PMC votes are cast.
>
>
> [ ] +1 Release this package as Apache Mesos 1.0.1
>
> [ ] -1 Do not release this package because ...
>
>
> Thanks,
>


Re: [VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-12 Thread Greg Mann
Whoops! Sorry y'all, my wires got crossed :) I ran these tests on Ubuntu
14.04.

G

On Fri, Aug 12, 2016 at 12:50 PM, Greg Mann  wrote:

> +1 (non-binding):
>
> * Ran "sudo make distcheck" on CentOS 7 with libevent and SSL enabled. All
> tests passed.
> * Used "test-upgrade.py" to test upgrades from 0.28.2 -> 1.0.1 and 1.0.0
> -> 1.0.1; both were successful.
>
> Cheers,
> Greg
>
>
> On Wed, Aug 10, 2016 at 5:32 PM, Vinod Kone  wrote:
>
>> Hi all,
>>
>>
>> Please vote on releasing the following candidate as Apache Mesos 1.0.1.
>>
>>
>> The CHANGELOG for the release is available at:
>>
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_p
>> lain;f=CHANGELOG;hb=1.0.1-rc1
>>
>> 
>> 
>>
>>
>> The candidate for Mesos 1.0.1 release is available at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz
>>
>>
>> The tag to be voted on is 1.0.1-rc1:
>>
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.1-rc1
>>
>>
>> The MD5 checksum of the tarball can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos
>> -1.0.1.tar.gz.md5
>>
>>
>> The signature of the tarball can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos
>> -1.0.1.tar.gz.asc
>>
>>
>> The PGP key used to sign the release is here:
>>
>> https://dist.apache.org/repos/dist/release/mesos/KEYS
>>
>>
>> The JAR is up in Maven in a staging repository here:
>>
>> https://repository.apache.org/content/repositories/orgapachemesos-1155
>>
>>
>> Please vote on releasing this package as Apache Mesos 1.0.1!
>>
>>
>> The vote is open until Mon Aug 15 17:29:33 PDT 2016 and passes if a
>> majority of at least 3 +1 PMC votes are cast.
>>
>>
>> [ ] +1 Release this package as Apache Mesos 1.0.1
>>
>> [ ] -1 Do not release this package because ...
>>
>>
>> Thanks,
>>
>
>


Re: [VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-12 Thread Radoslaw Gruchalski
I am trying to build Mesos 1.0.1 for Centos 7 in a Docker container but I'm
hitting this: https://issues.apache.org/jira/browse/MESOS-5925.

Kind regards,

Radek Gruchalski
ra...@gruchalski.com
+4917685656526

*Confidentiality:*
This communication is intended for the above-named person and may be
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor
must you copy or show it to anyone; please delete/destroy and inform the
sender immediately.

On Thu, Aug 11, 2016 at 2:32 AM, Vinod Kone  wrote:

> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 1.0.1.
>
>
> The CHANGELOG for the release is available at:
>
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_
> plain;f=CHANGELOG;hb=1.0.1-rc1
>
> 
> 
>
>
> The candidate for Mesos 1.0.1 release is available at:
>
> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz
>
>
> The tag to be voted on is 1.0.1-rc1:
>
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.1-rc1
>
>
> The MD5 checksum of the tarball can be found at:
>
> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/
> mesos-1.0.1.tar.gz.md5
>
>
> The signature of the tarball can be found at:
>
> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/
> mesos-1.0.1.tar.gz.asc
>
>
> The PGP key used to sign the release is here:
>
> https://dist.apache.org/repos/dist/release/mesos/KEYS
>
>
> The JAR is up in Maven in a staging repository here:
>
> https://repository.apache.org/content/repositories/orgapachemesos-1155
>
>
> Please vote on releasing this package as Apache Mesos 1.0.1!
>
>
> The vote is open until Mon Aug 15 17:29:33 PDT 2016 and passes if a
> majority of at least 3 +1 PMC votes are cast.
>
>
> [ ] +1 Release this package as Apache Mesos 1.0.1
>
> [ ] -1 Do not release this package because ...
>
>
> Thanks,
>


Re: [VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-12 Thread Alex Rukletsov
+1 (binding)

make check on Mac OS 10.11.6 with apple clang-703.0.31.

DockerFetcherPluginTest.INTERNET_CURL_FetchImage is flaky (MESOS-4570), but
this does not seem to be a regression or a blocker.

On Fri, Aug 12, 2016 at 10:30 PM, Radoslaw Gruchalski 
wrote:

> I am trying to build Mesos 1.0.1 for Centos 7 in a Docker container but
> I'm hitting this: https://issues.apache.org/jira/browse/MESOS-5925.
>
> Kind regards,
>
> Radek Gruchalski
> ra...@gruchalski.com
> +4917685656526
>
> *Confidentiality:*
> This communication is intended for the above-named person and may be
> confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
> On Thu, Aug 11, 2016 at 2:32 AM, Vinod Kone  wrote:
>
>> Hi all,
>>
>>
>> Please vote on releasing the following candidate as Apache Mesos 1.0.1.
>>
>>
>> The CHANGELOG for the release is available at:
>>
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_p
>> lain;f=CHANGELOG;hb=1.0.1-rc1
>>
>> 
>> 
>>
>>
>> The candidate for Mesos 1.0.1 release is available at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz
>>
>>
>> The tag to be voted on is 1.0.1-rc1:
>>
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.1-rc1
>>
>>
>> The MD5 checksum of the tarball can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos
>> -1.0.1.tar.gz.md5
>>
>>
>> The signature of the tarball can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos
>> -1.0.1.tar.gz.asc
>>
>>
>> The PGP key used to sign the release is here:
>>
>> https://dist.apache.org/repos/dist/release/mesos/KEYS
>>
>>
>> The JAR is up in Maven in a staging repository here:
>>
>> https://repository.apache.org/content/repositories/orgapachemesos-1155
>>
>>
>> Please vote on releasing this package as Apache Mesos 1.0.1!
>>
>>
>> The vote is open until Mon Aug 15 17:29:33 PDT 2016 and passes if a
>> majority of at least 3 +1 PMC votes are cast.
>>
>>
>> [ ] +1 Release this package as Apache Mesos 1.0.1
>>
>> [ ] -1 Do not release this package because ...
>>
>>
>> Thanks,
>>
>
>


Re: [VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-12 Thread Kapil Arya
+1 (binding)

You can find the rpm/deb packages here:
  http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-1.0.1-rc1

The following docker tags (built off of ubuntu 14.04) are also available:
mesosphere/mesos:1.0.1-rc1
mesosphere/mesos-master:1.0.1-rc1
mesosphere/mesos-slave:1.0.1-rc1

Kapil

On Fri, Aug 12, 2016 at 4:39 PM, Alex Rukletsov  wrote:

> +1 (binding)
>
> make check on Mac OS 10.11.6 with apple clang-703.0.31.
>
> DockerFetcherPluginTest.INTERNET_CURL_FetchImage is flaky (MESOS-4570),
> but
> this does not seem to be a regression or a blocker.
>
> On Fri, Aug 12, 2016 at 10:30 PM, Radoslaw Gruchalski <
> ra...@gruchalski.com>
> wrote:
>
> > I am trying to build Mesos 1.0.1 for Centos 7 in a Docker container but
> > I'm hitting this: https://issues.apache.org/jira/browse/MESOS-5925.
> >
> > Kind regards,
> >
> > Radek Gruchalski
> > ra...@gruchalski.com
> > +4917685656526
> >
> > *Confidentiality:*
> > This communication is intended for the above-named person and may be
> > confidential and/or legally privileged.
> > If it has come to you in error you must take no action based on it, nor
> > must you copy or show it to anyone; please delete/destroy and inform the
> > sender immediately.
> >
> > On Thu, Aug 11, 2016 at 2:32 AM, Vinod Kone 
> wrote:
> >
> >> Hi all,
> >>
> >>
> >> Please vote on releasing the following candidate as Apache Mesos 1.0.1.
> >>
> >>
> >> The CHANGELOG for the release is available at:
> >>
> >> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_p
> >> lain;f=CHANGELOG;hb=1.0.1-rc1
> >>
> >> 
> >> 
> >>
> >>
> >> The candidate for Mesos 1.0.1 release is available at:
> >>
> >> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/
> mesos-1.0.1.tar.gz
> >>
> >>
> >> The tag to be voted on is 1.0.1-rc1:
> >>
> >> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=
> commit;h=1.0.1-rc1
> >>
> >>
> >> The MD5 checksum of the tarball can be found at:
> >>
> >> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos
> >> -1.0.1.tar.gz.md5
> >>
> >>
> >> The signature of the tarball can be found at:
> >>
> >> https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos
> >> -1.0.1.tar.gz.asc
> >>
> >>
> >> The PGP key used to sign the release is here:
> >>
> >> https://dist.apache.org/repos/dist/release/mesos/KEYS
> >>
> >>
> >> The JAR is up in Maven in a staging repository here:
> >>
> >> https://repository.apache.org/content/repositories/orgapachemesos-1155
> >>
> >>
> >> Please vote on releasing this package as Apache Mesos 1.0.1!
> >>
> >>
> >> The vote is open until Mon Aug 15 17:29:33 PDT 2016 and passes if a
> >> majority of at least 3 +1 PMC votes are cast.
> >>
> >>
> >> [ ] +1 Release this package as Apache Mesos 1.0.1
> >>
> >> [ ] -1 Do not release this package because ...
> >>
> >>
> >> Thanks,
> >>
> >
> >
>


Re: Programmatically retrieve stdout/stderr from a node

2016-08-12 Thread Benjamin Mahler
Also I believe the CLI work that Haris / Kevin have been doing would make
this easy to do via the Mesos CLI (it's not integrated into the project
yet).

On Wed, Aug 10, 2016 at 9:57 AM, Erik Weathers 
wrote:

> Just for completeness and to provide an alternative, you can also probably
> leverage the dcos command line tool (https://github.com/dcos/dcos-cli) to
> get all the info you would need in a JSON format.
>
> e.g.,
> 1. set up ~/.dcos/config.toml for your cluster
> 2. DCOS_CONFIG=~/.dcos/config.toml dcos task --json --completed
>
> You could process that output with jq (https://stedolan.github.io/jq/)
> and do all of this in a short script.  (Not that I have much luck using jq,
> I'm not very skilled at working with such arcane syntax, a la XPath.)
>
> - Erik
>
> On Wed, Aug 10, 2016 at 9:27 AM, June Taylor  wrote:
>
>> David,
>>
>> Thanks for the suggestions, this has the missing piece we needed!
>>
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>> On Wed, Aug 10, 2016 at 10:54 AM, David Greenberg > > wrote:
>>
>>> The Cook framework has examples of how to do this. See here (
>>> https://github.com/twosigma/Cook/blob/master/scheduler/src/
>>> cook/mesos/api.clj#L322-L324) for constructing the stem of the URL,
>>> here (https://github.com/twosigma/Cook/blob/7a49fbb98b281e3b23779
>>> cd88d1d2b73428a0447/scheduler/src/cook/mesos/api.clj#L281-L297) for
>>> finding the filesystem path, and here (https://github.com/twosigma/C
>>> ook/blob/master/scheduler/docs/scheduler-rest-api.asc#using_output_url)
>>> for getting data with the stem.
>>>
>>> Essentially, you need to scrape the Mesos master to get all the path
>>> info you need. LMK if you have questions!
>>>
>>> On Wed, Aug 10, 2016 at 9:34 AM June Taylor  wrote:
>>>
 Tomek,

 I'm not sure I understand your suggestion. We know how to ask for a
 file from an HTTP endpoint, but it is the construction of the correct URL
 which is not currently clear.

 We are not sure how to determine the Run ID of the executor.


 Thanks,
 June Taylor
 System Administrator, Minnesota Population Center
 University of Minnesota

 On Wed, Aug 10, 2016 at 10:08 AM, Tomek Janiszewski 
 wrote:

> Hi
>
> If you need simplest method then python SimpleHTTPServer could help.
> Just launch it in background before command you want to run, assign it 
> port
> and query sandbox with : that can be obtained from
> state endpoint.
>
> -
> Tomek
>
> śr., 10.08.2016 o 16:53 użytkownik June Taylor  napisał:
>
>> We are trying to retrieve the stdout and stderr files from an
>> executor programmatically.
>>
>> It appears that these are available via HTTP request, however,
>> constructing the correct URL is posing to be a challenge.
>>
>> Our scenario is:
>>
>> 1. Use mesos-execute to submit a job. A framework ID is available at
>> this point.
>> 2. Using the framework ID, one can inquire with mesos-state to
>> determine which slave ID is executing the task.
>> 3. Using the slave ID, one can inquire with mesos-state to find the
>> hostname for that slave ID
>> 4. HTTP can be used to ask the /browse/ endpoint for a file, however,
>> there is an Executor ID which we cannot programmatically determine, to
>> complete this URL.
>>
>> Please advise the simplest option for retrieving the sandbox files
>> given the scenario starts with mesos-execute commands.
>>
>> Thanks!
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>

>>
>