Re: Initial leader election

2015-11-26 Thread Guilherme Moro
The nodes are quite fast to come up, but I will try to increase that for a
test anyway.
Either way, shouldn't the system try again automatically instead of just
issuing repeatedly "Replica in EMPTY status received a broadcasted recover
request" after a couple of failures?


Thanks for the answer.



On 25 November 2015 at 17:31, Marco Massenzio  wrote:

> A quick glance of the logs doesn't show anything that stands out, apart
> from:
>
> --zk_session_timeout="10secs"
>
> which seems to lead to:
>
> Nov 23 16:50:13 node1 mesos-master[17501]: I1123 16:50:13.594151 17521
> recover.cpp:111] Unable to finish the recover protocol in 10secs,
> retrying
>
> That is the default value, but maybe your setup may need longer than that
> (it is possible that the time it takes for all master nodes to come up and
> reach quorum may be the issue).
>
> --
> *Marco Massenzio*
> Distributed Systems Engineer
> http://codetrips.com
>
> On Wed, Nov 25, 2015 at 3:06 AM, Guilherme Moro  >
> wrote:
>
> > https://issues.apache.org/jira/browse/MESOS-4010
> >
> > On 24 November 2015 at 13:55, Klaus Ma  wrote:
> >
> > > I'd suggest to open a JIRA to trace issue; I think you can append
> > > master.log & slave.log for owner reference.
> > >
> > > 
> > > Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
> > > Platform Symphony/DCOS Development & Support, STG, IBM GCG
> > > +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
> > >
> > > On Tue, Nov 24, 2015 at 8:45 PM, Guilherme Moro <
> > guilherme.m...@ammeon.com
> > > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm having a problem while trying to create the initial cluster, no
> > > leader
> > > > is elected.
> > > > For a start, let me explain my setup:
> > > > 3 nodes
> > > > 3 zookeepers
> > > > 3 mesos-master services, configured as initctl services and
> controlled
> > by
> > > > puppet, RPM's installed are from the RHEL repository at mesosphere
> > > > (installed through puppet as well), running on RHEL 6.6
> > > > Quorum is set to 2, as expected, all the remaining configs were
> double
> > > > checked and appears to be correct.
> > > > Most of times I can get the cluster to bootstrap after rebooting the
> > > nodes
> > > > (sometimes more than once).
> > > > The whole thing resembles a bit
> > > > https://issues.apache.org/jira/browse/MESOS-2148 and
> > > > https://issues.apache.org/jira/browse/MESOS-2014
> > > >
> > > > Even when I get the master elected, sometimes another couple of
> reboots
> > > or
> > > > restarts of the services are needed to get all the slave nodes added
> > > (they
> > > > are the same nodes as the masters).
> > > >
> > > > I can quite easily reproduce this behavior, if someone cares to look
> at
> > > > logs tell me exactly what to collect and what logging flags I should
> > > > enable.
> > > >
> > > > So, should I maybe open a bug or is there any trick to bootstrap the
> > > > cluster that I'm losing here.
> > > >
> > > > Regards,
> > > >
> > > > Guilherme Moro
> > > >
> > > > --
> > > > This email and any files transmitted with it are confidential and
> > > intended
> > > > solely for the use of the individual or entity to whom they are
> > > addressed.
> > > > If you have received this email in error please notify the system
> > > manager.
> > > > This message contains confidential information and is intended only
> for
> > > the
> > > > individual named. If you are not the named addressee you should not
> > > > disseminate, distribute or copy this e-mail.
> > > >
> > > >
> > >
> >
> > --
> > This email and any files transmitted with it are confidential and
> intended
> > solely for the use of the individual or entity to whom they are
> addressed.
> > If you have received this email in error please notify the system
> manager.
> > This message contains confidential information and is intended only for
> the
> > individual named. If you are not the named addressee you should not
> > disseminate, distribute or copy this e-mail.
> >
> >
>

-- 
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. 
If you have received this email in error please notify the system manager. 
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail.



why I have to set LIBPROCESS_IP for my framework?

2015-11-26 Thread zhou weitao
Hi, list:

as the subject. Whatever Marathon, chronos, Spark and so on, I have to set
the ENV LIBPROCESS_IP for registering them to mesos-master. But I doesn't
figure out why I have to do that yet. If possible, please shoot me some
info.

thanks in advance.


Re: why I have to set LIBPROCESS_IP for my framework?

2015-11-26 Thread haosdent
Hi, Mesos use libprocess, which need set LIBPROCESS_IP and LIBPROCESS_PORT
to communicate with.
https://github.com/apache/mesos/tree/master/3rdparty/libprocess

On Thu, Nov 26, 2015 at 5:47 PM, zhou weitao  wrote:

> Hi, list:
>
> as the subject. Whatever Marathon, chronos, Spark and so on, I have to set
> the ENV LIBPROCESS_IP for registering them to mesos-master. But I doesn't
> figure out why I have to do that yet. If possible, please shoot me some
> info.
>
> thanks in advance.
>



-- 
Best Regards,
Haosdent Huang


Re: why I have to set LIBPROCESS_IP for my framework?

2015-11-26 Thread Chengwei Yang
Yes, both mesos-master/slave, scheduler use libprocess so far and you can set
LIBPROCESS_IP by set mesos-master/slave opton(--ip).

For scheduler which doesn't export an option to configure LIBPROCESS_IP then you
have to configure the environment yourself.

BUT only when your host not configured correclty.

`not configured correctly` means the local ip that libprocess got from hostname
is incorrect.

See code 3rdparty/libprocess/src/process.cpp:929-935, like below

```
$ sed -ne '929,935p' 3rdparty/libprocess/src/process.cpp
if (gethostname(hostname, sizeof(hostname)) < 0) {
  LOG(FATAL) << "Failed to initialize, gethostname: "
 << hstrerror(h_errno);
}

// Lookup IP address of local hostname.
Try ip = net::getIP(hostname, __address__.ip.family());
```

Ensure you have your host configured correctly. You may have to add a line to
`/etc/hosts` like

 

-- 
Thanks,
Chengwei

On Thu, Nov 26, 2015 at 05:58:28PM +0800, haosdent wrote:
> Hi, Mesos use libprocess, which need set LIBPROCESS_IP and LIBPROCESS_PORT
> to communicate with.
> https://github.com/apache/mesos/tree/master/3rdparty/libprocess
> 
> On Thu, Nov 26, 2015 at 5:47 PM, zhou weitao  wrote:
> 
> > Hi, list:
> >
> > as the subject. Whatever Marathon, chronos, Spark and so on, I have to set
> > the ENV LIBPROCESS_IP for registering them to mesos-master. But I doesn't
> > figure out why I have to do that yet. If possible, please shoot me some
> > info.
> >
> > thanks in advance.
> >
> 
> 
> 
> -- 
> Best Regards,
> Haosdent Huang


signature.asc
Description: Digital signature


Re: why I have to set LIBPROCESS_IP for my framework?

2015-11-26 Thread tommy xiao
Hi weitao,

mesos use libprocess to communicate each other, that all.

2015-11-26 17:47 GMT+08:00 zhou weitao :

> Hi, list:
>
> as the subject. Whatever Marathon, chronos, Spark and so on, I have to set
> the ENV LIBPROCESS_IP for registering them to mesos-master. But I doesn't
> figure out why I have to do that yet. If possible, please shoot me some
> info.
>
> thanks in advance.
>



-- 
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com


Re: can anyone shepherd MESOS-3725

2015-11-26 Thread Till Toenshoff
Done - sry for not noting your request any earlier.

> On Nov 19, 2015, at 1:46 AM, James Peach  wrote:
> 
> Hi all,
> 
> Can anyone shepherd https://issues.apache.org/jira/browse/MESOS-3725?
> 
> thanks!
> 
> 



Re: Anyone successfully setup CLION with mesos?

2015-11-26 Thread Marco Massenzio
I did set it up with CLion - it's not perfect (still a few "false
positives" on compile errors and it gets confused in places) but it's
definitely better than the best I was able to achieve with Eclipse (see my
blog link below for details on that one).

I can't exactly remember what I did to make it all work, but I think you
can just point it to the CMakeList.txt top-level file, then CLion will do
the rest and figure it out - Alex will correct me, but last time I tried, I
was only able to use it up to the `stouttests` target, I'm sure a lot more
work now.

At any rate, code navigation, auto-completion, etc. work just fine.

--
*Marco Massenzio*
Distributed Systems Engineer
http://codetrips.com

On Wed, Nov 25, 2015 at 10:41 PM, Alex Clemmer 
wrote:

> CMake support is not quite mature yet. I think we're at least a couple
> months out before it's really ready to rely on, but I'm quite happy to
> hear about your bugs! Feel free to file them against me (I'm
> `hausdorff` on the JIRA), and I'll make sure they get routed properly.
>
> On Wed, Nov 25, 2015 at 8:22 PM, haosdent  wrote:
> > I set up success for it. Because Mesos have cmake support now. Import it
> as
> > a cmake project.
> >
> > On Thu, Nov 26, 2015 at 12:21 PM, Shiyao Ma  wrote:
> >
> >> Hi,
> >>
> >> I'd like to browse the mesos code with abilities such as  jump to
> >> definitions, etc.
> >>
> >> Ctags and Youcompleteme fall short here.
> >>
> >> So I think CLION might be a good way to go, so anybody managed to do
> that
> >> with CLion?
> >>
> >> What's the setup ?
> >>
> >>
> >> Thanks.
> >>
> >>
> >> shiyao
> >>
> >
> >
> >
> > --
> > Best Regards,
> > Haosdent Huang
>
>
>
> --
> Alex
>
> Theory is the first term in the Taylor series of practice. -- Thomas M
> Cover (1992)
>