> > Please also add a note that one should specify IP addresses in ringX_addr
> > directives, not a domain name. Else corosync does not work properly in
> UDPu
> > mode, and at the same time it does not say anything significant in its
> log
> > files. I've spent 4 hours recently trying to figure this out.
> >
>
> As I was replying you on PCMK list. ringX_addr resolving should work as
> expected (I'm using only this configuration and same applies for most of
> the cluster created by pcs). Even if ringX_addr resolving would be
> broken, it's for sure not something appropriate for "TROUBLESHOOTING",
> but it's really about bug fix.
>
> Can you please attach corosync logs, so you would make possible for us
> to find root cause of problem you are hitting? (ideally with debug
> enabled).
>
> Sure, here they are:
http://oss.clusterlabs.org/pipermail/pacemaker/2015-January/023320.html
The complete NON-WORKING corosync.conf is (note that instead of "a.b.c.d" I
have a plain IP address):
# THIS IS A NON-WORKING CONFIGURATION DUE TO non-IP addresses in ringX_addr!
totem {
version: 2
cluster_name: velvica
secauth: on
clear_node_high_bit: yes
interface {
ringnumber: 0
bindnetaddr: a.b.c.d
mcastport: 5405
ttl: 1
}
transport: udpu
heartbeat_failures_allowed: 3
}
logging {
fileline: off
to_logfile: no
to_syslog: yes
debug: off
timestamp: off
logger_subsys {
subsys: QUORUM
debug: off
}
}
nodelist {
node {
ring0_addr: node1 # <-- seems not working, IP address is needed
}
node {
ring0_addr: node2
}
node {
ring0_addr: node3
}
}
quorum {
provider: corosync_votequorum
}
If I then replace node1, node2, node3 with their IP addresses, everything
becomes working. See /var/log/syslog output at
http://oss.clusterlabs.org/pipermail/pacemaker/2015-January/023320.html
> > On Monday, January 5, 2015, Jan Pokorný <[email protected]> wrote:
> >
> >> (if you let me, some more in-line)
> >>
> >> On 05/01/15 16:20 +0000, Christine Caulfield wrote:
> >>> Looks good to me, thanks. I've fixed a few typos and pointed out a
> >> spurious
> >>> capital inline below
> >>>
> >>> On 05/01/15 14:39, Steven Dake wrote:
> >>>> Add a troubleshooting guide. I'm sure other folks have some good
> stuff
> >>>> to put in here. These are just the ones I know about :)
> >>>>
> >>>> Signed-off-by: Steven Dake <[email protected] <javascript:;>>
> >>>> ---
> >>>> man/corosync.conf.5 | 39 +++++++++++++++++++++++++++++++++++++++
> >>>> 1 file changed, 39 insertions(+)
> >>>>
> >>>> diff --git a/man/corosync.conf.5 b/man/corosync.conf.5
> >>>> index 8e774c1..16d84ca 100644
> >>>> --- a/man/corosync.conf.5
> >>>> +++ b/man/corosync.conf.5
> >>>> @@ -678,6 +678,45 @@ Native means one of shm or socket, depending on
> >> what is supported by OS. On syst
> >>>> with support for both, SHM is selected. SHM is generally faster, but
> >> need to allocate
> >>>> ring buffer file in /dev/shm.
> >>>>
> >>>> +.SH "TROUBLESHOOTING"
> >>>> +.TP
> >>>> +Ocassionally Corosync will not work with the default network. Here
> >> are some
> >> ^^^ Occasionally
> >>
> >>>> +common tips that people have used to find a working Corosync.
> >>>> +
> >>>> +.TP
> >>>> +Disable the firewall. The firwall could block Corosync packets from
> >> reaching
> >>> ^^firewall
> >>>> +the network.
> >>>> +
> >>>> +.TP
> >>>> +Force IGMP v2. Some modern switches do not support the kernel IGMP
> v3
> >>>> + protocol. As a result, They will not properly register the cluster.
> >> To do
> >> ^^^ they
> >>
> >>>> +this, simply run the command
> >>>> +
> >>>> +.BR sysctl -w net.ipv4.conf.all.force_igmp_version=2
> >>>> +
> >>>> +.TP
> >>>> +If on a routed network, set a larger ttl. The TTL tells the routers
> >> how long
> >>>> +to let the packet multicast before dropping it permanently. The
> >> Default ttl
> >>> ^^^
> default
> >>
> >> (inconsistent casing of ttl/TTL)
> >>
> >>>> +is set to 1, which means the packet will drop after its first hop.
> >> This will
> >>>> +not work well on a routed network.
> >>>> +
> >>>> +.TP
> >>>> +I use a VLAN and Corosync doesn't work. If your using a VLAN, VLAN's
> >> shave the
> >>> ^^^ you're VLANs
> >>>
> >>>> +packet size available for Corosync to use in some cases. Corosync
> does
> >> not
> >>>> +automatically adjust to this change. Set netmtu appropriately when
> >> using a
> >>>> +VLAN.
> >>>> +
> >>>> +.TP
> >>>> +If all else fails, use UDPU. The authors implemented UDPU to solve
> >> the various
> >>>> +problems with multicast that plague modern switch implementations.
> >> The UDPU
> >>>> +protocol was initially believed to be much slower but the reality
> after
> >>>> +implementation is that it doesn't make much difference.
> >>>> +
> >>>> +Even with UDPU you would be hard pressed to find a faster group
> >> messaging
> >>>> +system than Corosync. The only downside of UDPU is it results in
> much
> >> more
> >>>> +packet copying across the network.
> >>>> +
> >>>> +
> >>>> .SH "FILES"
> >>>> .TP
> >>>> /etc/corosync/corosync.conf
> >>
> >> --
> >> Jan
> >>
> >
> >
> >
> > _______________________________________________
> > discuss mailing list
> > [email protected]
> > http://lists.corosync.org/mailman/listinfo/discuss
> >
>
>
_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss