On Wed, Apr 04, 2007 at 04:47:32PM -0700, Simon Horman wrote:
> On Wed, Apr 04, 2007 at 09:10:02PM +0200, Philipp Kolmann wrote:
> > Package: heartbeat
> > Version: 1.2.5-3
> > Severity: important
> > 
> > Hi,
> > 
> > today I upgraded my cluster to etch and wanted to test the failover settings
> > but this completely failed. After some debugging I found out, that the
> > IPv6addr script fails.
> > 
> > Looking deeper into this issue, it seems that the function 
> > is_addr6_available
> > checks the reply for a ICMPV6_ECHO_REQUEST but fails here:
> > 
> >         if (0 != memcmp(&local, &addr.sin6_addr,sizeof(local))) {
> > 
> > It seems that in etch addr.sin6_addr isn't '::1'.
> > 
> > I tried it with a sarge test system and there it works. (addr.sin6_addr ==
> > '::1').
> > 
> > I am not sure if this is a heartbeat or kernel problem, but a fix would be
> > very welcome.
> > 
> > Tell me, if you need any info to help fix this issue.
> 
> Hi Philipp,
> 
> thanks for pointing this out. I did a quick check of
> is_addr6_available() in the linux-ha 2.0 tree and it seems that this
> local check has been removed all together - though I am not sure why. I
> wonder if this would also be a good apprach to take to side-step the
> problem that you have found in 1.2.5.
> 
> The (untested) patch belows shows what I am thinking about.
> Is it possible for you to see if this solves the problem that you are
> seeing?
> 
> I have CCed the linux-ha-dev list, so the linux-ha developers can
> pass their eyes over this. Its a subscriber only list.
> 
> For the linux-ha people, this (Debian) bug can be seen at 
> http://bugs.debian.org/417835, and posting to
> [EMAIL PROTECTED] will log against it.

Hi Simon,

thanks for looking into this. I did some more testing yesterday and today and
found the same like you.

My problem now after disabling the local check is, that in send_ua at "if
(libnet_write(l) == -1)" I get the following glibc error:

*** glibc detected *** free(): invalid next size (fast): 0x00000000005042e0
***
Aborted

in gdb I get the following bt:

*** glibc detected *** free(): invalid next size (fast): 0x00000000005042e0
***

Program received signal SIGABRT, Aborted.
[Switching to Thread 47816971899424 (LWP 27541)]
0x00002b7d413c907b in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00002b7d413c907b in raise () from /lib/libc.so.6
#1  0x00002b7d413ca84e in abort () from /lib/libc.so.6
#2  0x00002b7d413ff629 in __fsetlocking () from /lib/libc.so.6
#3  0x00002b7d41406193 in mallopt () from /lib/libc.so.6
#4  0x00002b7d4140621e in free () from /lib/libc.so.6
#5  0x00002b7d41290b6b in libnet_write () from /usr/lib/libnet.so.1
#6  0x0000000000401c74 in send_ua (src_ip=0x7fff69b7f360, if_name=<value
optimized out>) at IPv6addr.c:372
#7  0x00000000004026a9 in main (argc=<value optimized out>, argv=<value
optimized out>) at IPv6addr.c:261

The IP Adress is set correctly, but the checks to verify fail.

For your information the ldd of the binary:

ldd .libs/IPv6addr 
        libplumb.so.0 => /usr/lib/libplumb.so.0 (0x00002b43ed141000)
        libglib-1.2.so.0 => /usr/lib/libglib-1.2.so.0 (0x00002b43ed255000)
        libnet.so.1 => /usr/lib/libnet.so.1 (0x00002b43ed380000)
        libc.so.6 => /lib/libc.so.6 (0x00002b43ed499000)
        libuuid.so.1 => /lib/libuuid.so.1 (0x00002b43ed6d6000)
        librt.so.1 => /lib/librt.so.1 (0x00002b43ed7da000)
        libdl.so.2 => /lib/libdl.so.2 (0x00002b43ed8e3000)
        /lib64/ld-linux-x86-64.so.2 (0x00002b43ed029000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00002b43ed9e6000)

Thanks for any help
Philipp Kolmann

-- 
The more I learn about people, the more I like my dog!


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to