On Tue 12/18/2007 12:52 AM, Andrew Beekhof said: >On Dec 17, 2007, at 11:28 PM, Scott Mann wrote: > >> On Mon 12/17/2007 1:36 AM, Andrew Beekhof said: >> >>> On Dec 14, 2007, at 6:31 PM, Scott Mann wrote: >>> >>>> >>>> On Fri 12/14/2007 1:04 AM, Andrew Beekhof said: >>>> >>>>> On Dec 14, 2007, at 12:12 AM, Scott Mann wrote: >>>>> >>>>>> >>>>>> On Thu 12/13/2007 3:09 PM, Andrew Beekhof said: >>>>>> >>>>>>> On Dec 13, 2007, at 8:11 PM, Scott Mann wrote: >>>>>> >>>>>>>>>>> I'm seeing about a 2.5minute delay between the time that >>>>>>>>>>> heartbeat >>>>>>>>>>> starts and the time that the IP address comes up on eth0:0 >>>>>>>>>>> (if it >>>>>>>>>>> were 5minutes, I'd at least have a clue). >>>>>>>>>> >>>>>>>>> >>>>>>>>> i depends on your configured deadtime IIRC. >>>>>>>>> what does ha.cf look like? >>>>>>>> >>>>>>>> Here's my ha.cf: >>>>>>>> >>>>>>>> logfacility local0 >>>>>>>> keepalive 2 >>>>>>>> deadtime 30 >>>>>>>> warntime 10 >>>>>>>> initdead 120 >>>>>>> >>>>>>> 120 - that's 2 of your 2.5 minutes right there >>>>>> >>>>>> Ah, interesting. So, in v2 (due to autojoin, perhaps?), initdead >>>>>> causes a >>>>>> delay in startup, whereas in v1 mode it doesn't. Very good to >>>>>> know. >>>> >>>>> should do in both i'd have thought... >>>> >>>>> when are you measuring from? >>>> >>>> OK. More details. >>>> >>>> First, in both cases I am running 2.1.2-24.1. >>>> >>>> In the case of v1 mode, my ha.cf file looks identical to the one I >>>> sent, >>>> except for the fact that I specify the two nodes (no autojoin) AND >>>> crm is off. >>>> The haresources file has one line with the "preferred" node and >>>> the IP >>>> address to manage. >>>> >>>> Heartbeat is started with the init script (/etc/init.d/heartbeat) >>>> and then >>>> another init script is run that starts my API application. In v1 >>>> mode, I >>>> can start my API application as soon as the init script completes >>>> and everything >>>> works as expected. >>>> >>>> In v2 mode, I cannot start the API app as soon as the heartbeat init >>>> script completes >>>> because I get a "Cannot signon" message because my app cannot >>>> connect to heartbeat. >>>> Only after the election completes and the resource is "started" am I >>>> able to connect >>>> to heartbeat via the API, which as you pointed out is delayed by >>>> initdead. >>> >>> That's really strange. >>> In order for the election to take place a number of components have >>> to >>> be signed into heartbeat... so I have no idea why your app cant. >>> Especially since nothing the CRM does (having elections or starting >>> resources) should influence your ability to sign in. >>> >>> Unless the resource is an IP and you're using it to connect to the >>> cluster in some way? >> >> The resource is an IP, but I'm using signon ((hb->llc_ops->signon). >> The heartbeat API I wrote doesn't really depend on the IP resource, >> it just wants to monitor it > >in v2 mode you can't monitor the resource using the HA API... only via >the CIB.
Yes, right. Figured that out when my ha api call for resources failed the first time ;-) The CIB API appears to be in /usr/include/heartbeat/crm/cib.h, correct? It appears that I can get notified of resource status changes via a cib signon. I'm beginning to work on that part now. > >> (and a few other things) and pass messages >> back and forth. But it is the signon that fails until everything is up >> and running. It's not just my api, by the way, no other client can >> signon >> either (e.g., cl_status). >> >>> >>>> >>>> >>>> I am concluding that in v1 mode, since the nodes are known, there's >>>> no need to >>>> delay initdead time. Whereas >>>> in v2 mode with autojoin any, the initdead wait time is consumed >>>> because >>>> there may be another node joining. Is that right? >>> >>> Its possible. I don't know how that code works. >>> Have you tried v2 without autojoin? >>> >> >> Yes. That's what I tried to explain above. In v1 mode with specific >> nodes, I can signon right >> away. In v2 with autojoin, it takes 2.5 minutes. > >Its still not clear to me that you've tried the third option.... "v2 >mode with specific nodes" Ah, sorry. Obvious oversight on my part. In v2, with the two nodes specified in ha.cf: node hostA node hostB and autojoin commented out, things are a bit different. In that case, I can connect via the api as quickly as with v1, however the resource is not available for ~2.5 minutes. Any clues where I should look for that? Thanks, again. Scott Mann Sr Software Engineer Aztek Networks _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
<<winmail.dat>>
_______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems