Hi,

On Thu, Feb 21, 2008 at 06:40:57PM +0100, Zoltan Boszormenyi wrote:
> Zoltan Boszormenyi ?rta:
>> Hi,
>>
>> we have a problem with automatic IPaddr failback on a system.
>> There are two nodes, IPaddr is preferred running on the "master" node.
>> Static score for that is 20. Resource stickiness for IPaddr is 40.
>> Pingd is set up the same way the documentation mentions, ha.cf has this:
>>
>> respawn root /usr/lib64/heartbeat/pingd -m 100 -d 5s
>>
>> Also, the node that loses the network connection to the ping node
>> gives up its IPaddr, again from the docs:
>>
>>         <rule id="virt_ip_connected" score_attribute="pingd">
>>           <expression id="virt_ip_connected_defined" attribute="pingd" 
>> operation="defined"/>
>>         </rule>
>>         <rule id="virt_ip_unconnected" score="-INFINITY" boolean_op="or">
>>           <expression id="virt_ip_unconnected_undefined" attribute="pingd" 
>> operation="not_defined"/>
>>           <expression id="virt_ip_unconnected_zero" attribute="pingd" 
>> operation="lte" value="0"/>
>>         </rule>
>>
>> This would _should_ mean the following scoring matrix and transition flow:
>>                                       master                           
>> slave
>>                                       static   stickiness  pingd   static  
>>  stickiness   pingd
>> IPaddr not running            20         0            100      0         0 
>>               100
>>
>> decision is to run IPaddr on master
>>
>> IPaddr running on master   20         40         100      0         0      
>>           100
>>
>> master loses connection
>>
>> IPaddr running on master   20         40         0           0         0   
>>               100
>>
>> IPaddr migrated to slave
>>
>> IPaddr running on master   20         0           0           0         40 
>>                 100
>>
>> master restores connection
>>
>> IPaddr running on master   20         0           100        0         40  
>>                100
>>
>> So, at this point, master has 120 points, slave has 140 points.
>> So, it should stay on the slave. But it doesn't stay, it's migrated
>> back to master. With trial-and-error, I raised resource_stickiness
>> to 200 and now it's staying on the slave.
>
> This question still stands. Why doesn't it work with 
> resource_stickiness=40?
> Is my theory wrong? Is the scoring system works differently?

There's a script somebody posted on the list a few times which
calculates scores from the pe input files (the transition
graphs). Just found it here:

http://hg.clusterlabs.org/pacemaker/dev/raw-file/tip/contrib/showscores.sh

The pe inputs are in /var/lib/heartbeat/pengine. This way you can
watch how they change between transitions. In particular, there's
a gotcha with groups, i.e. in order for a group to move, you'd
need to add scores for all resources from the group. Otherwise,
not an expert with scores, so can't give you a more specific
advice.

Thanks,

Dejan


>> But unfortunately only
>> on my testing setup. On the real machines IPaddr is migrated back
>> to the slave at both resource_stickiness values.
>
> This detail above was solved. On the production system IPaddr
> was migrated forcibly to the master once and the constraint that
> was automatically created by the migration wasn't deleted yet.
> Sorry for the noise.
>
>> The machines are running SLES10 SP1, heartbeat package is 2.1.3-0.6
>> coming from SuSE/Novell. It's a preview package from SLES10 SP2.
>>
>> Can someone explain it to me?
>>
>> Best regards,
>
> -- 
> ----------------------------------
> Zolt?n B?sz?rm?nyi
> Cybertec Sch?nig & Sch?nig GmbH
> http://www.postgresql.at/
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to