Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-08 Thread Rohit Saini
Hi Ondrej,
Yes, you are right. This issue was specific to floating IPs, not with local
IPs.

Post becoming master, I was sending "Neighbor Advertisement" message for my
floating IPs. This was a raw message which was created by me, so I was the
one who was setting flags in it.
Please find attached "image1" which is the message format of NA message.
Attached "image2" which a message capture, as you can see "Override" and
"Solicited" flag both are set. As part of solution, now only "Override" is
set.

Hope I answer your questions. Please let me know any queries.

Thanks,
Rohit

On Mon, Apr 8, 2019 at 6:13 PM Ondrej  wrote:

> On 4/5/19 8:18 PM, Rohit Saini wrote:
> > *Further update on this:*
> > This issue is resolved now. ILO was discarding "Neighbor Advertisement"
> > (NA) as Solicited flag was set in NA message. Hence it was not updating
> > its local neighbor table.
> > As per RFC, Solicited flag should be set only in NA message when it is a
> > response to Neighbor Solicitation.
> > After disabling the Solicited flag in NA message, ILO started updating
> > the local neighbor cache.
>
> Hi Rohit,
>
> Sounds great that after change you get a consistent behaviour. As I had
> not worked with IPv6 for quite some time I wonder how did you disable
> the 'Solicited flag'. Was this done on the OS (cluster node) or on the
> iLO? My guess is the OS but I have no idea how that can be accomplished.
> Can you share which setting you have changed to accomplish this? :)
>
> One additional note the observation here is that you are using the
> "floating IP" that relocated to other machine, while the configuration
> of cluster seems to be not containing any IPaddr2 resources that would
> be representing this address. I would guess that cluster without the
> floating address would not have issue as it would use the addresses
> assigned to the nodes and therefore the mapping between IP address and
> MAC address will be not changing even when the fence_ilo4 resource are
> moving between nodes. If there is intention to use the floating address
> in this cluster I would suggest checking if there is also no issue when
> "not using the floating address" or when it is disabled to see how the
> fence_ilo4 communicates. I think that there might be way in routing
> tables to set which IPv6 address should communicate with iLO IPv6
> address so you get consistent behaviour instead of using the floating IP
> address.
>
> Anyway I'm glad that mystery is resolved.
>
> --
> Ondrej
>
> >
> > On Fri, Apr 5, 2019 at 2:23 PM Rohit Saini
> > mailto:rohitsaini111.fo...@gmail.com>>
> > wrote:
> >
> > Hi Ondrej,
> > Finally found some lead on this.. We started tcpdump on my machine
> > to understand the IPMI traffic. Attaching the capture for your
> > reference.
> > fd00:1061:37:9021:: is my floating IP and fd00:1061:37:9002:: is my
> > ILO IP.
> > When resource movement happens, we are initiating the "Neighbor
> > Advertisement" for fd00:1061:37:9021:: (which is on new machine now)
> > so that peers can update their neighbor table and starts
> > communication with new MAC address.
> > Looks like ILO is not updating its neighbor table, as it is still
> > sending responding to older MAC.
> > After sometime, "Neighbor Solicitation" happens and ILO updates the
> > neighbor table. Now this ILO becomes reachable and starts responding
> > towards new MAC address.
> >
> > My ILO firmware is 2.60. We will try again the issue post upgrading
> > my firmware.
> >
> > To verify this theory, after resource movement, I flushed the local
> > neighbor table due to which "Neighbor Solicitation" was initiated
> > early and this delay in getting ILO response was not seen.
> > This fixed the issue.
> >
> > We are now more interested in understanding why ILO couldnot update
> > its neighbor table on receiving "Neighbor Advertisement". FYI,
> > Override flag in "Neighbor Advertisement" is already set.
> >
> > Thanks,
> > Rohit
> >
> > On Thu, Apr 4, 2019 at 8:37 AM Ondrej  > > wrote:
> >
> > On 4/3/19 6:10 PM, Rohit Saini wrote:
> >  > Hi Ondrej,
> >  > Please find my reply below:
> >  >
> >  > 1.
> >  > *Stonith configuration:*
> >  > [root@orana ~]# pcs config
> >  >   Resource: fence-uc-orana (class=stonith type=fence_ilo4)
> >  >Attributes: delay=0 ipaddr=fd00:1061:37:9002:: lanplus=1
> > login=xyz
> >  > passwd=xyz pcmk_host_list=orana pcmk_reboot_action=off
> >  >Meta Attrs: failure-timeout=3s
> >  >Operations: monitor interval=5s on-fail=ignore
> >  > (fence-uc-orana-monitor-interval-5s)
> >  >start interval=0s on-fail=restart
> >  > (fence-uc-orana-start-interval-0s)
> >  >   Resource: fence-uc-tigana (class=stonith type=fence_ilo4)
> >  >

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-08 Thread Ondrej

On 4/5/19 8:18 PM, Rohit Saini wrote:

*Further update on this:*
This issue is resolved now. ILO was discarding "Neighbor Advertisement" 
(NA) as Solicited flag was set in NA message. Hence it was not updating 
its local neighbor table.
As per RFC, Solicited flag should be set only in NA message when it is a 
response to Neighbor Solicitation.
After disabling the Solicited flag in NA message, ILO started updating 
the local neighbor cache.


Hi Rohit,

Sounds great that after change you get a consistent behaviour. As I had 
not worked with IPv6 for quite some time I wonder how did you disable 
the 'Solicited flag'. Was this done on the OS (cluster node) or on the 
iLO? My guess is the OS but I have no idea how that can be accomplished.

Can you share which setting you have changed to accomplish this? :)

One additional note the observation here is that you are using the 
"floating IP" that relocated to other machine, while the configuration 
of cluster seems to be not containing any IPaddr2 resources that would 
be representing this address. I would guess that cluster without the 
floating address would not have issue as it would use the addresses 
assigned to the nodes and therefore the mapping between IP address and 
MAC address will be not changing even when the fence_ilo4 resource are 
moving between nodes. If there is intention to use the floating address 
in this cluster I would suggest checking if there is also no issue when 
"not using the floating address" or when it is disabled to see how the 
fence_ilo4 communicates. I think that there might be way in routing 
tables to set which IPv6 address should communicate with iLO IPv6 
address so you get consistent behaviour instead of using the floating IP 
address.


Anyway I'm glad that mystery is resolved.

--
Ondrej



On Fri, Apr 5, 2019 at 2:23 PM Rohit Saini 
mailto:rohitsaini111.fo...@gmail.com>> 
wrote:


Hi Ondrej,
Finally found some lead on this.. We started tcpdump on my machine
to understand the IPMI traffic. Attaching the capture for your
reference.
fd00:1061:37:9021:: is my floating IP and fd00:1061:37:9002:: is my
ILO IP.
When resource movement happens, we are initiating the "Neighbor
Advertisement" for fd00:1061:37:9021:: (which is on new machine now)
so that peers can update their neighbor table and starts
communication with new MAC address.
Looks like ILO is not updating its neighbor table, as it is still
sending responding to older MAC.
After sometime, "Neighbor Solicitation" happens and ILO updates the
neighbor table. Now this ILO becomes reachable and starts responding
towards new MAC address.

My ILO firmware is 2.60. We will try again the issue post upgrading
my firmware.

To verify this theory, after resource movement, I flushed the local
neighbor table due to which "Neighbor Solicitation" was initiated
early and this delay in getting ILO response was not seen.
This fixed the issue.

We are now more interested in understanding why ILO couldnot update
its neighbor table on receiving "Neighbor Advertisement". FYI,
Override flag in "Neighbor Advertisement" is already set.

Thanks,
Rohit

On Thu, Apr 4, 2019 at 8:37 AM Ondrej mailto:ondrej-clusterl...@famera.cz>> wrote:

On 4/3/19 6:10 PM, Rohit Saini wrote:
 > Hi Ondrej,
 > Please find my reply below:
 >
 > 1.
 > *Stonith configuration:*
 > [root@orana ~]# pcs config
 >   Resource: fence-uc-orana (class=stonith type=fence_ilo4)
 >    Attributes: delay=0 ipaddr=fd00:1061:37:9002:: lanplus=1
login=xyz
 > passwd=xyz pcmk_host_list=orana pcmk_reboot_action=off
 >    Meta Attrs: failure-timeout=3s
 >    Operations: monitor interval=5s on-fail=ignore
 > (fence-uc-orana-monitor-interval-5s)
 >                start interval=0s on-fail=restart
 > (fence-uc-orana-start-interval-0s)
 >   Resource: fence-uc-tigana (class=stonith type=fence_ilo4)
 >    Attributes: delay=10 ipaddr=fd00:1061:37:9001:: lanplus=1
login=xyz
 > passwd=xyz pcmk_host_list=tigana pcmk_reboot_action=off
 >    Meta Attrs: failure-timeout=3s
 >    Operations: monitor interval=5s on-fail=ignore
 > (fence-uc-tigana-monitor-interval-5s)
 >                start interval=0s on-fail=restart
 > (fence-uc-tigana-start-interval-0s)
 >
 > Fencing Levels:
 >
 > Location Constraints:
 > Ordering Constraints:
 >    start fence-uc-orana then promote unicloud-master
(kind:Mandatory)
 >    start fence-uc-tigana then promote unicloud-master
(kind:Mandatory)
 > Colocation Constraints:
 >    fence-uc-orana with unicloud-master (score:INFINITY)
 > (rsc-role:Started) (with-rsc-role:Master)
 >    fence-uc-tigana with unicloud-master 

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-05 Thread Rohit Saini
*Further update on this:*
This issue is resolved now. ILO was discarding "Neighbor Advertisement"
(NA) as Solicited flag was set in NA message. Hence it was not updating its
local neighbor table.
As per RFC, Solicited flag should be set only in NA message when it is a
response to Neighbor Solicitation.
After disabling the Solicited flag in NA message, ILO started updating the
local neighbor cache.

On Fri, Apr 5, 2019 at 2:23 PM Rohit Saini 
wrote:

> Hi Ondrej,
> Finally found some lead on this.. We started tcpdump on my machine to
> understand the IPMI traffic. Attaching the capture for your reference.
> fd00:1061:37:9021:: is my floating IP and fd00:1061:37:9002:: is my ILO IP.
> When resource movement happens, we are initiating the "Neighbor
> Advertisement" for fd00:1061:37:9021:: (which is on new machine now) so
> that peers can update their neighbor table and starts communication with
> new MAC address.
> Looks like ILO is not updating its neighbor table, as it is still sending
> responding to older MAC.
> After sometime, "Neighbor Solicitation" happens and ILO updates the
> neighbor table. Now this ILO becomes reachable and starts responding
> towards new MAC address.
>
> My ILO firmware is 2.60. We will try again the issue post upgrading my
> firmware.
>
> To verify this theory, after resource movement, I flushed the local
> neighbor table due to which "Neighbor Solicitation" was initiated early and
> this delay in getting ILO response was not seen.
> This fixed the issue.
>
> We are now more interested in understanding why ILO couldnot update its
> neighbor table on receiving "Neighbor Advertisement". FYI, Override flag in
> "Neighbor Advertisement" is already set.
>
> Thanks,
> Rohit
>
> On Thu, Apr 4, 2019 at 8:37 AM Ondrej 
> wrote:
>
>> On 4/3/19 6:10 PM, Rohit Saini wrote:
>> > Hi Ondrej,
>> > Please find my reply below:
>> >
>> > 1.
>> > *Stonith configuration:*
>> > [root@orana ~]# pcs config
>> >   Resource: fence-uc-orana (class=stonith type=fence_ilo4)
>> >Attributes: delay=0 ipaddr=fd00:1061:37:9002:: lanplus=1 login=xyz
>> > passwd=xyz pcmk_host_list=orana pcmk_reboot_action=off
>> >Meta Attrs: failure-timeout=3s
>> >Operations: monitor interval=5s on-fail=ignore
>> > (fence-uc-orana-monitor-interval-5s)
>> >start interval=0s on-fail=restart
>> > (fence-uc-orana-start-interval-0s)
>> >   Resource: fence-uc-tigana (class=stonith type=fence_ilo4)
>> >Attributes: delay=10 ipaddr=fd00:1061:37:9001:: lanplus=1 login=xyz
>> > passwd=xyz pcmk_host_list=tigana pcmk_reboot_action=off
>> >Meta Attrs: failure-timeout=3s
>> >Operations: monitor interval=5s on-fail=ignore
>> > (fence-uc-tigana-monitor-interval-5s)
>> >start interval=0s on-fail=restart
>> > (fence-uc-tigana-start-interval-0s)
>> >
>> > Fencing Levels:
>> >
>> > Location Constraints:
>> > Ordering Constraints:
>> >start fence-uc-orana then promote unicloud-master (kind:Mandatory)
>> >start fence-uc-tigana then promote unicloud-master (kind:Mandatory)
>> > Colocation Constraints:
>> >fence-uc-orana with unicloud-master (score:INFINITY)
>> > (rsc-role:Started) (with-rsc-role:Master)
>> >fence-uc-tigana with unicloud-master (score:INFINITY)
>> > (rsc-role:Started) (with-rsc-role:Master)
>> >
>> >
>> > 2. This is seen randomly. Since I am using colocation, stonith
>> resources
>> > are stopped and started on new master. That time, starting of stonith
>> is
>> > taking variable amount of time.
>> > No other IPv6 issues are seen in the cluster nodes.
>> >
>> > 3. fence_agent version
>> >
>> > [root@orana ~]#  rpm -qa|grep  fence-agents-ipmilan
>> > fence-agents-ipmilan-4.0.11-66.el7.x86_64
>> >
>> >
>> > *NOTE:*
>> > Both IPv4 and IPv6 are configured on my ILO, with "iLO Client
>> > Applications use IPv6 first" turned on.
>> > Attaching corosync logs also.
>> >
>> > Thanks, increasing timeout to 60 worked. But thats not what exactly I
>> am
>> > looking for. I need to know exact reason behind delay of starting these
>> > IPv6 stonith resources.
>> >
>> > Regards,
>> > Rohit
>>
>> Hi Rohit,
>>
>> Thank you for response.
>>
>>  From configuration it is clear that we are using directly IP addresses
>> so the DNS resolution issue can be rules out. There are no messages from
>> fence_ilo4 that would indicate reason why it timed out. So we cannot
>> tell yet what caused the issue. I see that you have enabled
>> PCMK_debug=stonith-ng most probably (or PCMK_debug=yes),
>>
>> It is nice that increased the timeout worked, but as said in previous
>> email it may just mask the real reason why it takes longer to do
>> monitor/start operation.
>>
>>  > Both IPv4 and IPv6 are configured on my ILO, with "iLO Client
>>  > Applications use IPv6 first" turned on.
>> This seems to me to be more related to SNMP communication which we don't
>> use with fence_ilo4 as far as I know. We use the ipmitool on port 623/udp.
>>
>> 

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-05 Thread Rohit Saini
Hi Ondrej,
Finally found some lead on this.. We started tcpdump on my machine to
understand the IPMI traffic. Attaching the capture for your reference.
fd00:1061:37:9021:: is my floating IP and fd00:1061:37:9002:: is my ILO IP.
When resource movement happens, we are initiating the "Neighbor
Advertisement" for fd00:1061:37:9021:: (which is on new machine now) so
that peers can update their neighbor table and starts communication with
new MAC address.
Looks like ILO is not updating its neighbor table, as it is still sending
responding to older MAC.
After sometime, "Neighbor Solicitation" happens and ILO updates the
neighbor table. Now this ILO becomes reachable and starts responding
towards new MAC address.

My ILO firmware is 2.60. We will try again the issue post upgrading my
firmware.

To verify this theory, after resource movement, I flushed the local
neighbor table due to which "Neighbor Solicitation" was initiated early and
this delay in getting ILO response was not seen.
This fixed the issue.

We are now more interested in understanding why ILO couldnot update its
neighbor table on receiving "Neighbor Advertisement". FYI, Override flag in
"Neighbor Advertisement" is already set.

Thanks,
Rohit

On Thu, Apr 4, 2019 at 8:37 AM Ondrej  wrote:

> On 4/3/19 6:10 PM, Rohit Saini wrote:
> > Hi Ondrej,
> > Please find my reply below:
> >
> > 1.
> > *Stonith configuration:*
> > [root@orana ~]# pcs config
> >   Resource: fence-uc-orana (class=stonith type=fence_ilo4)
> >Attributes: delay=0 ipaddr=fd00:1061:37:9002:: lanplus=1 login=xyz
> > passwd=xyz pcmk_host_list=orana pcmk_reboot_action=off
> >Meta Attrs: failure-timeout=3s
> >Operations: monitor interval=5s on-fail=ignore
> > (fence-uc-orana-monitor-interval-5s)
> >start interval=0s on-fail=restart
> > (fence-uc-orana-start-interval-0s)
> >   Resource: fence-uc-tigana (class=stonith type=fence_ilo4)
> >Attributes: delay=10 ipaddr=fd00:1061:37:9001:: lanplus=1 login=xyz
> > passwd=xyz pcmk_host_list=tigana pcmk_reboot_action=off
> >Meta Attrs: failure-timeout=3s
> >Operations: monitor interval=5s on-fail=ignore
> > (fence-uc-tigana-monitor-interval-5s)
> >start interval=0s on-fail=restart
> > (fence-uc-tigana-start-interval-0s)
> >
> > Fencing Levels:
> >
> > Location Constraints:
> > Ordering Constraints:
> >start fence-uc-orana then promote unicloud-master (kind:Mandatory)
> >start fence-uc-tigana then promote unicloud-master (kind:Mandatory)
> > Colocation Constraints:
> >fence-uc-orana with unicloud-master (score:INFINITY)
> > (rsc-role:Started) (with-rsc-role:Master)
> >fence-uc-tigana with unicloud-master (score:INFINITY)
> > (rsc-role:Started) (with-rsc-role:Master)
> >
> >
> > 2. This is seen randomly. Since I am using colocation, stonith resources
> > are stopped and started on new master. That time, starting of stonith is
> > taking variable amount of time.
> > No other IPv6 issues are seen in the cluster nodes.
> >
> > 3. fence_agent version
> >
> > [root@orana ~]#  rpm -qa|grep  fence-agents-ipmilan
> > fence-agents-ipmilan-4.0.11-66.el7.x86_64
> >
> >
> > *NOTE:*
> > Both IPv4 and IPv6 are configured on my ILO, with "iLO Client
> > Applications use IPv6 first" turned on.
> > Attaching corosync logs also.
> >
> > Thanks, increasing timeout to 60 worked. But thats not what exactly I am
> > looking for. I need to know exact reason behind delay of starting these
> > IPv6 stonith resources.
> >
> > Regards,
> > Rohit
>
> Hi Rohit,
>
> Thank you for response.
>
>  From configuration it is clear that we are using directly IP addresses
> so the DNS resolution issue can be rules out. There are no messages from
> fence_ilo4 that would indicate reason why it timed out. So we cannot
> tell yet what caused the issue. I see that you have enabled
> PCMK_debug=stonith-ng most probably (or PCMK_debug=yes),
>
> It is nice that increased the timeout worked, but as said in previous
> email it may just mask the real reason why it takes longer to do
> monitor/start operation.
>
>  > Both IPv4 and IPv6 are configured on my ILO, with "iLO Client
>  > Applications use IPv6 first" turned on.
> This seems to me to be more related to SNMP communication which we don't
> use with fence_ilo4 as far as I know. We use the ipmitool on port 623/udp.
>
> https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-a00026111en_us=en_US#N104B2
>
>  > 2. This is seen randomly. Since I am using colocation, stonith resources
>  > are stopped and started on new master. That time, starting of stonith is
>  > taking variable amount of time.
> This is a good observation. Which leads me to question if the iLO has
> set any kind of session limits for the user that is used here. If there
> is any session limit it may be worth trying to increase it and test if
> the same delay can be observed. One situation when this can happen is
> that when one node communicates with iLO and during that time the
> communication from 

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-03 Thread Ondrej

On 4/3/19 6:10 PM, Rohit Saini wrote:

Hi Ondrej,
Please find my reply below:

1.
*Stonith configuration:*
[root@orana ~]# pcs config
  Resource: fence-uc-orana (class=stonith type=fence_ilo4)
   Attributes: delay=0 ipaddr=fd00:1061:37:9002:: lanplus=1 login=xyz 
passwd=xyz pcmk_host_list=orana pcmk_reboot_action=off

   Meta Attrs: failure-timeout=3s
   Operations: monitor interval=5s on-fail=ignore 
(fence-uc-orana-monitor-interval-5s)
               start interval=0s on-fail=restart 
(fence-uc-orana-start-interval-0s)

  Resource: fence-uc-tigana (class=stonith type=fence_ilo4)
   Attributes: delay=10 ipaddr=fd00:1061:37:9001:: lanplus=1 login=xyz 
passwd=xyz pcmk_host_list=tigana pcmk_reboot_action=off

   Meta Attrs: failure-timeout=3s
   Operations: monitor interval=5s on-fail=ignore 
(fence-uc-tigana-monitor-interval-5s)
               start interval=0s on-fail=restart 
(fence-uc-tigana-start-interval-0s)


Fencing Levels:

Location Constraints:
Ordering Constraints:
   start fence-uc-orana then promote unicloud-master (kind:Mandatory)
   start fence-uc-tigana then promote unicloud-master (kind:Mandatory)
Colocation Constraints:
   fence-uc-orana with unicloud-master (score:INFINITY) 
(rsc-role:Started) (with-rsc-role:Master)
   fence-uc-tigana with unicloud-master (score:INFINITY) 
(rsc-role:Started) (with-rsc-role:Master)



2. This is seen randomly. Since I am using colocation, stonith resources 
are stopped and started on new master. That time, starting of stonith is 
taking variable amount of time.

No other IPv6 issues are seen in the cluster nodes.

3. fence_agent version

[root@orana ~]#  rpm -qa|grep  fence-agents-ipmilan
fence-agents-ipmilan-4.0.11-66.el7.x86_64


*NOTE:*
Both IPv4 and IPv6 are configured on my ILO, with "iLO Client 
Applications use IPv6 first" turned on.

Attaching corosync logs also.

Thanks, increasing timeout to 60 worked. But thats not what exactly I am 
looking for. I need to know exact reason behind delay of starting these 
IPv6 stonith resources.


Regards,
Rohit


Hi Rohit,

Thank you for response.

From configuration it is clear that we are using directly IP addresses 
so the DNS resolution issue can be rules out. There are no messages from 
fence_ilo4 that would indicate reason why it timed out. So we cannot 
tell yet what caused the issue. I see that you have enabled 
PCMK_debug=stonith-ng most probably (or PCMK_debug=yes),


It is nice that increased the timeout worked, but as said in previous 
email it may just mask the real reason why it takes longer to do 
monitor/start operation.


> Both IPv4 and IPv6 are configured on my ILO, with "iLO Client
> Applications use IPv6 first" turned on.
This seems to me to be more related to SNMP communication which we don't 
use with fence_ilo4 as far as I know. We use the ipmitool on port 623/udp.

https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-a00026111en_us=en_US#N104B2

> 2. This is seen randomly. Since I am using colocation, stonith resources
> are stopped and started on new master. That time, starting of stonith is
> taking variable amount of time.
This is a good observation. Which leads me to question if the iLO has 
set any kind of session limits for the user that is used here. If there 
is any session limit it may be worth trying to increase it and test if 
the same delay can be observed. One situation when this can happen is 
that when one node communicates with iLO and during that time the 
communication from other node needs to happen while the limit is 1 
connection. The relocation of resource from one note to another might 
fit this, but this is just speculation and fastest way to prove/reject 
it would be to increase limit, if there is one, and test it.


# What more can be done to figure out on what is causing delay?

1. The fence_ilo4 can be configured with attribute 'verbose=1' to print 
additional information when it is run. These data looks similar to ones 
below and they seems to provide the timestamps which is great as we 
should be able to see when what command was run. I don't have a testing 
machine on which to run fence_ilo4 so the below example just shows how 
it looks when it fails on timeout connecting.


Apr 03 12:34:11 [4025] fastvm-centos-7-6-31 stonith-ng: notice:
stonith_action_async_done: Child process 4252 performing action
'monitor' timed out with signal 15
Apr 03 12:34:11 [4025] fastvm-centos-7-6-31 stonith-ng: warning:
log_action: fence_ilo4[4252] stderr: [ 2019-04-03 12:33:51,193 INFO:
Executing: /usr/bin/ipmitool -I lanplus -H fe80::f6bd:8a67:7eb5:214f -p
623 -U xyz -P [set] -L ADMINISTRATOR chassis power status ]
Apr 03 12:34:11 [4025] fastvm-centos-7-6-31 stonith-ng: warning:
log_action: fence_ilo4[4252] stderr: [ ]

# pcs stonith update fence-uc-orana verbose=1

Note: That above shows that some private data are included in logs, so 
in case that you have there something interesting for sharing make sure 
to strip out the sensitive data.


2. The version of 

Re: [ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-04-01 Thread Rohit Saini
Looking for some help on this.

Thanks,
Rohit

On Thu, Mar 28, 2019 at 11:24 AM Rohit Saini 
wrote:

> Hi All,
> I am trying fence_ilo4 with same ILO device having IPv4 and IPv6 address.
> I see some discrepancy in both the behaviours:
>
> *1. When ILO has IPv4 address*
> This is working fine and stonith resources are started immediately.
>
> *2. When ILO has IPv6 address*
> Starting of stonith resources is taking more than 20 seconds sometime.
>
> *[root@tigana ~]# pcs status*
> Cluster name: ucc
> Stack: corosync
> Current DC: tigana (version 1.1.16-12.el7-94ff4df) - partition with quorum
> Last updated: Wed Mar 27 00:01:37 2019
> Last change: Wed Mar 27 00:01:19 2019 by root via cibadmin on orana
>
> 2 nodes configured
> 4 resources configured
>
> Online: [ orana tigana ]
>
> Full list of resources:
>
>  Master/Slave Set: unicloud-master [unicloud]
>  Masters: [ orana ]
>  Slaves: [ tigana ]
>  fence-uc-orana (stonith:fence_ilo4):   FAILED orana
>  fence-uc-tigana(stonith:fence_ilo4):   Started orana
>
> Failed Actions:
> * fence-uc-orana_start_0 on orana 'unknown error' (1): call=32,
> status=Timed Out, exitreason='none',
> last-rc-change='Wed Mar 27 00:01:17 2019', queued=0ms, exec=20006ms
> *<<<*
>
>
>
> *Queries:*
> 1. Why is it happening only for IPv6 ILO devices? Is this some known issue?
> 2. Can we increase the timeout period "exec=20006ms" to something else.
>
>
> Thanks,
> Rohit
>
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Issue in fence_ilo4 with IPv6 ILO IPs

2019-03-28 Thread Rohit Saini
Hi All,
I am trying fence_ilo4 with same ILO device having IPv4 and IPv6 address. I
see some discrepancy in both the behaviours:

*1. When ILO has IPv4 address*
This is working fine and stonith resources are started immediately.

*2. When ILO has IPv6 address*
Starting of stonith resources is taking more than 20 seconds sometime.

*[root@tigana ~]# pcs status*
Cluster name: ucc
Stack: corosync
Current DC: tigana (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Wed Mar 27 00:01:37 2019
Last change: Wed Mar 27 00:01:19 2019 by root via cibadmin on orana

2 nodes configured
4 resources configured

Online: [ orana tigana ]

Full list of resources:

 Master/Slave Set: unicloud-master [unicloud]
 Masters: [ orana ]
 Slaves: [ tigana ]
 fence-uc-orana (stonith:fence_ilo4):   FAILED orana
 fence-uc-tigana(stonith:fence_ilo4):   Started orana

Failed Actions:
* fence-uc-orana_start_0 on orana 'unknown error' (1): call=32,
status=Timed Out, exitreason='none',
last-rc-change='Wed Mar 27 00:01:17 2019', queued=0ms, exec=20006ms
*<<<*



*Queries:*
1. Why is it happening only for IPv6 ILO devices? Is this some known issue?
2. Can we increase the timeout period "exec=20006ms" to something else.


Thanks,
Rohit
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/