Re: [ovirt-users] What recovers a VM from pause?
Please note that it's necessary to add a magic line '# VDSM PRIVATE' as second line in /etc/multipath.conf. Otherwise vdsm would overwrite your settings. Thus, /etc/multipath.conf should start with the following two lines: # VDSM REVISION 1.3 # VDSM PRIVATE On Mon, 2016-05-30 at 22:09 +0300, Nir Soffer wrote: But you may modify multipath configuration on the host. We use now this multipath configuration (/etc/multipath.conf): # VDSM REVISION 1.3 defaults { polling_interval5 no_path_retry fail user_friendly_names no flush_on_last_del yes fast_io_fail_tmo5 dev_loss_tmo30 max_fds 4096 deferred_remove yes } devices { device { all_devsyes no_path_retry fail } } This enforces failing of io request on devices that by default will queue such requests for long or unlimited time. Queuing requests is very bad for vdsm, and cause various commands to block for minutes during storage outage, failing various flows in vdsm and the ui. See https://bugzilla.redhat.com/880738 However, in your case, using queuing may be the best way to do the switch from one storage to another in the smoothest way. You may try this setting: devices { device { all_devsyes no_path_retry 30 } } This will queue io requests for 30 seconds before failing. Using this normally would be a bad idea with vdsm, since during storage outage, vdsm may block for 30 seconds when no paths is available, and is not designed for this behavior, but blocking from time to time for short time should be ok. I think that modifying the configuration and reloading multipathd service should be enough to use the new settings, but I'm not sure if this changes existing sessions or open devices. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] What recovers a VM from pause?
On Mon, May 30, 2016 at 10:09:25PM +0300, Nir Soffer wrote: > On Mon, May 30, 2016 at 4:07 PM, Nicolas Ecarnotwrote: > > Hello, > > > > We're planning a move from our old building towards a new one a few meters > > away. > > > > > > > > In a similar way of Martijn > > (https://www.mail-archive.com/users@ovirt.org/msg33182.html), I have > > maintenance planed on our storage side. > > > > Say an oVirt DC is using a SAN's LUN via iSCSI (Equallogic). > > This SAN allows me to setup block replication between two SANs, seen by > > oVirt as one (Dell is naming it SyncRep). > > Then switch all the iSCSI accesses to the replicated LUN. > > > > When doing this, the iSCSI stack of each oVirt host notices the > > de-connection, tries to reconnect, and succeeds. > > Amongst our hosts, this happens between 4 and 15 seconds. > > > > When this happens fast enough, oVirt engine and the VMs don't even notice, > > and they keep running happily. > > > > When this takes more than 4 seconds, there are 2 cases : > > > > 1 - The hosts and/or oVirt and/or the SPM (I actually don't know) notices > > that there is a storage failure, and pauses the VMs. > > When the iSCSI stack reconnects, the VMs are automatically recovered from > > pause, and this all takes less than 30 seconds. That is very acceptable for > > us, as this action is extremely rare. > > > > 2 - Same storage failure, VMs paused, and some VMs stay in pause mode > > forever. > > Manual "run" action is mandatory. > > When done, everything recovers correctly. > > This is also quite acceptable, but here come my questions : > > > > My questions : (!) > > - *WHAT* process or piece of code or what oVirt parts is responsible for > > deciding when to UN-pause a VM, and at what conditions? > > Vms get paused by qemu, when you get ENOSPC or some other IO error. > This probably happens when a vm is writing to storage, and all paths to > storage > are faulty - with current configuration, the scsi layer will fail > after 5 seconds, > and if no path is available, the write will fail. > > If vdsm storage monitoring system detected the issue, the storage domain > will become invalid. When the storage domain will become valid again, we > try to resume all vms paused because of IO errors. > > Storage monitoring is done every 10 seconds in normal conditions, but in > current release, there can be delays of up to couple of minutes in > extreme conditions, > for example, 50 storage domains and doing lot of io. So basically, the > storage domain > monitor may miss an error on storage, never become invalid, and would > never become valid again and the vm will have to be resumed manually. > See https://bugzilla.redhat.com/1081962 > > In ovirt 4.0 monitoring should be improved, and will always monitor > storage every > 10 seconds, but even this cannot guarantee that we will detect all > storage errors > For example, if the storage outage is shorter then 10 seconds. But I > guess that chance > that storage outage was shorter then 10 seconds, but long enough to cause a vm > to pause is very low. > > > That would help me to understand why some cases are working even more > > smoothly than others. > > - Are there related timeouts I could play with in engine-config options? > > Nothing on the engine side... > > > - [a bit off-topic] Is it safe to increase some iSCSI timeouts of > > buffer-sizes in the hope this kind of disconnection would get un-noticed? > > But you may modify multipath configuration on the host. > > We use now this multipath configuration (/etc/multipath.conf): > > # VDSM REVISION 1.3 > > defaults { > polling_interval5 > no_path_retry fail > user_friendly_names no > flush_on_last_del yes > fast_io_fail_tmo5 > dev_loss_tmo30 > max_fds 4096 > deferred_remove yes > } > > devices { > device { > all_devsyes > no_path_retry fail > } > } > > This enforces failing of io request on devices that by default will queue such > requests for long or unlimited time. Queuing requests is very bad for vdsm, > and > cause various commands to block for minutes during storage outage, > failing various > flows in vdsm and the ui. > See https://bugzilla.redhat.com/880738 > > However, in your case, using queuing may be the best way to do the switch > from one storage to another in the smoothest way. > > You may try this setting: > > devices { > device { > all_devsyes > no_path_retry 30 > } > } > > This will queue io requests for 30 seconds before failing. > Using this normally would be a bad idea with vdsm, since during storage > outage, > vdsm may block for 30 seconds when no paths is available, and is not designed > for this behavior, but blocking from time to time for short time should be ok. > > I think that modifying the
Re: [ovirt-users] What recovers a VM from pause?
> On 30 May 2016, at 21:21, Nicolas Ecarnotwrote: > > Le 30/05/2016 21:09, Nir Soffer wrote... SOME VERY VALUABLE ANSWERS! > > Thank you very much Nir, as your answers will give me food for thought for > the weeks to come. > > It's late here, I'll begin checking all this tomorrow, but just a note : > >> This enforces failing of io request on devices that by default will queue >> such >> requests for long or unlimited time. Queuing requests is very bad for vdsm, >> and >> cause various commands to block for minutes during storage outage, >> failing various >> flows in vdsm and the ui. >> See https://bugzilla.redhat.com/880738 > > Though we own a Redhat customer active subscription, I logged in and yet I > can not access the BZ above. > I'm sure you can help :) Hi Nicolas, ugh, there are procedural issues with that bug it seems. But in a nutshell, it is shipped fixed by [1] (so 3.6 release) and the actual fixes are [2] and [3] alternatively it’s described in ovirt bug [4] Thanks, michal [1] https://rhn.redhat.com/errata/RHBA-2016-0362.html [2] https://gerrit.ovirt.org/#/c/44855/ [3] https://gerrit.ovirt.org/#/c/42189/ [4] https://bugzilla.redhat.com/show_bug.cgi?id=1225162 > > -- > Nicolas ECARNOT > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] What recovers a VM from pause?
Le 30/05/2016 21:09, Nir Soffer wrote... SOME VERY VALUABLE ANSWERS! Thank you very much Nir, as your answers will give me food for thought for the weeks to come. It's late here, I'll begin checking all this tomorrow, but just a note : This enforces failing of io request on devices that by default will queue such requests for long or unlimited time. Queuing requests is very bad for vdsm, and cause various commands to block for minutes during storage outage, failing various flows in vdsm and the ui. See https://bugzilla.redhat.com/880738 Though we own a Redhat customer active subscription, I logged in and yet I can not access the BZ above. I'm sure you can help :) -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] What recovers a VM from pause?
On Mon, May 30, 2016 at 4:07 PM, Nicolas Ecarnotwrote: > Hello, > > We're planning a move from our old building towards a new one a few meters > away. > > > > In a similar way of Martijn > (https://www.mail-archive.com/users@ovirt.org/msg33182.html), I have > maintenance planed on our storage side. > > Say an oVirt DC is using a SAN's LUN via iSCSI (Equallogic). > This SAN allows me to setup block replication between two SANs, seen by > oVirt as one (Dell is naming it SyncRep). > Then switch all the iSCSI accesses to the replicated LUN. > > When doing this, the iSCSI stack of each oVirt host notices the > de-connection, tries to reconnect, and succeeds. > Amongst our hosts, this happens between 4 and 15 seconds. > > When this happens fast enough, oVirt engine and the VMs don't even notice, > and they keep running happily. > > When this takes more than 4 seconds, there are 2 cases : > > 1 - The hosts and/or oVirt and/or the SPM (I actually don't know) notices > that there is a storage failure, and pauses the VMs. > When the iSCSI stack reconnects, the VMs are automatically recovered from > pause, and this all takes less than 30 seconds. That is very acceptable for > us, as this action is extremely rare. > > 2 - Same storage failure, VMs paused, and some VMs stay in pause mode > forever. > Manual "run" action is mandatory. > When done, everything recovers correctly. > This is also quite acceptable, but here come my questions : > > My questions : (!) > - *WHAT* process or piece of code or what oVirt parts is responsible for > deciding when to UN-pause a VM, and at what conditions? Vms get paused by qemu, when you get ENOSPC or some other IO error. This probably happens when a vm is writing to storage, and all paths to storage are faulty - with current configuration, the scsi layer will fail after 5 seconds, and if no path is available, the write will fail. If vdsm storage monitoring system detected the issue, the storage domain will become invalid. When the storage domain will become valid again, we try to resume all vms paused because of IO errors. Storage monitoring is done every 10 seconds in normal conditions, but in current release, there can be delays of up to couple of minutes in extreme conditions, for example, 50 storage domains and doing lot of io. So basically, the storage domain monitor may miss an error on storage, never become invalid, and would never become valid again and the vm will have to be resumed manually. See https://bugzilla.redhat.com/1081962 In ovirt 4.0 monitoring should be improved, and will always monitor storage every 10 seconds, but even this cannot guarantee that we will detect all storage errors For example, if the storage outage is shorter then 10 seconds. But I guess that chance that storage outage was shorter then 10 seconds, but long enough to cause a vm to pause is very low. > That would help me to understand why some cases are working even more > smoothly than others. > - Are there related timeouts I could play with in engine-config options? Nothing on the engine side... > - [a bit off-topic] Is it safe to increase some iSCSI timeouts of > buffer-sizes in the hope this kind of disconnection would get un-noticed? But you may modify multipath configuration on the host. We use now this multipath configuration (/etc/multipath.conf): # VDSM REVISION 1.3 defaults { polling_interval5 no_path_retry fail user_friendly_names no flush_on_last_del yes fast_io_fail_tmo5 dev_loss_tmo30 max_fds 4096 deferred_remove yes } devices { device { all_devsyes no_path_retry fail } } This enforces failing of io request on devices that by default will queue such requests for long or unlimited time. Queuing requests is very bad for vdsm, and cause various commands to block for minutes during storage outage, failing various flows in vdsm and the ui. See https://bugzilla.redhat.com/880738 However, in your case, using queuing may be the best way to do the switch from one storage to another in the smoothest way. You may try this setting: devices { device { all_devsyes no_path_retry 30 } } This will queue io requests for 30 seconds before failing. Using this normally would be a bad idea with vdsm, since during storage outage, vdsm may block for 30 seconds when no paths is available, and is not designed for this behavior, but blocking from time to time for short time should be ok. I think that modifying the configuration and reloading multipathd service should be enough to use the new settings, but I'm not sure if this changes existing sessions or open devices. Adding Ben to add more info about this. Nir ___ Users mailing list Users@ovirt.org
Re: [ovirt-users] What recovers a VM from pause?
Am 5/30/2016 um 3:59 PM schrieb Nicolas Ecarnot: > Le 30/05/2016 15:30, InterNetX - Juergen Gotteswinter a écrit : >> Hi, >> >> you are aware of the fact that eql sync replication is just about >> replication, no single piece of high availability? i am not even sure if >> it does ip failover itself. so better think about minutes of >> interruptions than seconds. > > Hi Juergen, > > I'm absolutely aware that there is no HA discussed here, at least in my > mind. > It does ip fail-over, but I'm not even blindly trusting it enough, > that's why I'm doing numerous tests and measures. > I'm gladly surprised by how the iSCSI stack is reacting, and its log > files are readable enough for me to decide. > > Actually, I was more worrying about the iSCSI reconnection storm, but > googling about it does not seem to get any warnings. This works pretty well with the Eql Boxes, except you use the EQL without Hit Kit. With installed HitKit on each Client i dont think that this will cause problems. > >> anyway, dont count on ovirts pause/unpause. theres a real chance that it >> will go horrible wrong. a scheduled maint. window where everything gets >> shut down whould be best practice > > Indeed, this would the best choice, if I had it. > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] What recovers a VM from pause?
Le 30/05/2016 15:30, InterNetX - Juergen Gotteswinter a écrit : Hi, you are aware of the fact that eql sync replication is just about replication, no single piece of high availability? i am not even sure if it does ip failover itself. so better think about minutes of interruptions than seconds. Hi Juergen, I'm absolutely aware that there is no HA discussed here, at least in my mind. It does ip fail-over, but I'm not even blindly trusting it enough, that's why I'm doing numerous tests and measures. I'm gladly surprised by how the iSCSI stack is reacting, and its log files are readable enough for me to decide. Actually, I was more worrying about the iSCSI reconnection storm, but googling about it does not seem to get any warnings. anyway, dont count on ovirts pause/unpause. theres a real chance that it will go horrible wrong. a scheduled maint. window where everything gets shut down whould be best practice Indeed, this would the best choice, if I had it. -- Nicolas ECARNOT ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] What recovers a VM from pause?
Hi, you are aware of the fact that eql sync replication is just about replication, no single piece of high availability? i am not even sure if it does ip failover itself. so better think about minutes of interruptions than seconds. anyway, dont count on ovirts pause/unpause. theres a real chance that it will go horrible wrong. a scheduled maint. window where everything gets shut down whould be best practice Juergen Am 5/30/2016 um 3:07 PM schrieb Nicolas Ecarnot: > Hello, > > We're planning a move from our old building towards a new one a few > meters away. > > > > In a similar way of Martijn > (https://www.mail-archive.com/users@ovirt.org/msg33182.html), I have > maintenance planed on our storage side. > > Say an oVirt DC is using a SAN's LUN via iSCSI (Equallogic). > This SAN allows me to setup block replication between two SANs, seen by > oVirt as one (Dell is naming it SyncRep). > Then switch all the iSCSI accesses to the replicated LUN. > > When doing this, the iSCSI stack of each oVirt host notices the > de-connection, tries to reconnect, and succeeds. > Amongst our hosts, this happens between 4 and 15 seconds. > > When this happens fast enough, oVirt engine and the VMs don't even > notice, and they keep running happily. > > When this takes more than 4 seconds, there are 2 cases : > > 1 - The hosts and/or oVirt and/or the SPM (I actually don't know) > notices that there is a storage failure, and pauses the VMs. > When the iSCSI stack reconnects, the VMs are automatically recovered > from pause, and this all takes less than 30 seconds. That is very > acceptable for us, as this action is extremely rare. > > 2 - Same storage failure, VMs paused, and some VMs stay in pause mode > forever. > Manual "run" action is mandatory. > When done, everything recovers correctly. > This is also quite acceptable, but here come my questions : > > My questions : (!) > - *WHAT* process or piece of code or what oVirt parts is responsible for > deciding when to UN-pause a VM, and at what conditions? > That would help me to understand why some cases are working even more > smoothly than others. > - Are there related timeouts I could play with in engine-config options? > - [a bit off-topic] Is it safe to increase some iSCSI timeouts of > buffer-sizes in the hope this kind of disconnection would get un-noticed? > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users