Can you reply on my question? Yaniv Dary Technical Product Manager Red Hat Israel Ltd. 34 Jerusalem Road Building A, 4th floor Ra'anana, Israel 4350109
Tel : +972 (9) 7692306 8272306 Email: yd...@redhat.com IRC : ydary On Thu, May 26, 2016 at 9:14 AM, Yaniv Dary <yd...@redhat.com> wrote: > What DR solution are you using? > > Yaniv Dary > Technical Product Manager > Red Hat Israel Ltd. > 34 Jerusalem Road > Building A, 4th floor > Ra'anana, Israel 4350109 > > Tel : +972 (9) 7692306 > 8272306 > Email: yd...@redhat.com > IRC : ydary > > > On Wed, Nov 25, 2015 at 1:15 PM, Simone Tiraboschi <stira...@redhat.com> > wrote: > >> Adding Nir who knows it far better than me. >> >> >> On Mon, Nov 23, 2015 at 8:37 PM, Duckworth, Douglas C <du...@tulane.edu> >> wrote: >> >>> Hello -- >>> >>> Not sure if y'all can help with this issue we've been seeing with RHEV... >>> >>> On 11/13/2015, during Code Upgrade of Compellent SAN at our Disaster >>> Recovery Site, we Failed Over to Secondary SAN Controller. Most Virtual >>> Machines in our DR Cluster Resumed automatically after Pausing except VM >>> "BADVM" on Host "BADHOST." >>> >>> In Engine.log you can see that BADVM was sent into "VM_PAUSED_EIO" state >>> at 10:47:57: >>> >>> "VM BADVM has paused due to storage I/O problem." >>> >>> On this Red Hat Enterprise Virtualization Hypervisor 6.6 >>> (20150512.0.el6ev) Host, two other VMs paused but then automatically >>> resumed without System Administrator intervention... >>> >>> In our DR Cluster, 22 VMs also resumed automatically... >>> >>> None of these Guest VMs are engaged in high I/O as these are DR site VMs >>> not currently doing anything. >>> >>> We sent this information to Dell. Their response: >>> >>> "The root cause may reside within your virtualization solution, not the >>> parent OS (RHEV-Hypervisor disc) or Storage (Dell Compellent.)" >>> >>> We are doing this Failover again on Sunday November 29th so we would >>> like to know how to mitigate this issue, given we have to manually >>> resume paused VMs that don't resume automatically. >>> >>> Before we initiated SAN Controller Failover, all iSCSI paths to Targets >>> were present on Host tulhv2p03. >>> >>> VM logs on Host show in /var/log/libvirt/qemu/badhost.log that Storage >>> error was reported: >>> >>> block I/O error in device 'drive-virtio-disk0': Input/output error (5) >>> block I/O error in device 'drive-virtio-disk0': Input/output error (5) >>> block I/O error in device 'drive-virtio-disk0': Input/output error (5) >>> block I/O error in device 'drive-virtio-disk0': Input/output error (5) >>> >>> All disks used by this Guest VM are provided by single Storage Domain >>> COM_3TB4_DR with serial "270." In syslog we do see that all paths for >>> that Storage Domain Failed: >>> >>> Nov 13 16:47:40 multipathd: 36000d310005caf000000000000000270: remaining >>> active paths: 0 >>> >>> Though these recovered later: >>> >>> Nov 13 16:59:17 multipathd: 36000d310005caf000000000000000270: sdbg - >>> tur checker reports path is up >>> Nov 13 16:59:17 multipathd: 36000d310005caf000000000000000270: remaining >>> active paths: 8 >>> >>> Does anyone have an idea of why the VM would fail to automatically >>> resume if the iSCSI paths used by its Storage Domain recovered? >>> >>> Thanks >>> Doug >>> >>> -- >>> Thanks >>> >>> Douglas Charles Duckworth >>> Unix Administrator >>> Tulane University >>> Technology Services >>> 1555 Poydras Ave >>> NOLA -- 70112 >>> >>> E: du...@tulane.edu >>> O: 504-988-9341 >>> F: 504-988-8505 >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >> >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users