[ovirt-users] Set disk profile
Hi all, I'm trying to set a different profile on a disk. I open the dialog, click new profile and click OK. Seems simple enough. Log says: VMvmname vmname_Disk1 disk was updated by admin@internal. When I reopen the same dialog, the profile is set to the previous one. Running the latest ovirt 3.6. Any idea? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] centos 7.1 and up & ixgbe
Hi Jeff, was the issue ever resolved? Don't have permissions to view the bugzilla. On Thu, Mar 17, 2016 at 4:34 PM, Jeff Spahr <spa...@gmail.com> wrote: > I had the same issue, and I also have a support case open. They > referenced https://bugzilla.redhat.com/show_bug.cgi?id=1288237 which is > private. I didn't have any success getting that bugzilla changed to > public. We couldn't keep waiting for the issue to be fixed so we replaced > the NICs with Broadcom/Qlogic that we knew had no issues in other hosts. > > On Thu, Mar 17, 2016 at 11:27 AM, Sigbjorn Lie <sigbj...@nixtra.com> > wrote: > >> Hi, >> >> Is this on CentOS/RHEL 7.2? >> >> Log in as root as see if you can see any messages from ixgbe about "tx >> queue hung" in dmesg. I >> currently have an open support case for RHEL7.2 and the ixgbe driver, >> where there is a driver >> issue causing the network adapter to reset continuously when there are >> network traffic. >> >> >> Regards, >> Siggi >> >> >> >> On Thu, March 17, 2016 12:52, Nir Soffer wrote: >> > On Thu, Mar 17, 2016 at 10:49 AM, Johan Kooijman < >> m...@johankooijman.com> wrote: >> > >> >> Hi all, >> >> >> >> >> >> Since we upgraded to the latest ovirt node running 7.2, we're seeing >> that >> >> nodes become unavailable after a while. It's running fine, with a >> couple of VM's on it, untill it >> >> becomes non responsive. At that moment it doesn't even respond to >> ICMP. It'll come back by >> >> itself after a while, but oVirt fences the machine before that time >> and restarts VM's elsewhere. >> >> >> >> >> >> Engine tells me this message: >> >> >> >> >> >> VDSM host09 command failed: Message timeout which can be caused by >> >> communication issues >> >> >> >> Is anyone else experiencing these issues with ixgbe drivers? I'm >> running on >> >> Intel X540-AT2 cards. >> >> >> > >> > We will need engine and vdsm logs to understand this issue. >> > >> > >> > Can you file a bug and attach ful logs? >> > >> > >> > Nir >> > ___ >> > Users mailing list >> > Users@ovirt.org >> > http://lists.ovirt.org/mailman/listinfo/users >> > >> > >> >> >> ___ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> > > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] ovirt node 3.6 stable
Doesn't exist yet, you can use 3.5 node stable or the nightly builds: http://jenkins.ovirt.org/job/ovirt-node_ovirt-3.6_create-iso-el7_merged/ On Fri, Apr 8, 2016 at 11:04 PM, Marcelo Leandro <marcelol...@gmail.com> wrote: > Yes, but i cant see the ovirt node 3.6 > > Thanks. > Em 08/04/2016 16:47, "Arman Khalatyan" <arm2...@gmail.com> escreveu: > >> Are you looking for this ? >> http://www.ovirt.org/download/ >> Or >> sudo yum localinstall >> http://resources.ovirt.org/pub/yum-repo/ovirt-release36.rpm ? >> hello, >> >> which link to download the stable version of oVirt node 3.6? >> >> Thanks >> >> ___ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] centos 7.1 and up & ixgbe
Hi Jurrien, I don't see anything in logs on the nodes itself. The only thing we see in logs are in engine log - it looses connectivity to the host. Definitely CentOS 7.1/7.2 related. Downgraded the hosts to ovirt-iso 3.5, this resolves the issue. On Fri, Mar 18, 2016 at 9:01 AM, Bloemen, Jurriën < jurrien.bloe...@dmc.amcnetworks.com> wrote: > Hi Johan, > > Could you check if you see the following in you dmesg or message log file? > > [1123306.014288] [ cut here ] > [1123306.014302] WARNING: at net/core/dev.c:2189 > skb_warn_bad_offload+0xcd/0xda() > [1123306.014306] : caps=(0x00024849, 0x) len=330 > data_len=276 gso_size=276 gso_type=1 ip_summed=1 > [1123306.014308] Modules linked in: vhost_net macvtap macvlan > ip6table_filter ip6_tables iptable_filter ip_tables ebt_arp ebtable_nat > ebtables tun scsi_transport_iscsi iTCO_wdt iTCO_vendor_support > dm_service_time intel_powerclamp coretemp intel_rapl kvm_intel kvm > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cryptd pcspkr sb_edac > edac_core i2c_i801 lpc_ich mfd_core mei_me mei wmi ioatdma shpchp > ipmi_devintf ipmi_si ipmi_msghandler acpi_power_meter acpi_pad 8021q garp > mrp bridge stp llc bonding dm_multipath xfs libcrc32c sd_mod crc_t10dif > crct10dif_common ast syscopyarea sysfillrect sysimgblt drm_kms_helper ttm > crc32c_intel igb drm ahci ixgbe i2c_algo_bit libahci libata mdio i2c_core > ptp megaraid_sas pps_core dca dm_mirror dm_region_hash dm_log dm_mod > [1123306.014360] CPU: 30 PID: 0 Comm: swapper/30 Tainted: GW > -- 3.10.0-229.1.2.el7.x86_64 #1 > [1123306.014362] Hardware name: Supermicro SYS-2028TP-HC1TR/X10DRT-PT, > BIOS 1.1 08/03/2015 > [1123306.014364] 881fffc439a8 5326fb90ad1041ea 881fffc43960 > 81604afa > [1123306.014371] 881fffc43998 8106e34b 881fcebb0500 > 881fce88c000 > [1123306.014376] 0001 0001 881fcebb0500 > 881fffc43a00 > [1123306.014381] Call Trace: > [1123306.014383][] dump_stack+0x19/0x1b > [1123306.014396] [] warn_slowpath_common+0x6b/0xb0 > [1123306.014399] [] warn_slowpath_fmt+0x5c/0x80 > [1123306.014405] [] ? ___ratelimit+0x93/0x100 > [1123306.014409] [] skb_warn_bad_offload+0xcd/0xda > [1123306.014425] [] __skb_gso_segment+0x79/0xb0 > [1123306.014429] [] dev_hard_start_xmit+0x1a2/0x580 > [1123306.014438] [] ? deliver_clone+0x50/0x50 [bridge] > [1123306.014443] [] sch_direct_xmit+0xee/0x1c0 > [1123306.014447] [] dev_queue_xmit+0x1f8/0x4a0 > [1123306.014453] [] br_dev_queue_push_xmit+0x7b/0xc0 > [bridge] > [1123306.014458] [] br_forward_finish+0x22/0x60 [bridge] > [1123306.014464] [] __br_forward+0x80/0xf0 [bridge] > [1123306.014469] [] br_forward+0x8b/0xa0 [bridge] > [1123306.014476] [] br_handle_frame_finish+0x175/0x410 > [bridge] > [1123306.014481] [] br_handle_frame+0x175/0x260 [bridge] > [1123306.014485] [] __netif_receive_skb_core+0x282/0x870 > [1123306.014490] [] ? read_tsc+0x9/0x10 > [1123306.014493] [] __netif_receive_skb+0x18/0x60 > [1123306.014497] [] netif_receive_skb+0x40/0xd0 > [1123306.014500] [] napi_gro_receive+0x80/0xb0 > [1123306.014512] [] ixgbe_clean_rx_irq+0x7ac/0xb30 > [ixgbe] > [1123306.014519] [] ixgbe_poll+0x4bb/0x930 [ixgbe] > [1123306.014524] [] net_rx_action+0x152/0x240 > [1123306.014528] [] __do_softirq+0xf7/0x290 > [1123306.014533] [] call_softirq+0x1c/0x30 > [1123306.014539] [] do_softirq+0x55/0x90 > [1123306.014543] [] irq_exit+0x115/0x120 > [1123306.014546] [] do_IRQ+0x58/0xf0 > [1123306.014551] [] common_interrupt+0x6d/0x6d > [1123306.014553][] ? > cpuidle_enter_state+0x52/0xc0 > [1123306.014561] [] ? cpuidle_enter_state+0x48/0xc0 > [1123306.014565] [] cpuidle_idle_call+0xc5/0x200 > [1123306.014569] [] arch_cpu_idle+0xe/0x30 > [1123306.014574] [] cpu_startup_entry+0xf5/0x290 > [1123306.014580] [] start_secondary+0x1ba/0x230 > [1123306.014582] ---[ end trace 4d5a1bc838e1fcc0 ]--- > > If so, then could you try the following: > > ethtool -K lro off > > Do this for all the 10G intel nics and check if the problems still exists > > > *Kind regards,* > > > > *Jurriën Bloemen* > > On 17-03-16 09:49, Johan Kooijman wrote: > > Hi all, > > Since we upgraded to the latest ovirt node running 7.2, we're seeing that > nodes become unavailable after a while. It's running fine, with a couple of > VM's on it, untill it becomes non responsive. At that moment it doesn't > even respond to ICMP. It'll come back by itself after a while, but oVirt > fences the machine before that time and restarts VM's elsewhere. > > Engine tells me this message: > > VDSM host09 command failed: Message timeout which can be caused by > co
[ovirt-users] centos 7.1 and up & ixgbe
Hi all, Since we upgraded to the latest ovirt node running 7.2, we're seeing that nodes become unavailable after a while. It's running fine, with a couple of VM's on it, untill it becomes non responsive. At that moment it doesn't even respond to ICMP. It'll come back by itself after a while, but oVirt fences the machine before that time and restarts VM's elsewhere. Engine tells me this message: VDSM host09 command failed: Message timeout which can be caused by communication issues Is anyone else experiencing these issues with ixgbe drivers? I'm running on Intel X540-AT2 cards. -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] ovirt 3.6 node
Hi all, I can't seem to find ovirt 3.6 node ISO. Is there a specific reason for this? I have the issue with 3.5 (ovirt-node-iso-3.5-0.201502231653.el7.iso) that I can't do live merge on machines installed with ovirt node. -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Guest OS wrong cpu count
My apologies, totally overlooked that! On Thu, Mar 3, 2016 at 12:35 PM, Alexandr Krivulya <shur...@shurik.kiev.ua> wrote: > Hi, Windows Server 2008 R2 Standard supports up to 4 sockets. > > 03.03.16 13:30, Johan Kooijman пишет: > > Hi all, > > I created a VM on our ovirt 3.5 cluster having 8 CPU's and 16 GB of RAM ( > http://imgur.com/aaXfbfq). The guest however, only sees 4 CPU's: > <http://imgur.com/dEYVe0S>http://imgur.com/dEYVe0S. > > Any clues what may have caused this? > > -- > Met vriendelijke groeten / With kind regards, > Johan Kooijman > > > ___ > Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users > > > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Guest OS wrong cpu count
Hi all, I created a VM on our ovirt 3.5 cluster having 8 CPU's and 16 GB of RAM ( http://imgur.com/aaXfbfq). The guest however, only sees 4 CPU's: http://imgur.com/dEYVe0S. Any clues what may have caused this? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Upgrade patch 3.5 -> 3.6
Ack, thx. On Tue, Feb 16, 2016 at 11:46 AM, Yedidyah Bar David <d...@redhat.com> wrote: > On Tue, Feb 16, 2016 at 11:58 AM, Johan Kooijman <m...@johankooijman.com> > wrote: > > Didi, > > > > Ok, the piece of information on engine still supported on 3.6 was > unknown to > > me. Makes my job a lot easier. > > > > What would be the recommended path: first upgrade the nodes to 3.6 and > then > > the engine? Or the other way around? > > Generally speaking, first engine then nodes. I am pretty certain it's > documented > on the wiki, didn't check. > > > > > On Tue, Feb 16, 2016 at 8:45 AM, Yedidyah Bar David <d...@redhat.com> > wrote: > >> > >> On Tue, Feb 16, 2016 at 9:30 AM, Johan Kooijman <m...@johankooijman.com > > > >> wrote: > >> > Yes. I pasted the information on AIO, that was wrong on my end. I have > >> > an > >> > engine running on dedicated hardware and about 20 nodes in this > cluster. > >> > I > >> > do like to upgrade without downtime :) I know how to achieve this on > the > >> > node-end, but since I have to go from C6 to C7, I wonder what the > >> > procedure > >> > would be for engine. > >> > >> Please explain exactly what you are trying to do. > >> > >> Note that engine is still supported on el6. > >> > >> (new) Hosts are not. > >> > >> all-in-one is running both together, thus is not supported on el6 > either. > >> > >> IIRC we do not have a tested procedure to upgrade the engine from C6 to > C7 > >> yet, see also: > >> > >> https://bugzilla.redhat.com/show_bug.cgi?id=1234257 > >> https://bugzilla.redhat.com/show_bug.cgi?id=1285743 > >> > >> Best, > >> > >> > > >> > On Mon, Feb 15, 2016 at 9:21 PM, Alexander Wels <aw...@redhat.com> > >> > wrote: > >> >> > >> >> On Monday, February 15, 2016 08:21:40 PM Johan Kooijman wrote: > >> >> > Hi Alexander, > >> >> > > >> >> > Thanks for the input! My 3.5 is running on C6 however: > >> >> > > >> >> > Upgrade of All-in-One on EL6 is not supported in 3.6. VDSM and the > >> >> > packages > >> >> > requiring it are not built anymore for EL6 > >> >> > > >> >> > >> >> Well that was a piece of information you forgot to mention in your > >> >> initial > >> >> email. So now I am not entirely sure what you are trying to do. Are > you > >> >> trying > >> >> to save your existing VMs when you reinstall your machine? > >> >> > >> >> > >> >> > On Mon, Feb 15, 2016 at 3:37 PM, Alexander Wels <aw...@redhat.com> > >> >> > wrote: > >> >> > > On Monday, February 15, 2016 02:40:47 PM Johan Kooijman wrote: > >> >> > > > Hi, > >> >> > > > > >> >> > > > Can anybody recommend me best practice upgrade path for an > >> >> > > > upgrade > >> >> > > > from > >> >> > > > oVirt 3.5 on C6 to 3.6 on C7.2? > >> >> > > > >> >> > > The answer sort of depends on what you want. Do you want no > >> >> > > downtime > >> >> > > on > >> >> > > your > >> >> > > VMs or is downtime acceptable. Also are you running hosted engine > >> >> > > or > >> >> > > not? > >> >> > > > >> >> > > This is the basic plan which can be adjusted based on what your > >> >> > > needs > >> >> > > are: > >> >> > > > >> >> > > 1. Update engine from 3.5 to 3.6 (if hosted engine might be > >> >> > > trickier, > >> >> > > not > >> >> > > sure > >> >> > > haven't played with hosted engine). > >> >> > > 2. Create a new 3.6 cluster. > >> >> > > 3. Put 1 host in maintenance (which will migrate the VMs to the > >> >> > > other > >> >> > > hosts). > >> >> > > 4. Remove the host from the DC. > >> >> > > 5. Install C7.2 on the host > >> >> > > 6. Add that host to the
Re: [ovirt-users] Upgrade patch 3.5 -> 3.6
Didi, Ok, the piece of information on engine still supported on 3.6 was unknown to me. Makes my job a lot easier. What would be the recommended path: first upgrade the nodes to 3.6 and then the engine? Or the other way around? On Tue, Feb 16, 2016 at 8:45 AM, Yedidyah Bar David <d...@redhat.com> wrote: > On Tue, Feb 16, 2016 at 9:30 AM, Johan Kooijman <m...@johankooijman.com> > wrote: > > Yes. I pasted the information on AIO, that was wrong on my end. I have an > > engine running on dedicated hardware and about 20 nodes in this cluster. > I > > do like to upgrade without downtime :) I know how to achieve this on the > > node-end, but since I have to go from C6 to C7, I wonder what the > procedure > > would be for engine. > > Please explain exactly what you are trying to do. > > Note that engine is still supported on el6. > > (new) Hosts are not. > > all-in-one is running both together, thus is not supported on el6 either. > > IIRC we do not have a tested procedure to upgrade the engine from C6 to C7 > yet, see also: > > https://bugzilla.redhat.com/show_bug.cgi?id=1234257 > https://bugzilla.redhat.com/show_bug.cgi?id=1285743 > > Best, > > > > > On Mon, Feb 15, 2016 at 9:21 PM, Alexander Wels <aw...@redhat.com> > wrote: > >> > >> On Monday, February 15, 2016 08:21:40 PM Johan Kooijman wrote: > >> > Hi Alexander, > >> > > >> > Thanks for the input! My 3.5 is running on C6 however: > >> > > >> > Upgrade of All-in-One on EL6 is not supported in 3.6. VDSM and the > >> > packages > >> > requiring it are not built anymore for EL6 > >> > > >> > >> Well that was a piece of information you forgot to mention in your > initial > >> email. So now I am not entirely sure what you are trying to do. Are you > >> trying > >> to save your existing VMs when you reinstall your machine? > >> > >> > >> > On Mon, Feb 15, 2016 at 3:37 PM, Alexander Wels <aw...@redhat.com> > >> > wrote: > >> > > On Monday, February 15, 2016 02:40:47 PM Johan Kooijman wrote: > >> > > > Hi, > >> > > > > >> > > > Can anybody recommend me best practice upgrade path for an upgrade > >> > > > from > >> > > > oVirt 3.5 on C6 to 3.6 on C7.2? > >> > > > >> > > The answer sort of depends on what you want. Do you want no downtime > >> > > on > >> > > your > >> > > VMs or is downtime acceptable. Also are you running hosted engine or > >> > > not? > >> > > > >> > > This is the basic plan which can be adjusted based on what your > needs > >> > > are: > >> > > > >> > > 1. Update engine from 3.5 to 3.6 (if hosted engine might be > trickier, > >> > > not > >> > > sure > >> > > haven't played with hosted engine). > >> > > 2. Create a new 3.6 cluster. > >> > > 3. Put 1 host in maintenance (which will migrate the VMs to the > other > >> > > hosts). > >> > > 4. Remove the host from the DC. > >> > > 5. Install C7.2 on the host > >> > > 6. Add that host to the new 3.6 cluster. > >> > > 7. Optional (you can cross cluster live migrate some VMs from 6 to 7 > >> > > (just > >> > > not > >> > > the other way around, so once the VM is moved its stuck in the new > >> > > cluster). > >> > > 8. Go to 3 until all hosts are moved. > >> > > 9. Your 3.5 cluster should now be empty, and can be removed. > >> > > 10. Upgrade your DC to 3.6 (Can't upgrade if any lower clusters > >> > > exist). > >> > > > >> > > If you can have downtime, then just shut down the VMs running on the > >> > > host > >> > > in > >> > > step 3 before putting it in maintenance. Once the host is moved to > the > >> > > new > >> > > cluster you can start the VMs. > >> > > > >> > > Alexander > >> > > ___ > >> > > Users mailing list > >> > > Users@ovirt.org > >> > > http://lists.ovirt.org/mailman/listinfo/users > >> > > > > > > > > -- > > Met vriendelijke groeten / With kind regards, > > Johan Kooijman > > > > ___ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > > > > > -- > Didi > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Upgrade patch 3.5 -> 3.6
Yes. I pasted the information on AIO, that was wrong on my end. I have an engine running on dedicated hardware and about 20 nodes in this cluster. I do like to upgrade without downtime :) I know how to achieve this on the node-end, but since I have to go from C6 to C7, I wonder what the procedure would be for engine. On Mon, Feb 15, 2016 at 9:21 PM, Alexander Wels <aw...@redhat.com> wrote: > On Monday, February 15, 2016 08:21:40 PM Johan Kooijman wrote: > > Hi Alexander, > > > > Thanks for the input! My 3.5 is running on C6 however: > > > > Upgrade of All-in-One on EL6 is not supported in 3.6. VDSM and the > packages > > requiring it are not built anymore for EL6 > > > > Well that was a piece of information you forgot to mention in your initial > email. So now I am not entirely sure what you are trying to do. Are you > trying > to save your existing VMs when you reinstall your machine? > > > > On Mon, Feb 15, 2016 at 3:37 PM, Alexander Wels <aw...@redhat.com> > wrote: > > > On Monday, February 15, 2016 02:40:47 PM Johan Kooijman wrote: > > > > Hi, > > > > > > > > Can anybody recommend me best practice upgrade path for an upgrade > from > > > > oVirt 3.5 on C6 to 3.6 on C7.2? > > > > > > The answer sort of depends on what you want. Do you want no downtime on > > > your > > > VMs or is downtime acceptable. Also are you running hosted engine or > not? > > > > > > This is the basic plan which can be adjusted based on what your needs > are: > > > > > > 1. Update engine from 3.5 to 3.6 (if hosted engine might be trickier, > not > > > sure > > > haven't played with hosted engine). > > > 2. Create a new 3.6 cluster. > > > 3. Put 1 host in maintenance (which will migrate the VMs to the other > > > hosts). > > > 4. Remove the host from the DC. > > > 5. Install C7.2 on the host > > > 6. Add that host to the new 3.6 cluster. > > > 7. Optional (you can cross cluster live migrate some VMs from 6 to 7 > (just > > > not > > > the other way around, so once the VM is moved its stuck in the new > > > cluster). > > > 8. Go to 3 until all hosts are moved. > > > 9. Your 3.5 cluster should now be empty, and can be removed. > > > 10. Upgrade your DC to 3.6 (Can't upgrade if any lower clusters exist). > > > > > > If you can have downtime, then just shut down the VMs running on the > host > > > in > > > step 3 before putting it in maintenance. Once the host is moved to the > new > > > cluster you can start the VMs. > > > > > > Alexander > > > ___ > > > Users mailing list > > > Users@ovirt.org > > > http://lists.ovirt.org/mailman/listinfo/users > > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Upgrade patch 3.5 -> 3.6
Hi Alexander, Thanks for the input! My 3.5 is running on C6 however: Upgrade of All-in-One on EL6 is not supported in 3.6. VDSM and the packages requiring it are not built anymore for EL6 On Mon, Feb 15, 2016 at 3:37 PM, Alexander Wels <aw...@redhat.com> wrote: > On Monday, February 15, 2016 02:40:47 PM Johan Kooijman wrote: > > Hi, > > > > Can anybody recommend me best practice upgrade path for an upgrade from > > oVirt 3.5 on C6 to 3.6 on C7.2? > > The answer sort of depends on what you want. Do you want no downtime on > your > VMs or is downtime acceptable. Also are you running hosted engine or not? > > This is the basic plan which can be adjusted based on what your needs are: > > 1. Update engine from 3.5 to 3.6 (if hosted engine might be trickier, not > sure > haven't played with hosted engine). > 2. Create a new 3.6 cluster. > 3. Put 1 host in maintenance (which will migrate the VMs to the other > hosts). > 4. Remove the host from the DC. > 5. Install C7.2 on the host > 6. Add that host to the new 3.6 cluster. > 7. Optional (you can cross cluster live migrate some VMs from 6 to 7 (just > not > the other way around, so once the VM is moved its stuck in the new > cluster). > 8. Go to 3 until all hosts are moved. > 9. Your 3.5 cluster should now be empty, and can be removed. > 10. Upgrade your DC to 3.6 (Can't upgrade if any lower clusters exist). > > If you can have downtime, then just shut down the VMs running on the host > in > step 3 before putting it in maintenance. Once the host is moved to the new > cluster you can start the VMs. > > Alexander > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Upgrade patch 3.5 -> 3.6
Hi, Can anybody recommend me best practice upgrade path for an upgrade from oVirt 3.5 on C6 to 3.6 on C7.2? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] After updates - hosts become unresponsive once in a while
Hi all, Yesterday I updated a couple of my hosts to the latest CentOS packages as well as the latest ovirt packages, in the 3.5 series. All looks fine, but once every couple of hours a host becomes completely unresponsive, doesn't even ping. Engine takes care of this - fencing is done. Nothing to be found in the logs of the host itself, engine simply tells me host became unresponsive. Nothing else in the infra has changed. The other hosts, still at CentOS 7.1.1503, are fine, only the updated hosts at 7.2.1511 have this issue. I suspect a network driver issue somewhere, has anyone had the same experience so far? I'm using Intel X540-AT2 10 gbit cards in all my nodes, setup with LACP bonding. -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Snapshot failed
Hi all, I've been live migrating storage all day, but suddenly getting an error on one VM. I've put the logs here: https://plakbord.cloud.nl/p/VCqBg50032GZyrhz5QbDpjJM oVirt versions: ovirt-image-uploader-3.5.1-1.el6.noarch ovirt-engine-setup-base-3.5.3.1-1.el6.noarch ovirt-engine-setup-plugin-websocket-proxy-3.5.3.1-1.el6.noarch ovirt-engine-userportal-3.5.3.1-1.el6.noarch ovirt-engine-backend-3.5.3.1-1.el6.noarch ovirt-host-deploy-1.3.1-1.el6.noarch ovirt-guest-tools-iso-3.5-7.noarch ovirt-host-deploy-java-1.3.1-1.el6.noarch ovirt-release35-004-1.noarch ovirt-engine-lib-3.5.3.1-1.el6.noarch ovirt-engine-setup-plugin-ovirt-engine-common-3.5.3.1-1.el6.noarch ovirt-engine-extensions-api-impl-3.5.3.1-1.el6.noarch ovirt-engine-webadmin-portal-3.5.3.1-1.el6.noarch ovirt-iso-uploader-3.5.2-1.el6.noarch ovirt-engine-setup-3.5.3.1-1.el6.noarch ovirt-engine-dbscripts-3.5.3.1-1.el6.noarch ovirt-engine-3.5.3.1-1.el6.noarch ovirt-engine-setup-plugin-ovirt-engine-3.5.3.1-1.el6.noarch ovirt-engine-websocket-proxy-3.5.3.1-1.el6.noarch ovirt-engine-restapi-3.5.3.1-1.el6.noarch ovirt-engine-tools-3.5.3.1-1.el6.noarch ovirt-engine-cli-3.5.0.6-1.el6.noarch ovirt-engine-jboss-as-7.1.1-1.el6.x86_64 ovirt-engine-sdk-python-3.5.2.1-1.el6.noarch Any clues? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Delete disk references without deleting the disk
Hi all, Responding to an old thread - but I'm having an issue with this. I can't remove the disk from a VM, without deleting a snapshot. Since deleting contents on my storage domain will crash it (long story..), I need to find a way to remove it from ovirt inventory without deleting the image files on disk. - Can't delete the VM and uncheck "Remove disks", due to snapshots present; - Can't delete the disk without permanently deleting it, due to snapshots present; Are there any other options? What will happen if I put the storage domain in maintenance and destroy the entire storage domain? Is that even possible when VM's have disks associated to it that live on this storage domain? On Wed, Nov 25, 2015 at 8:54 AM, Liron Aravot <lara...@redhat.com> wrote: > > > - Original Message - > > From: "Johan Kooijman" <m...@johankooijman.com> > > To: "Liron Aravot" <lara...@redhat.com> > > Cc: "Nir Soffer" <nsof...@redhat.com>, "users" <users@ovirt.org> > > Sent: Tuesday, November 24, 2015 6:52:51 PM > > Subject: Re: [ovirt-users] Delete disk references without deleting the > disk > > > > When I deactivate the disk on a VM, I can click remove. It then offers me > > the dialog te remove it with a checkbox "Remove permanently". I don't > check > > that, the disk will be deleted from inventory, but won't be deleted from > > storage domain? > > Can't try for myself, it'll kill my storage domain. > > Yes, when you do that the disk will remain "floating" (you'll be able to > see it under the > Disks tab) and you'll be able to attach it to another vm later on. > > > > On Mon, Nov 23, 2015 at 12:46 PM, Johan Kooijman <m...@johankooijman.com > > > > wrote: > > > > > Ok. Any way to do it without? Because with snapshot deletion I end up > with > > > the same issue - I can't remove images form my storage. > > > > > > On Mon, Nov 23, 2015 at 12:18 PM, Liron Aravot <lara...@redhat.com> > wrote: > > > > > >> > > >> > > >> - Original Message - > > >> > From: "Johan Kooijman" <m...@johankooijman.com> > > >> > To: "Nir Soffer" <nsof...@redhat.com> > > >> > Cc: "users" <users@ovirt.org> > > >> > Sent: Monday, November 23, 2015 10:10:27 AM > > >> > Subject: Re: [ovirt-users] Delete disk references without deleting > the > > >> disk > > >> > > > >> > One weird thing though: when I try to remove the VM itself, it won't > > >> let me > > >> > uncheck the "Remove disks" checkbox. > > >> > > > >> > > >> That is because that there are snapshots for the disks, you can remove > > >> the snapshots and then you could > > >> leave your disks. Currently oVirt doesn't support snapshots for > floating > > >> disks. > > >> > > > >> > On Sun, Nov 22, 2015 at 9:00 PM, Nir Soffer < nsof...@redhat.com > > > >> wrote: > > >> > > > >> > > > >> > On Sun, Nov 22, 2015 at 6:14 PM, Johan Kooijman < > > >> m...@johankooijman.com > > > >> > wrote: > > >> > > Hi all, > > >> > > > > >> > > I have about 100 old VM's in my cluster. They're powered down, > ready > > >> for > > >> > > deletion. What I want to do is delete the VM's including disks > without > > >> > > actually deleting the disk images from the storage array itself. > Is > > >> that > > >> > > possible? > > >> > > > >> > Select the vm, click "remove", in the confirmation dialog, uncheck > the > > >> > "Delete disks" > > >> > checkbox, confirm. > > >> > > > >> > > At the end I want to be able to delete the storage domain (which > > >> > > then should not hold any data, as far as ovirt is concerned). > > >> > > > >> > Ovirt deleted the vms, but is keeping the disks, so the storage > domain > > >> > does hold all > > >> > the disks. > > >> > > > >> > > > > >> > > Reason for this: it's a ZFS pool with dedup enabled, deleting the > > >> images > > >> > > one > > >> > > by one will kill the array with 100% iowa for some time. > > >
Re: [ovirt-users] Delete disk references without deleting the disk
When I deactivate the disk on a VM, I can click remove. It then offers me the dialog te remove it with a checkbox "Remove permanently". I don't check that, the disk will be deleted from inventory, but won't be deleted from storage domain? Can't try for myself, it'll kill my storage domain. On Mon, Nov 23, 2015 at 12:46 PM, Johan Kooijman <m...@johankooijman.com> wrote: > Ok. Any way to do it without? Because with snapshot deletion I end up with > the same issue - I can't remove images form my storage. > > On Mon, Nov 23, 2015 at 12:18 PM, Liron Aravot <lara...@redhat.com> wrote: > >> >> >> - Original Message - >> > From: "Johan Kooijman" <m...@johankooijman.com> >> > To: "Nir Soffer" <nsof...@redhat.com> >> > Cc: "users" <users@ovirt.org> >> > Sent: Monday, November 23, 2015 10:10:27 AM >> > Subject: Re: [ovirt-users] Delete disk references without deleting the >> disk >> > >> > One weird thing though: when I try to remove the VM itself, it won't >> let me >> > uncheck the "Remove disks" checkbox. >> > >> >> That is because that there are snapshots for the disks, you can remove >> the snapshots and then you could >> leave your disks. Currently oVirt doesn't support snapshots for floating >> disks. >> > >> > On Sun, Nov 22, 2015 at 9:00 PM, Nir Soffer < nsof...@redhat.com > >> wrote: >> > >> > >> > On Sun, Nov 22, 2015 at 6:14 PM, Johan Kooijman < >> m...@johankooijman.com > >> > wrote: >> > > Hi all, >> > > >> > > I have about 100 old VM's in my cluster. They're powered down, ready >> for >> > > deletion. What I want to do is delete the VM's including disks without >> > > actually deleting the disk images from the storage array itself. Is >> that >> > > possible? >> > >> > Select the vm, click "remove", in the confirmation dialog, uncheck the >> > "Delete disks" >> > checkbox, confirm. >> > >> > > At the end I want to be able to delete the storage domain (which >> > > then should not hold any data, as far as ovirt is concerned). >> > >> > Ovirt deleted the vms, but is keeping the disks, so the storage domain >> > does hold all >> > the disks. >> > >> > > >> > > Reason for this: it's a ZFS pool with dedup enabled, deleting the >> images >> > > one >> > > by one will kill the array with 100% iowa for some time. >> > >> > So what do you need is to destroy the storage domain, which will >> > remove all the entities >> > associated with it, but will keep the storage without any change. >> > >> > Do this: >> > 1. Select the storage tab >> > 2. select the domain >> > 3. In the data center sub tab, click "maintenance" >> > 4. When domain is in maintenance, click "detach" >> > 5. Right click the domain and choose "destroy" >> > >> > This will remove the storage domain from engine database, leaving >> > the contents of the domain. >> > >> > You can now delete the contents using favorite system tools. >> > >> > Now, if we want to add support for this in ovirt, how would you delete >> > the entire domain in a more efficient way? >> > >> > Nir >> > >> > >> > >> > -- >> > Met vriendelijke groeten / With kind regards, >> > Johan Kooijman >> > >> > ___ >> > Users mailing list >> > Users@ovirt.org >> > http://lists.ovirt.org/mailman/listinfo/users >> > >> > > > > -- > Met vriendelijke groeten / With kind regards, > Johan Kooijman > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Delete disk references without deleting the disk
One weird thing though: when I try to remove the VM itself, it won't let me uncheck the "Remove disks" checkbox. On Sun, Nov 22, 2015 at 9:00 PM, Nir Soffer <nsof...@redhat.com> wrote: > On Sun, Nov 22, 2015 at 6:14 PM, Johan Kooijman <m...@johankooijman.com> > wrote: > > Hi all, > > > > I have about 100 old VM's in my cluster. They're powered down, ready for > > deletion. What I want to do is delete the VM's including disks without > > actually deleting the disk images from the storage array itself. Is that > > possible? > > Select the vm, click "remove", in the confirmation dialog, uncheck the > "Delete disks" > checkbox, confirm. > > > At the end I want to be able to delete the storage domain (which > > then should not hold any data, as far as ovirt is concerned). > > Ovirt deleted the vms, but is keeping the disks, so the storage domain > does hold all > the disks. > > > > > Reason for this: it's a ZFS pool with dedup enabled, deleting the images > one > > by one will kill the array with 100% iowa for some time. > > So what do you need is to destroy the storage domain, which will > remove all the entities > associated with it, but will keep the storage without any change. > > Do this: > 1. Select the storage tab > 2. select the domain > 3. In the data center sub tab, click "maintenance" > 4. When domain is in maintenance, click "detach" > 5. Right click the domain and choose "destroy" > > This will remove the storage domain from engine database, leaving > the contents of the domain. > > You can now delete the contents using favorite system tools. > > Now, if we want to add support for this in ovirt, how would you delete > the entire domain in a more efficient way? > > Nir > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] ZFS record size for oVirt
Hi all, I'm using an NFS storage domain, backed by a ZFS cluster. I need to deploy a new storage domain, what would the recommended record size be in this? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Delete disk references without deleting the disk
Ok. Any way to do it without? Because with snapshot deletion I end up with the same issue - I can't remove images form my storage. On Mon, Nov 23, 2015 at 12:18 PM, Liron Aravot <lara...@redhat.com> wrote: > > > - Original Message ----- > > From: "Johan Kooijman" <m...@johankooijman.com> > > To: "Nir Soffer" <nsof...@redhat.com> > > Cc: "users" <users@ovirt.org> > > Sent: Monday, November 23, 2015 10:10:27 AM > > Subject: Re: [ovirt-users] Delete disk references without deleting the > disk > > > > One weird thing though: when I try to remove the VM itself, it won't let > me > > uncheck the "Remove disks" checkbox. > > > > That is because that there are snapshots for the disks, you can remove the > snapshots and then you could > leave your disks. Currently oVirt doesn't support snapshots for floating > disks. > > > > On Sun, Nov 22, 2015 at 9:00 PM, Nir Soffer < nsof...@redhat.com > > wrote: > > > > > > On Sun, Nov 22, 2015 at 6:14 PM, Johan Kooijman < m...@johankooijman.com > > > > wrote: > > > Hi all, > > > > > > I have about 100 old VM's in my cluster. They're powered down, ready > for > > > deletion. What I want to do is delete the VM's including disks without > > > actually deleting the disk images from the storage array itself. Is > that > > > possible? > > > > Select the vm, click "remove", in the confirmation dialog, uncheck the > > "Delete disks" > > checkbox, confirm. > > > > > At the end I want to be able to delete the storage domain (which > > > then should not hold any data, as far as ovirt is concerned). > > > > Ovirt deleted the vms, but is keeping the disks, so the storage domain > > does hold all > > the disks. > > > > > > > > Reason for this: it's a ZFS pool with dedup enabled, deleting the > images > > > one > > > by one will kill the array with 100% iowa for some time. > > > > So what do you need is to destroy the storage domain, which will > > remove all the entities > > associated with it, but will keep the storage without any change. > > > > Do this: > > 1. Select the storage tab > > 2. select the domain > > 3. In the data center sub tab, click "maintenance" > > 4. When domain is in maintenance, click "detach" > > 5. Right click the domain and choose "destroy" > > > > This will remove the storage domain from engine database, leaving > > the contents of the domain. > > > > You can now delete the contents using favorite system tools. > > > > Now, if we want to add support for this in ovirt, how would you delete > > the entire domain in a more efficient way? > > > > Nir > > > > > > > > -- > > Met vriendelijke groeten / With kind regards, > > Johan Kooijman > > > > ___ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Delete disk references without deleting the disk
Hi Nir, I wonder if it can be made any more efficient, I think this method is clear enough. The only thing not clear to me was that while deleting the storage domain, it doesn't touch it's contents. On Sun, Nov 22, 2015 at 9:00 PM, Nir Soffer <nsof...@redhat.com> wrote: > On Sun, Nov 22, 2015 at 6:14 PM, Johan Kooijman <m...@johankooijman.com> > wrote: > > Hi all, > > > > I have about 100 old VM's in my cluster. They're powered down, ready for > > deletion. What I want to do is delete the VM's including disks without > > actually deleting the disk images from the storage array itself. Is that > > possible? > > Select the vm, click "remove", in the confirmation dialog, uncheck the > "Delete disks" > checkbox, confirm. > > > At the end I want to be able to delete the storage domain (which > > then should not hold any data, as far as ovirt is concerned). > > Ovirt deleted the vms, but is keeping the disks, so the storage domain > does hold all > the disks. > > > > > Reason for this: it's a ZFS pool with dedup enabled, deleting the images > one > > by one will kill the array with 100% iowa for some time. > > So what do you need is to destroy the storage domain, which will > remove all the entities > associated with it, but will keep the storage without any change. > > Do this: > 1. Select the storage tab > 2. select the domain > 3. In the data center sub tab, click "maintenance" > 4. When domain is in maintenance, click "detach" > 5. Right click the domain and choose "destroy" > > This will remove the storage domain from engine database, leaving > the contents of the domain. > > You can now delete the contents using favorite system tools. > > Now, if we want to add support for this in ovirt, how would you delete > the entire domain in a more efficient way? > > Nir > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Delete disk references without deleting the disk
Hi all, I have about 100 old VM's in my cluster. They're powered down, ready for deletion. What I want to do is delete the VM's including disks without actually deleting the disk images from the storage array itself. Is that possible? At the end I want to be able to delete the storage domain (which then should not hold any data, as far as ovirt is concerned). Reason for this: it's a ZFS pool with dedup enabled, deleting the images one by one will kill the array with 100% iowa for some time. -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Disk stuck in locked status
Ah, more interesting: the disk lives half on storage domain #1, half on storage domain #2. I don't really need these disks, but can't do anything to these disks at the moment. What to do? On Thu, Nov 5, 2015 at 4:41 PM, Johan Kooijman <m...@johankooijman.com> wrote: > Hi all, > > I was moving a disk from one storage domain to the other when the engine > was restarted. The VM the disk is on, is fine, but the disk stays in locked > status. > > How can I resolve this? > > -- > Met vriendelijke groeten / With kind regards, > Johan Kooijman > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Disk stuck in locked status
Hi all, I was moving a disk from one storage domain to the other when the engine was restarted. The VM the disk is on, is fine, but the disk stays in locked status. How can I resolve this? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Installation failed
Hi all, Trying to add a node to our setup, but since today I'm getting an error when adding. It looks like starting vdsm-network failed to start, due to this error: Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan python[27521]: DIGEST-MD5 parse_server_challenge() Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan python[27521]: DIGEST-MD5 ask_user_info() Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan python[27521]: DIGEST-MD5 make_client_response() Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan python[27521]: DIGEST-MD5 client step 3 Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan python[27521]: DIGEST-MD5 client mech dispose Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan python[27521]: DIGEST-MD5 common mech dispose Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan vdsm-tool[27521]: libvirt: error : no connection driver available for qemu:///system Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan systemd[1]: vdsm-network.service: control process exited, code=exited status=1 Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan systemd[1]: Failed to start Virtual Desktop Server Manager network restoration. Oct 26 16:46:59 hv01.ovirt.gs.cloud.lan systemd[1]: Unit vdsm-network.service entered failed state. Host deploy error can be found here: http://own.cloud.nl/index.php/s/Y49KTsv2vkt8L4a Any clue? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] VM's in wrong state
Hi all, Since a couple of days I have one node with non responsive VM's. The node is OK and the VM's are working actually, just show the wrong state. I'll find out why it happened, but for now I need to correct the state. Any ideas on how to do that? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM's in wrong state
Ok, thanks. Turned out that restart vdsm did the trick. On Fri, Oct 9, 2015 at 1:23 PM, Michal Skrivanek < michal.skriva...@redhat.com> wrote: > > On 9 Oct 2015, at 09:48, Johan Kooijman wrote: > > > Hi all, > > > > Since a couple of days I have one node with non responsive VM's. The > node is OK and the VM's are working actually, just show the wrong state. > Hi, > so what state does it show? Not Responding host? Is there a problem of > only one VM or all the VMs on that host are Unknown and host state is Not > Responding? > > > > I'll find out why it happened, but for now I need to correct the state. > Any ideas on how to do that? > > depends on the above., If there is an issue with only one or few VMs it > typically means libvirt (or vdsm while talking to libvirt) have some issues > accessing monitoring data of that VM. > if it's the whole host then communication between engine and host is > problematic, or storage issues - you should see something in the log (or > post it here, both vdsm.log and engine.log) > > Thanks, > michal > > > > -- > > Met vriendelijke groeten / With kind regards, > > Johan Kooijman > > ___ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Ovirtmgmt not on team device
That works, thx. On Thu, Oct 1, 2015 at 8:10 AM, Dan Kenigsberg <dan...@redhat.com> wrote: > On Wed, Sep 30, 2015 at 10:57:55AM +0200, Johan Kooijman wrote: > > Hi all, > > > > I'm adding my first CentOS 7 host to my cluster today, but running into > an > > issue. When setting up network for the new host I don't have the ability > to > > set ovirtmgmt to the team I created, see screenshot: > > http://imgur.com/k8GWwcK > > > > The team however, works perfectly fine: > > > > [root@hv15]# teamdctl team0 state view > > setup: > > runner: lacp > > ports: > > ens2f0 > > link watches: > > link summary: up > > instance[link_watch_0]: > > name: ethtool > > link: up > > runner: > > aggregator ID: 4, Selected > > selected: yes > > state: current > > ens2f1 > > link watches: > > link summary: up > > instance[link_watch_0]: > > name: ethtool > > link: up > > runner: > > aggregator ID: 4, Selected > > selected: yes > > state: current > > runner: > > active: yes > > fast rate: no > > > > With CentOS 6 and bonding I did not have this issue. Am I missing > something > > here? > > I'm afraid that ovirt does not support teamd devices as of yet. Only the > (good?) old "bonding" driver is supported. > > Please setup your management IP on top of a standard bond0 and try > again. > -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Ovirtmgmt not on team device
Hi all, I'm adding my first CentOS 7 host to my cluster today, but running into an issue. When setting up network for the new host I don't have the ability to set ovirtmgmt to the team I created, see screenshot: http://imgur.com/k8GWwcK The team however, works perfectly fine: [root@hv15]# teamdctl team0 state view setup: runner: lacp ports: ens2f0 link watches: link summary: up instance[link_watch_0]: name: ethtool link: up runner: aggregator ID: 4, Selected selected: yes state: current ens2f1 link watches: link summary: up instance[link_watch_0]: name: ethtool link: up runner: aggregator ID: 4, Selected selected: yes state: current runner: active: yes fast rate: no With CentOS 6 and bonding I did not have this issue. Am I missing something here? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] ovirt-guest-agent
Hi, Would it be possible add the io stats of a vm into ovirt-guest-agent for making that data available in the interface of oVirt? I could help out by getting the stats into the client package, the server side would be a bigger issue to me. -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] IOPS per VM
It's a Zetavault server, ZFS on Linux. On Thu, Jul 23, 2015 at 11:00 PM, Karli Sjöberg karli.sjob...@slu.se wrote: Forgot reply all... Den 23 jul 2015 10:13 em skrev Johan Kooijman m...@johankooijman.com: iotop on NFS servers tells me it's NFS what's taking the load :) What server is it? /K On Thu, Jul 23, 2015 at 10:11 PM, Karli Sjöberg karli.sjob...@slu.se wrote: Den 23 jul 2015 9:53 em skrev Johan Kooijman m...@johankooijman.com: My bad - should've mentioned we're running on NFS, iotop doesn't show that. So run it on the NFS server to see what file is most demanding? /K On Thu, Jul 23, 2015 at 9:14 PM, Chris Adams c...@cmadams.net wrote: Once upon a time, Johan Kooijman m...@johankooijman.com said: We're having some storage issues at the moment, some piece of our ovirt setup is eating up all available write IOPS. Is there a way of finding out which VM it may be? It's not CPU and/or network related it seems, because all VM's look good from the interface. Try running iotop from the shell on the host. -- Chris Adams c...@cmadams.net ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman -- Met vriendelijke groeten / With kind regards, Johan Kooijman -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] IOPS per VM
My bad - should've mentioned we're running on NFS, iotop doesn't show that. On Thu, Jul 23, 2015 at 9:14 PM, Chris Adams c...@cmadams.net wrote: Once upon a time, Johan Kooijman m...@johankooijman.com said: We're having some storage issues at the moment, some piece of our ovirt setup is eating up all available write IOPS. Is there a way of finding out which VM it may be? It's not CPU and/or network related it seems, because all VM's look good from the interface. Try running iotop from the shell on the host. -- Chris Adams c...@cmadams.net ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] IOPS per VM
iotop on NFS servers tells me it's NFS what's taking the load :) On Thu, Jul 23, 2015 at 10:11 PM, Karli Sjöberg karli.sjob...@slu.se wrote: Den 23 jul 2015 9:53 em skrev Johan Kooijman m...@johankooijman.com: My bad - should've mentioned we're running on NFS, iotop doesn't show that. So run it on the NFS server to see what file is most demanding? /K On Thu, Jul 23, 2015 at 9:14 PM, Chris Adams c...@cmadams.net wrote: Once upon a time, Johan Kooijman m...@johankooijman.com said: We're having some storage issues at the moment, some piece of our ovirt setup is eating up all available write IOPS. Is there a way of finding out which VM it may be? It's not CPU and/or network related it seems, because all VM's look good from the interface. Try running iotop from the shell on the host. -- Chris Adams c...@cmadams.net ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] IOPS per VM
Hi all, We're having some storage issues at the moment, some piece of our ovirt setup is eating up all available write IOPS. Is there a way of finding out which VM it may be? It's not CPU and/or network related it seems, because all VM's look good from the interface. -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Upgrade path
Hi all, What would be the best upgrade path for upgrading a 3.5.2 cluster from C6 to C7? Am I right in understanding that a cluster can have mixed hosts, but once a VM is on a C7 host, it cannot be migrated back to C6? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Delete snapshot
Hi all, I created a snapshot on a VM, which I'd like to delete. No further snapshots were made, only this single only. The button delete is greyed out though, untill I power off the VM. I thought with live merge this wasn't needed any more, or am I mistaken? Engine running on CentOS 6 x64, latest stable versions of all packages. Am I missing something? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] 3.5.0 to 3.5.1 Upgrade Steps
Hey Tim, yum update ovirt-engine-setup* engine-setup That's enough for the engine upgrade :) Works like a charm. On Tue, Jan 27, 2015 at 5:35 PM, Tim Macy mac...@gmail.com wrote: What are the proper steps to upgrade the engine from 3.5.0.1-1.el6 to 3.5.1-1.el6? engine-upgrade or engine-setup after yum update ovirt-engine-setup? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] iso uploader times out
Mikola, What version are you running? A bug in 3.5.0 caused the same behavior, it's fixed in 3.5.1. On Mon, Jan 26, 2015 at 4:36 AM, Mikola Rose mr...@power-soft.com wrote: Hi list members, Just finished deploying a self hosted engine. But i am having a problem uploading an iso. It seems to be timing out.. engine-iso-uploader -i iso upload rhel-server-6.6-x86_64-dvd.iso Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to abort): Uploading, please wait... ERROR: mount.nfs: Connection timed out I am a little confused as the host machine has access to the NFS shares but the engine vm does not yet I have created an iso storage item successfully so the engine can see the nfs share? Any idea on how I can trouble shoot this? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Can't migrate/remove VM
Found the bug I'm hitting: https://bugzilla.redhat.com/show_bug.cgi?id=1145636 Workaround is to put host in maintenance mode and uncheck the JSON checkbox change under advanced settings for the node. On Wed, Jan 7, 2015 at 12:26 PM, Johan Kooijman m...@johankooijman.com wrote: All, I was able to reproduce this every time I did an install of ovirt 3.5 on CentOS 6. The issue does not occur when I move to the snapshot version of 3.5 On Tue, Jan 6, 2015 at 2:20 PM, Johan Kooijman m...@johankooijman.com wrote: Hi all, Been playing with an ovirt test setup for the last couple of days. Created some vm's, started to throw them around on the cluster, but now I'm stuck. The VM's are running, but when I try to stop them, I get errors like this: https://plakbord.cloud.nl/p/zvAEVPFeBBJSBeGspKNxJsqF When trying to migrate a VM, the node throws this error: https://plakbord.cloud.nl/p/4Syi9A7tEd8L3A2pQg6boVB6 Any clue on what's happening? -- Met vriendelijke groeten / With kind regards, Johan Kooijman -- Met vriendelijke groeten / With kind regards, Johan Kooijman -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Can't migrate/remove VM
All, I was able to reproduce this every time I did an install of ovirt 3.5 on CentOS 6. The issue does not occur when I move to the snapshot version of 3.5 On Tue, Jan 6, 2015 at 2:20 PM, Johan Kooijman m...@johankooijman.com wrote: Hi all, Been playing with an ovirt test setup for the last couple of days. Created some vm's, started to throw them around on the cluster, but now I'm stuck. The VM's are running, but when I try to stop them, I get errors like this: https://plakbord.cloud.nl/p/zvAEVPFeBBJSBeGspKNxJsqF When trying to migrate a VM, the node throws this error: https://plakbord.cloud.nl/p/4Syi9A7tEd8L3A2pQg6boVB6 Any clue on what's happening? -- Met vriendelijke groeten / With kind regards, Johan Kooijman -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Can't migrate/remove VM
Hi all, Been playing with an ovirt test setup for the last couple of days. Created some vm's, started to throw them around on the cluster, but now I'm stuck. The VM's are running, but when I try to stop them, I get errors like this: https://plakbord.cloud.nl/p/zvAEVPFeBBJSBeGspKNxJsqF When trying to migrate a VM, the node throws this error: https://plakbord.cloud.nl/p/4Syi9A7tEd8L3A2pQg6boVB6 Any clue on what's happening? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Override hostname novnc
Hey all, Is there a way to override the host parameter that novnc uses in it's URL? It now tries to use it's internal hostname (private LAN), while we also like to connect over the internet: https://ovirt.domain.com/ovirt-engine/services/novnc-main.html?host=engine.ovirt.gs.domain.lanport=6100 -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Can't activate storage domain
Hi, I have a datacenter, 2 nodes and an NFS storage domain. The issue is when I try activate a storage domain, logs are here: https://plakbord.cloud.nl/p/P1itMAEoTVIUHQv1U0AAZ5Xy. In the past I've seen issues with sanlock selinux, in this case I have no selinux running. When look at the 2 nodes, they actually have the storage domains mounted: [root@hv2 mnt]# mount | grep nfs sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) 10.0.24.30:/cloud/ovirt-data on /rhev/data-center/mnt/10.0.24.30:_cloud_ovirt-data type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.0.24.30) 10.0.24.30:/cloud/ovirt-iso on /rhev/data-center/mnt/10.0.24.30:_cloud_ovirt-iso type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.0.24.30) 10.0.24.30:/cloud/ovirt-export on /rhev/data-center/mnt/10.0.24.30:_cloud_ovirt-export type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.0.24.30) Am I missing something here? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Can't activate storage domain
And also, when creating the storage domain, some stuff is written: root@head1-1:/cloud/ovirt-data# ls -la total 10 drwxrwxrwx 3 36 36 4 Jan 2 21:24 . drwxr-xr-x 5 root root 5 Jan 2 19:24 .. drwxr-xr-x 4 36 36 4 Jan 2 21:24 24b198c4-41c1-4311-8e11-c4cd7a70db5f -rwxr-xr-x 1 36 36 0 Jan 2 21:24 __DIRECT_IO_TEST__ It just fails with AcquireHostIdFailure. On Fri, Jan 2, 2015 at 8:49 PM, Johan Kooijman m...@johankooijman.com wrote: Forgot to mention: engine nodes are all CentOS 6.6 On Fri, Jan 2, 2015 at 8:44 PM, Johan Kooijman m...@johankooijman.com wrote: Hi, I have a datacenter, 2 nodes and an NFS storage domain. The issue is when I try activate a storage domain, logs are here: https://plakbord.cloud.nl/p/P1itMAEoTVIUHQv1U0AAZ5Xy. In the past I've seen issues with sanlock selinux, in this case I have no selinux running. When look at the 2 nodes, they actually have the storage domains mounted: [root@hv2 mnt]# mount | grep nfs sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) 10.0.24.30:/cloud/ovirt-data on /rhev/data-center/mnt/10.0.24.30:_cloud_ovirt-data type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.0.24.30) 10.0.24.30:/cloud/ovirt-iso on /rhev/data-center/mnt/10.0.24.30:_cloud_ovirt-iso type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.0.24.30) 10.0.24.30:/cloud/ovirt-export on /rhev/data-center/mnt/10.0.24.30:_cloud_ovirt-export type nfs (rw,soft,nosharecache,timeo=600,retrans=6,nfsvers=3,addr=10.0.24.30) Am I missing something here? -- Met vriendelijke groeten / With kind regards, Johan Kooijman -- Met vriendelijke groeten / With kind regards, Johan Kooijman -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [RFI] oVirt 3.6 Planning
Is there a final planning featurelist yet? On Fri, Sep 12, 2014 at 2:22 PM, Itamar Heim ih...@redhat.com wrote: With oVirt 3.5 nearing GA, time to ask for what do you want to see in oVirt 3.6? Thanks, Itamar ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [RFI] oVirt 3.6 Planning
+1 ! For those legacy storage systems that need it. On Thu, Sep 18, 2014 at 5:44 PM, Sven Kieske s.kie...@mittwald.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 +1! (don't know if an agent is needed but this situation occurs quite often ) On 18/09/14 17:35, Robert Story wrote: I've always thought it would be a good idea to have a 'storage agent' to run on storage domains, which could perform some operations more optimally that the current system. For example, exporting/importing domains currently reads the whole VM image over the network, then writes it back. In my case, my storage is all located on the smae NFS server, which could simply do an OS copy locally, instead of reading/writing gigabytes over the network. I don't know if other storage type could perform similar optimizations - -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJUGv3GAAoJEC5d3lL7/I9zZrYQAJf4ZNnwhLjFKSDWpQIh74Bh 1vg7jJq1rUgLiuyDSti8Rzb+3OkotnpZZb9idVTkU9ghKThBpTqccGNJ5amAwSnR QHHwA7228P9QZRth+6EjSn1iNPHsB3SffteKtToGJNkNB42exRwU6T7Dy6Dteo8V Xc4MCP1/H62Oda0SNid0xPmasFeYLM0KZlaVq6gXjrBF+eiOTs/Eko1y+HSA3aqR ksiXlhLL8RKbjCp5z3dAKxIzrF1aghkobFP9+KGWWTp3TEnQ0L8v4twzuUcOdF0r AX3C6VlKYlSqRhASWeOQyhIAmUVCaZcYf7jYsxS0aejuyk0P1E/0Of3UoR04DLPo v/rzQwEl6caQOEBzHfHh6aOY3FaX6FDUeaRdam3QxBovIwFyIWVblIkV32tZf4wN K9UermE47LzKztXeKFUF7lNfJtnaOEfw2EZEcFUc5nQ4rPoXiZdgY2dgFJTryIQ2 ZG4krnzEAFkSg6BAwYOM/WauyU1fb5mgaBW3xGIR1+VbsMJBc5kXwBnoVO8vLCle EH6MpvS9ud+1/plEDGTRNuiNMqf8tA0vHNVo9OJ44SAb689CXzSTKf0m0y6FdZWh hfHxJVM7JU5truVNd0pb9eOts11ZmSW15yqDluFMBoXrjQ4jPO+WSbFEzIo6h0G5 /p/Ttzv/iNb9oHyYmyhk =e1Jm -END PGP SIGNATURE- ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [RFI] oVirt 3.6 Planning
I would really like to see ceph support through libvirtd. Since cephfs is FAR from stable and having an NFS server in the middle really f*cks up performance, ceph is simply not an option right now. The main reason we have to move away from ovirt, can't wait to get back to ovirt! On Fri, Sep 12, 2014 at 2:37 PM, Itamar Heim ih...@redhat.com wrote: On 09/12/2014 03:26 PM, Cédric Buot de l'Epine wrote: Ceph support ;) care to provide more details on how you envision your use case? we're currently contemplating adding the ceph support via Cinder, which would simplify adding other technologies/plugins later. Regards, Cédric On 12/09/2014 14:22, Itamar Heim wrote: With oVirt 3.5 nearing GA, time to ask for what do you want to see in oVirt 3.6? Thanks, Itamar ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Ceph support
Hi all, Does somebody have *any* idea on when oVirt will start to support Ceph through libvirtd? I could mount an RBD volume onto a server and then expert it as NFS, but that kills my I/O throughput quite severely (write IOPS go down by 84%). Ceph is the way to go for storage needs if you ask me, but I'd rather not move away from ovirt because there's no support. -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] RBD as local disk
Hi all, I'm a big fan of both Ceph and oVirt. But since there's no native support for Ceph in oVirt I'm wondering if the following is possible: Mount an RBD blockdevice on the oVirt hosts as a localdisk. I know oVirt can use local disks, but I could find out if I can still do failover etc if the local disk is actually an RBD block device. Hope somebody can help me. -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] [Ann] oVirt 3.4 GA Releases
Hey Bob, Execute yum clean all first, solved same issue on this end. On Thu, Mar 27, 2014 at 5:19 PM, Bob Doolittle b...@doolittle.us.com wrote: I am for some reason unable to get an update for my RHEL6 system for the latest ovirt-release-el6 RPM. My version remains at 10.0.1-3 instead of 11.1.0-1. I assume this is why I have no ovirt-3.3-stable repo available, only ovirt-3.3.3. I've attached my repo file if interested. -Bob On 03/27/2014 08:37 AM, Sandro Bonazzola wrote: Il 27/03/2014 11:57, Gianluca Cecchi ha scritto: On Thu, Mar 27, 2014 at 10:52 AM, Brian Proffitt bprof...@redhat.com wrote: The existing repository ovirt-stable has been updated for delivering this release without the need of enabling any other repository. Is thee a way to eventually migrate from 3.3.3 to 3.3.4 without directly passing to 3.4? Yes, just disable the 3.4 and stable repo, keeping the 3.3 repo enabled. yum-config-manager --disable ovirt-stable yum-config-manager --disable ovirt-3.4-stable yum-config-manager --enable ovirt-3.3-stable Thanks, Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 F +31(0) 162 82 00 01 E m...@johankooijman.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] snapshot
Hey all, Am I missing something or is it not possible to snapshot a virtual disk onto a different storage domain? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Changing IP address of NFS storage server
It's a rare occasion, but shouldn't this be editable from the web interface? On Wed, Mar 19, 2014 at 6:31 PM, John Taylor jtt77...@gmail.com wrote: HI Rene, I think you can change it in table storage_server_connections -John On Wed, Mar 19, 2014 at 12:55 PM, René Koch rk...@linuxland.at wrote: Hi, I have an old oVirt 3.2 setup and I have to change the ip address of the data storage domain (NFS). Even if I shut down all vms I can't detach the data storage domain and reimport it as vms and templates are configured in oVirt which have their disk on this storage domain. Is there a way to change the ip? Maybe using the database? Would this be possible in 3.3 or 3.4? Thanks a lot for tips! -- Best Regards René Koch Senior Solution Architect LIS-Linuxland GmbH Brünner Straße 163, A-1210 Vienna Phone: +43 1 236 91 60 Mobile: +43 660 / 512 21 31 E-Mail: rk...@linuxland.at ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 F +31(0) 162 82 00 01 E m...@johankooijman.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Nodes lose storage at random
Gents, I'm sending this email for archiving purposes: It's been a while since my last update on this topic. It turned out that although only one node, randomly, at a time lost connection to storage, the issue was not at all with ovirt, but with the storage. I'd like to refer to these 2 topics for more information: http://lists.freebsd.org/pipermail/freebsd-net/2014-March/038061.html and http://lists.freebsd.org/pipermail/freebsd-net/2014-February/037967.html. Keywords: ovirt freebsd 9.2 zfs ixgbe intel 10gbit NFS On Mon, Feb 24, 2014 at 3:55 PM, Ronen Hod r...@redhat.com wrote: On 02/24/2014 11:48 AM, Nir Soffer wrote: - Original Message - From: Johan Kooijman m...@johankooijman.com To: Nir Soffer nsof...@redhat.com Cc: users users@ovirt.org Sent: Monday, February 24, 2014 2:45:59 AM Subject: Re: [Users] Nodes lose storage at random Interestingly enough - same thing happened today, around the same time. Logs from this host are attached. Around 1:10 AM stuff starts to go wrong again. Same pattern - we reboot the node and the node is fine again. So we made some progress, we know that it is not a problem with old kernel. In messages we see the same picture: 1. sanlock fail to renew the lease 2. after 80 secodns, kill vdsm 3. sanlock and vdsm cannot access the storage 4. kernel complain about nfs server timeouts (explains why sanlock failed to renew the lease) 5. after reboot, nfs is accessible again 6. after few days goto step 1 This looks like kernel nfs issue. Could be also kvm issue (running bsd on the one of the vm?) Could be also some incompatibility with the nfs server - maybe you are using esoteric configuration options? CCing Ronen, in case this is related to kvm. Not seems to be related to KVM. Adding Ric Wheeler. Ronen. thread: http://lists.ovirt.org/pipermail/users/2014-February/021507.html Nir -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 F +31(0) 162 82 00 01 E m...@johankooijman.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] oVirt 3.5 planning - bandwidth accounting
Dan, How about storing the rx_byte per 5 minutes in the engine DB? That way a reset of the counters has a minimal impact and analytics as traffic for VM x in month Y could be made. Another approach could be to have iptables keep the count? On Thu, Feb 27, 2014 at 1:03 PM, Dan Kenigsberg dan...@redhat.com wrote: There are users that would like to tell how much traffic each vnic of each VM has consumed in a period of time. Currently, we report only bitrate as a percetage of an estimated vnic speed. Integrating this value over time is inefficent and error prone. I suggest to have all the stack (Vdsm, Engine, dwh) report the actually-trasmitted (and actually-received) byte count on each vnic, as well as the time when the sample was taken. Currently, Vdsm reports 'eth0': {'rxDropped': '0', 'rxErrors': '0', 'rxRate': '8.0', 'speed': '1000', 'state': 'up', 'txDropped': '0', 'txErrors': '0', 'txRate': '10.0'}, but it should add rxKiBytes, txKiBytes and time to the frill. GUI could still calculate the rate for illustration, based on the raw trasmission and the sample time. Until we break backward compatibility, we'd keep reporting the flaky rxRate/txRate, too. I can think of only two problems with this approach: Linux byte counters would eventually reset when they overflow. This is currently hidden by Vdsm, but with the suggested change, would have to be handled by higher levels of the stack. A similar problem appears on migration: the counters would reset and Engine would need to know how to keep up the accounting properly. I've opened Bug 1066570 - [RFE] Report actual rx_byte instead of a false rxRate to track this request of mine. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 F +31(0) 162 82 00 01 E m...@johankooijman.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Stack trace caused by FreeBSD client
Hi all, Interesting thing I found out this afternoon. I have a FreeBSD 10 guest with virtio drivers, both disk and net. The VM works fine, but when I connect over SSH to the VM, I see this stack trace in messages on the node: Feb 23 19:19:42 hv3 kernel: [ cut here ] Feb 23 19:19:42 hv3 kernel: WARNING: at net/core/dev.c:1907 skb_warn_bad_offload+0xc2/0xf0() (Tainted: GW --- ) Feb 23 19:19:42 hv3 kernel: Hardware name: X9DR3-F Feb 23 19:19:42 hv3 kernel: igb: caps=(0x12114bb3, 0x0) len=5686 data_len=5620 ip_summed=0 Feb 23 19:19:42 hv3 kernel: Modules linked in: ebt_arp nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding 8021q garp ebtable_nat ebtables bridge stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_round_robin dm_multipath vhost_net macvtap macvlan tun kvm_intel kvm iTCO_wdt iTCO_vendor_support sg ixgbe mdio sb_edac edac_core lpc_ich mfd_core i2c_i801 ioatdma igb dca i2c_algo_bit i2c_core ptp pps_core ext4 jbd2 mbcache sd_mod crc_t10dif 3w_sas ahci isci libsas scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Feb 23 19:19:42 hv3 kernel: Pid: 15280, comm: vhost-15276 Tainted: G W ---2.6.32-431.5.1.el6.x86_64 #1 Feb 23 19:19:42 hv3 kernel: Call Trace: Feb 23 19:19:42 hv3 kernel: IRQ [81071e27] ? warn_slowpath_common+0x87/0xc0 Feb 23 19:19:42 hv3 kernel: [81071f16] ? warn_slowpath_fmt+0x46/0x50 Feb 23 19:19:42 hv3 kernel: [a016c862] ? igb_get_drvinfo+0x82/0xe0 [igb] Feb 23 19:19:42 hv3 kernel: [8145b1d2] ? skb_warn_bad_offload+0xc2/0xf0 Feb 23 19:19:42 hv3 kernel: [814602c1] ? __skb_gso_segment+0x71/0xc0 Feb 23 19:19:42 hv3 kernel: [81460323] ? skb_gso_segment+0x13/0x20 Feb 23 19:19:42 hv3 kernel: [814603cb] ? dev_hard_start_xmit+0x9b/0x480 Feb 23 19:19:42 hv3 kernel: [8147bf5a] ? sch_direct_xmit+0x15a/0x1c0 Feb 23 19:19:42 hv3 kernel: [81460a58] ? dev_queue_xmit+0x228/0x320 Feb 23 19:19:42 hv3 kernel: [a035a898] ? br_dev_queue_push_xmit+0x88/0xc0 [bridge] Feb 23 19:19:42 hv3 kernel: [a035a928] ? br_forward_finish+0x58/0x60 [bridge] Feb 23 19:19:42 hv3 kernel: [a035a9da] ? __br_forward+0xaa/0xd0 [bridge] Feb 23 19:19:42 hv3 kernel: [814897b6] ? nf_hook_slow+0x76/0x120 Feb 23 19:19:42 hv3 kernel: [a035aa5d] ? br_forward+0x5d/0x70 [bridge] Feb 23 19:19:42 hv3 kernel: [a035ba6b] ? br_handle_frame_finish+0x17b/0x2a0 [bridge] Feb 23 19:19:42 hv3 kernel: [a035bd3a] ? br_handle_frame+0x1aa/0x250 [bridge] Feb 23 19:19:42 hv3 kernel: [8145b7c9] ? __netif_receive_skb+0x529/0x750 Feb 23 19:19:42 hv3 kernel: [8145ba8a] ? process_backlog+0x9a/0x100 Feb 23 19:19:42 hv3 kernel: [81460d43] ? net_rx_action+0x103/0x2f0 Feb 23 19:19:42 hv3 kernel: [8107a8e1] ? __do_softirq+0xc1/0x1e0 Feb 23 19:19:42 hv3 kernel: [8100c30c] ? call_softirq+0x1c/0x30 Feb 23 19:19:42 hv3 kernel: EOI [8100fa75] ? do_softirq+0x65/0xa0 Feb 23 19:19:42 hv3 kernel: [814611c8] ? netif_rx_ni+0x28/0x30 Feb 23 19:19:42 hv3 kernel: [a01a0749] ? tun_sendmsg+0x229/0x4ec [tun] Feb 23 19:19:42 hv3 kernel: [a027bcf5] ? handle_tx+0x275/0x5e0 [vhost_net] Feb 23 19:19:42 hv3 kernel: [a027c095] ? handle_tx_kick+0x15/0x20 [vhost_net] Feb 23 19:19:42 hv3 kernel: [a027955c] ? vhost_worker+0xbc/0x140 [vhost_net] Feb 23 19:19:42 hv3 kernel: [a02794a0] ? vhost_worker+0x0/0x140 [vhost_net] Feb 23 19:19:42 hv3 kernel: [8109aee6] ? kthread+0x96/0xa0 Feb 23 19:19:42 hv3 kernel: [8100c20a] ? child_rip+0xa/0x20 Feb 23 19:19:42 hv3 kernel: [8109ae50] ? kthread+0x0/0xa0 Feb 23 19:19:42 hv3 kernel: [8100c200] ? child_rip+0x0/0x20 Feb 23 19:19:42 hv3 kernel: ---[ end trace e93142595d6ecfc7 ]--- This is 100% reproducable, every time. The login itself works just fine. Some more info: [root@hv3 ~]# uname -a Linux hv3.ovirt.gs.cloud.lan 2.6.32-431.5.1.el6.x86_64 #1 SMP Wed Feb 12 00:41:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux [root@hv3 ~]# rpm -qa | grep vdsm vdsm-4.13.3-3.el6.x86_64 vdsm-xmlrpc-4.13.3-3.el6.noarch vdsm-python-4.13.3-3.el6.x86_64 vdsm-cli-4.13.3-3.el6.noarch -- Met vriendelijke groeten / With kind regards, Johan Kooijman E m...@johankooijman.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Nodes lose storage at random
Thanks for looking into it. I've been running the ovirt ISO untill now, will switch to stock C6.5 to see if it makes a difference. On Sat, Feb 22, 2014 at 8:57 PM, Nir Soffer nsof...@redhat.com wrote: - Original Message - From: Johan Kooijman m...@johankooijman.com To: Nir Soffer nsof...@redhat.com Cc: users users@ovirt.org Sent: Wednesday, February 19, 2014 2:34:36 PM Subject: Re: [Users] Nodes lose storage at random Messages: https://t-x.dignus.nl/messages.txt Sanlock: https://t-x.dignus.nl/sanlock.log.txt We can see in /var/log/messages, that sanlock failed to write to the ids lockspace [1], which after 80 seconds [2], caused vdsm to loose its host id lease. In this case, sanlock kill vdsm [3], which die after 11 retries [4]. Then vdsm is respawned again [5]. This is expected. We don't know why sanlock failed to write to the storage, but in [6] the kernel tell us that the nfs server is not responding. Since the nfs server is accessible from other machines, it means you have some issue with this host. Later the machine reboots [7], and nfs server is still not accessible. Then you have lot of WARN_ON call traces [8], that looks related to network code. We can see that you are not running most recent kernel [7]. We experienced various nfs issues during the 6.5 beta. I would try to get help from kernel folks about this. [1] Feb 18 10:47:46 hv5 sanlock[14753]: 2014-02-18 10:47:46+ 1251833 [21345]: s2 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/10.0.24.1: _santank_ovirt-data/e9f70496-f181-4c9b-9ecb-d7f780772b04/dom_md/ids [2] Feb 18 10:48:35 hv5 sanlock[14753]: 2014-02-18 10:48:35+ 1251882 [14753]: s2 check_our_lease failed 80 [3] Feb 18 10:48:35 hv5 sanlock[14753]: 2014-02-18 10:48:35+ 1251882 [14753]: s2 kill 19317 sig 15 count 1 [4] Feb 18 10:48:45 hv5 sanlock[14753]: 2014-02-18 10:48:45+ 1251892 [14753]: dead 19317 ci 3 count 11 [5] Feb 18 10:48:45 hv5 respawn: slave '/usr/share/vdsm/vdsm' died, respawning slave [6] Feb 18 10:57:36 hv5 kernel: nfs: server 10.0.24.1 not responding, timed out [7] Feb 18 11:03:01 hv5 kernel: imklog 5.8.10, log source = /proc/kmsg started. Feb 18 11:03:01 hv5 kernel: Linux version 2.6.32-358.18.1.el6.x86_64 ( mockbu...@c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Aug 28 17:19:38 UTC 2013 [8] Feb 18 18:29:53 hv5 kernel: [ cut here ] Feb 18 18:29:53 hv5 kernel: WARNING: at net/core/dev.c:1759 skb_gso_segment+0x1df/0x2b0() (Not tainted) Feb 18 18:29:53 hv5 kernel: Hardware name: X9DRW Feb 18 18:29:53 hv5 kernel: igb: caps=(0x12114bb3, 0x0) len=1596 data_len=0 ip_summed=0 Feb 18 18:29:53 hv5 kernel: Modules linked in: ebt_arp nfs fscache auth_rpcgss nfs_acl bonding softdog ebtable_nat ebtables bnx2fc fcoe libfcoe libfc scsi_transport_fc scsi_tgt lockd sunrpc bridge ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables xt_physdev ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack xt_multi port ip6table_filter ip6_tables ext4 jbd2 8021q garp stp llc sha256_generic cbc cryptoloop dm_crypt aesni_intel cryptd aes_x86_64 aes_generic vhost_net macvtap macvlan tun kvm_ intel kvm sg sb_edac edac_core iTCO_wdt iTCO_vendor_support ioatdma shpchp dm_snapshot squashfs ext2 mbcache dm_round_robin sd_mod crc_t10dif isci libsas scsi_transport_sas 3w_ sas ahci ixgbe igb dca ptp pps_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xx x iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Feb 18 18:29:53 hv5 kernel: Pid: 5462, comm: vhost-5458 Not tainted 2.6.32-358.18.1.el6.x86_64 #1 Feb 18 18:29:53 hv5 kernel: Call Trace: Feb 18 18:29:53 hv5 kernel: IRQ [8106e3e7] ? warn_slowpath_common+0x87/0xc0 Feb 18 18:29:53 hv5 kernel: [8106e4d6] ? warn_slowpath_fmt+0x46/0x50 Feb 18 18:29:53 hv5 kernel: [a020bd62] ? igb_get_drvinfo+0x82/0xe0 [igb] Feb 18 18:29:53 hv5 kernel: [81448e7f] ? skb_gso_segment+0x1df/0x2b0 Feb 18 18:29:53 hv5 kernel: [81449260] ? dev_hard_start_xmit+0x1b0/0x530 Feb 18 18:29:53 hv5 kernel: [8146773a] ? sch_direct_xmit+0x15a/0x1c0 Feb 18 18:29:53 hv5 kernel: [8144d0c0] ? dev_queue_xmit+0x3b0/0x550 Feb 18 18:29:53 hv5 kernel: [a04af65c] ? br_dev_queue_push_xmit+0x6c/0xa0 [bridge] Feb 18 18:29:53 hv5 kernel: [a04af6e8] ? br_forward_finish+0x58/0x60 [bridge] Feb 18 18:29:53 hv5 kernel: [a04af79a] ? __br_forward+0xaa/0xd0 [bridge] Feb 18 18:29:53 hv5 kernel: [81474f34] ? nf_hook_slow+0x74/0x110 Feb 18 18:29:53 hv5 kernel: [a04af81d] ? br_forward+0x5d/0x70 [bridge] Feb 18 18:29:53 hv5 kernel: [a04b0609] ? br_handle_frame_finish+0x179/0x2a0 [bridge] Feb 18 18:29:53 hv5 kernel: [a04b08da] ? br_handle_frame
Re: [Users] Nodes lose storage at random
Been reinstalling to stocj CentOS 6.5 last night, all successful. Until roughly midnight GMT, 2 out of 4 hosts were showing the same errors. Any more suggestions? On Sat, Feb 22, 2014 at 8:57 PM, Nir Soffer nsof...@redhat.com wrote: - Original Message - From: Johan Kooijman m...@johankooijman.com To: Nir Soffer nsof...@redhat.com Cc: users users@ovirt.org Sent: Wednesday, February 19, 2014 2:34:36 PM Subject: Re: [Users] Nodes lose storage at random Messages: https://t-x.dignus.nl/messages.txt Sanlock: https://t-x.dignus.nl/sanlock.log.txt We can see in /var/log/messages, that sanlock failed to write to the ids lockspace [1], which after 80 seconds [2], caused vdsm to loose its host id lease. In this case, sanlock kill vdsm [3], which die after 11 retries [4]. Then vdsm is respawned again [5]. This is expected. We don't know why sanlock failed to write to the storage, but in [6] the kernel tell us that the nfs server is not responding. Since the nfs server is accessible from other machines, it means you have some issue with this host. Later the machine reboots [7], and nfs server is still not accessible. Then you have lot of WARN_ON call traces [8], that looks related to network code. We can see that you are not running most recent kernel [7]. We experienced various nfs issues during the 6.5 beta. I would try to get help from kernel folks about this. [1] Feb 18 10:47:46 hv5 sanlock[14753]: 2014-02-18 10:47:46+ 1251833 [21345]: s2 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/10.0.24.1: _santank_ovirt-data/e9f70496-f181-4c9b-9ecb-d7f780772b04/dom_md/ids [2] Feb 18 10:48:35 hv5 sanlock[14753]: 2014-02-18 10:48:35+ 1251882 [14753]: s2 check_our_lease failed 80 [3] Feb 18 10:48:35 hv5 sanlock[14753]: 2014-02-18 10:48:35+ 1251882 [14753]: s2 kill 19317 sig 15 count 1 [4] Feb 18 10:48:45 hv5 sanlock[14753]: 2014-02-18 10:48:45+ 1251892 [14753]: dead 19317 ci 3 count 11 [5] Feb 18 10:48:45 hv5 respawn: slave '/usr/share/vdsm/vdsm' died, respawning slave [6] Feb 18 10:57:36 hv5 kernel: nfs: server 10.0.24.1 not responding, timed out [7] Feb 18 11:03:01 hv5 kernel: imklog 5.8.10, log source = /proc/kmsg started. Feb 18 11:03:01 hv5 kernel: Linux version 2.6.32-358.18.1.el6.x86_64 ( mockbu...@c6b10.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Aug 28 17:19:38 UTC 2013 [8] Feb 18 18:29:53 hv5 kernel: [ cut here ] Feb 18 18:29:53 hv5 kernel: WARNING: at net/core/dev.c:1759 skb_gso_segment+0x1df/0x2b0() (Not tainted) Feb 18 18:29:53 hv5 kernel: Hardware name: X9DRW Feb 18 18:29:53 hv5 kernel: igb: caps=(0x12114bb3, 0x0) len=1596 data_len=0 ip_summed=0 Feb 18 18:29:53 hv5 kernel: Modules linked in: ebt_arp nfs fscache auth_rpcgss nfs_acl bonding softdog ebtable_nat ebtables bnx2fc fcoe libfcoe libfc scsi_transport_fc scsi_tgt lockd sunrpc bridge ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables xt_physdev ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack xt_multi port ip6table_filter ip6_tables ext4 jbd2 8021q garp stp llc sha256_generic cbc cryptoloop dm_crypt aesni_intel cryptd aes_x86_64 aes_generic vhost_net macvtap macvlan tun kvm_ intel kvm sg sb_edac edac_core iTCO_wdt iTCO_vendor_support ioatdma shpchp dm_snapshot squashfs ext2 mbcache dm_round_robin sd_mod crc_t10dif isci libsas scsi_transport_sas 3w_ sas ahci ixgbe igb dca ptp pps_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xx x iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan] Feb 18 18:29:53 hv5 kernel: Pid: 5462, comm: vhost-5458 Not tainted 2.6.32-358.18.1.el6.x86_64 #1 Feb 18 18:29:53 hv5 kernel: Call Trace: Feb 18 18:29:53 hv5 kernel: IRQ [8106e3e7] ? warn_slowpath_common+0x87/0xc0 Feb 18 18:29:53 hv5 kernel: [8106e4d6] ? warn_slowpath_fmt+0x46/0x50 Feb 18 18:29:53 hv5 kernel: [a020bd62] ? igb_get_drvinfo+0x82/0xe0 [igb] Feb 18 18:29:53 hv5 kernel: [81448e7f] ? skb_gso_segment+0x1df/0x2b0 Feb 18 18:29:53 hv5 kernel: [81449260] ? dev_hard_start_xmit+0x1b0/0x530 Feb 18 18:29:53 hv5 kernel: [8146773a] ? sch_direct_xmit+0x15a/0x1c0 Feb 18 18:29:53 hv5 kernel: [8144d0c0] ? dev_queue_xmit+0x3b0/0x550 Feb 18 18:29:53 hv5 kernel: [a04af65c] ? br_dev_queue_push_xmit+0x6c/0xa0 [bridge] Feb 18 18:29:53 hv5 kernel: [a04af6e8] ? br_forward_finish+0x58/0x60 [bridge] Feb 18 18:29:53 hv5 kernel: [a04af79a] ? __br_forward+0xaa/0xd0 [bridge] Feb 18 18:29:53 hv5 kernel: [81474f34] ? nf_hook_slow+0x74/0x110 Feb 18 18:29:53 hv5 kernel: [a04af81d] ? br_forward+0x5d/0x70 [bridge] Feb 18 18:29:53 hv5 kernel: [a04b0609] ? br_handle_frame_finish+0x179/0x2a0 [bridge] Feb 18 18:29:53 hv5 kernel
Re: [Users] Nodes lose storage at random
Meital, It's been 4 days since the last crash - but 5 minutes ago one of the nodes had the same issues. I've been running the script on the SPM as you mentioned. It turns out that at the time the node went down, the SPM didn't have more remoteFileHandler processes than before or after the crash - 29. I'm not sure what to make of this piece of information. On Tue, Feb 18, 2014 at 2:56 PM, Meital Bourvine mbour...@redhat.comwrote: Hi Johan, Can you please run something like this on the spm node? while true; do echo `date; ps ax | grep -i remotefilehandler | wc -l` /tmp/handler_num.txt; sleep 1; done When it'll happen again, please stop the script, and write here the maximum number and the time that it happened. Also, please check if process_pool_max_slots_per_domain is defined in /etc/vdsm/vdsm.conf, and if so, what's the value? (if it's not defined there, the default is 10) Thanks! -- *From: *Johan Kooijman m...@johankooijman.com *To: *Meital Bourvine mbour...@redhat.com *Cc: *users users@ovirt.org *Sent: *Tuesday, February 18, 2014 2:55:11 PM *Subject: *Re: [Users] Nodes lose storage at random To follow up on this: The setup has only ~80 VM's active right now. The 2 bugreports are not in scope for this setup, the issues occur at random, even when there's no activity (create/delete VM's) and there are only 4 directories in /rhev/data-center/mnt/. On Tue, Feb 18, 2014 at 1:51 PM, Johan Kooijman m...@johankooijman.comwrote: Meital, I'm running the latest stable oVirt, 3.3.3 on Centos 6.5. For my nodes I use the node iso CentOS 6 oVirt Node - 3.0.1 - 1.0.2.el6. I have no ways of reproducing just yet. I can confirm that it's happening on all nodes in the cluster. And every time a node goes offline, this error pops up. Could the fact that lockd statd were not running on the NFS host cause this error? Is there a workaround available that we know of? On Tue, Feb 18, 2014 at 12:57 PM, Meital Bourvine mbour...@redhat.comwrote: Hi Johan, Please take a look at this error (from vdsm.log): Thread-636938::DEBUG::2014-02-18 10:48:06,374::task::579::TaskManager.Task::(_updateState) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::moving from state init - state preparing Thread-636938::INFO::2014-02-18 10:48:06,375::logUtils::44::dispatcher::(wrapper) Run and protect: getVolumeSize(sdUUID='e9f70496-f181-4c9b-9ecb-d7f780772b04', spUUID='59980e09-b329-4254-b66e-790abd69e194', imgUUID='d50ecfbb-dc98-40cf-9b19-4bd402952aeb', volUUID='68fefe24-0346-4d0d-b377-ddd7be7be29c', options=None) Thread-636938::ERROR::2014-02-18 10:48:06,376::task::850::TaskManager.Task::(_setError) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Unexpected error Thread-636938::DEBUG::2014-02-18 10:48:06,415::task::869::TaskManager.Task::(_run) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Task._run: f4ce9a6e-0292-4071-9a24-a8d8fba7222b ('e9f70496-f181-4c9b-9ecb-d7f780772b04', '59980e09-b329-4254-b66e-790abd69e194', 'd50ecfbb-dc98-40cf-9b19-4bd402952aeb', '68fefe24-0346-4d0d-b377-ddd7be7be29c') {} failed - stopping task Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::1194::TaskManager.Task::(stop) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::stopping in state preparing (force False) Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::974::TaskManager.Task::(_decref) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::ref 1 aborting True Thread-636938::INFO::2014-02-18 10:48:06,416::task::1151::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::aborting: Task is aborted: u'No free file handlers in pool' - code 100 Thread-636938::DEBUG::2014-02-18 10:48:06,417::task::1156::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Prepare: aborted: No free file handlers in pool And then you can see after a few seconds: MainThread::INFO::2014-02-18 10:48:45,258::vdsm::101::vds::(run) (PID: 1450) I am the actual vdsm 4.12.1-2.el6 hv5.ovirt.gs.cloud.lan (2.6.32-358.18.1.el6.x86_64) Meaning that vdsm was restarted. Which oVirt version are you using? I see that there are a few old bugs that describes the same behaviour, but with different reproduction steps, for example [1], [2]. Can you think of any reproduction steps that might be causing this issue? [1] https://bugzilla.redhat.com/show_bug.cgi?id=948210 [2] https://bugzilla.redhat.com/show_bug.cgi?id=853011 -- *From: *Johan Kooijman m...@johankooijman.com *To: *users users@ovirt.org *Sent: *Tuesday, February 18, 2014 1:32:56 PM *Subject: *[Users] Nodes lose storage at random Hi All, We're seeing some weird issues in our ovirt setup. We have 4 nodes connected and an NFS (v3) filestore (FreeBSD/ZFS). Once in a while, it seems at random, a node loses their connection to storage, recovers it a minute later. The other nodes usually don't lose their storage
Re: [Users] Nodes lose storage at random
Nir, Messages: https://t-x.dignus.nl/messages.txt Sanlock: https://t-x.dignus.nl/sanlock.log.txt Any input is more than welcome! On Wed, Feb 19, 2014 at 10:38 AM, Nir Soffer nsof...@redhat.com wrote: - Original Message - From: Johan Kooijman m...@johankooijman.com To: users users@ovirt.org Sent: Tuesday, February 18, 2014 1:32:56 PM Subject: [Users] Nodes lose storage at random Hi All, We're seeing some weird issues in our ovirt setup. We have 4 nodes connected and an NFS (v3) filestore (FreeBSD/ZFS). Once in a while, it seems at random, a node loses their connection to storage, recovers it a minute later. The other nodes usually don't lose their storage at that moment. Just one, or two at a time. We've setup extra tooling to verify the storage performance at those moments and the availability for other systems. It's always online, just the nodes don't think so. In the logs, we see that vdsm was restarted: MainThread::DEBUG::2014-02-18 10:48:35,809::vdsm::45::vds::(sigtermHandler) Received signal 15 But we don't know why it happened. Please attach also /var/log/messages and /var/log/sanlock.log around the time that vdsm was restarted. Thanks, Nir -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 F +31(0) 162 82 00 01 E m...@johankooijman.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Nodes lose storage at random
Meital, I'm running the latest stable oVirt, 3.3.3 on Centos 6.5. For my nodes I use the node iso CentOS 6 oVirt Node - 3.0.1 - 1.0.2.el6. I have no ways of reproducing just yet. I can confirm that it's happening on all nodes in the cluster. And every time a node goes offline, this error pops up. Could the fact that lockd statd were not running on the NFS host cause this error? Is there a workaround available that we know of? On Tue, Feb 18, 2014 at 12:57 PM, Meital Bourvine mbour...@redhat.comwrote: Hi Johan, Please take a look at this error (from vdsm.log): Thread-636938::DEBUG::2014-02-18 10:48:06,374::task::579::TaskManager.Task::(_updateState) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::moving from state init - state preparing Thread-636938::INFO::2014-02-18 10:48:06,375::logUtils::44::dispatcher::(wrapper) Run and protect: getVolumeSize(sdUUID='e9f70496-f181-4c9b-9ecb-d7f780772b04', spUUID='59980e09-b329-4254-b66e-790abd69e194', imgUUID='d50ecfbb-dc98-40cf-9b19-4bd402952aeb', volUUID='68fefe24-0346-4d0d-b377-ddd7be7be29c', options=None) Thread-636938::ERROR::2014-02-18 10:48:06,376::task::850::TaskManager.Task::(_setError) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Unexpected error Thread-636938::DEBUG::2014-02-18 10:48:06,415::task::869::TaskManager.Task::(_run) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Task._run: f4ce9a6e-0292-4071-9a24-a8d8fba7222b ('e9f70496-f181-4c9b-9ecb-d7f780772b04', '59980e09-b329-4254-b66e-790abd69e194', 'd50ecfbb-dc98-40cf-9b19-4bd402952aeb', '68fefe24-0346-4d0d-b377-ddd7be7be29c') {} failed - stopping task Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::1194::TaskManager.Task::(stop) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::stopping in state preparing (force False) Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::974::TaskManager.Task::(_decref) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::ref 1 aborting True Thread-636938::INFO::2014-02-18 10:48:06,416::task::1151::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::aborting: Task is aborted: u'No free file handlers in pool' - code 100 Thread-636938::DEBUG::2014-02-18 10:48:06,417::task::1156::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Prepare: aborted: No free file handlers in pool And then you can see after a few seconds: MainThread::INFO::2014-02-18 10:48:45,258::vdsm::101::vds::(run) (PID: 1450) I am the actual vdsm 4.12.1-2.el6 hv5.ovirt.gs.cloud.lan (2.6.32-358.18.1.el6.x86_64) Meaning that vdsm was restarted. Which oVirt version are you using? I see that there are a few old bugs that describes the same behaviour, but with different reproduction steps, for example [1], [2]. Can you think of any reproduction steps that might be causing this issue? [1] https://bugzilla.redhat.com/show_bug.cgi?id=948210 [2] https://bugzilla.redhat.com/show_bug.cgi?id=853011 -- *From: *Johan Kooijman m...@johankooijman.com *To: *users users@ovirt.org *Sent: *Tuesday, February 18, 2014 1:32:56 PM *Subject: *[Users] Nodes lose storage at random Hi All, We're seeing some weird issues in our ovirt setup. We have 4 nodes connected and an NFS (v3) filestore (FreeBSD/ZFS). Once in a while, it seems at random, a node loses their connection to storage, recovers it a minute later. The other nodes usually don't lose their storage at that moment. Just one, or two at a time. We've setup extra tooling to verify the storage performance at those moments and the availability for other systems. It's always online, just the nodes don't think so. The engine tells me this: 2014-02-18 11:48:03,598 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-6-thread-48) domain d88764c8-ecc3-4f22-967e-2ce225ac4498:Export in problem. vds: hv5 2014-02-18 11:48:18,909 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-6-thread-48) domain e9f70496-f181-4c9b-9ecb-d7f780772b04:Data in problem. vds: hv5 2014-02-18 11:48:45,021 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-18) [46683672] Failed to refresh VDS , vds = 66e6aace-e51d-4006-bb2f-d85c2f1fd8d2 : hv5, VDS Network Error, continuing. 2014-02-18 11:48:45,070 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-41) [2ef1a894] Correlation ID: 2ef1a894, Call Stack: null, Custom Event ID: -1, Message: Invalid status on Data Center GS. Setting Data Center status to Non Responsive (On host hv5, Error: Network error during communication with the Host.). The export and data domain live over NFS. There's another domain, ISO, that lives on the engine machine, also shared over NFS. That domain doesn't have any issue at all. Attached are the logfiles for the relevant time period for both the engine server and the node. The node by the way, is a deployment
Re: [Users] Nodes lose storage at random
To follow up on this: The setup has only ~80 VM's active right now. The 2 bugreports are not in scope for this setup, the issues occur at random, even when there's no activity (create/delete VM's) and there are only 4 directories in /rhev/data-center/mnt/. On Tue, Feb 18, 2014 at 1:51 PM, Johan Kooijman m...@johankooijman.comwrote: Meital, I'm running the latest stable oVirt, 3.3.3 on Centos 6.5. For my nodes I use the node iso CentOS 6 oVirt Node - 3.0.1 - 1.0.2.el6. I have no ways of reproducing just yet. I can confirm that it's happening on all nodes in the cluster. And every time a node goes offline, this error pops up. Could the fact that lockd statd were not running on the NFS host cause this error? Is there a workaround available that we know of? On Tue, Feb 18, 2014 at 12:57 PM, Meital Bourvine mbour...@redhat.comwrote: Hi Johan, Please take a look at this error (from vdsm.log): Thread-636938::DEBUG::2014-02-18 10:48:06,374::task::579::TaskManager.Task::(_updateState) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::moving from state init - state preparing Thread-636938::INFO::2014-02-18 10:48:06,375::logUtils::44::dispatcher::(wrapper) Run and protect: getVolumeSize(sdUUID='e9f70496-f181-4c9b-9ecb-d7f780772b04', spUUID='59980e09-b329-4254-b66e-790abd69e194', imgUUID='d50ecfbb-dc98-40cf-9b19-4bd402952aeb', volUUID='68fefe24-0346-4d0d-b377-ddd7be7be29c', options=None) Thread-636938::ERROR::2014-02-18 10:48:06,376::task::850::TaskManager.Task::(_setError) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Unexpected error Thread-636938::DEBUG::2014-02-18 10:48:06,415::task::869::TaskManager.Task::(_run) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Task._run: f4ce9a6e-0292-4071-9a24-a8d8fba7222b ('e9f70496-f181-4c9b-9ecb-d7f780772b04', '59980e09-b329-4254-b66e-790abd69e194', 'd50ecfbb-dc98-40cf-9b19-4bd402952aeb', '68fefe24-0346-4d0d-b377-ddd7be7be29c') {} failed - stopping task Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::1194::TaskManager.Task::(stop) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::stopping in state preparing (force False) Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::974::TaskManager.Task::(_decref) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::ref 1 aborting True Thread-636938::INFO::2014-02-18 10:48:06,416::task::1151::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::aborting: Task is aborted: u'No free file handlers in pool' - code 100 Thread-636938::DEBUG::2014-02-18 10:48:06,417::task::1156::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Prepare: aborted: No free file handlers in pool And then you can see after a few seconds: MainThread::INFO::2014-02-18 10:48:45,258::vdsm::101::vds::(run) (PID: 1450) I am the actual vdsm 4.12.1-2.el6 hv5.ovirt.gs.cloud.lan (2.6.32-358.18.1.el6.x86_64) Meaning that vdsm was restarted. Which oVirt version are you using? I see that there are a few old bugs that describes the same behaviour, but with different reproduction steps, for example [1], [2]. Can you think of any reproduction steps that might be causing this issue? [1] https://bugzilla.redhat.com/show_bug.cgi?id=948210 [2] https://bugzilla.redhat.com/show_bug.cgi?id=853011 -- *From: *Johan Kooijman m...@johankooijman.com *To: *users users@ovirt.org *Sent: *Tuesday, February 18, 2014 1:32:56 PM *Subject: *[Users] Nodes lose storage at random Hi All, We're seeing some weird issues in our ovirt setup. We have 4 nodes connected and an NFS (v3) filestore (FreeBSD/ZFS). Once in a while, it seems at random, a node loses their connection to storage, recovers it a minute later. The other nodes usually don't lose their storage at that moment. Just one, or two at a time. We've setup extra tooling to verify the storage performance at those moments and the availability for other systems. It's always online, just the nodes don't think so. The engine tells me this: 2014-02-18 11:48:03,598 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-6-thread-48) domain d88764c8-ecc3-4f22-967e-2ce225ac4498:Export in problem. vds: hv5 2014-02-18 11:48:18,909 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-6-thread-48) domain e9f70496-f181-4c9b-9ecb-d7f780772b04:Data in problem. vds: hv5 2014-02-18 11:48:45,021 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-18) [46683672] Failed to refresh VDS , vds = 66e6aace-e51d-4006-bb2f-d85c2f1fd8d2 : hv5, VDS Network Error, continuing. 2014-02-18 11:48:45,070 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-41) [2ef1a894] Correlation ID: 2ef1a894, Call Stack: null, Custom Event ID: -1, Message: Invalid status on Data Center GS. Setting Data Center status to Non Responsive (On host hv5, Error: Network error
Re: [Users] Nodes lose storage at random
Ok, will do. The process_pool_max_slots_per_domain is not defined, default node values. On Tue, Feb 18, 2014 at 2:56 PM, Meital Bourvine mbour...@redhat.comwrote: Hi Johan, Can you please run something like this on the spm node? while true; do echo `date; ps ax | grep -i remotefilehandler | wc -l` /tmp/handler_num.txt; sleep 1; done When it'll happen again, please stop the script, and write here the maximum number and the time that it happened. Also, please check if process_pool_max_slots_per_domain is defined in /etc/vdsm/vdsm.conf, and if so, what's the value? (if it's not defined there, the default is 10) Thanks! -- *From: *Johan Kooijman m...@johankooijman.com *To: *Meital Bourvine mbour...@redhat.com *Cc: *users users@ovirt.org *Sent: *Tuesday, February 18, 2014 2:55:11 PM *Subject: *Re: [Users] Nodes lose storage at random To follow up on this: The setup has only ~80 VM's active right now. The 2 bugreports are not in scope for this setup, the issues occur at random, even when there's no activity (create/delete VM's) and there are only 4 directories in /rhev/data-center/mnt/. On Tue, Feb 18, 2014 at 1:51 PM, Johan Kooijman m...@johankooijman.comwrote: Meital, I'm running the latest stable oVirt, 3.3.3 on Centos 6.5. For my nodes I use the node iso CentOS 6 oVirt Node - 3.0.1 - 1.0.2.el6. I have no ways of reproducing just yet. I can confirm that it's happening on all nodes in the cluster. And every time a node goes offline, this error pops up. Could the fact that lockd statd were not running on the NFS host cause this error? Is there a workaround available that we know of? On Tue, Feb 18, 2014 at 12:57 PM, Meital Bourvine mbour...@redhat.comwrote: Hi Johan, Please take a look at this error (from vdsm.log): Thread-636938::DEBUG::2014-02-18 10:48:06,374::task::579::TaskManager.Task::(_updateState) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::moving from state init - state preparing Thread-636938::INFO::2014-02-18 10:48:06,375::logUtils::44::dispatcher::(wrapper) Run and protect: getVolumeSize(sdUUID='e9f70496-f181-4c9b-9ecb-d7f780772b04', spUUID='59980e09-b329-4254-b66e-790abd69e194', imgUUID='d50ecfbb-dc98-40cf-9b19-4bd402952aeb', volUUID='68fefe24-0346-4d0d-b377-ddd7be7be29c', options=None) Thread-636938::ERROR::2014-02-18 10:48:06,376::task::850::TaskManager.Task::(_setError) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Unexpected error Thread-636938::DEBUG::2014-02-18 10:48:06,415::task::869::TaskManager.Task::(_run) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Task._run: f4ce9a6e-0292-4071-9a24-a8d8fba7222b ('e9f70496-f181-4c9b-9ecb-d7f780772b04', '59980e09-b329-4254-b66e-790abd69e194', 'd50ecfbb-dc98-40cf-9b19-4bd402952aeb', '68fefe24-0346-4d0d-b377-ddd7be7be29c') {} failed - stopping task Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::1194::TaskManager.Task::(stop) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::stopping in state preparing (force False) Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::974::TaskManager.Task::(_decref) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::ref 1 aborting True Thread-636938::INFO::2014-02-18 10:48:06,416::task::1151::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::aborting: Task is aborted: u'No free file handlers in pool' - code 100 Thread-636938::DEBUG::2014-02-18 10:48:06,417::task::1156::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Prepare: aborted: No free file handlers in pool And then you can see after a few seconds: MainThread::INFO::2014-02-18 10:48:45,258::vdsm::101::vds::(run) (PID: 1450) I am the actual vdsm 4.12.1-2.el6 hv5.ovirt.gs.cloud.lan (2.6.32-358.18.1.el6.x86_64) Meaning that vdsm was restarted. Which oVirt version are you using? I see that there are a few old bugs that describes the same behaviour, but with different reproduction steps, for example [1], [2]. Can you think of any reproduction steps that might be causing this issue? [1] https://bugzilla.redhat.com/show_bug.cgi?id=948210 [2] https://bugzilla.redhat.com/show_bug.cgi?id=853011 -- *From: *Johan Kooijman m...@johankooijman.com *To: *users users@ovirt.org *Sent: *Tuesday, February 18, 2014 1:32:56 PM *Subject: *[Users] Nodes lose storage at random Hi All, We're seeing some weird issues in our ovirt setup. We have 4 nodes connected and an NFS (v3) filestore (FreeBSD/ZFS). Once in a while, it seems at random, a node loses their connection to storage, recovers it a minute later. The other nodes usually don't lose their storage at that moment. Just one, or two at a time. We've setup extra tooling to verify the storage performance at those moments and the availability for other systems. It's always online, just the nodes don't think so. The engine tells me this: 2014-02-18 11:48:03,598 WARN
Re: [Users] Nodes lose storage at random
One other interesting fact is that each node has 4 NFS mountpoints. 2 (data export) to the main SAN, 1 to the engine machine for ISO and one to the legacy SAN. When this issue occurs, the only mountpoint in a problem state seem to be the 2 mountpoints to the main SAN: 2014-02-18 11:48:03,598 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-6-thread-48) domain d88764c8-ecc3-4f22-967e-2ce225ac4498:Export in problem. vds: hv5 2014-02-18 11:48:18,909 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-6-thread-48) domain e9f70496-f181-4c9b-9ecb-d7f780772b04:Data in problem. vds: hv5 On Tue, Feb 18, 2014 at 3:04 PM, Johan Kooijman m...@johankooijman.comwrote: Ok, will do. The process_pool_max_slots_per_domain is not defined, default node values. On Tue, Feb 18, 2014 at 2:56 PM, Meital Bourvine mbour...@redhat.comwrote: Hi Johan, Can you please run something like this on the spm node? while true; do echo `date; ps ax | grep -i remotefilehandler | wc -l` /tmp/handler_num.txt; sleep 1; done When it'll happen again, please stop the script, and write here the maximum number and the time that it happened. Also, please check if process_pool_max_slots_per_domain is defined in /etc/vdsm/vdsm.conf, and if so, what's the value? (if it's not defined there, the default is 10) Thanks! -- *From: *Johan Kooijman m...@johankooijman.com *To: *Meital Bourvine mbour...@redhat.com *Cc: *users users@ovirt.org *Sent: *Tuesday, February 18, 2014 2:55:11 PM *Subject: *Re: [Users] Nodes lose storage at random To follow up on this: The setup has only ~80 VM's active right now. The 2 bugreports are not in scope for this setup, the issues occur at random, even when there's no activity (create/delete VM's) and there are only 4 directories in /rhev/data-center/mnt/. On Tue, Feb 18, 2014 at 1:51 PM, Johan Kooijman m...@johankooijman.comwrote: Meital, I'm running the latest stable oVirt, 3.3.3 on Centos 6.5. For my nodes I use the node iso CentOS 6 oVirt Node - 3.0.1 - 1.0.2.el6. I have no ways of reproducing just yet. I can confirm that it's happening on all nodes in the cluster. And every time a node goes offline, this error pops up. Could the fact that lockd statd were not running on the NFS host cause this error? Is there a workaround available that we know of? On Tue, Feb 18, 2014 at 12:57 PM, Meital Bourvine mbour...@redhat.comwrote: Hi Johan, Please take a look at this error (from vdsm.log): Thread-636938::DEBUG::2014-02-18 10:48:06,374::task::579::TaskManager.Task::(_updateState) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::moving from state init - state preparing Thread-636938::INFO::2014-02-18 10:48:06,375::logUtils::44::dispatcher::(wrapper) Run and protect: getVolumeSize(sdUUID='e9f70496-f181-4c9b-9ecb-d7f780772b04', spUUID='59980e09-b329-4254-b66e-790abd69e194', imgUUID='d50ecfbb-dc98-40cf-9b19-4bd402952aeb', volUUID='68fefe24-0346-4d0d-b377-ddd7be7be29c', options=None) Thread-636938::ERROR::2014-02-18 10:48:06,376::task::850::TaskManager.Task::(_setError) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Unexpected error Thread-636938::DEBUG::2014-02-18 10:48:06,415::task::869::TaskManager.Task::(_run) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Task._run: f4ce9a6e-0292-4071-9a24-a8d8fba7222b ('e9f70496-f181-4c9b-9ecb-d7f780772b04', '59980e09-b329-4254-b66e-790abd69e194', 'd50ecfbb-dc98-40cf-9b19-4bd402952aeb', '68fefe24-0346-4d0d-b377-ddd7be7be29c') {} failed - stopping task Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::1194::TaskManager.Task::(stop) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::stopping in state preparing (force False) Thread-636938::DEBUG::2014-02-18 10:48:06,416::task::974::TaskManager.Task::(_decref) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::ref 1 aborting True Thread-636938::INFO::2014-02-18 10:48:06,416::task::1151::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::aborting: Task is aborted: u'No free file handlers in pool' - code 100 Thread-636938::DEBUG::2014-02-18 10:48:06,417::task::1156::TaskManager.Task::(prepare) Task=`f4ce9a6e-0292-4071-9a24-a8d8fba7222b`::Prepare: aborted: No free file handlers in pool And then you can see after a few seconds: MainThread::INFO::2014-02-18 10:48:45,258::vdsm::101::vds::(run) (PID: 1450) I am the actual vdsm 4.12.1-2.el6 hv5.ovirt.gs.cloud.lan (2.6.32-358.18.1.el6.x86_64) Meaning that vdsm was restarted. Which oVirt version are you using? I see that there are a few old bugs that describes the same behaviour, but with different reproduction steps, for example [1], [2]. Can you think of any reproduction steps that might be causing this issue? [1] https://bugzilla.redhat.com/show_bug.cgi?id=948210 [2] https://bugzilla.redhat.com/show_bug.cgi?id=853011 -- *From: *Johan
Re: [Users] Fwd: Your stand proposal for oVirt has been accepted
Will be there at the meetup! On Tue, Dec 17, 2013 at 10:00 AM, Dave Neary dne...@redhat.com wrote: Hi everyone, Great news! We will have an oVirt stand at FOSDEM in Brussels this year! Brian and I will be looking for volunteers to man the stand and spread the love about oVirt over the next few weeks - please let us know if you plan to attend FOSDEM, we would love to see you there! Also, I would love to have an oVirt community meet-up for beers on Saturday evening - if we did, would you be interested in attending? Let us know! Thanks, Dave. Original Message Subject: Your stand proposal for oVirt has been accepted Date: Mon, 16 Dec 2013 22:28:20 +0100 (CET) From: FOSDEM Stands Team sta...@fosdem.org To: Dave Neary dne...@redhat.com Hi Dave, The FOSDEM stands team is glad to be able to inform you that your request for a stand for oVirt has been accepted. There will be one table reserved for you. You will receive further information about what's expected of you closer to the event date. Looking forward to seeing you at FOSDEM 2014! Kind regards, Wynke Stulemeijer FOSDEM stands team ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 F +31(0) 162 82 00 01 E m...@johankooijman.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Sanlock / selinux
Hey all, When I try do a migration from one node to another (ovirt 3.3 centos, ovirt-node 3.0.1 centos), I'm getting the following error: 2013-12-13 13:09:42.108+: 11887: error : virCommandHandshakeWait:2554 : internal error Failed to open socket to sanlock daemon: Permission denied This can be resolved by executing: [root@hv1 ~]# setsebool -P virt_use_sanlock=on [root@hv1 ~]# setsebool -P virt_use_nfs=on But it's not persistent after a reboot. How can I make that a persistent setting? -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 F +31(0) 162 82 00 01 E m...@johankooijman.com ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users