[ovirt-users] HE (3.6) on gluster storage, chicken and egg status
I've managed to get hosted-engine running but could use some direction for what comes next. Summary: 4 Hosts, CentOS 7.2, oVirt 3.6 (from ovirt.org repo) All hosts are in default/default datacenter/cluster All 4 hosts are running glusterd HE is installed on glusterfs volume, replica 3, on hosts 1,2, and 3. ( started with replica 4, but removed 4th brick to satisfy HE --deploy) Status: All hosts are active Datacenter is not initialized Issues: Attach hosted_storage to Default datacenter fails New volume dialog doesn't allow me to select a datacenter (list is empty) Questions: 1. Why doesn't replica 4 work as a glusterfs volume, is this just because of the installer or is there a more fundamental reason? 2. I assume the reason I can't create new volumes is because I don't have a data storage domain configured yet. I want all of my data storage to be glusterfs. How do I escape this chicken/egg puzzle? 3. What question should I be asking that I am not? Thanks ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] HE (3.6) on gluster storage, chicken and egg status
On Thu, Dec 31, 2015 at 3:44 AM, Donny Davis wrote: > I would say you would be much better off picking two hosts to do your HE on > and setting up drdb for the HE storage. You will have fewer problems with > your HE. Is drdb an oVirt thing? Or are you suggesting a db backup or replicated sql db solution that I would cut over to manually. > On Thu, Dec 31, 2015 at 1:41 AM, Sahina Bose wrote: >> >> If you mean, creating new gluster volumes - you need to make sure the >> gluster service is enabled on the Default cluster. The cluster that HE >> creates, has only virt service enabled by default. Engine should have been >> installed in "Both" mode like Roy mentioned. I went over those settings with limited success. I went through multiple iterations of trying to stand up HE on gluster storage. Then I gave up on that and tried NFS storage. Ultimately it always came down to sanlock errors (both gluster and NFS storage). I tried restarting the sanlock service, which would lead to watchdog rebooting the hosts. When the host came back up it almost seemed like things had started working. I could see begin to see/create gluster volumes (see hosted_storage, or begin to create a data storage domain) But when I would try to activate the hosted_storage domain things would start to fall apart again. sanlock as far as I can tell. I am currently running the engine on a physical system and things are working fine. I am considering taking a backup and attempting to use the HE physical to VM migration method, time permitting. >> On 12/28/2015 12:43 AM, Roy Golan wrote: >> >> >> 3 way replica is the officially supported replica count for VM store use >> case. If you wish to work with replica 4, you can update the >> supported_replica_count in vdsm.conf Thanks for that insight. I think I just experienced the bad aspects of both quorum=auto and quorum=none. I don't like replica 3, because you can only have one brick offline at a time. I think N+2 should be the target for a production environment. (so you have the capacity for a failure while doing maintenance). Would adding an arbiter effect the quorum status? Is 3x replica and 1x arbiter considered a replica 3 or 4? >> No chicken and egg here I think. You want a volume to be used as your >> master data domain and creating a new volume in a new gluster-cluster is >> independent of your datacenter status. >> >> You mentioned your hosts are on default cluster - so make sure your >> cluster support gluster service (you should have picked gluster as a service >> during engine install) I chose "both" during engine-setup, although I didn't have "gluster service" enable on the default cluster at first. Also vdsm-gluster rpm was not installed (I sort of feel like 'hosted-engine --deploy' should take care of that. Adding a host from my current physical engine using the "add host" gui didn't bring it in either. Thanks for the input! ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Storage network clarification
I'm having trouble setting up a dedicated storage network. I have a separate VLAN designated for storage, and configured separate IP addresses for each host that correspond to that subnet. I have tested this subnet extensively and it is working as expected. Prior to adding the hosts, I configured a storage network and configured the cluster to use that network for storage and not the ovirtmgmt network. I was hopping that this would be recognized when the hosts were added but it was not. I had to actually reconfigure the storage VLAN interface via oVirt "manage host networks" just to bring the host networks into compliance. The IP is configured directly on the bond0., not on a bridge interface which I assume is correct since it is not a "VM" network. In this setup I was not able to activate any of the hosts due to VDSM gluster errors, I think it was because VDSM was trying to use the hostname/IP of the ovirtmgmt network. I manually set up the peers using "gluster peer probe" and I was able to activate the hosts but they were not using the storage network (tcpdump). I also tried adding DNS records for the storage network interfaces using different hostnames but gluster seemed to still consider the ovirtmgmt interface as the primary. With the hosts active, I couldn't create/activate any volumes until I changed the cluster network settings to use the ovirtmgmt network for storage. I ended up abandoning the dedicated storage subnet for the time being and I'm starting to wonder if running virtualization and gluster on the same hosts is intended to work this way. Assuming that it should work, what is the correct way to configure it? I can't find any docs that go in detail about storage networks. Is reverse DNS a factor? If I had a better understanding of what oVirt is expecting to see that would be helpful. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Storage network clarification
Thanks I will try this. I am running ovirt-engine 3.6.1.3-1.el7.centos In the configuration described, is oVirt able to manage gluster? I am confused because if oVirt knows the nodes by their ovirtmgmt network IP/hostname aren't all the VDSM commands going to fail? On Mon, Jan 18, 2016 at 6:39 AM, combuster wrote: > Hi Fil, > > this worked for me a couple of months back: > > http://lists.ovirt.org/pipermail/users/2015-November/036235.html > > I'll try to set this up again, and see if there are any issues. Which oVirt > release are you running ? > > Ivan > > On 01/18/2016 02:56 PM, Fil Di Noto wrote: >> >> I'm having trouble setting up a dedicated storage network. >> >> I have a separate VLAN designated for storage, and configured separate >> IP addresses for each host that correspond to that subnet. I have >> tested this subnet extensively and it is working as expected. >> >> Prior to adding the hosts, I configured a storage network and >> configured the cluster to use that network for storage and not the >> ovirtmgmt network. I was hopping that this would be recognized when >> the hosts were added but it was not. I had to actually reconfigure the >> storage VLAN interface via oVirt "manage host networks" just to bring >> the host networks into compliance. The IP is configured directly on >> the bond0., not on a bridge interface which I assume is >> correct since it is not a "VM" network. >> >> In this setup I was not able to activate any of the hosts due to VDSM >> gluster errors, I think it was because VDSM was trying to use the >> hostname/IP of the ovirtmgmt network. I manually set up the peers >> using "gluster peer probe" and I was able to activate the hosts but >> they were not using the storage network (tcpdump). I also tried adding >> DNS records for the storage network interfaces using different >> hostnames but gluster seemed to still consider the ovirtmgmt interface >> as the primary. >> >> With the hosts active, I couldn't create/activate any volumes until I >> changed the cluster network settings to use the ovirtmgmt network for >> storage. I ended up abandoning the dedicated storage subnet for the >> time being and I'm starting to wonder if running virtualization and >> gluster on the same hosts is intended to work this way. >> >> Assuming that it should work, what is the correct way to configure it? >> I can't find any docs that go in detail about storage networks. Is >> reverse DNS a factor? If I had a better understanding of what oVirt is >> expecting to see that would be helpful. >> ___ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users > > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] NumaInfoMonitor error
This has been showing up in the logs on a couple of my hosts. When I put the host into maintenance mode the message stops but then appears on another host where the VMs were migrated to. Is this a serious problem? ( and is it caused by hot-adding CPU cores or memory? ) periodic/21::ERROR::2016-01-24 09:34:41,884::executor::188::Executor::(_execute_task) Unhandled exception in Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 186, in _execute_task callable() File "/usr/share/vdsm/virt/periodic.py", line 271, in __call__ self._execute() File "/usr/share/vdsm/virt/periodic.py", line 311, in _execute self._vm.updateNumaInfo() File "/usr/share/vdsm/virt/vm.py", line 5025, in updateNumaInfo self._numaInfo = numaUtils.getVmNumaNodeRuntimeInfo(self) File "/usr/share/vdsm/numaUtils.py", line 116, in getVmNumaNodeRuntimeInfo vnode_index = str(vcpu_to_vnode[vcpu_id]) KeyError: 1 ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] NumaInfoMonitor error
By migrating VMs one at a time I noticed that trouble seemed to be following one of them in particular. I powered that VM down and the problem stopped. Unlike all the other VMs this one had the "random number generator enabled" box checked. I removed the setting and started the VM. It took longer than normal to start and the host spit out some kernel errors. But then the VM started and everything now appears to be normal. On Sun, Jan 24, 2016 at 1:43 AM, Fil Di Noto wrote: > This has been showing up in the logs on a couple of my hosts. When I > put the host into maintenance mode the message stops but then appears > on another host where the VMs were migrated to. Is this a serious > problem? ( and is it caused by hot-adding CPU cores or memory? ) > > periodic/21::ERROR::2016-01-24 > 09:34:41,884::executor::188::Executor::(_execute_task) Unhandled > exception in > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 186, > in _execute_task > callable() > File "/usr/share/vdsm/virt/periodic.py", line 271, in __call__ > self._execute() > File "/usr/share/vdsm/virt/periodic.py", line 311, in _execute > self._vm.updateNumaInfo() > File "/usr/share/vdsm/virt/vm.py", line 5025, in updateNumaInfo > self._numaInfo = numaUtils.getVmNumaNodeRuntimeInfo(self) > File "/usr/share/vdsm/numaUtils.py", line 116, in getVmNumaNodeRuntimeInfo > vnode_index = str(vcpu_to_vnode[vcpu_id]) > KeyError: 1 ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Storage network clarification
Thanks I will add those steps to my notes and try again some time in the future. Trying to move storage network functions from ovirtmgmt to a dedicated logical network for an existing gluster volume did not go well for me. I did not put much effort into recovering, but I lost the storage domain completely. I did not collect any logs because I was pressed for time. I am finding it easier to manage gluster outside of ovirt for now. Without knowing how VDSM is going to try to approach a task it becomes more time consuming (and sometimes destructive) to work with VDSM than to just do things manually. On Tue, Jan 19, 2016 at 12:44 AM, Sahina Bose wrote: > > > On 01/19/2016 12:37 PM, Nicolas Ecarnot wrote: >> >> Hi Sahina, >> >> Le 19/01/2016 07:02, Sahina Bose a écrit : >>> >>> The steps to make sure gluster uses separate network for data traffic : >>> >>> 1. Create a logical network (non-VM), and mark it's role as "Gluster" >>> 2. After adding the host via the ovirtmgmt hostname/ ip address, assign >>> this gluster network to your interface (with the storage sub-net) >>> This step will initiate a peer probe of the host with this additonal >>> ip address. >>> >>> 3. When creating a volume after an interface/bond is tagged with gluster >>> network, the bricks are added using the gluster n/w's ip address. Now >>> when clients connect to the volume, traffic is routed via your gluster >>> network that's used by bricks. >> >> >> Does that mean that there's no way to reach this goal with an existing >> volume, previously created, and that was (alas) using the management >> network? > > > > There's some ongoing work to replace network used by brick - > http://review.gluster.org/#/c/12250/ and https://gerrit.ovirt.org/#/c/51685/ > [+Kaushal, Anuradha for further inputs and if there's any manual workaround > possible] > > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users