[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-10 Thread Amit Bawer
On Mon, Feb 10, 2020 at 4:13 PM Jorick Astrego  wrote:

>
> On 2/10/20 1:27 PM, Jorick Astrego wrote:
>
> Hmm, I didn't notice that.
>
> I did a check on the NFS server and I found the
> "1ed0a635-67ee-4255-aad9-b70822350706" in the exportdom path
> (/data/exportdom).
>
> This was an old NFS export domain that has been deleted for a while now. I
> remember finding somewhere an issue with old domains still being active
> after removal but I cannot find it now.
>
> I unexported the directory on the nfs server and now I have to correct
> mount and it activates fine.
>
> Thanks!
>
> Still weird that it picks another nfs mount path to mount that has been
> removed months ago from engine.
>
This is because vdsm scans for domains by storage i.e. looking up under
/rhev/data-center/mnt/* in case of nfs domains [1]


> It's not listed in the database on engine:
>
The table lists the valid domains known to engine, removals/additions of
storage domains update this table.

If you removed the old nfs domain, but the nfs storage was not available to
the time (i.e. not mounted) then storage format could fail silently [2]
and yet this table would still be updated for the SD removal. [3]

Haven't tested this out, and may need to unmount at a very specific moment
to achieve this in [2],
but looking around with the kind assistance of +Benny Zlotnik
 on engine side makes this assumption seem possible.

[1]
https://github.com/oVirt/vdsm/blob/821afbbc238ba379c12666922fc1ac80482ee383/lib/vdsm/storage/fileSD.py#L888
[2]
https://github.com/oVirt/vdsm/blob/master/lib/vdsm/storage/fileSD.py#L628
[3]
https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/storage/domain/RemoveStorageDomainCommand.java#L77

engine=# select * from storage_domain_static ;
>   id  |
> storage|  storage_name  | storage_domain_type |
> storage_type | storage_domain_format_type | _create_date
> | _update_date  | reco
> verable | last_time_used_as_master |storage_description |
> storage_comment | wipe_after_delete | warning_low_space_indicator |
> critical_space_action_blocker | first_metadata_device | vg_metadata_device
> | discard_after_delet
> e | backup | warning_low_confirmed_space_indicator | block_size
>
> --+--++-+--++---+---+-
>
> +--++-+---+-+---+---++
> --++---+
>  782a61af-a520-44c4-8845-74bf92888552 |
> 640ab34d-aa5d-478b-97be-e3f810558628 | ISO_DOMAIN
> |   2 |1 | 0  |
> 2017-11-16 09:49:49.225478+01 |   | t
> |0 | ISO_DOMAIN
> | | f |
> |   |
> || f
>   | f  |   |512
>  072fbaa1-08f3-4a40-9f34-a5ca22dd1d74 |
> ceab03af-7220-4d42-8f5c-9b557f5d29af | ovirt-image-repository
> |   4 |8 | 0  |
> 2016-10-14 20:40:44.700381+02 | 2018-04-06 14:03:31.201898+02 | t
> |0 | Public Glance repository for oVirt
> | | f |
> |   |
> || f
>   | f  |   |512
>  b30bab9d-9a66-44ce-ad17-2eb4ee858d8f |
> 40d191b0-b7f8-48f9-bf6f-327275f51fef | ssd-6
> |   1 |7 | 4  |
> 2017-06-25 12:45:24.52974+02  | 2019-01-24 15:35:57.013832+01 | t
> |1498461838176 |
> | | f |  10
> | 5 |
> || f
>   | f  |   |512
>  95b4e5d2-2974-4d5f-91e4-351f75a15435 |
> f11fed97-513a-4a10-b85c-2afe68f42608 | ssd-3
> |   1 |7 | 4  |
> 2019-01-10 12:15:55.20347+01  | 2019-01-24 15:35:57.013832+01 | t
> |0 |
> | | f |  10
> | 5 |
> || f
>   | f  |10 |512
>  f5d2f7c6-093f-46d6-a844-224d92db5ef9 |
> b8b456f0-27c3-49b9-b5e9-9fa81fb3cdaa | backupnfs
> |   1 |1 | 4  |
> 2018-01-19 13:31:25.899738+01 | 2019-02-14 14:36:22.3171

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-10 Thread Jorick Astrego

On 2/10/20 1:27 PM, Jorick Astrego wrote:
>
> Hmm, I didn't notice that.
>
> I did a check on the NFS server and I found the
> "1ed0a635-67ee-4255-aad9-b70822350706" in the exportdom path
> (/data/exportdom).
>
> This was an old NFS export domain that has been deleted for a while
> now. I remember finding somewhere an issue with old domains still
> being active after removal but I cannot find it now.
>
> I unexported the directory on the nfs server and now I have to correct
> mount and it activates fine.
>
> Thanks!
>
Still weird that it picks another nfs mount path to mount that has been
removed months ago from engine.

It's not listed in the database on engine:

engine=# select * from storage_domain_static ;
  id  |  
storage    |  storage_name  |
storage_domain_type | storage_type | storage_domain_format_type
| _create_date  | _update_date  | reco
verable | last_time_used_as_master |   
storage_description | storage_comment | wipe_after_delete |
warning_low_space_indicator | critical_space_action_blocker |
first_metadata_device | vg_metadata_device | discard_after_delet
e | backup | warning_low_confirmed_space_indicator | block_size

--+--++-+--++---+---+-

+--++-+---+-+---+---++
--++---+
 782a61af-a520-44c4-8845-74bf92888552 |
640ab34d-aa5d-478b-97be-e3f810558628 | ISO_DOMAIN
|   2 |    1 | 0  |
2017-11-16 09:49:49.225478+01 |   | t  
    |    0 |
ISO_DOMAIN | |
f |
|   |  
|    | f 
  | f  |   |    512
 072fbaa1-08f3-4a40-9f34-a5ca22dd1d74 |
ceab03af-7220-4d42-8f5c-9b557f5d29af | ovirt-image-repository
|   4 |    8 | 0  |
2016-10-14 20:40:44.700381+02 | 2018-04-06 14:03:31.201898+02 | t  
    |    0 | Public Glance repository for
oVirt | | f
| |  
|   |    | f 
  | f  |   |    512
 b30bab9d-9a66-44ce-ad17-2eb4ee858d8f |
40d191b0-b7f8-48f9-bf6f-327275f51fef | ssd-6 
|   1 |    7 | 4  |
2017-06-25 12:45:24.52974+02  | 2019-01-24 15:35:57.013832+01 | t  
    |    1498461838176
|    | |
f |  10
| 5 |  
|    | f 
  | f  |   |    512
 95b4e5d2-2974-4d5f-91e4-351f75a15435 |
f11fed97-513a-4a10-b85c-2afe68f42608 | ssd-3 
|   1 |    7 | 4  |
2019-01-10 12:15:55.20347+01  | 2019-01-24 15:35:57.013832+01 | t  
    |    0
|    | |
f |  10
| 5 |  
|    | f 
  | f  |    10 |    512
 f5d2f7c6-093f-46d6-a844-224d92db5ef9 |
b8b456f0-27c3-49b9-b5e9-9fa81fb3cdaa | backupnfs 
|   1 |    1 | 4  |
2018-01-19 13:31:25.899738+01 | 2019-02-14 14:36:22.3171+01   | t  
    |    1530772724454
|    | |
f |  10
| 5 |  
|    | f 
  | f  | 0 |    512
 33f1ba00-6a16-4e58-b4c5-94426f1c4482 |
6b6b7899-c82b-4417-b453-0b3b0ac11deb | ssd-4 
|   1 |    7 | 4  |
2017-06-25 12:43:49.339884+02 | 2019-02-27 21:30:23.35823

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-10 Thread Amit Bawer
On Mon, Feb 10, 2020 at 2:27 PM Jorick Astrego  wrote:

>
> On 2/10/20 11:09 AM, Amit Bawer wrote:
>
> compared it with host having nfs domain working
> this
>
> On Mon, Feb 10, 2020 at 11:11 AM Jorick Astrego 
> wrote:
>
>>
>> On 2/9/20 10:27 AM, Amit Bawer wrote:
>>
>>
>>
>> On Thu, Feb 6, 2020 at 11:07 AM Jorick Astrego 
>> wrote:
>>
>>> Hi,
>>>
>>> Something weird is going on with our ovirt node 4.3.8 install mounting a
>>> nfs share.
>>>
>>> We have a NFS domain for a couple of backup disks and we have a couple
>>> of 4.2 nodes connected to it.
>>>
>>> Now I'm adding a fresh cluster of 4.3.8 nodes and the backupnfs mount
>>> doesn't work.
>>>
>>> (annoying you cannot copy the text from the events view)
>>>
>>> The domain is up and working
>>>
>>> ID:f5d2f7c6-093f-46d6-a844-224d92db5ef9
>>> Size: 10238 GiB
>>> Available:2491 GiB
>>> Used:7747 GiB
>>> Allocated: 3302 GiB
>>> Over Allocation Ratio:37%
>>> Images:7
>>> Path:*.*.*.*:/data/ovirt
>>> NFS Version: AUTO
>>> Warning Low Space Indicator:10% (1023 GiB)
>>> Critical Space Action Blocker:5 GiB
>>>
>>> But somehow the node appears to thin thinks it's an LVM volume? It tries
>>> to find the VGs volume group but fails... which is not so strange as it is
>>> an NFS volume:
>>>
>>> 2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c) [storage.LVM]
>>> Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5
>>> out=[] err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not
>>> found', '  Cannot process volume group
>>> f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
>>> 2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c) [storage.Monitor]
>>> Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed
>>> (monitor:330)
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>>> 327, in _setupLoop
>>> self._setupMonitor()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>>> 349, in _setupMonitor
>>> self._produceDomain()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in
>>> wrapper
>>> value = meth(self, *a, **kw)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>>> 367, in _produceDomain
>>> self.domain = sdCache.produce(self.sdUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110,
>>> in produce
>>> domain.getRealDomain()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51,
>>> in getRealDomain
>>> return self._cache._realProduce(self._sdUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134,
>>> in _realProduce
>>> domain = self._findDomain(sdUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151,
>>> in _findDomain
>>> return findMethod(sdUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176,
>>> in _findUnfetchedDomain
>>> raise se.StorageDomainDoesNotExist(sdUUID)
>>> StorageDomainDoesNotExist: Storage domain does not exist:
>>> (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)
>>>
>>> The volume is actually mounted fine on the node:
>>>
>>> On NFS server
>>>
>>> Feb  5 15:47:09 back1en rpc.mountd[4899]: authenticated mount request
>>> from *.*.*.*:673 for /data/ovirt (/data/ovirt)
>>>
>>> On the host
>>>
>>> mount|grep nfs
>>>
>>> *.*.*.*:/data/ovirt on /rhev/data-center/mnt/*.*.*.*:_data_ovirt type
>>> nfs
>>> (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=*.*.*.*,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=*.*.*.*)
>>>
>>> And I can see the files:
>>>
>>> ls -alrt /rhev/data-center/mnt/*.*.*.*:_data_ovirt
>>> total 4
>>> drwxr-xr-x. 5 vdsm kvm61 Oct 26  2016
>>> 1ed0a635-67ee-4255-aad9-b70822350706
>>>
>>>
>> What ls -lart for 1ed0a635-67ee-4255-aad9-b70822350706 is showing?
>>
>> ls -arlt 1ed0a635-67ee-4255-aad9-b70822350706/
>> total 4
>> drwxr-xr-x. 2 vdsm kvm93 Oct 26  2016 dom_md
>> drwxr-xr-x. 5 vdsm kvm61 Oct 26  2016 .
>> drwxr-xr-x. 4 vdsm kvm40 Oct 26  2016 master
>> drwxr-xr-x. 5 vdsm kvm  4096 Oct 26  2016 images
>> drwxrwxrwx. 3 root root   86 Feb  5 14:37 ..
>>
> On a working nfs domain host we have following storage hierarchy,
> feece142-9e8d-42dc-9873-d154f60d0aac is the nfs domain in my case
>
> /rhev/data-center/
> ├── edefe626-3ada-11ea-9877-525400b37767
> ...
> │   ├── feece142-9e8d-42dc-9873-d154f60d0aac ->
> /rhev/data-center/mnt/10.35.18.45:
> _exports_data/feece142-9e8d-42dc-9873-d154f60d0aac
> │   └── mastersd ->
> /rhev/data-center/mnt/blockSD/a6a14714-6eaa-4054-9503-0ea3fcc38531
> └── mnt
> ├── 10.35.18.45:_exports_data
> │   └── feece142-9e8d-42dc-9873-d154f60d0aac
> │   ├── dom_md
> │   │   ├── ids
> │   │   ├── inbox
> │   │   ├── leases
> │   │   ├── metadata
> │   │   ├── outbox
> │   │   └── xleases
> 

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-10 Thread Jorick Astrego

On 2/10/20 11:09 AM, Amit Bawer wrote:
> compared it with host having nfs domain working
> this
>
> On Mon, Feb 10, 2020 at 11:11 AM Jorick Astrego  > wrote:
>
>
> On 2/9/20 10:27 AM, Amit Bawer wrote:
>>
>>
>> On Thu, Feb 6, 2020 at 11:07 AM Jorick Astrego
>> mailto:jor...@netbulae.eu>> wrote:
>>
>> Hi,
>>
>> Something weird is going on with our ovirt node 4.3.8 install
>> mounting a nfs share.
>>
>> We have a NFS domain for a couple of backup disks and we have
>> a couple of 4.2 nodes connected to it.
>>
>> Now I'm adding a fresh cluster of 4.3.8 nodes and the
>> backupnfs mount doesn't work.
>>
>> (annoying you cannot copy the text from the events view)
>>
>> The domain is up and working
>>
>> ID:f5d2f7c6-093f-46d6-a844-224d92db5ef9
>> Size:10238 GiB
>> Available:2491 GiB
>> Used:7747 GiB
>> Allocated:3302 GiB
>> Over Allocation Ratio:37%
>> Images:7
>> Path:*.*.*.*:/data/ovirt
>> NFS Version:AUTO
>> Warning Low Space Indicator:10% (1023 GiB)
>> Critical Space Action Blocker:5 GiB
>>
>> But somehow the node appears to thin thinks it's an LVM
>> volume? It tries to find the VGs volume group but fails...
>> which is not so strange as it is an NFS volume:
>>
>> 2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c)
>> [storage.LVM] Reloading VGs failed
>> (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5
>> out=[] err=['  Volume group
>> "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not found', ' 
>> Cannot process volume group
>> f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
>> 2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c)
>> [storage.Monitor] Setting up monitor for
>> f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed (monitor:330)
>> Traceback (most recent call last):
>>   File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>> line 327, in _setupLoop
>>     self._setupMonitor()
>>   File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>> line 349, in _setupMonitor
>>     self._produceDomain()
>>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py",
>> line 159, in wrapper
>>     value = meth(self, *a, **kw)
>>   File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
>> line 367, in _produceDomain
>>     self.domain = sdCache.produce(self.sdUUID)
>>   File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line 110, in produce
>>     domain.getRealDomain()
>>   File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line 51, in getRealDomain
>>     return self._cache._realProduce(self._sdUUID)
>>   File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line 134, in _realProduce
>>     domain = self._findDomain(sdUUID)
>>   File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line 151, in _findDomain
>>     return findMethod(sdUUID)
>>   File
>> "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
>> line 176, in _findUnfetchedDomain
>>     raise se.StorageDomainDoesNotExist(sdUUID)
>> StorageDomainDoesNotExist: Storage domain does not exist:
>> (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)
>>
>> The volume is actually mounted fine on the node:
>>
>> On NFS server
>>
>> Feb  5 15:47:09 back1en rpc.mountd[4899]: authenticated
>> mount request from *.*.*.*:673 for /data/ovirt (/data/ovirt)
>>
>> On the host
>>
>> mount|grep nfs
>>
>> *.*.*.*:/data/ovirt on
>> /rhev/data-center/mnt/*.*.*.*:_data_ovirt type nfs
>> 
>> (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=*.*.*.*,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=*.*.*.*)
>>
>> And I can see the files:
>>
>> ls -alrt /rhev/data-center/mnt/*.*.*.*:_data_ovirt
>> total 4
>> drwxr-xr-x. 5 vdsm kvm    61 Oct 26  2016
>> 1ed0a635-67ee-4255-aad9-b70822350706
>>
>>
>> What ls -lart for 1ed0a635-67ee-4255-aad9-b70822350706 is showing?
>
> ls -arlt 1ed0a635-67ee-4255-aad9-b70822350706/
> total 4
> drwxr-xr-x. 2 vdsm kvm    93 Oct 26  2016 dom_md
> drwxr-xr-x. 5 vdsm kvm

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-10 Thread Amit Bawer
compared it with host having nfs domain working
this

On Mon, Feb 10, 2020 at 11:11 AM Jorick Astrego  wrote:

>
> On 2/9/20 10:27 AM, Amit Bawer wrote:
>
>
>
> On Thu, Feb 6, 2020 at 11:07 AM Jorick Astrego  wrote:
>
>> Hi,
>>
>> Something weird is going on with our ovirt node 4.3.8 install mounting a
>> nfs share.
>>
>> We have a NFS domain for a couple of backup disks and we have a couple of
>> 4.2 nodes connected to it.
>>
>> Now I'm adding a fresh cluster of 4.3.8 nodes and the backupnfs mount
>> doesn't work.
>>
>> (annoying you cannot copy the text from the events view)
>>
>> The domain is up and working
>>
>> ID:f5d2f7c6-093f-46d6-a844-224d92db5ef9
>> Size: 10238 GiB
>> Available:2491 GiB
>> Used:7747 GiB
>> Allocated: 3302 GiB
>> Over Allocation Ratio:37%
>> Images:7
>> Path:*.*.*.*:/data/ovirt
>> NFS Version: AUTO
>> Warning Low Space Indicator:10% (1023 GiB)
>> Critical Space Action Blocker:5 GiB
>>
>> But somehow the node appears to thin thinks it's an LVM volume? It tries
>> to find the VGs volume group but fails... which is not so strange as it is
>> an NFS volume:
>>
>> 2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c) [storage.LVM]
>> Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5
>> out=[] err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not
>> found', '  Cannot process volume group
>> f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
>> 2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c) [storage.Monitor]
>> Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed
>> (monitor:330)
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>> 327, in _setupLoop
>> self._setupMonitor()
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>> 349, in _setupMonitor
>> self._produceDomain()
>>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in
>> wrapper
>> value = meth(self, *a, **kw)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
>> 367, in _produceDomain
>> self.domain = sdCache.produce(self.sdUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110,
>> in produce
>> domain.getRealDomain()
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51,
>> in getRealDomain
>> return self._cache._realProduce(self._sdUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134,
>> in _realProduce
>> domain = self._findDomain(sdUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151,
>> in _findDomain
>> return findMethod(sdUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176,
>> in _findUnfetchedDomain
>> raise se.StorageDomainDoesNotExist(sdUUID)
>> StorageDomainDoesNotExist: Storage domain does not exist:
>> (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)
>>
>> The volume is actually mounted fine on the node:
>>
>> On NFS server
>>
>> Feb  5 15:47:09 back1en rpc.mountd[4899]: authenticated mount request
>> from *.*.*.*:673 for /data/ovirt (/data/ovirt)
>>
>> On the host
>>
>> mount|grep nfs
>>
>> *.*.*.*:/data/ovirt on /rhev/data-center/mnt/*.*.*.*:_data_ovirt type nfs
>> (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=*.*.*.*,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=*.*.*.*)
>>
>> And I can see the files:
>>
>> ls -alrt /rhev/data-center/mnt/*.*.*.*:_data_ovirt
>> total 4
>> drwxr-xr-x. 5 vdsm kvm61 Oct 26  2016
>> 1ed0a635-67ee-4255-aad9-b70822350706
>>
>>
> What ls -lart for 1ed0a635-67ee-4255-aad9-b70822350706 is showing?
>
> ls -arlt 1ed0a635-67ee-4255-aad9-b70822350706/
> total 4
> drwxr-xr-x. 2 vdsm kvm93 Oct 26  2016 dom_md
> drwxr-xr-x. 5 vdsm kvm61 Oct 26  2016 .
> drwxr-xr-x. 4 vdsm kvm40 Oct 26  2016 master
> drwxr-xr-x. 5 vdsm kvm  4096 Oct 26  2016 images
> drwxrwxrwx. 3 root root   86 Feb  5 14:37 ..
>
On a working nfs domain host we have following storage hierarchy,
feece142-9e8d-42dc-9873-d154f60d0aac is the nfs domain in my case

/rhev/data-center/
├── edefe626-3ada-11ea-9877-525400b37767
...
│   ├── feece142-9e8d-42dc-9873-d154f60d0aac ->
/rhev/data-center/mnt/10.35.18.45:
_exports_data/feece142-9e8d-42dc-9873-d154f60d0aac
│   └── mastersd ->
/rhev/data-center/mnt/blockSD/a6a14714-6eaa-4054-9503-0ea3fcc38531
└── mnt
├── 10.35.18.45:_exports_data
│   └── feece142-9e8d-42dc-9873-d154f60d0aac
│   ├── dom_md
│   │   ├── ids
│   │   ├── inbox
│   │   ├── leases
│   │   ├── metadata
│   │   ├── outbox
│   │   └── xleases
│   └── images
│   ├── 915e6f45-ea13-428c-aab2-fb27798668e5
│   │   ├── b83843d7-4c5a-4872-87a4-d0fe27a2c3d2
│   │   ├── b83843d7-4c5a-4872-87a4-d0fe27a2c3d2.lease
│   │   └── b83843d7-4c5a-4872-87a4-d0fe27a2c3d2.meta

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-10 Thread Jorick Astrego

On 2/9/20 10:27 AM, Amit Bawer wrote:
>
>
> On Thu, Feb 6, 2020 at 11:07 AM Jorick Astrego  > wrote:
>
> Hi,
>
> Something weird is going on with our ovirt node 4.3.8 install
> mounting a nfs share.
>
> We have a NFS domain for a couple of backup disks and we have a
> couple of 4.2 nodes connected to it.
>
> Now I'm adding a fresh cluster of 4.3.8 nodes and the backupnfs
> mount doesn't work.
>
> (annoying you cannot copy the text from the events view)
>
> The domain is up and working
>
> ID:f5d2f7c6-093f-46d6-a844-224d92db5ef9
> Size:10238 GiB
> Available:2491 GiB
> Used:7747 GiB
> Allocated:3302 GiB
> Over Allocation Ratio:37%
> Images:7
> Path:*.*.*.*:/data/ovirt
> NFS Version:AUTO
> Warning Low Space Indicator:10% (1023 GiB)
> Critical Space Action Blocker:5 GiB
>
> But somehow the node appears to thin thinks it's an LVM volume? It
> tries to find the VGs volume group but fails... which is not so
> strange as it is an NFS volume:
>
> 2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c)
> [storage.LVM] Reloading VGs failed
> (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5 out=[]
> err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9"
> not found', '  Cannot process volume group
> f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
> 2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c)
> [storage.Monitor] Setting up monitor for
> f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed (monitor:330)
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 327, in _setupLoop
>     self._setupMonitor()
>   File
> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 349, in _setupMonitor
>     self._produceDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line
> 159, in wrapper
>     value = meth(self, *a, **kw)
>   File
> "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 367, in _produceDomain
>     self.domain = sdCache.produce(self.sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
> line 110, in produce
>     domain.getRealDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
> line 51, in getRealDomain
>     return self._cache._realProduce(self._sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
> line 134, in _realProduce
>     domain = self._findDomain(sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
> line 151, in _findDomain
>     return findMethod(sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py",
> line 176, in _findUnfetchedDomain
>     raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist:
> (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)
>
> The volume is actually mounted fine on the node:
>
> On NFS server
>
> Feb  5 15:47:09 back1en rpc.mountd[4899]: authenticated mount
> request from *.*.*.*:673 for /data/ovirt (/data/ovirt)
>
> On the host
>
> mount|grep nfs
>
> *.*.*.*:/data/ovirt on
> /rhev/data-center/mnt/*.*.*.*:_data_ovirt type nfs
> 
> (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=*.*.*.*,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=*.*.*.*)
>
> And I can see the files:
>
> ls -alrt /rhev/data-center/mnt/*.*.*.*:_data_ovirt
> total 4
> drwxr-xr-x. 5 vdsm kvm    61 Oct 26  2016
> 1ed0a635-67ee-4255-aad9-b70822350706
>
>
> What ls -lart for 1ed0a635-67ee-4255-aad9-b70822350706 is showing?

ls -arlt 1ed0a635-67ee-4255-aad9-b70822350706/
total 4
drwxr-xr-x. 2 vdsm kvm    93 Oct 26  2016 dom_md
drwxr-xr-x. 5 vdsm kvm    61 Oct 26  2016 .
drwxr-xr-x. 4 vdsm kvm    40 Oct 26  2016 master
drwxr-xr-x. 5 vdsm kvm  4096 Oct 26  2016 images
drwxrwxrwx. 3 root root   86 Feb  5 14:37 ..

Regards,

Jorick Astrego





Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts 



Tel: 053 20 30 270  i...@netbulae.euStaalsteden 4-3A
KvK 08198180
Fax: 053 20 30 271  www.netbulae.eu 7547 TA Enschede
BTW NL821234584B01



___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Condu

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-09 Thread Amit Bawer
On Thu, Feb 6, 2020 at 11:07 AM Jorick Astrego  wrote:

> Hi,
>
> Something weird is going on with our ovirt node 4.3.8 install mounting a
> nfs share.
>
> We have a NFS domain for a couple of backup disks and we have a couple of
> 4.2 nodes connected to it.
>
> Now I'm adding a fresh cluster of 4.3.8 nodes and the backupnfs mount
> doesn't work.
>
> (annoying you cannot copy the text from the events view)
>
> The domain is up and working
>
> ID:f5d2f7c6-093f-46d6-a844-224d92db5ef9
> Size: 10238 GiB
> Available:2491 GiB
> Used:7747 GiB
> Allocated: 3302 GiB
> Over Allocation Ratio:37%
> Images:7
> Path:*.*.*.*:/data/ovirt
> NFS Version: AUTO
> Warning Low Space Indicator:10% (1023 GiB)
> Critical Space Action Blocker:5 GiB
>
> But somehow the node appears to thin thinks it's an LVM volume? It tries
> to find the VGs volume group but fails... which is not so strange as it is
> an NFS volume:
>
> 2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c) [storage.LVM]
> Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5
> out=[] err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not
> found', '  Cannot process volume group
> f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
> 2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c) [storage.Monitor]
> Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed
> (monitor:330)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 327, in _setupLoop
> self._setupMonitor()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 349, in _setupMonitor
> self._produceDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in
> wrapper
> value = meth(self, *a, **kw)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 367, in _produceDomain
> self.domain = sdCache.produce(self.sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110,
> in produce
> domain.getRealDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in
> getRealDomain
> return self._cache._realProduce(self._sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134,
> in _realProduce
> domain = self._findDomain(sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151,
> in _findDomain
> return findMethod(sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176,
> in _findUnfetchedDomain
> raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist:
> (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)
>
> The volume is actually mounted fine on the node:
>
> On NFS server
>
> Feb  5 15:47:09 back1en rpc.mountd[4899]: authenticated mount request from
> *.*.*.*:673 for /data/ovirt (/data/ovirt)
>
> On the host
>
> mount|grep nfs
>
> *.*.*.*:/data/ovirt on /rhev/data-center/mnt/*.*.*.*:_data_ovirt type nfs
> (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=*.*.*.*,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=*.*.*.*)
>
> And I can see the files:
>
> ls -alrt /rhev/data-center/mnt/*.*.*.*:_data_ovirt
> total 4
> drwxr-xr-x. 5 vdsm kvm61 Oct 26  2016
> 1ed0a635-67ee-4255-aad9-b70822350706
>
>
What ls -lart for 1ed0a635-67ee-4255-aad9-b70822350706 is showing?


> -rwxr-xr-x. 1 vdsm kvm 0 Feb  5 14:37 __DIRECT_IO_TEST__
> drwxrwxrwx. 3 root root   86 Feb  5 14:37 .
> drwxr-xr-x. 5 vdsm kvm  4096 Feb  5 14:37 ..
>
>
>
>
>
> Met vriendelijke groet, With kind regards,
>
> Jorick Astrego
>
> *Netbulae Virtualization Experts *
> --
> Tel: 053 20 30 270 i...@netbulae.eu Staalsteden 4-3A KvK 08198180
> Fax: 053 20 30 271 www.netbulae.eu 7547 TA Enschede BTW NL821234584B01
> --
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/IFTO5WBLVLGTVWKYN3BGLOHAC453UBD5/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6VO6WD7MJWBJJXG4FTW7PUSOXHIZDHD3/


[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-08 Thread Jorick Astrego
Dear Amit.

That is not exactly the issue we're facing; I have two clusters an old
4.2 cluster and a fresh 4.3 cluster. The DC is set to 4.2 because of this.

I wish to bring the 4.3 cluster online and migrate everything to it,
then remove the 4.2 cluster.

This should be a path that can be taken imho.

Otherwise I could upgrade the current 4.2 cluster althought it gets
decomissioned but then I have the same problem with hosts upgraded to
4.3 not being able to connect to the NFS and able to be activated. Until
I upgrade them all and set the cluster and DC to 4.3.

I could detach the NFS domain for a bit, but we use this mount for
backup disks of several VM's and I'd rather not migrate everything in a
hurry or do without these backups for a longer time.

Regards,

Jorick Astrego

On 2/8/20 11:51 AM, Amit Bawer wrote:
>  I doubt if you can use 4.3.8 nodes with a 4.2 cluster without
> upgrading it first. But myabe members of this list could say differently.
>
> On Friday, February 7, 2020, Jorick Astrego  > wrote:
>
>
> On 2/6/20 6:22 PM, Amit Bawer wrote:
>>
>>
>> On Thu, Feb 6, 2020 at 2:54 PM Jorick Astrego > > wrote:
>>
>>
>> On 2/6/20 1:44 PM, Amit Bawer wrote:
>>>
>>>
>>> On Thu, Feb 6, 2020 at 1:07 PM Jorick Astrego
>>> mailto:jor...@netbulae.eu>> wrote:
>>>
>>> Here you go, this is from the activation I just did a
>>> couple of minutes ago.
>>>
>>> I was hoping to see how it was first connected to host, but
>>> it doesn't go that far back. Anyway, the storage domain type
>>> is set from engine and vdsm never try to guess it as far as
>>> I saw.
>>
>> I put the host in maintenance and activated it again, this
>> should give you some more info. See attached log.
>>
>>> Could you query the engine db about the misbehaving domain
>>> and paste the results?
>>>
>>> # su - postgres
>>> Last login: Thu Feb  6 07:17:52 EST 2020 on pts/0
>>> -bash-4.2$
>>> LD_LIBRARY_PATH=/opt/rh/rh-postgresql10/root/lib64/  
>>> /opt/rh/rh-postgresql10/root/usr/bin/psql engine
>>> psql (10.6)
>>> Type "help" for help.
>>> engine=# select * from storage_domain_static where id =
>>> 'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>>
>>
>> engine=# select * from storage_domain_static where id =
>> 'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>>   id  |  
>> storage    | storage_name |
>> storage_domain_type | storage_type |
>> storage_domain_format_type |
>> _create_date  |    _update_date |
>> recoverable | la
>> st_time_used_as_master | storage_description |
>> storage_comment | wipe_after_delete |
>> warning_low_space_indicator |
>> critical_space_action_blocker | first_metadata_device |
>> vg_metadata_device | discard_after_delete | backup |
>> warning_low_co
>> nfirmed_space_indicator | block_size
>> 
>> --+--+--+-+--++---+-+-+---
>> 
>> ---+-+-+---+-+---+---++--++---
>> +
>>  f5d2f7c6-093f-46d6-a844-224d92db5ef9 |
>> b8b456f0-27c3-49b9-b5e9-9fa81fb3cdaa | backupnfs   
>> |   1 |    1 |
>> 4  | 2018-01-19
>> 13:31:25.899738+01 | 2019-02-14 14:36:22.3171+01 |
>> t   |  
>>  1530772724454 |
>> | | f
>> |  10
>> | 5 |  
>> |    | f    | f 
>> |  
>>   0 |    512
>> (1 row)
>>
>>
>>
>> Thanks for sharing,
>>
>> The storage_type in db is indeed NFS (1),
>> storage_domain_format_type is 4 - for ovirt 4.3 the
>> storage_domain_format_type is 5 by default and usually datacenter
>> upgrade is required for 4.2 to 4.3 migration, which not sure if
>> possible in your current setup since you have 4.2 nodes using
>> this storage as well.
>>
>> Regarding the repeating monitor failure for the SD:
>>
>> 2020-02

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-08 Thread Amit Bawer
 I doubt if you can use 4.3.8 nodes with a 4.2 cluster without upgrading it
first. But myabe members of this list could say differently.

On Friday, February 7, 2020, Jorick Astrego  wrote:

>
> On 2/6/20 6:22 PM, Amit Bawer wrote:
>
>
>
> On Thu, Feb 6, 2020 at 2:54 PM Jorick Astrego  wrote:
>
>>
>> On 2/6/20 1:44 PM, Amit Bawer wrote:
>>
>>
>>
>> On Thu, Feb 6, 2020 at 1:07 PM Jorick Astrego  wrote:
>>
>>> Here you go, this is from the activation I just did a couple of minutes
>>> ago.
>>>
>> I was hoping to see how it was first connected to host, but it doesn't go
>> that far back. Anyway, the storage domain type is set from engine and vdsm
>> never try to guess it as far as I saw.
>>
>> I put the host in maintenance and activated it again, this should give
>> you some more info. See attached log.
>>
>> Could you query the engine db about the misbehaving domain and paste the
>> results?
>>
>> # su - postgres
>> Last login: Thu Feb  6 07:17:52 EST 2020 on pts/0
>> -bash-4.2$ LD_LIBRARY_PATH=/opt/rh/rh-postgresql10/root/lib64/
>> /opt/rh/rh-postgresql10/root/usr/bin/psql engine
>> psql (10.6)
>> Type "help" for help.
>> engine=# select * from storage_domain_static where id = '
>> f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>>
>>
>> engine=# select * from storage_domain_static where id =
>> 'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>>   id  |
>> storage| storage_name | storage_domain_type | storage_type
>> | storage_domain_format_type | _create_date  |
>> _update_date | recoverable | la
>> st_time_used_as_master | storage_description | storage_comment |
>> wipe_after_delete | warning_low_space_indicator |
>> critical_space_action_blocker | first_metadata_device | vg_metadata_device
>> | discard_after_delete | backup | warning_low_co
>> nfirmed_space_indicator | block_size
>> --+-
>> -+--+-+-
>> -++-
>> --+-+-+---
>> ---+-+--
>> ---+---+-+--
>> -+---+--
>> --+--++---
>> +
>>  f5d2f7c6-093f-46d6-a844-224d92db5ef9 | b8b456f0-27c3-49b9-b5e9-9fa81fb3cdaa
>> | backupnfs|   1 |1 |
>> 4  | 2018-01-19 13:31:25.899738+01 | 2019-02-14
>> 14:36:22.3171+01 | t   |
>>  1530772724454 | | |
>> f |  10
>> | 5 |
>> || f| f  |
>>   0 |512
>> (1 row)
>>
>>
>>
> Thanks for sharing,
>
> The storage_type in db is indeed NFS (1), storage_domain_format_type is 4
> - for ovirt 4.3 the storage_domain_format_type is 5 by default and usually
> datacenter upgrade is required for 4.2 to 4.3 migration, which not sure if
> possible in your current setup since you have 4.2 nodes using this storage
> as well.
>
> Regarding the repeating monitor failure for the SD:
>
> 2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c) [storage.LVM]
> Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5
> out=[] err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not
> found', '  Cannot process volume group f5d2f7c6-093f-46d6-a844-224d92db5ef9'])
> (lvm:470)
>
> This error means that the monitor has tried to query the SD as a VG first
> and failed, this is expected for the fallback code called for finding a
> domain missing from SD cache:
>
> def _findUnfetchedDomain(self, sdUUID):
> ...
> for mod in (blockSD, glusterSD, localFsSD, nfsSD):
> try:
> return mod.findDomain(sdUUID)
> except se.StorageDomainDoesNotExist:
> pass
> except Exception:
> self.log.error(
> "Error while looking for domain `%s`",
> sdUUID, exc_info=True)
> raise se.StorageDomainDoesNotExist(sdUUID)
>
> 2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c) [storage.Monitor]
> Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed
> (monitor:330)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 327, in _setupLoop
> self._setupMonitor()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 349, in _setupMonitor
> self._produceDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in
> wrapper
> value = meth(self, *a, **kw)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 367, in _produceDomain
> self.domain = sdCache.produce(self.sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110,
> in produce
> domain.getRealDomain()

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-07 Thread Jorick Astrego

On 2/6/20 6:22 PM, Amit Bawer wrote:
>
>
> On Thu, Feb 6, 2020 at 2:54 PM Jorick Astrego  > wrote:
>
>
> On 2/6/20 1:44 PM, Amit Bawer wrote:
>>
>>
>> On Thu, Feb 6, 2020 at 1:07 PM Jorick Astrego > > wrote:
>>
>> Here you go, this is from the activation I just did a couple
>> of minutes ago.
>>
>> I was hoping to see how it was first connected to host, but it
>> doesn't go that far back. Anyway, the storage domain type is set
>> from engine and vdsm never try to guess it as far as I saw.
>
> I put the host in maintenance and activated it again, this should
> give you some more info. See attached log.
>
>> Could you query the engine db about the misbehaving domain and
>> paste the results?
>>
>> # su - postgres
>> Last login: Thu Feb  6 07:17:52 EST 2020 on pts/0
>> -bash-4.2$ LD_LIBRARY_PATH=/opt/rh/rh-postgresql10/root/lib64/  
>> /opt/rh/rh-postgresql10/root/usr/bin/psql engine
>> psql (10.6)
>> Type "help" for help.
>> engine=# select * from storage_domain_static where id =
>> 'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>
>
> engine=# select * from storage_domain_static where id =
> 'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>   id  |  
> storage    | storage_name | storage_domain_type |
> storage_type | storage_domain_format_type |
> _create_date  |    _update_date |
> recoverable | la
> st_time_used_as_master | storage_description | storage_comment
> | wipe_after_delete | warning_low_space_indicator |
> critical_space_action_blocker | first_metadata_device |
> vg_metadata_device | discard_after_delete | backup |
> warning_low_co
> nfirmed_space_indicator | block_size
> 
> --+--+--+-+--++---+-+-+---
> 
> ---+-+-+---+-+---+---++--++---
> +
>  f5d2f7c6-093f-46d6-a844-224d92db5ef9 |
> b8b456f0-27c3-49b9-b5e9-9fa81fb3cdaa | backupnfs   
> |   1 |    1 |
> 4  | 2018-01-19 13:31:25.899738+01 |
> 2019-02-14 14:36:22.3171+01 | t   |  
>  1530772724454 | |
> | f |  10
> | 5 |  
> |    | f    | f 
> |  
>   0 |    512
> (1 row)
>
>
>
> Thanks for sharing,
>
> The storage_type in db is indeed NFS (1), storage_domain_format_type
> is 4 - for ovirt 4.3 the storage_domain_format_type is 5 by default
> and usually datacenter upgrade is required for 4.2 to 4.3 migration,
> which not sure if possible in your current setup since you have 4.2
> nodes using this storage as well.
>
> Regarding the repeating monitor failure for the SD:
>
> 2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c) [storage.LVM]
> Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9']
> rc=5 out=[] err=['  Volume group
> "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not found', '  Cannot process
> volume group f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
>
> This error means that the monitor has tried to query the SD as a VG
> first and failed, this is expected for the fallback code called for
> finding a domain missing from SD cache:
>
> def_findUnfetchedDomain(self, sdUUID):
> ...
> formod in(blockSD, glusterSD, localFsSD, nfsSD):
> try:
> returnmod.findDomain(sdUUID)
> exceptse.StorageDomainDoesNotExist:
> pass
> exceptException:
> self.log.error(
> "Error while looking for domain `%s`",
> sdUUID, exc_info=True)
> raisese.StorageDomainDoesNotExist(sdUUID)
>
> 2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c) [storage.Monitor]
> Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed
> (monitor:330)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 327, in _setupLoop
>     self._setupMonitor()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 349, in _setupMonitor
>     self._produceDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in
> wrapper
>     value = meth(self, *a, **kw)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 367, in _pro

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-06 Thread Amit Bawer
On Thu, Feb 6, 2020 at 2:54 PM Jorick Astrego  wrote:

>
> On 2/6/20 1:44 PM, Amit Bawer wrote:
>
>
>
> On Thu, Feb 6, 2020 at 1:07 PM Jorick Astrego  wrote:
>
>> Here you go, this is from the activation I just did a couple of minutes
>> ago.
>>
> I was hoping to see how it was first connected to host, but it doesn't go
> that far back. Anyway, the storage domain type is set from engine and vdsm
> never try to guess it as far as I saw.
>
> I put the host in maintenance and activated it again, this should give you
> some more info. See attached log.
>
> Could you query the engine db about the misbehaving domain and paste the
> results?
>
> # su - postgres
> Last login: Thu Feb  6 07:17:52 EST 2020 on pts/0
> -bash-4.2$ LD_LIBRARY_PATH=/opt/rh/rh-postgresql10/root/lib64/
> /opt/rh/rh-postgresql10/root/usr/bin/psql engine
> psql (10.6)
> Type "help" for help.
> engine=# select * from storage_domain_static where id = '
> f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>
>
> engine=# select * from storage_domain_static where id =
> 'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>   id  |
> storage| storage_name | storage_domain_type | storage_type
> | storage_domain_format_type | _create_date  |
> _update_date | recoverable | la
> st_time_used_as_master | storage_description | storage_comment |
> wipe_after_delete | warning_low_space_indicator |
> critical_space_action_blocker | first_metadata_device | vg_metadata_device
> | discard_after_delete | backup | warning_low_co
> nfirmed_space_indicator | block_size
>
> --+--+--+-+--++---+-+-+---
>
> ---+-+-+---+-+---+---++--++---
> +
>  f5d2f7c6-093f-46d6-a844-224d92db5ef9 |
> b8b456f0-27c3-49b9-b5e9-9fa81fb3cdaa | backupnfs|   1
> |1 | 4  | 2018-01-19 13:31:25.899738+01
> | 2019-02-14 14:36:22.3171+01 | t   |
>  1530772724454 | | |
> f |  10
> | 5 |
> || f| f  |
>   0 |512
> (1 row)
>
>
>
Thanks for sharing,

The storage_type in db is indeed NFS (1), storage_domain_format_type is 4 -
for ovirt 4.3 the storage_domain_format_type is 5 by default and usually
datacenter upgrade is required for 4.2 to 4.3 migration, which not sure if
possible in your current setup since you have 4.2 nodes using this storage
as well.

Regarding the repeating monitor failure for the SD:

2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c) [storage.LVM]
Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5
out=[] err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not
found', '  Cannot process volume group
f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)

This error means that the monitor has tried to query the SD as a VG first
and failed, this is expected for the fallback code called for finding a
domain missing from SD cache:

def _findUnfetchedDomain(self, sdUUID):
...
for mod in (blockSD, glusterSD, localFsSD, nfsSD):
try:
return mod.findDomain(sdUUID)
except se.StorageDomainDoesNotExist:
pass
except Exception:
self.log.error(
"Error while looking for domain `%s`",
sdUUID, exc_info=True)

raise se.StorageDomainDoesNotExist(sdUUID)

2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c) [storage.Monitor]
Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed
(monitor:330)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
327, in _setupLoop
self._setupMonitor()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
349, in _setupMonitor
self._produceDomain()
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in
wrapper
value = meth(self, *a, **kw)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
367, in _produceDomain
self.domain = sdCache.produce(self.sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in
produce
domain.getRealDomain()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in
getRealDomain
return self._cache._realProduce(self._sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in
_realProduce
domain = self._findDomain(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in
_findDomain
return findMethod(sdUUID)
  File "/usr/lib/python2.7/site

[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-06 Thread Gianluca Cecchi
On Thu, Feb 6, 2020 at 1:19 PM Jorick Astrego  wrote:

>
> On 2/6/20 12:08 PM, Gianluca Cecchi wrote:
>
> On Thu, Feb 6, 2020 at 10:07 AM Jorick Astrego  wrote:
>
>> Hi,
>>
> [snip]
>
>> (annoying you cannot copy the text from the events view)
>>
>>
> I don't know if I understood correctly your concern, but in my case if I
> go in Hosts --> Events
> and double click on a line I have a pop-up window named "Event Details"
> where I can copy  and paste "ID" "Time" and "Message" fields' values.
> Not so friendly but it works
>
> [snip]

> Hi Gianluca,
>
> Thanks, it's a bit counter intuitive but that works!
>
> Regards,
>
> Jorick
>

In fact, I remember I casually bumped into this... "feature" ;-)
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Y5BX5ZUIKR43OMSKQBV2DMKJONWSETBV/


[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-06 Thread Jorick Astrego

On 2/6/20 12:08 PM, Gianluca Cecchi wrote:
> On Thu, Feb 6, 2020 at 10:07 AM Jorick Astrego  > wrote:
>
> Hi,
>
> [snip] 
>
> (annoying you cannot copy the text from the events view)
>
>
>
> I don't know if I understood correctly your concern, but in my case if
> I go in Hosts --> Events 
> and double click on a line I have a pop-up window named "Event
> Details" where I can copy  and paste "ID" "Time" and "Message" fields'
> values.
> Not so friendly but it works
>
> I don't know if from the Rest API you can also filter the kind of
> events to display, but at least you can do something like:
>
> curl -X GET -H "Accept: application/xml" -u $(cat ident_file )
> --cacert /etc/pki/ovirt-engine/ca.pem
> https://engine_fqdn:443/ovirt-engine/api/events | grep description
>
>
> HIH,
> Gianluca

Hi Gianluca,

Thanks, it's a bit counter intuitive but that works!

Regards,

Jorick





Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts 



Tel: 053 20 30 270  i...@netbulae.euStaalsteden 4-3A
KvK 08198180
Fax: 053 20 30 271  www.netbulae.eu 7547 TA Enschede
BTW NL821234584B01



___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PARQXHWTUNOCA6RFWCMB3NIZZ4XKHX2Y/


[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-06 Thread Gianluca Cecchi
On Thu, Feb 6, 2020 at 10:07 AM Jorick Astrego  wrote:

> Hi,
>
[snip]

> (annoying you cannot copy the text from the events view)
>
>
I don't know if I understood correctly your concern, but in my case if I go
in Hosts --> Events
and double click on a line I have a pop-up window named "Event Details"
where I can copy  and paste "ID" "Time" and "Message" fields' values.
Not so friendly but it works

I don't know if from the Rest API you can also filter the kind of events to
display, but at least you can do something like:

curl -X GET -H "Accept: application/xml" -u $(cat ident_file ) --cacert
/etc/pki/ovirt-engine/ca.pem https://engine_fqdn:443/ovirt-engine/api/events
| grep description


HIH,
Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZCK5ZF46I3SAJWSU4URABEMXC2J3FFVZ/


[ovirt-users] Re: issue connecting 4.3.8 node to nfs domain

2020-02-06 Thread Amit Bawer
On Thu, Feb 6, 2020 at 11:07 AM Jorick Astrego  wrote:

> Hi,
>
> Something weird is going on with our ovirt node 4.3.8 install mounting a
> nfs share.
>
> We have a NFS domain for a couple of backup disks and we have a couple of
> 4.2 nodes connected to it.
>
> Now I'm adding a fresh cluster of 4.3.8 nodes and the backupnfs mount
> doesn't work.
>
> (annoying you cannot copy the text from the events view)
>
> The domain is up and working
>
> ID:f5d2f7c6-093f-46d6-a844-224d92db5ef9
> Size: 10238 GiB
> Available:2491 GiB
> Used:7747 GiB
> Allocated: 3302 GiB
> Over Allocation Ratio:37%
> Images:7
> Path:*.*.*.*:/data/ovirt
> NFS Version: AUTO
> Warning Low Space Indicator:10% (1023 GiB)
> Critical Space Action Blocker:5 GiB
>
> But somehow the node appears to thin thinks it's an LVM volume? It tries
> to find the VGs volume group but fails... which is not so strange as it is
> an NFS volume:
>

Could you provide full vdsm.log file with this flow?


> 2020-02-05 14:17:54,190+ WARN  (monitor/f5d2f7c) [storage.LVM]
> Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9'] rc=5
> out=[] err=['  Volume group "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not
> found', '  Cannot process volume group
> f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
> 2020-02-05 14:17:54,201+ ERROR (monitor/f5d2f7c) [storage.Monitor]
> Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed
> (monitor:330)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 327, in _setupLoop
> self._setupMonitor()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 349, in _setupMonitor
> self._produceDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in
> wrapper
> value = meth(self, *a, **kw)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line
> 367, in _produceDomain
> self.domain = sdCache.produce(self.sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110,
> in produce
> domain.getRealDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in
> getRealDomain
> return self._cache._realProduce(self._sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134,
> in _realProduce
> domain = self._findDomain(sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151,
> in _findDomain
> return findMethod(sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176,
> in _findUnfetchedDomain
> raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist:
> (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)
>
> The volume is actually mounted fine on the node:
>
> On NFS server
>
> Feb  5 15:47:09 back1en rpc.mountd[4899]: authenticated mount request from
> *.*.*.*:673 for /data/ovirt (/data/ovirt)
>
> On the host
>
> mount|grep nfs
>
> *.*.*.*:/data/ovirt on /rhev/data-center/mnt/*.*.*.*:_data_ovirt type nfs
> (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,nolock,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=*.*.*.*,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=*.*.*.*)
>
> And I can see the files:
>
> ls -alrt /rhev/data-center/mnt/*.*.*.*:_data_ovirt
> total 4
> drwxr-xr-x. 5 vdsm kvm61 Oct 26  2016
> 1ed0a635-67ee-4255-aad9-b70822350706
> -rwxr-xr-x. 1 vdsm kvm 0 Feb  5 14:37 __DIRECT_IO_TEST__
> drwxrwxrwx. 3 root root   86 Feb  5 14:37 .
> drwxr-xr-x. 5 vdsm kvm  4096 Feb  5 14:37 ..
>
>
>
>
>
> Met vriendelijke groet, With kind regards,
>
> Jorick Astrego
>
> *Netbulae Virtualization Experts *
> --
> Tel: 053 20 30 270 i...@netbulae.eu Staalsteden 4-3A KvK 08198180
> Fax: 053 20 30 271 www.netbulae.eu 7547 TA Enschede BTW NL821234584B01
> --
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/IFTO5WBLVLGTVWKYN3BGLOHAC453UBD5/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/URHZXVEL4N3DS6JS2JJRFRL37R24OSGX/