On 2/6/20 6:22 PM, Amit Bawer wrote:
>
>
> On Thu, Feb 6, 2020 at 2:54 PM Jorick Astrego <jor...@netbulae.eu
> <mailto:jor...@netbulae.eu>> wrote:
>
>
>     On 2/6/20 1:44 PM, Amit Bawer wrote:
>>
>>
>>     On Thu, Feb 6, 2020 at 1:07 PM Jorick Astrego <jor...@netbulae.eu
>>     <mailto:jor...@netbulae.eu>> wrote:
>>
>>         Here you go, this is from the activation I just did a couple
>>         of minutes ago.
>>
>>     I was hoping to see how it was first connected to host, but it
>>     doesn't go that far back. Anyway, the storage domain type is set
>>     from engine and vdsm never try to guess it as far as I saw.
>
>     I put the host in maintenance and activated it again, this should
>     give you some more info. See attached log.
>
>>     Could you query the engine db about the misbehaving domain and
>>     paste the results?
>>
>>     # su - postgres
>>     Last login: Thu Feb  6 07:17:52 EST 2020 on pts/0
>>     -bash-4.2$ LD_LIBRARY_PATH=/opt/rh/rh-postgresql10/root/lib64/  
>>     /opt/rh/rh-postgresql10/root/usr/bin/psql engine
>>     psql (10.6)
>>     Type "help" for help.
>>     engine=# select * from storage_domain_static where id =
>>     'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>
>
>         engine=# select * from storage_domain_static where id =
>         'f5d2f7c6-093f-46d6-a844-224d92db5ef9' ;
>                           id                  |              
>         storage                | storage_name | storage_domain_type |
>         storage_type | storage_domain_format_type |        
>         _create_date          |        _update_date         |
>         recoverable | la
>         st_time_used_as_master | storage_description | storage_comment
>         | wipe_after_delete | warning_low_space_indicator |
>         critical_space_action_blocker | first_metadata_device |
>         vg_metadata_device | discard_after_delete | backup |
>         warning_low_co
>         nfirmed_space_indicator | block_size
>         
> --------------------------------------+--------------------------------------+--------------+---------------------+--------------+----------------------------+-------------------------------+-----------------------------+-------------+---
>         
> -----------------------+---------------------+-----------------+-------------------+-----------------------------+-------------------------------+-----------------------+--------------------+----------------------+--------+---------------
>         ------------------------+------------
>          f5d2f7c6-093f-46d6-a844-224d92db5ef9 |
>         b8b456f0-27c3-49b9-b5e9-9fa81fb3cdaa | backupnfs   
>         |                   1 |            1 |
>         4                          | 2018-01-19 13:31:25.899738+01 |
>         2019-02-14 14:36:22.3171+01 | t           |  
>                  1530772724454 |                     |                
>         | f                 |                          10
>         |                             5 |                      
>         |                    | f                    | f     
>         |              
>                               0 |        512
>         (1 row)
>
>
>
> Thanks for sharing,
>
> The storage_type in db is indeed NFS (1), storage_domain_format_type
> is 4 - for ovirt 4.3 the storage_domain_format_type is 5 by default
> and usually datacenter upgrade is required for 4.2 to 4.3 migration,
> which not sure if possible in your current setup since you have 4.2
> nodes using this storage as well.
>
> Regarding the repeating monitor failure for the SD:
>
> 2020-02-05 14:17:54,190+0000 WARN  (monitor/f5d2f7c) [storage.LVM]
> Reloading VGs failed (vgs=[u'f5d2f7c6-093f-46d6-a844-224d92db5ef9']
> rc=5 out=[] err=['  Volume group
> "f5d2f7c6-093f-46d6-a844-224d92db5ef9" not found', '  Cannot process
> volume group f5d2f7c6-093f-46d6-a844-224d92db5ef9']) (lvm:470)
>
> This error means that the monitor has tried to query the SD as a VG
> first and failed, this is expected for the fallback code called for
> finding a domain missing from SD cache:
>
> def_findUnfetchedDomain(self, sdUUID):
> ...
> formod in(blockSD, glusterSD, localFsSD, nfsSD):
> try:
> returnmod.findDomain(sdUUID)
> exceptse.StorageDomainDoesNotExist:
> pass
> exceptException:
> self.log.error(
> "Error while looking for domain `%s`",
> sdUUID, exc_info=True)
> raisese.StorageDomainDoesNotExist(sdUUID)
>
> 2020-02-05 14:17:54,201+0000 ERROR (monitor/f5d2f7c) [storage.Monitor]
> Setting up monitor for f5d2f7c6-093f-46d6-a844-224d92db5ef9 failed
> (monitor:330)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 327, in _setupLoop
>     self._setupMonitor()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 349, in _setupMonitor
>     self._produceDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 159, in
> wrapper
>     value = meth(self, *a, **kw)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py",
> line 367, in _produceDomain
>     self.domain = sdCache.produce(self.sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line
> 110, in produce
>     domain.getRealDomain()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line
> 51, in getRealDomain
>     return self._cache._realProduce(self._sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line
> 134, in _realProduce
>     domain = self._findDomain(sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line
> 151, in _findDomain
>     return findMethod(sdUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line
> 176, in _findUnfetchedDomain
>     raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist:
> (u'f5d2f7c6-093f-46d6-a844-224d92db5ef9',)
>
> This error part means it failed to query the domain for any possible
> type, either for NFS.
>
> Are you able to create a new NFS storage domain on the same storage
> server (but on another export path not to harm the existing one)?
> If you do succeed to connect to it from the 4.3 datacenter, it could
> mean the v4 format is an issue;
> otherwise it could mean there is an issue with a different NFS
> settings required for 4.3.

Well this will be a problem either way, when I add a new NFS it will not
be storage_domain_format_type 5 as the DC is still on 4.2.

Also when I do add a type 5 nfs domain, the 4.2 nodes will try to mount
it, fail and then become non-responsive taking the whole running cluster
down?


Regards,

Jorick Astrego







Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts 

----------------

        Tel: 053 20 30 270      i...@netbulae.eu        Staalsteden 4-3A        
KvK 08198180
        Fax: 053 20 30 271      www.netbulae.eu         7547 TA Enschede        
BTW NL821234584B01

----------------

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JPINMHNIYGVCW5UAFAHFQEN3GNF5VZSB/

Reply via email to