Yes, OK to attached needed logs to bug report...

From: Simone Tiraboschi [mailto:stira...@redhat.com]
Sent: Tuesday, December 22, 2015 9:27 AM
To: Will Dennis; users
Cc: Sahina Bose; Yedidyah Bar David; Nir Soffer
Subject: Re: [ovirt-users] Cannot retrieve answer file from 1st HE host when 
setting up 2nd host



On Tue, Dec 22, 2015 at 3:06 PM, Will Dennis 
<wden...@nec-labs.com<mailto:wden...@nec-labs.com>> wrote:
See attached for requested logs


Thanks, the issue is here:
Dec 21 19:40:53 ovirt-node-03 etc-glusterfs-glusterd.vol[1079]: [2015-12-22 
00:40:53.496109] C [MSGID: 106002] 
[glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action] 0-management: 
Server quorum lost for volume engine. Stopping local bricks.
Dec 21 19:40:53 ovirt-node-03 etc-glusterfs-glusterd.vol[1079]: [2015-12-22 
00:40:53.496410] C [MSGID: 106002] 
[glusterd-server-quorum.c:351:glusterd_do_volume_quorum_action] 0-management: 
Server quorum lost for volume vmdata. Stopping local bricks.

So at that point gluster lost its quorum and the fail system got read-only.

On the getStorageDomainsList VDSM internally raises cause the file-system is 
read only:

Thread-141::DEBUG::2015-12-21 
11:29:59,666::fileSD::157::Storage.StorageDomainManifest::(__init__) Reading 
domain in path 
/rhev/data-center/mnt/glusterSD/localhost:_engine/e89b6e64-bd7d-4846-b970-9af32a3295ee
Thread-141::DEBUG::2015-12-21 
11:29:59,666::__init__::320::IOProcessClient::(_run) Starting IOProcess...
Thread-141::DEBUG::2015-12-21 
11:29:59,680::persistentDict::192::Storage.PersistentDict::(__init__) Created a 
persistent dict with FileMetadataRW backend
Thread-141::ERROR::2015-12-21 
11:29:59,686::hsm::2898::Storage.HSM::(getStorageDomainsList) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2882, in getStorageDomainsList
    dom = sdCache.produce(sdUUID=sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 100, in produce
    domain.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 124, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 143, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/glusterSD.py", line 32, in findDomain
    return GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID))
  File "/usr/share/vdsm/storage/fileSD.py", line 198, in __init__
    validateFileSystemFeatures(manifest.sdUUID, manifest.mountpoint)
  File "/usr/share/vdsm/storage/fileSD.py", line 93, in 
validateFileSystemFeatures
    oop.getProcessPool(sdUUID).directTouch(testFilePath)
  File "/usr/share/vdsm/storage/outOfProcess.py", line 350, in directTouch
    ioproc.touch(path, flags, mode)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 543, in 
touch
    self.timeout)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 427, in 
_sendCommand
    raise OSError(errcode, errstr)
OSError: [Errno 30] Read-only file system

But instead of reporting a failure to hosted-engine-setup, it reported a 
successfully execution where it wasn't able to find any storage domain there ( 
this one is a real bug, I'm going to open a bug on that, can I attach your logs 
there? ):

Thread-141::INFO::2015-12-21 11:29:59,702::logUtils::51::dispatcher::(wrapper) 
Run and protect: getStorageDomainsList, Return response: {'domlist': []}
Thread-141::DEBUG::2015-12-21 
11:29:59,702::task::1191::Storage.TaskManager.Task::(prepare) 
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::finished: {'domlist': []}
Thread-141::DEBUG::2015-12-21 
11:29:59,702::task::595::Storage.TaskManager.Task::(_updateState) 
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::moving from state preparing -> 
state finished
Thread-141::DEBUG::2015-12-21 
11:29:59,703::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) 
Owner.releaseAll requests {} resources {}
Thread-141::DEBUG::2015-12-21 
11:29:59,703::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) 
Owner.cancelAll requests {}
Thread-141::DEBUG::2015-12-21 
11:29:59,703::task::993::Storage.TaskManager.Task::(_decref) 
Task=`96a9ea03-dc13-483e-9b17-b55a759c9b44`::ref 0 aborting False
Thread-141::INFO::2015-12-21 
11:29:59,704::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler 
for 127.0.0.1:39718<http://127.0.0.1:39718> stopped

And so, cause VDSM doesn't report any existing storage domain, 
hosted-engine-setup assumes that you are going to deploy the first host and so 
your original issue.



From: Simone Tiraboschi [mailto:stira...@redhat.com<mailto:stira...@redhat.com>]
Sent: Tuesday, December 22, 2015 8:56 AM
To: Will Dennis
Cc: Sahina Bose; Yedidyah Bar David

Subject: Re: [ovirt-users] Cannot retrieve answer file from 1st HE host when 
setting up 2nd host


On Tue, Dec 22, 2015 at 2:44 PM, Will Dennis 
<wden...@nec-labs.com<mailto:wden...@nec-labs.com>> wrote:
Which logs are needed?

Let's start with vdsm.log and /var/log/messages
Then it's quite strange that you have that amount of data in mom.log so also 
that one could be interesting.


/var/log/vdsm
total 24M
drwxr-xr-x   3 vdsm kvm  4.0K Dec 18 20:10 .
drwxr-xr-x. 13 root root 4.0K Dec 20 03:15 ..
drwxr-xr-x   2 vdsm kvm     6 Dec  9 03:24 backup
-rw-r--r--   1 vdsm kvm  2.5K Dec 21 11:29 connectivity.log
-rw-r--r--   1 vdsm kvm  173K Dec 21 11:21 mom.log
-rw-r--r--   1 vdsm kvm  2.0M Dec 17 10:09 mom.log.1
-rw-r--r--   1 vdsm kvm  2.0M Dec 17 04:06 mom.log.2
-rw-r--r--   1 vdsm kvm  2.0M Dec 16 22:03 mom.log.3
-rw-r--r--   1 vdsm kvm  2.0M Dec 16 16:00 mom.log.4
-rw-r--r--   1 vdsm kvm  2.0M Dec 16 09:57 mom.log.5
-rw-r--r--   1 root root 115K Dec 21 11:29 supervdsm.log
-rw-r--r--   1 root root 2.7K Oct 16 11:38 upgrade.log
-rw-r--r--   1 vdsm kvm   13M Dec 22 08:44 vdsm.log


From: Simone Tiraboschi [mailto:stira...@redhat.com<mailto:stira...@redhat.com>]
Sent: Tuesday, December 22, 2015 3:58 AM
To: Will Dennis; Sahina Bose
Cc: Yedidyah Bar David; users
Subject: Re: [ovirt-users] Cannot retrieve answer file from 1st HE host when 
setting up 2nd host



On Tue, Dec 22, 2015 at 2:09 AM, Will Dennis 
<wden...@nec-labs.com<mailto:wden...@nec-labs.com>> wrote:
http://ur1.ca/ocstf


2015-12-21 11:28:39 DEBUG otopi.plugins.otopi.dialog.human 
dialog.__logString:219 DIALOG:SEND                 Please specify the full 
shared storage connection path to use (example: host:/path):
2015-12-21 11:28:55 DEBUG otopi.plugins.otopi.dialog.human 
dialog.__logString:219 DIALOG:RECEIVE    localhost:/engine

OK, so you are trying to deploy hosted-engine on GlusterFS in a hyper-converged 
way (using the same hosts for virtualization and for serving GlusterFS). 
Unfortunately I've to advise you that this is not a supported configuration on 
oVirt 3.6 due to different open bugs.
So I'm glad you can help us testing it but I prefer to advise that today that 
schema is not production ready.

In your case it seams that VDSM correctly connects the GlusterFS volume seeing 
all the bricks

2015-12-21 11:28:55 DEBUG otopi.plugins.ovirt_hosted_engine_setup.storage.nfs 
plugin.execute:936 execute-output: ('/sbin/gluster', '--mode=script', '--xml', 
'volume', 'info', 'engine', '--remote-host=localhost') stdout:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cliOutput>
  <opRet>0</opRet>
  <opErrno>0</opErrno>
  <opErrstr/>
  <volInfo>
    <volumes>
      <volume>
        <name>engine</name>
        <id>974c9da4-b236-4fc1-b26a-645f14601db8</id>
        <status>1</status>
        <statusStr>Started</statusStr>
        <brickCount>6</brickCount>
        <distCount>3</distCount>

but then VDSM doesn't find any storage domain there:

otopi.plugins.ovirt_hosted_engine_setup.storage.storage.Plugin._late_customization
2015-12-21 11:29:58 DEBUG 
otopi.plugins.ovirt_hosted_engine_setup.storage.storage 
storage._getExistingDomain:476 _getExistingDomain
2015-12-21 11:29:58 DEBUG 
otopi.plugins.ovirt_hosted_engine_setup.storage.storage 
storage._storageServerConnection:638 connectStorageServer
2015-12-21 11:29:58 DEBUG 
otopi.plugins.ovirt_hosted_engine_setup.storage.storage 
storage._storageServerConnection:701 {'status': {'message': 'OK', 'code': 0}, 
'statuslist': [{'status': 0, 'id': '67ece152-dd66-444c-8d18-4249d1b8f488'}]}
2015-12-21 11:29:58 DEBUG 
otopi.plugins.ovirt_hosted_engine_setup.storage.storage 
storage._getStorageDomainsList:595 getStorageDomainsList
2015-12-21 11:29:59 DEBUG 
otopi.plugins.ovirt_hosted_engine_setup.storage.storage 
storage._getStorageDomainsList:598 {'status': {'message': 'OK', 'code': 0}, 
'domlist': []}

Can you please attach also the correspondent VDSM logs?

Adding Sahina here.


On Dec 21, 2015, at 11:58 AM, Simone Tiraboschi 
<stira...@redhat.com<mailto:stira...@redhat.com><mailto:stira...@redhat.com<mailto:stira...@redhat.com>>>
 wrote:

On Mon, Dec 21, 2015 at 5:52 PM, Will Dennis 
<wden...@nec-labs.com<mailto:wden...@nec-labs.com><mailto:wden...@nec-labs.com<mailto:wden...@nec-labs.com>>>
 wrote:

However, when I went to the 3rd host and did the setup, I selected 'glusterfs' 
and gave the path of the engine volume, it came back and incorrectly identified 
it as the first host, instead of an additional host... How does setup determine 
that? I confirmed that on this 3rd host that the engine volume is available and 
has the GUID subfolder of the hosted engine...


Can you please attach a log of hosted-engine-setup also from there?



_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to