Re: [Users] Data Center Non Responsive / Contending

2014-03-05 Thread Giorgio Bersano
2014-03-05 14:58 GMT+01:00 Liron Aravot :
>
> Giorgio,
> I've added the following patch to resolve the issue: 
> http://gerrit.ovirt.org/#/c/25424/
> Have you opened the bug for it? if so, please provide me the bug number so i 
> could assign the patch to it.
> Thanks.

Good that you already have a patch. BZ 1072900 .
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-05 Thread Liron Aravot


- Original Message -
> From: "Giorgio Bersano" 
> To: "Liron Aravot" 
> Cc: "Meital Bourvine" , "users@ovirt.org" 
> , fsimo...@redhat.com
> Sent: Wednesday, March 5, 2014 1:19:34 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 22:35 GMT+01:00 Liron Aravot :
> >
> >
> > - Original Message -
> >> From: "Giorgio Bersano" 
> >> To: "Liron Aravot" 
> >> Cc: "Meital Bourvine" , "users@ovirt.org"
> >> , fsimo...@redhat.com
> >> Sent: Tuesday, March 4, 2014 6:10:27 PM
> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >>
> >> 2014-03-04 16:37 GMT+01:00 Liron Aravot :
> >> >
> >> >
> >> > - Original Message -
> >> >> From: "Giorgio Bersano" 
> >> >> To: "Liron Aravot" 
> >> >> Cc: "Meital Bourvine" , "users@ovirt.org"
> >> >> , fsimo...@redhat.com
> >> >> Sent: Tuesday, March 4, 2014 5:31:01 PM
> >> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >> >>
> >> >> 2014-03-04 16:03 GMT+01:00 Liron Aravot :
> >> >> > Hi Giorgio,
> >> >> > Apperantly the issue is caused because there is no connectivity to
> >> >> > the
> >> >> > export domain and than we fail on spmStart - that's obviously a bug
> >> >> > that
> >> >> > shouldn't happen.
> >> >>
> >> >> Hi Liron,
> >> >> we are reaching the same conclusion.
> >> >>
> >> >> > can you open a bug for the issue?
> >> >> Surely I will
> >> >>
> >> >> > in the meanwhile, as it seems to still exist - seems to me like the
> >> >> > way
> >> >> > for
> >> >> > solving it would be either to fix the connectivity issue between vdsm
> >> >> > and
> >> >> > the storage domain or to downgrade your vdsm version to before this
> >> >> > issue
> >> >> > was introduced.
> >> >>
> >> >>
> >> >> I have some problems with your suggestion(s):
> >> >> - I cannot fix the connectivity between vdsm and the storage domain
> >> >> because, as I already said, it is exposed by a VM by this very same
> >> >> DataCenter and if the DC doesn't goes up, the NFS server can't too.
> >> >> - I don't understand what does it mean to downgrade the vdsm: to which
> >> >> point in time?
> >> >>
> >> >> It seems I've put myself - again - in a situation of the "the egg or
> >> >> the chicken" type, where the SD depends from THIS export domain but
> >> >> the export domain isn't available if the DC isn't running.
> >> >>
> >> >> This export domain isn't that important to me. I can throw it away
> >> >> without any problem.
> >> >>
> >> >> What if we edit the DB and remove any instances related to it? Any
> >> >> adverse consequences?
> >> >>
> >> >
> >> > Ok, please perform a full db backup before attempting the following:
> >> > 1. right click on the the domain and choose "Destory"
> >> > 2. move all hosts to maintenance
> >> > 3. log in into the database and run the following sql command:
> >> > update storage_pool where id = '{you id goes here}' set
> >> > master_domain_version = master_domain_version + 1;
> >> > 4. activate a host.
> >>
> >> Ok Liron, that did the trick!
> >>
> 
> Just for the record, the correct command was this one:
> 
> update storage_pool  set master_domain_version = master_domain_version + 1
> where id = '{your id goes here}' ;
> 
> Best regards,
> Giorgio.

Giorgio,
I've added the following patch to resolve the issue: 
http://gerrit.ovirt.org/#/c/25424/
Have you opened the bug for it? if so, please provide me the bug number so i 
could assign the patch to it.
Thanks.
> 
> >> Up and running again, even that VM supposed to be the server acting as
> >> export domain.
> >>
> >> Now I've to run away as I'm late to a meeting but tomorrow I'll file a
> >> bug regarding this.
> >>
> >> Thanks to you and Meital for your assistance,
> >> Giorgio.
> >
> > Sure, happy that everything is fine!
> >>
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-05 Thread Giorgio Bersano
2014-03-04 22:35 GMT+01:00 Liron Aravot :
>
>
> - Original Message -
>> From: "Giorgio Bersano" 
>> To: "Liron Aravot" 
>> Cc: "Meital Bourvine" , "users@ovirt.org" 
>> , fsimo...@redhat.com
>> Sent: Tuesday, March 4, 2014 6:10:27 PM
>> Subject: Re: [Users] Data Center Non Responsive / Contending
>>
>> 2014-03-04 16:37 GMT+01:00 Liron Aravot :
>> >
>> >
>> > - Original Message -
>> >> From: "Giorgio Bersano" 
>> >> To: "Liron Aravot" 
>> >> Cc: "Meital Bourvine" , "users@ovirt.org"
>> >> , fsimo...@redhat.com
>> >> Sent: Tuesday, March 4, 2014 5:31:01 PM
>> >> Subject: Re: [Users] Data Center Non Responsive / Contending
>> >>
>> >> 2014-03-04 16:03 GMT+01:00 Liron Aravot :
>> >> > Hi Giorgio,
>> >> > Apperantly the issue is caused because there is no connectivity to the
>> >> > export domain and than we fail on spmStart - that's obviously a bug that
>> >> > shouldn't happen.
>> >>
>> >> Hi Liron,
>> >> we are reaching the same conclusion.
>> >>
>> >> > can you open a bug for the issue?
>> >> Surely I will
>> >>
>> >> > in the meanwhile, as it seems to still exist - seems to me like the way
>> >> > for
>> >> > solving it would be either to fix the connectivity issue between vdsm
>> >> > and
>> >> > the storage domain or to downgrade your vdsm version to before this
>> >> > issue
>> >> > was introduced.
>> >>
>> >>
>> >> I have some problems with your suggestion(s):
>> >> - I cannot fix the connectivity between vdsm and the storage domain
>> >> because, as I already said, it is exposed by a VM by this very same
>> >> DataCenter and if the DC doesn't goes up, the NFS server can't too.
>> >> - I don't understand what does it mean to downgrade the vdsm: to which
>> >> point in time?
>> >>
>> >> It seems I've put myself - again - in a situation of the "the egg or
>> >> the chicken" type, where the SD depends from THIS export domain but
>> >> the export domain isn't available if the DC isn't running.
>> >>
>> >> This export domain isn't that important to me. I can throw it away
>> >> without any problem.
>> >>
>> >> What if we edit the DB and remove any instances related to it? Any
>> >> adverse consequences?
>> >>
>> >
>> > Ok, please perform a full db backup before attempting the following:
>> > 1. right click on the the domain and choose "Destory"
>> > 2. move all hosts to maintenance
>> > 3. log in into the database and run the following sql command:
>> > update storage_pool where id = '{you id goes here}' set
>> > master_domain_version = master_domain_version + 1;
>> > 4. activate a host.
>>
>> Ok Liron, that did the trick!
>>

Just for the record, the correct command was this one:

update storage_pool  set master_domain_version = master_domain_version + 1
where id = '{your id goes here}' ;

Best regards,
Giorgio.

>> Up and running again, even that VM supposed to be the server acting as
>> export domain.
>>
>> Now I've to run away as I'm late to a meeting but tomorrow I'll file a
>> bug regarding this.
>>
>> Thanks to you and Meital for your assistance,
>> Giorgio.
>
> Sure, happy that everything is fine!
>>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-05 Thread Giorgio Bersano
2014-03-04 17:32 GMT+01:00 Sven Kieske :
> Would you mind sharing the link to it?
> I didn't find it.
>
> Thanks!
>

Here it is: BZ 1072900 ( https://bugzilla.redhat.com/show_bug.cgi?id=1072900 )
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Liron Aravot


- Original Message -
> From: "Giorgio Bersano" 
> To: "Liron Aravot" 
> Cc: "Meital Bourvine" , "users@ovirt.org" 
> , fsimo...@redhat.com
> Sent: Tuesday, March 4, 2014 6:10:27 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 16:37 GMT+01:00 Liron Aravot :
> >
> >
> > - Original Message -
> >> From: "Giorgio Bersano" 
> >> To: "Liron Aravot" 
> >> Cc: "Meital Bourvine" , "users@ovirt.org"
> >> , fsimo...@redhat.com
> >> Sent: Tuesday, March 4, 2014 5:31:01 PM
> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >>
> >> 2014-03-04 16:03 GMT+01:00 Liron Aravot :
> >> > Hi Giorgio,
> >> > Apperantly the issue is caused because there is no connectivity to the
> >> > export domain and than we fail on spmStart - that's obviously a bug that
> >> > shouldn't happen.
> >>
> >> Hi Liron,
> >> we are reaching the same conclusion.
> >>
> >> > can you open a bug for the issue?
> >> Surely I will
> >>
> >> > in the meanwhile, as it seems to still exist - seems to me like the way
> >> > for
> >> > solving it would be either to fix the connectivity issue between vdsm
> >> > and
> >> > the storage domain or to downgrade your vdsm version to before this
> >> > issue
> >> > was introduced.
> >>
> >>
> >> I have some problems with your suggestion(s):
> >> - I cannot fix the connectivity between vdsm and the storage domain
> >> because, as I already said, it is exposed by a VM by this very same
> >> DataCenter and if the DC doesn't goes up, the NFS server can't too.
> >> - I don't understand what does it mean to downgrade the vdsm: to which
> >> point in time?
> >>
> >> It seems I've put myself - again - in a situation of the "the egg or
> >> the chicken" type, where the SD depends from THIS export domain but
> >> the export domain isn't available if the DC isn't running.
> >>
> >> This export domain isn't that important to me. I can throw it away
> >> without any problem.
> >>
> >> What if we edit the DB and remove any instances related to it? Any
> >> adverse consequences?
> >>
> >
> > Ok, please perform a full db backup before attempting the following:
> > 1. right click on the the domain and choose "Destory"
> > 2. move all hosts to maintenance
> > 3. log in into the database and run the following sql command:
> > update storage_pool where id = '{you id goes here}' set
> > master_domain_version = master_domain_version + 1;
> > 4. activate a host.
> 
> Ok Liron, that did the trick!
> 
> Up and running again, even that VM supposed to be the server acting as
> export domain.
> 
> Now I've to run away as I'm late to a meeting but tomorrow I'll file a
> bug regarding this.
> 
> Thanks to you and Meital for your assistance,
> Giorgio.

Sure, happy that everything is fine!
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Sven Kieske
Would you mind sharing the link to it?
I didn't find it.

Thanks!

Am 04.03.2014 16:31, schrieb Giorgio Bersano:
>> can you open a bug for the issue?
> Surely I will
> 

-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH & Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Giorgio Bersano
2014-03-04 16:37 GMT+01:00 Liron Aravot :
>
>
> - Original Message -
>> From: "Giorgio Bersano" 
>> To: "Liron Aravot" 
>> Cc: "Meital Bourvine" , "users@ovirt.org" 
>> , fsimo...@redhat.com
>> Sent: Tuesday, March 4, 2014 5:31:01 PM
>> Subject: Re: [Users] Data Center Non Responsive / Contending
>>
>> 2014-03-04 16:03 GMT+01:00 Liron Aravot :
>> > Hi Giorgio,
>> > Apperantly the issue is caused because there is no connectivity to the
>> > export domain and than we fail on spmStart - that's obviously a bug that
>> > shouldn't happen.
>>
>> Hi Liron,
>> we are reaching the same conclusion.
>>
>> > can you open a bug for the issue?
>> Surely I will
>>
>> > in the meanwhile, as it seems to still exist - seems to me like the way for
>> > solving it would be either to fix the connectivity issue between vdsm and
>> > the storage domain or to downgrade your vdsm version to before this issue
>> > was introduced.
>>
>>
>> I have some problems with your suggestion(s):
>> - I cannot fix the connectivity between vdsm and the storage domain
>> because, as I already said, it is exposed by a VM by this very same
>> DataCenter and if the DC doesn't goes up, the NFS server can't too.
>> - I don't understand what does it mean to downgrade the vdsm: to which
>> point in time?
>>
>> It seems I've put myself - again - in a situation of the "the egg or
>> the chicken" type, where the SD depends from THIS export domain but
>> the export domain isn't available if the DC isn't running.
>>
>> This export domain isn't that important to me. I can throw it away
>> without any problem.
>>
>> What if we edit the DB and remove any instances related to it? Any
>> adverse consequences?
>>
>
> Ok, please perform a full db backup before attempting the following:
> 1. right click on the the domain and choose "Destory"
> 2. move all hosts to maintenance
> 3. log in into the database and run the following sql command:
> update storage_pool where id = '{you id goes here}' set master_domain_version 
> = master_domain_version + 1;
> 4. activate a host.

Ok Liron, that did the trick!

Up and running again, even that VM supposed to be the server acting as
export domain.

Now I've to run away as I'm late to a meeting but tomorrow I'll file a
bug regarding this.

Thanks to you and Meital for your assistance,
Giorgio.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Liron Aravot


- Original Message -
> From: "Giorgio Bersano" 
> To: "Liron Aravot" 
> Cc: "Meital Bourvine" , "users@ovirt.org" 
> , fsimo...@redhat.com
> Sent: Tuesday, March 4, 2014 5:31:01 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 16:03 GMT+01:00 Liron Aravot :
> > Hi Giorgio,
> > Apperantly the issue is caused because there is no connectivity to the
> > export domain and than we fail on spmStart - that's obviously a bug that
> > shouldn't happen.
> 
> Hi Liron,
> we are reaching the same conclusion.
> 
> > can you open a bug for the issue?
> Surely I will
> 
> > in the meanwhile, as it seems to still exist - seems to me like the way for
> > solving it would be either to fix the connectivity issue between vdsm and
> > the storage domain or to downgrade your vdsm version to before this issue
> > was introduced.
> 
> 
> I have some problems with your suggestion(s):
> - I cannot fix the connectivity between vdsm and the storage domain
> because, as I already said, it is exposed by a VM by this very same
> DataCenter and if the DC doesn't goes up, the NFS server can't too.
> - I don't understand what does it mean to downgrade the vdsm: to which
> point in time?
> 
> It seems I've put myself - again - in a situation of the "the egg or
> the chicken" type, where the SD depends from THIS export domain but
> the export domain isn't available if the DC isn't running.
> 
> This export domain isn't that important to me. I can throw it away
> without any problem.
> 
> What if we edit the DB and remove any instances related to it? Any
> adverse consequences?
> 

Ok, please perform a full db backup before attempting the following:
1. right click on the the domain and choose "Destory"
2. move all hosts to maintenance
3. log in into the database and run the following sql command:
update storage_pool where id = '{you id goes here}' set master_domain_version = 
master_domain_version + 1;
4. activate a host.
> 
> 
> >
> > 6a519e95-62ef-445b-9a98-f05c81592c85::WARNING::2014-03-04
> > 13:05:31,489::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
> > Volume group "1810e5eb-9e
> > b6-4797-ac50-8023a939f312" not found', '  Skipping volume group
> > 1810e5eb-9eb6-4797-ac50-8023a939f312']
> > 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04
> > 13:05:31,499::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
> > 1810e5eb-9eb6-4797-ac50-8023a
> > 939f312 not found
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> > dom = findMethod(sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
> > raise se.StorageDomainDoesNotExist(sdUUID)
> > StorageDomainDoesNotExist: Storage domain does not exist:
> > (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
> > 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04
> > 13:05:31,500::sp::329::Storage.StoragePool::(startSpm) Unexpected error
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/sp.py", line 296, in startSpm
> > self._updateDomainsRole()
> >   File "/usr/share/vdsm/storage/securable.py", line 75, in wrapper
> > return method(self, *args, **kwargs)
> >   File "/usr/share/vdsm/storage/sp.py", line 205, in _updateDomainsRole
> > domain = sdCache.produce(sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
> > domain.getRealDomain()
> >   File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
> > return self._cache._realProduce(self._sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
> > domain = self._findDomain(sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> > dom = findMethod(sdUUID)
> >   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
> > raise se.StorageDomainDoesNotExist(sdUUID)
> >
> >
> >
> >
> > - Original Message -
> >> From: "Giorgio Bersano" 
> >> To: "Meital Bourvine" 
> >> Cc: "users@ovirt.org" 
> >> Sent: Tuesday, March 4, 2014 4:35:07 PM
> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >>
> >> 2014-03-04 15:23 GMT+01:00 Meital Bourvine :
> >> > Master data domain mu

Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Giorgio Bersano
2014-03-04 16:25 GMT+01:00 Liron Aravot :
>
>
> - Original Message -
>> From: "Liron Aravot" 
>> To: "Giorgio Bersano" 
>> Cc: "users@ovirt.org" 
>> Sent: Tuesday, March 4, 2014 5:03:44 PM
>> Subject: Re: [Users] Data Center Non Responsive / Contending
>>
>> Hi Giorgio,
>> Apperantly the issue is caused because there is no connectivity to the export
>> domain and than we fail on spmStart - that's obviously a bug that shouldn't
>> happen.
>> can you open a bug for the issue?
>> in the meanwhile, as it seems to still exist - seems to me like the way for
>> solving it would be either to fix the connectivity issue between vdsm and
>> the storage domain or to downgrade your vdsm version to before this issue
>> was introduced.
>
> by the way, solution that we can go with is to remove the domain manually 
> from the engine and forcibly cause to reconstruction of the pool metadata, so 
> that issue should be resolved.
>

Do you mean "Destroy" from the webadmin?



> note that if it'll happen for further domains in the future the same 
> procedure would be required.
> up to your choice we can proceed with solution - let me know on which way 
> you'd want to go.
>>
>> 6a519e95-62ef-445b-9a98-f05c81592c85::WARNING::2014-03-04
>> 13:05:31,489::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
>> Volume group "1810e5eb-9e
>> b6-4797-ac50-8023a939f312" not found', '  Skipping volume group
>> 1810e5eb-9eb6-4797-ac50-8023a939f312']
>> 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04
>> 13:05:31,499::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
>> 1810e5eb-9eb6-4797-ac50-8023a
>> 939f312 not found
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
>> dom = findMethod(sdUUID)
>>   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
>> raise se.StorageDomainDoesNotExist(sdUUID)
>> StorageDomainDoesNotExist: Storage domain does not exist:
>> (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
>> 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04
>> 13:05:31,500::sp::329::Storage.StoragePool::(startSpm) Unexpected error
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/storage/sp.py", line 296, in startSpm
>> self._updateDomainsRole()
>>   File "/usr/share/vdsm/storage/securable.py", line 75, in wrapper
>> return method(self, *args, **kwargs)
>>   File "/usr/share/vdsm/storage/sp.py", line 205, in _updateDomainsRole
>> domain = sdCache.produce(sdUUID)
>>   File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
>> domain.getRealDomain()
>>   File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
>> return self._cache._realProduce(self._sdUUID)
>>   File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
>> domain = self._findDomain(sdUUID)
>>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
>> dom = findMethod(sdUUID)
>>   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
>> raise se.StorageDomainDoesNotExist(sdUUID)
>>
>>
>>
>>
>> - Original Message -
>> > From: "Giorgio Bersano" 
>> > To: "Meital Bourvine" 
>> > Cc: "users@ovirt.org" 
>> > Sent: Tuesday, March 4, 2014 4:35:07 PM
>> > Subject: Re: [Users] Data Center Non Responsive / Contending
>> >
>> > 2014-03-04 15:23 GMT+01:00 Meital Bourvine :
>> > > Master data domain must be reachable in order for the DC to be up.
>> > > Export domain shouldn't affect the dc status.
>> > > Are you sure that you've created the export domain as an export domain,
>> > > and
>> > > not as a regular nfs?
>> > >
>> >
>> > Yes, I am.
>> >
>> > Don't know how to extract this info from DB, but in webadmin, in the
>> > storage list, I have these info:
>> >
>> > Domain Name: nfs02EXPORT
>> > Domain Type: Export
>> > Storage Type: NFS
>> > Format: V1
>> > Cross Data-Center Status: Inactive
>> > Total Space: [N/A]
>> > Free Space: [N/A]
>> >
>> > ATM my only "Data" Domain is based on iSCSI, no NFS.
>> >
>> >
>> >
>> >
>> >
>> &

Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Giorgio Bersano
2014-03-04 16:03 GMT+01:00 Liron Aravot :
> Hi Giorgio,
> Apperantly the issue is caused because there is no connectivity to the export 
> domain and than we fail on spmStart - that's obviously a bug that shouldn't 
> happen.

Hi Liron,
we are reaching the same conclusion.

> can you open a bug for the issue?
Surely I will

> in the meanwhile, as it seems to still exist - seems to me like the way for 
> solving it would be either to fix the connectivity issue between vdsm and the 
> storage domain or to downgrade your vdsm version to before this issue was 
> introduced.


I have some problems with your suggestion(s):
- I cannot fix the connectivity between vdsm and the storage domain
because, as I already said, it is exposed by a VM by this very same
DataCenter and if the DC doesn't goes up, the NFS server can't too.
- I don't understand what does it mean to downgrade the vdsm: to which
point in time?

It seems I've put myself - again - in a situation of the "the egg or
the chicken" type, where the SD depends from THIS export domain but
the export domain isn't available if the DC isn't running.

This export domain isn't that important to me. I can throw it away
without any problem.

What if we edit the DB and remove any instances related to it? Any
adverse consequences?



>
> 6a519e95-62ef-445b-9a98-f05c81592c85::WARNING::2014-03-04 
> 13:05:31,489::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['  
> Volume group "1810e5eb-9e
> b6-4797-ac50-8023a939f312" not found', '  Skipping volume group 
> 1810e5eb-9eb6-4797-ac50-8023a939f312']
> 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04 
> 13:05:31,499::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 
> 1810e5eb-9eb6-4797-ac50-8023a
> 939f312 not found
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> dom = findMethod(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
> raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist: 
> (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
> 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04 
> 13:05:31,500::sp::329::Storage.StoragePool::(startSpm) Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/sp.py", line 296, in startSpm
> self._updateDomainsRole()
>   File "/usr/share/vdsm/storage/securable.py", line 75, in wrapper
> return method(self, *args, **kwargs)
>   File "/usr/share/vdsm/storage/sp.py", line 205, in _updateDomainsRole
> domain = sdCache.produce(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
> domain.getRealDomain()
>   File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
> return self._cache._realProduce(self._sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
> domain = self._findDomain(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> dom = findMethod(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
> raise se.StorageDomainDoesNotExist(sdUUID)
>
>
>
>
> - Original Message -
>> From: "Giorgio Bersano" 
>> To: "Meital Bourvine" 
>> Cc: "users@ovirt.org" 
>> Sent: Tuesday, March 4, 2014 4:35:07 PM
>> Subject: Re: [Users] Data Center Non Responsive / Contending
>>
>> 2014-03-04 15:23 GMT+01:00 Meital Bourvine :
>> > Master data domain must be reachable in order for the DC to be up.
>> > Export domain shouldn't affect the dc status.
>> > Are you sure that you've created the export domain as an export domain, and
>> > not as a regular nfs?
>> >
>>
>> Yes, I am.
>>
>> Don't know how to extract this info from DB, but in webadmin, in the
>> storage list, I have these info:
>>
>> Domain Name: nfs02EXPORT
>> Domain Type: Export
>> Storage Type: NFS
>> Format: V1
>> Cross Data-Center Status: Inactive
>> Total Space: [N/A]
>> Free Space: [N/A]
>>
>> ATM my only "Data" Domain is based on iSCSI, no NFS.
>>
>>
>>
>>
>>
>> > - Original Message -
>> >> From: "Giorgio Bersano" 
>> >> To: "Meital Bourvine" 
>> >> Cc: "users@ovirt.org" 
>> >> Sent: Tuesday, March 4, 2014 4:16:19 PM
>> >> Subject: Re: [Users] Data Center Non Responsive / Contending
>>

Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Liron Aravot


- Original Message -
> From: "Liron Aravot" 
> To: "Giorgio Bersano" 
> Cc: "users@ovirt.org" 
> Sent: Tuesday, March 4, 2014 5:03:44 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> Hi Giorgio,
> Apperantly the issue is caused because there is no connectivity to the export
> domain and than we fail on spmStart - that's obviously a bug that shouldn't
> happen.
> can you open a bug for the issue?
> in the meanwhile, as it seems to still exist - seems to me like the way for
> solving it would be either to fix the connectivity issue between vdsm and
> the storage domain or to downgrade your vdsm version to before this issue
> was introduced.

by the way, solution that we can go with is to remove the domain manually from 
the engine and forcibly cause to reconstruction of the pool metadata, so that 
issue should be resolved.

note that if it'll happen for further domains in the future the same procedure 
would be required.
up to your choice we can proceed with solution - let me know on which way you'd 
want to go.
> 
> 6a519e95-62ef-445b-9a98-f05c81592c85::WARNING::2014-03-04
> 13:05:31,489::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['
> Volume group "1810e5eb-9e
> b6-4797-ac50-8023a939f312" not found', '  Skipping volume group
> 1810e5eb-9eb6-4797-ac50-8023a939f312']
> 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04
> 13:05:31,499::sdc::143::Storage.StorageDomainCache::(_findDomain) domain
> 1810e5eb-9eb6-4797-ac50-8023a
> 939f312 not found
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> dom = findMethod(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
> raise se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does not exist:
> (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
> 6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04
> 13:05:31,500::sp::329::Storage.StoragePool::(startSpm) Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/sp.py", line 296, in startSpm
> self._updateDomainsRole()
>   File "/usr/share/vdsm/storage/securable.py", line 75, in wrapper
> return method(self, *args, **kwargs)
>   File "/usr/share/vdsm/storage/sp.py", line 205, in _updateDomainsRole
> domain = sdCache.produce(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
> domain.getRealDomain()
>   File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
> return self._cache._realProduce(self._sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
> domain = self._findDomain(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
> dom = findMethod(sdUUID)
>   File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
> raise se.StorageDomainDoesNotExist(sdUUID)
> 
> 
> 
> 
> - Original Message -
> > From: "Giorgio Bersano" 
> > To: "Meital Bourvine" 
> > Cc: "users@ovirt.org" 
> > Sent: Tuesday, March 4, 2014 4:35:07 PM
> > Subject: Re: [Users] Data Center Non Responsive / Contending
> > 
> > 2014-03-04 15:23 GMT+01:00 Meital Bourvine :
> > > Master data domain must be reachable in order for the DC to be up.
> > > Export domain shouldn't affect the dc status.
> > > Are you sure that you've created the export domain as an export domain,
> > > and
> > > not as a regular nfs?
> > >
> > 
> > Yes, I am.
> > 
> > Don't know how to extract this info from DB, but in webadmin, in the
> > storage list, I have these info:
> > 
> > Domain Name: nfs02EXPORT
> > Domain Type: Export
> > Storage Type: NFS
> > Format: V1
> > Cross Data-Center Status: Inactive
> > Total Space: [N/A]
> > Free Space: [N/A]
> > 
> > ATM my only "Data" Domain is based on iSCSI, no NFS.
> > 
> > 
> > 
> > 
> > 
> > > - Original Message -
> > >> From: "Giorgio Bersano" 
> > >> To: "Meital Bourvine" 
> > >> Cc: "users@ovirt.org" 
> > >> Sent: Tuesday, March 4, 2014 4:16:19 PM
> > >> Subject: Re: [Users] Data Center Non Responsive / Contending
> > >>
> > >> 2014-03-04 14:48 GMT+01:00 Meital Bourvine :
> > >> > StorageDomainDoesNotExist: Storage domain does not exist:
>

Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Liron Aravot
adding federico
- Original Message -
> From: "Liron Aravot" 
> To: "Giorgio Bersano" 
> Cc: "users@ovirt.org" 
> Sent: Tuesday, March 4, 2014 5:11:36 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 
> 
> - Original Message -
> > From: "Giorgio Bersano" 
> > To: "Meital Bourvine" 
> > Cc: "users@ovirt.org" 
> > Sent: Tuesday, March 4, 2014 5:06:13 PM
> > Subject: Re: [Users] Data Center Non Responsive / Contending
> > 
> > 2014-03-04 15:38 GMT+01:00 Meital Bourvine :
> > > Ok, and is the iscsi functional at the moment?
> > >
> > 
> > I think so.
> > For example I see in the DB that the id of my Master Data Domain ,
> > dt02clu6070,  is  "a689cb30-743e-4261-bfd1-b8b194dc85db" then
> > 
> > [root@vbox70 ~]# lvs a689cb30-743e-4261-bfd1-b8b194dc85db
> >   LV   VG
> >  Attr   LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
> >   4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   3,62g
> >   5c8bb733-4b0c-43a9-9471-0fde3d159fb2
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi---  11,00g
> >   7b617ab1-70c1-42ea-9303-ceffac1da72d
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   3,88g
> >   e4b86b91-80ec-4bba-8372-10522046ee6b
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   9,00g
> >   ids
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi-ao 128,00m
> >   inbox
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 128,00m
> >   leases
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a-   2,00g
> >   master
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a-   1,00g
> >   metadata
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 512,00m
> >   outbox
> > a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 128,00m
> > 
> > I can read from the LVs that have the LVM Available bit set:
> > 
> > [root@vbox70 ~]# dd if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/ids
> > bs=1M of=/dev/null
> > 128+0 records in
> > 128+0 records out
> > 134217728 bytes (134 MB) copied, 0,0323692 s, 4,1 GB/s
> > 
> > [root@vbox70 ~]# dd if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/ids
> > bs=1M |od -xc |head -20
> > 00020101221000200030200
> > 020   ! 022 002  \0 003  \0  \0  \0  \0  \0  \0 002  \0  \0
> > 0200001
> >  \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0
> > 04000010007
> > 001  \0  \0  \0  \0  \0  \0  \0  \a  \0  \0  \0  \0  \0  \0  \0
> > 0603661393862633033
> >  \0  \0  \0  \0  \0  \0  \0  \0   a   6   8   9   c   b   3   0
> > 100372d33342d6532343136622d64662d31
> >   -   7   4   3   e   -   4   2   6   1   -   b   f   d   1   -
> > 120386231623439636435386264
> >   b   8   b   1   9   4   d   c   8   5   d   b  \0  \0  \0  \0
> > 1403638343839663932
> >  \0  \0  \0  \0  \0  \0  \0  \0   8   6   8   4   f   9   2   9
> > 160612d62372d6638346564622d38302d35
> >   -   a   7   b   f   -   4   8   d   e   -   b   0   8   5   -
> > 200656363306539353766306364762e6f62
> >   c   e   0   c   9   e   7   5   0   f   d   c   .   v   b   o
> > 22037782e307270006926de
> >   x   7   0   .   p   r   i  \0 336   &  \0  \0  \0  \0  \0  \0
> > [root@vbox70 ~]#
> > 
> > Obviously I can't read from LVs that aren't available:
> > 
> > [root@vbox70 ~]# dd
> > if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400
> > bs=1M of=/dev/null
> > dd: apertura di
> > `/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400':
> > No such file or directory
> > [root@vbox70 ~]#
> > 
> > But those LV are the VM's disks and I suppose it's availability is
> > managed by oVirt
> > 
> 
> please see my previous mail on this thread, the issue seems to be with the
> connectivity to the nfs path, not the iscsi.
> 2014-03-04 13:15:41,167 INFO
> [org.ovirt.engine.core.dal.dbbroker.auditloghandl

Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Liron Aravot


- Original Message -
> From: "Giorgio Bersano" 
> To: "Meital Bourvine" 
> Cc: "users@ovirt.org" 
> Sent: Tuesday, March 4, 2014 5:06:13 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 15:38 GMT+01:00 Meital Bourvine :
> > Ok, and is the iscsi functional at the moment?
> >
> 
> I think so.
> For example I see in the DB that the id of my Master Data Domain ,
> dt02clu6070,  is  "a689cb30-743e-4261-bfd1-b8b194dc85db" then
> 
> [root@vbox70 ~]# lvs a689cb30-743e-4261-bfd1-b8b194dc85db
>   LV   VG
>  Attr   LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
>   4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   3,62g
>   5c8bb733-4b0c-43a9-9471-0fde3d159fb2
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi---  11,00g
>   7b617ab1-70c1-42ea-9303-ceffac1da72d
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   3,88g
>   e4b86b91-80ec-4bba-8372-10522046ee6b
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   9,00g
>   ids
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi-ao 128,00m
>   inbox
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 128,00m
>   leases
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a-   2,00g
>   master
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a-   1,00g
>   metadata
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 512,00m
>   outbox
> a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 128,00m
> 
> I can read from the LVs that have the LVM Available bit set:
> 
> [root@vbox70 ~]# dd if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/ids
> bs=1M of=/dev/null
> 128+0 records in
> 128+0 records out
> 134217728 bytes (134 MB) copied, 0,0323692 s, 4,1 GB/s
> 
> [root@vbox70 ~]# dd if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/ids
> bs=1M |od -xc |head -20
> 00020101221000200030200
> 020   ! 022 002  \0 003  \0  \0  \0  \0  \0  \0 002  \0  \0
> 0200001
>  \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0
> 04000010007
> 001  \0  \0  \0  \0  \0  \0  \0  \a  \0  \0  \0  \0  \0  \0  \0
> 0603661393862633033
>  \0  \0  \0  \0  \0  \0  \0  \0   a   6   8   9   c   b   3   0
> 100372d33342d6532343136622d64662d31
>   -   7   4   3   e   -   4   2   6   1   -   b   f   d   1   -
> 120386231623439636435386264
>   b   8   b   1   9   4   d   c   8   5   d   b  \0  \0  \0  \0
> 1403638343839663932
>  \0  \0  \0  \0  \0  \0  \0  \0   8   6   8   4   f   9   2   9
> 160612d62372d6638346564622d38302d35
>   -   a   7   b   f   -   4   8   d   e   -   b   0   8   5   -
> 200656363306539353766306364762e6f62
>   c   e   0   c   9   e   7   5   0   f   d   c   .   v   b   o
> 22037782e307270006926de
>   x   7   0   .   p   r   i  \0 336   &  \0  \0  \0  \0  \0  \0
> [root@vbox70 ~]#
> 
> Obviously I can't read from LVs that aren't available:
> 
> [root@vbox70 ~]# dd
> if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400
> bs=1M of=/dev/null
> dd: apertura di
> `/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400':
> No such file or directory
> [root@vbox70 ~]#
> 
> But those LV are the VM's disks and I suppose it's availability is
> managed by oVirt
> 

please see my previous mail on this thread, the issue seems to be with the 
connectivity to the nfs path, not the iscsi.
2014-03-04 13:15:41,167 INFO  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler_Worker-27) [1141851d] Correlation
 ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to connect 
Host vbox70 to the Storage Domains nfs02EXPORT.
> 
> 
> > - Original Message -
> >> From: "Giorgio Bersano" 
> >> To: "Meital Bourvine" 
> >> Cc: "users@ovirt.org" 
> >> Sent: Tuesday, March 4, 2014 4:35:07 PM
> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >>
> >> 2014-03-04 15:23 GMT+01:00 Meital Bourvine :
> >> > Master data domain must be reachable in order for the DC to be up.
> >> > Export domain shou

Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Giorgio Bersano
2014-03-04 15:38 GMT+01:00 Meital Bourvine :
> Ok, and is the iscsi functional at the moment?
>

I think so.
For example I see in the DB that the id of my Master Data Domain ,
dt02clu6070,  is  "a689cb30-743e-4261-bfd1-b8b194dc85db" then

[root@vbox70 ~]# lvs a689cb30-743e-4261-bfd1-b8b194dc85db
  LV   VG
 Attr   LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400
a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   3,62g
  5c8bb733-4b0c-43a9-9471-0fde3d159fb2
a689cb30-743e-4261-bfd1-b8b194dc85db -wi---  11,00g
  7b617ab1-70c1-42ea-9303-ceffac1da72d
a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   3,88g
  e4b86b91-80ec-4bba-8372-10522046ee6b
a689cb30-743e-4261-bfd1-b8b194dc85db -wi---   9,00g
  ids
a689cb30-743e-4261-bfd1-b8b194dc85db -wi-ao 128,00m
  inbox
a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 128,00m
  leases
a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a-   2,00g
  master
a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a-   1,00g
  metadata
a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 512,00m
  outbox
a689cb30-743e-4261-bfd1-b8b194dc85db -wi-a- 128,00m

I can read from the LVs that have the LVM Available bit set:

[root@vbox70 ~]# dd if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/ids
bs=1M of=/dev/null
128+0 records in
128+0 records out
134217728 bytes (134 MB) copied, 0,0323692 s, 4,1 GB/s

[root@vbox70 ~]# dd if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/ids
bs=1M |od -xc |head -20
00020101221000200030200
020   ! 022 002  \0 003  \0  \0  \0  \0  \0  \0 002  \0  \0
0200001
 \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0
04000010007
001  \0  \0  \0  \0  \0  \0  \0  \a  \0  \0  \0  \0  \0  \0  \0
0603661393862633033
 \0  \0  \0  \0  \0  \0  \0  \0   a   6   8   9   c   b   3   0
100372d33342d6532343136622d64662d31
  -   7   4   3   e   -   4   2   6   1   -   b   f   d   1   -
120386231623439636435386264
  b   8   b   1   9   4   d   c   8   5   d   b  \0  \0  \0  \0
1403638343839663932
 \0  \0  \0  \0  \0  \0  \0  \0   8   6   8   4   f   9   2   9
160612d62372d6638346564622d38302d35
  -   a   7   b   f   -   4   8   d   e   -   b   0   8   5   -
200656363306539353766306364762e6f62
  c   e   0   c   9   e   7   5   0   f   d   c   .   v   b   o
22037782e307270006926de
  x   7   0   .   p   r   i  \0 336   &  \0  \0  \0  \0  \0  \0
[root@vbox70 ~]#

Obviously I can't read from LVs that aren't available:

[root@vbox70 ~]# dd
if=/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400
bs=1M of=/dev/null
dd: apertura di
`/dev/a689cb30-743e-4261-bfd1-b8b194dc85db/4a1be3d8-ac7d-46cf-ae1c-ba154bc9a400':
No such file or directory
[root@vbox70 ~]#

But those LV are the VM's disks and I suppose it's availability is
managed by oVirt



> - Original Message -
>> From: "Giorgio Bersano" 
>> To: "Meital Bourvine" 
>> Cc: "users@ovirt.org" 
>> Sent: Tuesday, March 4, 2014 4:35:07 PM
>> Subject: Re: [Users] Data Center Non Responsive / Contending
>>
>> 2014-03-04 15:23 GMT+01:00 Meital Bourvine :
>> > Master data domain must be reachable in order for the DC to be up.
>> > Export domain shouldn't affect the dc status.
>> > Are you sure that you've created the export domain as an export domain, and
>> > not as a regular nfs?
>> >
>>
>> Yes, I am.
>>
>> Don't know how to extract this info from DB, but in webadmin, in the
>> storage list, I have these info:
>>
>> Domain Name: nfs02EXPORT
>> Domain Type: Export
>> Storage Type: NFS
>> Format: V1
>> Cross Data-Center Status: Inactive
>> Total Space: [N/A]
>> Free Space: [N/A]
>>
>> ATM my only "Data" Domain is based on iSCSI, no NFS.
>>
>>
>>
>>
>>
>> > - Original Message -
>> >> From: "Giorgio Bersano" 
>> >> To: "Meital Bourvine" 
>> >> Cc: "users@ovirt.org" 
>> >> Sent: Tuesday, March 4, 2014 4:16:19 PM
>> >> Subject: Re: [Users] Data Center Non Responsive / Contending
>> >>
>> >> 2014-03-04 14:48 GMT+01:00 Meital Bourvine :
>>

Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Liron Aravot
Hi Giorgio,
Apperantly the issue is caused because there is no connectivity to the export 
domain and than we fail on spmStart - that's obviously a bug that shouldn't 
happen.
can you open a bug for the issue?
in the meanwhile, as it seems to still exist - seems to me like the way for 
solving it would be either to fix the connectivity issue between vdsm and the 
storage domain or to downgrade your vdsm version to before this issue was 
introduced.

6a519e95-62ef-445b-9a98-f05c81592c85::WARNING::2014-03-04 
13:05:31,489::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] ['  
Volume group "1810e5eb-9e
b6-4797-ac50-8023a939f312" not found', '  Skipping volume group 
1810e5eb-9eb6-4797-ac50-8023a939f312']
6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04 
13:05:31,499::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 
1810e5eb-9eb6-4797-ac50-8023a
939f312 not found
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: 
(u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
6a519e95-62ef-445b-9a98-f05c81592c85::ERROR::2014-03-04 
13:05:31,500::sp::329::Storage.StoragePool::(startSpm) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sp.py", line 296, in startSpm
self._updateDomainsRole()
  File "/usr/share/vdsm/storage/securable.py", line 75, in wrapper
return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 205, in _updateDomainsRole
domain = sdCache.produce(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
domain.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)




- Original Message -----
> From: "Giorgio Bersano" 
> To: "Meital Bourvine" 
> Cc: "users@ovirt.org" 
> Sent: Tuesday, March 4, 2014 4:35:07 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 15:23 GMT+01:00 Meital Bourvine :
> > Master data domain must be reachable in order for the DC to be up.
> > Export domain shouldn't affect the dc status.
> > Are you sure that you've created the export domain as an export domain, and
> > not as a regular nfs?
> >
> 
> Yes, I am.
> 
> Don't know how to extract this info from DB, but in webadmin, in the
> storage list, I have these info:
> 
> Domain Name: nfs02EXPORT
> Domain Type: Export
> Storage Type: NFS
> Format: V1
> Cross Data-Center Status: Inactive
> Total Space: [N/A]
> Free Space: [N/A]
> 
> ATM my only "Data" Domain is based on iSCSI, no NFS.
> 
> 
> 
> 
> 
> > - Original Message -
> >> From: "Giorgio Bersano" 
> >> To: "Meital Bourvine" 
> >> Cc: "users@ovirt.org" 
> >> Sent: Tuesday, March 4, 2014 4:16:19 PM
> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >>
> >> 2014-03-04 14:48 GMT+01:00 Meital Bourvine :
> >> > StorageDomainDoesNotExist: Storage domain does not exist:
> >> > (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
> >> >
> >> > What's the output of:
> >> > lvs
> >> > vdsClient -s 0 getStorageDomainsList
> >> >
> >> > If it exists in the list, please run:
> >> > vdsClient -s 0 getStorageDomainInfo 1810e5eb-9eb6-4797-ac50-8023a939f312
> >> >
> >>
> >> I'm attaching a compressed archive to avoid mangling by googlemail client.
> >>
> >> Indeed the NFS storage with that id is not in the list of available
> >> storage as it is brought up by a VM that has to be run in this very
> >> same cluster. Obviously it isn't running at the moment.
> >>
> >> You find this in the DB:
> >>
> >> COPY storage_domain_static (id, storage, storage_name,
> >> storage_domain_type, storage_type, storage_domain_format_type,
> >> _create_date, _update_date, recoverable, last_time_used_as_master,
> >> storage

Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Nicolas Ecarnot

Off-topic but...



Le 04/03/2014 15:23, Meital Bourvine a écrit :

Master data domain must be reachable in order for the DC to be up.
Export domain shouldn't affect the dc status.


Last month we experienced a planned and controlled complete electrical 
shutdown of our whole datacenter.
When switching everything on, we witnessed that no matter the time we 
waited or the actions tried, our oVirt 3.3 wasn't able to start (hosts 
were responsive but not able to get activated) as long as our NFS server 
(used only for the export domain) wasn't up and running.


I did not find the time to tell about it, as I thought I was alone 
seeing that, but today, I see I'm not.


--
Nicolas Ecarnot
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Meital Bourvine
Ok, and is the iscsi functional at the moment?

- Original Message -
> From: "Giorgio Bersano" 
> To: "Meital Bourvine" 
> Cc: "users@ovirt.org" 
> Sent: Tuesday, March 4, 2014 4:35:07 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 15:23 GMT+01:00 Meital Bourvine :
> > Master data domain must be reachable in order for the DC to be up.
> > Export domain shouldn't affect the dc status.
> > Are you sure that you've created the export domain as an export domain, and
> > not as a regular nfs?
> >
> 
> Yes, I am.
> 
> Don't know how to extract this info from DB, but in webadmin, in the
> storage list, I have these info:
> 
> Domain Name: nfs02EXPORT
> Domain Type: Export
> Storage Type: NFS
> Format: V1
> Cross Data-Center Status: Inactive
> Total Space: [N/A]
> Free Space: [N/A]
> 
> ATM my only "Data" Domain is based on iSCSI, no NFS.
> 
> 
> 
> 
> 
> > - Original Message -----
> >> From: "Giorgio Bersano" 
> >> To: "Meital Bourvine" 
> >> Cc: "users@ovirt.org" 
> >> Sent: Tuesday, March 4, 2014 4:16:19 PM
> >> Subject: Re: [Users] Data Center Non Responsive / Contending
> >>
> >> 2014-03-04 14:48 GMT+01:00 Meital Bourvine :
> >> > StorageDomainDoesNotExist: Storage domain does not exist:
> >> > (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
> >> >
> >> > What's the output of:
> >> > lvs
> >> > vdsClient -s 0 getStorageDomainsList
> >> >
> >> > If it exists in the list, please run:
> >> > vdsClient -s 0 getStorageDomainInfo 1810e5eb-9eb6-4797-ac50-8023a939f312
> >> >
> >>
> >> I'm attaching a compressed archive to avoid mangling by googlemail client.
> >>
> >> Indeed the NFS storage with that id is not in the list of available
> >> storage as it is brought up by a VM that has to be run in this very
> >> same cluster. Obviously it isn't running at the moment.
> >>
> >> You find this in the DB:
> >>
> >> COPY storage_domain_static (id, storage, storage_name,
> >> storage_domain_type, storage_type, storage_domain_format_type,
> >> _create_date, _update_date, recoverable, last_time_used_as_master,
> >> storage_description, storage_comment) FROM stdin;
> >> ...
> >> 1810e5eb-9eb6-4797-ac50-8023a939f312
> >> 11d4972d-f227-49ed-b997-f33cf4b2aa26nfs02EXPORT 3   1
> >>  0   2014-02-28 18:11:23.17092+01\N  t   0   \N
> >>   \N
> >> ...
> >>
> >> Also, disks for that VM are carved from the Master Data Domain that is
> >> not available ATM.
> >>
> >> To say in other words: I thought that availability of an export domain
> >> wasn't critical to switch on a Data Center. Am I wrong?
> >>
> >> Thanks,
> >> Giorgio.
> >>
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Giorgio Bersano
2014-03-04 15:23 GMT+01:00 Meital Bourvine :
> Master data domain must be reachable in order for the DC to be up.
> Export domain shouldn't affect the dc status.
> Are you sure that you've created the export domain as an export domain, and 
> not as a regular nfs?
>

Yes, I am.

Don't know how to extract this info from DB, but in webadmin, in the
storage list, I have these info:

Domain Name: nfs02EXPORT
Domain Type: Export
Storage Type: NFS
Format: V1
Cross Data-Center Status: Inactive
Total Space: [N/A]
Free Space: [N/A]

ATM my only "Data" Domain is based on iSCSI, no NFS.





> - Original Message -
>> From: "Giorgio Bersano" 
>> To: "Meital Bourvine" 
>> Cc: "users@ovirt.org" 
>> Sent: Tuesday, March 4, 2014 4:16:19 PM
>> Subject: Re: [Users] Data Center Non Responsive / Contending
>>
>> 2014-03-04 14:48 GMT+01:00 Meital Bourvine :
>> > StorageDomainDoesNotExist: Storage domain does not exist:
>> > (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
>> >
>> > What's the output of:
>> > lvs
>> > vdsClient -s 0 getStorageDomainsList
>> >
>> > If it exists in the list, please run:
>> > vdsClient -s 0 getStorageDomainInfo 1810e5eb-9eb6-4797-ac50-8023a939f312
>> >
>>
>> I'm attaching a compressed archive to avoid mangling by googlemail client.
>>
>> Indeed the NFS storage with that id is not in the list of available
>> storage as it is brought up by a VM that has to be run in this very
>> same cluster. Obviously it isn't running at the moment.
>>
>> You find this in the DB:
>>
>> COPY storage_domain_static (id, storage, storage_name,
>> storage_domain_type, storage_type, storage_domain_format_type,
>> _create_date, _update_date, recoverable, last_time_used_as_master,
>> storage_description, storage_comment) FROM stdin;
>> ...
>> 1810e5eb-9eb6-4797-ac50-8023a939f312
>> 11d4972d-f227-49ed-b997-f33cf4b2aa26nfs02EXPORT 3   1
>>  0   2014-02-28 18:11:23.17092+01\N  t   0   \N
>>   \N
>> ...
>>
>> Also, disks for that VM are carved from the Master Data Domain that is
>> not available ATM.
>>
>> To say in other words: I thought that availability of an export domain
>> wasn't critical to switch on a Data Center. Am I wrong?
>>
>> Thanks,
>> Giorgio.
>>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Meital Bourvine
Master data domain must be reachable in order for the DC to be up.
Export domain shouldn't affect the dc status.
Are you sure that you've created the export domain as an export domain, and not 
as a regular nfs?

- Original Message -
> From: "Giorgio Bersano" 
> To: "Meital Bourvine" 
> Cc: "users@ovirt.org" 
> Sent: Tuesday, March 4, 2014 4:16:19 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 14:48 GMT+01:00 Meital Bourvine :
> > StorageDomainDoesNotExist: Storage domain does not exist:
> > (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
> >
> > What's the output of:
> > lvs
> > vdsClient -s 0 getStorageDomainsList
> >
> > If it exists in the list, please run:
> > vdsClient -s 0 getStorageDomainInfo 1810e5eb-9eb6-4797-ac50-8023a939f312
> >
> 
> I'm attaching a compressed archive to avoid mangling by googlemail client.
> 
> Indeed the NFS storage with that id is not in the list of available
> storage as it is brought up by a VM that has to be run in this very
> same cluster. Obviously it isn't running at the moment.
> 
> You find this in the DB:
> 
> COPY storage_domain_static (id, storage, storage_name,
> storage_domain_type, storage_type, storage_domain_format_type,
> _create_date, _update_date, recoverable, last_time_used_as_master,
> storage_description, storage_comment) FROM stdin;
> ...
> 1810e5eb-9eb6-4797-ac50-8023a939f312
> 11d4972d-f227-49ed-b997-f33cf4b2aa26nfs02EXPORT 3   1
>  0   2014-02-28 18:11:23.17092+01\N  t   0   \N
>   \N
> ...
> 
> Also, disks for that VM are carved from the Master Data Domain that is
> not available ATM.
> 
> To say in other words: I thought that availability of an export domain
> wasn't critical to switch on a Data Center. Am I wrong?
> 
> Thanks,
> Giorgio.
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Giorgio Bersano
2014-03-04 14:48 GMT+01:00 Meital Bourvine :
> StorageDomainDoesNotExist: Storage domain does not exist: 
> (u'1810e5eb-9eb6-4797-ac50-8023a939f312',)
>
> What's the output of:
> lvs
> vdsClient -s 0 getStorageDomainsList
>
> If it exists in the list, please run:
> vdsClient -s 0 getStorageDomainInfo 1810e5eb-9eb6-4797-ac50-8023a939f312
>

I'm attaching a compressed archive to avoid mangling by googlemail client.

Indeed the NFS storage with that id is not in the list of available
storage as it is brought up by a VM that has to be run in this very
same cluster. Obviously it isn't running at the moment.

You find this in the DB:

COPY storage_domain_static (id, storage, storage_name,
storage_domain_type, storage_type, storage_domain_format_type,
_create_date, _update_date, recoverable, last_time_used_as_master,
storage_description, storage_comment) FROM stdin;
...
1810e5eb-9eb6-4797-ac50-8023a939f312
11d4972d-f227-49ed-b997-f33cf4b2aa26nfs02EXPORT 3   1
 0   2014-02-28 18:11:23.17092+01\N  t   0   \N
  \N
...

Also, disks for that VM are carved from the Master Data Domain that is
not available ATM.

To say in other words: I thought that availability of an export domain
wasn't critical to switch on a Data Center. Am I wrong?

Thanks,
Giorgio.


lvs+vdsclient.txt.gz
Description: GNU Zip compressed data
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Meital Bourvine
StorageDomainDoesNotExist: Storage domain does not exist: 
(u'1810e5eb-9eb6-4797-ac50-8023a939f312',)

What's the output of:
lvs
vdsClient -s 0 getStorageDomainsList

If it exists in the list, please run:
vdsClient -s 0 getStorageDomainInfo 1810e5eb-9eb6-4797-ac50-8023a939f312



- Original Message -
> From: "Giorgio Bersano" 
> To: "Meital Bourvine" 
> Cc: "users@ovirt.org" 
> Sent: Tuesday, March 4, 2014 3:18:24 PM
> Subject: Re: [Users] Data Center Non Responsive / Contending
> 
> 2014-03-04 12:18 GMT+01:00 Meital Bourvine :
> > Hi Giorgio,
> >
> > Can you please attach vdsm.log from both hosts, and engine.log?
> >
> > Try maybe moving both hosts to maintenance, "confirm host had been
> > rebooted", and activate again. See if it helps.
> >
> 
> Hi Meital,
> tried but no positive outcome.
> 
> I'm attaching logs regarding this last operations as I'm beginning to
> think the problem is in the unreacheable Export Domain. Also there is
> an emended copy of the DB taken this last night.
> 
> If not enough I'll go to look for yesterday's relevant logs.
> 
> Thank you for taking care,
> Giorgio.
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center Non Responsive / Contending

2014-03-04 Thread Meital Bourvine
Hi Giorgio,

Can you please attach vdsm.log from both hosts, and engine.log?

Try maybe moving both hosts to maintenance, "confirm host had been rebooted", 
and activate again. See if it helps.

- Original Message -
> From: "Giorgio Bersano" 
> To: "users@ovirt.org" 
> Sent: Tuesday, March 4, 2014 12:34:49 PM
> Subject: [Users] Data Center Non Responsive / Contending
> 
> Hi everyone,
> I'm asking for help again as testing my setup I put myself in a
> situation in which I can't get out.
> 
> Layout: two hosts, an iSCSI storage, the engine installed as a
> "regular" KVM guest on another host (external to the oVirt setup). All
> CentOS 6.5, oVirt 3.4.0beta3.
> One DC ("Default"), one Cluster ("Default"), default storage type iSCSI.
> Moreover, ISO domain is another "external" KVM guest exposing an NFS
> share; Export Domain is in fact a VM in this Cluster.
> 
> This is a preproduction setup.
> Yesterday all was running fine until I needed to do some hardware
> maintenance.
> So I decided to put the two hosts in maintenance from the webadmin
> then shutdown them in the usual way. Engine was left operational.
> Later i booted again one host, waited some time then tried to activate
> it (from the webadmin) without success. Even switching on the second
> host didn't change anything.
> Now the Data Center status toggles between "Contending" and "Non Responsive".
> 
> None of the hosts is SPM and if I choose "Select as SPM" in the
> webadmin the result is this popup:
> --
>  Operation Canceled
> --
> Error while executing action: Cannot force select SPM: Storage Domain
> cannot be accessed.
> -Please check that at least one Host is operational and Data Center state is
> up.
> --
> 
> The two hosts are operational but the DC isn't up (downpointing red
> arrow - "Non Responsive", hourglass - "Contending").
> 
> Storage domains are in "Unknown" state and if I try to "Activate" the
> Master Data Domain its status becomes "Locked" and then it fails with
> these events:
> . Invalid status on Data Center Default. Setting status to Non Responsive
> . Failed to activate Storage Domain dt02clu6070 (Data Center Default) by
> admin
> 
> Connectivity seems OK. iSCSI connectivity seems OK.
> 
> I'm almost certain that if I had left one host active I would have had
> zero problems.
> But I also think shutting down the full system should not be a problem.
> 
> In the end I restarted the ovirt-engine service without results (as
> expected).
> 
> Any thought? Logs needed?
> 
> TIA,
> Giorgio.
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Data Center Non Responsive / Contending

2014-03-04 Thread Giorgio Bersano
Hi everyone,
I'm asking for help again as testing my setup I put myself in a
situation in which I can't get out.

Layout: two hosts, an iSCSI storage, the engine installed as a
"regular" KVM guest on another host (external to the oVirt setup). All
CentOS 6.5, oVirt 3.4.0beta3.
One DC ("Default"), one Cluster ("Default"), default storage type iSCSI.
Moreover, ISO domain is another "external" KVM guest exposing an NFS
share; Export Domain is in fact a VM in this Cluster.

This is a preproduction setup.
Yesterday all was running fine until I needed to do some hardware maintenance.
So I decided to put the two hosts in maintenance from the webadmin
then shutdown them in the usual way. Engine was left operational.
Later i booted again one host, waited some time then tried to activate
it (from the webadmin) without success. Even switching on the second
host didn't change anything.
Now the Data Center status toggles between "Contending" and "Non Responsive".

None of the hosts is SPM and if I choose "Select as SPM" in the
webadmin the result is this popup:
--
 Operation Canceled
--
Error while executing action: Cannot force select SPM: Storage Domain
cannot be accessed.
-Please check that at least one Host is operational and Data Center state is up.
--

The two hosts are operational but the DC isn't up (downpointing red
arrow - "Non Responsive", hourglass - "Contending").

Storage domains are in "Unknown" state and if I try to "Activate" the
Master Data Domain its status becomes "Locked" and then it fails with
these events:
. Invalid status on Data Center Default. Setting status to Non Responsive
. Failed to activate Storage Domain dt02clu6070 (Data Center Default) by admin

Connectivity seems OK. iSCSI connectivity seems OK.

I'm almost certain that if I had left one host active I would have had
zero problems.
But I also think shutting down the full system should not be a problem.

In the end I restarted the ovirt-engine service without results (as expected).

Any thought? Logs needed?

TIA,
Giorgio.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users