[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException (was: vdsm.storage.exception.UnknownTask: Task id unknown (was: [oVirt Jenkins] ovirt-system-tests_he-basic-suite-master -

2020-07-21 Thread Martin Perina
On Tue, Jul 21, 2020 at 3:44 PM Yedidyah Bar David  wrote:

> We had several different threads about this failure. I'll use current
> to send a summary:
>
> - Our wildfly build was patched to include newer httpclient/core:
> https://gerrit.ovirt.org/110324
>
> - The appliance was patched to use master snapshot:
> https://gerrit.ovirt.org/110397
>
> - Now he-basic-suite succeeded, first time after many months:
>
> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1680/


So that's great news and from now on we should make sure that it stays
green (meaning everybody need to pay the same attention as to the basic
suite)


>
> Thanks to everyone involved!
>
> Best regards,
>
> On Tue, Jul 7, 2020 at 3:41 PM Martin Perina  wrote:
> >
> > Hi,
> >
> > I'm not aware of change regarding certificates recently. So is this
> error reproducible outside Jenkins? Or even better is it reproducible on
> some easier flow other than HE installation so we can debug what
> certificate is loaded in VDSM?
> >
> > Thanks,
> > Martin
> >
> > On Tue, Jul 7, 2020 at 2:07 PM Yedidyah Bar David 
> wrote:
> >>
> >> On Tue, Jul 7, 2020 at 12:50 PM Yedidyah Bar David 
> wrote:
> >> >
> >> > On Wed, Jun 24, 2020 at 2:14 PM Evgeny Slutsky 
> wrote:
> >> > >
> >> > > Hi,
> >> > > changing the hostname to include also the domain name fixed the
> cert deployment issue:
> >> > > https://gerrit.ovirt.org/#/c/109842/
> >> > >
> >> > > not sure how it affects the engine certificate content.
> >> > > from my offline discussion with @Martin Perina  this was that
> change that could cause it:
> >> > > https://gerrit.ovirt.org/#/c/109636/
> >> > >
> >> > > any thoughts?
> >> >
> >> > Above two patches are merged, but we still fail the same way:
> >> >
> >> >
> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/
> >> >
> >> >
> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/ovirt-hosted-engine-setup/engine-logs-2020-07-07T03%3A15%3A01Z/ovirt-engine/engine.log
> >> >
> >> > 2020-07-06 23:04:25,555-04 ERROR
> >> > [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
> >> >
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-38)
> >> > [fb28ce9] Command 'UploadStreamVDSCommand(HostName =
> >> > lago-he-basic-suite-master-host-0.lago.local,
> >> >
> UploadStreamVDSCommandParameters:{hostId='e096650f-a7d6-4383-b1bb-f2e61327aac0'})'
> >> > execution failed: javax.net.ssl.SSLPeerUnverifiedException:
> >> > Certificate for  doesn't
> >> > match any of the subject alternative names:
> >> > [lago-he-basic-suite-master-host-0.lago.local]
> >> >
> >> > Any idea?
> >>
> >> And I now see this is indeed what's failing hosted-engine deploy at:
> >>
> >> 2020-07-07 05:51:58,573-0400 INFO ansible task start {'status': 'OK',
> >> 'ansible_type': 'task', 'ansible_playbook':
> >> '/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml',
> >> 'ansible_task': 'ovirt.hosted_engine_setup : Check OVF_STORE volume
> >> status'}
> >>
> >> (See other thread: [oVirt Jenkins]
> >> ovirt-system-tests_he-basic-suite-master - Build # 1655 - Still
> >> Failing! )
> >>
> >> On a successful run, engine.log has:
> >>
> >> 2020-07-02 18:01:55,527+03 INFO
> >>
> [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
> >> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> >>  [2b0721d8] Running command: ProcessOvfUpdateForStorageDomainCommand
> >> internal: true. Entities affected :  ID:
> >> e102d7b5-1a37-490f-a3e7-20e56c37791f Type: StorageAction group
> >> MANIPULATE_STORAG
> >> E_DOMAIN with role type ADMIN
> >> 2020-07-02 18:01:55,607+03 INFO
> >>
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> >> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> >> [2b0721d8
> >> ] START, SetVolumeDescriptionVDSCommand(
> >>
> SetVolumeDescriptionVDSCommandParameters:{storagePoolId='b9dccefe-bc61-11ea-8ebe-001a4a231728',
> >> ignoreFailoverLimit='false', storageDomainId='e102d7b
> >> 5-1a37-490f-a3e7-20e56c37791f',
> >> imageGroupId='db934a98-4111-4faf-8cb9-6b36928cd61c',
> >> imageId='f898c40e-1f88-48db-b59b-f2c73162ddb7'}), log id: e203e51
> >> 2020-07-02 18:01:55,609+03 INFO
> >>
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> >> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> >> [2b0721d8
> >> ] -- executeIrsBrokerCommand: calling 'setVolumeDescription',
> parameters:
> >> 2020-07-02 18:01:55,609+03 INFO
> >>
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> >> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> >> [2b0721d8
> >> ] ++ spUUID=b9dccefe-bc61-11ea-8ebe-001a4a231728
> >> 2020-07-02 18:01:55,609+03 INFO
> >>
> 

[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException (was: vdsm.storage.exception.UnknownTask: Task id unknown (was: [oVirt Jenkins] ovirt-system-tests_he-basic-suite-master -

2020-07-21 Thread Yedidyah Bar David
We had several different threads about this failure. I'll use current
to send a summary:

- Our wildfly build was patched to include newer httpclient/core:
https://gerrit.ovirt.org/110324

- The appliance was patched to use master snapshot:
https://gerrit.ovirt.org/110397

- Now he-basic-suite succeeded, first time after many months:
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1680/

Thanks to everyone involved!

Best regards,

On Tue, Jul 7, 2020 at 3:41 PM Martin Perina  wrote:
>
> Hi,
>
> I'm not aware of change regarding certificates recently. So is this error 
> reproducible outside Jenkins? Or even better is it reproducible on some 
> easier flow other than HE installation so we can debug what certificate is 
> loaded in VDSM?
>
> Thanks,
> Martin
>
> On Tue, Jul 7, 2020 at 2:07 PM Yedidyah Bar David  wrote:
>>
>> On Tue, Jul 7, 2020 at 12:50 PM Yedidyah Bar David  wrote:
>> >
>> > On Wed, Jun 24, 2020 at 2:14 PM Evgeny Slutsky  wrote:
>> > >
>> > > Hi,
>> > > changing the hostname to include also the domain name fixed the  cert 
>> > > deployment issue:
>> > > https://gerrit.ovirt.org/#/c/109842/
>> > >
>> > > not sure how it affects the engine certificate content.
>> > > from my offline discussion with @Martin Perina  this was that change 
>> > > that could cause it:
>> > > https://gerrit.ovirt.org/#/c/109636/
>> > >
>> > > any thoughts?
>> >
>> > Above two patches are merged, but we still fail the same way:
>> >
>> > https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/
>> >
>> > https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/ovirt-hosted-engine-setup/engine-logs-2020-07-07T03%3A15%3A01Z/ovirt-engine/engine.log
>> >
>> > 2020-07-06 23:04:25,555-04 ERROR
>> > [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
>> > (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-38)
>> > [fb28ce9] Command 'UploadStreamVDSCommand(HostName =
>> > lago-he-basic-suite-master-host-0.lago.local,
>> > UploadStreamVDSCommandParameters:{hostId='e096650f-a7d6-4383-b1bb-f2e61327aac0'})'
>> > execution failed: javax.net.ssl.SSLPeerUnverifiedException:
>> > Certificate for  doesn't
>> > match any of the subject alternative names:
>> > [lago-he-basic-suite-master-host-0.lago.local]
>> >
>> > Any idea?
>>
>> And I now see this is indeed what's failing hosted-engine deploy at:
>>
>> 2020-07-07 05:51:58,573-0400 INFO ansible task start {'status': 'OK',
>> 'ansible_type': 'task', 'ansible_playbook':
>> '/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml',
>> 'ansible_task': 'ovirt.hosted_engine_setup : Check OVF_STORE volume
>> status'}
>>
>> (See other thread: [oVirt Jenkins]
>> ovirt-system-tests_he-basic-suite-master - Build # 1655 - Still
>> Failing! )
>>
>> On a successful run, engine.log has:
>>
>> 2020-07-02 18:01:55,527+03 INFO
>> [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
>>  [2b0721d8] Running command: ProcessOvfUpdateForStorageDomainCommand
>> internal: true. Entities affected :  ID:
>> e102d7b5-1a37-490f-a3e7-20e56c37791f Type: StorageAction group
>> MANIPULATE_STORAG
>> E_DOMAIN with role type ADMIN
>> 2020-07-02 18:01:55,607+03 INFO
>> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
>> [2b0721d8
>> ] START, SetVolumeDescriptionVDSCommand(
>> SetVolumeDescriptionVDSCommandParameters:{storagePoolId='b9dccefe-bc61-11ea-8ebe-001a4a231728',
>> ignoreFailoverLimit='false', storageDomainId='e102d7b
>> 5-1a37-490f-a3e7-20e56c37791f',
>> imageGroupId='db934a98-4111-4faf-8cb9-6b36928cd61c',
>> imageId='f898c40e-1f88-48db-b59b-f2c73162ddb7'}), log id: e203e51
>> 2020-07-02 18:01:55,609+03 INFO
>> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
>> [2b0721d8
>> ] -- executeIrsBrokerCommand: calling 'setVolumeDescription', parameters:
>> 2020-07-02 18:01:55,609+03 INFO
>> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
>> [2b0721d8
>> ] ++ spUUID=b9dccefe-bc61-11ea-8ebe-001a4a231728
>> 2020-07-02 18:01:55,609+03 INFO
>> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
>> [2b0721d8
>> ] ++ sdUUID=e102d7b5-1a37-490f-a3e7-20e56c37791f
>> 2020-07-02 18:01:55,609+03 INFO
>> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
>> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
>> [2b0721d8
>> ] ++ 

[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException

2020-07-20 Thread Marcin Sobczyk

Hi,

the problem was most probably caused by a regression in httpclient:

https://issues.apache.org/jira/browse/HTTPCLIENT-2047

and should be fixed by now with:

https://gerrit.ovirt.org/#/c/110324/

Regards, Marcin

On 7/15/20 10:05 AM, Martin Perina wrote:

Artur,

could you please add some additional logging into the engine HTTP 
client to find out why apache-http-client complains about the certificate?


Thanks,
Martin


On Wed, Jul 15, 2020 at 9:43 AM Yedidyah Bar David > wrote:


On Thu, Jul 9, 2020 at 12:32 PM Marcin Sobczyk
mailto:msobc...@redhat.com>> wrote:
>
> Hi,
>
> On 7/8/20 3:34 PM, Yedidyah Bar David wrote:
> > Did you also get in engine.log
"javax.net.ssl.SSLPeerUnverifiedException"?
> I was also able to reproduce this on my server, but I'm baffled
by this
> one...
> I enabled debug logs on the engine with [1] and got this stack
trace [2],
> but the certs seem ok to me:
>
> 1. I verified the hostname in the certificate by running on the
host:
>
> openssl s_client \
>      -connect 127.0.0.1:54321  \
>      -CAfile /etc/pki/vdsm/certs/cacert.pem \
>      -cert /etc/pki/vdsm/certs/vdsmcert.pem \
>      -key /etc/pki/vdsm/keys/vdsmkey.pem \
>      -verify_hostname lago-he-basic-suite-master-host-0.lago.local
>
> 2. curl is also happy:
>
> curl \
>      --cacert /etc/pki/vdsm/certs/cacert.pem \
>      --cert /etc/pki/vdsm/certs/vdsmcert.pem \
>      --key /etc/pki/vdsm/keys/vdsmkey.pem \
> https://lago-he-basic-suite-master-host-0.lago.local:54321
>
> 3. on the hosted engine there is proper entry in '/etc/hosts':
>
> [root@lago-he-basic-suite-master-engine certs]# cat /etc/hosts
> 127.0.0.1   localhost localhost.localdomain localhost4
> localhost4.localdomain4
> ::1         localhost localhost.localdomain localhost6
> localhost6.localdomain6
> 192.168.200.3 lago-he-basic-suite-master-host-0.lago.local
> 192.168.222.76 lago-he-basic-suite-master-engine.lago.local #
> hosted-engine-setup-/var/tmp/localvm9k3eqtf7
>
> 4. and dig -x seems to resolve properly:
>
> [root@lago-he-basic-suite-master-engine certs]# dig +short -x
192.168.200.3
> lago-he-basic-suite-master-host-0.lago.local.
>
> If anyone else has some ideas what else could be checked then please
> ping me.
>
> Marcin
>
> [1] https://gerrit.ovirt.org/110211
> [2] http://pastebin.test.redhat.com/882851
>
> >
> > On Wed, Jul 8, 2020 at 4:25 PM Artem Hrechanychenko
mailto:ahrec...@redhat.com>> wrote:
> >> Reproduced locally without using Jenkins
> >>
> >>> [ INFO  ] TASK [ovirt.hosted_engine_setup : Add HE disks]
> >>> [ ERROR ] {'msg': 'Timeout exceed while waiting on result
state of the entity.', 'exception': 'Traceback (most recent call
last):\n  File

"/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/modules/ovirt_disk_28.py",
line 678, in main\n  File

"/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
line 646, in create\n
poll_interval=self._module.params[\'poll_interval\'],\n File

"/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
line 364, in wait\n    raise Exception("Timeout exceed while
waiting on result state of the entity.")\nException: Timeout
exceed while waiting on result state of the entity.\n', 'failed':
True, 'invocation': {'module_args': {'name':
'HostedEngineConfigurationImage', 'size': '1GiB', 'format': 'raw',
'sparse': False, 'description': 'Hosted-Engine configuration
disk', 'content_type': 'hosted_engine_configuration', 'interface':
'virtio', 'storage_domain': 'hosted_storage', 'wait': True,
'timeout': 600, 'auth': {'token':

'rAqX1OJIbJyMrA1aWVR-AR54T2lsiBbalN80dWugpfHFBqwiCe4rz3porngvlFSE90k-FEqagPPFboU6ew1hPw',
'url':
'https://lago-he-basic-suite-master-engine.lago.local/ovirt-engine/api',
'ca_file': None, 'insecure': True, 'timeout': 0, 'compress': True,
'kerberos': False, 'headers': None}, 'poll_interval': 3,
'fetch_nested': False, 'nested_attributes': [], 'state':
'present', 'force': False, 'id': None, 'vm_name': None, 'vm_id':
None, 'storage_domains': None, 'profile': None, 'quota_id': None,
'bootable': None, 'shareable': None, 'logical_unit': None,
'download_image_path': None, 'upload_image_path': None,
'sparsify': None, 'openstack_volume_type': None, 'image_provider':
None, 'host': None, 'wipe_after_delete': None, 'activate': None}},
'_ansible_no_log': False, 'changed': False, 'item': {'name':
'HostedEngineConfigurationImage', 'description': 'Hosted-Engine
configuration disk', 'size': '1GiB', 'format': 'raw', 'sparse':
   

[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException (was: vdsm.storage.exception.UnknownTask: Task id unknown

2020-07-15 Thread Martin Perina
Artur,

could you please add some additional logging into the engine HTTP client to
find out why apache-http-client complains about the certificate?

Thanks,
Martin


On Wed, Jul 15, 2020 at 9:43 AM Yedidyah Bar David  wrote:

> On Thu, Jul 9, 2020 at 12:32 PM Marcin Sobczyk 
> wrote:
> >
> > Hi,
> >
> > On 7/8/20 3:34 PM, Yedidyah Bar David wrote:
> > > Did you also get in engine.log
> "javax.net.ssl.SSLPeerUnverifiedException"?
> > I was also able to reproduce this on my server, but I'm baffled by this
> > one...
> > I enabled debug logs on the engine with [1] and got this stack trace [2],
> > but the certs seem ok to me:
> >
> > 1. I verified the hostname in the certificate by running on the host:
> >
> > openssl s_client \
> >  -connect 127.0.0.1:54321 \
> >  -CAfile /etc/pki/vdsm/certs/cacert.pem \
> >  -cert /etc/pki/vdsm/certs/vdsmcert.pem \
> >  -key /etc/pki/vdsm/keys/vdsmkey.pem \
> >  -verify_hostname lago-he-basic-suite-master-host-0.lago.local
> >
> > 2. curl is also happy:
> >
> > curl \
> >  --cacert /etc/pki/vdsm/certs/cacert.pem \
> >  --cert /etc/pki/vdsm/certs/vdsmcert.pem \
> >  --key /etc/pki/vdsm/keys/vdsmkey.pem \
> >  https://lago-he-basic-suite-master-host-0.lago.local:54321
> >
> > 3. on the hosted engine there is proper entry in '/etc/hosts':
> >
> > [root@lago-he-basic-suite-master-engine certs]# cat /etc/hosts
> > 127.0.0.1   localhost localhost.localdomain localhost4
> > localhost4.localdomain4
> > ::1 localhost localhost.localdomain localhost6
> > localhost6.localdomain6
> > 192.168.200.3 lago-he-basic-suite-master-host-0.lago.local
> > 192.168.222.76 lago-he-basic-suite-master-engine.lago.local #
> > hosted-engine-setup-/var/tmp/localvm9k3eqtf7
> >
> > 4. and dig -x seems to resolve properly:
> >
> > [root@lago-he-basic-suite-master-engine certs]# dig +short -x
> 192.168.200.3
> > lago-he-basic-suite-master-host-0.lago.local.
> >
> > If anyone else has some ideas what else could be checked then please
> > ping me.
> >
> > Marcin
> >
> > [1] https://gerrit.ovirt.org/110211
> > [2] http://pastebin.test.redhat.com/882851
> >
> > >
> > > On Wed, Jul 8, 2020 at 4:25 PM Artem Hrechanychenko <
> ahrec...@redhat.com> wrote:
> > >> Reproduced locally without using Jenkins
> > >>
> > >>> [ INFO  ] TASK [ovirt.hosted_engine_setup : Add HE disks]
> > >>> [ ERROR ] {'msg': 'Timeout exceed while waiting on result state of
> the entity.', 'exception': 'Traceback (most recent call last):\n  File
> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/modules/ovirt_disk_28.py",
> line 678, in main\n  File
> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
> line 646, in create\n
> poll_interval=self._module.params[\'poll_interval\'],\n  File
> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
> line 364, in wait\nraise Exception("Timeout exceed while waiting on
> result state of the entity.")\nException: Timeout exceed while waiting on
> result state of the entity.\n', 'failed': True, 'invocation':
> {'module_args': {'name': 'HostedEngineConfigurationImage', 'size': '1GiB',
> 'format': 'raw', 'sparse': False, 'description': 'Hosted-Engine
> configuration disk', 'content_type': 'hosted_engine_configuration',
> 'interface': 'virtio', 'storage_domain': 'hosted_storage', 'wait': True,
> 'timeout': 600, 'auth': {'token':
> 'rAqX1OJIbJyMrA1aWVR-AR54T2lsiBbalN80dWugpfHFBqwiCe4rz3porngvlFSE90k-FEqagPPFboU6ew1hPw',
> 'url': '
> https://lago-he-basic-suite-master-engine.lago.local/ovirt-engine/api',
> 'ca_file': None, 'insecure': True, 'timeout': 0, 'compress': True,
> 'kerberos': False, 'headers': None}, 'poll_interval': 3, 'fetch_nested':
> False, 'nested_attributes': [], 'state': 'present', 'force': False, 'id':
> None, 'vm_name': None, 'vm_id': None, 'storage_domains': None, 'profile':
> None, 'quota_id': None, 'bootable': None, 'shareable': None,
> 'logical_unit': None, 'download_image_path': None, 'upload_image_path':
> None, 'sparsify': None, 'openstack_volume_type': None, 'image_provider':
> None, 'host': None, 'wipe_after_delete': None, 'activate': None}},
> '_ansible_no_log': False, 'changed': False, 'item': {'name':
> 'HostedEngineConfigurationImage', 'description': 'Hosted-Engine
> configuration disk', 'size': '1GiB', 'format': 'raw', 'sparse': False,
> 'content': 'hosted_engine_configuration'}, 'ansible_loop_var': 'item',
> '_ansible_item_label': {'name': 'HostedEngineConfigurationImage',
> 'description': 'Hosted-Engine configuration disk', 'size': '1GiB',
> 'format': 'raw', 'sparse': False, 'content': 'hosted_engine_configuration'}}
> > >>
> > >> On Tue, Jul 7, 2020 at 4:22 PM Martin Perina 
> wrote:
> > >>> Hi,
> > >>>
> > >>> I'm not aware of change regarding certificates recently. So is this
> error reproducible outside Jenkins? Or even better is it reproducible on
> some easier flow 

[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException (was: vdsm.storage.exception.UnknownTask: Task id unknown

2020-07-15 Thread Yedidyah Bar David
On Thu, Jul 9, 2020 at 12:32 PM Marcin Sobczyk  wrote:
>
> Hi,
>
> On 7/8/20 3:34 PM, Yedidyah Bar David wrote:
> > Did you also get in engine.log "javax.net.ssl.SSLPeerUnverifiedException"?
> I was also able to reproduce this on my server, but I'm baffled by this
> one...
> I enabled debug logs on the engine with [1] and got this stack trace [2],
> but the certs seem ok to me:
>
> 1. I verified the hostname in the certificate by running on the host:
>
> openssl s_client \
>  -connect 127.0.0.1:54321 \
>  -CAfile /etc/pki/vdsm/certs/cacert.pem \
>  -cert /etc/pki/vdsm/certs/vdsmcert.pem \
>  -key /etc/pki/vdsm/keys/vdsmkey.pem \
>  -verify_hostname lago-he-basic-suite-master-host-0.lago.local
>
> 2. curl is also happy:
>
> curl \
>  --cacert /etc/pki/vdsm/certs/cacert.pem \
>  --cert /etc/pki/vdsm/certs/vdsmcert.pem \
>  --key /etc/pki/vdsm/keys/vdsmkey.pem \
>  https://lago-he-basic-suite-master-host-0.lago.local:54321
>
> 3. on the hosted engine there is proper entry in '/etc/hosts':
>
> [root@lago-he-basic-suite-master-engine certs]# cat /etc/hosts
> 127.0.0.1   localhost localhost.localdomain localhost4
> localhost4.localdomain4
> ::1 localhost localhost.localdomain localhost6
> localhost6.localdomain6
> 192.168.200.3 lago-he-basic-suite-master-host-0.lago.local
> 192.168.222.76 lago-he-basic-suite-master-engine.lago.local #
> hosted-engine-setup-/var/tmp/localvm9k3eqtf7
>
> 4. and dig -x seems to resolve properly:
>
> [root@lago-he-basic-suite-master-engine certs]# dig +short -x 192.168.200.3
> lago-he-basic-suite-master-host-0.lago.local.
>
> If anyone else has some ideas what else could be checked then please
> ping me.
>
> Marcin
>
> [1] https://gerrit.ovirt.org/110211
> [2] http://pastebin.test.redhat.com/882851
>
> >
> > On Wed, Jul 8, 2020 at 4:25 PM Artem Hrechanychenko  
> > wrote:
> >> Reproduced locally without using Jenkins
> >>
> >>> [ INFO  ] TASK [ovirt.hosted_engine_setup : Add HE disks]
> >>> [ ERROR ] {'msg': 'Timeout exceed while waiting on result state of the 
> >>> entity.', 'exception': 'Traceback (most recent call last):\n  File 
> >>> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/modules/ovirt_disk_28.py",
> >>>  line 678, in main\n  File 
> >>> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
> >>>  line 646, in create\n
> >>> poll_interval=self._module.params[\'poll_interval\'],\n  File 
> >>> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
> >>>  line 364, in wait\nraise Exception("Timeout exceed while waiting on 
> >>> result state of the entity.")\nException: Timeout exceed while waiting on 
> >>> result state of the entity.\n', 'failed': True, 'invocation': 
> >>> {'module_args': {'name': 'HostedEngineConfigurationImage', 'size': 
> >>> '1GiB', 'format': 'raw', 'sparse': False, 'description': 'Hosted-Engine 
> >>> configuration disk', 'content_type': 'hosted_engine_configuration', 
> >>> 'interface': 'virtio', 'storage_domain': 'hosted_storage', 'wait': True, 
> >>> 'timeout': 600, 'auth': {'token': 
> >>> 'rAqX1OJIbJyMrA1aWVR-AR54T2lsiBbalN80dWugpfHFBqwiCe4rz3porngvlFSE90k-FEqagPPFboU6ew1hPw',
> >>>  'url': 
> >>> 'https://lago-he-basic-suite-master-engine.lago.local/ovirt-engine/api', 
> >>> 'ca_file': None, 'insecure': True, 'timeout': 0, 'compress': True, 
> >>> 'kerberos': False, 'headers': None}, 'poll_interval': 3, 'fetch_nested': 
> >>> False, 'nested_attributes': [], 'state': 'present', 'force': False, 'id': 
> >>> None, 'vm_name': None, 'vm_id': None, 'storage_domains': None, 'profile': 
> >>> None, 'quota_id': None, 'bootable': None, 'shareable': None, 
> >>> 'logical_unit': None, 'download_image_path': None, 'upload_image_path': 
> >>> None, 'sparsify': None, 'openstack_volume_type': None, 'image_provider': 
> >>> None, 'host': None, 'wipe_after_delete': None, 'activate': None}}, 
> >>> '_ansible_no_log': False, 'changed': False, 'item': {'name': 
> >>> 'HostedEngineConfigurationImage', 'description': 'Hosted-Engine 
> >>> configuration disk', 'size': '1GiB', 'format': 'raw', 'sparse': False, 
> >>> 'content': 'hosted_engine_configuration'}, 'ansible_loop_var': 'item', 
> >>> '_ansible_item_label': {'name': 'HostedEngineConfigurationImage', 
> >>> 'description': 'Hosted-Engine configuration disk', 'size': '1GiB', 
> >>> 'format': 'raw', 'sparse': False, 'content': 
> >>> 'hosted_engine_configuration'}}
> >>
> >> On Tue, Jul 7, 2020 at 4:22 PM Martin Perina  wrote:
> >>> Hi,
> >>>
> >>> I'm not aware of change regarding certificates recently. So is this error 
> >>> reproducible outside Jenkins? Or even better is it reproducible on some 
> >>> easier flow other than HE installation so we can debug what certificate 
> >>> is loaded in VDSM?

It (or probably something similar) was now reported also on users@, see thread:

[ovirt-users] Ovirt 

[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException (was: vdsm.storage.exception.UnknownTask: Task id unknown

2020-07-09 Thread Marcin Sobczyk

Hi,

On 7/8/20 3:34 PM, Yedidyah Bar David wrote:

Did you also get in engine.log "javax.net.ssl.SSLPeerUnverifiedException"?
I was also able to reproduce this on my server, but I'm baffled by this 
one...

I enabled debug logs on the engine with [1] and got this stack trace [2],
but the certs seem ok to me:

1. I verified the hostname in the certificate by running on the host:

openssl s_client \
    -connect 127.0.0.1:54321 \
    -CAfile /etc/pki/vdsm/certs/cacert.pem \
    -cert /etc/pki/vdsm/certs/vdsmcert.pem \
    -key /etc/pki/vdsm/keys/vdsmkey.pem \
    -verify_hostname lago-he-basic-suite-master-host-0.lago.local

2. curl is also happy:

curl \
    --cacert /etc/pki/vdsm/certs/cacert.pem \
    --cert /etc/pki/vdsm/certs/vdsmcert.pem \
    --key /etc/pki/vdsm/keys/vdsmkey.pem \
    https://lago-he-basic-suite-master-host-0.lago.local:54321

3. on the hosted engine there is proper entry in '/etc/hosts':

[root@lago-he-basic-suite-master-engine certs]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 
localhost6.localdomain6

192.168.200.3 lago-he-basic-suite-master-host-0.lago.local
192.168.222.76 lago-he-basic-suite-master-engine.lago.local # 
hosted-engine-setup-/var/tmp/localvm9k3eqtf7


4. and dig -x seems to resolve properly:

[root@lago-he-basic-suite-master-engine certs]# dig +short -x 192.168.200.3
lago-he-basic-suite-master-host-0.lago.local.

If anyone else has some ideas what else could be checked then please 
ping me.


Marcin

[1] https://gerrit.ovirt.org/110211
[2] http://pastebin.test.redhat.com/882851



On Wed, Jul 8, 2020 at 4:25 PM Artem Hrechanychenko  wrote:

Reproduced locally without using Jenkins


[ INFO  ] TASK [ovirt.hosted_engine_setup : Add HE disks]
[ ERROR ] {'msg': 'Timeout exceed while waiting on result state of the entity.', 'exception': 'Traceback (most recent 
call last):\n  File 
"/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/modules/ovirt_disk_28.py",
 line 678, in main\n  File 
"/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
 line 646, in create\npoll_interval=self._module.params[\'poll_interval\'],\n  File 
"/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
 line 364, in wait\nraise Exception("Timeout exceed while waiting on result state of the 
entity.")\nException: Timeout exceed while waiting on result state of the entity.\n', 'failed': True, 
'invocation': {'module_args': {'name': 'HostedEngineConfigurationImage', 'size': '1GiB', 'format': 'raw', 'sparse': 
False, 'description': 'Hosted-Engine configuration disk', 'content_type': 'hosted_engine_configuration', 'interface': 
'virtio', 'storage_domain': 'hosted_storage', 'wait': True, 'timeout': 600, 'auth': {'token': 
'rAqX1OJIbJyMrA1aWVR-AR54T2lsiBbalN80dWugpfHFBqwiCe4rz3porngvlFSE90k-FEqagPPFboU6ew1hPw', 'url': 
'https://lago-he-basic-suite-master-engine.lago.local/ovirt-engine/api', 'ca_file': None, 'insecure': True, 'timeout': 
0, 'compress': True, 'kerberos': False, 'headers': None}, 'poll_interval': 3, 'fetch_nested': False, 
'nested_attributes': [], 'state': 'present', 'force': False, 'id': None, 'vm_name': None, 'vm_id': None, 
'storage_domains': None, 'profile': None, 'quota_id': None, 'bootable': None, 'shareable': None, 'logical_unit': None, 
'download_image_path': None, 'upload_image_path': None, 'sparsify': None, 'openstack_volume_type': None, 
'image_provider': None, 'host': None, 'wipe_after_delete': None, 'activate': None}}, '_ansible_no_log': False, 
'changed': False, 'item': {'name': 'HostedEngineConfigurationImage', 'description': 'Hosted-Engine configuration disk', 
'size': '1GiB', 'format': 'raw', 'sparse': False, 'content': 'hosted_engine_configuration'}, 'ansible_loop_var': 
'item', '_ansible_item_label': {'name': 'HostedEngineConfigurationImage', 'description': 'Hosted-Engine configuration 
disk', 'size': '1GiB', 'format': 'raw', 'sparse': False, 'content': 'hosted_engine_configuration'}}


On Tue, Jul 7, 2020 at 4:22 PM Martin Perina  wrote:

Hi,

I'm not aware of change regarding certificates recently. So is this error 
reproducible outside Jenkins? Or even better is it reproducible on some easier 
flow other than HE installation so we can debug what certificate is loaded in 
VDSM?

Thanks,
Martin

On Tue, Jul 7, 2020 at 2:07 PM Yedidyah Bar David  wrote:

On Tue, Jul 7, 2020 at 12:50 PM Yedidyah Bar David  wrote:

On Wed, Jun 24, 2020 at 2:14 PM Evgeny Slutsky  wrote:

Hi,
changing the hostname to include also the domain name fixed the  cert 
deployment issue:
https://gerrit.ovirt.org/#/c/109842/

not sure how it affects the engine certificate content.
from my offline discussion with @Martin Perina  this was that change that could 
cause it:
https://gerrit.ovirt.org/#/c/109636/

any thoughts?

Above two patches 

[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException (was: vdsm.storage.exception.UnknownTask: Task id unknown (was: [oVirt Jenkins] ovirt-system-tests_he-basic-suite-master -

2020-07-08 Thread Yedidyah Bar David
Did you also get in engine.log "javax.net.ssl.SSLPeerUnverifiedException"?

On Wed, Jul 8, 2020 at 4:25 PM Artem Hrechanychenko  wrote:
>
> Reproduced locally without using Jenkins
>
>> [ INFO  ] TASK [ovirt.hosted_engine_setup : Add HE disks]
>> [ ERROR ] {'msg': 'Timeout exceed while waiting on result state of the 
>> entity.', 'exception': 'Traceback (most recent call last):\n  File 
>> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/modules/ovirt_disk_28.py",
>>  line 678, in main\n  File 
>> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
>>  line 646, in create\n
>> poll_interval=self._module.params[\'poll_interval\'],\n  File 
>> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
>>  line 364, in wait\nraise Exception("Timeout exceed while waiting on 
>> result state of the entity.")\nException: Timeout exceed while waiting on 
>> result state of the entity.\n', 'failed': True, 'invocation': 
>> {'module_args': {'name': 'HostedEngineConfigurationImage', 'size': '1GiB', 
>> 'format': 'raw', 'sparse': False, 'description': 'Hosted-Engine 
>> configuration disk', 'content_type': 'hosted_engine_configuration', 
>> 'interface': 'virtio', 'storage_domain': 'hosted_storage', 'wait': True, 
>> 'timeout': 600, 'auth': {'token': 
>> 'rAqX1OJIbJyMrA1aWVR-AR54T2lsiBbalN80dWugpfHFBqwiCe4rz3porngvlFSE90k-FEqagPPFboU6ew1hPw',
>>  'url': 
>> 'https://lago-he-basic-suite-master-engine.lago.local/ovirt-engine/api', 
>> 'ca_file': None, 'insecure': True, 'timeout': 0, 'compress': True, 
>> 'kerberos': False, 'headers': None}, 'poll_interval': 3, 'fetch_nested': 
>> False, 'nested_attributes': [], 'state': 'present', 'force': False, 'id': 
>> None, 'vm_name': None, 'vm_id': None, 'storage_domains': None, 'profile': 
>> None, 'quota_id': None, 'bootable': None, 'shareable': None, 'logical_unit': 
>> None, 'download_image_path': None, 'upload_image_path': None, 'sparsify': 
>> None, 'openstack_volume_type': None, 'image_provider': None, 'host': None, 
>> 'wipe_after_delete': None, 'activate': None}}, '_ansible_no_log': False, 
>> 'changed': False, 'item': {'name': 'HostedEngineConfigurationImage', 
>> 'description': 'Hosted-Engine configuration disk', 'size': '1GiB', 'format': 
>> 'raw', 'sparse': False, 'content': 'hosted_engine_configuration'}, 
>> 'ansible_loop_var': 'item', '_ansible_item_label': {'name': 
>> 'HostedEngineConfigurationImage', 'description': 'Hosted-Engine 
>> configuration disk', 'size': '1GiB', 'format': 'raw', 'sparse': False, 
>> 'content': 'hosted_engine_configuration'}}
>
>
> On Tue, Jul 7, 2020 at 4:22 PM Martin Perina  wrote:
>>
>> Hi,
>>
>> I'm not aware of change regarding certificates recently. So is this error 
>> reproducible outside Jenkins? Or even better is it reproducible on some 
>> easier flow other than HE installation so we can debug what certificate is 
>> loaded in VDSM?
>>
>> Thanks,
>> Martin
>>
>> On Tue, Jul 7, 2020 at 2:07 PM Yedidyah Bar David  wrote:
>>>
>>> On Tue, Jul 7, 2020 at 12:50 PM Yedidyah Bar David  wrote:
>>> >
>>> > On Wed, Jun 24, 2020 at 2:14 PM Evgeny Slutsky  
>>> > wrote:
>>> > >
>>> > > Hi,
>>> > > changing the hostname to include also the domain name fixed the  cert 
>>> > > deployment issue:
>>> > > https://gerrit.ovirt.org/#/c/109842/
>>> > >
>>> > > not sure how it affects the engine certificate content.
>>> > > from my offline discussion with @Martin Perina  this was that change 
>>> > > that could cause it:
>>> > > https://gerrit.ovirt.org/#/c/109636/
>>> > >
>>> > > any thoughts?
>>> >
>>> > Above two patches are merged, but we still fail the same way:
>>> >
>>> > https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/
>>> >
>>> > https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/ovirt-hosted-engine-setup/engine-logs-2020-07-07T03%3A15%3A01Z/ovirt-engine/engine.log
>>> >
>>> > 2020-07-06 23:04:25,555-04 ERROR
>>> > [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
>>> > (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-38)
>>> > [fb28ce9] Command 'UploadStreamVDSCommand(HostName =
>>> > lago-he-basic-suite-master-host-0.lago.local,
>>> > UploadStreamVDSCommandParameters:{hostId='e096650f-a7d6-4383-b1bb-f2e61327aac0'})'
>>> > execution failed: javax.net.ssl.SSLPeerUnverifiedException:
>>> > Certificate for  doesn't
>>> > match any of the subject alternative names:
>>> > [lago-he-basic-suite-master-host-0.lago.local]
>>> >
>>> > Any idea?
>>>
>>> And I now see this is indeed what's failing hosted-engine deploy at:
>>>
>>> 2020-07-07 05:51:58,573-0400 INFO ansible task start {'status': 'OK',
>>> 'ansible_type': 'task', 'ansible_playbook':
>>> 

[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException (was: vdsm.storage.exception.UnknownTask: Task id unknown (was: [oVirt Jenkins] ovirt-system-tests_he-basic-suite-master -

2020-07-08 Thread Artem Hrechanychenko
Reproduced locally without using Jenkins

[ INFO  ] TASK [ovirt.hosted_engine_setup : Add HE disks]
> [ ERROR ] {'msg': 'Timeout exceed while waiting on result state of the
> entity.', 'exception': 'Traceback (most recent call last):\n  File
> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/modules/ovirt_disk_28.py",
> line 678, in main\n  File
> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
> line 646, in create\n
>  poll_interval=self._module.params[\'poll_interval\'],\n  File
> "/tmp/ansible_ovirt_disk_28_payload_vtqyyibx/ansible_ovirt_disk_28_payload.zip/ansible/module_utils/ovirt.py",
> line 364, in wait\nraise Exception("Timeout exceed while waiting on
> result state of the entity.")\nException: Timeout exceed while waiting on
> result state of the entity.\n', 'failed': True, 'invocation':
> {'module_args': {'name': 'HostedEngineConfigurationImage', 'size': '1GiB',
> 'format': 'raw', 'sparse': False, 'description': 'Hosted-Engine
> configuration disk', 'content_type': 'hosted_engine_configuration',
> 'interface': 'virtio', 'storage_domain': 'hosted_storage', 'wait': True,
> 'timeout': 600, 'auth': {'token':
> 'rAqX1OJIbJyMrA1aWVR-AR54T2lsiBbalN80dWugpfHFBqwiCe4rz3porngvlFSE90k-FEqagPPFboU6ew1hPw',
> 'url': '
> https://lago-he-basic-suite-master-engine.lago.local/ovirt-engine/api',
> 'ca_file': None, 'insecure': True, 'timeout': 0, 'compress': True,
> 'kerberos': False, 'headers': None}, 'poll_interval': 3, 'fetch_nested':
> False, 'nested_attributes': [], 'state': 'present', 'force': False, 'id':
> None, 'vm_name': None, 'vm_id': None, 'storage_domains': None, 'profile':
> None, 'quota_id': None, 'bootable': None, 'shareable': None,
> 'logical_unit': None, 'download_image_path': None, 'upload_image_path':
> None, 'sparsify': None, 'openstack_volume_type': None, 'image_provider':
> None, 'host': None, 'wipe_after_delete': None, 'activate': None}},
> '_ansible_no_log': False, 'changed': False, 'item': {'name':
> 'HostedEngineConfigurationImage', 'description': 'Hosted-Engine
> configuration disk', 'size': '1GiB', 'format': 'raw', 'sparse': False,
> 'content': 'hosted_engine_configuration'}, 'ansible_loop_var': 'item',
> '_ansible_item_label': {'name': 'HostedEngineConfigurationImage',
> 'description': 'Hosted-Engine configuration disk', 'size': '1GiB',
> 'format': 'raw', 'sparse': False, 'content': 'hosted_engine_configuration'}}
>

On Tue, Jul 7, 2020 at 4:22 PM Martin Perina  wrote:

> Hi,
>
> I'm not aware of change regarding certificates recently. So is this error
> reproducible outside Jenkins? Or even better is it reproducible on some
> easier flow other than HE installation so we can debug what certificate is
> loaded in VDSM?
>
> Thanks,
> Martin
>
> On Tue, Jul 7, 2020 at 2:07 PM Yedidyah Bar David  wrote:
>
>> On Tue, Jul 7, 2020 at 12:50 PM Yedidyah Bar David 
>> wrote:
>> >
>> > On Wed, Jun 24, 2020 at 2:14 PM Evgeny Slutsky 
>> wrote:
>> > >
>> > > Hi,
>> > > changing the hostname to include also the domain name fixed the  cert
>> deployment issue:
>> > > https://gerrit.ovirt.org/#/c/109842/
>> > >
>> > > not sure how it affects the engine certificate content.
>> > > from my offline discussion with @Martin Perina  this was that change
>> that could cause it:
>> > > https://gerrit.ovirt.org/#/c/109636/
>> > >
>> > > any thoughts?
>> >
>> > Above two patches are merged, but we still fail the same way:
>> >
>> >
>> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/
>> >
>> >
>> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/ovirt-hosted-engine-setup/engine-logs-2020-07-07T03%3A15%3A01Z/ovirt-engine/engine.log
>> >
>> > 2020-07-06 23:04:25,555-04 ERROR
>> > [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
>> > (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-38)
>> > [fb28ce9] Command 'UploadStreamVDSCommand(HostName =
>> > lago-he-basic-suite-master-host-0.lago.local,
>> >
>> UploadStreamVDSCommandParameters:{hostId='e096650f-a7d6-4383-b1bb-f2e61327aac0'})'
>> > execution failed: javax.net.ssl.SSLPeerUnverifiedException:
>> > Certificate for  doesn't
>> > match any of the subject alternative names:
>> > [lago-he-basic-suite-master-host-0.lago.local]
>> >
>> > Any idea?
>>
>> And I now see this is indeed what's failing hosted-engine deploy at:
>>
>> 2020-07-07 05:51:58,573-0400 INFO ansible task start {'status': 'OK',
>> 'ansible_type': 'task', 'ansible_playbook':
>> '/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml',
>> 'ansible_task': 'ovirt.hosted_engine_setup : Check OVF_STORE volume
>> status'}
>>
>> (See other thread: [oVirt Jenkins]
>> ovirt-system-tests_he-basic-suite-master - Build # 1655 - Still
>> Failing! )
>>
>> On a successful run, engine.log has:
>>
>> 

[ovirt-devel] Re: execution failed: javax.net.ssl.SSLPeerUnverifiedException (was: vdsm.storage.exception.UnknownTask: Task id unknown (was: [oVirt Jenkins] ovirt-system-tests_he-basic-suite-master -

2020-07-07 Thread Martin Perina
Hi,

I'm not aware of change regarding certificates recently. So is this error
reproducible outside Jenkins? Or even better is it reproducible on some
easier flow other than HE installation so we can debug what certificate is
loaded in VDSM?

Thanks,
Martin

On Tue, Jul 7, 2020 at 2:07 PM Yedidyah Bar David  wrote:

> On Tue, Jul 7, 2020 at 12:50 PM Yedidyah Bar David 
> wrote:
> >
> > On Wed, Jun 24, 2020 at 2:14 PM Evgeny Slutsky 
> wrote:
> > >
> > > Hi,
> > > changing the hostname to include also the domain name fixed the  cert
> deployment issue:
> > > https://gerrit.ovirt.org/#/c/109842/
> > >
> > > not sure how it affects the engine certificate content.
> > > from my offline discussion with @Martin Perina  this was that change
> that could cause it:
> > > https://gerrit.ovirt.org/#/c/109636/
> > >
> > > any thoughts?
> >
> > Above two patches are merged, but we still fail the same way:
> >
> >
> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/
> >
> >
> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1664/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/ovirt-hosted-engine-setup/engine-logs-2020-07-07T03%3A15%3A01Z/ovirt-engine/engine.log
> >
> > 2020-07-06 23:04:25,555-04 ERROR
> > [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand]
> > (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-38)
> > [fb28ce9] Command 'UploadStreamVDSCommand(HostName =
> > lago-he-basic-suite-master-host-0.lago.local,
> >
> UploadStreamVDSCommandParameters:{hostId='e096650f-a7d6-4383-b1bb-f2e61327aac0'})'
> > execution failed: javax.net.ssl.SSLPeerUnverifiedException:
> > Certificate for  doesn't
> > match any of the subject alternative names:
> > [lago-he-basic-suite-master-host-0.lago.local]
> >
> > Any idea?
>
> And I now see this is indeed what's failing hosted-engine deploy at:
>
> 2020-07-07 05:51:58,573-0400 INFO ansible task start {'status': 'OK',
> 'ansible_type': 'task', 'ansible_playbook':
> '/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml',
> 'ansible_task': 'ovirt.hosted_engine_setup : Check OVF_STORE volume
> status'}
>
> (See other thread: [oVirt Jenkins]
> ovirt-system-tests_he-basic-suite-master - Build # 1655 - Still
> Failing! )
>
> On a successful run, engine.log has:
>
> 2020-07-02 18:01:55,527+03 INFO
>
> [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
>  [2b0721d8] Running command: ProcessOvfUpdateForStorageDomainCommand
> internal: true. Entities affected :  ID:
> e102d7b5-1a37-490f-a3e7-20e56c37791f Type: StorageAction group
> MANIPULATE_STORAG
> E_DOMAIN with role type ADMIN
> 2020-07-02 18:01:55,607+03 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> [2b0721d8
> ] START, SetVolumeDescriptionVDSCommand(
>
> SetVolumeDescriptionVDSCommandParameters:{storagePoolId='b9dccefe-bc61-11ea-8ebe-001a4a231728',
> ignoreFailoverLimit='false', storageDomainId='e102d7b
> 5-1a37-490f-a3e7-20e56c37791f',
> imageGroupId='db934a98-4111-4faf-8cb9-6b36928cd61c',
> imageId='f898c40e-1f88-48db-b59b-f2c73162ddb7'}), log id: e203e51
> 2020-07-02 18:01:55,609+03 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> [2b0721d8
> ] -- executeIrsBrokerCommand: calling 'setVolumeDescription', parameters:
> 2020-07-02 18:01:55,609+03 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> [2b0721d8
> ] ++ spUUID=b9dccefe-bc61-11ea-8ebe-001a4a231728
> 2020-07-02 18:01:55,609+03 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> [2b0721d8
> ] ++ sdUUID=e102d7b5-1a37-490f-a3e7-20e56c37791f
> 2020-07-02 18:01:55,609+03 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> [2b0721d8
> ] ++ imageGroupGUID=db934a98-4111-4faf-8cb9-6b36928cd61c
> 2020-07-02 18:01:55,610+03 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> [2b0721d8
> ] ++ volUUID=f898c40e-1f88-48db-b59b-f2c73162ddb7
> 2020-07-02 18:01:55,610+03 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84)
> [2b0721d8
> ] ++ description={"Updated":false,"Last Updated":"Thu Jul 02 17:35:07
> IDT 2020","Storage
>