[ovirt-users] Can't connect "https://FQDN/ovirt-engine" after reinstall ovirt-engine

2019-03-19 Thread exclim89
19.03.2019 on the server with the ovirt engine, I reinstalled the GUI and it 
seemed to delete the ovirt-engine, in any case, the installed ovirt-engine 
packages were not. I reinstalled the ovirt-engine package, but could not 
connect to the web interface. There are such errors in the 
/var/log/ovirt-engine/engine.log:

2019-03-20 11:37:12,660+09 INFO  
[org.ovirt.engine.core.extensions.mgr.ExtensionsManager] (ServerService Thread 
Pool -- 58) [] Initializing extension 'internal-authz'
2019-03-20 11:37:12,661+09 ERROR 
[org.ovirt.engine.extension.aaa.jdbc.binding.api.AuthzExtension] (ServerService 
Thread Pool -- 58) [] Unexpected Exception invoking: 
EXTENSION_INITIALIZE[e5ae1b7f-9104-4f23-a444-7b9175ff68d2]
2019-03-20 11:37:12,662+09 ERROR 
[org.ovirt.engine.core.extensions.mgr.ExtensionsManager] (ServerService Thread 
Pool -- 58) [] Error in activating extension 'internal-authz': Connection 
refused. Check that the hostname and port are correct and that the postmaster 
is accepting TCP/IP connections.
2019-03-20 11:37:12,662+09 ERROR 
[org.ovirt.engine.core.sso.utils.SsoExtensionsManager] (ServerService Thread 
Pool -- 58) [] Could not initialize extension 'internal-authz'. Exception 
message is: Class: class 
org.ovirt.engine.core.extensions.mgr.ExtensionInvokeCommandFailedException
...
2019-03-20 11:37:13,046+09 ERROR 
[org.ovirt.engine.ui.frontend.server.dashboard.DashboardDataServlet] 
(ServerService Thread Pool -- 44) [] Could not access engine's DWH 
configuration table: java.sql.SQLException: javax.resource.ResourceException: 
IJ000453: Unable to get managed connection for java:/ENGINEDataSource
...
2019-03-20 11:37:13,049+09 WARN  
[org.ovirt.engine.ui.frontend.server.dashboard.DashboardDataServlet] 
(ServerService Thread Pool -- 44) [] No valid DWH configurations were found, 
assuming DWH database isn't setup.
2019-03-20 11:37:13,049+09 INFO  
[org.ovirt.engine.ui.frontend.server.dashboard.DashboardDataServlet] 
(ServerService Thread Pool -- 44) [] Dashboard DB query cache has been disabled.

ovirt-engine-4.2.8.2-1
CentOS 7,  3.10.0-693.2.2.el7.x86_64

I have a backup file, but version 4.1. Recovering from it using engine-backup 
is impossible.
I also have a backup /etc before deleting the ovirt-engine.

Как мне восстановить ovirt-engine?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/4SIGKFFHKBYFWMLNPJODEK6SV7PLYGLP/


[ovirt-users] Re: vender_id syntax UserDefinedVMProperties

2019-03-19 Thread Strahil
Can't you make the script check if it windows or Linux and skip if it's Linux?

Best Regards,
Strahil NikolovOn Mar 19, 2019 23:02, Darin Schmidt  
wrote:
>
> You also need to have this code hooked in:
> cd /usr/libexec/vdsm/hooks/before_vm_start/
> vi 99_mask_kvm
>
> #!/usr/bin/python2
>
> import hooking
> domxml = hooking.read_domxml()
>
> hyperv = domxml.getElementsByTagName('hyperv')[0]
> smm = domxml.createElement('vendor_id')
> smm.setAttribute('state', 'on')
> smm.setAttribute('value', '1234567890ab')
> hyperv.appendChild(smm)
>
> features = domxml.getElementsByTagName('features')[0]
> kvm = domxml.createElement('kvm')
> hidden = domxml.createElement('hidden')
> hidden.setAttribute('state', 'on')
> kvm.appendChild(hidden)
> features.appendChild(kvm)
>
> hooking.write_domxml(domxml)
>
>
> only problem now is that I cant boot a linux VM with the vendor_is portion 
> there..
>
> On Mon, Mar 18, 2019 at 3:30 PM Darin Schmidt  wrote:
>>
>> Seems that the system has to be running with bios Q35 UEFI. Standard bios 
>> does not work. System is operational now. 
>>
>> On Mon, Mar 18, 2019, 6:30 AM Darin Schmidt  wrote:
>>>
>>> Still no luck getting the gtx 1080 to enable inside the VM. I see the code 
>>> is being generated in the xml with the hook. But I still get error code 43. 
>>> Someone mentioned doing it with eufi bios and that worked for them. So when 
>>> I get back from work today, perhaps ill give that a try. 
>>>
>>> On Mon, Mar 18, 2019, 6:10 AM Darin Schmidt  wrote:

 I have gotten the system to see the card, its in device manager. The 
 problem seems to be that I cannot use it in the VM because from what I 
 have been finding out is that it gets and error code 43. Nvidia drivers 
 disable the card if it detects that its being used in a VM. I have found 
 some code to use to hook it into the xml before_vm_starts.

 99_mask_kvm
 #!/usr/bin/python2

 import hooking
 domxml = hooking.read_domxml()

 hyperv = domxml.getElementsByTagName('hyperv')[0]
 smm = domxml.createElement('vendor_id')
 smm.setAttribute('state', 'on')
 smm.setAttribute('value', '1234567890ab')
 hyperv.appendChild(smm)

 features = domxml.getElementsByTagName('features')[0]
 kvm = domxml.createElement('kvm')
 hidden = domxml.createElement('hidden')
 hidden.setAttribute('state', 'on')
 kvm.appendChild(hidden)
 features.appendChild(kvm)

 hooking.write_domxml(domxml)


 I am currently reinstalling the drivers to see if this helps. 

 kvm off and vender_id is now in the xml code that get generated when the 
 VM is started. Im going off of examples Im finding online. Perhaps I just 
 need to add the 10de to it instead of some generic # others are using. 

 On Mon, Mar 18, 2019 at 6:02 AM Nisim Simsolo  wrote:
>
> Hi
>
> Vendor ID of Nvidia is usually 10de.
> You can locate 'vendor ID:___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VH27ZEYJHUE5XQ72FPE26YLOQBNHIXW4/


[ovirt-users] Re: vender_id syntax UserDefinedVMProperties

2019-03-19 Thread Darin Schmidt
You also need to have this code hooked in:
cd /usr/libexec/vdsm/hooks/before_vm_start/
vi 99_mask_kvm

#!/usr/bin/python2

import hooking
domxml = hooking.read_domxml()

hyperv = domxml.getElementsByTagName('hyperv')[0]
smm = domxml.createElement('vendor_id')
smm.setAttribute('state', 'on')
smm.setAttribute('value', '1234567890ab')
hyperv.appendChild(smm)

features = domxml.getElementsByTagName('features')[0]
kvm = domxml.createElement('kvm')
hidden = domxml.createElement('hidden')
hidden.setAttribute('state', 'on')
kvm.appendChild(hidden)
features.appendChild(kvm)

hooking.write_domxml(domxml)


only problem now is that I cant boot a linux VM with the vendor_is portion
there..

On Mon, Mar 18, 2019 at 3:30 PM Darin Schmidt 
wrote:

> Seems that the system has to be running with bios Q35 UEFI. Standard bios
> does not work. System is operational now.
>
> On Mon, Mar 18, 2019, 6:30 AM Darin Schmidt 
> wrote:
>
>> Still no luck getting the gtx 1080 to enable inside the VM. I see the
>> code is being generated in the xml with the hook. But I still get error
>> code 43. Someone mentioned doing it with eufi bios and that worked for
>> them. So when I get back from work today, perhaps ill give that a try.
>>
>> On Mon, Mar 18, 2019, 6:10 AM Darin Schmidt 
>> wrote:
>>
>>> I have gotten the system to see the card, its in device manager. The
>>> problem seems to be that I cannot use it in the VM because from what I have
>>> been finding out is that it gets and error code 43. Nvidia drivers disable
>>> the card if it detects that its being used in a VM. I have found some code
>>> to use to hook it into the xml before_vm_starts.
>>>
>>> 99_mask_kvm
>>> #!/usr/bin/python2
>>>
>>> import hooking
>>> domxml = hooking.read_domxml()
>>>
>>> hyperv = domxml.getElementsByTagName('hyperv')[0]
>>> smm = domxml.createElement('vendor_id')
>>> smm.setAttribute('state', 'on')
>>> smm.setAttribute('value', '1234567890ab')
>>> hyperv.appendChild(smm)
>>>
>>> features = domxml.getElementsByTagName('features')[0]
>>> kvm = domxml.createElement('kvm')
>>> hidden = domxml.createElement('hidden')
>>> hidden.setAttribute('state', 'on')
>>> kvm.appendChild(hidden)
>>> features.appendChild(kvm)
>>>
>>> hooking.write_domxml(domxml)
>>>
>>>
>>> I am currently reinstalling the drivers to see if this helps.
>>>
>>> kvm off and vender_id is now in the xml code that get generated when the
>>> VM is started. Im going off of examples Im finding online. Perhaps I just
>>> need to add the 10de to it instead of some generic # others are using.
>>>
>>> On Mon, Mar 18, 2019 at 6:02 AM Nisim Simsolo 
>>> wrote:
>>>
 Hi

 Vendor ID of Nvidia is usually 10de.
 You can locate 'vendor ID:product ID' by running lspci command, for
 example:
 [root@intel-vfio ~]# lspci -Dnn | grep -i nvidia
 :03:00.0 VGA compatible controller [0300]: NVIDIA Corporation
 GK104GL [Quadro K4200] [10de:11b4] (rev a1)
 :03:00.1 Audio device [0403]: NVIDIA Corporation GK104 HDMI Audio
 Controller [10de:0e0a] (rev a1)
 [root@intel-vfio ~]#

 In this example, the vendor ID of VGA controller is 10de and the
 product ID is 11b4

 Please bare in mind that you need to enable IOMMU, add pci-stub
 (prevent the host driver for using GPU device) and disable the default
 nouveau driver on the host Kernel command line.
 to do that:
 1. Edit host /etc/sysconfig/grub and add the next to GRUB_CMDLINE_LINUX:

- intel_iommu=on or amd_iommu=on
- pci-stub.ids=10de:11b4,10de:0e0a
- rdblacklist=nouveau

 2. Regenerate the boot loader configuration using grub2-mkconfig
 command:
 # grub2-mkconfig -o /etc/grub2.cfg
 3. Reboot the host.
 4. Verify configuration:
 [root@intel-vfio ~]# cat /proc/cmdline
 BOOT_IMAGE=/vmlinuz-3.10.0-957.5.1.el7.x86_64
 root=/dev/mapper/vg0-lv_root ro crashkernel=auto rd.lvm.lv=vg0/lv_root
 rd.lvm.lv=vg0/lv_swap rhgb quiet pci-stub.ids=10de:11b4,10de:0e0a
 intel_iommu=on rdblacklist=nouveau LANG=en_US.UTF-8
 [root@intel-vfio ~]#


 After running this, you should be able to passthrough GPU to VM.

 BTW, why are you using engine-config and not doing it from oVirt UI or
 using virsh edit command?

 Thanks


 On Mon, Mar 18, 2019 at 1:52 AM Darin Schmidt 
 wrote:

> Hello all, im trying to figure out how to configure the custom
> properties to enable my NVIDIA card to work in the VM. Its my 
> understanding
> that the drives dont work because it detects its in a VM..
>
> Im trying to do something like this:
>
> engine-config -s
> UserDefinedVMProperties="kvmhidden=^(true|false)$;{type=vendor_id;state={^(on|off)$;value=^([0-9])$}}"
>
>
> But thats clearly not working. If I do this:
>
> engine-config -s
> UserDefinedVMProperties="kvmhidden=^(true|false)$;vendor_id={state=^(on|off)$;value=^([0-9])$}"

[ovirt-users] oVirt 4.3 - create windows2012 vm failed

2019-03-19 Thread jingjie . jiang
Hi,

The oVirt 4.3 is installed on CentOS 7.6. 
Windows2012 vm failed with following error messages:

2019-03-18 09:19:12,941-04 ERROR 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(ForkJoinPool-1-worker-15) [] EVENT_ID: VM_DOWN_ERROR(119), VM win12_nfs is 
down with error. Exit message: internal error: qemu unexpectedly closed the 
monitor: 2019-03-18T13:19:11.837850Z qemu-kvm: warning: All CPU(s) up to 
maxcpus should be described in NUMA config, ability to start up with partial 
NUMA mappings is obsoleted and will be removed in future
Hyper-V SynIC is not supported by kernel
2019-03-18T13:19:11.861071Z qemu-kvm: kvm_init_vcpu failed: Function not 
implemented.

Anyone suggestion?

Thanks,
Jingjie
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GHUBLKYNM4G5QOIEFY5H7QKPLBDU5H7Z/


[ovirt-users] Re: oVirt 4.3 - create windows2012 failed due to ovirt-imageio-proxy

2019-03-19 Thread Jingjie Jiang

Hi Edward,

The problem fixed.


Thanks,

Jingjie


On 3/18/19 5:07 PM, Edward Berger wrote:
That is completely normal if you didn't download and install the CA 
certificate from your ovirt engine GUI.

There's a download link for it on the page before you login?

On Mon, Mar 18, 2019 at 5:01 PM > wrote:


Hi,

I tried to create windows2012 vm on nfs data domain, but the disk
was locked.
Found the error message as following:

Connection to ovirt-imageio-proxy service has failed. Make sure
the service is installed, configured, and ovirt-engine certificate
is registered as a valid CA in the browser.

Is this known issue?

Thanks,
Jingjie
___
Users mailing list -- users@ovirt.org 
To unsubscribe send an email to users-le...@ovirt.org

Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/QBZQG553L7WYVHQFDRWUAYKYZ2HLSJKW/

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6KI3N22RY6UCXM6OR73T7WPCXLUOWA5X/


[ovirt-users] Re: Change cluster cpu type with hosted engine

2019-03-19 Thread Fabrice SOLER

Le 12/03/2019 à 09:08, Juhani Rautiainen a écrit :


On Tue, Mar 12, 2019 at 2:02 PM Fabrice SOLER 
> wrote:


Hello,

I need to create a windows 10 virtual machine but I have an error :

I have a fresh ovirt installation (version 4.2.8) with an hosted
engine. At the hosted engine installation there was no question
about the cluster cpu type, it should be great if in the future
version it could be.

To change an host to another cluster this host need to be in
maintenance mode, and the hosted engine will be power off.

I have created another Cluster with an SandyBridge Family CPU
type, but to move the hosted engine to this new cluster the hosted
should be power off.

Is there someone who can help ?


Hi!

This is modified from original but could work:
- create new cluster with new CPU type
- set HE global maintenance mode
- set one of the hosted-engine hosts into maintenance mode

Hi,
Thank you for your answer.
At this point the hosted-engine hosts could not come into the 
maintenance mode.
Do you know if it is possible to change cluster with CLI. Thus, I could 
stop the hosted engine and move the hosts to the new cluser.

Sincerely,
Fabrice

- move it to a different cluster
- shutdown the engine VM
- manually restart the engine VM on the host on the custom cluster 
directly executing on that host: hosted-engine --vm-start

- connect again to the engine
- set all the hosts of the initial cluster into maintenance mode
- change CPU type in original cluster
- shut down again the engine VM
- manually restart the engine VM on one of the hosts of the initial 
cluster
- move back the host that got into a temporary cluster to its initial 
cluster


Sincerely,

-- 
___

Users mailing list -- users@ovirt.org 
To unsubscribe send an email to users-le...@ovirt.org

Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/KFH3ZLPA7KZSSJG3DGOGW2F4OMXE4KZK/


-Juhani
--
Juhani Rautiainen jra...@iki.fi 



--
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QZCARTDLTKXJ5YTFQ7EX4T3BB4RFGD64/


[ovirt-users] Re: Ovirt 4.3.1 problem with HA agent

2019-03-19 Thread Strahil
>> >> 1.2 All bricks healed (gluster volume heal data info summary) and no 
>> >> split-brain
>> >
>> >  
>> >  
>> > gluster volume heal data info
>> >  
>> > Brick node-msk-gluster203:/opt/gluster/data
>> > Status: Connected
>> > Number of entries: 0
>> >  
>> > Brick node-msk-gluster205:/opt/gluster/data
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > Status: Connected
>> > Number of entries: 7
>> >  
>> > Brick node-msk-gluster201:/opt/gluster/data
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > Status: Connected
>> > Number of entries: 7
>> >  
>>
>> Data needs healing.
>> Run: cluster volume heal data full
>
> This does not work.

Yeah, That's because my phone corrects the 'gluster' to 'cluster'

Usually gluster daemons detect need of heal, but with 'gluster volume heal data 
full && sleep 5 && gluster volume heal data info summary && sleep 5 && gluster 
volume heal data info summary', you can force syncing and get the result.
Let's see what happens with DNS.

Best Regards,
Strahil Nikolov___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BR4Y4X5AGRUWGYOSKNQPRR6XHCOMQXZG/


[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Gianluca Cecchi
On Tue, Mar 19, 2019 at 4:44 PM Gianluca Cecchi 
wrote:

> On Tue, Mar 19, 2019 at 4:31 PM Miguel Duarte de Mora Barroso <
> mdbarr...@redhat.com> wrote:
>
> [snip]
>
>
>> >> >> >> @Gianluca Cecchi , I notice that one of your duplicate networks -
>> >> >> >> 'ovn192'  - has no ports attached. That makes it the perfect
>> candidate
>> >> >> >> to be deleted, and see if it becomes 'listable' on engine. That
>> would
>> >> >> >> help rule out the 'duplicate name' theory.
>> >> >> >
>> >> >> >
>> >> >> >  I can try. Can you give me the command to be run?
>> >> >> > It is a test oVirt so It would be not a big problem in case of
>> failures in this respect.
>> >> >>
>> >> >> You can delete it via the UI; just be sure to delete the one without
>> >> >> ports - it's external ID is 6110649a-db2b-4de7-8fbc-601095cfe510.
>> >> >>
>> >> >> It will ask you if you also want to delete it from the external
>> >> >> provider, say yes.
>> >> >
>> >> >
>> >> >
>> >> > Inside the GUI I see only one ovn192 network and one ovn172 network
>> and their external ids don't match the ones without ports...
>> >> >
>> >> > - ovn192
>> >> > Id: 8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5
>> >> > External ID: 32367d8a-460f-4447-b35a-abe9ea5187e0
>> >> >
>> >> > - ovn172
>> >> > Id: 7546d5d3-a0e3-40d5-9d22-cf355da47d3a
>> >> > External ID: 64c4c17f-cd67-4e29-939e-2b952495159f
>> >> >
>> >> > So I think I have to delete from command line
>> >>
>> >> Check pastebin [0],  with it you can safely delete those 2 networks.
>> >> Last course of action would be to delete via ovn-nbctl - e.g.
>> >> ovn-nbctl destroy logical_switch  - but hopefully it won't
>> >> come to that.
>> >>
>> >> [0] - https://paste.fedoraproject.org/paste/mxVUEJZWxG-QHX0mJO1VhA
>> >>
>>
>>
> I get "not found" for both:
>
>  [root@ovmgr1 ~]# curl -k -X DELETE   '
> https://localhost:9696/v2/networks/6110649a-db2b-4de7-8fbc-601095cfe510'
>  -H 'X-Auth-Token:
> WyutJuakjpSzJ4nj7drptpDfbAb3sKcZWvhF3NqRVXRyUpIHz9QGG_ZeeLi7u7trv7Er2D3vAcSX9LIFpXzz7w'
> {
>   "error": {
> "message": "Cannot find Logical_Switch with
> name=6110649a-db2b-4de7-8fbc-601095cfe510",
> "code": 404,
> "title": "Not Found"
>   }
> }
> [root@ovmgr1 ~]# curl -k -X DELETE   '
> https://localhost:9696/v2/networks/8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5'
>  -H 'X-Auth-Token:
> WyutJuakjpSzJ4nj7drptpDfbAb3sKcZWvhF3NqRVXRyUpIHz9QGG_ZeeLi7u7trv7Er2D3vAcSX9LIFpXzz7w'
> {
>   "error": {
> "message": "Cannot find Logical_Switch with
> name=8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5",
> "code": 404,
> "title": "Not Found"
>   }
> }
> [root@ovmgr1 ~]#
>
> Is there a command to get the supposed list?
>
> Thanks for your help.
> I'm also available to completely reset the OVN config if there is a way
> for it...
>
> Gianluca
>


A GET call outputs this information :
 [root@ovmgr1 ~]# curl -k -X GET 'https://localhost:9696/v2/networks' -H
'X-Auth-Token:
WyutJuakjpSzJ4nj7drptpDfbAb3sKcZWvhF3NqRVXRyUpIHz9QGG_ZeeLi7u7trv7Er2D3vAcSX9LIFpXzz7w'
{"networks": [{"status": "ACTIVE", "name": "ovn172", "tenant_id":
"0001", "mtu": 1442, "port_security_enabled":
false, "id": "64c4c17f-cd67-4e29-939e-2b952495159f"}, {"status": "ACTIVE",
"name": "ovn172", "tenant_id": "0001", "mtu":
1442, "port_security_enabled": false, "id":
"04501f6b-3977-4ba1-9ead-7096768d796d"}, {"status": "ACTIVE", "name":
"ovn192", "tenant_id": "0001", "mtu": 1442,
"port_security_enabled": false, "id":
"32367d8a-460f-4447-b35a-abe9ea5187e0"}]}[root@ovmgr1 ~]#
[root@ovmgr1 ~]#
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GOG3NESKLWF4MFWP5CJRV6COUZF24FL2/


[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Gianluca Cecchi
On Tue, Mar 19, 2019 at 4:31 PM Miguel Duarte de Mora Barroso <
mdbarr...@redhat.com> wrote:

[snip]


> >> >> >> @Gianluca Cecchi , I notice that one of your duplicate networks -
> >> >> >> 'ovn192'  - has no ports attached. That makes it the perfect
> candidate
> >> >> >> to be deleted, and see if it becomes 'listable' on engine. That
> would
> >> >> >> help rule out the 'duplicate name' theory.
> >> >> >
> >> >> >
> >> >> >  I can try. Can you give me the command to be run?
> >> >> > It is a test oVirt so It would be not a big problem in case of
> failures in this respect.
> >> >>
> >> >> You can delete it via the UI; just be sure to delete the one without
> >> >> ports - it's external ID is 6110649a-db2b-4de7-8fbc-601095cfe510.
> >> >>
> >> >> It will ask you if you also want to delete it from the external
> >> >> provider, say yes.
> >> >
> >> >
> >> >
> >> > Inside the GUI I see only one ovn192 network and one ovn172 network
> and their external ids don't match the ones without ports...
> >> >
> >> > - ovn192
> >> > Id: 8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5
> >> > External ID: 32367d8a-460f-4447-b35a-abe9ea5187e0
> >> >
> >> > - ovn172
> >> > Id: 7546d5d3-a0e3-40d5-9d22-cf355da47d3a
> >> > External ID: 64c4c17f-cd67-4e29-939e-2b952495159f
> >> >
> >> > So I think I have to delete from command line
> >>
> >> Check pastebin [0],  with it you can safely delete those 2 networks.
> >> Last course of action would be to delete via ovn-nbctl - e.g.
> >> ovn-nbctl destroy logical_switch  - but hopefully it won't
> >> come to that.
> >>
> >> [0] - https://paste.fedoraproject.org/paste/mxVUEJZWxG-QHX0mJO1VhA
> >>
>
>
I get "not found" for both:

 [root@ovmgr1 ~]# curl -k -X DELETE   '
https://localhost:9696/v2/networks/6110649a-db2b-4de7-8fbc-601095cfe510'
 -H 'X-Auth-Token:
WyutJuakjpSzJ4nj7drptpDfbAb3sKcZWvhF3NqRVXRyUpIHz9QGG_ZeeLi7u7trv7Er2D3vAcSX9LIFpXzz7w'
{
  "error": {
"message": "Cannot find Logical_Switch with
name=6110649a-db2b-4de7-8fbc-601095cfe510",
"code": 404,
"title": "Not Found"
  }
}
[root@ovmgr1 ~]# curl -k -X DELETE   '
https://localhost:9696/v2/networks/8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5'
 -H 'X-Auth-Token:
WyutJuakjpSzJ4nj7drptpDfbAb3sKcZWvhF3NqRVXRyUpIHz9QGG_ZeeLi7u7trv7Er2D3vAcSX9LIFpXzz7w'
{
  "error": {
"message": "Cannot find Logical_Switch with
name=8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5",
"code": 404,
"title": "Not Found"
  }
}
[root@ovmgr1 ~]#

Is there a command to get the supposed list?

Thanks for your help.
I'm also available to completely reset the OVN config if there is a way for
it...

Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q4HGOPNRBWRALML6A4UNFR7B4LYX643N/


[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Miguel Duarte de Mora Barroso
On Tue, Mar 19, 2019 at 3:09 PM Gianluca Cecchi
 wrote:
>
>
>
> On Tue, Mar 19, 2019 at 2:51 PM Miguel Duarte de Mora Barroso 
>  wrote:
>>
>> On Tue, Mar 19, 2019 at 2:15 PM Gianluca Cecchi
>>  wrote:
>> >
>> > On Tue, Mar 19, 2019 at 10:25 AM Miguel Duarte de Mora Barroso 
>> >  wrote:
>> >>
>> >> >>
>> >> >>
>> >> >> @Gianluca Cecchi , I notice that one of your duplicate networks -
>> >> >> 'ovn192'  - has no ports attached. That makes it the perfect candidate
>> >> >> to be deleted, and see if it becomes 'listable' on engine. That would
>> >> >> help rule out the 'duplicate name' theory.
>> >> >
>> >> >
>> >> >  I can try. Can you give me the command to be run?
>> >> > It is a test oVirt so It would be not a big problem in case of failures 
>> >> > in this respect.
>> >>
>> >> You can delete it via the UI; just be sure to delete the one without
>> >> ports - it's external ID is 6110649a-db2b-4de7-8fbc-601095cfe510.
>> >>
>> >> It will ask you if you also want to delete it from the external
>> >> provider, say yes.
>> >
>> >
>> >
>> > Inside the GUI I see only one ovn192 network and one ovn172 network and 
>> > their external ids don't match the ones without ports...
>> >
>> > - ovn192
>> > Id: 8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5
>> > External ID: 32367d8a-460f-4447-b35a-abe9ea5187e0
>> >
>> > - ovn172
>> > Id: 7546d5d3-a0e3-40d5-9d22-cf355da47d3a
>> > External ID: 64c4c17f-cd67-4e29-939e-2b952495159f
>> >
>> > So I think I have to delete from command line
>>
>> Check pastebin [0],  with it you can safely delete those 2 networks.
>> Last course of action would be to delete via ovn-nbctl - e.g.
>> ovn-nbctl destroy logical_switch  - but hopefully it won't
>> come to that.
>>
>> [0] - https://paste.fedoraproject.org/paste/mxVUEJZWxG-QHX0mJO1VhA
>>
>> >
>> > Gianluca Cecchi
>> >
>
>
>
> I get this error from the first part where I should get  the token id
> {
>   "error": {
> "message": "No JSON object could be decoded",
> "code": 400,
> "title": "Bad Request"
>   }
> }
>
> In your command there is:
>
>   -H 'Postman-Token: 87fa50fd-0d06-497d-b2ac-b66b78ad90b8' \

Remove that, sorry for not noticing it before. Also get rid of the
'Cache-Control: no-cache' header.

The request thus becomes:
curl -k -X POST \
  https://localhost:35357/v2.0/tokens \
  -H 'Content-Type: application/json' \
  -d '{
"auth": {
"passwordCredentials": {
"username": ,
"password": 
}
}
}
'

>
> what is that sequence? where did you get it?
> Also, inside the credential section
>
> "username": ,
> "password": YYY
>
> do I have to put my username and password inside single/double quotes or 
> nothing?
> eg admin@internal or "admin@internal" or what?
>

Between quotes - e.g. "admin@internal" and "whatever-password-you-have".

> Thanks,
> Gianluca
>
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IBMCNPUMP25FMLLHREJPOAOOFSYWARQB/


[ovirt-users] Re: [ANN] oVirt 4.3.2 is now generally available

2019-03-19 Thread Sandro Bonazzola
Il giorno mar 19 mar 2019 alle ore 10:59 Sandro Bonazzola <
sbona...@redhat.com> ha scritto:

> The oVirt Project is pleased to announce the general availability of oVirt
> 4.3.2, as of March 19th, 2019.
>
> This update is the second in a series of stabilization updates to the 4.3
> series.
>
> This release is available now on x86_64 architecture for:
> * Red Hat Enterprise Linux 7.6 or later
> * CentOS Linux (or similar) 7.6 or later
>
> This release supports Hypervisor Hosts on x86_64 and ppc64le architectures
> for:
> * Red Hat Enterprise Linux 7.6 or later
> * CentOS Linux (or similar) 7.6 or later
> * oVirt Node 4.3 (available for x86_64 only)
>
> Experimental tech preview for x86_64 and s390x architectures for Fedora 28
> is also included.
>
> See the release notes [1] for installation / upgrade instructions and
> a list of new features and bugs fixed.
>
> Notes:
> - oVirt Appliance is already available
> - oVirt Node is already available[2]
>
> oVirt Node has been updated including:
> - oVirt 4.3.2: http://www.ovirt.org/release/4.3.2/
> - Latest CentOS updates (no relevant errata available up to now on
> https://lists.centos.org/pipermail/centos-announce )
>

Relevant errata have been published:
CESA-2019:0512 Important CentOS 7 kernel Security Update

CESA-2019:0483 Moderate CentOS 7 openssl Security Update






>
> Additional Resources:
> * Read more about the oVirt 4.3.2 release highlights:
> http://www.ovirt.org/release/4.3.2/
> * Get more oVirt Project updates on Twitter: https://twitter.com/ovirt
> * Check out the latest project news on the oVirt blog:
> http://www.ovirt.org/blog/
>
> [1] http://www.ovirt.org/release/4.3.2/
> [2] http://resources.ovirt.org/pub/ovirt-4.3/iso/
>
>
> --
>
> SANDRO BONAZZOLA
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R RHV
>
> Red Hat EMEA 
>
> sbona...@redhat.com
> 
>


-- 

SANDRO BONAZZOLA

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3M4OWHSARGRXWJTXMYVGECDY5VXFKSWB/


[ovirt-users] Re: Hosted -engine is down and cannot be restarted

2019-03-19 Thread ada per
should i still upgrade it ? I am currently in version 4.3.1.1-1.el7

On Tue, Mar 19, 2019 at 4:17 PM ada per  wrote:

> After changing the ownership the engine is up!!
>
> thanks for your help!!!:)
>
> On Tue, Mar 19, 2019 at 3:25 PM Simone Tiraboschi 
> wrote:
>
>>
>>
>> On Tue, Mar 19, 2019 at 2:21 PM ada per  wrote:
>>
>>> Thanks for you reply.
>>>
>>> Can you please provide step by step instructions on how to upgrade the
>>> vdsm from a node command line?
>>>
>>
>> Can you please report the version of vdsm you are using?
>>
>> then check the ownership of
>>
>> /rhev/data-center/----/05b2b2d5-a80e-4622-9410-8e1e9d362f3f/images/bb890447-f1f7-46af-8e57-543d61f0bd08/81685d19-0060-4f5d-a4cd-c5efa24aecfe
>>
>> if it's not vdsm:kvm, change it and then try again with hosted-engine
>> --vm-start
>>
>>
>>>
>>> On Tue, Mar 19, 2019 at 2:49 PM Simone Tiraboschi 
>>> wrote:
>>>
 Hi Ada,
 here the error:

 2019-03-19 14:08:25,833+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer]
 RPC call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312)
 2019-03-19 14:08:25,839+0200 INFO  (vm/a492d2eb) [vdsm.api] FINISH
 prepareImage error=Volume does not exist:
 (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal,
 task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52)
 2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb)
 [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
 Unexpected error (task:875)
 Traceback (most recent call last):
   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
 882, in _run
 return fn(*args, **kargs)
   File "", line 2, in prepareImage
   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50,
 in method
 ret = func(*args, **kwargs)
   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line
 3199, in prepareImage
 legality = dom.produceVolume(imgUUID, volUUID).getLegality()
   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 822,
 in produceVolume
 volUUID)
   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
 801, in __init__
 self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID,
 volUUID)
   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
 line 71, in __init__
 volUUID)
   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
 86, in __init__
 self.validate()
   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
 112, in validate
 self.validateVolumePath()
   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
 line 131, in validateVolumePath
 raise se.VolumeDoesNotExist(self.volUUID)
 VolumeDoesNotExist: Volume does not exist:
 (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)
 2019-03-19 14:08:25,840+0200 INFO  (vm/a492d2eb)
 [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
 aborting: Task is aborted: "Volume does not exist:
 (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code 201 (task:1181)
 2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher]
 FINISH prepareImage error=Volume does not exist:
 (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83)

 I think it's still https://bugzilla.redhat.com/1666795
 

 Can you please try updating vdsm to vdsm-4.30.10 since the bug is
 reported as solved in that version?




 On Tue, Mar 19, 2019 at 12:30 PM ada per  wrote:

> an vdsm:
>
>
>
>
>
> On Tue, Mar 19, 2019 at 1:24 PM ada per  wrote:
>
>> Thank you! please see attached files:
>>
>> On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi <
>> stira...@redhat.com> wrote:
>>
>>> Can you please check/attach also
>>> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ?
>>>
>>> On Tue, Mar 19, 2019 at 11:36 AM ada per  wrote:
>>>
 Hello everyone,

 For a strange reason the hosted engine went down and I cannot
 restart it.  I tried manually restarting it without any success can you
 please advice?

 For all the nodes the engine status is the same as the one below.
 --== Host nodex. (id: 6) status ==--
 conf_on_shared_storage : True
 Status up-to-date  : True
 Hostname   : nodex
 Host ID: 6
 Engine status  : {"reason": "bad vm status",
 "health": "bad", "vm": "down_unexpected", "detail": "Down"}
 Score  : 3400
 stopped: False
 Local maintenance 

[ovirt-users] Re: Hosted -engine is down and cannot be restarted

2019-03-19 Thread ada per
After changing the ownership the engine is up!!

thanks for your help!!!:)

On Tue, Mar 19, 2019 at 3:25 PM Simone Tiraboschi 
wrote:

>
>
> On Tue, Mar 19, 2019 at 2:21 PM ada per  wrote:
>
>> Thanks for you reply.
>>
>> Can you please provide step by step instructions on how to upgrade the
>> vdsm from a node command line?
>>
>
> Can you please report the version of vdsm you are using?
>
> then check the ownership of
>
> /rhev/data-center/----/05b2b2d5-a80e-4622-9410-8e1e9d362f3f/images/bb890447-f1f7-46af-8e57-543d61f0bd08/81685d19-0060-4f5d-a4cd-c5efa24aecfe
>
> if it's not vdsm:kvm, change it and then try again with hosted-engine
> --vm-start
>
>
>>
>> On Tue, Mar 19, 2019 at 2:49 PM Simone Tiraboschi 
>> wrote:
>>
>>> Hi Ada,
>>> here the error:
>>>
>>> 2019-03-19 14:08:25,833+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer]
>>> RPC call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312)
>>> 2019-03-19 14:08:25,839+0200 INFO  (vm/a492d2eb) [vdsm.api] FINISH
>>> prepareImage error=Volume does not exist:
>>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal,
>>> task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52)
>>> 2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb)
>>> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
>>> Unexpected error (task:875)
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
>>> 882, in _run
>>> return fn(*args, **kargs)
>>>   File "", line 2, in prepareImage
>>>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50,
>>> in method
>>> ret = func(*args, **kwargs)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line
>>> 3199, in prepareImage
>>> legality = dom.produceVolume(imgUUID, volUUID).getLegality()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 822,
>>> in produceVolume
>>> volUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>>> 801, in __init__
>>> self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID,
>>> volUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>>> line 71, in __init__
>>> volUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>>> 86, in __init__
>>> self.validate()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>>> 112, in validate
>>> self.validateVolumePath()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>>> line 131, in validateVolumePath
>>> raise se.VolumeDoesNotExist(self.volUUID)
>>> VolumeDoesNotExist: Volume does not exist:
>>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)
>>> 2019-03-19 14:08:25,840+0200 INFO  (vm/a492d2eb)
>>> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
>>> aborting: Task is aborted: "Volume does not exist:
>>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code 201 (task:1181)
>>> 2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher]
>>> FINISH prepareImage error=Volume does not exist:
>>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83)
>>>
>>> I think it's still https://bugzilla.redhat.com/1666795
>>> 
>>>
>>> Can you please try updating vdsm to vdsm-4.30.10 since the bug is
>>> reported as solved in that version?
>>>
>>>
>>>
>>>
>>> On Tue, Mar 19, 2019 at 12:30 PM ada per  wrote:
>>>
 an vdsm:





 On Tue, Mar 19, 2019 at 1:24 PM ada per  wrote:

> Thank you! please see attached files:
>
> On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi <
> stira...@redhat.com> wrote:
>
>> Can you please check/attach also
>> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ?
>>
>> On Tue, Mar 19, 2019 at 11:36 AM ada per  wrote:
>>
>>> Hello everyone,
>>>
>>> For a strange reason the hosted engine went down and I cannot
>>> restart it.  I tried manually restarting it without any success can you
>>> please advice?
>>>
>>> For all the nodes the engine status is the same as the one below.
>>> --== Host nodex. (id: 6) status ==--
>>> conf_on_shared_storage : True
>>> Status up-to-date  : True
>>> Hostname   : nodex
>>> Host ID: 6
>>> Engine status  : {"reason": "bad vm status",
>>> "health": "bad", "vm": "down_unexpected", "detail": "Down"}
>>> Score  : 3400
>>> stopped: False
>>> Local maintenance  : False
>>> crc32  : 323a9f45
>>> local_conf_timestamp   : 2648874
>>> Host timestamp : 2648874
>>> Extra metadata (valid at timestamp):
>>> 

[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Gianluca Cecchi
On Tue, Mar 19, 2019 at 2:51 PM Miguel Duarte de Mora Barroso <
mdbarr...@redhat.com> wrote:

> On Tue, Mar 19, 2019 at 2:15 PM Gianluca Cecchi
>  wrote:
> >
> > On Tue, Mar 19, 2019 at 10:25 AM Miguel Duarte de Mora Barroso <
> mdbarr...@redhat.com> wrote:
> >>
> >> >>
> >> >>
> >> >> @Gianluca Cecchi , I notice that one of your duplicate networks -
> >> >> 'ovn192'  - has no ports attached. That makes it the perfect
> candidate
> >> >> to be deleted, and see if it becomes 'listable' on engine. That would
> >> >> help rule out the 'duplicate name' theory.
> >> >
> >> >
> >> >  I can try. Can you give me the command to be run?
> >> > It is a test oVirt so It would be not a big problem in case of
> failures in this respect.
> >>
> >> You can delete it via the UI; just be sure to delete the one without
> >> ports - it's external ID is 6110649a-db2b-4de7-8fbc-601095cfe510.
> >>
> >> It will ask you if you also want to delete it from the external
> >> provider, say yes.
> >
> >
> >
> > Inside the GUI I see only one ovn192 network and one ovn172 network and
> their external ids don't match the ones without ports...
> >
> > - ovn192
> > Id: 8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5
> > External ID: 32367d8a-460f-4447-b35a-abe9ea5187e0
> >
> > - ovn172
> > Id: 7546d5d3-a0e3-40d5-9d22-cf355da47d3a
> > External ID: 64c4c17f-cd67-4e29-939e-2b952495159f
> >
> > So I think I have to delete from command line
>
> Check pastebin [0],  with it you can safely delete those 2 networks.
> Last course of action would be to delete via ovn-nbctl - e.g.
> ovn-nbctl destroy logical_switch  - but hopefully it won't
> come to that.
>
> [0] - https://paste.fedoraproject.org/paste/mxVUEJZWxG-QHX0mJO1VhA
>
> >
> > Gianluca Cecchi
> >
>


I get this error from the first part where I should get  the token id
{
  "error": {
"message": "No JSON object could be decoded",
"code": 400,
"title": "Bad Request"
  }
}

In your command there is:

  -H 'Postman-Token: 87fa50fd-0d06-497d-b2ac-b66b78ad90b8' \

what is that sequence? where did you get it?
Also, inside the credential section

"username": ,
"password": YYY

do I have to put my username and password inside single/double quotes or
nothing?
eg admin@internal or "admin@internal" or what?

Thanks,
Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/5WYYBO5D6XTG7ZVQQOM3IH2ILSMVMNJJ/


[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Miguel Duarte de Mora Barroso
On Tue, Mar 19, 2019 at 2:15 PM Gianluca Cecchi
 wrote:
>
> On Tue, Mar 19, 2019 at 10:25 AM Miguel Duarte de Mora Barroso 
>  wrote:
>>
>> >>
>> >>
>> >> @Gianluca Cecchi , I notice that one of your duplicate networks -
>> >> 'ovn192'  - has no ports attached. That makes it the perfect candidate
>> >> to be deleted, and see if it becomes 'listable' on engine. That would
>> >> help rule out the 'duplicate name' theory.
>> >
>> >
>> >  I can try. Can you give me the command to be run?
>> > It is a test oVirt so It would be not a big problem in case of failures in 
>> > this respect.
>>
>> You can delete it via the UI; just be sure to delete the one without
>> ports - it's external ID is 6110649a-db2b-4de7-8fbc-601095cfe510.
>>
>> It will ask you if you also want to delete it from the external
>> provider, say yes.
>
>
>
> Inside the GUI I see only one ovn192 network and one ovn172 network and their 
> external ids don't match the ones without ports...
>
> - ovn192
> Id: 8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5
> External ID: 32367d8a-460f-4447-b35a-abe9ea5187e0
>
> - ovn172
> Id: 7546d5d3-a0e3-40d5-9d22-cf355da47d3a
> External ID: 64c4c17f-cd67-4e29-939e-2b952495159f
>
> So I think I have to delete from command line

Check pastebin [0],  with it you can safely delete those 2 networks.
Last course of action would be to delete via ovn-nbctl - e.g.
ovn-nbctl destroy logical_switch  - but hopefully it won't
come to that.

[0] - https://paste.fedoraproject.org/paste/mxVUEJZWxG-QHX0mJO1VhA

>
> Gianluca Cecchi
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IKKBNNW5KLNBBHH2AFWYRYW3NVIJP3QJ/


[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-19 Thread Juhani Rautiainen
On Tue, Mar 19, 2019 at 3:01 PM Simone Tiraboschi  wrote:
>

>>
>> No failed pings to be seen. So how that ping.py decides that 4 out of 5 
>> failed??
>
>
> It's just calling the system ping utility as an external process checking the 
> exit code.
> I don't see any issue with that approach.

I was looking at the same thing but I can also see that packets reach
the host NIC. I just read the times again and it seems that first ping
was delayed (took over 2 secs). So is that 4 out of 5 number of
succeeded pings? Because I read it the other way.

> Can you please try executing:
>
> while true;
>do ping -c 1 -W 2 10.168.8.1 > /dev/null; echo $?; sleep 0.5;
> done

I'll try this tomorrow during the expected failure time.

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/T4BR74HPP6ADXEQTC7SHK36DXUUY4UDB/


[ovirt-users] Re: Hosted -engine is down and cannot be restarted

2019-03-19 Thread Simone Tiraboschi
On Tue, Mar 19, 2019 at 2:21 PM ada per  wrote:

> Thanks for you reply.
>
> Can you please provide step by step instructions on how to upgrade the
> vdsm from a node command line?
>

Can you please report the version of vdsm you are using?

then check the ownership of
/rhev/data-center/----/05b2b2d5-a80e-4622-9410-8e1e9d362f3f/images/bb890447-f1f7-46af-8e57-543d61f0bd08/81685d19-0060-4f5d-a4cd-c5efa24aecfe

if it's not vdsm:kvm, change it and then try again with hosted-engine
--vm-start


>
> On Tue, Mar 19, 2019 at 2:49 PM Simone Tiraboschi 
> wrote:
>
>> Hi Ada,
>> here the error:
>>
>> 2019-03-19 14:08:25,833+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer]
>> RPC call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312)
>> 2019-03-19 14:08:25,839+0200 INFO  (vm/a492d2eb) [vdsm.api] FINISH
>> prepareImage error=Volume does not exist:
>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal,
>> task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52)
>> 2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb)
>> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
>> Unexpected error (task:875)
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
>> in _run
>> return fn(*args, **kargs)
>>   File "", line 2, in prepareImage
>>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in
>> method
>> ret = func(*args, **kwargs)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3199,
>> in prepareImage
>> legality = dom.produceVolume(imgUUID, volUUID).getLegality()
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 822,
>> in produceVolume
>> volUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 801, in __init__
>> self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID,
>> volUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>> line 71, in __init__
>> volUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 86, in __init__
>> self.validate()
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 112, in validate
>> self.validateVolumePath()
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>> line 131, in validateVolumePath
>> raise se.VolumeDoesNotExist(self.volUUID)
>> VolumeDoesNotExist: Volume does not exist:
>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)
>> 2019-03-19 14:08:25,840+0200 INFO  (vm/a492d2eb)
>> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
>> aborting: Task is aborted: "Volume does not exist:
>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code 201 (task:1181)
>> 2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher]
>> FINISH prepareImage error=Volume does not exist:
>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83)
>>
>> I think it's still https://bugzilla.redhat.com/1666795
>> 
>>
>> Can you please try updating vdsm to vdsm-4.30.10 since the bug is
>> reported as solved in that version?
>>
>>
>>
>>
>> On Tue, Mar 19, 2019 at 12:30 PM ada per  wrote:
>>
>>> an vdsm:
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Mar 19, 2019 at 1:24 PM ada per  wrote:
>>>
 Thank you! please see attached files:

 On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi 
 wrote:

> Can you please check/attach also
> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ?
>
> On Tue, Mar 19, 2019 at 11:36 AM ada per  wrote:
>
>> Hello everyone,
>>
>> For a strange reason the hosted engine went down and I cannot restart
>> it.  I tried manually restarting it without any success can you please
>> advice?
>>
>> For all the nodes the engine status is the same as the one below.
>> --== Host nodex. (id: 6) status ==--
>> conf_on_shared_storage : True
>> Status up-to-date  : True
>> Hostname   : nodex
>> Host ID: 6
>> Engine status  : {"reason": "bad vm status",
>> "health": "bad", "vm": "down_unexpected", "detail": "Down"}
>> Score  : 3400
>> stopped: False
>> Local maintenance  : False
>> crc32  : 323a9f45
>> local_conf_timestamp   : 2648874
>> Host timestamp : 2648874
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=2648874 (Tue Mar 19 12:25:44 2019)
>> host-id=6
>> score=3400
>> vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019)
>> conf_on_shared_storage=True
>>

[ovirt-users] Re: Hosted -engine is down and cannot be restarted

2019-03-19 Thread ada per
Thanks for you reply.

Can you please provide step by step instructions on how to upgrade the vdsm
from a node command line?

On Tue, Mar 19, 2019 at 2:49 PM Simone Tiraboschi 
wrote:

> Hi Ada,
> here the error:
>
> 2019-03-19 14:08:25,833+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
> call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312)
> 2019-03-19 14:08:25,839+0200 INFO  (vm/a492d2eb) [vdsm.api] FINISH
> prepareImage error=Volume does not exist:
> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal,
> task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52)
> 2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb)
> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
> Unexpected error (task:875)
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
> in _run
> return fn(*args, **kargs)
>   File "", line 2, in prepareImage
>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in
> method
> ret = func(*args, **kwargs)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3199,
> in prepareImage
> legality = dom.produceVolume(imgUUID, volUUID).getLegality()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 822, in
> produceVolume
> volUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
> 801, in __init__
> self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line
> 71, in __init__
> volUUID)
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 86,
> in __init__
> self.validate()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
> 112, in validate
> self.validateVolumePath()
>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line
> 131, in validateVolumePath
> raise se.VolumeDoesNotExist(self.volUUID)
> VolumeDoesNotExist: Volume does not exist:
> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)
> 2019-03-19 14:08:25,840+0200 INFO  (vm/a492d2eb)
> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257')
> aborting: Task is aborted: "Volume does not exist:
> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code 201 (task:1181)
> 2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher]
> FINISH prepareImage error=Volume does not exist:
> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83)
>
> I think it's still https://bugzilla.redhat.com/1666795
> 
>
> Can you please try updating vdsm to vdsm-4.30.10 since the bug is reported
> as solved in that version?
>
>
>
>
> On Tue, Mar 19, 2019 at 12:30 PM ada per  wrote:
>
>> an vdsm:
>>
>>
>>
>>
>>
>> On Tue, Mar 19, 2019 at 1:24 PM ada per  wrote:
>>
>>> Thank you! please see attached files:
>>>
>>> On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi 
>>> wrote:
>>>
 Can you please check/attach also
 /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ?

 On Tue, Mar 19, 2019 at 11:36 AM ada per  wrote:

> Hello everyone,
>
> For a strange reason the hosted engine went down and I cannot restart
> it.  I tried manually restarting it without any success can you please
> advice?
>
> For all the nodes the engine status is the same as the one below.
> --== Host nodex. (id: 6) status ==--
> conf_on_shared_storage : True
> Status up-to-date  : True
> Hostname   : nodex
> Host ID: 6
> Engine status  : {"reason": "bad vm status",
> "health": "bad", "vm": "down_unexpected", "detail": "Down"}
> Score  : 3400
> stopped: False
> Local maintenance  : False
> crc32  : 323a9f45
> local_conf_timestamp   : 2648874
> Host timestamp : 2648874
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=2648874 (Tue Mar 19 12:25:44 2019)
> host-id=6
> score=3400
> vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019)
> conf_on_shared_storage=True
> maintenance=False
> state=GlobalMaintenance
> stopped=False
>
> When I try the commands
> root@node5# hosted-engine --vm-shutdown
> I ge the response:
> root@node5# Command VM.shutdown with args {'delay': '120', 'message':
> 'VM is shutting down!', 'vmID': 'a492d2eb-1dfd-470d-a141-3e55d2189275'}
> failed:(code=1, message=Virtual machine does not exist)
>
> But when I run  : hosted-engine --vm-start
> I get the response: VM exists and is down, 

[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Gianluca Cecchi
On Tue, Mar 19, 2019 at 9:37 AM Marcin Mirecki  wrote:

[snip]


> I think it could be related to the situation described here (it is the
>> same environment, in the meantime updated also from 4.2.8 to 4.3.1) and
>> previous configuration not backed up at that time:
>>
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/32S5L4JKHGPHE2XIQMLRIVLOXRG4CHW3/
>>
>> and some steps not done correctly by me.
>> After following indications, I tried to import ovn but probably I did it
>> wrong.
>>
>
>
> Is it possible that you added new networks, instead of importing the old
> ones?
> If so the old networks would just stay in the database, and we would have
> duplicated networks like you have now.
>
>
It could be, but I don't remember now, sorry
Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WGEBOHHETTY5K5GDBKZMW3SUCOBHAPNB/


[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Gianluca Cecchi
On Tue, Mar 19, 2019 at 10:25 AM Miguel Duarte de Mora Barroso <
mdbarr...@redhat.com> wrote:

> >>
> >>
> >> @Gianluca Cecchi , I notice that one of your duplicate networks -
> >> 'ovn192'  - has no ports attached. That makes it the perfect candidate
> >> to be deleted, and see if it becomes 'listable' on engine. That would
> >> help rule out the 'duplicate name' theory.
> >
> >
> >  I can try. Can you give me the command to be run?
> > It is a test oVirt so It would be not a big problem in case of failures
> in this respect.
>
> You can delete it via the UI; just be sure to delete the one without
> ports - it's external ID is 6110649a-db2b-4de7-8fbc-601095cfe510.
>
> It will ask you if you also want to delete it from the external
> provider, say yes.
>


Inside the GUI I see only one ovn192 network and one ovn172 network and
their external ids don't match the ones without ports...

- ovn192
Id: 8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5
External ID: 32367d8a-460f-4447-b35a-abe9ea5187e0

- ovn172
Id: 7546d5d3-a0e3-40d5-9d22-cf355da47d3a
External ID: 64c4c17f-cd67-4e29-939e-2b952495159f

So I think I have to delete from command line

Gianluca Cecchi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YALFNP4M6DZLNDC2THZNF5VYJNIKQBEC/


[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-19 Thread Simone Tiraboschi
On Tue, Mar 19, 2019 at 1:32 PM Juhani Rautiainen <
juhani.rautiai...@gmail.com> wrote:

> On Tue, Mar 19, 2019 at 1:33 PM Juhani Rautiainen
>  wrote:
> >
> > On Tue, Mar 19, 2019 at 12:46 PM Juhani Rautiainen
> >
> > It seems that either our firewall is not responding to pings or
> > something else is wrong. Looking at the broker.log this can be seen.
> > Curious thing is that the reboot happens even when ping comes back in
> > couple of seconds. Is there timeout in ping or does it fire them in
> > quick succession?
>
> I don't know much of Python, but I think there is a problem with
> broker/ping.py. I noticed that these ping failures happen every
> fifteen minutes:
>
> [root@ovirt01 ~]# grep Failed /var/log/ovirt-hosted-engine-ha/broker.log
> Thread-1::WARNING::2019-03-19
> 14:04:44,898::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
> (4 out of 5)
> Thread-1::WARNING::2019-03-19
> 14:19:38,891::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
> (4 out of 5)
>
> I monitored the firewall and network traffic in host and ping works
> but that ping.py somehow thinks that it did not get replies. I can't
> see anything obvius in the code. But this is from tcpdump from that
> last failure time frame:
>
> 14:19:22.598518 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19055, seq 1, length 64
> 14:19:22.598705 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19055, seq 1, length 64
> 14:19:23.126800 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19056, seq 1, length 64
> 14:19:23.126978 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19056, seq 1, length 64
> 14:19:23.653544 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19057, seq 1, length 64
> 14:19:23.653731 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19057, seq 1, length 64
> 14:19:24.180846 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19058, seq 1, length 64
> 14:19:24.181042 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19058, seq 1, length 64
> 14:19:24.708083 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19065, seq 1, length 64
> 14:19:24.708274 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19065, seq 1, length 64
> 14:19:32.743986 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19141, seq 1, length 64
> 14:19:35.160398 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19141, seq 1, length 64
> 14:19:35.271171 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19152, seq 1, length 64
> 14:19:35.365315 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19152, seq 1, length 64
> 14:19:35.892716 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19154, seq 1, length 64
> 14:19:36.002087 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19154, seq 1, length 64
> 14:19:36.529263 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19156, seq 1, length 64
> 14:19:38.359281 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19156, seq 1, length 64
> 14:19:38.887231 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19201, seq 1, length 64
> 14:19:38.889774 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19201, seq 1, length 64
> 14:19:42.923684 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19234, seq 1, length 64
> 14:19:42.923951 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19234, seq 1, length 64
> 14:19:43.450788 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19235, seq 1, length 64
> 14:19:43.450968 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19235, seq 1, length 64
> 14:19:43.977791 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19237, seq 1, length 64
> 14:19:43.977965 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19237, seq 1, length 64
> 14:19:44.504541 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19238, seq 1, length 64
> 14:19:44.504715 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19238, seq 1, length 64
> 14:19:45.031570 IP ovirt01.virt.local > gateway: ICMP echo request, id
> 19244, seq 1, length 64
> 14:19:45.031752 IP gateway > ovirt01.virt.local: ICMP echo reply, id
> 19244, seq 1, length 64
>
> No failed pings to be seen. So how that ping.py decides that 4 out of 5
> failed??
>

It's just calling the system ping utility as an external process checking
the exit code.
I don't see any issue with that approach.

Can you please try executing:

while true;
   do ping -c 1 -W 2 10.168.8.1 > /dev/null; echo $?; sleep 0.5;
done


>
> Thanks,
>   Juhani
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/UH7MKGQECM2VSI77DNRHQB56C76FJBTY/
>
___

[ovirt-users] Host unresponsive after upgrade 4.2.8 -> 4.3.2 failed

2019-03-19 Thread Artem Tambovskiy
Hello,

Just started upgrading my small cluster to from 4.2.8 to 4.3.2 and endup in
the situation that one of the hosts is not working after upgrade.
For some reason vdsmd is not starting up, I have tried to restart it
manually with no luck:

Any ideas on what could be the reason?

[root@ovirt2 log]# systemctl restart vdsmd
A dependency job for vdsmd.service failed. See 'journalctl -xe' for details.
[root@ovirt2 log]# journalctl -xe
-- Unit ovirt-ha-agent.service has finished shutting down.
Mar 19 15:47:47 ovirt2.domain.org systemd[1]: Starting Virtual Desktop
Server Manager...
-- Subject: Unit vdsmd.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit vdsmd.service has begun starting up.
Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm:
Running mkdirs
Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm:
Running configure_coredump
Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm:
Running configure_vdsm_logs
Mar 19 15:47:47 ovirt2.domain.org vdsmd_init_common.sh[56717]: vdsm:
Running wait_for_network
Mar 19 15:47:47 ovirt2.domain.org supervdsmd[56716]: Supervdsm failed to
start: 'module' object has no attribute 'Accounting'
Mar 19 15:47:47 ovirt2.domain.org python2[56716]: detected unhandled Python
exception in '/usr/share/vdsm/supervdsmd'
Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: Duplicate: core
backtrace
Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: DUP_OF_DIR:
/var/tmp/abrt/Python-2019-03-19-14:23:04-17292
Mar 19 15:47:48 ovirt2.domain.org abrt-server[56745]: Deleting problem
directory Python-2019-03-19-15:47:47-56716 (dup of
Python-2019-03-19-14:23:04-17292
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: Traceback (most
recent call last):
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File
"/usr/share/vdsm/supervdsmd", line 26, in 
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]:
supervdsm_server.main(sys.argv[1:])
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File
"/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 294, in
main
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: module_name))
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File
"/usr/lib64/python2.7/importlib/__init__.py", line 37, in import_module
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: __import__(name)
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: File
"/usr/lib/python2.7/site-packages/vdsm/supervdsm_api/systemd.py", line 34,
in 
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]:
cmdutils.Accounting.CPU,
Mar 19 15:47:49 ovirt2.domain.org daemonAdapter[56716]: AttributeError:
'module' object has no attribute 'Accounting'
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service: main
process exited, code=exited, status=1/FAILURE
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Unit supervdsmd.service
entered failed state.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service failed.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: supervdsmd.service holdoff
time over, scheduling restart.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Cannot add dependency job for
unit lvm2-lvmetad.socket, ignoring: Unit is masked.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Stopped Auxiliary vdsm
service for running helper functions as root.
-- Subject: Unit supervdsmd.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit supervdsmd.service has finished shutting down.
Mar 19 15:47:49 ovirt2.domain.org systemd[1]: Started Auxiliary vdsm
service for running helper functions as root.
-- Subject: Unit supervdsmd.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit supervdsmd.service has finished starting up.
-- 
-- The start-up result is done.
Mar 19 15:47:50 ovirt2.domain.org supervdsmd[56757]: Supervdsm failed to
start: 'module' object has no attribute 'Accounting'
Mar 19 15:47:50 ovirt2.domain.org python2[56757]: detected unhandled Python
exception in '/usr/share/vdsm/supervdsmd'


-- 
Regards,
Artem
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RXQ7ZH2EZ74CO3VID7PXAXO6CHK4BXH3/


[ovirt-users] Re: Hosted -engine is down and cannot be restarted

2019-03-19 Thread Simone Tiraboschi
Hi Ada,
here the error:

2019-03-19 14:08:25,833+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312)
2019-03-19 14:08:25,839+0200 INFO  (vm/a492d2eb) [vdsm.api] FINISH
prepareImage error=Volume does not exist:
(u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal,
task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52)
2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb) [storage.TaskManager.Task]
(Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
in _run
return fn(*args, **kargs)
  File "", line 2, in prepareImage
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in
method
ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3199,
in prepareImage
legality = dom.produceVolume(imgUUID, volUUID).getLegality()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 822, in
produceVolume
volUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 801,
in __init__
self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line
71, in __init__
volUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 86,
in __init__
self.validate()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 112,
in validate
self.validateVolumePath()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line
131, in validateVolumePath
raise se.VolumeDoesNotExist(self.volUUID)
VolumeDoesNotExist: Volume does not exist:
(u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)
2019-03-19 14:08:25,840+0200 INFO  (vm/a492d2eb) [storage.TaskManager.Task]
(Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257') aborting: Task is aborted:
"Volume does not exist: (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code
201 (task:1181)
2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher]
FINISH prepareImage error=Volume does not exist:
(u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83)

I think it's still https://bugzilla.redhat.com/1666795


Can you please try updating vdsm to vdsm-4.30.10 since the bug is reported
as solved in that version?




On Tue, Mar 19, 2019 at 12:30 PM ada per  wrote:

> an vdsm:
>
>
>
>
>
> On Tue, Mar 19, 2019 at 1:24 PM ada per  wrote:
>
>> Thank you! please see attached files:
>>
>> On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi 
>> wrote:
>>
>>> Can you please check/attach also
>>> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ?
>>>
>>> On Tue, Mar 19, 2019 at 11:36 AM ada per  wrote:
>>>
 Hello everyone,

 For a strange reason the hosted engine went down and I cannot restart
 it.  I tried manually restarting it without any success can you please
 advice?

 For all the nodes the engine status is the same as the one below.
 --== Host nodex. (id: 6) status ==--
 conf_on_shared_storage : True
 Status up-to-date  : True
 Hostname   : nodex
 Host ID: 6
 Engine status  : {"reason": "bad vm status",
 "health": "bad", "vm": "down_unexpected", "detail": "Down"}
 Score  : 3400
 stopped: False
 Local maintenance  : False
 crc32  : 323a9f45
 local_conf_timestamp   : 2648874
 Host timestamp : 2648874
 Extra metadata (valid at timestamp):
 metadata_parse_version=1
 metadata_feature_version=1
 timestamp=2648874 (Tue Mar 19 12:25:44 2019)
 host-id=6
 score=3400
 vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019)
 conf_on_shared_storage=True
 maintenance=False
 state=GlobalMaintenance
 stopped=False

 When I try the commands
 root@node5# hosted-engine --vm-shutdown
 I ge the response:
 root@node5# Command VM.shutdown with args {'delay': '120', 'message':
 'VM is shutting down!', 'vmID': 'a492d2eb-1dfd-470d-a141-3e55d2189275'}
 failed:(code=1, message=Virtual machine does not exist)

 But when I run  : hosted-engine --vm-start
 I get the response: VM exists and is down, cleaning up and restarting



 Below you can see the # journalctl -u ovirt-ha-agent logs

 Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
 ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Unhandled
 monitoring loop exception
   Traceback
 

[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-19 Thread Juhani Rautiainen
On Tue, Mar 19, 2019 at 1:33 PM Juhani Rautiainen
 wrote:
>
> On Tue, Mar 19, 2019 at 12:46 PM Juhani Rautiainen
>
> It seems that either our firewall is not responding to pings or
> something else is wrong. Looking at the broker.log this can be seen.
> Curious thing is that the reboot happens even when ping comes back in
> couple of seconds. Is there timeout in ping or does it fire them in
> quick succession?

I don't know much of Python, but I think there is a problem with
broker/ping.py. I noticed that these ping failures happen every
fifteen minutes:

[root@ovirt01 ~]# grep Failed /var/log/ovirt-hosted-engine-ha/broker.log
Thread-1::WARNING::2019-03-19
14:04:44,898::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
(4 out of 5)
Thread-1::WARNING::2019-03-19
14:19:38,891::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
(4 out of 5)

I monitored the firewall and network traffic in host and ping works
but that ping.py somehow thinks that it did not get replies. I can't
see anything obvius in the code. But this is from tcpdump from that
last failure time frame:

14:19:22.598518 IP ovirt01.virt.local > gateway: ICMP echo request, id
19055, seq 1, length 64
14:19:22.598705 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19055, seq 1, length 64
14:19:23.126800 IP ovirt01.virt.local > gateway: ICMP echo request, id
19056, seq 1, length 64
14:19:23.126978 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19056, seq 1, length 64
14:19:23.653544 IP ovirt01.virt.local > gateway: ICMP echo request, id
19057, seq 1, length 64
14:19:23.653731 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19057, seq 1, length 64
14:19:24.180846 IP ovirt01.virt.local > gateway: ICMP echo request, id
19058, seq 1, length 64
14:19:24.181042 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19058, seq 1, length 64
14:19:24.708083 IP ovirt01.virt.local > gateway: ICMP echo request, id
19065, seq 1, length 64
14:19:24.708274 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19065, seq 1, length 64
14:19:32.743986 IP ovirt01.virt.local > gateway: ICMP echo request, id
19141, seq 1, length 64
14:19:35.160398 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19141, seq 1, length 64
14:19:35.271171 IP ovirt01.virt.local > gateway: ICMP echo request, id
19152, seq 1, length 64
14:19:35.365315 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19152, seq 1, length 64
14:19:35.892716 IP ovirt01.virt.local > gateway: ICMP echo request, id
19154, seq 1, length 64
14:19:36.002087 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19154, seq 1, length 64
14:19:36.529263 IP ovirt01.virt.local > gateway: ICMP echo request, id
19156, seq 1, length 64
14:19:38.359281 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19156, seq 1, length 64
14:19:38.887231 IP ovirt01.virt.local > gateway: ICMP echo request, id
19201, seq 1, length 64
14:19:38.889774 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19201, seq 1, length 64
14:19:42.923684 IP ovirt01.virt.local > gateway: ICMP echo request, id
19234, seq 1, length 64
14:19:42.923951 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19234, seq 1, length 64
14:19:43.450788 IP ovirt01.virt.local > gateway: ICMP echo request, id
19235, seq 1, length 64
14:19:43.450968 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19235, seq 1, length 64
14:19:43.977791 IP ovirt01.virt.local > gateway: ICMP echo request, id
19237, seq 1, length 64
14:19:43.977965 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19237, seq 1, length 64
14:19:44.504541 IP ovirt01.virt.local > gateway: ICMP echo request, id
19238, seq 1, length 64
14:19:44.504715 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19238, seq 1, length 64
14:19:45.031570 IP ovirt01.virt.local > gateway: ICMP echo request, id
19244, seq 1, length 64
14:19:45.031752 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19244, seq 1, length 64

No failed pings to be seen. So how that ping.py decides that 4 out of 5 failed??

Thanks,
  Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UH7MKGQECM2VSI77DNRHQB56C76FJBTY/


[ovirt-users] Re: Ovirt 4.3.1 problem with HA agent

2019-03-19 Thread Николаев Алексей
Thx for your help, Strahil! Hmmm, I see DNS resolution failed in hostname without FQDN. I'll try to fix it. 19.03.2019, 09:43, "Strahil" :Hi Alexei,>> 1.2 All bricks healed (gluster volume heal data info summary) and no split-brain>>  >  > gluster volume heal data info>  > Brick node-msk-gluster203:/opt/gluster/data> Status: Connected> Number of entries: 0>  > Brick node-msk-gluster205:/opt/gluster/data> 78043-0943-48f8-a4fe-9b23e2ba3404>> 7-1746-471b-a49d-8d824db9fd72>> > > 8-4370-46ce-b976-ac22d2f680ee>> 9142-7843fd260c70>> > Status: Connected> Number of entries: 7>  > Brick node-msk-gluster201:/opt/gluster/data> 78043-0943-48f8-a4fe-9b23e2ba3404>> 7-1746-471b-a49d-8d824db9fd72>> > > 8-4370-46ce-b976-ac22d2f680ee>> 9142-7843fd260c70>> > Status: Connected> Number of entries: 7>  Data needs healing.Run: cluster volume heal data fullThis does not work. If it still doesn't heal (check in 5 min),go to /rhev/data-center/mnt/glusterSD/msk-gluster-facility._dataAnd run 'find  . -exec stat {}\;'  without the quotes.Done. https://yadi.sk/i/nXu0RV646YpD6Q  As I have understood you, ovirt Hosted Engine  is running and can be started on all nodes except 1.Ovirt Hosted Engine works and can be run on all nodes with no exceptions.Hosted Engine volume /rhev/data-center/mnt/glusterSD/msk-gluster-facility._engine can be mounted by all nodes without problems. >>  >> 2. Go to the problematic host and check the mount point is there>>  >  > No mount point on problematic node /rhev/data-center/mnt/glusterSD/msk-gluster-facility.:_data> If I create a mount point manually, it is deleted after the node is activated.>  > Other nodes can mount this volume without problems. Only this node have connection problems after update.>  > Here is a part of the log at the time of activation of the node:>  > vdsm log>  > 2019-03-18 16:46:00,548+0300 INFO  (jsonrpc/5) [vds] Setting Hosted Engine HA local maintenance to False (API:1630)> 2019-03-18 16:46:00,549+0300 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call Host.setHaMaintenanceMode succeeded in 0.00 seconds (__init__:573)> 2019-03-18 16:46:00,581+0300 INFO  (jsonrpc/7) [vdsm.api] START connectStorageServer(domType=7, spUUID=u'5a5cca91-01f8-01af-0297-025f', conList=[{u'id': u'5799806e-7969-45da-b17d-b47a63e6a8e4', u'connection': u'msk-gluster-facility.:/data', u'iqn': u'', u'user': u'', u'tpgt': u'1', u'vfs_type': u'glusterfs', u'password': '', u'port': u''}], options=None) from=:::10.77.253.210,56630, flow_id=81524ed, task_id=5f353993-95de-480d-afea-d32dc94fd146 (api:46)> 2019-03-18 16:46:00,621+0300 INFO  (jsonrpc/7) [storage.StorageServer.MountConnection] Creating directory u'/rhev/data-center/mnt/glusterSD/msk-gluster-facility.:_data' (storageServer:167)> 2019-03-18 16:46:00,622+0300 INFO  (jsonrpc/7) [storage.fileUtils] Creating directory: /rhev/data-center/mnt/glusterSD/msk-gluster-facility.:_data mode: None (fileUtils:197)> 2019-03-18 16:46:00,622+0300 WARN  (jsonrpc/7) [storage.StorageServer.MountConnection] gluster server u'msk-gluster-facility.' is not in bricks ['node-msk-gluster203', 'node-msk-gluster205', 'node-msk-gluster201'], possibly mounting duplicate servers (storageServer:317)This seems very strange. As you have hidden the hostname, I'm not use which on is this.Check that DNS can be resolved from  all hosts and the hostname of this Host is resolvable.Name resolution works without problems. dig msk-gluster-facility. ;; ANSWER SECTION:msk-gluster-facility.. 1786 IN A    10.77.253.205 # <-- node-msk-gluster205.msk-gluster-facility.. 1786 IN A    10.77.253.201 # <-- node-msk-gluster201.msk-gluster-facility.. 1786 IN A    10.77.253.203 # <-- node-msk-gluster203. ;; Query time: 5 msec;; SERVER: 10.77.16.155#53(10.77.16.155);; WHEN: Tue Mar 19 14:55:10 MSK 2019;; MSG SIZE  rcvd: 110 Also check if it in the  peer  list.msk-gluster-facility. is just an A type record in dns. It is used on a webUI for mounting gluster volumes and gluster storage HA.Try to manually mount the cluster volume:mount -t glusterfs msk-gluster-facility.:/data /mntWell, the mount works from hypervisor node77-202.And does not work with the hypervisor node77-204 (problematic node). node77- 204/var/log/glusterfs/mnt.log [2019-03-19 12:15:11.106226] I [MSGID: 100030] [glusterfsd.c:2511:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.12.15 (args: /usr/sbin/glusterfs --volfile-server=msk-gluster-facility. --volfile-id=/data /mnt)[2019-03-19 12:15:11.109577] W [MSGID: 101002] [options.c:995:xl_opt_validate] 0-glusterfs: option 'address-family' is deprecated, preferred is 'transport.address-family', continuing with correction[2019-03-19 12:15:11.129652] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1[2019-03-19 12:15:11.135384] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2[2019-03-19 

[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-19 Thread Juhani Rautiainen
On Tue, Mar 19, 2019 at 12:46 PM Juhani Rautiainen
 wrote:
>
>
> Couldn't find anything that jumps as problem but another post in list
> made me check ha-agent logs. This is the reason for reboot:
>
> MainThread::INFO::2019-03-19
> 12:04:41,262::states::135::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Penalizing score by 1600 due to gateway status
> MainThread::INFO::2019-03-19
> 12:04:41,263::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
> Current state EngineUp (score: 1800)
> MainThread::ERROR::2019-03-19
> 12:04:51,283::states::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
> Host ovirt02.virt.local (id 2) score is significantly better than
> local score, shutting down VM on this host
> MainThread::INFO::2019-03-19
> 12:04:51,467::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
> Success, was notification of state_transition (EngineUp-EngineStop)
> sent? sent
> MainThread::INFO::2019-03-19
> 12:04:51,624::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
> Current state EngineStop (score: 3400)
>
> So HA-agent does the reboot. Now the question is: What that
> 'Penalizing score by 1600 due to gateway status' means? Other HA VM's
> don't seen to have any problems.

It seems that either our firewall is not responding to pings or
something else is wrong. Looking at the broker.log this can be seen.
Curious thing is that the reboot happens even when ping comes back in
couple of seconds. Is there timeout in ping or does it fire them in
quick succession?

Thread-1::INFO::2019-03-19 12:04:20,244::ping::60::ping.Ping::(action)
Successfully pinged 10.168.8.1
Thread-2::INFO::2019-03-19
12:04:20,567::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-5::INFO::2019-03-19
12:04:24,729::engine_health::242::engine_health.EngineHealth::(_result_from_stats)
VM is up on this host with healthy engine
Thread-2::INFO::2019-03-19
12:04:29,745::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-3::INFO::2019-03-19
12:04:30,166::mem_free::51::mem_free.MemFree::(action) memFree: 340451
Thread-5::INFO::2019-03-19
12:04:34,843::engine_health::242::engine_health.EngineHealth::(_result_from_stats)
VM is up on this host with healthy engine
Thread-2::INFO::2019-03-19
12:04:39,926::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-3::INFO::2019-03-19
12:04:40,287::mem_free::51::mem_free.MemFree::(action) memFree: 340450
Thread-1::WARNING::2019-03-19
12:04:40,389::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
(0 out of 5)
Thread-1::INFO::2019-03-19 12:04:43,474::ping::60::ping.Ping::(action)
Successfully pinged 10.168.8.1
Thread-5::INFO::2019-03-19
12:04:44,961::engine_health::242::engine_health.EngineHealth::(_result_from_stats)
VM is up on this host with healthy engine
Thread-2::INFO::2019-03-19
12:04:50,154::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-3::INFO::2019-03-19
12:04:50,415::mem_free::51::mem_free.MemFree::(action) memFree: 340454
Thread-1::INFO::2019-03-19 12:04:51,616::ping::60::ping.Ping::(action)
Successfully pinged 10.168.8.1
Thread-5::INFO::2019-03-19
12:04:55,076::engine_health::242::engine_health.EngineHealth::(_result_from_stats)
VM is up on this host with healthy engine
Thread-4::INFO::2019-03-19
12:04:59,197::cpu_load_no_engine::126::cpu_load_no_engine.CpuLoadNoEngine::(calculate_load)
System load total=0.0247, engine=0.0004, non-engine=0.0243
Thread-2::INFO::2019-03-19
12:05:00,434::mgmt_bridge::62::mgmt_bridge.MgmtBridge::(action) Found
bridge ovirtmgmt with ports
Thread-3::INFO::2019-03-19
12:05:00,541::mem_free::51::mem_free.MemFree::(action) memFree: 340433
Thread-1::INFO::2019-03-19 12:05:01,763::ping::60::ping.Ping::(action)
Successfully pinged 10.168.8.1
Thread-7::INFO::2019-03-19
12:05:06,692::engine_health::203::engine_health.EngineHealth::(_result_from_stats)
VM not running on this host, status Down

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PCIGAKWR6OZZTOEQ33P2QUA6RTJM5WQY/


[ovirt-users] Re: Hosted -engine is down and cannot be restarted

2019-03-19 Thread Simone Tiraboschi
Can you please check/attach also /var/log/ovirt-hosted-engine-ha/broker.log
and /var/log/vdsm/vdsm.log ?

On Tue, Mar 19, 2019 at 11:36 AM ada per  wrote:

> Hello everyone,
>
> For a strange reason the hosted engine went down and I cannot restart it.
> I tried manually restarting it without any success can you please advice?
>
> For all the nodes the engine status is the same as the one below.
> --== Host nodex. (id: 6) status ==--
> conf_on_shared_storage : True
> Status up-to-date  : True
> Hostname   : nodex
> Host ID: 6
> Engine status  : {"reason": "bad vm status", "health":
> "bad", "vm": "down_unexpected", "detail": "Down"}
> Score  : 3400
> stopped: False
> Local maintenance  : False
> crc32  : 323a9f45
> local_conf_timestamp   : 2648874
> Host timestamp : 2648874
> Extra metadata (valid at timestamp):
> metadata_parse_version=1
> metadata_feature_version=1
> timestamp=2648874 (Tue Mar 19 12:25:44 2019)
> host-id=6
> score=3400
> vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019)
> conf_on_shared_storage=True
> maintenance=False
> state=GlobalMaintenance
> stopped=False
>
> When I try the commands
> root@node5# hosted-engine --vm-shutdown
> I ge the response:
> root@node5# Command VM.shutdown with args {'delay': '120', 'message': 'VM
> is shutting down!', 'vmID': 'a492d2eb-1dfd-470d-a141-3e55d2189275'}
> failed:(code=1, message=Virtual machine does not exist)
>
> But when I run  : hosted-engine --vm-start
> I get the response: VM exists and is down, cleaning up and restarting
>
>
>
> Below you can see the # journalctl -u ovirt-ha-agent logs
>
> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Unhandled
> monitoring loop exception
>   Traceback (most
> recent call last):
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 430, in start_monitoring
>
> self._monitoring_loop()
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 449, in _monitoring_loop
>   for
> old_state, state, delay in self.fsm:
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py",
> line 127, in next
>   new_data =
> self.refresh(self._state.data)
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
> line 81, in refresh
>
> stats.update(self.hosted_engine.collect_stats())
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 737, in collect_stats
>   all_stats =
> self._broker.get_stats_from_storage()
> File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 143, in get_stats_from_storage
>   result =
> self._proxy.get_stats()
> File
> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
>   return
> self.__send(self.__name, args)
> File
> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
>
> verbose=self.__verbose
> File
> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
>   return
> self.single_request(host, handler, request_body, verbose)
> File
> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request
>
> self.send_content(h, request_body)
> File
> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content
>
> connection.endheaders(request_body)
> File
> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders
>
> self._send_output(message_body)
> File
> 

[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-19 Thread Juhani Rautiainen
On Tue, Mar 19, 2019 at 12:39 PM Kaustav Majumder  wrote:
>
>

> It should not affect.
>>
>> Can
>> this cause problems? I noticed that this message was in events hour
>> before reboot:
>>
> @Sahina Bose what can cause such?
>>
>> Invalid status on Data Center Default. Setting status to Non Responsive.
>>
>> Same event happened just after reboot.
>
>> -Juhani
>
>
> Can you also check the vdsm logs for any anomaly around the time of reboot .

Couldn't find anything that jumps as problem but another post in list
made me check ha-agent logs. This is the reason for reboot:

MainThread::INFO::2019-03-19
12:04:41,262::states::135::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
Penalizing score by 1600 due to gateway status
MainThread::INFO::2019-03-19
12:04:41,263::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
Current state EngineUp (score: 1800)
MainThread::ERROR::2019-03-19
12:04:51,283::states::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Host ovirt02.virt.local (id 2) score is significantly better than
local score, shutting down VM on this host
MainThread::INFO::2019-03-19
12:04:51,467::brokerlink::68::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineUp-EngineStop)
sent? sent
MainThread::INFO::2019-03-19
12:04:51,624::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
Current state EngineStop (score: 3400)

So HA-agent does the reboot. Now the question is: What that
'Penalizing score by 1600 due to gateway status' means? Other HA VM's
don't seen to have any problems.

Thanks,
Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FYM4IONIT5K7NOYLZ3S2GIEDCSIFKXQI/


[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-19 Thread Kaustav Majumder
On Tue, Mar 19, 2019 at 4:00 PM Juhani Rautiainen <
juhani.rautiai...@gmail.com> wrote:

> On Tue, Mar 19, 2019 at 12:21 PM Kaustav Majumder 
> wrote:
> >
> > Hi,
> > Can you check if the he vm fqdn resolves to it's ip from all the hosts?
>
> I checked both hosts and DNS resolving works fine. Just occurred to me
> that I also added addresses to /etc/hosts just in case DNS fails.

It should not affect.

> Can
> this cause problems? I noticed that this message was in events hour
> before reboot:
>
> @Sahina Bose  what can cause such?

> Invalid status on Data Center Default. Setting status to Non Responsive.
>
> Same event happened just after reboot.
>
> -Juhani
>

Can you also check the vdsm logs for any anomaly around the time of reboot .

-- 

Thanks,

Kaustav Majumder
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EN6KQSD6DYFMSA5K3LXNFR4AM56Y2WHO/


[ovirt-users] Hosted -engine is down and cannot be restarted

2019-03-19 Thread ada per
Hello everyone, 

For a strange reason the hosted engine went down and I cannot restart it.  I 
tried manually restarting it without any success can you please advice?

For all the nodes the engine status is the same as the one below. 
--== Host nodex. (id: 6) status ==--
conf_on_shared_storage : True
Status up-to-date  : True
Hostname   : nodex
Host ID: 6
Engine status  : {"reason": "bad vm status", "health": 
"bad", "vm": "down_unexpected", "detail": "Down"}
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 323a9f45
local_conf_timestamp   : 2648874
Host timestamp : 2648874
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=2648874 (Tue Mar 19 12:25:44 2019)
host-id=6
score=3400
vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019)
conf_on_shared_storage=True
maintenance=False
state=GlobalMaintenance
stopped=False

When I try the commands
root@node5# hosted-engine --vm-shutdown
I ge the response:
root@node5# Command VM.shutdown with args {'delay': '120', 'message': 'VM is 
shutting down!', 'vmID': 'a492d2eb-1dfd-470d-a141-3e55d2189275'} 
failed:(code=1, message=Virtual machine does not exist) 

But when I run  : hosted-engine --vm-start 
I get the response: VM exists and is down, cleaning up and restarting



Below you can see the # journalctl -u ovirt-ha-agent logs

Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent 
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Unhandled 
monitoring loop exception
  Traceback (most 
recent call last):
File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 430, in start_monitoring
  
self._monitoring_loop()
File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 449, in _monitoring_loop
  for old_state, 
state, delay in self.fsm:
File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", 
line 127, in next
  new_data = 
self.refresh(self._state.data)
File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py",
 line 81, in refresh
  
stats.update(self.hosted_engine.collect_stats())
File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 737, in collect_stats
  all_stats = 
self._broker.get_stats_from_storage()
File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", 
line 143, in get_stats_from_storage
  result = 
self._proxy.get_stats()
File 
"/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__
  return 
self.__send(self.__name, args)
File 
"/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request
  
verbose=self.__verbose
File 
"/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request
  return 
self.single_request(host, handler, request_body, verbose)
File 
"/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request
  
self.send_content(h, request_body)
File 
"/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content
  
connection.endheaders(request_body)
File 
"/usr/lib64/python2.7/httplib.py", line 1037, in endheaders
  
self._send_output(message_body)
   

[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-19 Thread Juhani Rautiainen
On Tue, Mar 19, 2019 at 12:21 PM Kaustav Majumder  wrote:
>
> Hi,
> Can you check if the he vm fqdn resolves to it's ip from all the hosts?

I checked both hosts and DNS resolving works fine. Just occurred to me
that I also added addresses to /etc/hosts just in case DNS fails. Can
this cause problems? I noticed that this message was in events hour
before reboot:

Invalid status on Data Center Default. Setting status to Non Responsive.

Same event happened just after reboot.

-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/742ETPAXWBALM443VUOYFV6UITK3YKXH/


[ovirt-users] Re: Daily reboots of Hosted Engine?

2019-03-19 Thread Kaustav Majumder
Hi,
Can you check if the he vm fqdn resolves to it's ip from all the hosts?

On Tue, Mar 19, 2019 at 3:48 PM Juhani Rautiainen <
juhani.rautiai...@gmail.com> wrote:

> Hi!
>
> Hosted engine reboots itself almost daily. Is this by design? If not,
> where should I be searching for the clues why it shuts down? Someone
> is giving reboot order to HE because /var/log/messages in contains
> this:
> Mar 19 12:05:00 ovirtmgr qemu-ga: info: guest-shutdown called, mode:
> powerdown
> Mar 19 12:05:00 ovirtmgr systemd: Started Delayed Shutdown Service.
>
> And I'm still running v4.3.0 because upgrade to that was bit painful
> and haven't dared to new round.
>
> Thanks,
> -Juhani
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WIJXBQZQT4HDWAQ4IVLOIFGHKAKTT76O/
>


Thanks,
Kaustav
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2BHBIMFFGWMXGKVU5MFMQ36NJGHPS7EE/


[ovirt-users] Daily reboots of Hosted Engine?

2019-03-19 Thread Juhani Rautiainen
Hi!

Hosted engine reboots itself almost daily. Is this by design? If not,
where should I be searching for the clues why it shuts down? Someone
is giving reboot order to HE because /var/log/messages in contains
this:
Mar 19 12:05:00 ovirtmgr qemu-ga: info: guest-shutdown called, mode: powerdown
Mar 19 12:05:00 ovirtmgr systemd: Started Delayed Shutdown Service.

And I'm still running v4.3.0 because upgrade to that was bit painful
and haven't dared to new round.

Thanks,
-Juhani
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WIJXBQZQT4HDWAQ4IVLOIFGHKAKTT76O/


[ovirt-users] Re: Migrate HE beetwen hosts failed.

2019-03-19 Thread Ryan Barry
Just to confirm, the entire cluster is set to Westmere as the CPU type?

Can you please attach vdsm logs and libvirt logs from the host you are
trying to migrate to?

On Mon, Mar 18, 2019 at 4:20 AM  wrote:
>
> All my VM's, including VM with HE:
>
> Guest CPU Type: Intel Westmere Family
>
> All VM migrating, excluding VM with HE.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/UJNYD54C4727TXF6MN26LADGEETIYMQ4/



-- 

Ryan Barry

Associate Manager - RHV Virt/SLA

rba...@redhat.comM: +16518159306 IM: rbarry
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WXRXFQPF3MSCPXB7A4MA5GR5RDVQZBVA/


[ovirt-users] Re: Live migration failed

2019-03-19 Thread Ryan Barry
Can you please attach the relevant lines from vdsm.log on the hosts?
libvirt logs and qemu logs would also be helpful

On Tue, Mar 12, 2019 at 2:50 AM Bong Shau Fui  wrote:

> Hi:
>I deployed 2 ovirt hosts and an ovirt engine in a nested KVM server.
> I've a windows vm setup and tried to perform live migration but failed.  I
> checked on the hosts and found them meeting the live migration
> requirements, or at least that's what I thought.  I took the requirement
> from the below document.
>
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.5/html/Administration_Guide/sect-Migrating_Virtual_Machines_Between_Hosts.html
> The hosts, both source and destination are quite empty, with only the
> hosted engine, 1 centos vm and the windows VM in the cluster.  I can do a
> live migration for the centos vm successfully.  But when I tried live
> migration on the hosted-engine vm it failed immediately with a message "No
> available host to migrate VMs to".  When I tried to migrate the windows VM
> the message box that let me choose the destination host popped up but
> failed after a while.
>I'd like to ask where can I get more information with regards to
> live-migration apart from /var/log/ovirt-engine/engine.log ?  I also
> checked on the ovirt hosts' /var/log/vdsm/vdsm.log but found nothing
> pointing to the reason why it failed.
>Below is the extract from /var/log/ovirt-engine/engine.log when the
> live-migration took place
>
> 2019-03-12 14:37:58,159+08 INFO
> [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default task-131) []
> User admin@internal successfully logged in with scopes: ovirt-app-api
> ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search
> ovirt-ext=token-info:validate ovirt-ext=token:password-access
> 2019-03-12 14:37:58,450+08 INFO
> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
> (EE-ManagedThreadFactory-engineScheduled-Thread-59) [7730] Lock freed
> to object
> 'EngineLock:{exclusiveLocks='[d113be83-2740-4246-a1f2-b9344889c3cf=PROVIDER]',
> sharedLocks=''}'
> 2019-03-12 14:38:02,544+08 INFO
> [org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
> (EE-ManagedThreadFactory-engineScheduled-Thread-50) []
> BaseAsyncTask::onTaskEndSuccess: Task
> '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
> successfully.
> 2019-03-12 14:38:12,677+08 INFO
> [org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
> (EE-ManagedThreadFactory-engineScheduled-Thread-16) []
> BaseAsyncTask::onTaskEndSuccess: Task
> '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
> successfully.
> 2019-03-12 14:38:21,650+08 INFO
> [org.ovirt.engine.core.bll.aaa.SessionDataContainer]
> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [] Not removing session
> 'xDiHqqa6l+g8cngM26TTCfW7NeLN3WgWChsx28wUM391vAngSxwtyCkLbQxZR1AbJ5I+2bkPZNQijMUk0jLZcA==',
> session has running commands for user 'admin@internal-authz'.
> 2019-03-12 14:38:22,782+08 INFO
> [org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
> (EE-ManagedThreadFactory-engineScheduled-Thread-49) []
> BaseAsyncTask::onTaskEndSuccess: Task
> '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
> successfully.
> 2019-03-12 14:38:33,018+08 INFO
> [org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
> (EE-ManagedThreadFactory-engineScheduled-Thread-74) []
> BaseAsyncTask::onTaskEndSuccess: Task
> '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
> successfully.
> 2019-03-12 14:38:43,261+08 INFO
> [org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
> (EE-ManagedThreadFactory-engineScheduled-Thread-59) [7730]
> BaseAsyncTask::onTaskEndSuccess: Task
> '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
> successfully.
> 2019-03-12 14:38:53,528+08 INFO
> [org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
> (EE-ManagedThreadFactory-engineScheduled-Thread-13) []
> BaseAsyncTask::onTaskEndSuccess: Task
> '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
> successfully.
> 2019-03-12 14:39:03,759+08 INFO
> [org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
> (EE-ManagedThreadFactory-engineScheduled-Thread-43) []
> BaseAsyncTask::onTaskEndSuccess: Task
> '67631cf6-4c75-4681-88ef-fd4af56c0363' (Parent Command 'RemoveDisk',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
> successfully.
> 2019-03-12 14:39:14,011+08 INFO
> 

[ovirt-users] [ANN] oVirt 4.3.2 is now generally available

2019-03-19 Thread Sandro Bonazzola
The oVirt Project is pleased to announce the general availability of oVirt
4.3.2, as of March 19th, 2019.

This update is the second in a series of stabilization updates to the 4.3
series.

This release is available now on x86_64 architecture for:
* Red Hat Enterprise Linux 7.6 or later
* CentOS Linux (or similar) 7.6 or later

This release supports Hypervisor Hosts on x86_64 and ppc64le architectures
for:
* Red Hat Enterprise Linux 7.6 or later
* CentOS Linux (or similar) 7.6 or later
* oVirt Node 4.3 (available for x86_64 only)

Experimental tech preview for x86_64 and s390x architectures for Fedora 28
is also included.

See the release notes [1] for installation / upgrade instructions and
a list of new features and bugs fixed.

Notes:
- oVirt Appliance is already available
- oVirt Node is already available[2]

oVirt Node has been updated including:
- oVirt 4.3.2: http://www.ovirt.org/release/4.3.2/
- Latest CentOS updates (no relevant errata available up to now on
https://lists.centos.org/pipermail/centos-announce )

Additional Resources:
* Read more about the oVirt 4.3.2 release highlights:
http://www.ovirt.org/release/4.3.2/
* Get more oVirt Project updates on Twitter: https://twitter.com/ovirt
* Check out the latest project news on the oVirt blog:
http://www.ovirt.org/blog/

[1] http://www.ovirt.org/release/4.3.2/
[2] http://resources.ovirt.org/pub/ovirt-4.3/iso/


-- 

SANDRO BONAZZOLA

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/K6VSXJEB6Z2D3DBWWYDEPPQWYC4NLQXK/


[ovirt-users] vm_network not sync

2019-03-19 Thread fz
Hi, i've installed a Self Hosted Engine on Ovirt 4.3, all configured well and 
all function well, but, when i add a vm_network (non mgt) and assign it a DHCP, 
this network came out of sync. Host is ok, but DC not.
 I read a lot of forum about this, but i dont understand as resolve this issue. 
If i try to sync all network, i loose the host i need to delete ifcg-eno* and 
restart the node for to re-initialize it.
How can i to ocnfigure right the vm_network?
Thanks.

Fabio.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WETTNAWRGEJZLKW4RBMX2CXBYSM757WF/


[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Miguel Duarte de Mora Barroso
On Mon, Mar 18, 2019 at 5:08 PM Gianluca Cecchi
 wrote:
>
> On Mon, Mar 18, 2019 at 4:40 PM Miguel Duarte de Mora Barroso 
>  wrote:
>>
>> On Mon, Mar 18, 2019 at 2:20 PM Gianluca Cecchi
>>  wrote:
>> >
>> > Hello,
>> > passing from old manual to current OVN in 4.3.1 it seems I have some 
>> > problems with OVN now.
>> > I cannot assign network on OVN to VM (powered on or off doesn't change).
>> > When I add//edit a vnic, they are not on the possible choices
>> > Environment composed by three hosts and one engine (external on vSphere).
>> > The mgmt network during time has been configured on network named 
>> > ovirtmgmntZ2Z3
>> > On engine it seems there are 2 switches for every defined ovn network 
>> > (ovn192 and ovn172)
>> > Below some output of commands in case any inconsistency has remained and I 
>> > can purge it.
>> > Thanks in advance.
>> >
>>
>> I'm very confused here; you mention that on engine there are 2
>> switches for every ovn network, but, on your ovn-nbctl list
>> logical_switch output I can clearly see the 2 logical switches where
>> the OVN logical networks are stored. Who created those ?
>
>
> I think it could be related to the situation described here (it is the same 
> environment, in the meantime updated also from 4.2.8 to 4.3.1) and previous 
> configuration not backed up at that time:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/32S5L4JKHGPHE2XIQMLRIVLOXRG4CHW3/
>
> and some steps not done correctly by me.
> After following indications, I tried to import ovn but probably I did it 
> wrong.
>
>>
>>
>> Could you show us the properties of those 2 networks ? (e.g. ovn-nbctl
>> list logical_switch 32367d8a-460f-4447-b35a-abe9ea5187e0 & ovn-nbctl
>> list logical_switch 64c4c17f-cd67-4e29-939e-2b952495159f)
>>
>
> [root@ovmgr1 ~]# ovn-nbctl list logical_switch 
> 32367d8a-460f-4447-b35a-abe9ea5187e0
> _uuid   : 32367d8a-460f-4447-b35a-abe9ea5187e0
> acls: []
> dns_records : []
> external_ids: {}
> load_balancer   : []
> name: "ovn192"
> other_config: {subnet="192.168.10.0/24"}
> ports   : [affc5570-3e5a-439c-9fdf-d75d6810e3a3, 
> f639d541-2118-4c24-b478-b7a586eb170c]
> qos_rules   : []
> [root@ovmgr1 ~]#
>
> [root@ovmgr1 ~]# ovn-nbctl list logical_switch 
> 64c4c17f-cd67-4e29-939e-2b952495159f
> _uuid   : 64c4c17f-cd67-4e29-939e-2b952495159f
> acls: []
> dns_records : []
> external_ids: {}
> load_balancer   : []
> name: "ovn172"
> other_config: {subnet="172.16.10.0/24"}
> ports   : [32c348d9-12e9-4bcf-a43f-69338c887cfc, 
> 3c77c2ea-de00-43f9-a5c5-9b3ffea5ec69]
> qos_rules   : []
> [root@ovmgr1 ~]#
>
>
>>
>>
>> @Gianluca Cecchi , I notice that one of your duplicate networks -
>> 'ovn192'  - has no ports attached. That makes it the perfect candidate
>> to be deleted, and see if it becomes 'listable' on engine. That would
>> help rule out the 'duplicate name' theory.
>
>
>  I can try. Can you give me the command to be run?
> It is a test oVirt so It would be not a big problem in case of failures in 
> this respect.

You can delete it via the UI; just be sure to delete the one without
ports - it's external ID is 6110649a-db2b-4de7-8fbc-601095cfe510.

It will ask you if you also want to delete it from the external
provider, say yes.



>
>>
>> At the moment, I can't think of a better alternative. Let's see if
>> Marcin comes up with a better test / idea / alternative.
>>
>> Also, please let us know the version of the ovirt-provider-ovn,
>> openvswitch-ovn-central, and openvswitch-ovn-host.
>
>
> On engine:
> [root@ovmgr1 ~]# rpm -q ovirt-provider-ovn openvswitch-ovn-central 
> openvswitch-ovn-host
> ovirt-provider-ovn-1.2.20-1.el7.noarch
> openvswitch-ovn-central-2.10.1-3.el7.x86_64
> package openvswitch-ovn-host is not installed
> [root@ovmgr1 ~]#
>
> On the 3 hosts I only have this package installed:
> openvswitch-ovn-host-2.10.1-3.el7.x86_64
>
>  Thanks
> Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HPASIKY52XE7LPRYDQBKGTCHADP35YRS/


[ovirt-users] Re: VM has been paused due to a storage I/O error

2019-03-19 Thread Sahina Bose
Can you check the gluster mount logs to check if there are storage
related errors.
For the VM that's paused, check which storage domain and gluster
volume the OS disk is on. For instance, if the name of the gluster
volume is data, check the logs under
/var/log/glusterfs/rhev-data-center-mnt-glusterSD-**_data.log

On Thu, Mar 14, 2019 at 12:30 PM xil...@126.com  wrote:
>
> There was a shutdown due to a fault of the physical machine. After shutdown, 
> HostedEngine was suspended. If I try to manually start HostedEngine, I/O 
> error will occur.When I restarted HostedEngine, it was back to normal.
>
> 
> xil...@126.com
>
>
> From: Strahil
> Date: 2019-03-14 14:36
> To: xilazz; Gianluca
> CC: users
> Subject: Re: [ovirt-users] Re: VM has been paused due to a storage I/O error
>
> This should not happen with replica 3  volumes.
> Are you sure you don't have a fluster brick out of sync/disconnected ?
>
> Best Regards,
> Strahil Nikolov
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WREBDQWXVGKDCX2SFO7PGLIHRZRIFSM3/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MTTTBN3PTE3XZM5HQGT5PSAL4D76U5XK/


[ovirt-users] Re: How to fix ovn apparent inconsistency?

2019-03-19 Thread Marcin Mirecki
On Mon, Mar 18, 2019 at 5:08 PM Gianluca Cecchi 
wrote:

> On Mon, Mar 18, 2019 at 4:40 PM Miguel Duarte de Mora Barroso <
> mdbarr...@redhat.com> wrote:
>
>> On Mon, Mar 18, 2019 at 2:20 PM Gianluca Cecchi
>>  wrote:
>> >
>> > Hello,
>> > passing from old manual to current OVN in 4.3.1 it seems I have some
>> problems with OVN now.
>> > I cannot assign network on OVN to VM (powered on or off doesn't change).
>> > When I add//edit a vnic, they are not on the possible choices
>> > Environment composed by three hosts and one engine (external on
>> vSphere).
>> > The mgmt network during time has been configured on network named
>> ovirtmgmntZ2Z3
>> > On engine it seems there are 2 switches for every defined ovn network
>> (ovn192 and ovn172)
>> > Below some output of commands in case any inconsistency has remained
>> and I can purge it.
>> > Thanks in advance.
>> >
>>
>> I'm very confused here; you mention that on engine there are 2
>> switches for every ovn network, but, on your ovn-nbctl list
>> logical_switch output I can clearly see the 2 logical switches where
>> the OVN logical networks are stored. Who created those ?
>>
>
> I think it could be related to the situation described here (it is the
> same environment, in the meantime updated also from 4.2.8 to 4.3.1) and
> previous configuration not backed up at that time:
>
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/32S5L4JKHGPHE2XIQMLRIVLOXRG4CHW3/
>
> and some steps not done correctly by me.
> After following indications, I tried to import ovn but probably I did it
> wrong.
>


Is it possible that you added new networks, instead of importing the old
ones?
If so the old networks would just stay in the database, and we would have
duplicated networks like you have now.


>
>
>>
>> Could you show us the properties of those 2 networks ? (e.g. ovn-nbctl
>> list logical_switch 32367d8a-460f-4447-b35a-abe9ea5187e0 & ovn-nbctl
>> list logical_switch 64c4c17f-cd67-4e29-939e-2b952495159f)
>>
>>
> [root@ovmgr1 ~]# ovn-nbctl list logical_switch
> 32367d8a-460f-4447-b35a-abe9ea5187e0
> _uuid   : 32367d8a-460f-4447-b35a-abe9ea5187e0
> acls: []
> dns_records : []
> external_ids: {}
> load_balancer   : []
> name: "ovn192"
> other_config: {subnet="192.168.10.0/24"}
> ports   : [affc5570-3e5a-439c-9fdf-d75d6810e3a3,
> f639d541-2118-4c24-b478-b7a586eb170c]
> qos_rules   : []
> [root@ovmgr1 ~]#
>
> [root@ovmgr1 ~]# ovn-nbctl list logical_switch
> 64c4c17f-cd67-4e29-939e-2b952495159f
> _uuid   : 64c4c17f-cd67-4e29-939e-2b952495159f
> acls: []
> dns_records : []
> external_ids: {}
> load_balancer   : []
> name: "ovn172"
> other_config: {subnet="172.16.10.0/24"}
> ports   : [32c348d9-12e9-4bcf-a43f-69338c887cfc,
> 3c77c2ea-de00-43f9-a5c5-9b3ffea5ec69]
> qos_rules   : []
> [root@ovmgr1 ~]#
>
>
>
>>
>> @Gianluca Cecchi , I notice that one of your duplicate networks -
>> 'ovn192'  - has no ports attached. That makes it the perfect candidate
>> to be deleted, and see if it becomes 'listable' on engine. That would
>> help rule out the 'duplicate name' theory.
>>
>
>  I can try. Can you give me the command to be run?
> It is a test oVirt so It would be not a big problem in case of failures in
> this respect.
>
>
>> At the moment, I can't think of a better alternative. Let's see if
>> Marcin comes up with a better test / idea / alternative.
>>
>> Also, please let us know the version of the ovirt-provider-ovn,
>> openvswitch-ovn-central, and openvswitch-ovn-host.
>>
>
> On engine:
> [root@ovmgr1 ~]# rpm -q ovirt-provider-ovn openvswitch-ovn-central
> openvswitch-ovn-host
> ovirt-provider-ovn-1.2.20-1.el7.noarch
> openvswitch-ovn-central-2.10.1-3.el7.x86_64
> package openvswitch-ovn-host is not installed
> [root@ovmgr1 ~]#
>
> On the 3 hosts I only have this package installed:
> openvswitch-ovn-host-2.10.1-3.el7.x86_64
>
>  Thanks
> Gianluca
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/I734KDXNXY3PCO4VMFTK6LO7PDR2VHZR/


[ovirt-users] Re: Ovirt 4.3.1 problem with HA agent

2019-03-19 Thread Strahil
Hi Alexei,

>> 1.2 All bricks healed (gluster volume heal data info summary) and no 
>> split-brain
>
>  
>  
> gluster volume heal data info
>  
> Brick node-msk-gluster203:/opt/gluster/data
> Status: Connected
> Number of entries: 0
>  
> Brick node-msk-gluster205:/opt/gluster/data
> 
> 
> 
> 
> 
> 
> 
> Status: Connected
> Number of entries: 7
>  
> Brick node-msk-gluster201:/opt/gluster/data
> 
> 
> 
> 
> 
> 
> 
> Status: Connected
> Number of entries: 7
>  

Data needs healing.
Run: cluster volume heal data full
If it still doesn't heal (check in 5 min),go to 
/rhev/data-center/mnt/glusterSD/msk-gluster-facility._data
And run 'find  . -exec stat {}\;'  without the quotes.

As I have understood you, ovirt Hosted Engine  is running and can be started on 
all nodes except 1.


>>  
>> 2. Go to the problematic host and check the mount point is there
>
>  
>  
> No mount point on problematic node 
> /rhev/data-center/mnt/glusterSD/msk-gluster-facility.:_data
> If I create a mount point manually, it is deleted after the node is activated.
>  
> Other nodes can mount this volume without problems. Only this node have 
> connection problems after update.
>  
> Here is a part of the log at the time of activation of the node:
>  
> vdsm log
>  
> 2019-03-18 16:46:00,548+0300 INFO  (jsonrpc/5) [vds] Setting Hosted Engine HA 
> local maintenance to False (API:1630)
> 2019-03-18 16:46:00,549+0300 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC 
> call Host.setHaMaintenanceMode succeeded in 0.00 seconds (__init__:573)
> 2019-03-18 16:46:00,581+0300 INFO  (jsonrpc/7) [vdsm.api] START 
> connectStorageServer(domType=7, 
> spUUID=u'5a5cca91-01f8-01af-0297-025f', conList=[{u'id': 
> u'5799806e-7969-45da-b17d-b47a63e6a8e4', u'connection': 
> u'msk-gluster-facility.:/data', u'iqn': u'', u'user': u'', u'tpgt': u'1', 
> u'vfs_type': u'glusterfs', u'password': '', u'port': u''}], 
> options=None) from=:::10.77.253.210,56630, flow_id=81524ed, 
> task_id=5f353993-95de-480d-afea-d32dc94fd146 (api:46)
> 2019-03-18 16:46:00,621+0300 INFO  (jsonrpc/7) 
> [storage.StorageServer.MountConnection] Creating directory 
> u'/rhev/data-center/mnt/glusterSD/msk-gluster-facility.:_data' 
> (storageServer:167)
> 2019-03-18 16:46:00,622+0300 INFO  (jsonrpc/7) [storage.fileUtils] Creating 
> directory: /rhev/data-center/mnt/glusterSD/msk-gluster-facility.:_data 
> mode: None (fileUtils:197)
> 2019-03-18 16:46:00,622+0300 WARN  (jsonrpc/7) 
> [storage.StorageServer.MountConnection] gluster server 
> u'msk-gluster-facility.' is not in bricks ['node-msk-gluster203', 
> 'node-msk-gluster205', 'node-msk-gluster201'], possibly mounting duplicate 
> servers (storageServer:317)


This seems very strange. As you have hidden the hostname, I'm not use which on 
is this.
Check that DNS can be resolved from  all hosts and the hostname of this Host is 
resolvable.
Also check if it in the  peer  list.
Try to manually mount the cluster volume:
mount -t glusterfs msk-gluster-facility.:/data /mnt

Is this a second FQDN/IP of this server?
If so, gluster accepts that via gluster peer probe IP2


>> 2.1. Check permissions (should be vdsm:kvm) and fix with chown -R if needed

>> 2.2. Check the OVF_STORE from the logs that it exists
>
>  
> How can i do this?

Go to /rhev/data-center/mnt/glusterSD/host_engine and use find inside the 
domain UUID for files that are not owned by vdsm:KVM.
I usually run 'chown -R vdsm:KVM 823xx---zzz'  and it will fix any 
misconfiguration.

Best Regards,
Strahil Nikolov___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OFDI4CH3REYGWAD7V36K4SW64MALACAV/