[ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when migrating

2020-04-11 Thread Maton, Brett
Problem with signed / unsigned sounds about right.
Not having much luck with addr2line though.

I just manually migrated the VM to cause the problem again, not sure if
this partial NUMA config warning could be contributing:

2020-04-09T23:54:05.537028Z qemu-kvm: warning: All CPU(s) up to maxcpus
should be described in NUMA config, ability to start up with partial NUMA
mappings is obsoleted and will be removed in future
2020-04-11 07:25:04.146+: initiating migration
tcmalloc: large alloc 562949953421312 bytes == (nil) @  0x7f4a0bb464ef
0x7f4a0bb66367 0x7f4a2364b736 0x561527438ac8 0x5615274398e5 0x5615273e9bae
0x5615273f07b6 0x5615275b8de5 0x5615275b4bdf 0x7f4a0aaeee65 0x7f4a0a81788d

(process:32202): GLib-ERROR **: 08:25:04.151: gmem.c:135: failed to
allocate 562949953421312 bytes
2020-04-11 07:25:08.408+: shutting down, reason=crashed


Attempt to use addr2line

# addr2line -e /usr/libexec/qemu-kvm
0x7f4a0bb464ef 0x7f4a0bb66367 0x7f4a2364b736 0x561527438ac8 0x5615274398e5
0x5615273e9bae 0x5615273f07b6 0x5615275b8de5 0x5615275b4bdf 0x7f4a0aaeee65
0x7f4a0a81788d
??:0
??:0

Single addresses give the same:

0x7f4a0bb464ef
??:0

0x7f4a0a81788d
??:0

Maybe need debug packages ?

On Fri, 10 Apr 2020 at 22:23,  wrote:

> I found this thread on Stack overflow:
>
>
> https://stackoverflow.com/questions/9077457/how-to-trace-tcmalloc-large-alloc
>
>
>
> See
> http://code.google.com/p/gperftools/source/browse/trunk/src/tcmalloc.cc?r=80&redir=1
>  line
> 843
>
> Depending on your application - the large allocation may or may not be a
> bug.
>
> In any case - the part after the @ mark is a stack trace and can be used
> to locate the source of the message
>
> The repeating number (4294488064 which seems to be equal to 4G-479232 or
> 0x1-0x75000) makes me suspect the original allocation call got a
> negative signed value and used it as an unsigned value.
>
> It also had this to trace the memory leak:
>
> to trace the mem address to a line in your code, use addr2line commandline
> tool.. use it as addr2line -e  then press enter and then
> paste an address and press enter
>
>
>
> I’m not sure if this is helpful but it does sound like a memory leak.
>
>
>
> In a related Microsoft doc it stated:
>
>
>
> 1073741824 Allocations larger than this value cause a stack trace
> to be dumped to stderr. The threshold for dumping stack traces is increased
> by a factor of 1.125 every time we print a message so that the threshold
> automatically goes up by a factor of ~1000 every 60 messages. This bounds
> the amount of extra logging generated by this flag. Default value of this
> flag is very large and therefore you should see no extra logging unless the
> flag is overridden.
>
>
>
> The default in Windows is 1 GB. I’m not sure about Linux.
>
>
>
> I hope this is helpful.
>
>
>
> Eric Evans
>
> Digital Data Services LLC.
>
> 304.660.9080
>
>
>
> *From:* Maton, Brett 
> *Sent:* Friday, April 10, 2020 4:53 PM
> *To:* eev...@digitaldatatechs.com
> *Cc:* Ovirt Users 
> *Subject:* [ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when
> migrating
>
>
>
> The hosts are identical, and yes I'm sure about the 563 terrabytes, which
> is obviously wrong, and why I mentioned it. Possibly an overflow?
>
>
>
> On Fri, 10 Apr 2020, 21:31 ,  wrote:
>
> I have a Windows 10 guest and a Server 2016 guest that migrate without an
> issue.
> Are your CPU architectures comparable between the hosts?
> BTW,  56294995342131 bytes is 562 terabytes. Are you sure that's correct?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7JDAC6SVJIPJRMLDHHZIREUGC3EDR6FP/
>
>
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/D74C7MSRJEQOTPNDB55XTRDSNT2WK6ST/


[ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when migrating

2020-04-10 Thread eevans
I found this thread on Stack overflow:

https://stackoverflow.com/questions/9077457/how-to-trace-tcmalloc-large-alloc

 

See  
<http://code.google.com/p/gperftools/source/browse/trunk/src/tcmalloc.cc?r=80&redir=1>
 
http://code.google.com/p/gperftools/source/browse/trunk/src/tcmalloc.cc?r=80&redir=1
 line 843

Depending on your application - the large allocation may or may not be a bug.

In any case - the part after the @ mark is a stack trace and can be used to 
locate the source of the message

The repeating number (4294488064 which seems to be equal to 4G-479232 or 
0x1-0x75000) makes me suspect the original allocation call got a 
negative signed value and used it as an unsigned value.

It also had this to trace the memory leak:

to trace the mem address to a line in your code, use addr2line commandline 
tool.. use it as addr2line -e  then press enter and then paste 
an address and press enter

 

I’m not sure if this is helpful but it does sound like a memory leak.

 

In a related Microsoft doc it stated:

 

1073741824 Allocations larger than this value cause a stack trace to be 
dumped to stderr. The threshold for dumping stack traces is increased by a 
factor of 1.125 every time we print a message so that the threshold 
automatically goes up by a factor of ~1000 every 60 messages. This bounds the 
amount of extra logging generated by this flag. Default value of this flag is 
very large and therefore you should see no extra logging unless the flag is 
overridden.

 

The default in Windows is 1 GB. I’m not sure about Linux.

 

I hope this is helpful.

 

Eric Evans

Digital Data Services LLC.

304.660.9080



 

From: Maton, Brett  
Sent: Friday, April 10, 2020 4:53 PM
To: eev...@digitaldatatechs.com
Cc: Ovirt Users 
Subject: [ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when migrating

 

The hosts are identical, and yes I'm sure about the 563 terrabytes, which is 
obviously wrong, and why I mentioned it. Possibly an overflow?

 

On Fri, 10 Apr 2020, 21:31 , mailto:eev...@digitaldatatechs.com> > wrote:

I have a Windows 10 guest and a Server 2016 guest that migrate without an issue.
Are your CPU architectures comparable between the hosts? 
BTW,  56294995342131 bytes is 562 terabytes. Are you sure that's correct?
___
Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> 
To unsubscribe send an email to users-le...@ovirt.org 
<mailto:users-le...@ovirt.org> 
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7JDAC6SVJIPJRMLDHHZIREUGC3EDR6FP/


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BQCP4V32U3EX2UCCSSJEBX55EGTQMH3V/


[ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when migrating

2020-04-10 Thread Maton, Brett
The hosts are identical, and yes I'm sure about the 563 terrabytes, which
is obviously wrong, and why I mentioned it. Possibly an overflow?

On Fri, 10 Apr 2020, 21:31 ,  wrote:

> I have a Windows 10 guest and a Server 2016 guest that migrate without an
> issue.
> Are your CPU architectures comparable between the hosts?
> BTW,  56294995342131 bytes is 562 terabytes. Are you sure that's correct?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7JDAC6SVJIPJRMLDHHZIREUGC3EDR6FP/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YRJ2QS4NXAKISWRMPOFHDO74V63ARPBN/


[ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when migrating

2020-04-10 Thread eevans
I have a Windows 10 guest and a Server 2016 guest that migrate without an issue.
Are your CPU architectures comparable between the hosts? 
BTW,  56294995342131 bytes is 562 terabytes. Are you sure that's correct?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7JDAC6SVJIPJRMLDHHZIREUGC3EDR6FP/


[ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when migrating

2020-04-08 Thread Maton, Brett
Any other suggestions ?
I'm already running a later version of qemu (qemu-kvm-ev-2.12.0-33.1.el7_7.4)
than the one referenced (qemu-kvm-rhev-2.9.0-16) in
https://access.redhat.com/solutions/3423481 (from what I can see on that
page without a subscription).

Regards,
Brett

On Tue, 7 Apr 2020 at 11:56, Maton, Brett  wrote:

> I haven't got an active RHEL subscription so I can't view that solution
> unfortunately.
>
>
> Thanks for the log pointers though, looking in the qemu log I'm not
> surprised it's crashing...
>
> tcmalloc: large alloc 562949953421312 bytes == (nil) @  0x7f93c080b4ef
> 0x7f93c082b367 0x7f93d8310736 0x55efa0670ac8 0x55efa06718e5 0x55efa0621bae
> 0x55efa06287b6 0x55efa07f0de5 0x55efa07ecbdf 0x7f93bf7b3e65 0x7f93bf4dc88d
>
> (process:1374): GLib-ERROR **: 09:26:39.525: gmem.c:135: *failed to
> allocate 562949953421312 bytes*
> 2020-04-06 08:26:43.036+: shutting down, reason=crashed
> ...
> libvirt version: 4.5.0, package: 23.el7_7.6 (CentOS BuildSystem <
> http://bugs.centos.org>, 2020-03-17-23:39:10, x86-01.bsys.centos.org),
> qemu version: 2.12.0qemu-kvm-ev-2.12.0-33.1.el7_7.4, kernel:
> 3.10.0-1062.18.1.el7.x86_64
>
> 562949953421312 bytes is mighty big, nigh on 563 TB!
> The VM in question is allocated 4GB RAM and has a 60GB disk...
>
> Couldn't see any errors in the VDSM log at the time that qemu failed.
>
> On Tue, 7 Apr 2020 at 10:52, Shani Leviim  wrote:
>
>> Hi Brett,
>> According to [1], you can try to update the package qemu-kvm-rhev.
>> (Or yum update if there're more packages related need to be upgraded).
>>
>> You may also find some more information about that error on the vdsm log
>> (/var/log/vdsm/vdsm.log)
>> and the qemu log (/var/log/libvirt/qemu/vm_name.log)
>>
>> [1] https://access.redhat.com/solutions/3423481
>>
>>
>> *Regards,*
>>
>> *Shani Leviim*
>>
>>
>> On Mon, Apr 6, 2020 at 12:09 PM Maton, Brett 
>> wrote:
>>
>>> I recently added a Windows 10 Pro 64 bit (release 1909) VM, and I'm
>>> seeing a lot of failures when oVirt tries to move the VM to another host
>>> (triggered by load balancing),
>>>
>>> These errors are showing up in the UI event log
>>>
>>> Migration failed  (VM: , Source: , Destination: >> 2>).
>>>
>>> Followed by:
>>>
>>> VM  is down with error. Exit message: Lost connection with qemu
>>> process.
>>>
>>> Google returned some references to 'options kvm ignore_msrs=1' which
>>> I've added to /etc/modprobe/d/kvm.conf and restarted the hosts but that
>>> doesn't appear to have made a difference.
>>>
>>> Is this a known issue with Windows 10 guests?
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/QNJ7GFDXKBVREHJY4FBIORLBVEBO353R/
>>>
>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JNNOKV3N6E3BZ2FGLYCO2UULDLF6WENN/


[ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when migrating

2020-04-07 Thread Maton, Brett
I haven't got an active RHEL subscription so I can't view that solution
unfortunately.


Thanks for the log pointers though, looking in the qemu log I'm not
surprised it's crashing...

tcmalloc: large alloc 562949953421312 bytes == (nil) @  0x7f93c080b4ef
0x7f93c082b367 0x7f93d8310736 0x55efa0670ac8 0x55efa06718e5 0x55efa0621bae
0x55efa06287b6 0x55efa07f0de5 0x55efa07ecbdf 0x7f93bf7b3e65 0x7f93bf4dc88d

(process:1374): GLib-ERROR **: 09:26:39.525: gmem.c:135: *failed to
allocate 562949953421312 bytes*
2020-04-06 08:26:43.036+: shutting down, reason=crashed
...
libvirt version: 4.5.0, package: 23.el7_7.6 (CentOS BuildSystem <
http://bugs.centos.org>, 2020-03-17-23:39:10, x86-01.bsys.centos.org), qemu
version: 2.12.0qemu-kvm-ev-2.12.0-33.1.el7_7.4, kernel:
3.10.0-1062.18.1.el7.x86_64

562949953421312 bytes is mighty big, nigh on 563 TB!
The VM in question is allocated 4GB RAM and has a 60GB disk...

Couldn't see any errors in the VDSM log at the time that qemu failed.

On Tue, 7 Apr 2020 at 10:52, Shani Leviim  wrote:

> Hi Brett,
> According to [1], you can try to update the package qemu-kvm-rhev.
> (Or yum update if there're more packages related need to be upgraded).
>
> You may also find some more information about that error on the vdsm log
> (/var/log/vdsm/vdsm.log)
> and the qemu log (/var/log/libvirt/qemu/vm_name.log)
>
> [1] https://access.redhat.com/solutions/3423481
>
>
> *Regards,*
>
> *Shani Leviim*
>
>
> On Mon, Apr 6, 2020 at 12:09 PM Maton, Brett 
> wrote:
>
>> I recently added a Windows 10 Pro 64 bit (release 1909) VM, and I'm
>> seeing a lot of failures when oVirt tries to move the VM to another host
>> (triggered by load balancing),
>>
>> These errors are showing up in the UI event log
>>
>> Migration failed  (VM: , Source: , Destination: > 2>).
>>
>> Followed by:
>>
>> VM  is down with error. Exit message: Lost connection with qemu
>> process.
>>
>> Google returned some references to 'options kvm ignore_msrs=1' which
>> I've added to /etc/modprobe/d/kvm.conf and restarted the hosts but that
>> doesn't appear to have made a difference.
>>
>> Is this a known issue with Windows 10 guests?
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/QNJ7GFDXKBVREHJY4FBIORLBVEBO353R/
>>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XAYHLSBLONRGROONCQPWXSBBYVFFK6KK/


[ovirt-users] Re: Windows 10 Pro 64 (1909) crashes when migrating

2020-04-07 Thread Shani Leviim
Hi Brett,
According to [1], you can try to update the package qemu-kvm-rhev.
(Or yum update if there're more packages related need to be upgraded).

You may also find some more information about that error on the vdsm log
(/var/log/vdsm/vdsm.log)
and the qemu log (/var/log/libvirt/qemu/vm_name.log)

[1] https://access.redhat.com/solutions/3423481


*Regards,*

*Shani Leviim*


On Mon, Apr 6, 2020 at 12:09 PM Maton, Brett 
wrote:

> I recently added a Windows 10 Pro 64 bit (release 1909) VM, and I'm seeing
> a lot of failures when oVirt tries to move the VM to another host
> (triggered by load balancing),
>
> These errors are showing up in the UI event log
>
> Migration failed  (VM: , Source: , Destination:  2>).
>
> Followed by:
>
> VM  is down with error. Exit message: Lost connection with qemu
> process.
>
> Google returned some references to 'options kvm ignore_msrs=1' which I've
> added to /etc/modprobe/d/kvm.conf and restarted the hosts but that doesn't
> appear to have made a difference.
>
> Is this a known issue with Windows 10 guests?
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/QNJ7GFDXKBVREHJY4FBIORLBVEBO353R/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6H735KZWX7DMB6ONUYRSNZ3R5IUBQ4WY/