Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included) - patch backport propose

2015-10-19 Thread Stefano Stabellini
On Fri, 16 Oct 2015, Stefano Stabellini wrote:
> On Fri, 16 Oct 2015, Fabio Fantoni wrote:
> > Il 09/10/2015 09:56, Fabio Fantoni ha scritto:
> > > Il 08/10/2015 17:58, Andreas Kinzler ha scritto:
> > > > Is this still current? I made an interesting observation:
> > > > 
> > > > I had no problems with SPICE and vanilla Xen 4.5.1 when using it on 
> > > > Gentoo
> > > > with glibc 2.19/gcc 4.6.4.
> > > > Segfaults started when I switched to glibc 2.20/gcc 4.9.3 - I did not
> > > > change Xen source code at all.
> > > > All this might be related to:
> > > > https://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg02764.html
> > > > 
> > > > Andreas
> > > 
> > > Thanks for your mail.
> > > The problem I had seems different, I not found the exactly cause but I
> > > solved using newer qemu (2.2 from unstable - now 4.6) with xen 4.5.1.
> > > Big distro I saw already use newer qemu and should be ok.
> > > I still using glibc < 2.20 in my debian servers, this is probably because 
> > > I
> > > not had your problem but I think that backport the patch you linked can be
> > > useful for solve a qemu crash case, I'll test it in my next build and if
> > > I'll not found regression I'll require the backport for xen qemu gits.
> > > 
> > > > 
> > > > > https://github.com/Fantu/Xen/commits/rebase/m2r-staging
> > > > > Latest test with regression based on latest stable-4.5, more exactly:
> > > > > https://github.com/Fantu/Xen/commits/rebase/m2r-testing
> > > > > Some days ago on same dom0 and domU I tried with latest stable version
> > > > > (that I use on only 2 production servers for now but I not saw the
> > > > > regression), more exactly:
> > > > > https://github.com/Fantu/Xen/commits/rebase/m2r-stable-4.5
> > > > > Dom0 debian 7 with kernel 3.16 from backports, seabios 1.8.1-2 from
> > > > > unstable and this xen configure:
> > > > > ./configure --prefix=/usr --disable-blktap1 --disable-qemu-traditional
> > > > > --disable-rombios 
> > > > > --with-system-seabios=/usr/share/seabios/bios-256k.bin
> > > > > --with-extra-qemuu-configure-args="--enable-spice --enable-usb-redir"
> > > > > --disable-blktap2
> > > > > 
> > > > > I suppose that there is unexpected case caused by a backports or 
> > > > > missed
> > > > > patch/es to backports from unstable.
> > > > > I not found with a fast look rilevant patch to try to revert, can 
> > > > > anyone
> > > > > suggest me the more probable point/s for bisect and/or patch to revert
> > > > > or I must try full bisect 4.5.0->stable-4.5?
> > > > > 
> > > > 
> > > 
> > 
> > I tried to use xen 4.6 with its qemu plus cherry-pick of this patch:
> > http://git.qemu.org/?p=qemu.git;a=commit;h=c6e484707f28b3e115e64122a0570f6b3c585489
> > - spice-display: fix segfault in qemu_spice_create_update
> > Used some days without see regression.
> > Probably is useful apply it to qemu-xen unstable and 4.6 (about older 
> > versions
> > I don't know and I not tested).
> > @Stefano Stabellini: can you take a look to it please?
> 
> It looks like a reasonable effort.  My test machines are unavailable at
> the moment, when they are back I'll commit (assuming it passes the
> tests).

I have applied the commit to staging and 4.6-testing

 
> > About qemu used in unstable I think is good update to 2.4.0.1 now that xen 
> > 4.7
> > devel is started, I think we keep upstream qemu updated with too many 
> > latency
> > and there is a more high bug risk with xen for major of distro that use 
> > newer
> > qemu version (following stable version of both).
> 
> I'll upgrade QEMU as soon as possible.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included) - patch backport propose

2015-10-16 Thread Stefano Stabellini
On Fri, 16 Oct 2015, Fabio Fantoni wrote:
> Il 09/10/2015 09:56, Fabio Fantoni ha scritto:
> > Il 08/10/2015 17:58, Andreas Kinzler ha scritto:
> > > Is this still current? I made an interesting observation:
> > > 
> > > I had no problems with SPICE and vanilla Xen 4.5.1 when using it on Gentoo
> > > with glibc 2.19/gcc 4.6.4.
> > > Segfaults started when I switched to glibc 2.20/gcc 4.9.3 - I did not
> > > change Xen source code at all.
> > > All this might be related to:
> > > https://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg02764.html
> > > 
> > > Andreas
> > 
> > Thanks for your mail.
> > The problem I had seems different, I not found the exactly cause but I
> > solved using newer qemu (2.2 from unstable - now 4.6) with xen 4.5.1.
> > Big distro I saw already use newer qemu and should be ok.
> > I still using glibc < 2.20 in my debian servers, this is probably because I
> > not had your problem but I think that backport the patch you linked can be
> > useful for solve a qemu crash case, I'll test it in my next build and if
> > I'll not found regression I'll require the backport for xen qemu gits.
> > 
> > > 
> > > > https://github.com/Fantu/Xen/commits/rebase/m2r-staging
> > > > Latest test with regression based on latest stable-4.5, more exactly:
> > > > https://github.com/Fantu/Xen/commits/rebase/m2r-testing
> > > > Some days ago on same dom0 and domU I tried with latest stable version
> > > > (that I use on only 2 production servers for now but I not saw the
> > > > regression), more exactly:
> > > > https://github.com/Fantu/Xen/commits/rebase/m2r-stable-4.5
> > > > Dom0 debian 7 with kernel 3.16 from backports, seabios 1.8.1-2 from
> > > > unstable and this xen configure:
> > > > ./configure --prefix=/usr --disable-blktap1 --disable-qemu-traditional
> > > > --disable-rombios --with-system-seabios=/usr/share/seabios/bios-256k.bin
> > > > --with-extra-qemuu-configure-args="--enable-spice --enable-usb-redir"
> > > > --disable-blktap2
> > > > 
> > > > I suppose that there is unexpected case caused by a backports or missed
> > > > patch/es to backports from unstable.
> > > > I not found with a fast look rilevant patch to try to revert, can anyone
> > > > suggest me the more probable point/s for bisect and/or patch to revert
> > > > or I must try full bisect 4.5.0->stable-4.5?
> > > > 
> > > 
> > 
> 
> I tried to use xen 4.6 with its qemu plus cherry-pick of this patch:
> http://git.qemu.org/?p=qemu.git;a=commit;h=c6e484707f28b3e115e64122a0570f6b3c585489
> - spice-display: fix segfault in qemu_spice_create_update
> Used some days without see regression.
> Probably is useful apply it to qemu-xen unstable and 4.6 (about older versions
> I don't know and I not tested).
> @Stefano Stabellini: can you take a look to it please?

It looks like a reasonable effort.  My test machines are unavailable at
the moment, when they are back I'll commit (assuming it passes the
tests).


> About qemu used in unstable I think is good update to 2.4.0.1 now that xen 4.7
> devel is started, I think we keep upstream qemu updated with too many latency
> and there is a more high bug risk with xen for major of distro that use newer
> qemu version (following stable version of both).

I'll upgrade QEMU as soon as possible.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included) - patch backport propose

2015-10-16 Thread Fabio Fantoni

Il 09/10/2015 09:56, Fabio Fantoni ha scritto:

Il 08/10/2015 17:58, Andreas Kinzler ha scritto:

Is this still current? I made an interesting observation:

I had no problems with SPICE and vanilla Xen 4.5.1 when using it on 
Gentoo with glibc 2.19/gcc 4.6.4.
Segfaults started when I switched to glibc 2.20/gcc 4.9.3 - I did not 
change Xen source code at all.
All this might be related to: 
https://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg02764.html


Andreas


Thanks for your mail.
The problem I had seems different, I not found the exactly cause but I 
solved using newer qemu (2.2 from unstable - now 4.6) with xen 4.5.1.

Big distro I saw already use newer qemu and should be ok.
I still using glibc < 2.20 in my debian servers, this is probably 
because I not had your problem but I think that backport the patch you 
linked can be useful for solve a qemu crash case, I'll test it in my 
next build and if I'll not found regression I'll require the backport 
for xen qemu gits.





https://github.com/Fantu/Xen/commits/rebase/m2r-staging
Latest test with regression based on latest stable-4.5, more exactly:
https://github.com/Fantu/Xen/commits/rebase/m2r-testing
Some days ago on same dom0 and domU I tried with latest stable 
version (that I use on only 2 production servers for now but I not 
saw the regression), more exactly:

https://github.com/Fantu/Xen/commits/rebase/m2r-stable-4.5
Dom0 debian 7 with kernel 3.16 from backports, seabios 1.8.1-2 from 
unstable and this xen configure:
./configure --prefix=/usr --disable-blktap1 
--disable-qemu-traditional --disable-rombios 
--with-system-seabios=/usr/share/seabios/bios-256k.bin 
--with-extra-qemuu-configure-args="--enable-spice 
--enable-usb-redir" --disable-blktap2


I suppose that there is unexpected case caused by a backports or 
missed patch/es to backports from unstable.
I not found with a fast look rilevant patch to try to revert, can 
anyone suggest me the more probable point/s for bisect and/or patch 
to revert or I must try full bisect 4.5.0->stable-4.5?








I tried to use xen 4.6 with its qemu plus cherry-pick of this patch:
http://git.qemu.org/?p=qemu.git;a=commit;h=c6e484707f28b3e115e64122a0570f6b3c585489 
- spice-display: fix segfault in qemu_spice_create_update

Used some days without see regression.
Probably is useful apply it to qemu-xen unstable and 4.6 (about older 
versions I don't know and I not tested).

@Stefano Stabellini: can you take a look to it please?

About qemu used in unstable I think is good update to 2.4.0.1 now that 
xen 4.7 devel is started, I think we keep upstream qemu updated with too 
many latency and there is a more high bug risk with xen for major of 
distro that use newer qemu version (following stable version of both).


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-10-09 Thread Fabio Fantoni

Il 08/10/2015 17:58, Andreas Kinzler ha scritto:

Is this still current? I made an interesting observation:

I had no problems with SPICE and vanilla Xen 4.5.1 when using it on 
Gentoo with glibc 2.19/gcc 4.6.4.
Segfaults started when I switched to glibc 2.20/gcc 4.9.3 - I did not 
change Xen source code at all.
All this might be related to: 
https://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg02764.html


Andreas


Thanks for your mail.
The problem I had seems different, I not found the exactly cause but I 
solved using newer qemu (2.2 from unstable - now 4.6) with xen 4.5.1.

Big distro I saw already use newer qemu and should be ok.
I still using glibc < 2.20 in my debian servers, this is probably 
because I not had your problem but I think that backport the patch you 
linked can be useful for solve a qemu crash case, I'll test it in my 
next build and if I'll not found regression I'll require the backport 
for xen qemu gits.





https://github.com/Fantu/Xen/commits/rebase/m2r-staging
Latest test with regression based on latest stable-4.5, more exactly:
https://github.com/Fantu/Xen/commits/rebase/m2r-testing
Some days ago on same dom0 and domU I tried with latest stable 
version (that I use on only 2 production servers for now but I not 
saw the regression), more exactly:

https://github.com/Fantu/Xen/commits/rebase/m2r-stable-4.5
Dom0 debian 7 with kernel 3.16 from backports, seabios 1.8.1-2 from 
unstable and this xen configure:
./configure --prefix=/usr --disable-blktap1 
--disable-qemu-traditional --disable-rombios 
--with-system-seabios=/usr/share/seabios/bios-256k.bin 
--with-extra-qemuu-configure-args="--enable-spice --enable-usb-redir" 
--disable-blktap2


I suppose that there is unexpected case caused by a backports or 
missed patch/es to backports from unstable.
I not found with a fast look rilevant patch to try to revert, can 
anyone suggest me the more probable point/s for bisect and/or patch 
to revert or I must try full bisect 4.5.0->stable-4.5?







___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-10-08 Thread Andreas Kinzler

Is this still current?

I made an interesting observation: I had no problems with SPICE and 
vanilla Xen 4.5.1 when using it on Gentoo with glibc 2.19/gcc 4.6.4.
Segfaults started when I switched to glibc 2.20/gcc 4.9.3 - I did not 
change Xen source code at all.
All this might be related to: 
https://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg02764.html


Andreas


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-10-08 Thread Andreas Kinzler

Is this still current? I made an interesting observation:

I had no problems with SPICE and vanilla Xen 4.5.1 when using it on 
Gentoo with glibc 2.19/gcc 4.6.4.
Segfaults started when I switched to glibc 2.20/gcc 4.9.3 - I did not 
change Xen source code at all.
All this might be related to: 
https://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg02764.html


Andreas


https://github.com/Fantu/Xen/commits/rebase/m2r-staging
Latest test with regression based on latest stable-4.5, more exactly:
https://github.com/Fantu/Xen/commits/rebase/m2r-testing
Some days ago on same dom0 and domU I tried with latest stable version 
(that I use on only 2 production servers for now but I not saw the 
regression), more exactly:

https://github.com/Fantu/Xen/commits/rebase/m2r-stable-4.5
Dom0 debian 7 with kernel 3.16 from backports, seabios 1.8.1-2 from 
unstable and this xen configure:
./configure --prefix=/usr --disable-blktap1 --disable-qemu-traditional 
--disable-rombios 
--with-system-seabios=/usr/share/seabios/bios-256k.bin 
--with-extra-qemuu-configure-args="--enable-spice --enable-usb-redir" 
--disable-blktap2


I suppose that there is unexpected case caused by a backports or 
missed patch/es to backports from unstable.
I not found with a fast look rilevant patch to try to revert, can 
anyone suggest me the more probable point/s for bisect and/or patch to 
revert or I must try full bisect 4.5.0->stable-4.5?





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-15 Thread Stefano Stabellini
On Wed, 13 May 2015, Fabio Fantoni wrote:
> Il 12/05/2015 16:44, Stefano Stabellini ha scritto:
> > On Tue, 12 May 2015, Stefano Stabellini wrote:
> > > On Tue, 12 May 2015, Fabio Fantoni wrote:
> > > > Il 12/05/2015 12:26, Fabio Fantoni ha scritto:
> > > > > Il 12/05/2015 11:23, Fabio Fantoni ha scritto:
> > > > > > Il 11/05/2015 17:04, Fabio Fantoni ha scritto:
> > > > > > > Il 21/04/2015 14:53, Stefano Stabellini ha scritto:
> > > > > > > > On Tue, 21 Apr 2015, Fabio Fantoni wrote:
> > > > > > > > > Il 21/04/2015 12:49, Stefano Stabellini ha scritto:
> > > > > > > > > > On Mon, 20 Apr 2015, Fabio Fantoni wrote:
> > > > > > > > > > > I updated xen and qemu from xen 4.5.0 with its upstream
> > > > > > > > > > > qemu
> > > > > > > > > > > included to
> > > > > > > > > > > xen
> > > > > > > > > > > 4.5.1-pre with qemu upstream from stable-4.5 (changed
> > > > > > > > > > > Config.mk
> > > > > > > > > > > to use
> > > > > > > > > > > revision "master").
> > > > > > > > > > > After few minutes I booted windows 7 64 bit domU qemu
> > > > > > > > > > > crash,
> > > > > > > > > > > tried 2 times
> > > > > > > > > > > with same result.
> > > > > > > > > > > 
> > > > > > > > > > > In the domU's qemu log:
> > > > > > > > > > > > qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion
> > > > > > > > > > > > `(old_top ==
> > > > > > > > > > > > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
> > > > > > > > > > > > __builtin_offsetof
> > > > > > > > > > > > (struct malloc_chunk, fd && old_size == 0) ||
> > > > > > > > > > > > ((unsigned
> > > > > > > > > > > > long)
> > > > > > > > > > > > (old_size) >= (unsigned long)__builtin_offsetof
> > > > > > > > > > > > (struct
> > > > > > > > > > > > malloc_chunk,
> > > > > > > > > > > > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 *
> > > > > > > > > > > > (sizeof(size_t))) -
> > > > > > > > > > > > 1))) && ((old_top)->size & 0x1) && ((unsigned
> > > > > > > > > > > > long)old_end &
> > > > > > > > > > > > pagemask)
> > > > > > > > > > > > ==
> > > > > > > > > > > > 0)' failed.
> > > > > > > > > > > > Killing all inferiors
> > > > > > > > > > > In attachment the full backtrace of qemu crash.
> > > > > > > > > > > 
> > > > > > > > > > > With a fast search after I saw the backtrace I found a
> > > > > > > > > > > probable
> > > > > > > > > > > cause of
> > > > > > > > > > > regression (I'm not sure):
> > > > > > > > > > > http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
> > > > > > > > > > > spice: make sure we don't overflow ssd->buf
> > > > > > > > > > > 
> > > > > > > > > > > Added also qemu-devel and spice-devel as cc.
> > > > > > > > > > > 
> > > > > > > > > > > If you need more informations/tests tell me and I'll post
> > > > > > > > > > > them.
> > > > > > > > > > Maybe you could try to revert the offending commit
> > > > > > > > > > (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better
> > > > > > > > > > bisect
> > > > > > > > > > the
> > > > > > > > > > crash?
> > > > > > > > > Thanks for your reply.
> > > > > > > > > 
> > > > > > > > > I reverted to 4.5.0 on dom0 for now on that system because I'm
> > > > > > > > > busy
> > > > > > > > > trying to
> > > > > > > > > found another problem that cause very bad performance without
> > > > > > > > > errors
> > > > > > > > > or
> > > > > > > > > nothing in logs :( I don't know if if xen related, kernel
> > > > > > > > > related or
> > > > > > > > > other for
> > > > > > > > > now.
> > > > > > > > > 
> > > > > > > > > About this regression with spice I'll do further tests in next
> > > > > > > > > days
> > > > > > > > > (probably
> > > > > > > > > starting reverting the spice patch in qemu) but any help is
> > > > > > > > > appreciated.
> > > > > > > > > Based on data I have for now is possible that the problem is
> > > > > > > > > that
> > > > > > > > > qemu try to
> > > > > > > > > allocate other ram or videoram after domU create but with xen
> > > > > > > > > is not
> > > > > > > > > possible?
> > > > > > > > > In the spice related patch I saw something about dynamic
> > > > > > > > > allocation
> > > > > > > > > for
> > > > > > > > > example.
> > > > > > > > It is probably caused by a commit in the range:
> > > > > > > > 
> > > > > > > > 1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4
> > > > > > > > 
> > > > > > > > there are only 10 commits in that range. By using git bisect you
> > > > > > > > should
> > > > > > > > be able to narrow it down in just 3 tests.
> > > > > > > Sorry for delay, I was busy with many things, today I retried with
> > > > > > > updated stable-4.5 and also reverting "spice: make sure we don't
> > > > > > > overflow ssd->buf" (in a second test) but in both case regression
> > > > > > > remain
> > > > > > > :(
> > > > > > > Tomorrow probably I'll do other tests.
> > > > > > I did another test, reverting this instead:
> > > > > > http://xenbits.xen.org/gitweb/?p

Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-13 Thread Fabio Fantoni

Il 12/05/2015 16:44, Stefano Stabellini ha scritto:

On Tue, 12 May 2015, Stefano Stabellini wrote:

On Tue, 12 May 2015, Fabio Fantoni wrote:

Il 12/05/2015 12:26, Fabio Fantoni ha scritto:

Il 12/05/2015 11:23, Fabio Fantoni ha scritto:

Il 11/05/2015 17:04, Fabio Fantoni ha scritto:

Il 21/04/2015 14:53, Stefano Stabellini ha scritto:

On Tue, 21 Apr 2015, Fabio Fantoni wrote:

Il 21/04/2015 12:49, Stefano Stabellini ha scritto:

On Mon, 20 Apr 2015, Fabio Fantoni wrote:

I updated xen and qemu from xen 4.5.0 with its upstream qemu
included to
xen
4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk
to use
revision "master").
After few minutes I booted windows 7 64 bit domU qemu crash,
tried 2 times
with same result.

In the domU's qemu log:

qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion
`(old_top ==
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
__builtin_offsetof
(struct malloc_chunk, fd && old_size == 0) || ((unsigned
long)
(old_size) >= (unsigned long)__builtin_offsetof (struct
malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 *
(sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end &
pagemask)
==
0)' failed.
Killing all inferiors

In attachment the full backtrace of qemu crash.

With a fast search after I saw the backtrace I found a probable
cause of
regression (I'm not sure):
http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
spice: make sure we don't overflow ssd->buf

Added also qemu-devel and spice-devel as cc.

If you need more informations/tests tell me and I'll post them.

Maybe you could try to revert the offending commit
(5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect
the
crash?

Thanks for your reply.

I reverted to 4.5.0 on dom0 for now on that system because I'm busy
trying to
found another problem that cause very bad performance without errors
or
nothing in logs :( I don't know if if xen related, kernel related or
other for
now.

About this regression with spice I'll do further tests in next days
(probably
starting reverting the spice patch in qemu) but any help is
appreciated.
Based on data I have for now is possible that the problem is that
qemu try to
allocate other ram or videoram after domU create but with xen is not
possible?
In the spice related patch I saw something about dynamic allocation
for
example.

It is probably caused by a commit in the range:

1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4

there are only 10 commits in that range. By using git bisect you
should
be able to narrow it down in just 3 tests.

Sorry for delay, I was busy with many things, today I retried with
updated stable-4.5 and also reverting "spice: make sure we don't
overflow ssd->buf" (in a second test) but in both case regression remain
:(
Tomorrow probably I'll do other tests.

I did another test, reverting this instead:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8
And now seems I'm unable to reproduce the regression, before happen after
few seconds up to 1-2 minutes, now I use the same domU 15-20 minutes
without problem.
Probably is the cause of regression even if seems strange that on unstable
with same patch on tests of some days ago didn't happen.

Any ideas?

Thanks for any reply and sorry for my bad english.

Bad news, qemu crash still happen even if this time in qemu log there is
another output, see attachment.
After take a look on the other patches I saw:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commitdiff;h=7154fba0e51ec985ef621965d1b7120ad424fcbf
With "Conflicts: hw/display/vga.c" in description I'll try to revert it
instead.

Or someone can tell me another probable test I can try?

Tried also to revet the patch above with same result, so I retried with qemu
from 4.5.0 and seems the crash happen also in this case...I'm going crazy :(

Sorry, I missed this bit before. The only thing I could suggest at this
point, would be to make sure that you have a clean test environment.
Usually this happens when you have some "leftovers" from previous broken
tests.


I use make debball to be sure to track and remove all files on package 
update.
Now I retried with latest xen-unstable and the qemu crash didn't happen, 
more exactly I used this:

https://github.com/Fantu/Xen/commits/rebase/m2r-staging
Latest test with regression based on latest stable-4.5, more exactly:
https://github.com/Fantu/Xen/commits/rebase/m2r-testing
Some days ago on same dom0 and domU I tried with latest stable version 
(that I use on only 2 production servers for now but I not saw the 
regression), more exactly:

https://github.com/Fantu/Xen/commits/rebase/m2r-stable-4.5
Dom0 debian 7 with kernel 3.16 from backports, seabios 1.8.1-2 from 
unstable and this xen configure:
./configure --prefix=/usr --disable-blktap1 --disable-qemu-tradit

Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-12 Thread Stefano Stabellini
On Tue, 12 May 2015, Stefano Stabellini wrote:
> On Tue, 12 May 2015, Fabio Fantoni wrote:
> > Il 12/05/2015 12:26, Fabio Fantoni ha scritto:
> > > Il 12/05/2015 11:23, Fabio Fantoni ha scritto:
> > > > Il 11/05/2015 17:04, Fabio Fantoni ha scritto:
> > > > > Il 21/04/2015 14:53, Stefano Stabellini ha scritto:
> > > > > > On Tue, 21 Apr 2015, Fabio Fantoni wrote:
> > > > > > > Il 21/04/2015 12:49, Stefano Stabellini ha scritto:
> > > > > > > > On Mon, 20 Apr 2015, Fabio Fantoni wrote:
> > > > > > > > > I updated xen and qemu from xen 4.5.0 with its upstream qemu
> > > > > > > > > included to
> > > > > > > > > xen
> > > > > > > > > 4.5.1-pre with qemu upstream from stable-4.5 (changed 
> > > > > > > > > Config.mk
> > > > > > > > > to use
> > > > > > > > > revision "master").
> > > > > > > > > After few minutes I booted windows 7 64 bit domU qemu crash,
> > > > > > > > > tried 2 times
> > > > > > > > > with same result.
> > > > > > > > > 
> > > > > > > > > In the domU's qemu log:
> > > > > > > > > > qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion
> > > > > > > > > > `(old_top ==
> > > > > > > > > > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
> > > > > > > > > > __builtin_offsetof
> > > > > > > > > > (struct malloc_chunk, fd && old_size == 0) || ((unsigned
> > > > > > > > > > long)
> > > > > > > > > > (old_size) >= (unsigned long)__builtin_offsetof (struct
> > > > > > > > > > malloc_chunk,
> > > > > > > > > > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 *
> > > > > > > > > > (sizeof(size_t))) -
> > > > > > > > > > 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end &
> > > > > > > > > > pagemask)
> > > > > > > > > > ==
> > > > > > > > > > 0)' failed.
> > > > > > > > > > Killing all inferiors
> > > > > > > > > In attachment the full backtrace of qemu crash.
> > > > > > > > > 
> > > > > > > > > With a fast search after I saw the backtrace I found a 
> > > > > > > > > probable
> > > > > > > > > cause of
> > > > > > > > > regression (I'm not sure):
> > > > > > > > > http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
> > > > > > > > >  
> > > > > > > > > spice: make sure we don't overflow ssd->buf
> > > > > > > > > 
> > > > > > > > > Added also qemu-devel and spice-devel as cc.
> > > > > > > > > 
> > > > > > > > > If you need more informations/tests tell me and I'll post 
> > > > > > > > > them.
> > > > > > > >Maybe you could try to revert the offending commit
> > > > > > > > (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better 
> > > > > > > > bisect
> > > > > > > > the
> > > > > > > > crash?
> > > > > > > Thanks for your reply.
> > > > > > > 
> > > > > > > I reverted to 4.5.0 on dom0 for now on that system because I'm 
> > > > > > > busy
> > > > > > > trying to
> > > > > > > found another problem that cause very bad performance without 
> > > > > > > errors
> > > > > > > or
> > > > > > > nothing in logs :( I don't know if if xen related, kernel related 
> > > > > > > or
> > > > > > > other for
> > > > > > > now.
> > > > > > > 
> > > > > > > About this regression with spice I'll do further tests in next 
> > > > > > > days
> > > > > > > (probably
> > > > > > > starting reverting the spice patch in qemu) but any help is
> > > > > > > appreciated.
> > > > > > > Based on data I have for now is possible that the problem is that
> > > > > > > qemu try to
> > > > > > > allocate other ram or videoram after domU create but with xen is 
> > > > > > > not
> > > > > > > possible?
> > > > > > > In the spice related patch I saw something about dynamic 
> > > > > > > allocation
> > > > > > > for
> > > > > > > example.
> > > > > > It is probably caused by a commit in the range:
> > > > > > 
> > > > > > 1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4
> > > > > >  
> > > > > > 
> > > > > > there are only 10 commits in that range. By using git bisect you
> > > > > > should
> > > > > > be able to narrow it down in just 3 tests.
> > > > > 
> > > > > Sorry for delay, I was busy with many things, today I retried with
> > > > > updated stable-4.5 and also reverting "spice: make sure we don't
> > > > > overflow ssd->buf" (in a second test) but in both case regression 
> > > > > remain
> > > > > :(
> > > > > Tomorrow probably I'll do other tests.
> > > > 
> > > > I did another test, reverting this instead:
> > > > http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8
> > > >  
> > > > And now seems I'm unable to reproduce the regression, before happen 
> > > > after
> > > > few seconds up to 1-2 minutes, now I use the same domU 15-20 minutes
> > > > without problem.
> > > > Probably is the cause of regression even if seems strange that on 
> > > > unstable
> > > > with same patch on tests of some days ago didn't happen.
> > > > 
> > > > Any ideas?
> > > > 
> > > > Thanks for any reply and sorry for my bad english.
> > > 
> > 

Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-12 Thread Stefano Stabellini
On Tue, 12 May 2015, Fabio Fantoni wrote:
> Il 12/05/2015 12:26, Fabio Fantoni ha scritto:
> > Il 12/05/2015 11:23, Fabio Fantoni ha scritto:
> > > Il 11/05/2015 17:04, Fabio Fantoni ha scritto:
> > > > Il 21/04/2015 14:53, Stefano Stabellini ha scritto:
> > > > > On Tue, 21 Apr 2015, Fabio Fantoni wrote:
> > > > > > Il 21/04/2015 12:49, Stefano Stabellini ha scritto:
> > > > > > > On Mon, 20 Apr 2015, Fabio Fantoni wrote:
> > > > > > > > I updated xen and qemu from xen 4.5.0 with its upstream qemu
> > > > > > > > included to
> > > > > > > > xen
> > > > > > > > 4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk
> > > > > > > > to use
> > > > > > > > revision "master").
> > > > > > > > After few minutes I booted windows 7 64 bit domU qemu crash,
> > > > > > > > tried 2 times
> > > > > > > > with same result.
> > > > > > > > 
> > > > > > > > In the domU's qemu log:
> > > > > > > > > qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion
> > > > > > > > > `(old_top ==
> > > > > > > > > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
> > > > > > > > > __builtin_offsetof
> > > > > > > > > (struct malloc_chunk, fd && old_size == 0) || ((unsigned
> > > > > > > > > long)
> > > > > > > > > (old_size) >= (unsigned long)__builtin_offsetof (struct
> > > > > > > > > malloc_chunk,
> > > > > > > > > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 *
> > > > > > > > > (sizeof(size_t))) -
> > > > > > > > > 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end &
> > > > > > > > > pagemask)
> > > > > > > > > ==
> > > > > > > > > 0)' failed.
> > > > > > > > > Killing all inferiors
> > > > > > > > In attachment the full backtrace of qemu crash.
> > > > > > > > 
> > > > > > > > With a fast search after I saw the backtrace I found a probable
> > > > > > > > cause of
> > > > > > > > regression (I'm not sure):
> > > > > > > > http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
> > > > > > > >  
> > > > > > > > spice: make sure we don't overflow ssd->buf
> > > > > > > > 
> > > > > > > > Added also qemu-devel and spice-devel as cc.
> > > > > > > > 
> > > > > > > > If you need more informations/tests tell me and I'll post them.
> > > > > > >Maybe you could try to revert the offending commit
> > > > > > > (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect
> > > > > > > the
> > > > > > > crash?
> > > > > > Thanks for your reply.
> > > > > > 
> > > > > > I reverted to 4.5.0 on dom0 for now on that system because I'm busy
> > > > > > trying to
> > > > > > found another problem that cause very bad performance without errors
> > > > > > or
> > > > > > nothing in logs :( I don't know if if xen related, kernel related or
> > > > > > other for
> > > > > > now.
> > > > > > 
> > > > > > About this regression with spice I'll do further tests in next days
> > > > > > (probably
> > > > > > starting reverting the spice patch in qemu) but any help is
> > > > > > appreciated.
> > > > > > Based on data I have for now is possible that the problem is that
> > > > > > qemu try to
> > > > > > allocate other ram or videoram after domU create but with xen is not
> > > > > > possible?
> > > > > > In the spice related patch I saw something about dynamic allocation
> > > > > > for
> > > > > > example.
> > > > > It is probably caused by a commit in the range:
> > > > > 
> > > > > 1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4
> > > > >  
> > > > > 
> > > > > there are only 10 commits in that range. By using git bisect you
> > > > > should
> > > > > be able to narrow it down in just 3 tests.
> > > > 
> > > > Sorry for delay, I was busy with many things, today I retried with
> > > > updated stable-4.5 and also reverting "spice: make sure we don't
> > > > overflow ssd->buf" (in a second test) but in both case regression remain
> > > > :(
> > > > Tomorrow probably I'll do other tests.
> > > 
> > > I did another test, reverting this instead:
> > > http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8
> > >  
> > > And now seems I'm unable to reproduce the regression, before happen after
> > > few seconds up to 1-2 minutes, now I use the same domU 15-20 minutes
> > > without problem.
> > > Probably is the cause of regression even if seems strange that on unstable
> > > with same patch on tests of some days ago didn't happen.
> > > 
> > > Any ideas?
> > > 
> > > Thanks for any reply and sorry for my bad english.
> > 
> > Bad news, qemu crash still happen even if this time in qemu log there is
> > another output, see attachment.
> > After take a look on the other patches I saw:
> > http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commitdiff;h=7154fba0e51ec985ef621965d1b7120ad424fcbf
> >  
> > With "Conflicts: hw/display/vga.c" in description I'll try to revert it
> > instead.
> > 
> > Or someone can tell me another probable test I can t

Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-12 Thread Fabio Fantoni

Il 12/05/2015 12:26, Fabio Fantoni ha scritto:

Il 12/05/2015 11:23, Fabio Fantoni ha scritto:

Il 11/05/2015 17:04, Fabio Fantoni ha scritto:

Il 21/04/2015 14:53, Stefano Stabellini ha scritto:

On Tue, 21 Apr 2015, Fabio Fantoni wrote:

Il 21/04/2015 12:49, Stefano Stabellini ha scritto:

On Mon, 20 Apr 2015, Fabio Fantoni wrote:
I updated xen and qemu from xen 4.5.0 with its upstream qemu 
included to

xen
4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk 
to use

revision "master").
After few minutes I booted windows 7 64 bit domU qemu crash, 
tried 2 times

with same result.

In the domU's qemu log:

qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
__builtin_offsetof
(struct malloc_chunk, fd && old_size == 0) || ((unsigned long)
(old_size) >= (unsigned long)__builtin_offsetof (struct
malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * 
(sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & 
pagemask)

==
0)' failed.
Killing all inferiors

In attachment the full backtrace of qemu crash.

With a fast search after I saw the backtrace I found a probable 
cause of

regression (I'm not sure):
http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa 


spice: make sure we don't overflow ssd->buf

Added also qemu-devel and spice-devel as cc.

If you need more informations/tests tell me and I'll post them.

   Maybe you could try to revert the offending commit
(5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect 
the

crash?

Thanks for your reply.

I reverted to 4.5.0 on dom0 for now on that system because I'm 
busy trying to
found another problem that cause very bad performance without 
errors or
nothing in logs :( I don't know if if xen related, kernel related 
or other for

now.

About this regression with spice I'll do further tests in next 
days (probably
starting reverting the spice patch in qemu) but any help is 
appreciated.
Based on data I have for now is possible that the problem is that 
qemu try to
allocate other ram or videoram after domU create but with xen is 
not possible?
In the spice related patch I saw something about dynamic 
allocation for

example.

It is probably caused by a commit in the range:

1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4 



there are only 10 commits in that range. By using git bisect you 
should

be able to narrow it down in just 3 tests.


Sorry for delay, I was busy with many things, today I retried with 
updated stable-4.5 and also reverting "spice: make sure we don't 
overflow ssd->buf" (in a second test) but in both case regression 
remain :(

Tomorrow probably I'll do other tests.


I did another test, reverting this instead:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8 

And now seems I'm unable to reproduce the regression, before happen 
after few seconds up to 1-2 minutes, now I use the same domU 15-20 
minutes without problem.
Probably is the cause of regression even if seems strange that on 
unstable with same patch on tests of some days ago didn't happen.


Any ideas?

Thanks for any reply and sorry for my bad english.


Bad news, qemu crash still happen even if this time in qemu log there 
is another output, see attachment.

After take a look on the other patches I saw:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commitdiff;h=7154fba0e51ec985ef621965d1b7120ad424fcbf 

With "Conflicts: hw/display/vga.c" in description I'll try to revert 
it instead.


Or someone can tell me another probable test I can try?


Tried also to revet the patch above with same result, so I retried with 
qemu from 4.5.0 and seems the crash happen also in this case...I'm going 
crazy :(


In attachment full gdb log.

Any ideas on how to found the problem please?

Thanks for any reply and sorry for my bad english.
Full backtrace:
#0  0x736e8165 in *__GI_raise (sig=) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
pid = 
selftid = 
#1  0x736eb3e0 in *__GI_abort () at abort.c:92
act = {__sigaction_handler = {sa_handler = 0x58ddeba0, sa_sigaction 
= 0x58ddeba0}, sa_mask = {__val = {140737278660816, 140737014337136, 4, 
140737014337376, 140737277706678, 206158430256, 140737014337416, 
140737014337168, 87, 226653584, 140737351936019, 140737488348083, 
140737278647399, 140737278651152, 3096, 140737277299604}}, sa_flags = 
-474017696, sa_restorer = 0x736b9c60}
sigs = {__val = {32, 0 }}
#2  0x7372bdea in __malloc_assert (assertion=, 
file=, line=, function=) at 
malloc.c:351
No locals.
#3  0x7372ed13 in sYSMALLOc (av=, nb=) at 
malloc.c:3093
snd_brk = 
front_misalign = 
remainder = 
tried_mmap = false
old_size = 
size = 
   

Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-12 Thread Fabio Fantoni

Il 12/05/2015 11:23, Fabio Fantoni ha scritto:

Il 11/05/2015 17:04, Fabio Fantoni ha scritto:

Il 21/04/2015 14:53, Stefano Stabellini ha scritto:

On Tue, 21 Apr 2015, Fabio Fantoni wrote:

Il 21/04/2015 12:49, Stefano Stabellini ha scritto:

On Mon, 20 Apr 2015, Fabio Fantoni wrote:
I updated xen and qemu from xen 4.5.0 with its upstream qemu 
included to

xen
4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk 
to use

revision "master").
After few minutes I booted windows 7 64 bit domU qemu crash, 
tried 2 times

with same result.

In the domU's qemu log:

qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
__builtin_offsetof
(struct malloc_chunk, fd && old_size == 0) || ((unsigned long)
(old_size) >= (unsigned long)__builtin_offsetof (struct
malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * 
(sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & 
pagemask)

==
0)' failed.
Killing all inferiors

In attachment the full backtrace of qemu crash.

With a fast search after I saw the backtrace I found a probable 
cause of

regression (I'm not sure):
http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa 


spice: make sure we don't overflow ssd->buf

Added also qemu-devel and spice-devel as cc.

If you need more informations/tests tell me and I'll post them.

   Maybe you could try to revert the offending commit
(5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect the
crash?

Thanks for your reply.

I reverted to 4.5.0 on dom0 for now on that system because I'm busy 
trying to
found another problem that cause very bad performance without 
errors or
nothing in logs :( I don't know if if xen related, kernel related 
or other for

now.

About this regression with spice I'll do further tests in next days 
(probably
starting reverting the spice patch in qemu) but any help is 
appreciated.
Based on data I have for now is possible that the problem is that 
qemu try to
allocate other ram or videoram after domU create but with xen is 
not possible?
In the spice related patch I saw something about dynamic allocation 
for

example.

It is probably caused by a commit in the range:

1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4 



there are only 10 commits in that range. By using git bisect you should
be able to narrow it down in just 3 tests.


Sorry for delay, I was busy with many things, today I retried with 
updated stable-4.5 and also reverting "spice: make sure we don't 
overflow ssd->buf" (in a second test) but in both case regression 
remain :(

Tomorrow probably I'll do other tests.


I did another test, reverting this instead:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8 

And now seems I'm unable to reproduce the regression, before happen 
after few seconds up to 1-2 minutes, now I use the same domU 15-20 
minutes without problem.
Probably is the cause of regression even if seems strange that on 
unstable with same patch on tests of some days ago didn't happen.


Any ideas?

Thanks for any reply and sorry for my bad english.


Bad news, qemu crash still happen even if this time in qemu log there is 
another output, see attachment.

After take a look on the other patches I saw:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commitdiff;h=7154fba0e51ec985ef621965d1b7120ad424fcbf
With "Conflicts: hw/display/vga.c" in description I'll try to revert it 
instead.


Or someone can tell me another probable test I can try?
main_channel_link: add main channel client
main_channel_handle_parsed: net test: latency 7.814000 ms, bitrate 5417989417 
bps (5166.997354 Mbps)
inputs_connect: inputs channel client create
red_dispatcher_set_cursor_peer: 
main_channel_handle_parsed: agent start
main_channel_handle_parsed: agent start
red_channel_client_disconnect: rcc=0x7fa861b2eb60 (channel=0x7fa861b07e80 
type=3 id=0)
red_channel_client_disconnect: rcc=0x7fa862349720 (channel=0x7fa861bc8b80 
type=2 id=0)
red_channel_client_disconnect: rcc=0x7fa861ca6580 (channel=0x7fa861bb3990 
type=4 id=0)
red_channel_client_disconnect_dummy: rcc=0x7fa861ecae20 (channel=0x7fa861c22c40 
type=6 id=0)
snd_channel_put: SndChannel=0x7fa861e6f620 freed
red_channel_client_disconnect_dummy: rcc=0x7fa861ec6be0 (channel=0x7fa861b8c890 
type=5 id=0)
snd_channel_put: SndChannel=0x7fa861d97ab0 freed
red_channel_client_disconnect: rcc=0x7fa861fbf120 (channel=0x7fa861bd57b0 
type=9 id=0)
red_channel_client_disconnect: rcc=0x7fa861fbaee0 (channel=0x7fa861baff50 
type=9 id=1)
red_channel_client_disconnect: rcc=0x7fa861eff170 (channel=0x7fa861c79f90 
type=9 id=2)
red_channel_client_disconnect: rcc=0x7fa861f044b0 (channel=0x7fa861c7a7b0 
type=9 id=3)
red_channel_client_disconnect: rcc=0x7fa861b52ad0 (channel=0x7fa861afc3f0 
type=1

Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-12 Thread Fabio Fantoni

Il 11/05/2015 17:04, Fabio Fantoni ha scritto:

Il 21/04/2015 14:53, Stefano Stabellini ha scritto:

On Tue, 21 Apr 2015, Fabio Fantoni wrote:

Il 21/04/2015 12:49, Stefano Stabellini ha scritto:

On Mon, 20 Apr 2015, Fabio Fantoni wrote:
I updated xen and qemu from xen 4.5.0 with its upstream qemu 
included to

xen
4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk to 
use

revision "master").
After few minutes I booted windows 7 64 bit domU qemu crash, tried 
2 times

with same result.

In the domU's qemu log:

qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
__builtin_offsetof
(struct malloc_chunk, fd && old_size == 0) || ((unsigned long)
(old_size) >= (unsigned long)__builtin_offsetof (struct
malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * 
(sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & 
pagemask)

==
0)' failed.
Killing all inferiors

In attachment the full backtrace of qemu crash.

With a fast search after I saw the backtrace I found a probable 
cause of

regression (I'm not sure):
http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa 


spice: make sure we don't overflow ssd->buf

Added also qemu-devel and spice-devel as cc.

If you need more informations/tests tell me and I'll post them.

   Maybe you could try to revert the offending commit
(5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect the
crash?

Thanks for your reply.

I reverted to 4.5.0 on dom0 for now on that system because I'm busy 
trying to

found another problem that cause very bad performance without errors or
nothing in logs :( I don't know if if xen related, kernel related or 
other for

now.

About this regression with spice I'll do further tests in next days 
(probably
starting reverting the spice patch in qemu) but any help is 
appreciated.
Based on data I have for now is possible that the problem is that 
qemu try to
allocate other ram or videoram after domU create but with xen is not 
possible?

In the spice related patch I saw something about dynamic allocation for
example.

It is probably caused by a commit in the range:

1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4 



there are only 10 commits in that range. By using git bisect you should
be able to narrow it down in just 3 tests.


Sorry for delay, I was busy with many things, today I retried with 
updated stable-4.5 and also reverting "spice: make sure we don't 
overflow ssd->buf" (in a second test) but in both case regression 
remain :(

Tomorrow probably I'll do other tests.


I did another test, reverting this instead:
http://xenbits.xen.org/gitweb/?p=qemu-upstream-4.5-testing.git;a=commit;h=c9ac5f816bf3a8b56f836b078711dcef6e5c90b8
And now seems I'm unable to reproduce the regression, before happen 
after few seconds up to 1-2 minutes, now I use the same domU 15-20 
minutes without problem.
Probably is the cause of regression even if seems strange that on 
unstable with same patch on tests of some days ago didn't happen.


Any ideas?

Thanks for any reply and sorry for my bad english.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-05-11 Thread Fabio Fantoni

Il 21/04/2015 14:53, Stefano Stabellini ha scritto:

On Tue, 21 Apr 2015, Fabio Fantoni wrote:

Il 21/04/2015 12:49, Stefano Stabellini ha scritto:

On Mon, 20 Apr 2015, Fabio Fantoni wrote:

I updated xen and qemu from xen 4.5.0 with its upstream qemu included to
xen
4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk to use
revision "master").
After few minutes I booted windows 7 64 bit domU qemu crash, tried 2 times
with same result.

In the domU's qemu log:

qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
__builtin_offsetof
(struct malloc_chunk, fd && old_size == 0) || ((unsigned long)
(old_size) >= (unsigned long)__builtin_offsetof (struct
malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask)
==
0)' failed.
Killing all inferiors

In attachment the full backtrace of qemu crash.

With a fast search after I saw the backtrace I found a probable cause of
regression (I'm not sure):
http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
spice: make sure we don't overflow ssd->buf

Added also qemu-devel and spice-devel as cc.

If you need more informations/tests tell me and I'll post them.

   Maybe you could try to revert the offending commit
(5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect the
crash?

Thanks for your reply.

I reverted to 4.5.0 on dom0 for now on that system because I'm busy trying to
found another problem that cause very bad performance without errors or
nothing in logs :( I don't know if if xen related, kernel related or other for
now.

About this regression with spice I'll do further tests in next days (probably
starting reverting the spice patch in qemu) but any help is appreciated.
Based on data I have for now is possible that the problem is that qemu try to
allocate other ram or videoram after domU create but with xen is not possible?
In the spice related patch I saw something about dynamic allocation for
example.

It is probably caused by a commit in the range:

1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4

there are only 10 commits in that range. By using git bisect you should
be able to narrow it down in just 3 tests.


Sorry for delay, I was busy with many things, today I retried with 
updated stable-4.5 and also reverting "spice: make sure we don't 
overflow ssd->buf" (in a second test) but in both case regression remain :(

Tomorrow probably I'll do other tests.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-04-21 Thread Stefano Stabellini
On Tue, 21 Apr 2015, Fabio Fantoni wrote:
> Il 21/04/2015 12:49, Stefano Stabellini ha scritto:
> > On Mon, 20 Apr 2015, Fabio Fantoni wrote:
> > > I updated xen and qemu from xen 4.5.0 with its upstream qemu included to
> > > xen
> > > 4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk to use
> > > revision "master").
> > > After few minutes I booted windows 7 64 bit domU qemu crash, tried 2 times
> > > with same result.
> > > 
> > > In the domU's qemu log:
> > > > qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
> > > > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
> > > > __builtin_offsetof
> > > > (struct malloc_chunk, fd && old_size == 0) || ((unsigned long)
> > > > (old_size) >= (unsigned long)__builtin_offsetof (struct
> > > > malloc_chunk,
> > > > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) -
> > > > 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask)
> > > > ==
> > > > 0)' failed.
> > > > Killing all inferiors
> > > In attachment the full backtrace of qemu crash.
> > > 
> > > With a fast search after I saw the backtrace I found a probable cause of
> > > regression (I'm not sure):
> > > http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
> > > spice: make sure we don't overflow ssd->buf
> > > 
> > > Added also qemu-devel and spice-devel as cc.
> > > 
> > > If you need more informations/tests tell me and I'll post them.
> >   Maybe you could try to revert the offending commit
> > (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect the
> > crash?
> Thanks for your reply.
> 
> I reverted to 4.5.0 on dom0 for now on that system because I'm busy trying to
> found another problem that cause very bad performance without errors or
> nothing in logs :( I don't know if if xen related, kernel related or other for
> now.
> 
> About this regression with spice I'll do further tests in next days (probably
> starting reverting the spice patch in qemu) but any help is appreciated.
> Based on data I have for now is possible that the problem is that qemu try to
> allocate other ram or videoram after domU create but with xen is not possible?
> In the spice related patch I saw something about dynamic allocation for
> example.

It is probably caused by a commit in the range:

1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4

there are only 10 commits in that range. By using git bisect you should
be able to narrow it down in just 3 tests.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-04-21 Thread Fabio Fantoni

Il 21/04/2015 12:49, Stefano Stabellini ha scritto:

On Mon, 20 Apr 2015, Fabio Fantoni wrote:

I updated xen and qemu from xen 4.5.0 with its upstream qemu included to xen
4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk to use
revision "master").
After few minutes I booted windows 7 64 bit domU qemu crash, tried 2 times
with same result.

In the domU's qemu log:

qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof
(struct malloc_chunk, fd && old_size == 0) || ((unsigned long)
(old_size) >= (unsigned long)__builtin_offsetof (struct malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) ==
0)' failed.
Killing all inferiors

In attachment the full backtrace of qemu crash.

With a fast search after I saw the backtrace I found a probable cause of
regression (I'm not sure):
http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
spice: make sure we don't overflow ssd->buf

Added also qemu-devel and spice-devel as cc.

If you need more informations/tests tell me and I'll post them.
  
Maybe you could try to revert the offending commit

(5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect the
crash?

Thanks for your reply.

I reverted to 4.5.0 on dom0 for now on that system because I'm busy 
trying to found another problem that cause very bad performance without 
errors or nothing in logs :( I don't know if if xen related, kernel 
related or other for now.


About this regression with spice I'll do further tests in next days 
(probably starting reverting the spice patch in qemu) but any help is 
appreciated.
Based on data I have for now is possible that the problem is that qemu 
try to allocate other ram or videoram after domU create but with xen is 
not possible?
In the spice related patch I saw something about dynamic allocation for 
example.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-04-21 Thread Stefano Stabellini
On Mon, 20 Apr 2015, Fabio Fantoni wrote:
> I updated xen and qemu from xen 4.5.0 with its upstream qemu included to xen
> 4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk to use
> revision "master").
> After few minutes I booted windows 7 64 bit domU qemu crash, tried 2 times
> with same result.
> 
> In the domU's qemu log:
> > qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
> > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof
> > (struct malloc_chunk, fd && old_size == 0) || ((unsigned long)
> > (old_size) >= (unsigned long)__builtin_offsetof (struct malloc_chunk,
> > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) -
> > 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) ==
> > 0)' failed.
> > Killing all inferiors
> 
> In attachment the full backtrace of qemu crash.
> 
> With a fast search after I saw the backtrace I found a probable cause of
> regression (I'm not sure):
> http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
> spice: make sure we don't overflow ssd->buf
> 
> Added also qemu-devel and spice-devel as cc.
> 
> If you need more informations/tests tell me and I'll post them.
 
Maybe you could try to revert the offending commit
(5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect the
crash?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-04-20 Thread Fabio Fantoni
I updated xen and qemu from xen 4.5.0 with its upstream qemu included to 
xen 4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk to 
use revision "master").
After few minutes I booted windows 7 64 bit domU qemu crash, tried 2 
times with same result.


In the domU's qemu log:
qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top == 
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - 
__builtin_offsetof (struct malloc_chunk, fd && old_size == 0) || 
((unsigned long) (old_size) >= (unsigned long)__builtin_offsetof 
(struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & 
~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && 
((unsigned long)old_end & pagemask) == 0)' failed.

Killing all inferiors


In attachment the full backtrace of qemu crash.

With a fast search after I saw the backtrace I found a probable cause of 
regression (I'm not sure):

http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
spice: make sure we don't overflow ssd->buf

Added also qemu-devel and spice-devel as cc.

If you need more informations/tests tell me and I'll post them.

Thanks for any reply and sorry for my bad english.


Program received signal SIGABRT, Aborted.
[Switching to Thread 5234]
0x73905165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt full
#0  0x73905165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x739083e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2  0x73948dea in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#3  0x7394bd13 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#4  0x7394da70 in malloc () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#5  0x74d38550 in spice_malloc (n_bytes=1184900) at mem.c:93
mem = 
__FUNCTION__ = "spice_malloc"
#6  0x74d389be in spice_chunks_linearize (chunks=0x7fffdc1fb6b0)
at mem.c:226
data = 
p = 
i = 
#7  0x74d16b56 in canvas_bitmap_to_surface (
canvas=canvas@entry=0x56719de0, bitmap=bitmap@entry=0x7fffdc1a2c08, 
palette=0x0, want_original=1) at ../spice-common/common/canvas_base.c:635
src = 
image = 
format = 
__FUNCTION__ = "canvas_bitmap_to_surface"
---Type  to continue, or q  to quit---
#8  0x74d16ce2 in canvas_get_bits (want_original=, 
bitmap=0x7fffdc1a2c08, canvas=0x56719de0)
at ../spice-common/common/canvas_base.c:964
palette = 
#9  canvas_get_image_internal (canvas=canvas@entry=0x56719de0, 
image=0x7fffdc1a2bf0, want_original=, 
want_original@entry=0, real_get=real_get@entry=1)
at ../spice-common/common/canvas_base.c:1141
descriptor = 0x7fffdc1a2bf0
surface = 
converted = 
wanted_format = 1
surface_format = 
saved_want_original = 
__FUNCTION__ = "canvas_get_image_internal"
#10 0x74d173ba in canvas_get_image (
canvas=canvas@entry=0x56719de0, image=, 
want_original=want_original@entry=0)
at ../spice-common/common/canvas_base.c:1285
No locals.
#11 0x74d1970e in canvas_draw_copy (spice_canvas=0x56719de0, 
bbox=0x7fffdc207a50, clip=, copy=0x7fffe4dfc320)
at ../spice-common/common/canvas_base.c:2258
canvas = 0x56719de0
dest_region = {extents = {x1 = 0, y1 = 708, x2 = 425, y2 = 728}, 
---Type  to continue, or q  to quit---
  data = 0x0}
surface_canvas = 
src_image = 
rop = SPICE_ROP_COPY
__FUNCTION__ = "canvas_draw_copy"
#12 0x74cecffc in red_draw_qxl_drawable (
worker=worker@entry=0x7fffe4423010, 
drawable=drawable@entry=0x7fffe45d6a88) at red_worker.c:4394
copy = {src_bitmap = 0x7fffdc1a2bf0, src_area = {left = 0, top = 677, 
right = 425, bottom = 697}, rop_descriptor = 8, 
  scale_mode = 1 '\001', mask = {flags = 245 '\365', pos = {
  x = -173079809, y = -173079809}, bitmap = 0x0}}
img1 = {descriptor = {id = 93825007287960, type = 48 '0', 
flags = 193 '\301', width = 21845, height = 4210421981}, u = {
bitmap = {format = 55 '7', flags = 10 '\n', x = 0, 
  y = 3867565524, stride = 32767, palette = 0x7fffe805fffc, 
  palette_id = 606579, data = 0x7fffdc78}, quic = {
  data_size = 2615, data = 0x7fffe6865dd4}, surface = {
  surface_id = 2615}, lz_rgb = {data_size = 2615, 
  data = 0x7fffe6865dd4}, lz_plt = {flags = 55 '7', 
  data_size = 0, palette = 0x7fffe6865dd4, 
  palette_id = 140737086095356, data = 0x94173}, jpeg = {
  data_size = 2615, data = 0x7fffe6865dd4}, zlib_glz = {
  glz_data_size = 2615, data_size = 0, data = 0x7fffe