Bug#1028251: New Patch (Was: Re: Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64)

2023-01-14 Thread Chuck Zmudzinski
On 1/13/2023 9:08 PM, Chuck Zmudzinski wrote:
> On 1/13/23 6:59 PM, Hans van Kranenburg wrote:
> > Hi,
> > 
> > On 1/13/23 22:45, Chuck Zmudzinski wrote:
> >> On 1/13/23 7:39 AM, Marek Marczykowski-Górecki wrote:
> >>> On Fri, Jan 13, 2023 at 12:58:29AM -0500, Chuck Zmudzinski wrote:
> >>>> On 1/11/2023 10:58 PM, Chuck Zmudzinski wrote:
> >>>>> On 1/9/23 12:55 PM, Hans van Kranenburg wrote:
> >>>>>> Hi!
> > [...]
> > Yolo style cutting out lines here...
> > [...]
> >>>>
>
> >> 
> >> Perhaps this is an opportunity for you to try to fix 922033 again.
> >> I see it has been sitting there for a few years now. Let's see
> >> what Hans thinks.
> > 
> > Yeah, well, so, the thing here is...
> > 
> > When Debian started to package Xen (thanks! Bastian, in 200X), the
> > upstream init scripts were copy pasted, and adjusted to have the ability
> > to have different Hypervisor-ABI-incompatible versions installed at the
> > same time. Also, this is related to the collection of Makefile patches
> > we carry around to have ABI-incompatible stuff end up in a directory
> > like /usr/lib/xen-4.14/ and /usr/lib/xen-4.17/ !
>
> That is a nice feature of the Xen Debian packages, to have the ability
> to manage guests on different versions of the hypervisor.
>
> > 
> > What does this mean? Well, in the most basic sense it means that you
> > could apt-get (dist-)upgrade and then still be able to xl shutdown a
> > domU afterwards before doing reboot, because it will choose the right
> > tools which match with the ABI of the *now* running hypervisor instead
> > of being left with a dumpster fire, which in the end causes you to shout
> > curse words and cause you to have to go to the machine and hold the
> > power button for 5 seconds to force power it off.
> > 
> > This is the thing about where you upgrade from Xen 4.14 to Xen 4.17
> > during the upgrade from Debian 11/Bullseye to Debian 12/Bookworm, it
> > will allow you, if booting the whole new thing is a huge failure, to
> > reset the computer, and in grub, choose to use the previous Xen (and
> > possibly do that in combination with previous Debian linux kernel) and
> > then have a system where you again at least can start your domUs again
> > *) and first have a good rest, night of sleep before starting to dig
> > into what's going wrong.
> > 
> > So, this is exactly the same way of doing stuff like how you can also
> > reboot back into the previous Linux kernel (ABI-compatible) one during a
> > system upgrade, even if you're not using Xen at all!
> > 
> > I like this very much. This is the kind of thing that helps admins of
> > systems that have just local disks and a few domUs. Like, the case where
> > you support some non-profit organization with their server stuff running
> > on donated hardware. (Yes, I also do some of those, I do!) And, in case
> > something does fail (there could always be something like a misbehaving
> > mpt3sas card in the hardware or anything that no one else spotted yet),
> > the admin does not have to end up in total panic mode after doing the
> > upgrade on a Friday afternoon lying upside down inside a broom closet,
> > but they can just at least recover from the situation and have something
> > that's running again, and then a day later, or 2 or 3 days or a week
> > later return on another planned moment to fix it, after asking around.
> > 
> > Upstream Xen stuff doesn't have anything like that.
> > 
> > But, they actually look at us, and they think, ooh, this is actually
> > nice, we should have that also by default.
> > 
> > The fact that we have this changed/altered/divergent init scripts in
> > Debian is the main reason that we cannot just enable systemd things
> > which will put upstream whatever on the system.
>
> I understand the problem here.
>
> > 
> > So, what could we do about this?
> > 
> > The project plan (that could be drafted on an A4 paper) could look like,
> > gather around all distro maintainers of Linux distro's that are shipping
> > Xen, and then search for a 'Project owner', which we totally need to be
> > someone that is actually employed at a company that actually cares about
> > getting the results of this.

"Totally need to be someone that is actually employed at a company." I am 
curious
about that statement. Has Debian given up on the idea that members of the FLOSS
community can band together and solve a problem like this without corporate
backing? I don't think other d

Bug#1028251: New Patch (Was: Re: Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64)

2023-01-13 Thread Chuck Zmudzinski
On 1/13/23 6:59 PM, Hans van Kranenburg wrote:
> Hi,
> 
> On 1/13/23 22:45, Chuck Zmudzinski wrote:
>> On 1/13/23 7:39 AM, Marek Marczykowski-Górecki wrote:
>>> On Fri, Jan 13, 2023 at 12:58:29AM -0500, Chuck Zmudzinski wrote:
>>>> On 1/11/2023 10:58 PM, Chuck Zmudzinski wrote:
>>>>> On 1/9/23 12:55 PM, Hans van Kranenburg wrote:
>>>>>> Hi!
> [...]
> Yolo style cutting out lines here...
> [...]
>>>>
>>>> Regarding the systemd files causing ftbfs, this explains it:
>>>>
>>>> https://salsa.debian.org/xen-team/debian-xen/-/blob/master/m4/systemd.m4#L119
>>>>
>>>> and this:
>>>>
>>>> https://salsa.debian.org/xen-team/debian-xen/-/blob/master/tools/configure.ac#L480
>>>>
>>>> The comments indicate that using AX_AVAILABLE_SYSTEMD() will
>>>> by default enable systemd if systemd development files are on the
>>>> build system, and AX_ALLOW_SYSTEMD() means --enable-systemd
>>>> must explicitly be passed to tools/configure to enable it. Upstream
>>>> uses the former, so build systems with systemd development files
>>>> by default will ftbfs because that produces missing files that dh_missing
>>>> in debian/rules does not like.
>>>>
>>>> So the reason there is ftbfs on my system is that my system has
>>>> the systemd development package installed.
>>>
>>> By the way, maybe a better fix would be to pass --enable-systemd, add 
>>> libsystemd-dev
>>> build-dep and list them in the package? They might require patching to
>>> support Debian-specific upgrade machinery, though...
>>>
>>> Not installing xendriverdomain.service is one of things missing for
>>> driver domains support
>>> (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=922033).
>>>
>> 
>> Hi Marek,
>> 
>> I wouldn't be against fixing it that way. In fact, I would prefer
>> that Debian packaged Xen with full support for native systemd units.
>> I am willing to wait until if/when the package maintainers have
>> full systemd support in the Xen packages.
>> 
>> Perhaps this is an opportunity for you to try to fix 922033 again.
>> I see it has been sitting there for a few years now. Let's see
>> what Hans thinks.
> 
> Yeah, well, so, the thing here is...
> 
> When Debian started to package Xen (thanks! Bastian, in 200X), the
> upstream init scripts were copy pasted, and adjusted to have the ability
> to have different Hypervisor-ABI-incompatible versions installed at the
> same time. Also, this is related to the collection of Makefile patches
> we carry around to have ABI-incompatible stuff end up in a directory
> like /usr/lib/xen-4.14/ and /usr/lib/xen-4.17/ !

That is a nice feature of the Xen Debian packages, to have the ability
to manage guests on different versions of the hypervisor.

> 
> What does this mean? Well, in the most basic sense it means that you
> could apt-get (dist-)upgrade and then still be able to xl shutdown a
> domU afterwards before doing reboot, because it will choose the right
> tools which match with the ABI of the *now* running hypervisor instead
> of being left with a dumpster fire, which in the end causes you to shout
> curse words and cause you to have to go to the machine and hold the
> power button for 5 seconds to force power it off.
> 
> This is the thing about where you upgrade from Xen 4.14 to Xen 4.17
> during the upgrade from Debian 11/Bullseye to Debian 12/Bookworm, it
> will allow you, if booting the whole new thing is a huge failure, to
> reset the computer, and in grub, choose to use the previous Xen (and
> possibly do that in combination with previous Debian linux kernel) and
> then have a system where you again at least can start your domUs again
> *) and first have a good rest, night of sleep before starting to dig
> into what's going wrong.
> 
> So, this is exactly the same way of doing stuff like how you can also
> reboot back into the previous Linux kernel (ABI-compatible) one during a
> system upgrade, even if you're not using Xen at all!
> 
> I like this very much. This is the kind of thing that helps admins of
> systems that have just local disks and a few domUs. Like, the case where
> you support some non-profit organization with their server stuff running
> on donated hardware. (Yes, I also do some of those, I do!) And, in case
> something does fail (there could always be something like a misbehaving
> mpt3sas card in the hardware or anything that no one else spotted yet),
> the admin does not have to end up in total panic mode after doi

Bug#1028251: [Pkg-xen-devel] Bug#1028251: New Patch (Was: Re: Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64)

2023-01-13 Thread Chuck Zmudzinski
On 1/13/23 7:39 AM, Marek Marczykowski-Górecki wrote:
> On Fri, Jan 13, 2023 at 12:58:29AM -0500, Chuck Zmudzinski wrote:
>> On 1/11/2023 10:58 PM, Chuck Zmudzinski wrote:
>> > On 1/9/23 12:55 PM, Hans van Kranenburg wrote:
>> > > Hi!
>> > > 
>> > > On 09/01/2023 18:44, Chuck Zmudzinski wrote:
>> > >> Control: tag -1 + moreinfo
>> > >> 
>> > >> thanks
>> > >> 
>> > >> On 1/9/23 8:09 AM, Hans van Kranenburg wrote:
>> > >>> Hi Chuck,
>> > >>>
>> > >>> On 1/8/23 23:18, Chuck Zmudzinski wrote:
>> > >>>> [...]
>> > >>>>
>> > >>>> The build failed:
>> > >>>>
>> > >>>>debian/rules override_dh_missing
>> > >>>> make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0'
>> > >>>> dh_missing --list-missing
>> > >>>> dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in 
>> > >>>> debian/tmp but is not installed to anywhere
>> > >>>> dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in 
>> > >>>> debian/tmp but is not installed to anywhere
>> > >>>> dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service 
>> > >>>> exists in debian/tmp but is not installed to anywhere
>> > >>>> dh_missing: warning: 
>> > >>>> usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in 
>> > >>>> debian/tmp but is not installed to anywhere
>> > >>>> dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service 
>> > >>>> exists in debian/tmp but is not installed to anywhere
>> > >>>> dh_missing: warning: usr/lib/systemd/system/xenconsoled.service 
>> > >>>> exists in debian/tmp but is not installed to anywhere
>> > >>>> dh_missing: warning: usr/lib/systemd/system/xendomains.service exists 
>> > >>>> in debian/tmp but is not installed to anywhere
>> > >>>> dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service 
>> > >>>> exists in debian/tmp but is not installed to anywhere
>> > >>>> dh_missing: warning: usr/lib/systemd/system/xenstored.service exists 
>> > >>>> in debian/tmp but is not installed to anywhere
>> > >>>
>> > >>> I cannot reproduce this error here locally and the CI build also 
>> > >>> succeeds:
>> > >>>
>> > >>> https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577
>> > >> 
>> > >> I thought I had a fairly clean sid install, but I think the problem
>> > >> on my system could be caused by some obscure grandfathered in
>> > >> setting because the sid I am using was updated from all the way back to
>> > >> an original install of jessie many years ago...
>> > >> 
>> > >> It might be time for me to refresh my sid with a clean installation.
>> > >> 
>> > >> Out of curiosity and if you have time, can you answer a couple of
>> > >> question if you know the answer?
>> > >> 
>> > >> 1. Do the builds on a clean environment produce the missing files
>> > >> listed in my build?
>> > > 
>> > > No, after my local package build, there's no such things in there:
>> > > 
>> > > ~/build/xen/debian-xen/debian/tmp/usr/lib m (master) 1-$ ll
>> > > total 0
>> > > drwxr-xr-x 1 knorrie knorrie  110 Jan  8 23:51 debug
>> > > drwxr-xr-x 1 knorrie knorrie 2048 Jan  8 23:50 x86_64-linux-gnu
>> > > drwxr-xr-x 1 knorrie knorrie   20 Jan  8 23:51 xen-4.17
>> > > 
>> > >> 
>> > >> 2. Are those systemd service files installed anywhere in the xen
>> > >> binary packages, either in arch=x86_64 packages or for the arch=all
>> > >> packages such as xen-utils-common?
>> > > 
>> > > No, they are not:
>> > > 
>> > > https://packages.debian.org/search?searchon=contents=xenconsoled.service=path=unstable=any
>> > > 
>> > >> If you don't know the answer to these questions I will investigate
>> > >> myself to find the answers, so you can work on more important things.
>> > >> 
>> > >>>
>> > >

Bug#1028251: New Patch (Was: Re: [Pkg-xen-devel] Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64)

2023-01-12 Thread Chuck Zmudzinski
On 1/11/2023 10:58 PM, Chuck Zmudzinski wrote:
> On 1/9/23 12:55 PM, Hans van Kranenburg wrote:
> > Hi!
> > 
> > On 09/01/2023 18:44, Chuck Zmudzinski wrote:
> >> Control: tag -1 + moreinfo
> >> 
> >> thanks
> >> 
> >> On 1/9/23 8:09 AM, Hans van Kranenburg wrote:
> >>> Hi Chuck,
> >>>
> >>> On 1/8/23 23:18, Chuck Zmudzinski wrote:
> >>>> [...]
> >>>>
> >>>> The build failed:
> >>>>
> >>>>debian/rules override_dh_missing
> >>>> make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0'
> >>>> dh_missing --list-missing
> >>>> dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in 
> >>>> debian/tmp but is not installed to anywhere
> >>>> dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in 
> >>>> debian/tmp but is not installed to anywhere
> >>>> dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists 
> >>>> in debian/tmp but is not installed to anywhere
> >>>> dh_missing: warning: 
> >>>> usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in 
> >>>> debian/tmp but is not installed to anywhere
> >>>> dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists 
> >>>> in debian/tmp but is not installed to anywhere
> >>>> dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists 
> >>>> in debian/tmp but is not installed to anywhere
> >>>> dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in 
> >>>> debian/tmp but is not installed to anywhere
> >>>> dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service 
> >>>> exists in debian/tmp but is not installed to anywhere
> >>>> dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in 
> >>>> debian/tmp but is not installed to anywhere
> >>>
> >>> I cannot reproduce this error here locally and the CI build also succeeds:
> >>>
> >>> https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577
> >> 
> >> I thought I had a fairly clean sid install, but I think the problem
> >> on my system could be caused by some obscure grandfathered in
> >> setting because the sid I am using was updated from all the way back to
> >> an original install of jessie many years ago...
> >> 
> >> It might be time for me to refresh my sid with a clean installation.
> >> 
> >> Out of curiosity and if you have time, can you answer a couple of
> >> question if you know the answer?
> >> 
> >> 1. Do the builds on a clean environment produce the missing files
> >> listed in my build?
> > 
> > No, after my local package build, there's no such things in there:
> > 
> > ~/build/xen/debian-xen/debian/tmp/usr/lib m (master) 1-$ ll
> > total 0
> > drwxr-xr-x 1 knorrie knorrie  110 Jan  8 23:51 debug
> > drwxr-xr-x 1 knorrie knorrie 2048 Jan  8 23:50 x86_64-linux-gnu
> > drwxr-xr-x 1 knorrie knorrie   20 Jan  8 23:51 xen-4.17
> > 
> >> 
> >> 2. Are those systemd service files installed anywhere in the xen
> >> binary packages, either in arch=x86_64 packages or for the arch=all
> >> packages such as xen-utils-common?
> > 
> > No, they are not:
> > 
> > https://packages.debian.org/search?searchon=contents=xenconsoled.service=path=unstable=any
> > 
> >> If you don't know the answer to these questions I will investigate
> >> myself to find the answers, so you can work on more important things.
> >> 
> >>>
> >>> How are you building the packages? In a clean build environment, using
> >>> for example sbuild or pbuilder, or in an environment where unrelated
> >>> other build dependencies could be present, that are not included in the
> >>> xen list, but maybe 'wake up and do something' if they're present?
> >> 
> >> As I said, I am building on a sid install that might have some
> >> stuff grandfathered in from old releases going back to jessie.
> >> I also might have some stale stuff around from my private builds
> >> of the traditional device model available from xen that is not
> >> part of the Debian packages. I will investigate these possible causes.
> >> 
> >> I use debuild as a frontend to dpkg-buildpackage to build the packages.
> &g

Bug#1028557: general: The Debian Social Contract (DSC) is meaningless

2023-01-12 Thread Chuck Zmudzinski
Package: general
Severity: normal

Dear Maintainer,

It is a bug that Debian considers the DSC so important, yet,
the concept of a contract is totally meaningless outside of
the context of a legal system where the obligations and rights
that arise from the terms of the contract can be enforced.

Proposed Fix: Replace the DSC with a disclaimer that admits the
document formerly known as the Debian Social Contract is totally
meaningless because it cannot be legally enforced and refers
the reader to the actual software licenses of the software
in the distribution that *can* be enforced in a legal system.

Thanks



Bug#1028251: [Pkg-xen-devel] Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64

2023-01-11 Thread Chuck Zmudzinski
On 1/9/23 12:55 PM, Hans van Kranenburg wrote:
> Hi!
> 
> On 09/01/2023 18:44, Chuck Zmudzinski wrote:
>> Control: tag -1 + moreinfo
>> 
>> thanks
>> 
>> On 1/9/23 8:09 AM, Hans van Kranenburg wrote:
>>> Hi Chuck,
>>>
>>> On 1/8/23 23:18, Chuck Zmudzinski wrote:
>>>> [...]
>>>>
>>>> The build failed:
>>>>
>>>>debian/rules override_dh_missing
>>>> make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0'
>>>> dh_missing --list-missing
>>>> dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in debian/tmp 
>>>> but is not installed to anywhere
>>>> dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in 
>>>> debian/tmp but is not installed to anywhere
>>>> dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists 
>>>> in debian/tmp but is not installed to anywhere
>>>> dh_missing: warning: 
>>>> usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in 
>>>> debian/tmp but is not installed to anywhere
>>>> dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists in 
>>>> debian/tmp but is not installed to anywhere
>>>> dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists in 
>>>> debian/tmp but is not installed to anywhere
>>>> dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in 
>>>> debian/tmp but is not installed to anywhere
>>>> dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service exists 
>>>> in debian/tmp but is not installed to anywhere
>>>> dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in 
>>>> debian/tmp but is not installed to anywhere
>>>
>>> I cannot reproduce this error here locally and the CI build also succeeds:
>>>
>>> https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577
>> 
>> I thought I had a fairly clean sid install, but I think the problem
>> on my system could be caused by some obscure grandfathered in
>> setting because the sid I am using was updated from all the way back to
>> an original install of jessie many years ago...
>> 
>> It might be time for me to refresh my sid with a clean installation.
>> 
>> Out of curiosity and if you have time, can you answer a couple of
>> question if you know the answer?
>> 
>> 1. Do the builds on a clean environment produce the missing files
>> listed in my build?
> 
> No, after my local package build, there's no such things in there:
> 
> ~/build/xen/debian-xen/debian/tmp/usr/lib m (master) 1-$ ll
> total 0
> drwxr-xr-x 1 knorrie knorrie  110 Jan  8 23:51 debug
> drwxr-xr-x 1 knorrie knorrie 2048 Jan  8 23:50 x86_64-linux-gnu
> drwxr-xr-x 1 knorrie knorrie   20 Jan  8 23:51 xen-4.17
> 
>> 
>> 2. Are those systemd service files installed anywhere in the xen
>> binary packages, either in arch=x86_64 packages or for the arch=all
>> packages such as xen-utils-common?
> 
> No, they are not:
> 
> https://packages.debian.org/search?searchon=contents=xenconsoled.service=path=unstable=any
> 
>> If you don't know the answer to these questions I will investigate
>> myself to find the answers, so you can work on more important things.
>> 
>>>
>>> How are you building the packages? In a clean build environment, using
>>> for example sbuild or pbuilder, or in an environment where unrelated
>>> other build dependencies could be present, that are not included in the
>>> xen list, but maybe 'wake up and do something' if they're present?
>> 
>> As I said, I am building on a sid install that might have some
>> stuff grandfathered in from old releases going back to jessie.
>> I also might have some stale stuff around from my private builds
>> of the traditional device model available from xen that is not
>> part of the Debian packages. I will investigate these possible causes.
>> 
>> I use debuild as a frontend to dpkg-buildpackage to build the packages.
> 
> Yes. So (I'm not entirely sure how it works, but as example, just making
> something up here): After doing something else first, you might end up
> with a system that has for example dh-systemd-yolo-all-the-things-helper
> installed. And, it might be that only it being present means that the
> package build process changes. It might even be a 'feature' of that
> helper... "just add it to your build depends, and it will automatically
> do all the things for you

Bug#1028251: [Pkg-xen-devel] Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64

2023-01-09 Thread Chuck Zmudzinski
On 1/9/23 12:55 PM, Hans van Kranenburg wrote:
> Hi!
> 
> On 09/01/2023 18:44, Chuck Zmudzinski wrote:
> ...
> This is why it is very much recommended to build the packages using
> something like sbuild, so that you can be sure that every time it will
> start with a super minimal chroot which only has some essential things,
> and that the only build dependencies used will be the ones that are
> explicitly defined in the debian/control of the package.

Thanks for the advice - it is now on my TODO list to learn to use sbuild
or some other tool that makes it easy to do builds in a minimal chroot.

Kind regards,

Chuck



Bug#1028251: [Pkg-xen-devel] Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64

2023-01-09 Thread Chuck Zmudzinski
Control: tag -1 + moreinfo

thanks

On 1/9/23 8:09 AM, Hans van Kranenburg wrote:
> Hi Chuck,
> 
> On 1/8/23 23:18, Chuck Zmudzinski wrote:
>> [...]
>> 
>> The build failed:
>> 
>>debian/rules override_dh_missing
>> make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0'
>> dh_missing --list-missing
>> dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in debian/tmp 
>> but is not installed to anywhere
>> dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in 
>> debian/tmp but is not installed to anywhere
>> dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists in 
>> debian/tmp but is not installed to anywhere
>> dh_missing: warning: 
>> usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in 
>> debian/tmp but is not installed to anywhere
>> dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists in 
>> debian/tmp but is not installed to anywhere
>> dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists in 
>> debian/tmp but is not installed to anywhere
>> dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in 
>> debian/tmp but is not installed to anywhere
>> dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service exists 
>> in debian/tmp but is not installed to anywhere
>> dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in 
>> debian/tmp but is not installed to anywhere
> 
> I cannot reproduce this error here locally and the CI build also succeeds:
> 
> https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577

I thought I had a fairly clean sid install, but I think the problem
on my system could be caused by some obscure grandfathered in
setting because the sid I am using was updated from all the way back to
an original install of jessie many years ago...

It might be time for me to refresh my sid with a clean installation.

Out of curiosity and if you have time, can you answer a couple of
question if you know the answer?

1. Do the builds on a clean environment produce the missing files
listed in my build?

2. Are those systemd service files installed anywhere in the xen
binary packages, either in arch=x86_64 packages or for the arch=all
packages such as xen-utils-common?

If you don't know the answer to these questions I will investigate
myself to find the answers, so you can work on more important things.

> 
> How are you building the packages? In a clean build environment, using
> for example sbuild or pbuilder, or in an environment where unrelated
> other build dependencies could be present, that are not included in the
> xen list, but maybe 'wake up and do something' if they're present?

As I said, I am building on a sid install that might have some
stuff grandfathered in from old releases going back to jessie.
I also might have some stale stuff around from my private builds
of the traditional device model available from xen that is not
part of the Debian packages. I will investigate these possible causes.

I use debuild as a frontend to dpkg-buildpackage to build the packages.

> 
> You can also compare your own build output with the full one from the CI
> job:
> 
> https://salsa.debian.org/xen-team/debian-xen/-/jobs/3767564/raw

I will take a look at that when I get a chance.

This is not a real high priority for me, so I am content to let this
be until I get a chance to investigate the quirks of my current
installation of sid, and I also added the moreinfo tag, so you can
ignore this bug if you wish until I do some further research. 

Cheers,

Chuck



Bug#1028251: Updated Patch (Was: xen: FTBFS when building xen binary packages for sid on x86_64)

2023-01-08 Thread Chuck Zmudzinski
Sorry, the patch I posted in the original message will not apply properly.
I forgot I also edited the comment:

Here is the correct patch:

--- rules    2022-12-21 16:34:51.0 -0500
+++ rules.new    2023-01-08 05:31:24.0 -0500
@@ -327,9 +327,9 @@
     | xargs -0r gzip -9vn
 
 # By default, files in debian/tmp which are not handled by anything
-# in rules are ignored.  This makes them into errors.
+# in rules are ignored.  This lists them.
 override_dh_missing:
-    dh_missing --fail-missing
+    dh_missing --list-missing
 
 
 # We are dropping the config file /etc/default/xen which appeared in
--snip--

Thanks for all your work. Apart from this little problem, it
appears Xen 4.17 will work well on Bookworm.

Kind regards,

Chuck



Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64

2023-01-08 Thread Chuck Zmudzinski
Source: xen
Version: 4.17.0-1
Severity: normal
Tags: ftbfs patch

Dear Maintainer,

Hi,

I needed to test a patch to libxl so I started by trying to build
xen from source on an up-to-date sid installation.

The build failed:

   debian/rules override_dh_missing
make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0'
dh_missing --list-missing
dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in debian/tmp but 
is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in debian/tmp 
but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists in 
debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service 
exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists in 
debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists in 
debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in 
debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service exists in 
debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in 
debian/tmp but is not installed to anywhere

Please note that this output is after editing the
line in debian/rules that is currently

dh_missing --fail-missing

with

dh_missing --list-missing

so the missing files only induce a warning instead of FTBFS.

So the workaround is this patch to debian/rules:

--- a/debian/rules  2023-01-08 16:36:01.605863417 -0500
+++ b/debian/rules  2023-01-08 05:31:24.0 -0500
@@ -329,7 +329,7 @@
 # By default, files in debian/tmp which are not handled by anything
 # in rules are ignored.  This lists them.
 override_dh_missing:
-   dh_missing --fail-missing
+   dh_missing --list-missing


 # We are dropping the config file /etc/default/xen which appeared in
---snip-

I presume you know about this and plan to fix it before the
next upload, but perhaps a recent systemd update is causing
this so I am reporting it here.

I also request that if the missing systemd files cannot be
installed properly before the next upload of a new version
you apply a workaround such as this patch or another workaround
until the missing systemd files are installed and configured
correctly.

Kind regards,

Chuck

-- System Information:
Debian Release: bookworm/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.0.0-6-amd64 (SMP w/4 CPU threads; PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM:

AppArmor: enabled



Bug#988333: [Pkg-xen-devel] xen/4.14.3-1: VGA Intel IGD Passthrough to Debian Xen HVM DomUs not working: xl -vvv create log

2021-10-27 Thread Chuck Zmudzinski

On 10/26/2021 10:06 AM, Chuck Zmudzinski wrote:

On 10/25/2021 4:45 PM, Chuck Zmudzinski wrote:

On 10/23/2021 11:11 AM, Hans van Kranenburg wrote:

Hi!


On 5/10/2021 1:33 PM, Chuck Zmudzinski wrote:
[...] with buster and bullseye running as the Dom0, I can only get 
the VGA/Passthrough feature to work with Windows Xen HVMs. I would 
expect both Windows and Linux HVMs to work comparably well.


A possible time-saver that I can recommend is to send a post to the
upstream xen-users list [0] about this already. Like "Hi all, I'm
starting a HVM Linux domU with Linux 5.10.70 on a Xen 4.14.3 system 
with

also 5.10.70 dom0 kernel, with this and this domU config file. It fails
to start, this is the xl -vvv create output, and this error (the irq
stuff) appears in the dom0 kernel log.". Try to keep it simple and not
too long initially, without the surrounding stories, to increase chance
of it being fully read.


I can do this soon - I have some more interesting tests to share
here and with the Xen developers upstream.


I will need to think a little about how to present this bug to
the Xen upstream developers in a short and simple enough way
for them to be likely to read it initially. For now, I will report here
some results from the journal log entries of both Bullseye dom0
and Bullseye domU for two different configurations. These logs
are not generated with the -vvv option, but they do provide
quite a bit of interesting information and are already
somewhat overwhelming, even without the -vvv option. So
I will hold off for now before making the logs even more verbose
with -vvv.


Now I add output of xl create with -vvv option:

chuckz@debian:~$ sudo xl -vvv create bullseye-hvm.cfg
Parsing config from bullseye-hvm.cfg
libxl: debug: libxl_create.c:2017:do_domain_create: ao 0x55c97f27e180: 
create: how=(nil) callback=(nil) poller=0x55c97f27e220

libxl: detail: libxl_create.c:622:libxl__domain_make: passthrough: sync_pt
libxl: debug: libxl_device.c:379:libxl__device_disk_set_backend: Disk 
vdev=xvda spec.backend=unknown
libxl: debug: libxl_device.c:413:libxl__device_disk_set_backend: Disk 
vdev=xvda, using backend phy
libxl: debug: libxl_device.c:379:libxl__device_disk_set_backend: Disk 
vdev=xvdb spec.backend=unknown
libxl: debug: libxl_device.c:413:libxl__device_disk_set_backend: Disk 
vdev=xvdb, using backend phy
libxl: debug: libxl_create.c:1279:initiate_domain_create: Domain 
2:running bootloader
libxl: debug: libxl_bootloader.c:328:libxl__bootloader_run: Domain 2:not 
a PV/PVH domain, skipping bootloader
libxl: debug: libxl_event.c:864:libxl__ev_xswatch_deregister: watch 
w=0x55c97f284148: deregister unregistered

libxl: detail: libxl_x86.c:338:hvm_set_viridian_features: base group enabled
libxl: detail: libxl_x86.c:338:hvm_set_viridian_features: freq group enabled
libxl: detail: libxl_x86.c:338:hvm_set_viridian_features: time_ref_count 
group enabled
libxl: detail: libxl_x86.c:338:hvm_set_viridian_features: apic_assist 
group enabled
libxl: detail: libxl_x86.c:338:hvm_set_viridian_features: crash_ctl 
group enabled

domainbuilder: detail: xc_dom_allocate: cmdline="", features=""
domainbuilder: detail: xc_dom_kernel_file: 
filename="/usr/lib/xen-4.14/boot/hvmloader"

domainbuilder: detail: xc_dom_malloc_filemap    : 329 kB
libxl: debug: libxl_dom.c:829:libxl__load_hvm_firmware_module: Loading 
BIOS: /usr/share/seabios/bios-256k.bin
domainbuilder: detail: xc_dom_boot_xen_init: ver 4.14, caps 
xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64

domainbuilder: detail: xc_dom_parse_image: called
domainbuilder: detail: xc_dom_find_loader: trying multiboot-binary 
loader ...

domainbuilder: detail: loader probe failed
domainbuilder: detail: xc_dom_find_loader: trying HVM-generic loader ...
domainbuilder: detail: loader probe OK
xc: detail: ELF: phdr: paddr=0x10 memsz=0x5bfc4
xc: detail: ELF: memory: 0x10 -> 0x15bfc4
domainbuilder: detail: xc_dom_mem_init: mem 3072 MB, pages 0xc 
pages, 4k each

domainbuilder: detail: xc_dom_mem_init: 0xc pages
domainbuilder: detail: xc_dom_boot_mem_init: called
domainbuilder: detail: range: start=0x0 end=0xc000
xc: detail: PHYSICAL MEMORY ALLOCATION:
xc: detail:   4KB PAGES: 0x0200
xc: detail:   2MB PAGES: 0x01ff
xc: detail:   1GB PAGES: 0x0002
domainbuilder: detail: xc_dom_build_image: called
domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 
0x100+0x5c at 0x7f20de301000
domainbuilder: detail: xc_dom_alloc_segment:   kernel   : 0x10 
-> 0x15c000  (pfn 0x100 + 0x5c pages)

xc: detail: ELF: phdr 0 at 0x7f20de2a5000 -> 0x7f20de2f7420
domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 
0x15c+0x40 at 0x7f20de2c1000
domainbuilder: detail: xc_dom_alloc_segment:   System Firmware module : 
0x15c000 -> 0x19c000  (pfn 0x15c + 0x40 pages)
domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 
0x19c+0x1 at 0x7f20

Bug#988333: [Pkg-xen-devel] linux-image-5.10.0-6-amd64: VGA Intel IGD Passthrough to Debian Xen HVM DomUs not working, but Windows Xen HVMs do work

2021-10-26 Thread Chuck Zmudzinski

On 10/26/2021 10:06 AM, Chuck Zmudzinski wrote:

On 10/25/2021 4:45 PM, Chuck Zmudzinski wrote:

On 10/23/2021 11:11 AM, Hans van Kranenburg wrote:

Hi!


On 5/10/2021 1:33 PM, Chuck Zmudzinski wrote:
[...] with buster and bullseye running as the Dom0, I can only get 
the VGA/Passthrough feature to work with Windows Xen HVMs. I would 
expect both Windows and Linux HVMs to work comparably well.


A possible time-saver that I can recommend is to send a post to the
upstream xen-users list [0] about this already. Like "Hi all, I'm
starting a HVM Linux domU with Linux 5.10.70 on a Xen 4.14.3 system 
with

also 5.10.70 dom0 kernel, with this and this domU config file. It fails
to start, this is the xl -vvv create output, and this error (the irq
stuff) appears in the dom0 kernel log.". Try to keep it simple and not
too long initially, without the surrounding stories, to increase chance
of it being fully read.


I can do this soon - I have some more interesting tests to share
here and with the Xen developers upstream.


I will need to think a little about how to present this bug to
the Xen upstream developers in a short and simple enough way
for them to be likely to read it initially. For now, I will report here
some results from the journal log entries of both Bullseye dom0
and Bullseye domU for two different configurations. These logs
are not generated with the -vvv option, but they do provide
quite a bit of interesting information and are already
somewhat overwhelming, even without the -vvv option. So
I will hold off for now before making the logs even more verbose
with -vvv.

The intention of this message is to provide detailed logs for a
detailed analysis of the problem, not to describe the problem
in simple terms.

A few days ago I ran two tests, and I have four different log
files attached from those tests. In both tests, the Bullseye
HVM was configured for PCI/IGD passthrough using the
domain config file and preparation for passthrough in dom0
described in the earlier message #31:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=988333#31

The two tests were:

1. Bullseye dom0, Debian 11.1 / Bullseye HVM domU, Debian 11.1

This first test essentially confirmed that the updated versions
of the packages for both Bullseye dom0 and Bullseye domU
since the original report five months ago do not fix the
problem. In this test case, I am using all the official packages
of Debian 11.1 (Bullseye).

It is important to note that the version of the device
model used in this test is the official upstream version
of qemu for Bullseye. On Debian, Xen uses by default the
qemu-system-i386 binary from the qemu-system-x86
package, and Bullseye currently uses qemu version
5.2+dfsg-11+deb11u1 as the default device model.

I attached two log files from this test:
qemu-upstream-hvm.txt and qemu-upstream-dom0.txt.
They are the logged journal entries for the Bullseye HVM
and Bullseye dom0 domains, respectively. They are fairly
complete logs, showing the kernel version running in both
the dom0 and the HVM, the kernel command line for both
the dom0 and the domU, the command that was used to
create the HVM domain, etc.

One might recall that in the original report I said it was
difficult to capture logs from the domU, but this time I was able
to capture the log by waiting a few minutes before shutting
it down. I also discovered, in contrast to what I said in the
earlier report, that it is possible to gracefully shut down the
domU using xl shutdown  by waiting long enough
before trying to shut it down, and also it takes a few minutes
instead of the normal few seconds to shut it down because
of the problems caused by this configuration. By waiting
for the graceful shutdown instead of using xl destroy ,
I was able to view the log of the attempted boot in the domU
on a subsequent normal boot (without PCI passthrough) using
journalctl, and capture some useful Call Traces.

For this first test, although there is a successful shut down,
the domain is never built to the point where one can login,
neither at the terminal nor remotely via ssh. But the boot
messages were displayed on the passed through video
device, but only very slowly, it took almost two minutes
before the boot messages started to appear and it also
took a couple of minutes after issuing the xl shutdown
command in dom0 before it indicated on the passed
through video device that the HVM domain shut down
and powered off.

The second test:

2. Same as first test, except use the qemu traditional device
model instead of the qemu upstream model which on Debian
comes from the qemu-system-x86 package.

I also attached two log files from this test:
qemu-traditional-hvm.txt and qemu-traditional-dom0.txt,
and these also are fairly complete logs showing the kernel
version in use, etc.

Since Debian does not provide the traditional device model,
I had to build it from xenbits.xen.org:

https://xenbits.xen.org/gitweb/?p=qemu-xen-traditional.git;a=shortlog;h=refs/heads/stable-

Bug#988333: [Pkg-xen-devel] linux-image-5.10.0-6-amd64: VGA Intel IGD Passthrough to Debian Xen HVM DomUs not working, but Windows Xen HVMs do work

2021-10-26 Thread Chuck Zmudzinski

On 10/23/2021 11:11 AM, Hans van Kranenburg wrote:


Can you share the domU config file?


Yes, here it is:

builder = 'hvm'
memory = '3072'
vcpus = '4'
device_model_version = 'qemu-xen'
# device_model_version = 'qemu-xen-traditional'
# This is now bullseye
disk = ['/dev/systems/linux,,xvda,w','/dev/data/linuxdata,,xvdb,w']
name = 'bullseye-hvm'
vif = [ 
'mac=00:16:3E:27:2C:AA,model=e1000,script=vif-route.hvm,ip=192.168.1.4' ]

on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'restart'
boot = 'c'
acpi = '1'
apic = '1'
viridian = '1'
xen_platform_pci = '1'
serial = 'pty'
vga = 'none'
sdl = '0'
vnc = '0'
gfx_passthru = '1'
pci = [ '00:1b.0', '00:14.0,rdm_policy=relaxed', '00:02.0' ]



And, other configs you need to have in place to exclude the devices from
being seen as normal devices directly in dom0? (I haven't used
passthrough myself yet, but I read that this is needed.)


I run this script in Dom0 before starting the domain:

#!/bin/bash
modprobe xen-pciback
xl pci-assignable-add 00:02.0
xl pci-assignable-add 00:14.0
xl pci-assignable-add 00:1b.0
xl pci-assignable-list

The script makes the Intel IGD, USB 3.0 controller, and
sound device available to an unprivileged domain. the pci = ...
statement in the domain config corresponds to these same
three PCI devices.


I forgot to add that you need to run lspci in dom0 to get the
PCI bus, slot and function numbers of the PCI devices you want to
pass through to the unprivileged domain. On my system,
this is what I got:

$lspci
00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM 
Controller (rev 06)
00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th 
Gen Core Processor Integrated Graphics Controller (rev 06)
00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core 
Processor HD Audio Controller (rev 06)
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset 
Family USB xHCI (rev 05)
00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series 
Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection 
I217-V (rev 05)
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset 
Family USB EHCI #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset 
High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset 
Family PCI Express Root Port #1 (rev d5)

00:1c.3 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d5)
00:1c.4 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset 
Family PCI Express Root Port #5 (rev d5)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset 
Family USB EHCI #1 (rev 05)

00:1f.0 ISA bridge: Intel Corporation B85 Express LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset 
Family 6-port SATA Controller 1 [AHCI mode] (rev 05)
00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family 
SMBus Controller (rev 05)
02:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI 
Bridge (rev 04)
04:00.0 Network controller: Qualcomm Atheros AR9287 Wireless Network 
Adapter (PCI-Express) (rev 01)


The PCI slot and function numbers might be different for the USB
controller and sound card on other systems, but it is always 00:02.0
for the Intel IGD from what I have read. On my system, from the output
of lspci listed above the USB 3.0 controller (xHCI as opposed to EHCI
which indicates an older and slower USB 2.0 controller) is 00:14.0
and the sound card is 00:1b.0, and this is where the arguments for the
pci = ... statement in the domU config file and the xl pci-assignable-add
commands come from.

Also, it was necessary to add the rdm_policy=relaxed option to the USB
card in the pci = statement, and YMMV with different hardware as far
as how compatible your PCI devices are with PCI passthrough. From what
I have read, the relaxed rdm_policy setting is needed because my USB
card's memory overlaps with other devices, and I think Intel does a
better job isolating the PCI devices with newer hardware. My box is
now almost seven years old and I think newer hardware might not
need that relaxed rdm_policy setting. It would be better to have hardware
that works without this relaxed rdm_policy because allowing passthrough
of devices that overlap with other devices' memory is obviously a
security concern, but my setup does not involve any untrusted domains
so I am comfortable using it in my environment.

All the best,

Chuck



Bug#988333: [Pkg-xen-devel] linux-image-5.10.0-6-amd64: VGA Intel IGD Passthrough to Debian Xen HVM DomUs not working, but Windows Xen HVMs do work

2021-10-25 Thread Chuck Zmudzinski

On 10/23/2021 11:11 AM, Hans van Kranenburg wrote:

Hi!

On 10/19/21 5:44 AM, Chuck Zmudzinski wrote:

On 5/10/2021 1:33 PM, Chuck Zmudzinski wrote:

[...] with buster and bullseye running as the Dom0, I can only get the 
VGA/Passthrough feature to work with Windows Xen HVMs. I would expect both 
Windows and Linux HVMs to work comparably well.

You don't mention the used Xen version (Debian package version) for
buster and bullseye anywhere, so I'll assume it's the latest
4.14.3-1(~deb11u1) one.


Yes, That's the version. The original report from five months ago was an 
earlier version
but the latest version still behaves the same way. I just tested it a 
couple of days

ago.




[...]

The biggest problems were that the Dom0 reported problems
with IRQ 16 being disabled after starting the bullseye HVM DomU,
and only xl destroy could be used to stop the corrupted process.

Well, at least we have an error somewhere already. That's a starting point.

Can you share the domU config file?


Yes, here it is:

builder = 'hvm'
memory = '3072'
vcpus = '4'
device_model_version = 'qemu-xen'
# device_model_version = 'qemu-xen-traditional'
# This is now bullseye
disk = ['/dev/systems/linux,,xvda,w','/dev/data/linuxdata,,xvdb,w']
name = 'bullseye-hvm'
vif = [ 
'mac=00:16:3E:27:2C:AA,model=e1000,script=vif-route.hvm,ip=192.168.1.4' ]

on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'restart'
boot = 'c'
acpi = '1'
apic = '1'
viridian = '1'
xen_platform_pci = '1'
serial = 'pty'
vga = 'none'
sdl = '0'
vnc = '0'
gfx_passthru = '1'
pci = [ '00:1b.0', '00:14.0,rdm_policy=relaxed', '00:02.0' ]



And, other configs you need to have in place to exclude the devices from
being seen as normal devices directly in dom0? (I haven't used
passthrough myself yet, but I read that this is needed.)


I run this script in Dom0 before starting the domain:

#!/bin/bash
modprobe xen-pciback
xl pci-assignable-add 00:02.0
xl pci-assignable-add 00:14.0
xl pci-assignable-add 00:1b.0
xl pci-assignable-list

The script makes the Intel IGD, USB 3.0 controller, and
sound device available to the domain. the pci = ...
statement in the domain config corresponds to these same
three PCI devices.



Can you share more verbose logging done by xl create when using xl -vvv
create ?


I don' have time now, but will do this and report tomorrow.



But, AFAIK what you want to do should be possible yes.


The bullseye HVM DomU still fails to boot on an up-to-date
bullseye Xen Dom0 configured to pass through the same PCI/IGD
devices. The bullseye HVM DomU with IGD passthrough has so
far only been verified to work on an old, slightly modified
jessie Xen Dom0.

More Details: These latest tests are with linux version 5.10.70-1
for bullseye stable. For the jessie Dom0, which worked with the
unmodified bullseye HVM DomU, I had to add a few patches to
the old jessie Xen packages so the unmodified bullseye Xen HVM

Ok, yes, clear, that makes the domU kernel not the primary suspect.


These tests demonstrate that a fix for this bug is possible in src:xen
rather than in src:linux, but the patches needed to fix this bug in
Xen 4.14, which is the version of Xen on bullseye, are not yet
identified.

It might also be possible (just a wild guess) that for Xen 4.14, the
options in the domU config file need to be different than for Xen 4.4.


They are a little different already, 4.4 did not need the rdm_policy 
setting.
But you are right, there are other settings I haven't checked yet. I 
will report

on some more tests I have done tomorrow when I have more ti




I will continue to investigate this issue and try to bisect the problem
as it recurs in Dom0 for some version of Xen > 4.4 and <= 4.14. It
will obviously take some time since there are so many differences
between Xen 4.4 and 4.14.

If you can make progress on that, and find an actual commit that changes
the behavior, then we're probably at 95% towards finding a cause and
solution. :) That'd be great.

A possible time-saver that I can recommend is to send a post to the
upstream xen-users list [0] about this already. Like "Hi all, I'm
starting a HVM Linux domU with Linux 5.10.70 on a Xen 4.14.3 system with
also 5.10.70 dom0 kernel, with this and this domU config file. It fails
to start, this is the xl -vvv create output, and this error (the irq
stuff) appears in the dom0 kernel log.". Try to keep it simple and not
too long initially, without the surrounding stories, to increase chance
of it being fully read.


I can do this soon - I have some more interesting tests to share
here and with the Xen developers upstream.




If I find a fix in src:xen for Xen >=4.14 Dom0 on bullseye or sid, I will
reassign #988333 to src:xen myself. Until then, I will leave it to the
discretion of the Debian Kernel Team to decide whether or not to
reassign it to src:xen now.

Yes, that makes sense indeed, I'll do it in a minute. Even while we
don't know if it has to do with the Xen or dom0 kernel code, it's more
li

Bug#988333: linux-image-5.10.0-6-amd64: VGA Intel IGD Passthrough to Debian Xen HVM DomUs not working, but Windows Xen HVMs do work

2021-10-19 Thread Chuck Zmudzinski

On 5/10/2021 1:33 PM, Chuck Zmudzinski wrote:

Package: src:linux
Version: 5.10.28-1
Severity: normal
Tags: upstream

Dear Maintainer,

I have been using Xen's PCI and VGA passthrough feature since wheezy and jessie 
were the stable versions, and back then both Windows HVMs and Linux HVMs would 
function with the Intel Integrated Graphics Device (IGD), the audio device, and 
the USB 3 controller passed to them. But with buster and bullseye running as 
the Dom0, I can only get the VGA/Passthrough feature to work with Windows Xen 
HVMs. I would expect both Windows and Linux HVMs to work comparably well.




Dear Debian Kernel Team and Debian Xen Team,

I originally reported this bug in src:linux, as described above, but
recent tests indicate a fix can be made in src:xen without any
modifications to src:linux, so I suggest reassigning it from
src:linux to src:xen. My explanation follows:

On my system which is an ASRock B85M Pro4 (Haswell), with BIOS
P2.50 12/11/2015, and with a jessie Xen Dom0 with a few patches
to the old jessie Xen packages, I was able to successfully pass
through the USB 3.0 controller, the sound card, and the Intel IGD
to an unmodified bullseye HVM DomU without any of the
problems I reported in the original bug report (message #5):

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=988333#5

The biggest problems were that the Dom0 reported problems
with IRQ 16 being disabled after starting the bullseye HVM DomU,
and only xl destroy could be used to stop the corrupted process.

The bullseye HVM DomU still fails to boot on an up-to-date
bullseye Xen Dom0 configured to pass through the same PCI/IGD
devices. The bullseye HVM DomU with IGD passthrough has so
far only been verified to work on an old, slightly modified
jessie Xen Dom0.

More Details: These latest tests are with linux version 5.10.70-1
for bullseye stable. For the jessie Dom0, which worked with the
unmodified bullseye HVM DomU, I had to add a few patches to
the old jessie Xen packages so the unmodified bullseye Xen HVM
DomU would boot on the jessie Xen Dom0 that uses a fairly
old version of Xen (version 4.4). Specifically, it was necessary to
add two upstream Xen patches to the old jessie Xen-4.4 packages:

1. https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=98297f0
2. https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=09a4ef8

The first patch is needed to support booting Linux kernels >= 4.10 in
a Xen HVM DomU on a Xen 4.4 Dom0, since that is when the Linux
kernel started validating the timestamp counter adjust msr for hvm
guests, and the validation fails on Xen 4.4 without the first patch.

Modern versions of Linux expect the Dom0 to provide feature-XXX
flags in xenstore for the DomU. The 4.4 version of libxl does not
provide this so the second patch provides it for version 4.4 of libxl.

Neither of these two patches specifically solves the problem with
IGD passthrough, they are simply needed to enable the old Xen 4.4
hypervisor and tools to boot modern Linux kernels in a Xen HVM
DomU, with or without PCI/IGD passthrough.

It was also necessary to use the ancient Xen qemu traditional device
model since the old Xen 4.4 does not support IGD passthrough with
the older upstream qemu device model for jessie. I did not do
anything special here. I just compiled the qemu-xen-traditional
binary from the xenbits git repository:

https://xenbits.xen.org/gitweb/?p=qemu-xen-traditional.git;a=commit;h=204a7fc1

and recompiled hvmloader with rombios as required by
qemu-xen-traditional and integrated these with the Debian
packaging of Xen 4.4 for jessie the way it was done with
Xen 4.1 on wheezy before Debian removed qemu-xen-traditional
from the Debian Xen packages.

On the jessie Dom0, no other changes were made. For example, it
used the latest 3.16.0-11 kernel for jessie.

These tests demonstrate that a fix for this bug is possible in src:xen
rather than in src:linux, but the patches needed to fix this bug in
Xen 4.14, which is the version of Xen on bullseye, are not yet
identified.

I will continue to investigate this issue and try to bisect the problem
as it recurs in Dom0 for some version of Xen > 4.4 and <= 4.14. It
will obviously take some time since there are so many differences
between Xen 4.4 and 4.14.

If I find a fix in src:xen for Xen >=4.14 Dom0 on bullseye or sid, I will
reassign #988333 to src:xen myself. Until then, I will leave it to the
discretion of the Debian Kernel Team to decide whether or not to
reassign it to src:xen now.

Regards,

Chuck



Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-10-04 Thread Chuck Zmudzinski

On 10/4/2021 1:51 PM, Diederik de Haas wrote:

On Monday, 4 October 2021 17:27:22 CEST Chuck Zmudzinski wrote:

  I can confirm these 4 fix the bug on my hardware.

\o/
Thanks for testing and reporting back :-)

Cheers,
   Diederik


Thank you, Diederik, for your good work finding the commits
from upstream that fix the bug. And also thanks to you, Andy,
for helping fix this bug in the IRC and for your interest and
support of the Debian Xen Team's work.

Cheers,

Chuck



Bug#994899: Bug#991967: Simply ACPI powerdown/reset issue?

2021-10-04 Thread Chuck Zmudzinski
As discussed in message #91, the submitter of this bug accepts the 
package maintainer's fix which will close this bug.




Bug#994899: Bug#991967: Simply ACPI powerdown/reset issue?

2021-10-04 Thread Chuck Zmudzinski

On 10/4/2021 6:57 AM, Diederik de Haas wrote:

On Monday, 4 October 2021 11:46:54 CEST Hans van Kranenburg wrote:

The 4th one is not explicitly tagged with Fixes: 1c4aa69ca1e1, but I
agree with Diederik that we should keep them all together.

Context: Those 4 are part of 1 patch-set posted here:
https://lists.xen.org/archives/html/xen-devel/2020-11/msg01516.html

The 5th was already debatable and I choose to include it in my MR, but I'm fine
with not including that one.

Cheers,
   Diederik


As the submitter of #994899, I can confirm these 4 fix the bug
on my hardware. I agree this fix can close #994899 and #995341,
since as Hans noted, they are part of the upstream stable 4.15 branch
and I presume that will make them stable enough for bullseye.

Thank you Hans, Diederik, and Elliott.

All the best,

Chuck



Bug#995341: Highly inappropriate behavior which the RT should be aware of

2021-10-03 Thread Chuck Zmudzinski

On 10/3/2021 11:21 AM, Chuck Zmudzinski wrote:

On 10/1/2021 5:48 AM, Diederik de Haas wrote:
We've already identified a possible fix, which I can point to if so 
desired,


I think the fix referred to is here:

https://salsa.debian.org/xen-team/debian-xen/-/tree/knorrie/for-diederik-3-fixes 



AFAICT, this fix involves adding three more commits


Slight correction - it actually looks like this proposal involves 3 fixes
for diederik, but it actually involves five new commits from the upstream
unstable Xen 4.16 branch, as indicated by the five new patches in the
debian/patches directory.


from the
unstable upstream Xen 4.16 branch in addition to the nine
commits already added from unstable upstream Xen 4.16
to provide better support for the Raspberry Pi 4 but with
the unintended side effect of #994899.

I do not object to this fix for Debian's current unstable
distribution. However, this bug concerns Debian's
current stable version, bullseye, not sid/unstable.

I would respectfully disagree with the Release Team's
decision to migrate the aforementioned fix from the
unstable release to bullseye unless the fix is accepted
by the upstream Xen project in its stable 4.14 branch and
its future stable point releases 4.14.x.

IMO, the debdiff attached to message #30:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995341#30

is a better suited fix more in accordance with the stability/security
requirements of a typical Debian stable release.

Regards,

Chuck Zmudzinski




Bug#995341: release.debian.org: Xen dom0 does not power off on bullseye (stable)

2021-10-03 Thread Chuck Zmudzinski
The original submitter has proposed a fix (see messages #30 and #35). 
Another contributor to this report has indicated the package maintainer 
does not endorse the submitter of the bug's proposed fix and is working 
on another fix (see messages #40 and #65). The original submitter of the 
bug thinks the possible solution proposed in messages #40 and #65 does 
not meet the typical stability/security requirements for a typical 
Debian stable release.




Bug#995341: Highly inappropriate behavior which the RT should be aware of

2021-10-03 Thread Chuck Zmudzinski

On 10/1/2021 5:48 AM, Diederik de Haas wrote:

We've already identified a possible fix, which I can point to if so desired,


I think the fix referred to is here:

https://salsa.debian.org/xen-team/debian-xen/-/tree/knorrie/for-diederik-3-fixes

AFAICT, this fix involves adding three more commits from the
unstable upstream Xen 4.16 branch in addition to the nine
commits already added from unstable upstream Xen 4.16
to provide better support for the Raspberry Pi 4 but with
the unintended side effect of #994899.

I do not object to this fix for Debian's current unstable
distribution. However, this bug concerns Debian's
current stable version, bullseye, not sid/unstable.

I would respectfully disagree with the Release Team's
decision to migrate the aforementioned fix from the
unstable release to bullseye unless the fix is accepted
by the upstream Xen project in its stable 4.14 branch and
its future stable point releases 4.14.x.

IMO, the debdiff attached to message #30:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995341#30

is a better suited fix more in accordance with the stability/security
requirements of a typical Debian stable release.

Regards,

Chuck Zmudzinski



Bug#995341: Highly inappropriate behavior which the RT should be aware of

2021-10-02 Thread Chuck Zmudzinski

On 10/1/2021 5:48 AM, Diederik de Haas wrote:

Hi Release Team,

I want to make sure that you're aware of what I consider HIGHLY inappropriate
behavior by Chuck where he is trying to sidestep/override the Xen maintainers
by filing this bug directly to the release.debian.org pseudo package.

This only appeared on the Debian Xen maintainers' ML because Chuck went on a
severity-dance where he *also* changed the severity of bug #994899, which _is_
assigned to the Xen package and therefor the Xen maintainers could see it.

In https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994899#10 I tried to
steer the efforts to getting the issue fixed in a more constructive manner.
I failed at that.

We've already identified a possible fix, which I can point to if so desired, but
I don't think the RT should be bothered with this (dispute).


I respectfully disagree, because as I have mentioned repeatedly both
in this report and in #994899 that the process of migrating the
stable package of the xen hypervisor for amd64 broke because when
bullseye was released on August, 14, 2021, that package still contained
patches from the unstable upstream Xen 4.16 branch, whereas the
version advertised by Debian for the stable release was, and still
is, the stable upstream of Xen, version 4.14.


It may be news to Chuck, but not the RT, but a package maintainer has the
prerogative to include additional patches in the package that gets uploaded to
the Debian archive. (It happens all the time)
And that can introduce bugs. Shit happens. You learn from that. And then you
go on fixing those bugs *in coordination with* the package maintainers.


Perhaps it may be news to Diederik that the Release Team does have
the prerogative to review and either accept or reject a series of
patches from an unstable upstream branch into the stable release
and respond to a request from a user/volunteer to review such
patches that obviously can and in fact did cause bug #994899 in
this case.

It may have just been an oversight, but in this case, IMO, the package
maintainers *should* have notified the release team of the unstable
patches from Xen 4.16 that were in the supposedly stable 4.14 xen
hypervisor package for amd64 sometime BEFORE bullseye was released
as the new stable version on August 14, 2021, so the Release Team
could decide if the unstable patches could stay in the formal release of
bullseye. IMHO, it is up to the Release Team, not the package maintainers,
to decide if Debian specific patches from an UNSTABLE upstream branch can
remain in a package of the STABLE upstream version at the time of the stable
release. The package maintainers never gave the Release Team a chance to
review the upstream unstable patches before bullseye was released.

It is also for the Release Team, not the package maintainers, to
decide if those unstable patches can remain after a user/volunteer
requests that they be removed as the appropriate way to fix a bug
in the stable release that is caused by the presence of the unstable
patches in the stable release.

I would be much less inclined to request that the Release Team review
the unstable patches that are causing #994899 if there was some
evidence that upstream plans to eventually backport those patches from
Xen 4.16 to Xen 4.14. At present no such evidence exists, and perhaps
a way to resolve this controversy is for Debian to submit a pull
request to the Xen project to merge the unstable patches in Debian's
current Xen 4.14 packages into Xen's stable Xen 4.14 branch. If upstream
endorses the unstable patches as suitable for their 4.14 stable release and
eventully commits them to their 4.14 branch and subsequent upstream
point releases, then I would also accept them as appropriate for the Debian
package of the upstream stable 4.14 version of Xen that targets the stable
version, currently bullseye.

Regards,

Chuck Zmudzinski



What you don't do, is try to go above/around them by addressing the RT
directly.
One should have at least the decency to directly To/CC the package maintainer
when you do, which in 99.99+% of cases you REALLY should not do.

Regards,
   Diederik




Bug#995341: Highly inappropriate behavior which the RT should be aware of

2021-10-02 Thread Chuck Zmudzinski

On 10/1/2021 5:48 AM, Diederik de Haas wrote:

Hi Release Team,

I want to make sure that you're aware of what I consider HIGHLY inappropriate
behavior by Chuck where he is trying to sidestep/override the Xen maintainers
by filing this bug directly to the release.debian.org pseudo package.


I consider it also highly inappropriate for one volunteer to criticize
a newcomer volunteer without at least a Cc to the volunteer he is
criticizing, to give the volunteer under attack an opportunity to
respond and defend herself/himself.



This only appeared on the Debian Xen maintainers' ML because Chuck went on a
severity-dance where he *also* changed the severity of bug #994899, which _is_
assigned to the Xen package and therefor the Xen maintainers could see it.

In https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994899#10 I tried to
steer the efforts to getting the issue fixed in a more constructive manner.
I failed at that.

We've already identified a possible fix, which I can point to if so desired, but
I don't think the RT should be bothered with this (dispute).
It may be news to Chuck, but not the RT, but a package maintainer has the
prerogative to include additional patches in the package that gets uploaded to
the Debian archive. (It happens all the time)
And that can introduce bugs. Shit happens. You learn from that. And then you
go on fixing those bugs *in coordination with* the package maintainers.


I agree package maintainers must have a say. and as far as I can tell the
Release Team has the final say on what goes into the stable release.
I have tried to cooperate with volunteers for the package maintainers,
but they refused to cooperate with me. When volunteers for the
package maintainers are uncooperative and excessively critical and
unfair to a newcomer volunteer, what is a newcomer to do?
Does Debian really consider this the best way to sustain the community
and acquire new developers as veterans move on or quit? IMHO, following
Diederik's approach toward newcomers will result in a slow and painful
death for Debian as competent developers move on and no one is there
to replace them because it is just not worth the personal attacks one
must endure when trying to contribute to Debian.



What you don't do, is try to go above/around them by addressing the RT
directly.
One should have at least the decency to directly To/CC the package maintainer
when you do, which in 99.99+% of cases you REALLY should not do.


One should also have the decency to Cc a person one is criticizing,
something i have done by sending a Cc to the person I am criticizing
(in my defense of his attack on me). Diederik did not have this decency
when he criticized me.

Respectfully,

Chuck Zmudzinski



Regards,
   Diederik




Bug#995341: release.debian.org: Xen dom0 does not power off on bullseye (stable)

2021-10-01 Thread Chuck Zmudzinski

On 9/29/2021 7:26 PM, Chuck Zmudzinski wrote:



Special instructions for applying the debdiff:


Please note that an updated debdiff has been provided
to target the correct distribution and use the correct
version number for the updated package at the following
link:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995341#30

Also, a clarification:

The special instructions below (the quilt pop -a and
quilt push -a commands) are only required to test the debdiff
on a local build. No special processing should be required
on automated test builds.

Chuck




1. Download the attached debdiff, bug#994899.diff, into a working 
directory


Then in the working directory with the attached bug#994899.diff and the
archived/compressed source file archives for xen_4.14.3-1~deb11u1 in it:

2. dpkg-source -x xen_4.14.3-1~deb11u1.dsc
3. cd xen-4.14.3
4. quilt pop -a
5. patch -p1 < ../bug#994899.diff
6. quilt push -a

After these 6 steps, the tree is ready
to build the source/binary packages.





Bug#995341: release.debian.org: Xen dom0 does not power off on bullseye (stable)

2021-09-30 Thread Chuck Zmudzinski

On 9/29/2021 7:26 PM, Chuck Zmudzinski wrote:


Ordinarily, as I understand the process, a bug in the
stable version is first fixed in the unstable release
and then the fix is migrated (backported) to the
stable release. But it appears to me a fix in the
unstable release will not be forthcoming soon, or
it might be a different bug (see #991967, affecting the
unstable release, sid, for more details).


Another way to look at this unusual situation:

At the present time the Xen packages targeting
the unstable version are identical to the
Xen packages targeting the stable version. In other
words, either the stable version is not really stable
or the unstable version is actually stable. I argue it
is the former and somewhere along the line the
process for migrating a stable version of Xen into
bullseye broke. I have identified when the process
broke. It was when patches from unstable upstream
Xen 4.16 were migrated to bullseye even though
the upstream version of Xen for both stable and
unstable was stable upstream Xen 4.14. In other words,
the current Debian version of Xen targeting the
stable distribution is actually an unstable version
of Debian Xen that is a mixture of mostly stable Xen 4.14
and nine unstable patches from upstream
Xen 4.16. This is causing instabilities and bugs such as
#994899 and #991967 on amd64 and also likely i386.
This upload to stable fixes this by removing the instabilities
on amd64 and i386 without removing the good work
done by the Debian Xen Team improving support
for arm devices. So it is a win-win to accept this upload
to stable.

Going forward, work on Debian's unstable version
of Xen can continue with investigating and fixing
#991967 and eventually updating to a newer upstream
version, which will probably be at least Xen 4.16 which
in my tests already show that #994899 is fixed upstream
in Xen 4.16, as discussed here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994899#5

To quote (with corrections and clarifications) from the
aforementioned message:

I also tested the current unstable (master) branch
from Xen upstream, which is xen-c76cfad, which
(upstream) calls Xen-4.16-unstable. I tested
the current bullseye kernel (5.10.46-4) as a
dom0 on that upstream Xen-4.16 hypervisor
and did not see the bug, so this most definitely
is NOT an upstream bug.

So far all practical purposes, #994899 *is* fixed upstream
in upstream's unstable version Xen 4.16. So we should be
free to patch it in stable now.

Chuck



Bug#995341: release.debian.org: Xen dom0 does not power off on bullseye (stable)

2021-09-30 Thread Chuck Zmudzinski

On 9/30/2021 2:57 PM, Paul Gevers wrote:

Hi Chuck,

On 30-09-2021 18:15, Chuck Zmudzinski wrote:


... the debdiff I uploaded to BTS has UNRELEASED
rather than bullseye for the distribution field of the changelog,
and the new target version is ...deb11u1.1 instead of deb11u2.
That is how dch formatted the changelog when I generated
the debdiff. Let me know if I need to fix the debdiff filename and
changelog with a new debdiff that corrects those, and I will
upload it to 995...@bugs.debian.org.

You could indeed send a follow-up with a debdiff that fixes these issues
as that saves the stable release managers a round trip to you and less
brain power. You'd need to fix it locally anyways to actually do the upload.


Update debdiff with these fixes to changelog is attached to this message.
Changelog also Closes #994899.

Chuck
diff -Nru xen-4.14.3/debian/changelog xen-4.14.3/debian/changelog
--- xen-4.14.3/debian/changelog 2021-09-13 10:28:21.0 -0400
+++ xen-4.14.3/debian/changelog 2021-09-30 16:06:50.0 -0400
@@ -1,3 +1,12 @@
+xen (4.14.3-1~deb11u2) bullseye; urgency=medium
+
+  * Non-maintainer upload.
+  * debian/patches - move RPI4 patches into a separate directory
+  * debian/rules - disable RPI4 patches on amd64|i386 (Closes: #994899)
+  * debian/control - add Build Dependency quilt
+
+ -- Chuck Zmudzinski   Thu, 30 Sep 2021 16:06:50 -0400
+
 xen (4.14.3-1~deb11u1) bullseye-security; urgency=medium
 
   * Rebuild for bullseye-security
diff -Nru xen-4.14.3/debian/control xen-4.14.3/debian/control
--- xen-4.14.3/debian/control   2021-07-10 08:01:39.0 -0400
+++ xen-4.14.3/debian/control   2021-09-26 22:21:51.0 -0400
@@ -34,6 +34,7 @@
markdown,
ocaml-native-compilers | ocaml-nox,
ocaml-findlib,
+   quilt,
 Homepage: https://xenproject.org/
 Vcs-Browser: https://salsa.debian.org/xen-team/debian-xen
 Vcs-Git: https://salsa.debian.org/xen-team/debian-xen.git
diff -Nru 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
--- 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
2021-09-13 10:25:25.0 -0400
+++ 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
1969-12-31 19:00:00.0 -0500
@@ -1,105 +0,0 @@
-From: Stefano Stabellini 
-Date: Fri, 2 Oct 2020 13:47:17 -0700
-Subject: xen/rpi4: implement watchdog-based reset
-
-The preferred method to reboot RPi4 is PSCI. If it is not available,
-touching the watchdog is required to be able to reboot the board.
-
-The implementation is based on
-drivers/watchdog/bcm2835_wdt.c:__bcm2835_restart in Linux v5.9-rc7.
-
-Signed-off-by: Stefano Stabellini 
-Acked-by: Julien Grall 
-Reviewed-by: Bertrand Marquis 
-Tested-by: Roman Shaposhnik 
-CC: ro...@zededa.com
-(cherry picked from commit 25849c8b16f2a5b7fcd0a823e80a5f1b590291f9)

- xen/arch/arm/platforms/brcm-raspberry-pi.c | 61 ++
- 1 file changed, 61 insertions(+)
-
-diff --git a/xen/arch/arm/platforms/brcm-raspberry-pi.c 
b/xen/arch/arm/platforms/brcm-raspberry-pi.c
-index f5ae58a..811b40b 100644
 a/xen/arch/arm/platforms/brcm-raspberry-pi.c
-+++ b/xen/arch/arm/platforms/brcm-raspberry-pi.c
-@@ -17,6 +17,10 @@
-  * GNU General Public License for more details.
-  */
- 
-+#include 
-+#include 
-+#include 
-+#include 
- #include 
- 
- static const char *const rpi4_dt_compat[] __initconst =
-@@ -37,12 +41,69 @@ static const struct dt_device_match rpi4_blacklist_dev[] 
__initconst =
-  * The aux peripheral also shares a page with the aux UART.
-  */
- DT_MATCH_COMPATIBLE("brcm,bcm2835-aux"),
-+/* Special device used for rebooting */
-+DT_MATCH_COMPATIBLE("brcm,bcm2835-pm"),
- { /* sentinel */ },
- };
- 
-+
-+#define PM_PASSWORD 0x5a00
-+#define PM_RSTC 0x1c
-+#define PM_WDOG 0x24
-+#define PM_RSTC_WRCFG_FULL_RESET0x0020
-+#define PM_RSTC_WRCFG_CLR   0xffcf
-+
-+static void __iomem *rpi4_map_watchdog(void)
-+{
-+void __iomem *base;
-+struct dt_device_node *node;
-+paddr_t start, len;
-+int ret;
-+
-+node = dt_find_compatible_node(NULL, NULL, "brcm,bcm2835-pm");
-+if ( !node )
-+return NULL;
-+
-+ret = dt_device_get_address(node, 0, , );
-+if ( ret )
-+{
-+printk("Cannot read watchdog register address\n");
-+return NULL;
-+}
-+
-+base = ioremap_nocache(start & PAGE_MASK, PAGE_SIZE);
-+if ( !base )
-+{
-+printk("Unable to map watchdog register!\n");
-+return NULL;
-+}
-+
-+return base;
-+}
-+
-+static void rpi4_reset(void)
-+{
-+uint32_t val;
-+void __iomem *base = rpi4_map_watchdog();
-+
-+if ( !base )
-+return;
-+
-+/* use a timeout of 10 ticks (~150us) */
-+writel(10 | PM_PASSW

Bug#995341: release.debian.org: Xen dom0 does not power off on bullseye (stable)

2021-09-30 Thread Chuck Zmudzinski

On 9/30/2021 2:57 PM, Paul Gevers wrote:

Hi Chuck,

On 30-09-2021 18:15, Chuck Zmudzinski wrote:


... the debdiff I uploaded to BTS has UNRELEASED
rather than bullseye for the distribution field of the changelog,
and the new target version is ...deb11u1.1 instead of deb11u2.
That is how dch formatted the changelog when I generated
the debdiff. Let me know if I need to fix the debdiff filename and
changelog with a new debdiff that corrects those, and I will
upload it to 995...@bugs.debian.org.

You could indeed send a follow-up with a debdiff that fixes these issues
as that saves the stable release managers a round trip to you and less
brain power. You'd need to fix it locally anyways to actually do the upload.

Two notes:
1) https://lists.debian.org/debian-devel-announce/2019/08/msg0.html
(under "Workflow").
2) https://lists.debian.org/debian-live/2021/09/msg00027.html

Paul



I will prepare the updated debdiff.

I understand I am asking for an exception to the Release
Team's first "usual" criteria for acceptance:

   * The bug you want to fix in stable must be fixed in unstable
 already (and not waiting in NEW or the delayed queue)

I don't know if they will be willing to make an exception. Probably
not before 11.1 comes out as I am sure they are busy dealing with
the upcoming point release. I do hope they will read my whole
report before deciding. What happened here is very unusual
for Debian, and IMHO Debian would not be renowned for
stability if it happened more often.

Chuck



Bug#995341: release.debian.org: Xen dom0 does not power off on bullseye (stable)

2021-09-30 Thread Chuck Zmudzinski

Control: severity -1 normal

After reading some other bug reports, I now think this
bug's severity should be normal, not important.

Regards,

Chuck



Bug#995341: release.debian.org: Xen dom0 does not power off on bullseye (stable)

2021-09-29 Thread Chuck Zmudzinski
just a user, not a developer, so I could only test
the patch on my amd64 system for the amd64 package.
Other architectures (i386, arm64, etc.) and
crossbuilding (if this package is crossbuilt on
buildd) need to be tested/verified before uploading.

If you upload this patch (or another patch that does the same)
you can close this bug and #994899.

For more information about this problem, please see the
messages in #994899:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994899

and in #991967:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967

All the best,

Chuck Zmudzinski
diff -Nru xen-4.14.3/debian/changelog xen-4.14.3/debian/changelog
--- xen-4.14.3/debian/changelog 2021-09-13 10:28:21.0 -0400
+++ xen-4.14.3/debian/changelog 2021-09-27 11:51:02.0 -0400
@@ -1,3 +1,12 @@
+xen (4.14.3-1~deb11u1.1) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * debian/patches - move RPI4 patches into a separate directory
+  * debian/rules - disable RPI4 patches on amd64|i386 to fix #994899
+  * debian/control - add Build Dependency quilt
+
+ -- Chuck Zmudzinski   Mon, 27 Sep 2021 11:51:04 -0400
+
 xen (4.14.3-1~deb11u1) bullseye-security; urgency=medium
 
   * Rebuild for bullseye-security
diff -Nru xen-4.14.3/debian/control xen-4.14.3/debian/control
--- xen-4.14.3/debian/control   2021-07-10 08:01:39.0 -0400
+++ xen-4.14.3/debian/control   2021-09-26 22:21:51.0 -0400
@@ -34,6 +34,7 @@
markdown,
ocaml-native-compilers | ocaml-nox,
ocaml-findlib,
+   quilt,
 Homepage: https://xenproject.org/
 Vcs-Browser: https://salsa.debian.org/xen-team/debian-xen
 Vcs-Git: https://salsa.debian.org/xen-team/debian-xen.git
diff -Nru 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
--- 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
2021-09-13 10:25:25.0 -0400
+++ 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
1969-12-31 19:00:00.0 -0500
@@ -1,105 +0,0 @@
-From: Stefano Stabellini 
-Date: Fri, 2 Oct 2020 13:47:17 -0700
-Subject: xen/rpi4: implement watchdog-based reset
-
-The preferred method to reboot RPi4 is PSCI. If it is not available,
-touching the watchdog is required to be able to reboot the board.
-
-The implementation is based on
-drivers/watchdog/bcm2835_wdt.c:__bcm2835_restart in Linux v5.9-rc7.
-
-Signed-off-by: Stefano Stabellini 
-Acked-by: Julien Grall 
-Reviewed-by: Bertrand Marquis 
-Tested-by: Roman Shaposhnik 
-CC: ro...@zededa.com
-(cherry picked from commit 25849c8b16f2a5b7fcd0a823e80a5f1b590291f9)

- xen/arch/arm/platforms/brcm-raspberry-pi.c | 61 ++
- 1 file changed, 61 insertions(+)
-
-diff --git a/xen/arch/arm/platforms/brcm-raspberry-pi.c 
b/xen/arch/arm/platforms/brcm-raspberry-pi.c
-index f5ae58a..811b40b 100644
 a/xen/arch/arm/platforms/brcm-raspberry-pi.c
-+++ b/xen/arch/arm/platforms/brcm-raspberry-pi.c
-@@ -17,6 +17,10 @@
-  * GNU General Public License for more details.
-  */
- 
-+#include 
-+#include 
-+#include 
-+#include 
- #include 
- 
- static const char *const rpi4_dt_compat[] __initconst =
-@@ -37,12 +41,69 @@ static const struct dt_device_match rpi4_blacklist_dev[] 
__initconst =
-  * The aux peripheral also shares a page with the aux UART.
-  */
- DT_MATCH_COMPATIBLE("brcm,bcm2835-aux"),
-+/* Special device used for rebooting */
-+DT_MATCH_COMPATIBLE("brcm,bcm2835-pm"),
- { /* sentinel */ },
- };
- 
-+
-+#define PM_PASSWORD 0x5a00
-+#define PM_RSTC 0x1c
-+#define PM_WDOG 0x24
-+#define PM_RSTC_WRCFG_FULL_RESET0x0020
-+#define PM_RSTC_WRCFG_CLR   0xffcf
-+
-+static void __iomem *rpi4_map_watchdog(void)
-+{
-+void __iomem *base;
-+struct dt_device_node *node;
-+paddr_t start, len;
-+int ret;
-+
-+node = dt_find_compatible_node(NULL, NULL, "brcm,bcm2835-pm");
-+if ( !node )
-+return NULL;
-+
-+ret = dt_device_get_address(node, 0, , );
-+if ( ret )
-+{
-+printk("Cannot read watchdog register address\n");
-+return NULL;
-+}
-+
-+base = ioremap_nocache(start & PAGE_MASK, PAGE_SIZE);
-+if ( !base )
-+{
-+printk("Unable to map watchdog register!\n");
-+return NULL;
-+}
-+
-+return base;
-+}
-+
-+static void rpi4_reset(void)
-+{
-+uint32_t val;
-+void __iomem *base = rpi4_map_watchdog();
-+
-+if ( !base )
-+return;
-+
-+/* use a timeout of 10 ticks (~150us) */
-+writel(10 | PM_PASSWORD, base + PM_WDOG);
-+val = readl(base + PM_RSTC);
-+val &= PM_RSTC_WRCFG_CLR;
-+val |= PM_PASSWORD | PM_RSTC_WRCFG_FULL_RESET;
-+writel(val, base + PM_RSTC);
-+
-+/* No sleeping, possibly atomic. */
-+mdelay(1);
-+}
-+

Bug#991967: Simply ACPI powerdown/reset issue?

2021-09-29 Thread Chuck Zmudzinski

This corrects typos - I referenced the wrong bug # in
a few places.

On 9/25/2021 11:27 PM, Elliott Mitchell wrote:


Since the purpose of the bug reports is to find and diagnose bugs, I did
a bit of experimentation and made some observations.

I checked out the Debian Xen source via git.  I got the current
"master" branch which is presently the candidate 4.14.3-1 version,
which includes urgent fixes.  The hash is:
e7a17db0305c8de891b366ad3528e5a43015

On top of this I cherry-picked 3 commits from Xen's main branch:
5a4087004d1adbbb223925f3306db0e5824a2bdc
0f089bbf43ecce6f27576cb548ba4341d0ec46a8
bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b


By main branch, I presume you mean the unstable
4.16 branch of Xen. Correct?

(these can be retrieved via Xen's gitweb at
https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=<$hash> which is
suitable for the `git am` command)

With these I built 4.14.3-1 and then tried kernels 4.19.181-1 and
4.19.194-3 (this system is presently mostly on oldstable).  The results
were:

Xen 4.14.3-1 with Linux 4.19.181-1: system reboots were successful

Xen 4.14.3-1 with Linux 4.19.194-3: system reboots hung



Interesting. Looks like you are honing in on solving this bug. I notice
at the beginning of this message you quoted an older message of mine
which does not take into account that I have reported a new bug
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994899
because I did come to the conclusion, as you did, that there are
in fact two bugs.

I wonder if the results of your modified Xen 4.14.3-1 with
4.19.181-1 and 4.19.194-3 on my hardware would be of help.
I have, as you might recall, older (Haswell) intel, EFI boot
system, and systemd for init/shutdown services.
If I get the same result, then I would agree we are seeing a
regression between those two versions of Linux. Otherwise,
then there may also be some tests involving EFI vs. BIOS to
do. Or, based on what I have learned at #994899, also possibly
we need to check systemd vs. sysv-init. Do you want me to
do the test on my hardware?


Unfortunately I was too quick at installing the rebuilt 4.14.3-1 and I
missed trying the vanilla Debian 4.14.2+25-gb6a8c4f72d-2 with
Linux 4.19.181-1. I believe this combination would have hung during
reboot.


I can confirm it did hang on my hardware with this combination of
Xen and Linux versions.

As such, I believe there are in fact two distinct bugs being observed.
The presence of EITHER of these is sufficient to cause hangs during
powerdown or reboot.


And we already have two distinct bugs on BTS.

First, some patch originally from Linux's main branch breaks Xen reboots
was backported somewhere between 4.19.181-1 and 4.19.194-3.  This may
either have been introduced before 5.10 diverged from main, or may also
have been backported to 5.10.  THIS is Debian bug #991967.


I agree. I believe you.

Second, the Xen patch 3c428e9ecb1f290689080c11e0c37b793425bef1 which is
valuable to ARM devices breaks reboots and powerdowns on x86.  This is
correctly fixed by 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.

Presently
this has no Debian bug report.


That looks a lot like #994899. Have you ruled out the possibility that
this bug is #994899 in disguise? If so, how? Or do you think #994899
is a third bug?

The first is presently unidentified, someone enthusiastic either needs to
read git logs/source code, or bisect and build to find where it got
broken.


Yeah, that's alot of work. That's how I found my solution for #994899.
For that bug, since the working version was Xen 4.11 and the broken
version was Xen 4.14, the cause could have been in 4.12, 4.13, or 4.14.
So that required a bit of detective work studying git logs, but in the
end, I just tested 4.12, and it was good, then 4.13 and it was good.
I also tested the first Debian version of 4.14, which was actually
experimental on Debian if I recall correctly. It did not include the
RPI4 patches, and it was good too. So I knew the bug was introduced
sometime after that, and I soon identified the RPI4 patches as the place
where the bug (#994899) first appeared on my hardware.

The second we seem to have a fix.  The only question is how many patches
to cherry pick?  bc141e8ca562 is non-urgent as it is merely superficial
and not needed for functionality.
5a4087004d1a is a workaround for Linux kernel breakage, but how likely
are we to see that fixed in the Linux kernel packages?  The fix is
well-contained and needed for some highly popular ARM devices.




When you decide what to do here, I would like to check it to
see if it works on my hardware and if you don't hear anything
from me, you can assume it worked fine on my hardware.

Cheers,

Chuck



Bug#991967: Simply ACPI powerdown/reset issue?

2021-09-27 Thread Chuck Zmudzinski

On 9/25/2021 11:27 PM, Elliott Mitchell wrote:


I checked out the Debian Xen source via git.  I got the current
"master" branch which is presently the candidate 4.14.3-1 version,
which includes urgent fixes.  The hash is:
e7a17db0305c8de891b366ad3528e5a43015

On top of this I cherry-picked 3 commits from Xen's main branch:
5a4087004d1adbbb223925f3306db0e5824a2bdc
0f089bbf43ecce6f27576cb548ba4341d0ec46a8
bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b

(these can be retrieved via Xen's gitweb at
https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=<$hash> which is
suitable for the `git am` command)

With these I built 4.14.3-1 and then tried kernels 4.19.181-1 and
4.19.194-3 (this system is presently mostly on oldstable).  The results
were:

Xen 4.14.3-1 with Linux 4.19.181-1: system reboots were successful

Xen 4.14.3-1 with Linux 4.19.194-3: system reboots hung


I presume the Xen 4.14.3-1 you are referring to is not the
official version, but the one patched with the three extra
aforementioned commits. Note: I use quilt to manage the
packages, and quilt rejected the last commit because the
context within three lines of the patched code was changed.
A goto bad was changed to goto done by another commit
on the Xen unstable branch, so I fixed the patch file
and changed the 'done' to 'bad' to get the third patch to succeed.
Let's call this patched version of Xen version 4.14.3-1.1

I tried these on my hardware, which is a Haswell processor, EFI
boot, and systemd for init, and my results are:

Xen 4.14.3-1.1 with Linux 4.19.181-1: system reboots hung
Xen 4.14.3-1.1 with Linux 4.19.194-3: system reboots hung
Xen 4.14.3-1.1 with Linux 5.10.46-4: system reboots hung

I still cannot reproduce this result, not even with the extra three
commits. Perhaps it depends on differences in the BIOS or EFI, or
maybe systemd vs. sysv.

I share this result in case it is of help to you.

Regards,

Chuck Zmudzinski



Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-09-27 Thread Chuck Zmudzinski
A patch has been uploaded (see message #67). For more information, see 
message #34.




Bug#994899: patch

2021-09-27 Thread Chuck Zmudzinski

Patch is attached. (An improved patch from the one in message #55)

changes from the patch in message #55: debian/rules: quilt pop 0026-... 
instead of quilt pop 14
This ensures builds succeed when the number of patches in debian/patches 
increases


Patch generated by: debdiff xen_4.14.3-1~deb11u1.dsc 
xen_4.14.3-1~deb11u1.1.dsc > bug#994899.diff


What it does: Rebuilds the xen packages without the RPI4 patches on 
amd64 and i386


Tested on: Native amd64 build

Fixes this bug on my amd64 system

Build Instructions:

Since the .pc directory is changed in the new package, we need quilt to 
rebuild it correctly.


So the following commands should work to build the packages. Start in an 
empty directory.


if [ ! -e /usr/bin/quilt ]; then sudo apt install quilt; fi
dget -x 
https://snapshot.debian.org/archive/debian-security/20210920T191155Z/pool/updates/main/x/xen/xen_4.14.3-1~deb11u1.dsc

cd xen-4.14.3
dpkg-checkbuilddeps
quilt pop -a
cd ..
patch -p0 < bug#994899.diff
cd xen-4.14.3
quilt push -a
debuild -i -us -uc -b

To build the source package:

debuild -i -us -uc -S
diff -Nru xen-4.14.3/debian/changelog xen-4.14.3/debian/changelog
--- xen-4.14.3/debian/changelog 2021-09-13 10:28:21.0 -0400
+++ xen-4.14.3/debian/changelog 2021-09-27 11:51:02.0 -0400
@@ -1,3 +1,12 @@
+xen (4.14.3-1~deb11u1.1) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * debian/patches - move RPI4 patches into a separate directory
+  * debian/rules - disable RPI4 patches on amd64|i386 to fix #994899
+  * debian/control - add Build Dependency quilt
+
+ -- Chuck Zmudzinski   Mon, 27 Sep 2021 11:51:04 -0400
+
 xen (4.14.3-1~deb11u1) bullseye-security; urgency=medium
 
   * Rebuild for bullseye-security
diff -Nru xen-4.14.3/debian/control xen-4.14.3/debian/control
--- xen-4.14.3/debian/control   2021-07-10 08:01:39.0 -0400
+++ xen-4.14.3/debian/control   2021-09-26 22:21:51.0 -0400
@@ -34,6 +34,7 @@
markdown,
ocaml-native-compilers | ocaml-nox,
ocaml-findlib,
+   quilt,
 Homepage: https://xenproject.org/
 Vcs-Browser: https://salsa.debian.org/xen-team/debian-xen
 Vcs-Git: https://salsa.debian.org/xen-team/debian-xen.git
diff -Nru 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
--- 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
2021-09-13 10:25:25.0 -0400
+++ 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
1969-12-31 19:00:00.0 -0500
@@ -1,105 +0,0 @@
-From: Stefano Stabellini 
-Date: Fri, 2 Oct 2020 13:47:17 -0700
-Subject: xen/rpi4: implement watchdog-based reset
-
-The preferred method to reboot RPi4 is PSCI. If it is not available,
-touching the watchdog is required to be able to reboot the board.
-
-The implementation is based on
-drivers/watchdog/bcm2835_wdt.c:__bcm2835_restart in Linux v5.9-rc7.
-
-Signed-off-by: Stefano Stabellini 
-Acked-by: Julien Grall 
-Reviewed-by: Bertrand Marquis 
-Tested-by: Roman Shaposhnik 
-CC: ro...@zededa.com
-(cherry picked from commit 25849c8b16f2a5b7fcd0a823e80a5f1b590291f9)

- xen/arch/arm/platforms/brcm-raspberry-pi.c | 61 ++
- 1 file changed, 61 insertions(+)
-
-diff --git a/xen/arch/arm/platforms/brcm-raspberry-pi.c 
b/xen/arch/arm/platforms/brcm-raspberry-pi.c
-index f5ae58a..811b40b 100644
 a/xen/arch/arm/platforms/brcm-raspberry-pi.c
-+++ b/xen/arch/arm/platforms/brcm-raspberry-pi.c
-@@ -17,6 +17,10 @@
-  * GNU General Public License for more details.
-  */
- 
-+#include 
-+#include 
-+#include 
-+#include 
- #include 
- 
- static const char *const rpi4_dt_compat[] __initconst =
-@@ -37,12 +41,69 @@ static const struct dt_device_match rpi4_blacklist_dev[] 
__initconst =
-  * The aux peripheral also shares a page with the aux UART.
-  */
- DT_MATCH_COMPATIBLE("brcm,bcm2835-aux"),
-+/* Special device used for rebooting */
-+DT_MATCH_COMPATIBLE("brcm,bcm2835-pm"),
- { /* sentinel */ },
- };
- 
-+
-+#define PM_PASSWORD 0x5a00
-+#define PM_RSTC 0x1c
-+#define PM_WDOG 0x24
-+#define PM_RSTC_WRCFG_FULL_RESET0x0020
-+#define PM_RSTC_WRCFG_CLR   0xffcf
-+
-+static void __iomem *rpi4_map_watchdog(void)
-+{
-+void __iomem *base;
-+struct dt_device_node *node;
-+paddr_t start, len;
-+int ret;
-+
-+node = dt_find_compatible_node(NULL, NULL, "brcm,bcm2835-pm");
-+if ( !node )
-+return NULL;
-+
-+ret = dt_device_get_address(node, 0, , );
-+if ( ret )
-+{
-+printk("Cannot read watchdog register address\n");
-+return NULL;
-+}
-+
-+base = ioremap_nocache(start & PAGE_MASK, PAGE_SIZE);
-+if ( !base )
-+{
-+printk("Unable to map watchdog register!\n");
-+return NULL;
-

Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-09-27 Thread Chuck Zmudzinski
A patch has been uploaded (message #55). For more information, see 
message #34.




Bug#994899: patch

2021-09-27 Thread Chuck Zmudzinski

Patch is attached.

Patch generated by: debdiff xen_4.14.3-1~deb11u1.dsc 
xen_4.14.3-1~deb11u1.1.dsc > bug#994899.diff


What it does: Rebuilds the xen packages without the RPI4 patches on 
amd64 and i386


Tested on: Native amd64 build

Fixes this bug on my amd64 system

Build Instructions:

Since the .pc directory is changed in the new package, we need quilt to 
rebuild it correctly.


So the following commands should work to build the packages. Start in an 
empty directory.


if [ ! -e /usr/bin/quilt ]; then sudo apt install quilt; fi
dget -x 
https://snapshot.debian.org/archive/debian-security/20210920T191155Z/pool/updates/main/x/xen/xen_4.14.3-1~deb11u1.dsc

cd xen-4.14.3
dpkg-checkbuilddeps
quilt pop -a
cd ..
patch -p0 < bug#994899.diff
cd xen-4.14.3
debuild -i -us -uc -b

To build the source package:

debuild -i -us -uc -S
diff -Nru xen-4.14.3/debian/changelog xen-4.14.3/debian/changelog
--- xen-4.14.3/debian/changelog 2021-09-13 10:28:21.0 -0400
+++ xen-4.14.3/debian/changelog 2021-09-26 22:22:56.0 -0400
@@ -1,3 +1,12 @@
+xen (4.14.3-1~deb11u1.1) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * debian/patches - move RPI4 patches into a separate directory
+  * debian/rules - disable RPI4 patches on amd64|i386 to fix #994899
+  * debian/control - add Build Dependency quilt
+
+ -- Chuck Zmudzinski   Sun, 26 Sep 2021 22:23:21 -0400
+
 xen (4.14.3-1~deb11u1) bullseye-security; urgency=medium
 
   * Rebuild for bullseye-security
diff -Nru xen-4.14.3/debian/control xen-4.14.3/debian/control
--- xen-4.14.3/debian/control   2021-07-10 08:01:39.0 -0400
+++ xen-4.14.3/debian/control   2021-09-26 22:21:51.0 -0400
@@ -34,6 +34,7 @@
markdown,
ocaml-native-compilers | ocaml-nox,
ocaml-findlib,
+   quilt,
 Homepage: https://xenproject.org/
 Vcs-Browser: https://salsa.debian.org/xen-team/debian-xen
 Vcs-Git: https://salsa.debian.org/xen-team/debian-xen.git
diff -Nru 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
--- 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
2021-09-13 10:25:25.0 -0400
+++ 
xen-4.14.3/debian/patches/0027-xen-rpi4-implement-watchdog-based-reset.patch
1969-12-31 19:00:00.0 -0500
@@ -1,105 +0,0 @@
-From: Stefano Stabellini 
-Date: Fri, 2 Oct 2020 13:47:17 -0700
-Subject: xen/rpi4: implement watchdog-based reset
-
-The preferred method to reboot RPi4 is PSCI. If it is not available,
-touching the watchdog is required to be able to reboot the board.
-
-The implementation is based on
-drivers/watchdog/bcm2835_wdt.c:__bcm2835_restart in Linux v5.9-rc7.
-
-Signed-off-by: Stefano Stabellini 
-Acked-by: Julien Grall 
-Reviewed-by: Bertrand Marquis 
-Tested-by: Roman Shaposhnik 
-CC: ro...@zededa.com
-(cherry picked from commit 25849c8b16f2a5b7fcd0a823e80a5f1b590291f9)

- xen/arch/arm/platforms/brcm-raspberry-pi.c | 61 ++
- 1 file changed, 61 insertions(+)
-
-diff --git a/xen/arch/arm/platforms/brcm-raspberry-pi.c 
b/xen/arch/arm/platforms/brcm-raspberry-pi.c
-index f5ae58a..811b40b 100644
 a/xen/arch/arm/platforms/brcm-raspberry-pi.c
-+++ b/xen/arch/arm/platforms/brcm-raspberry-pi.c
-@@ -17,6 +17,10 @@
-  * GNU General Public License for more details.
-  */
- 
-+#include 
-+#include 
-+#include 
-+#include 
- #include 
- 
- static const char *const rpi4_dt_compat[] __initconst =
-@@ -37,12 +41,69 @@ static const struct dt_device_match rpi4_blacklist_dev[] 
__initconst =
-  * The aux peripheral also shares a page with the aux UART.
-  */
- DT_MATCH_COMPATIBLE("brcm,bcm2835-aux"),
-+/* Special device used for rebooting */
-+DT_MATCH_COMPATIBLE("brcm,bcm2835-pm"),
- { /* sentinel */ },
- };
- 
-+
-+#define PM_PASSWORD 0x5a00
-+#define PM_RSTC 0x1c
-+#define PM_WDOG 0x24
-+#define PM_RSTC_WRCFG_FULL_RESET0x0020
-+#define PM_RSTC_WRCFG_CLR   0xffcf
-+
-+static void __iomem *rpi4_map_watchdog(void)
-+{
-+void __iomem *base;
-+struct dt_device_node *node;
-+paddr_t start, len;
-+int ret;
-+
-+node = dt_find_compatible_node(NULL, NULL, "brcm,bcm2835-pm");
-+if ( !node )
-+return NULL;
-+
-+ret = dt_device_get_address(node, 0, , );
-+if ( ret )
-+{
-+printk("Cannot read watchdog register address\n");
-+return NULL;
-+}
-+
-+base = ioremap_nocache(start & PAGE_MASK, PAGE_SIZE);
-+if ( !base )
-+{
-+printk("Unable to map watchdog register!\n");
-+return NULL;
-+}
-+
-+return base;
-+}
-+
-+static void rpi4_reset(void)
-+{
-+uint32_t val;
-+void __iomem *base = rpi4_map_watchdog();
-+
-+if ( !base )
-+return;
-+
-+/* use a timeout of 10 ticks (~150us) */
-+writ

Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-09-26 Thread Chuck Zmudzinski

Added tag upstream. Explanation is in discussion at
related bug #991967 here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967#169

and here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967#174

Briefly, since we are currently  shipping a fork of Xen-4.14
on our unstable, testing, and stable versions of the hypervisor
to better support arm devices but there is an annoying
bug also in x86 (amd64) in these versions, IMO we should

1) Notify upstream of the fork we are doing and
2) Notify our users, especially on the stable branch,
that our version of Xen is actually  a fork of Xen-4.14.

I know this can be discovered by reading the changelog,
but to find it one must go back to the unstable version
that was released back in December of 2020 to find
where the fork started. Not many people (including me)
would look there to try to find such a significant change
to the package, especially on stable where ordinarily only
vanilla security patches from upstream are in the changelog.
So, as a courtesy to our users, I think the visibility of
this change needs to be elevated to the status of at least
a README.Debian file, if not an actual notification
to the user by dpkg when installing. Of course
the changelog should also note explicitly that this is
a fork of Xen 4.14 in all the released versions that
have patches from Xen upstream 4.16. Perhaps there is
a way to also indicate this in the version name and number
of the packages, but I do not know if there are conventions
or policies to handle a version change that is really the start
of a fork. If so, we should follow them.



Bug#991967: Simply ACPI powerdown/reset issue?

2021-09-26 Thread Chuck Zmudzinski

On 9/26/2021 8:46 AM, Chuck Zmudzinski wrote:

On 9/25/2021 11:27 PM, Elliott Mitchell wrote:


Unfortunately I was too quick at installing the rebuilt 4.14.3-1 and I
missed trying the vanilla Debian 4.14.2+25-gb6a8c4f72d-2 with
Linux 4.19.181-1.  I believe this combination would have hung during
reboot.




In light of what I discovered while investigating the cause of
bug #994899, I would tend to think calling
Debian 4.14.2+25-gb6a8c4f72d-2 "vanilla" an interesting
choice of words. To me, vanilla connotes boring,
uninteresting. But that version of Debian Xen, and
also the current version in the stable distribution,
bullseye, are not boring or uninteresting as I have
studied these versions and concluded they actually
are now a fork of upstream Xen's 4.14 version, since
they contain patches from upstream Xen's 4.16 unstable
branch to better support the Raspberry Pi 4, as noted
in the changelogs of those versions.

So I am adding the tag upstream,


Actually, I will add the upstream tag to the bug I reported in
Xen, #994899, since we are talking about upstream Xen, not
upstream Linux.



Bug#991967: Simply ACPI powerdown/reset issue?

2021-09-26 Thread Chuck Zmudzinski

On 9/25/2021 11:27 PM, Elliott Mitchell wrote:


Unfortunately I was too quick at installing the rebuilt 4.14.3-1 and I
missed trying the vanilla Debian 4.14.2+25-gb6a8c4f72d-2 with
Linux 4.19.181-1.  I believe this combination would have hung during
reboot.




In light of what I discovered while investigating the cause of
bug #994899, I would tend to think calling
Debian 4.14.2+25-gb6a8c4f72d-2 "vanilla" an interesting
choice of words. To me, vanilla connotes boring,
uninteresting. But that version of Debian Xen, and
also the current version in the stable distribution,
bullseye, are not boring or uninteresting as I have
studied these versions and concluded they actually
are now a fork of upstream Xen's 4.14 version, since
they contain patches from upstream Xen's 4.16 unstable
branch to better support the Raspberry Pi 4, as noted
in the changelogs of those versions.

So I am adding the tag upstream, and I suggest that
the Debian Xen Team notify upstream Xen that we
are planning a fork of Xen to better support popular
arm devices and we are already shipping a testing
version of it in our current bullseye release. We could
tell upstream we are willing to stop this fork if they
could assist us with backporting the reworking of the
xen/arm/acpi and xen/x86/acpi code that is in upstream
Xen 4.16 unstable to xen 4.14. We can tell
them if they are interested in what we are doing, they
can take a look at the work we are doing on our
public development servers (salsa).

For our own users, especially in the stable version,
we should make a note of this fact in a README.Debian
file and place it in an appropriate place of the binary
packages. We should also note that there are encouraging
results with this version for improved support on arm,
but some tests indicate an annoying bug causing
problems shutting down Domain 0 appear to have
surfaced on x86 (amd64). For details, see bugs #991967
and #994899 on the Debian Bug Tracking System.

I think this is the BEST way to truly proceed in accordance
with the Debian Social Policy of courtesy and cooperation
with the free software projects that are available to the
public in our main repositories, and to properly inform
our users what we are doing in our current Xen packages
for unstable, testing, and stable.



Bug#991967: Simply ACPI powerdown/reset issue?

2021-09-26 Thread Chuck Zmudzinski

On 9/25/2021 11:27 PM, Elliott Mitchell wrote:


The second we seem to have a fix.  The only question is how many patches
to cherry pick?  bc141e8ca562 is non-urgent as it is merely superficial
and not needed for functionality.
5a4087004d1a is a workaround for Linux kernel breakage, but how likely
are we to see that fixed in the Linux kernel packages?  The fix is
well-contained and needed for some highly popular ARM devices.


I suspect that depends on how highly motivated Debian is
to support those highly popular ARM devices not just with
Linux, but with Linux as a Xen Dom0 on those devices. Even
if they are highly popular devices, what matters, ultimately,
I think, is if there is a reason for them to be popular as
devices that run a Xen dom0. Then maybe there is a chance
to get some patches into the Linux kernel for this purpose.
Just my two cents, FWIW.



Bug#991967: Simply ACPI powerdown/reset issue?

2021-09-25 Thread Fr. Chuck Zmudzinski, C.P.M.

On 9/25/2021 11:27 PM, Elliott Mitchell wrote:

On Tue, Sep 21, 2021 at 06:33:20AM -0400, Chuck Zmudzinski wrote:

I presume you are suggesting I try booting 4.19.181-1 on the
current version of Xen-4.14 for bullseye as a dom0. I am not
inclined to try it until an official Debian developer endorses
your opinion that the bug I am seeing is distinct
from #991967, at which point I will report the bug I am
seeing as a new bug.

Chuck Zmudzinski you are getting rather close to my threshold for calling
harrassment.  You're not /quite/ there, but I'm concerned.


Sorry if I offended you in some way, I didn't mean to.


Since the purpose of the bug reports is to find and diagnose bugs, I did
a bit of experimentation and made some observations.

I checked out the Debian Xen source via git.  I got the current
"master" branch which is presently the candidate 4.14.3-1 version,
which includes urgent fixes.  The hash is:
e7a17db0305c8de891b366ad3528e5a43015

On top of this I cherry-picked 3 commits from Xen's main branch:
5a4087004d1adbbb223925f3306db0e5824a2bdc
0f089bbf43ecce6f27576cb548ba4341d0ec46a8
bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b


By main branch, I presume you mean the unstable
4.16 branch of Xen. Correct?

(these can be retrieved via Xen's gitweb at
https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=<$hash> which is
suitable for the `git am` command)

With these I built 4.14.3-1 and then tried kernels 4.19.181-1 and
4.19.194-3 (this system is presently mostly on oldstable).  The results
were:

Xen 4.14.3-1 with Linux 4.19.181-1: system reboots were successful

Xen 4.14.3-1 with Linux 4.19.194-3: system reboots hung



Interesting. Looks like you are honing in on solving this bug. I notice
at the beginning of this message you quoted an older message of mine
which does not take into account that I have reported a new bug
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994899
because I did come to the conclusion, as you did, that there are
in fact two bugs.

I wonder if the results of your modified Xen 4.14.3-1 with
4.19.181-1 and 4.19.194-3 on my hardware would be of help.
I have, as you might recall, older (Haswell) intel, EFI boot
system, and systemd for init/shutdown services.
If I get the same result, then I would agree we are seeing a
regression between those two versions of Linux. Otherwise,
then there may also be some tests involving EFI vs. BIOS to
do. Or, based on what I have learned at #994899, also possibly
we need to check systemd vs. sysv-init. Do you want me to
do the test on my hardware?


Unfortunately I was too quick at installing the rebuilt 4.14.3-1 and I
missed trying the vanilla Debian 4.14.2+25-gb6a8c4f72d-2 with
Linux 4.19.181-1. I believe this combination would have hung during
reboot.


I can confirm it did hang on my hardware with this combination of
Xen and Linux versions.


As such, I believe there are in fact two distinct bugs being observed.
The presence of EITHER of these is sufficient to cause hangs during
powerdown or reboot.


And we already have two distinct bugs on BTS.

First, some patch originally from Linux's main branch breaks Xen reboots
was backported somewhere between 4.19.181-1 and 4.19.194-3.  This may
either have been introduced before 5.10 diverged from main, or may also
have been backported to 5.10.  THIS is Debian bug #991967.


I agree. I believe you.

Second, the Xen patch 3c428e9ecb1f290689080c11e0c37b793425bef1 which is
valuable to ARM devices breaks reboots and powerdowns on x86.  This is
correctly fixed by 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.

Presently
this has no Debian bug report.


That looks a lot like #994889. Have you ruled out the possibility that
this bug is #994889 in disguise? If so, how? Or do you think #994889
is a third bug?


The first is presently unidentified, someone enthusiastic either needs to
read git logs/source code, or bisect and build to find where it got
broken.


Yeah, that's alot of work. That's how I found my solution for #994889.
For that bug, since the working version was Xen 4.11 and the broken
version was Xen 4.14, the cause could have been in 4.12, 4.13, or 4.14.
So that required a bit of detective work studying git logs, but in the
end, I just tested 4.12, and it was good, then 4.13 and it was good.
I also tested the first Debian version of 4.14, which was actually
experimental on Debian if I recall correctly. It did not include the
RPI4 patches, and it was good too. So I knew the bug was introduced
sometime after that, and I soon identified the RPI4 patches as the place
where the bug (#994889) first appeared on my hardware.

The second we seem to have a fix.  The only question is how many patches
to cherry pick?  bc141e8ca562 is non-urgent as it is merely superficial
and not needed for functionality.
5a4087004d1a is a workaround for Linux kernel breakage, but how likely
are we to see that fixed in the Linux kernel packages?  The fix is
well-contained and needed for some highly pop

Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-09-24 Thread Chuck Zmudzinski

Lowered severity to minor because the information so
far indicates the bug may only affect a limited set of
hardware/software combinations. It does affect my
system, but I have found a solution for it in accord
with the Debian principles of free software.

I understand the free software development world
cannot stop everything it is doing to work on this
little bug.

Nevertheless, if Debian wishes to try to implement
a true and full solution that fixes both this bug
and #991976, I am willing to cooperate with anyone
who will not accuse me of wrongdoing in this
public forum without first discussing the matter with
me in a private email.

Cheers,

Chuck Zmudzinski



Bug#994899: [Pkg-xen-devel] Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-09-24 Thread Chuck Zmudzinski

Based on the technical information so far provided by the
Debian community in this report and in the related bug
#991967, I consider this bug closed. For me it is fixed. I
found the solution for it on my hardware and shared
it with the Debian community. I do not care if the
official Debian developers implement my suggestion
or not. I will not run Debian's version which is really
a mix of Xen-4.14 stable and Xen-4.16 unstable. Instead
I will run the version with the patch I suggested in this
bug report, and AFAICT I will have a more stable and
bug-free version of Xen than anyone who runs the current
so-called stable version of Xen for Debian.

Since the information about this bug is scattered in various
places in the this bug report and in #991967, I will say this
bug concerned the following hardware/software configuration:

Motherboard/CPU: ASRock B85M Pro4, BIOS P2.50 12/11/2015,
with a Haswell CPU (core i5-4590S)

Boot system: EFI, not using secure boot, booting xen
hypervisor and dom0 bullseye with grub-efi package for
bullseye, and it boots the xen-4.14-amd64.gz file, not
the xen-4.14-amd64.efi file.

Init system: systemd

Xen domain type: Domain 0

Linux Kernel Versions: all Linux kernel versions of both
buster and bullseye running as a dom0 I tested exhibit
the bug, but no Debian stable Linux kernel version since
4.19.0-16 running as a dom0 exhibited the bug with
the Xen hypervisor for buster.

If you are experiencing the symptoms described in this
bug report, the solution I proposed for Debian in both
this bug report and in #991976 might fix the bug if your
hardware and software configuration is similar to
what I have described above. However, to fix it you
will have to test it yourself, and that would
involve building the Xen package for bullseye from
source, unless and until Debian decides to implement
the fix I proposed in my original bug report.

if you use BIOS boot and/or sysv-init instead of EFI
boot and/or systemd, it is likely the fix I have described
here will not fix the bug, and also if you have a newer
intel cpu or an amd cpu the fix might not work on your
hardware, but it might work if you use EFI and systemd
in those latter cases.

If you try the solution I proposed here and it does not
solve your issue, I would suggest that you look at #991976
before reporting a new bug.

I will not take action to close this bug; that is up to
the Debian developers to decide. Instead, I will select
this message to be a summary of the bug.

Happy computing on Debian,

Chuck Zmudzinski



Bug#994899: [Pkg-xen-devel] Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-09-23 Thread Chuck Zmudzinski

On 9/23/2021 5:50 PM, Diederik de Haas wrote:

On donderdag 23 september 2021 21:54:49 CEST Chuck Zmudzinski wrote:

While I did respond point by point privately to the author

Don't do that. Any discussion relevant to the bug should be sent to the bug
itself so that everyone has all the relevant information.

I actually learned myself how to build Xen packages so I could assist you as
good as possible. You won't see any more effort or participation on my part.

Bye.


Sorry to hear that.

Cheers,

Chuck Zmudzinski



Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-09-23 Thread Chuck Zmudzinski

On 9/23/2021 12:49 PM, Diederik de Haas wrote:

Control: tag -1 -newcomer
Control: tag -1 -upstream

On woensdag 22 september 2021 21:50:16 CEST Chuck Zmudzinski wrote:

Finally, I tag the bug newcomer simply because there is a known solution but

That's what the 'patch' tag is for. 'newcomer' is similar to 'good first issue',
which this is not. Hence removing the 'newcomer' tag.


the Debian Xen package maintainer seems to want the Debian Kernel Team to
find a way to fix the bug in the Linux kernel, as evidenced by the recent
discussion over at #991976, instead of implementing the fix in the Xen
hypervisor as proposed here.

You're claiming, possibly correctly, that the issue is with the Debian Xen
package, not with the upstream code, so removing the 'upstream' tag as well.


It's good that you filed this bug against the Debian Xen package because it's
(quite) possible that there is both an issue with the Linux kernel which
#991976 is about and with the Xen package, what this issue will be about.

They way you went about it ... not so good.

By filing a bug you want others to spend their free time to (help) fix an issue
you are having (and in this case, me too). To make the best use of their time
and your chances of it being fixed, you should state the problem as short and
succinct as possible.
And in the case of a 'patch' as may be the case here, the actual patch.

You did neither.

You did go on a rant where you made (incorrect) claims and accusations.
I don't think that helps your goal, which is getting this issue fixed. Do you?

F.e. you make claims on the Debian Xen package maintainers' position, while
this is the first time they've been made (explicitly) aware of this issue.
So they did not have a chance to (formulate and) state their position.

I had written a point-by-point description of what *I* think was wrong with
your bug report, but that would only keep the negative cycle in place.
FTR: I'm just a contributor to Debian (by participating in this bug), just
like you are (by submitting a bug). And so is Elliot.

For uploading packages to the Debian archive you *do* need special
permissions. For almost all other things, everyone can contribute.


Package: xen-hypervisor-4.14-amd64
Version: 4.14.3-1~deb11u1
Severity: important
Since I am not a developer, I only tagged this bug important, but if I were
a developer, I would tag it serious and implement a fix that does what I will
propose below.

https://www.debian.org/Bugs/Developer#severities explains what the severity
levels entail. There is no correlation between severity and some (claimed) role
within the project. IMO this bug is *at most* important.
Let's leave it to the Debian Xen package maintainers to change the severity
if they think that's appropriate.


I refer you here for my first description of the problem
to the Debian Bug System:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967#34

IMO, this is just wrong.
You've filed a new bug so make the exact problem the primary part of this bug.
Don't ask of others to read a '50 page document' and expect them to distill
YOUR problem themselves.
Doing a copy+paste of the *relevant* part is absolutely fine.

So please reply to this with the following (minimal) info:
- Your hardware
- Whether you use BIOS or UEFI
- Your init system
- What you did and what the result of that was.

Item 2 & 3 may seem 'odd' at first, but should become clear later on.


another Debian user confirmed the bug on bullseye:

Yep. If I do 'poweroff' on my Xen server, it looks like it does the whole
shutdown procedure correctly, but it doesn't actually poweroff the machine.


I can drill down the cause of this bug in the stable version of debian to
a series of nine commits from upstream in order to improve Raspberry Pi 4
support in version 4.14.0+88-g1d1d1f5391-1.

Do those 9 commits correspond to 9 patches from the
/debian/patches/ directory? If so, which 9?
If you add the 'patch' tag, do indeed include the patch in the bug report.

I built Xen packages based on 4.14.3-1~deb11u1 but remove patches 0029-0034,
but after installing those packages and rebooting into my patched version, my
Xen server still did NOT power off. Other patches didn't seem relevant *to me*,
but I can be wrong. If you share your changes, I can try whether that will fix
the problem with me (too).
My Xen server uses BIOS (not UEFI, which I think you do) and has sysv-init
as init system. That may be relevant as well.


So the bug was introduced in the Debian Xen unstable/testing package on
15 Dec 2020 according to the changelog.

You *think* that was the case? Or did your Bullseye system actually poweroff
correctly when installing version 4.14.0+80-gd101b417b7-1 or earlier?
That was the version before the RPi4 related patches were added.
Version N-1=good, version N=bad is very useful and relevant info.


I also tested the current unstable branch from Xen upstream, which is
xen-c76cfad, which unstable calls Xen-4.16-unstable. I te

Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-09-22 Thread Chuck Zmudzinski

Package: xen-hypervisor-4.14-amd64
Version: 4.14.3-1~deb11u1
Severity: important
Tags: patch newcomer upstream

Dear Maintainer,

This bug is related to #991976, reported by Elliott Mitchell,
who happens to be the person who requested the patches
that are causing this bug. I understand he is a Debian Xen
developer.

Since I am not a developer, I only tagged this bug
important, but if I were a developer, I would tag it
serious and implement a fix that does what I will
propose below.

I hereby humbly request that you elevate this bug to
serious, since it is entirely wrong to release software
that causes a modern workstation/server to not power down
properly and renders it unable to be managed remotely,
which is what this bug does. A bug like this is
normal on an unstable or testing distribution, but
unacceptable/serious on the current stable release.

I refer you here for my first description of the problem
to the Debian Bug System:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967#34

I also point out that another Debian user confirmed the
bug on bullseye:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967#54

Elliott is of the opinion that what I am seeing is a
bug distinct from #991976. I am inclined to agree, as I
have not been able to reproduce it the way he describes
it, although the symptoms he sees are very similar.
That is why I am submitting a new report.

I can drill down the cause of this bug in the stable
version of debian to the Debian Xen Team's decision to
include a series of nine commits from upstream in order
to improve Raspberry Pi 4 support in version
4.14.0+88-g1d1d1f5391-1. So the bug was introduced in
the Debian Xen unstable/testing package on 15 Dec 2020
according to the changelog.

I understand the original reporter of #991976 wants to
keep these patches in the stable version of Xen to
better support the Raspberry Pi 4, and that he is
a Debian Xen developer. But I will strenuously and
respectfully disagree with any decision by the Debian
Xen Team to not apply a very reasonable compromise
solution.

Over on #991967, I argued passionately for removing
the nine Raspberry Pi 4 patches from the stable Xen
version because, and it is still my opinion that
experiments with patches from unstable upstream
branches is not appropriate for a package in a
stable version. That is why I would expect the
release team to tag this bug as serious even if
the Debian Xen Team refuses to tag it as serious.

Nevertheless, I propose the following compromise:

Simply ship a package for the stable version that
omits the nine Raspberry Pi 4 patches from unstable
upstream while building for the amd64|i386
architectures. I was able to implement such a fix
even though I am not an official developer in
just a few hours. It is really a trivial fix,
all I did was add a rule in debian/rules to use
quilt to disable the nine patches on amd64|i386.
I made it easy by moving the nine RP4 patches
from debian/patches to debian/patches/rpi4
and so could I use sed s/rpi4/#rpi4/ debian/series
to disable the patches for the amd64|i386 case.

I am sure there are other ways to implement the
fix, and it really is trivial, it would fix
this bug and still allow for the Raspberry Pi 4
patches to be included where they are needed
which I believe is in the arm64 architecture.

I also tested the current unstable branch from
Xen upstream, which is xen-c76cfad, which
unstable calls Xen-4.16-unstable. I tested
the current bullseye kernel as a dom0 on that
upstream Xen-4.16 hypervisor and did not
see the bug, so this most definetely is
NOT an upstream bug. It is a Debian Xen
packaging bug. I expect that perhaps some
commits on the Xen-4.16 upstream branch
that are missing on the Xen-4.14 branch might
also fix this bug, but until such a solution
is found, I suggest the aforementioned solution
as a workaround. The reason I tagged this bug
as upstream is that I think it would be
adviseable to make upstream aware that our
current xen-4.14 package is not really a
true Xen-4.14 but one with some patches from
Xen-unstable that are causing this bug, and
perhaps they can help eventually find the best
solution for their Xen-4.14 stable branch.

Finally, I tag the bug newcomer simply because
there is a known solution but the Debian Xen
package maintainer seems to want the Debian
Kernel Team to find a way to fix the bug
in the Linux kernel, as evidenced by the
recent discussion over at #991976, instead
of implementing the fix in the Xen hypervisor
as proposed here.

Regards,

Chuck Zmudzinski


*** Reporter, please consider answering these questions, where 
appropriate ***


   * What led up to the situation?
   * What exactly did you do (or not do) that was effective (or
 ineffective)?
   * What was the outcome of this action?
   * What outcome did you expect instead?

*** End of the template - remove these template lines ***


-- System Information:
Debian Release: 11.0
  APT prefers stable-updates
  APT policy: (500, 'stable

Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-09-21 Thread Chuck Zmudzinski

On 8/24/2021 7:12 PM, Ben Hutchings wrote:

On Tue, Aug 24, 2021 at 03:27:19PM -0400, Phillip Susi wrote:

Ben Hutchings  writes:


I think a proper fix would be one of:

a. If the Xen virtual keyboard driver is advertising capabilities it
doesn't have, stop it doing that.
b. Change the implementation of modalias attributes to allow longer
values.

It's not clear to me whether the Xen driver is advertising correctly or
not.  If it is, then�the solution should be b, but that may be too
disruptive a change to the kernel.  So a reasonable workaround might
be:

c. Change the input subsystem to limit the length of the
capabilities part of the modalias.

The problem with a) is that the Xen keyboard is not a physical keyboard
and so it has no way of knowing what keys it actually has.  It is a fake
input device designed to pass through whatever input the Xen hypervisor
sends down.  As such, any key could come in.  If it doesn't advertise
that it has all of these keys, then they would not be accepted by
libinput when the hypervisor sends them down.

Right, that's what I feared.

xen-kbdfront is setting the bits for keys in the ranges [KEY_ESC,
KEY_UNKNOWN) and [KEY_OK, KEY_MAX), which I think works out to 654
keys and 2362 bytes in the modalias.


This seems to be the heart of the problem: libinput was designed
assuming that all keyboards can and must report what keys are actually
present, and then libinput tries to cram that information into the
modalias rather than some other sysfs attribute as it should ( or not at
all... I still don't see how this information is actually supposed to be
useful to userspace ).

I think modaliases aren't intended to be interpreted by user-space,
other than processing wildcards when matching to modules.

For input devices, the same information is available through other
variables in the uevent, in a more compact form.  The information *is*
useful for user-space; e.g. in initramfs-tools we recognise keyboard
devices and add their drivers to the initramfs but ignore other input
devices.


As for b), the problem isn't with the modalias attribute itself, but
when the kernel tries to copy it into the environment block for the udev
callout.  The environment block is only a single page, and so limited to
4 KB.  And that's for everything else that goes into the environment,
not just the modalias.

Text-based sysfs attributes are limited to a page, but udev receives
uevents through netlink, not sysfs.

The current limit on the environment of a uevent appears to be 2 KB
(UEVENT_BUFFER_SIZE defined in ).  That seems like it
*might* be easier to change, so long as user-space doesn't have a
similar limit.

I looked into systemd/udev, and it seems to use an 8 KB buffer for
receiving uevents:

https://sources.debian.org/src/systemd/247.9-1/src/libsystemd/sd-device/device-monitor.c/?hl=390#L390

But as a first step I think increasing the kernel buffer size to 4 KB
would be enough.  Perhaps someone could test whether this patch to the
domU kernel makes udev happier:

--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -30,7 +30,7 @@
  
  #define UEVENT_HELPER_PATH_LEN		256

  #define UEVENT_NUM_ENVP   64  /* number of env 
pointers */
-#define UEVENT_BUFFER_SIZE 2048/* buffer for the variables */
+#define UEVENT_BUFFER_SIZE 4096/* buffer for the variables */
  
  #ifdef CONFIG_UEVENT_HELPER

  /* path to the userspace helper executed on an event */
--- END ---

?

Ben.



Even though this patch has been tested to apparently fix this bug and
the bug has been elevated to important and tagged patch and upstream,
AFAICT there is no action yet upstream or anywhere else after more than
three weeks. Is this patch dead as a possible fix for this bug?

Best wishes,

Chuck



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-21 Thread Chuck Zmudzinski

On 9/21/2021 9:13 AM, Chuck Zmudzinski wrote:

On 9/20/2021 10:37 PM, Elliott Mitchell wrote:

On Mon, Sep 20, 2021 at 10:23:39PM -0400, Chuck Zmudzinski wrote:

On 9/20/21 7:39 PM, Diederik de Haas wrote:

On dinsdag 21 september 2021 01:15:15 CEST Elliott Mitchell wrote:

Merely having the path is a sufficiently strong indicator for me to
simply wave it past.  I though would suggest Debian should instead
cherry-pick commit 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.

This is available as a patch at:

https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=0f089bbf43ecce6f27576cb548ba4341d0ec46a8 

You probably then also want the following commit, which is a fix on 
that patch:
https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b 



Found that via the following url/query:
https://xenbits.xen.org/gitweb/?p=xen.git=search=HEAD=commit=x86%2FACPI 



I don't know whether others should be used from that as well.

I tried these two commits (adapted for the xen-4.14 branch) but this
approach did not fix the bug - with these patches applied the dom0
did not power down.

My advice for the Debian Xen Team is to consult with upstream and
get their advice on whether or not it is advisable for Debian to
retain the patches from the Xen-4.16 branch that have been
added to the Debian 4.14 package in an attempt to support
some arm devices that panic during on an unpatched Xen-4.14.
If upstream cannot help Debian backport fixes for arm panics
from Xen-4.16/unstable to Xen-4.14 stable, I think the Debian
Xen team should remove aggressive patches that really have now
turned the Debian Xen-4.14 package into a Frankenstein version
that is a mixture of Xen-4.14 and Xen-4.16, and decide that support
for those arm devices must wait until Debian gets Xen 4.16 up
and running on the unstable and hopefully soon, testing distribution.

It is still not established you're running into #991967.  Unless the one
you're pointing towards was backported to the Xen 4.11 packages (which I
doubt) it cannot explain #991967, since at the time 4.11 was in use.

Could be this is a second bug with symptoms similar to #991967. Now
that a fix for the second bug has been identified, you might try a
4.19.181-1 kernel and see whether that fixes things.




FWIW, I tried this.

Sorry, not only does this not fix things, when I shutdown the dom0
running with the official Debian 4.19.181-1 kernel on the current
official Debian Xen-4.14 hypervisor, the dom0 not only did not
power off, it did not even reach the systemd poweroff target. 


Slight correction - after a few minutes, it did finally reach the
systemd poweroff target, but the power did not turn off.
Yet, it works perfectly on the official Debian Xen-4.11 hypervisor. 
Again,

my tests cannot confirm that there is a bug in src:linux, the only
common denominator for this bug in all my testing is src:xen, the
and it appears in all the 4.14 Xen versions for bullseye, for every 
single

Linux version tested.

Chuck




Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-21 Thread Chuck Zmudzinski

On 9/20/2021 10:37 PM, Elliott Mitchell wrote:

On Mon, Sep 20, 2021 at 10:23:39PM -0400, Chuck Zmudzinski wrote:

On 9/20/21 7:39 PM, Diederik de Haas wrote:

On dinsdag 21 september 2021 01:15:15 CEST Elliott Mitchell wrote:

Merely having the path is a sufficiently strong indicator for me to
simply wave it past.  I though would suggest Debian should instead
cherry-pick commit 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.

This is available as a patch at:

https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=0f089bbf43ecce6f27576cb548ba4341d0ec46a8

You probably then also want the following commit, which is a fix on that patch:
https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b

Found that via the following url/query:
https://xenbits.xen.org/gitweb/?p=xen.git=search=HEAD=commit=x86%2FACPI

I don't know whether others should be used from that as well.

I tried these two commits (adapted for the xen-4.14 branch) but this
approach did not fix the bug - with these patches applied the dom0
did not power down.

My advice for the Debian Xen Team is to consult with upstream and
get their advice on whether or not it is advisable for Debian to
retain the patches from the Xen-4.16 branch that have been
added to the Debian 4.14 package in an attempt to support
some arm devices that panic during on an unpatched Xen-4.14.
If upstream cannot help Debian backport fixes for arm panics
from Xen-4.16/unstable to Xen-4.14 stable, I think the Debian
Xen team should remove aggressive patches that really have now
turned the Debian Xen-4.14 package into a Frankenstein version
that is a mixture of Xen-4.14 and Xen-4.16, and decide that support
for those arm devices must wait until Debian gets Xen 4.16 up
and running on the unstable and hopefully soon, testing distribution.

It is still not established you're running into #991967.  Unless the one
you're pointing towards was backported to the Xen 4.11 packages (which I
doubt) it cannot explain #991967, since at the time 4.11 was in use.

Could be this is a second bug with symptoms similar to #991967.  Now
that a fix for the second bug has been identified, you might try a
4.19.181-1 kernel and see whether that fixes things.




FWIW, I tried this.

Sorry, not only does this not fix things, when I shutdown the dom0
running with the official Debian 4.19.181-1 kernel on the current
official Debian Xen-4.14 hypervisor, the dom0 not only did not
power off, it did not even reach the systemd poweroff target. Yet,
it works perfectly on the official Debian Xen-4.11 hypervisor. Again,
my tests cannot confirm that there is a bug in src:linux, the only
common denominator for this bug in all my testing is src:xen, the
and it appears in all the 4.14 Xen versions for bullseye, for every single
Linux version tested.

Chuck



Bug#991967: linux-src 4.19.194-3 breaks Xen Dom0 powerdown and reboot

2021-09-21 Thread Chuck Zmudzinski



On 9/21/21 7:22 AM, Fr. Chuck Zmudzinski, C.P.M. wrote:
On Sat, 7 Aug 2021 08:40:14 +0200 Salvatore Bonaccorso 
 wrote:

> Control: tags -1 + moreinfo
>
> Hi,
>
> On Fri, Aug 06, 2021 at 11:50:54AM -0700, Elliott Mitchell wrote:
> > Package: src:linux
> > Version: 4.19.194-3
> > Control: affects -1 src:xen
> >
> > SSIA. Previous versions of 4.19 had no issues (4.19.181-1 
according to
> > notes), but this cropped up with 4.19.194-3 (-1 and -2 weren't 
tested).

> >
> > When a Xen domain 0 tries to reboot or powerdown the computer, it 
hangs

> > with the display off, but the power supply is active.
> >
> > I'm rebuilding from source, so I imagine this also effects
> > linux-image-4.19.0-17-amd64.
>
> Can you please try to bisect which commit introduced the issue? Does
> it affect as well current upstream 4.19.201?
>
> Regards,
> Salvatore
>
>

Dear Salvatore,

As you have noticed, much more information about this bug
has been added to this bug report, but the original reporter
is of the opinion that much of that new information concerns
a bug related to but distinct from the bug he reported. Both
bugs have the same symptom: dom0 does not power down
when shutting down the system, and it is clear that both
bugs are related to x86 acpi code in either the Linux kernel
or in the Xen hypervisor. But I cannot reproduce his original
bug which occurred in Linux 4.19.194-3 on Xen-4.11 from buster.
I have only seen the bug in Xen-4.14 for bullseye, and I always
see it with Xen-4.14 regardless of the Linux kernel version.
As far as I can tell, another participant in this bug report
has reproduced the behavior I am seeing, but not the behavior
the original reporter is seeing:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967#54

Would you endorse the original reporter's belief that there are
two distinct bugs being discussed here? If so, I would be inclined
to report the bug I am seeing as a distinct but related bug in the
bullseye version of src:xen. Otherwise, I respectfully ask that
you reclassify this as a bug in src:xen, since the original reporter
has not been able to identify a commit in src:linux that caused
the bug and no one has been able to reproduce the bug on
Xen-4.11/Linux-4.19.194-3.

Regards,

Chuck Zmudzinski



Allow me to propose the following arguments in favor of
changing this to a bug in src:xen:

1) The original report of this bug in src:linux version 4.19.194-3
with Xen 4.11 has not been reproduced by anyone.

2) The same symptom has been reproduced in recent versions
of src:xen for bullseye, with Xen version 4.14.x

3) For the future, what is the point of trying to fix a bug in
oldstable? Why not concern ourselves with fixing the
bug as it now appears in stable?

Regards,

Chuck Zmudzinski



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-21 Thread Chuck Zmudzinski



On 9/20/21 10:12 PM, Chuck Zmudzinski wrote:

On 9/20/21 6:29 PM, Chuck Zmudzinski wrote:

On 9/20/21 1:43 PM, Chuck Zmudzinski wrote:


On 9/20/21 12:27 AM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:


I suspect the following patch is the culprit for problems
shutting down on the amd64 architecture:

0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.

Of the ones listed that is the only one which has any overlap with x86
code.  The next reproduction step is `apt-get source xen &&
patch -p1 -R < 
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

&& dpkg-buildpackage -b`.  Then try with this to confirm that patch
is what does it.

Thing is that delta is rather small.  I don't have a simulator, but 
that

is rather small to be the culprit.


I just tested the build with
patch -p1 -R < 
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
applied before building the package and I can confirm that this is 
the patch
causing the trouble for dom0 poweroff on x86/amd64. Reverting this 
patch
fixes it on my amd64 system. But this would probably break the arm 
build.


I think one possible fix would require modifying
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
so it only applies at runtime to the arm architecture. I will try some
modifications to the patch instead of removing it, and if I get 
something

that works on amd64 and also might work on arm, I will post it
for Elliott to try.


I have an encouraging result. I found a very simple patch
to xen/arch/x86/acpi/lib.c that fixes the dom0 poweroff
bug on my system and it should not affect the arm patches
at all:
--
This patch partially reverts previous patch
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

This hopefully fixes #911976

--- a/xen/arch/x86/acpi/lib.c    2021-09-20 16:49:08.0 -0400
+++ b/xen/arch/x86/acpi/lib.c    2021-09-20 16:25:05.572038000 -0400
@@ -46,10 +46,6 @@
 if ((phys + size) <= (1 * 1024 * 1024))
     return __va(phys);

-    /* No further arch specific implementation after early boot */
-    if (system_state >= SYS_STATE_boot)
-        return NULL;
-
 offset = phys & (PAGE_SIZE - 1);
 mapped_size = PAGE_SIZE - offset;
 set_fixmap(FIX_ACPI_END, phys);
--




Further testing with this patch revealed a problem. Although
this simple patch causes dom0 to poweroff when shutting
down, on the next reboot the system dropped to single-user
shell because it mixed up my ssd and my hard disk. Normally
the system assigns my SSD as /dev/sda and my hard disk
as /dev/sdb. But on the first reboot after running the Xen
hypervisor, the system reversed them so my SSD was /dev/sdb
and my hard disk was /dev/sda. Since the EFI partition, which
is a vfat partition, is on the SSD and in /etc/fstab I ask to mount
it from the /dev/sda1 partition, it is now at /dev/sdb1, and
the first partition is not a vfat partition on the hard disk so
the system drops to a root shell for system maintenance.

This switching of the devices on the subsequent reboot is
another symptom of this bug I have seen in the past, and
usually the ordinary behavior is restored on the next reboot
or after resetting and powering off or unplugging from power.
So this patch does not really fix the bug reliably.


To clarify things, I saw this strange behavior of the system
switching the disk devices with this patch under the following
conditions:

1) Boot using this simple patch - dom0 shuts down properly

2) Boot using Elliott's suggested patch in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967#94

3) It was when booting using Elliott's suggested patch that
I saw the drop to single-user root for system maintenance.
Moreover, Elliott's suggested patch did not fix the dom0
power off bug.

So it might be the case that this simple patch would work
for both amd64 and arm devices nicely, but Elliott refuses
to test it with his arm devices. Sigh.



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-21 Thread Chuck Zmudzinski

On 9/20/21 10:37 PM, Elliott Mitchell wrote:

On Mon, Sep 20, 2021 at 10:23:39PM -0400, Chuck Zmudzinski wrote:

On 9/20/21 7:39 PM, Diederik de Haas wrote:

On dinsdag 21 september 2021 01:15:15 CEST Elliott Mitchell wrote:

Merely having the path is a sufficiently strong indicator for me to
simply wave it past.  I though would suggest Debian should instead
cherry-pick commit 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.

This is available as a patch at:

https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=0f089bbf43ecce6f27576cb548ba4341d0ec46a8

You probably then also want the following commit, which is a fix on that patch:
https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b

Found that via the following url/query:
https://xenbits.xen.org/gitweb/?p=xen.git=search=HEAD=commit=x86%2FACPI

I don't know whether others should be used from that as well.

I tried these two commits (adapted for the xen-4.14 branch) but this
approach did not fix the bug - with these patches applied the dom0
did not power down.

My advice for the Debian Xen Team is to consult with upstream and
get their advice on whether or not it is advisable for Debian to
retain the patches from the Xen-4.16 branch that have been
added to the Debian 4.14 package in an attempt to support
some arm devices that panic during on an unpatched Xen-4.14.
If upstream cannot help Debian backport fixes for arm panics
from Xen-4.16/unstable to Xen-4.14 stable, I think the Debian
Xen team should remove aggressive patches that really have now
turned the Debian Xen-4.14 package into a Frankenstein version
that is a mixture of Xen-4.14 and Xen-4.16, and decide that support
for those arm devices must wait until Debian gets Xen 4.16 up
and running on the unstable and hopefully soon, testing distribution.

It is still not established you're running into #991967.  Unless the one
you're pointing towards was backported to the Xen 4.11 packages (which I
doubt) it cannot explain #991967, since at the time 4.11 was in use.

Could be this is a second bug with symptoms similar to #991967.  Now
that a fix for the second bug has been identified, you might try a
4.19.181-1 kernel and see whether that fixes things.




I presume you are suggesting I try booting 4.19.181-1 on the
current version of Xen-4.14 for bullseye as a dom0. I am not
inclined to try it until an official Debian developer endorses
your opinion that the bug I am seeing is distinct
from #991967, at which point I will report the bug I am
seeing as a new bug.

Regards,

Chuck Zmudzinski



Bug#991967: linux-src 4.19.194-3 breaks Xen Dom0 powerdown and reboot

2021-09-21 Thread Fr. Chuck Zmudzinski, C.P.M.
On Sat, 7 Aug 2021 08:40:14 +0200 Salvatore Bonaccorso 
 wrote:

> Control: tags -1 + moreinfo
>
> Hi,
>
> On Fri, Aug 06, 2021 at 11:50:54AM -0700, Elliott Mitchell wrote:
> > Package: src:linux
> > Version: 4.19.194-3
> > Control: affects -1 src:xen
> >
> > SSIA. Previous versions of 4.19 had no issues (4.19.181-1 according to
> > notes), but this cropped up with 4.19.194-3 (-1 and -2 weren't tested).
> >
> > When a Xen domain 0 tries to reboot or powerdown the computer, it hangs
> > with the display off, but the power supply is active.
> >
> > I'm rebuilding from source, so I imagine this also effects
> > linux-image-4.19.0-17-amd64.
>
> Can you please try to bisect which commit introduced the issue? Does
> it affect as well current upstream 4.19.201?
>
> Regards,
> Salvatore
>
>

Dear Salvatore,

As you have noticed, much more information about this bug
has been added to this bug report, but the original reporter
is of the opinion that much of that new information concerns
a bug related to but distinct from the bug he reported. Both
bugs have the same symptom: dom0 does not power down
when shutting down the system, and it is clear that both
bugs are related to x86 acpi code in either the Linux kernel
or in the Xen hypervisor. But I cannot reproduce his original
bug which occurred in Linux 4.19.194-3 on Xen-4.11 from buster.
I have only seen the bug in Xen-4.14 for bullseye, and I always
see it with Xen-4.14 regardless of the Linux kernel version.
As far as I can tell, another participant in this bug report
has reproduced the behavior I am seeing, but not the behavior
the original reporter is seeing:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991967#54

Would you endorse the original reporter's belief that there are
two distinct bugs being discussed here? If so, I would be inclined
to report the bug I am seeing as a distinct but related bug in the
bullseye version of src:xen. Otherwise, I respectfully ask that
you reclassify this as a bug in src:xen, since the original reporter
has not been able to identify a commit in src:linux that caused
the bug and no one has been able to reproduce the bug on
Xen-4.11/Linux-4.19.194-3.

Regards,

Chuck Zmudzinski



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-21 Thread Chuck Zmudzinski



On 9/20/21 7:39 PM, Diederik de Haas wrote:

On dinsdag 21 september 2021 01:15:15 CEST Elliott Mitchell wrote:

Merely having the path is a sufficiently strong indicator for me to
simply wave it past.  I though would suggest Debian should instead
cherry-pick commit 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.

This is available as a patch at:

https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=0f089bbf43ecce6f27576cb548ba4341d0ec46a8

You probably then also want the following commit, which is a fix on that patch:
https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b

Found that via the following url/query:
https://xenbits.xen.org/gitweb/?p=xen.git=search=HEAD=commit=x86%2FACPI

I don't know whether others should be used from that as well.


I tried these two commits (adapted for the xen-4.14 branch) but this
approach did not fix the bug - with these patches applied the dom0
did not power down.

My advice for the Debian Xen Team is to consult with upstream and
get their advice on whether or not it is advisable for Debian to
retain the patches from the Xen-4.16 branch that have been
added to the Debian 4.14 package in an attempt to support
some arm devices that panic during on an unpatched Xen-4.14.
If upstream cannot help Debian backport fixes for arm panics
from Xen-4.16/unstable to Xen-4.14 stable, I think the Debian
Xen team should remove aggressive patches that really have now
turned the Debian Xen-4.14 package into a Frankenstein version
that is a mixture of Xen-4.14 and Xen-4.16, and decide that support
for those arm devices must wait until Debian gets Xen 4.16 up
and running on the unstable and hopefully soon, testing distribution.



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-21 Thread Chuck Zmudzinski

On 9/20/21 6:29 PM, Chuck Zmudzinski wrote:

On 9/20/21 1:43 PM, Chuck Zmudzinski wrote:


On 9/20/21 12:27 AM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:


I suspect the following patch is the culprit for problems
shutting down on the amd64 architecture:

0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.

Of the ones listed that is the only one which has any overlap with x86
code.  The next reproduction step is `apt-get source xen &&
patch -p1 -R < 
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

&& dpkg-buildpackage -b`.  Then try with this to confirm that patch
is what does it.

Thing is that delta is rather small.  I don't have a simulator, but 
that

is rather small to be the culprit.


I just tested the build with
patch -p1 -R < 
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
applied before building the package and I can confirm that this is 
the patch

causing the trouble for dom0 poweroff on x86/amd64. Reverting this patch
fixes it on my amd64 system. But this would probably break the arm 
build.


I think one possible fix would require modifying
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
so it only applies at runtime to the arm architecture. I will try some
modifications to the patch instead of removing it, and if I get 
something

that works on amd64 and also might work on arm, I will post it
for Elliott to try.


I have an encouraging result. I found a very simple patch
to xen/arch/x86/acpi/lib.c that fixes the dom0 poweroff
bug on my system and it should not affect the arm patches
at all:
--
This patch partially reverts previous patch
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

This hopefully fixes #911976

--- a/xen/arch/x86/acpi/lib.c    2021-09-20 16:49:08.0 -0400
+++ b/xen/arch/x86/acpi/lib.c    2021-09-20 16:25:05.572038000 -0400
@@ -46,10 +46,6 @@
 if ((phys + size) <= (1 * 1024 * 1024))
     return __va(phys);

-    /* No further arch specific implementation after early boot */
-    if (system_state >= SYS_STATE_boot)
-        return NULL;
-
 offset = phys & (PAGE_SIZE - 1);
 mapped_size = PAGE_SIZE - offset;
 set_fixmap(FIX_ACPI_END, phys);
--




Further testing with this patch revealed a problem. Although
this simple patch causes dom0 to poweroff when shutting
down, on the next reboot the system dropped to single-user
shell because it mixed up my ssd and my hard disk. Normally
the system assigns my SSD as /dev/sda and my hard disk
as /dev/sdb. But on the first reboot after running the Xen
hypervisor, the system reversed them so my SSD was /dev/sdb
and my hard disk was /dev/sda. Since the EFI partition, which
is a vfat partition, is on the SSD and in /etc/fstab I ask to mount
it from the /dev/sda1 partition, it is now at /dev/sdb1, and
the first partition is not a vfat partition on the hard disk so
the system drops to a root shell for system maintenance.

This switching of the devices on the subsequent reboot is
another symptom of this bug I have seen in the past, and
usually the ordinary behavior is restored on the next reboot
or after resetting and powering off or unplugging from power.

So this patch does not really fix the bug reliably.



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-20 Thread Chuck Zmudzinski

On 9/20/21 1:43 PM, Chuck Zmudzinski wrote:


On 9/20/21 12:27 AM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:


I suspect the following patch is the culprit for problems
shutting down on the amd64 architecture:

0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.

Of the ones listed that is the only one which has any overlap with x86
code.  The next reproduction step is `apt-get source xen &&
patch -p1 -R < 
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

&& dpkg-buildpackage -b`.  Then try with this to confirm that patch
is what does it.

Thing is that delta is rather small.  I don't have a simulator, but that
is rather small to be the culprit.


I just tested the build with
patch -p1 -R < 
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
applied before building the package and I can confirm that this is the 
patch

causing the trouble for dom0 poweroff on x86/amd64. Reverting this patch
fixes it on my amd64 system. But this would probably break the arm build.

I think one possible fix would require modifying
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
so it only applies at runtime to the arm architecture. I will try some
modifications to the patch instead of removing it, and if I get something
that works on amd64 and also might work on arm, I will post it
for Elliott to try.


I have an encouraging result. I found a very simple patch
to xen/arch/x86/acpi/lib.c that fixes the dom0 poweroff
bug on my system and it should not affect the arm patches
at all:
--
This patch partially reverts previous patch
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

This hopefully fixes #911976

--- a/xen/arch/x86/acpi/lib.c    2021-09-20 16:49:08.0 -0400
+++ b/xen/arch/x86/acpi/lib.c    2021-09-20 16:25:05.572038000 -0400
@@ -46,10 +46,6 @@
 if ((phys + size) <= (1 * 1024 * 1024))
     return __va(phys);

-    /* No further arch specific implementation after early boot */
-    if (system_state >= SYS_STATE_boot)
-        return NULL;
-
 offset = phys & (PAGE_SIZE - 1);
 mapped_size = PAGE_SIZE - offset;
 set_fixmap(FIX_ACPI_END, phys);
--

Can you try this patch to src:xen and see if your
arm devices are OK with it?



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-20 Thread Chuck Zmudzinski



On 9/20/21 12:27 AM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:


I suspect the following patch is the culprit for problems
shutting down on the amd64 architecture:

0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.

Of the ones listed that is the only one which has any overlap with x86
code.  The next reproduction step is `apt-get source xen &&
patch -p1 -R < 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
&& dpkg-buildpackage -b`.  Then try with this to confirm that patch
is what does it.

Thing is that delta is rather small.  I don't have a simulator, but that
is rather small to be the culprit.


I just tested the build with
patch -p1 -R < 
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

applied before building the package and I can confirm that this is the patch
causing the trouble for dom0 poweroff on x86/amd64. Reverting this patch
fixes it on my amd64 system. But this would probably break the arm build.

I think one possible fix would require modifying
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
so it only applies at runtime to the arm architecture. I will try some
modifications to the patch instead of removing it, and if I get something
that works on amd64 and also might work on arm, I will post it
for Elliott to try.



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-20 Thread Chuck Zmudzinski



On 9/20/21 12:27 AM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:

xen hypervisor version: 4.14.2+25-gb6a8c4f72d-2, amd64

linux kernel version: 5.10.46-4 (the current amd64 kernel
for bullseye)

Boot system: EFI, not using secure boot, booting xen
hypervisor and dom0 bullseye with grub-efi package for
bullseye, and it boots the xen-4.14-amd64.gz file, not
the xen-4.14-amd64.efi file.
I also tested a buster dom0 with the 4.19 series kernel
on the xen-4.14 hypervisor from bullseye and saw the
problem, but I did not see the problem with either
a buster (linux 4.19) or bullseye (linux 5.10) dom0 on
the xen-4.11 hypervisor, so I think the problem is
with the Debian version of the xen-4.14 hypervisor,
not with src:linux.

You're referencing several software versions which are mismatches for
#991967.  #991967 was observed with Xen 4.11 and Linux kernel 4.19.194-3,
but not Linux kernel 4.19.181.

The fact it correlates with a Linux kernel update rather strongly points
to the Linux kernel.  I could believe the situation is partially the
fault of both though.


I don't see it with Xen-4.11 and Linux kernel 4.19.194-3 which is
the current default dom0 configuration on Debian buster, but I
do see it with Debian's version of Xen-4.14 and either Linux
kernel 4.19.194-3 from buster or Linux kernel 5.10.46-4 from
bullseye as the dom0. So I only saw it with the update of the
Xen hypervisor from 4.11 to 4.14. Of course you have different
hardware and a different acpi implementation which is also likely
to be a factor that determines whether or not the dom0 poweroff
bug manifests itself.




I suspect the following patch is the culprit for problems
shutting down on the amd64 architecture:

0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.

Of the ones listed that is the only one which has any overlap with x86
code.  The next reproduction step is `apt-get source xen &&
patch -p1 -R < 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
&& dpkg-buildpackage -b`.  Then try with this to confirm that patch
is what does it.

Thing is that delta is rather small.  I don't have a simulator, but that
is rather small to be the culprit.


I did try to remove this single patch from the xen build using
quilt, but quilt was not happy when it tried to apply the
subsequent arm patch, so I just removed all the subsequent
arm patches to keep quilt happy with my modified xen
src tree. I will try it now, though.

If it is this small a delta that is causing the problem
on x86/amd64, then maybe we can come up with a workaround
in src:xen that is acceptable for both arm and x86/amd64.




I think this bug should be re-classified as a bug in src:xen.

There could be a separate bug in src:xen, but that is not #991967.


I also would inquire with the Debian Xen Team about why they
are backporting patches from the upstream xen unstable
branch into Debian's 4.14 package that is currently shipping
on Debian stable (bullseye). IMHO, the aforementioned
patches that are not in the stable 4.14 branch upstream
should not be included in the xen package for Debian stable.

It was requested since someone trying to have Xen operational on a device
needed those for operation.  Rather a lot of bugfix or very small
standalone feature patches get cherry-picked.


Presently I haven't been convinced this is a Xen bug (though it does
effect Xen installations).

Any chance you've got the tools to build and try a 5.5.0 or 5.10.0 Linux
kernel?  I'm suspecting got incorrectly backported on the Linux side
(alternatively the Xen project seems a bit poor at keeping needed patches
in Linux).




Yes, I recently built and tested a slightly modified Debian
bullseye kernel to test a fix for #983357:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983357

If you have a patch for Debian's 5.10 bullseye kernel that
might fix the dom0 poweroff bug I am seeing on bullseye with
Debian's current Xen 4.14, I am willing to try it out on my
system as an alternate fix from the fix I discovered in
src:xen that unfortunately removes arm patches that are
needed by some devices.



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-20 Thread Chuck Zmudzinski

On 9/19/2021 9:30 PM, Chuck Zmudzinski wrote:

On 9/19/2021 4:53 PM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 03:54:01PM -0400, Chuck Zmudzinski wrote:

On 9/19/2021 1:29 PM, Elliott Mitchell wrote:

Have you tried memory ballooning with PVH or HVM domains?

That combination has been reliably crashing Xen for me for a while.
Apparently few others have run into it, yet it is reliable for me.  
Have

you tried the combination?  Works?  Panics?

I have not tried ballooning HVM or PVH domains. If the Xen
hypervisor is crashing when ballooning unprivileged domains,
doesn't that support my belief that there are bugs in src:xen
rather than in src:linux?

No.


I still think the patches to fix a panic on devices using the arm 
architecture

are a bit aggressive for the Debian Xen package for Debian stable. Those
patches upstream are intended for Xen unstable, which is currently
Xen 4.16. Such patches do not belong in a stable Xen 4.14 package for
Debian stable, especially after it can be proven they cause a regression
for Xen users of amd64 devices, the regression being that they break the
proper shutdown functioning of amd64 devices.

I think the correct Debian way to support the arm devices that
panic on a true upstream Xen 4.14 hypervisor without the
patches for arm that cause dom0 to not power off properly on
amd64 is by first testing the arm patches as part of a new Xen 4.16
unstable Xen package for Debian unstable, then follow ordinary
development procedures for porting Xen 4.16 to bookworm/testing,
and then finally a backport of Xen 4.16 to bullseye. That is the
only way I can see this being done without causing grief to
Xen users who want a stable Xen on a stable Debian, unless
upstream can help with porting the arm patches back to Xen 4.14
in such a way that they don't break things on amd64.

This was also deliberately not copied to #991967 since this is 
unrelated.

I'm concerned this second one might be Debian, but the small delta makes
me think it likely originates from upstream Xen.  I was wondering 
whether

you had seen it since I haven't found other reports.

(note, if you try recreating, this is a Xen panic, all domains get lost)




This is off-topic for bug #991968.

Regards,

Chuck


Also off-topic for bug #991967 - sorry about the typo.

Chuck



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-19 Thread Chuck Zmudzinski

On 9/19/2021 4:53 PM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 03:54:01PM -0400, Chuck Zmudzinski wrote:

On 9/19/2021 1:29 PM, Elliott Mitchell wrote:

Have you tried memory ballooning with PVH or HVM domains?

That combination has been reliably crashing Xen for me for a while.
Apparently few others have run into it, yet it is reliable for me.  Have
you tried the combination?  Works?  Panics?

I have not tried ballooning HVM or PVH domains. If the Xen
hypervisor is crashing when ballooning unprivileged domains,
doesn't that support my belief that there are bugs in src:xen
rather than in src:linux?

No.


I still think the patches to fix a panic on devices using the arm 
architecture

are a bit aggressive for the Debian Xen package for Debian stable. Those
patches upstream are intended for Xen unstable, which is currently
Xen 4.16. Such patches do not belong in a stable Xen 4.14 package for
Debian stable, especially after it can be proven they cause a regression
for Xen users of amd64 devices, the regression being that they break the
proper shutdown functioning of amd64 devices.

I think the correct Debian way to support the arm devices that
panic on a true upstream Xen 4.14 hypervisor without the
patches for arm that cause dom0 to not power off properly on
amd64 is by first testing the arm patches as part of a new Xen 4.16
unstable Xen package for Debian unstable, then follow ordinary
development procedures for porting Xen 4.16 to bookworm/testing,
and then finally a backport of Xen 4.16 to bullseye. That is the
only way I can see this being done without causing grief to
Xen users who want a stable Xen on a stable Debian, unless
upstream can help with porting the arm patches back to Xen 4.14
in such a way that they don't break things on amd64.


This was also deliberately not copied to #991967 since this is unrelated.
I'm concerned this second one might be Debian, but the small delta makes
me think it likely originates from upstream Xen.  I was wondering whether
you had seen it since I haven't found other reports.

(note, if you try recreating, this is a Xen panic, all domains get lost)




This is off-topic for bug #991968.

Regards,

Chuck



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-19 Thread Chuck Zmudzinski

On 9/19/2021 1:29 PM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:

I noticed this bug on bullseye ever since I have been
running bullseye as a dom0, but my testing indicates
there is no problem with src:linux but the problem
appeared in src:xen with the 4.14 version of xen on
bullseye.

I ask Elliott if you are only seeing the problem on Debian's
xen-4.14 hypervisor? Also, which architecture, arm or
amd64? I only see the problem on the Debian xen-4.14
hypervisor, and I have only tested on amd64, and I
have found a fix for my amd64 system which is as
follows:

Motherboard: ASRock B85M Pro4, BIOS P2.50 12/11/2015,
with a Haswell CPU (core i5-4590S)

xen hypervisor version: 4.14.2+25-gb6a8c4f72d-2, amd64

linux kernel version: 5.10.46-4 (the current amd64 kernel
for bullseye)

Boot system: EFI, not using secure boot, booting xen
hypervisor and dom0 bullseye with grub-efi package for
bullseye, and it boots the xen-4.14-amd64.gz file, not
the xen-4.14-amd64.efi file.

Actually hardware which is pretty different from mine, so you may run
into distinct bugs.

Have you tried PVH or HVM domains?


HVM domains: Yes, and they work normally on all Debian versions
I have tried..

PVH domains: No, I have not tried these on Debian.


Have you tried memory ballooning with PVH or HVM domains?

That combination has been reliably crashing Xen for me for a while.
Apparently few others have run into it, yet it is reliable for me.  Have
you tried the combination?  Works?  Panics?


I have not tried ballooning HVM or PVH domains. If the Xen
hypervisor is crashing when ballooning unprivileged domains,
doesn't that support my belief that there are bugs in src:xen
rather than in src:linux?

Regards,

Chuck



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-19 Thread Chuck Zmudzinski

On 9/19/2021 10:56 AM, Elliott Mitchell wrote:

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:

On Sat, 11 Sep 2021 13:29:12 +0200 Salvatore Bonaccorso
 wrote:
  >
  > On Fri, Sep 10, 2021 at 06:47:12PM -0700, Elliott Mitchell wrote:
  > > An experiment lead to a potential alternative explanation for #991967.
  > > The issue may be ACPI (non-UEFI) powerdown/reset was broken at
  > > 4.19.194-3. Presence of Xen on the system may be unrelated.
  > >
  > > Failing that, it could be Xen and non-UEFI systems are effected. (Xen
  > > was tried on a UEFI system and the issue wasn't observed)
  >
  > Following up on https://bugs.debian.org/991967#12
  >
  > Did you succeeded in bisecting the issue as you seem to have it
  > reproducible?

I noticed this bug on bullseye ever since I have been
running bullseye as a dom0, but my testing indicates
there is no problem with src:linux but the problem
appeared in src:xen with the 4.14 version of xen on
bullseye.

I ask Elliott if you are only seeing the problem on Debian's
xen-4.14 hypervisor? Also, which architecture, arm or
amd64? I only see the problem on the Debian xen-4.14
hypervisor, and I have only tested on amd64, and I
have found a fix for my amd64 system which is as
follows:

Motherboard: ASRock B85M Pro4, BIOS P2.50 12/11/2015,
with a Haswell CPU (core i5-4590S)

xen hypervisor version: 4.14.2+25-gb6a8c4f72d-2, amd64

linux kernel version: 5.10.46-4 (the current amd64 kernel
for bullseye)

Nope.  As per the report the problem appeared with kernel 4.19.194-3 and
at the time using Xen 4.11.

The kernel you're listing is rather more recent, which might suggest a
patch which had been backported from 5.x to 4.19.

I could believe a Xen security update being the trigger though (I don't
recall there being one at the right time, but I wouldn't rule it out).



Boot system: EFI, not using secure boot, booting xen
hypervisor and dom0 bullseye with grub-efi package for
bullseye, and it boots the xen-4.14-amd64.gz file, not
the xen-4.14-amd64.efi file.

I also tested a buster dom0 with the 4.19 series kernel
on the xen-4.14 hypervisor from bullseye and saw the
problem, but I did not see the problem with either
a buster (linux 4.19) or bullseye (linux 5.10) dom0 on
the xen-4.11 hypervisor, so I think the problem is
with the Debian version of the xen-4.14 hypervisor,
not with src:linux.

Just to make sure, the kernel you were testing was 4.19.194-3?  The
issue didn't manifest with kernels earlier than that.


I will check again with a buster dom0 when I get a chance,
probably late tonight or tomorrow. I think it was 4.19.194-3
if that is the latest buster kernel because I don't think there
has been an update to the buster kernel since I tested it.


Could be we're seeing distinct bugs.


I could agree if the problem shows up on my system
with the 4.19.194-3 kernel dom0 on xen-4.11, but if not,
then it is probably the same bug, a bug that is in src:xen,
not src:linux.




This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.

While that commit modifies the code path the processor takes, the
modified path appears identical.



I also would inquire with the Debian Xen Team about why they
are backporting patches from the upstream xen unstable
branch into Debian's 4.14 package that is currently shipping
on Debian stable (bullseye). IMHO, the aforementioned
patches that are not in the stable 4.14 branch upstream
should not be included in the xen package for Debian stable.

Some people are asking for those.  Those are bugfixes for an extremely
popular device which panics on boot without the patches.


The raspberry pi, I presume.



Meanwhile turned out between 5.10.0 and 5.10.30 the ARM64 device-trees
were modified in a way which broke Xen 4.14 on ARM64.  The change
violated Linux's own standards for device-trees, yet still appeared in a
stable branch.

In other news, if you see device-trees compared to ACPI tables, they're
not very comparable.  99% of ACPI tables work for all versions of all
OSes.  Any given device-tree is only likely to work for a single version
of a single OS.  While a useful abstraction for portions of kernel code,
device-trees are utter garbage compared to ACPI tables.




Well, now we are at Debian stable with 5.10.x for linux and 4.14.x for xen,
so we are kind of stuck with these versions on Debian stable now. I am all
for tweaking the Debian stable packages to support raspberry and amd64. The
question is, what is the quickest and least disturbing way to fix it now?

All the best,

Chuck



Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-19 Thread Chuck Zmudzinski

On 9/19/2021 1:05 AM, Chuck Zmudzinski wrote:


Hello Elliott and Salvatore,

I noticed this bug on bullseye ever since I have been
running bullseye as a dom0, but my testing indicates
there is no problem with src:linux but the problem
appeared in src:xen with the 4.14 version of xen on
bullseye.

I ask Elliott if you are only seeing the problem on Debian's
xen-4.14 hypervisor? Also, which architecture, arm or
amd64? I only see the problem on the Debian xen-4.14
hypervisor, and I have only tested on amd64, and I
have found a fix for my amd64 system which is as
follows:

Motherboard: ASRock B85M Pro4, BIOS P2.50 12/11/2015,
with a Haswell CPU (core i5-4590S)

xen hypervisor version: 4.14.2+25-gb6a8c4f72d-2, amd64

linux kernel version: 5.10.46-4 (the current amd64 kernel
for bullseye)

Boot system: EFI, not using secure boot, booting xen
hypervisor and dom0 bullseye with grub-efi package for
bullseye, and it boots the xen-4.14-amd64.gz file, not
the xen-4.14-amd64.efi file.

I also tested a buster dom0 with the 4.19 series kernel
on the xen-4.14 hypervisor from bullseye and saw the
problem, but I did not see the problem with either
a buster (linux 4.19) or bullseye (linux 5.10) dom0 on
the xen-4.11 hypervisor, so I think the problem is
with the Debian version of the xen-4.14 hypervisor,
not with src:linux.

I also found a fix in src:xen:

I noticed the series of patches in debian/patches of the
4.14.2+25-gb6a8c4f72d-2 version of src:xen (and
earlier versions of xen-4.14 on Debian) have several patches
backported from the unstable branch of xen upstream. By
removing some of these patches from the patches
series of the src:xen package, the dom0 shuts down
as expected on my ASRock Haswell motherboard.

I rebuilt the src:xen package after removing the following
patches from the debian/patches series and the result
was that the computer shuts down as expected if I boot
using the patched hypervisor:

0027-xen-rpi4-implement-watchdog-based-reset.patch
0028-tools-python-Pass-linker-to-Python-build-process.patch
0029-xen-arm-acpi-Don-t-fail-if-SPCR-table-is-absent.patch
0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
0031-xen-arm-acpi-The-fixmap-area-should-always-be-cleare.patch
0032-xen-arm-Check-if-the-platform-is-not-using-ACPI-befo.patch
0033-xen-arm-Introduce-fw_unreserved_regions-and-use-it.patch
0034-xen-arm-acpi-add-BAD_MADT_GICC_ENTRY-macro.patch
0035-xen-arm-traps-Don-t-panic-when-receiving-an-unknown-.patch

Most of these patches seem unrelated to the amd64
architecture and instead affect the arm architecture, and
removing all these patches is probably more than is needed to
fix this bug, but I removed them all because I could not find
them upstream on the 4.14 branch but instead only saw them
on the xen unstable branch upstream (I did not check if they are
on the 4.15 branch upstream), and I wanted to test
a true upstream 4.14 version without these seemingly
aggressive patches added by Debian from the unstable
branch of xen upstream, and I discovered by being
more conservative and not adding these patches from the
unstable branch upstream fixed the problem!

I suspect the following patch is the culprit for problems
shutting down on the amd64 architecture:

0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

The commit log for this patch states:

From: Julien Grall 
Date: Sat, 26 Sep 2020 17:44:29 +0100
Subject: xen/acpi: Rework acpi_os_map_memory() and acpi_os_unmap_memory()

The functions acpi_os_{un,}map_memory() are meant to be arch-agnostic
while the __acpi_os_{un,}map_memory() are meant to be arch-specific.

Currently, the former are still containing x86 specific code.

To avoid this rather strange split, the generic helpers are reworked so
they are arch-agnostic. This requires the introduction of a new helper
__acpi_os_unmap_memory() that will undo any mapping done by
__acpi_os_map_memory().

Currently, the arch-helper for unmap is basically a no-op so it only
returns whether the mapping was arch specific. But this will change
in the future.

Note that the x86 version of acpi_os_map_memory() was already able to
able the 1MB region. Hence why there is no addition of new code.

Signed-off-by: Julien Grall 
Reviewed-by: Rahul Singh 
Reviewed-by: Jan Beulich 
Acked-by: Stefano Stabellini 
Tested-by: Rahul Singh 
Tested-by: Elliott Mitchell 
(cherry picked from commit 1c4aa69ca1e1fad20b2158051eb152276d1eb973)
---

This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.

I think this bug should be re-classified as a bug in src:xen.

I also would inquire with the Debian Xen Team about why they
are backporting patches from the upstream xen unstable
branch into Debian's 4.14 package that is currently shipping
on Debian stable (bullseye). IMHO, the aforementioned
patches that are not in the stable 4.14 branch

Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-19 Thread Chuck Zmudzinski
ed-by: Rahul Singh 
Tested-by: Elliott Mitchell 
(cherry picked from commit 1c4aa69ca1e1fad20b2158051eb152276d1eb973)
---

This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.

I think this bug should be re-classified as a bug in src:xen.

I also would inquire with the Debian Xen Team about why they
are backporting patches from the upstream xen unstable
branch into Debian's 4.14 package that is currently shipping
on Debian stable (bullseye). IMHO, the aforementioned
patches that are not in the stable 4.14 branch upstream
should not be included in the xen package for Debian stable.

Regards,

Chuck Zmudzinski



Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-26 Thread Chuck Zmudzinski

On 8/26/2021 8:01 AM, Chuck Zmudzinski wrote:

On 8/24/2021 7:12 PM, Ben Hutchings wrote:


The current limit on the environment of a uevent appears to be 2 KB
(UEVENT_BUFFER_SIZE defined in ).� That seems like it
*might* be easier to change, so long as user-space doesn't have a
similar limit.

I looked into systemd/udev, and it seems to use an 8 KB buffer for
receiving uevents:

https://sources.debian.org/src/systemd/247.9-1/src/libsystemd/sd-device/device-monitor.c/?hl=390#L390 



But as a first step I think increasing the kernel buffer size to 4 KB
would be enough.� Perhaps someone could test whether this patch to the
domU kernel makes udev happier:

--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -30,7 +30,7 @@
� � #define UEVENT_HELPER_PATH_LEN������� 256
� #define UEVENT_NUM_ENVP����������� 64��� /* 
number of env pointers */
-#define UEVENT_BUFFER_SIZE������� 2048��� /* buffer for the 
variables */
+#define UEVENT_BUFFER_SIZE������� 4096��� /* buffer for the 
variables */

� � #ifdef CONFIG_UEVENT_HELPER
� /* path to the userspace helper executed on an event */
--- END ---

?

Ben.



I tested this patch on my Xen HVM bullseye system and
it appears 4k is enough for the UEVENT_BUFFER_SIZE
to accommodate the Xen Virtual Keyboard's large
modalias. I needed to follow the instructions in
the Kernel team's handbook for changing the ABI
name of the kernel for the build to succeed with
the patch. I just bumped it from 8 to 8.1.

Results:

1. No coldplug failure reported at boot time.

2. With the patch the system can write uevent
data to sysfs for the Xen Virtual Keyboard device.

With the current 5.10.0-8 kernel:

chuckz@debian:~$ cat /sys/devices/virtual/input/input2/uevent
chuckz@debian:~$

With the patched kernel with a change to the ABI version from 8 to 8.1:

chuckz@debian:~$ uname -r
5.10.0-8.1-amd64
chuckz@debian:~$ cat /sys/devices/virtual/input/input2/uevent
PRODUCT=1/5853//0
NAME="Xen Virtual Keyboard"
PHYS="xenbus/device/vkbd/0"
PROP=0
EV=3
KEY=7fff  ...
MODALIAS=input:b0001v5853pe-e0,1,k71,72... really long MODALIAS
--- 



So I think a test of the installation media in a Xen HVM with the
4k buffer in the kernel is the next step.

I would also like to test a live CD in a Xen HVM with this patch.
It was also reported to fail to boot in a Xen HVM on the
debian-user list.

BTW, my complements to the Debian Kernel Team for the
excellent handbook on building kernels for Debian. It is
easy to understand and made it very easy for me to
build and test the patch even though I have not built
a Linux kernel in many years, and I never built a Debian
kernel before.

All the best,

Chuck



Results of more tests with the patched kernel:

1. Boot on dom0 - works normally, can create VMs, run Liinux container, etc.
2. Boot in Xen PV - works normally
3. Boot on bare hardware - works normally

I do not see any issues with the patched kernel on my system.

Cheers,

Chuck



Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-26 Thread Chuck Zmudzinski

On 8/24/2021 7:12 PM, Ben Hutchings wrote:


The current limit on the environment of a uevent appears to be 2 KB
(UEVENT_BUFFER_SIZE defined in ).  That seems like it
*might* be easier to change, so long as user-space doesn't have a
similar limit.

I looked into systemd/udev, and it seems to use an 8 KB buffer for
receiving uevents:

https://sources.debian.org/src/systemd/247.9-1/src/libsystemd/sd-device/device-monitor.c/?hl=390#L390

But as a first step I think increasing the kernel buffer size to 4 KB
would be enough.  Perhaps someone could test whether this patch to the
domU kernel makes udev happier:

--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -30,7 +30,7 @@
  
  #define UEVENT_HELPER_PATH_LEN		256

  #define UEVENT_NUM_ENVP   64  /* number of env 
pointers */
-#define UEVENT_BUFFER_SIZE 2048/* buffer for the variables */
+#define UEVENT_BUFFER_SIZE 4096/* buffer for the variables */
  
  #ifdef CONFIG_UEVENT_HELPER

  /* path to the userspace helper executed on an event */
--- END ---

?

Ben.



I tested this patch on my Xen HVM bullseye system and
it appears 4k is enough for the UEVENT_BUFFER_SIZE
to accommodate the Xen Virtual Keyboard's large
modalias. I needed to follow the instructions in
the Kernel team's handbook for changing the ABI
name of the kernel for the build to succeed with
the patch. I just bumped it from 8 to 8.1.

Results:

1. No coldplug failure reported at boot time.

2. With the patch the system can write uevent
data to sysfs for the Xen Virtual Keyboard device.

With the current 5.10.0-8 kernel:

chuckz@debian:~$ cat /sys/devices/virtual/input/input2/uevent
chuckz@debian:~$

With the patched kernel with a change to the ABI version from 8 to 8.1:

chuckz@debian:~$ uname -r
5.10.0-8.1-amd64
chuckz@debian:~$ cat /sys/devices/virtual/input/input2/uevent
PRODUCT=1/5853//0
NAME="Xen Virtual Keyboard"
PHYS="xenbus/device/vkbd/0"
PROP=0
EV=3
KEY=7fff  ...
MODALIAS=input:b0001v5853pe-e0,1,k71,72... really long MODALIAS
---

So I think a test of the installation media in a Xen HVM with the
4k buffer in the kernel is the next step.

I would also like to test a live CD in a Xen HVM with this patch.
It was also reported to fail to boot in a Xen HVM on the
debian-user list.

BTW, my complements to the Debian Kernel Team for the
excellent handbook on building kernels for Debian. It is
easy to understand and made it very easy for me to
build and test the patch even though I have not built
a Linux kernel in many years, and I never built a Debian
kernel before.

All the best,

Chuck



Bug#988776: Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-25 Thread Chuck Zmudzinski

On 8/25/2021 4:16 PM, Phillip Susi wrote:

Chuck Zmudzinski  writes:

If it doesn't work, I am also willing to try approach a by patching
the Linux kernel xen-kbdfront driver by removing the for loops that
advertise those 654 keys. I tend to agree with Philip that this is
totally unnecessary, but I suppose I could be wrong about that.
I read the discussion Philip had with the Xen developers and they
seemed to want to keep the Xen keyboard driver as it is.

That was the first thing I tried and the libinput maintainer pointed out
that if you don't advertise the keys, you can't use the keys.  In other
words, somebody presses that key on their keyboard and the domU won't
recognize it.



Well, good news - It looks like Ben's patch works, I just tested it in 
my full

install in a Xen HVM domU and all looks good. I did not see the Coldplug
failure at the beginning of the boot - it is hard to miss in the bright red
letters on the console, and even more convincing is the fact that another
symptom of the bug is gone. This bug manifests itself in udev not being
able to write uevent data to sysfs for the Xen Virtual Keyboard. With
Ben's patch of increasing the UEVENT_BUFFER_SIZE from 2048 to 4096,
udev can write its uevent data to sysfs for the Xen Virtual Keyboard:

With the current 5.10.0-8 kernel:

chuckz@debian:~$ cat /sys/devices/virtual/input/input2/uevent
chuckz@debian:~$

With the patched kernel with a change to the ABI version from 8 to 8.1:

chuckz@debian:~$ uname -r
5.10.0-8.1-amd64
chuckz@debian:~$ cat /sys/devices/virtual/input/input2/uevent
PRODUCT=1/5853//0
NAME="Xen Virtual Keyboard"
PHYS="xenbus/device/vkbd/0"
PROP=0
EV=3
KEY=7fff  ...
MODALIAS=input:b0001v5853pe-e0,1,k71,72... really long MODALIAS

I expect with that patch the installation media will work
in a Xen HVM domU.

Cheers,

Chuck



Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-25 Thread Chuck Zmudzinski

On 8/24/2021 7:12 PM, Ben Hutchings wrote:


Text-based sysfs attributes are limited to a page, but udev receives
uevents through netlink, not sysfs.

The current limit on the environment of a uevent appears to be 2 KB
(UEVENT_BUFFER_SIZE defined in ).  That seems like it
*might* be easier to change, so long as user-space doesn't have a
similar limit.

I looked into systemd/udev, and it seems to use an 8 KB buffer for
receiving uevents:

https://sources.debian.org/src/systemd/247.9-1/src/libsystemd/sd-device/device-monitor.c/?hl=390#L390

But as a first step I think increasing the kernel buffer size to 4 KB
would be enough.  Perhaps someone could test whether this patch to the
domU kernel makes udev happier:

--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -30,7 +30,7 @@
  
  #define UEVENT_HELPER_PATH_LEN		256

  #define UEVENT_NUM_ENVP   64  /* number of env 
pointers */
-#define UEVENT_BUFFER_SIZE 2048/* buffer for the variables */
+#define UEVENT_BUFFER_SIZE 4096/* buffer for the variables */
  
  #ifdef CONFIG_UEVENT_HELPER

  /* path to the userspace helper executed on an event */
--- END ---

?

Ben.



I tried this patch but the build failed - it ran for over an hour. I am not
sure why as I have not built a Linux kernel in many years. So I will
this:

1) Try to build the unmodified kernel on my system just to be sure I
am building the kernel correctly and that my hardware is OK. Once
I could not build the Linux kernel until I replaced a bad memory
card.

2) If that succeeds, I will try the patch with a bump to the abi version.

From the output of the failed build and what I read in the section on
the Debian kernel ABI name, I think that the system detected an
ABI change and so it failed. The build was checking symbols when
it failed.

This will take a little while because it takes over an hour to build the
kernel on my system.

Chuck



Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-25 Thread Chuck Zmudzinski

On 8/25/2021 12:45 PM, Chuck Zmudzinski wrote:

On 8/24/2021 7:12 PM, Ben Hutchings wrote:

On Tue, Aug 24, 2021 at 03:27:19PM -0400, Phillip Susi wrote:

Ben Hutchings  writes:


I think a proper fix would be one of:

a. If the Xen virtual keyboard driver is advertising capabilities it
��� doesn't have, stop it doing that.
b. Change the implementation of modalias attributes to allow longer
��� values.

It's not clear to me whether the Xen driver is advertising 
correctly or

not.� If it is, then�the solution should be b, but that may be too
disruptive a change to the kernel.� So a reasonable workaround might
be:

c. Change the input subsystem to limit the length of the
��� capabilities part of the modalias.

The problem with a) is that the Xen keyboard is not a physical keyboard
and so it has no way of knowing what keys it actually has.� It is a 
fake

input device designed to pass through whatever input the Xen hypervisor
sends down.� As such, any key could come in.� If it doesn't advertise
that it has all of these keys, then they would not be accepted by
libinput when the hypervisor sends them down.

Right, that's what I feared.

xen-kbdfront is setting the bits for keys in the ranges [KEY_ESC,
KEY_UNKNOWN) and [KEY_OK, KEY_MAX), which I think works out to 654
keys and 2362 bytes in the modalias.


This seems to be the heart of the problem: libinput was designed
assuming that all keyboards can and must report what keys are actually
present, and then libinput tries to cram that information into the
modalias rather than some other sysfs attribute as it should ( or 
not at
all... I still don't see how this information is actually supposed 
to be

useful to userspace ).

I think modaliases aren't intended to be interpreted by user-space,
other than processing wildcards when matching to modules.

For input devices, the same information is available through other
variables in the uevent, in a more compact form.� The information *is*
useful for user-space; e.g. in initramfs-tools we recognise keyboard
devices and add their drivers to the initramfs but ignore other input
devices.


As for b), the problem isn't with the modalias attribute itself, but
when the kernel tries to copy it into the environment block for the 
udev
callout.� The environment block is only a single page, and so 
limited to

4 KB.� And that's for everything else that goes into the environment,
not just the modalias.

Text-based sysfs attributes are limited to a page, but udev receives
uevents through netlink, not sysfs.

The current limit on the environment of a uevent appears to be 2 KB
(UEVENT_BUFFER_SIZE defined in ).� That seems like it
*might* be easier to change, so long as user-space doesn't have a
similar limit.

I looked into systemd/udev, and it seems to use an 8 KB buffer for
receiving uevents:

https://sources.debian.org/src/systemd/247.9-1/src/libsystemd/sd-device/device-monitor.c/?hl=390#L390 



But as a first step I think increasing the kernel buffer size to 4 KB
would be enough.� Perhaps someone could test whether this patch to the
domU kernel makes udev happier:

--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -30,7 +30,7 @@
� � #define UEVENT_HELPER_PATH_LEN������� 256
� #define UEVENT_NUM_ENVP����������� 64��� /* 
number of env pointers */
-#define UEVENT_BUFFER_SIZE������� 2048��� /* buffer for the 
variables */
+#define UEVENT_BUFFER_SIZE������� 4096��� /* buffer for the 
variables */

� � #ifdef CONFIG_UEVENT_HELPER
� /* path to the userspace helper executed on an event */
--- END ---

?

Ben.




I will try it in my bullseye Xen HVM DomU.

I am not sure how to rebuild the installation media with a patched
systemd, but I can patch my installed Xen HVM DomU system
with a patched systemd with the increased buffer size and see if the
Coldplug failure early in the boot process goes away. If so, then it
is likely this patch to systemd would also fix the installation media.

If it doesn't work, I am also willing to try approach a by patching
the Linux kernel xen-kbdfront driver by removing the for loops that
advertise those 654 keys. I tend to agree with Philip that this is
totally unnecessary, but I suppose I could be wrong about that.
I read the discussion Philip had with the Xen developers and they
seemed to want to keep the Xen keyboard driver as it is.

Chuck


The build failed with an error. I used the test-patches script to start 
the build:


chuckz@debian:~/linuxdata/sources-bullseye/kernel/linux-5.10.46$ bash 
debian/bin/test-patches ../patch


with Ben's patch to UEVENT_BUFFER_SIZE in ../patch.

The build was running for over an hour and then failed with the last few 
lines on

the console as:

RT_SYMBOL
zl10039_attach���������������������������������� module: 
drivers/media/dvb-frontends/zl10039

Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-25 Thread Chuck Zmudzinski

On 8/24/2021 7:12 PM, Ben Hutchings wrote:

On Tue, Aug 24, 2021 at 03:27:19PM -0400, Phillip Susi wrote:

Ben Hutchings  writes:


I think a proper fix would be one of:

a. If the Xen virtual keyboard driver is advertising capabilities it
doesn't have, stop it doing that.
b. Change the implementation of modalias attributes to allow longer
values.

It's not clear to me whether the Xen driver is advertising correctly or
not.  If it is, then�the solution should be b, but that may be too
disruptive a change to the kernel.  So a reasonable workaround might
be:

c. Change the input subsystem to limit the length of the
capabilities part of the modalias.

The problem with a) is that the Xen keyboard is not a physical keyboard
and so it has no way of knowing what keys it actually has.  It is a fake
input device designed to pass through whatever input the Xen hypervisor
sends down.  As such, any key could come in.  If it doesn't advertise
that it has all of these keys, then they would not be accepted by
libinput when the hypervisor sends them down.

Right, that's what I feared.

xen-kbdfront is setting the bits for keys in the ranges [KEY_ESC,
KEY_UNKNOWN) and [KEY_OK, KEY_MAX), which I think works out to 654
keys and 2362 bytes in the modalias.


This seems to be the heart of the problem: libinput was designed
assuming that all keyboards can and must report what keys are actually
present, and then libinput tries to cram that information into the
modalias rather than some other sysfs attribute as it should ( or not at
all... I still don't see how this information is actually supposed to be
useful to userspace ).

I think modaliases aren't intended to be interpreted by user-space,
other than processing wildcards when matching to modules.

For input devices, the same information is available through other
variables in the uevent, in a more compact form.  The information *is*
useful for user-space; e.g. in initramfs-tools we recognise keyboard
devices and add their drivers to the initramfs but ignore other input
devices.


As for b), the problem isn't with the modalias attribute itself, but
when the kernel tries to copy it into the environment block for the udev
callout.  The environment block is only a single page, and so limited to
4 KB.  And that's for everything else that goes into the environment,
not just the modalias.

Text-based sysfs attributes are limited to a page, but udev receives
uevents through netlink, not sysfs.

The current limit on the environment of a uevent appears to be 2 KB
(UEVENT_BUFFER_SIZE defined in ).  That seems like it
*might* be easier to change, so long as user-space doesn't have a
similar limit.

I looked into systemd/udev, and it seems to use an 8 KB buffer for
receiving uevents:

https://sources.debian.org/src/systemd/247.9-1/src/libsystemd/sd-device/device-monitor.c/?hl=390#L390

But as a first step I think increasing the kernel buffer size to 4 KB
would be enough.  Perhaps someone could test whether this patch to the
domU kernel makes udev happier:

--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -30,7 +30,7 @@
  
  #define UEVENT_HELPER_PATH_LEN		256

  #define UEVENT_NUM_ENVP   64  /* number of env 
pointers */
-#define UEVENT_BUFFER_SIZE 2048/* buffer for the variables */
+#define UEVENT_BUFFER_SIZE 4096/* buffer for the variables */
  
  #ifdef CONFIG_UEVENT_HELPER

  /* path to the userspace helper executed on an event */
--- END ---

?

Ben.




I will try it in my bullseye Xen HVM DomU.

I am not sure how to rebuild the installation media with a patched
systemd, but I can patch my installed Xen HVM DomU system
with a patched systemd with the increased buffer size and see if the
Coldplug failure early in the boot process goes away. If so, then it
is likely this patch to systemd would also fix the installation media.

If it doesn't work, I am also willing to try approach a by patching
the Linux kernel xen-kbdfront driver by removing the for loops that
advertise those 654 keys. I tend to agree with Philip that this is
totally unnecessary, but I suppose I could be wrong about that.
I read the discussion Philip had with the Xen developers and they
seemed to want to keep the Xen keyboard driver as it is.

Chuck



Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-25 Thread Chuck Zmudzinski

On 8/25/2021 10:54 AM, Ben Hutchings wrote:

On Tue, 2021-08-24 at 15:19 -0400, Chuck Zmudzinski wrote:

On 8/24/2021 1:12 PM, Ben Hutchings wrote:

[...]


I think a proper fix would be one of:

a. If the Xen virtual keyboard driver is advertising capabilities it
 doesn't have, stop it doing that.
b. Change the implementation of modalias attributes to allow longer
 values.

It's not clear to me whether the Xen driver is advertising correctly or
not.  If it is, then the solution should be b, but that may be too
disruptive a change to the kernel.  So a reasonable workaround might
be:

c. Change the input subsystem to limit the length of the
 capabilities part of the modalias.


Ben.


So workaround c would not involve disruptions to the kernel or
systemd? Workaround c seems too disruptive for stable to me,
but maybe could go into unstable and eventually into testing.

I don't think it would be very disruptive.  It might require a kernel
ABI bump, but we do those regularly during a stable release.  And this
bug is severe enough that I think a fix would be suitable for Debian
stable.


A problem with the approach of fixing this bug in the Xen
keyboard driver is that the fix must be implemented in the underlying
Dom0 system, which could be almost anything - another Linux distro
or Debian stable or oldstable. Any fix upstream would probably get into
a bullseye Dom0, but not oldstable Dom0, but perhaps it could be
provided as a backport for anyone who is still on oldstable for their
Xen Dom0.

[...]

I agree that we need to fix this for domU independently of any protocol
change to allow discovery of which keys the underlying input device
has.  So we can't solve this with approach a.


Ben.



Actually, now I think my comments about approach a are wrong. I was thinking
the Linux kernel was reading the modalias of the Xen Virtual Keyboard from
through some interface provided by xen - the hypervisor or libxl or some
such component running in Dom0. After further investigation, now I think the
modalias of the Xen Virtual Keyboard is coming from here:

https://github.com/torvalds/linux/blob/6e764bcd1cf72a2846c0e53d3975a09b242c04c9/drivers/input/misc/xen-kbdfront.c#L257

This is the xen-kbdfront.c driver, which is part of the Linux kernel.

At line 257 of that driver, we have:

        for (i = KEY_ESC; i < KEY_UNKNOWN; i++)
            __set_bit(i, kbd->keybit);
        for (i = KEY_OK; i < KEY_MAX; i++)
            __set_bit(i, kbd->keybit);

This is advertising too many keys, making the modalias absurdly large.
The Xen virtual keyboard driver in the Linux kernel has been doing
this at least since 2011 when to Xen virtual keyboard driver was
moved to its current location in the Linux kernel source tree.

So this can probably be fixed in the Linux kernel without any patches
to the Xen hypervisor or libxl running in Dom0. Probably just
removing those two for loops would fix it.

Chuck



Bug#983357: Bug#988776: Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-24 Thread Chuck Zmudzinski

On 8/24/2021 1:12 PM, Ben Hutchings wrote:

On Tue, 2021-08-24 at 10:56 -0400, Chuck Zmudzinski wrote:

On 5/24/2021 3:30 AM, Michael Biebl wrote:

Hi Phillip

Am 24.05.2021 um 06:19 schrieb Cyril Brulebois:

trigger to cold plug all devices.  Both scripts are set -e.  The Xen
Virtual Keyboard driver and at least one other driver have always
failed
to trigger due to having absurdly long modalias, but the error used to
be ignored.  The kernel now returns the error to udevadm

So this is a change in behaviour in the kernel?
What happens if you boot the installed system? Does udevadm trigger
fail there as well?

I feel a bit uneasy changing the udev start script this late in the
release cycle (especially when it appears like covering up an issue
someplace else).

I'll let Marco make the judgement on this though, as he has the most
experience with those udev udeb start scripts as the original author.

Michael


After reviewing Philip's message at

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983357#43

which seems to point to the root cause of this bug, I can add:

On my Xen HVM DomU I see the absurdly long modalias for the Xen
Virtual keyboard that seems to be causing this crash in sysfs at

/sys/devices/virtual/input/input2/modalias

But at /sys/devices/vkbd-0/modalias, I see just 'xen:vkbd', which would
probably not result in an error in the udev script if this was also
written as the modalias at /sys/devices/virtual/input/input2/modalias

So the Xen virtual keyboard appears more than once in sysfs, and
modalias is not the same in the different places. This seems
to be a problem.

They are two different devices, and they should have different
modaliases.

Linux has code for discovering devices on each kind of bus, including
virtual buses, and that code creates "bus devices" such as vkbd-0.  At
this point the kernel doesn't know what the device is capable of.  The
modalias for a bus device carries some identifying information that can
be used to select a driver module for it.

The driver does know what the device is capable of, and how to use it.
It will normally create one or more "class devices" that support a
particular set of operations; in this case input device operations.
Class devices typically don't have modaliases, since they don't need
another layer of drivers on top.  However, for input devices the
modalias carries information about the device's capabilities.  These
may trigger loading of the evdev or joydev module.


I understand the correct way to fix this bug is by modifying the
Xxen virtual keyboard (and any other devices that might cause
this crash) and not the start-udev script on the netinst
installation media, which is so far the only available workaround.
Hopefully Xen will accept a fix if we can come up with a fix.

[...]

I think a proper fix would be one of:

a. If the Xen virtual keyboard driver is advertising capabilities it
doesn't have, stop it doing that.
b. Change the implementation of modalias attributes to allow longer
values.

It's not clear to me whether the Xen driver is advertising correctly or
not.  If it is, then the solution should be b, but that may be too
disruptive a change to the kernel.  So a reasonable workaround might
be:

c. Change the input subsystem to limit the length of the
capabilities part of the modalias.


Ben.



So workaround c would not involve disruptions to the kernel or
systemd? Workaround c seems too disruptive for stable to me,
but maybe could go into unstable and eventually into testing.

A problem with the approach of fixing this bug in the Xen
keyboard driver is that the fix must be implemented in the underlying
Dom0 system, which could be almost anything - another Linux distro
or Debian stable or oldstable. Any fix upstream would probably get into
a bullseye Dom0, but not oldstable Dom0, but perhaps it could be
provided as a backport for anyone who is still on oldstable for their
Xen Dom0.

Anyway, I will look into the Xen virtual keyboard capabilities. The
only capability I can think of that would be useful in this context is that
it supports live migration of a VM through some sort of hot-swapping
capability. If it has that capability, a workaround to support it would be
good. But if it does not have that capability or if such a capability is
not needed for a keyboard, then it should probably stop advertising
itself as being able or needing to do that. Ultimately, it is up to Xen to
decide if they are going to make changes to its virtual keyboard.

Chuck



Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-24 Thread Chuck Zmudzinski

On 5/24/2021 3:30 AM, Michael Biebl wrote:

Hi Phillip

Am 24.05.2021 um 06:19 schrieb Cyril Brulebois:

trigger to cold plug all devices.  Both scripts are set -e.  The Xen
Virtual Keyboard driver and at least one other driver have always 
failed

to trigger due to having absurdly long modalias, but the error used to
be ignored.  The kernel now returns the error to udevadm


So this is a change in behaviour in the kernel?
What happens if you boot the installed system? Does udevadm trigger 
fail there as well?


I feel a bit uneasy changing the udev start script this late in the 
release cycle (especially when it appears like covering up an issue 
someplace else).


I'll let Marco make the judgement on this though, as he has the most 
experience with those udev udeb start scripts as the original author.


Michael



After reviewing Philip's message at

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983357#43

which seems to point to the root cause of this bug, I can add:

On my Xen HVM DomU I see the absurdly long modalias for the Xen
Virtual keyboard that seems to be causing this crash in sysfs at

/sys/devices/virtual/input/input2/modalias

But at /sys/devices/vkbd-0/modalias, I see just 'xen:vkbd', which would
probably not result in an error in the udev script if this was also
written as the modalias at /sys/devices/virtual/input/input2/modalias

So the Xen virtual keyboard appears more than once in sysfs, and
modalias is not the same in the different places. This seems
to be a problem.

I understand the correct way to fix this bug is by modifying the
Xxen virtual keyboard (and any other devices that might cause
this crash) and not the start-udev script on the netinst
installation media, which is so far the only available workaround.
Hopefully Xen will accept a fix if we can come up with a fix.

I am willing to try to debug this by testing patches to the Xen
virtual keyboard, and anyone who has any tips on how
udev works would be helpful. Is there documentation in udev for
device developers somewhere to consult that explains how to
update old device drivers so they are compatible with the
modern version? Does the Xen virtual keyboard need to be
managed by udev? Is there a simple way to disable incompatible
devices so udev ignores them?

Chuck Zmudzinski



Bug#983357: Netinst crashes xen domU when loading kernel

2021-08-24 Thread Chuck Zmudzinski

On 5/25/2021 2:38 PM, Phillip Susi wrote:

Michael Biebl writes:


So this is a change in behaviour in the kernel?

Yes, this commit fixed the kernel to report the error instead of
silently failing:

commit df44b479654f62b478c18ee4d8bc4e9f897a9844
Author: Peter Rajnoha 
Date:   Wed Dec 5 12:27:44 2018 +0100

 kobject: return error code if writing /sys/.../uevent fails
 
 Propagate error code back to userspace if writing the /sys/.../uevent

 file fails. Before, the write operation always returned with success,
 even if we failed to recognize the input string or if we failed to
 generate the uevent itself.
 
 With the error codes properly propagated back to userspace, we are

 able to react in userspace accordingly by not assuming and awaiting
 a uevent that is not delivered.
 
 Signed-off-by: Peter Rajnoha 

 Signed-off-by: Greg Kroah-Hartman 


What happens if you boot the installed system? Does udevadm trigger fail
there as well?

Yes, it does; that is how I was able to track down the problem.


I feel a bit uneasy changing the udev start script this late in the
release cycle (especially when it appears like covering up an issue
someplace else).

I'll let Marco make the judgement on this though, as he has the most
experience with those udev udeb start scripts as the original author.

So far I have been removing the -e from the shbang line in the
start-udev script and remastering the iso so I can get it to boot.  It
would probably be a better idea to just add a || true to the udevadm
trigger call.  I feel fairly certain that no matter what the cause of
the coldplug failure, the user is going to be better off ignoring it and
trying to proceed than a kernel panic.



Hello,

This bug was noticed on the debian-user list recently and I have
been testing various workarounds and instead of removing -e from
the shbang line I came up with prepending the udevadm trigger call
in the start-udev script with

dmesg | grep DMI: | grep 'Xen HVM domU' ||

This causes the offending udevadm trigger call to never be invoked
when running in a Xen HVM DomU. On all other systems, the call
should be invoked like normal. With this hack, I was able to create
a modified ISO and run the bullseye installer from it in a Xen HVM
DomU and complete an install without the crash and reboot.

I also can confirm that I always see the coldplug failure on the installed
system in a Xen HVM DomU, but in that case the failure does not
cause a crash and the system boots normally after reporting the failure.

I also do not see the problem in a Xen PV DomU, which I think is
what the /install.amd/xen folder on the installation media is for.

Chuck Zmudzinski



Bug#990055: qemu-system-x86: Cannot set PCI slot 2 for Intel IGD Passthrough using Xen

2021-06-18 Thread Chuck Zmudzinski
Package: qemu-system-x86
Version: 6.0+dfsg-1~exp0 and 5.2+dfsg-10
Severity: normal
Tags: patch upstream

Dear Maintainer,

I find that when using qemu with a Windows Xen HVM DomU and also passing 
through the Intel integrated graphics device (IGD) to the Windows Xen HVM DomU, 
it is much more reliable if the Intel IGD is at PCI slot 2 in the HVM DomU. 
When using the ancient qemu-xen-traditional device model provided by the Xen 
project, the Intel IGD always grabs slot 2 when it is passed through to the Xen 
HVM DomU using the gfx_passthru option in xl.cfg, but not when using what the 
Xen project refers to as the upstream qemu device model, which is the device 
model provided by the qemu-system-x86 package. Intel says the IGD device needs 
to be at PCI slot 2 and but it will be at a different slot when using the qemu 
version provided by the qemu-system-x86 package.

One problem that occurs is that Windows sometimes reports code 43 errors in the 
Windows Device Manager when the Intel IGD is not set to PCI slot 2, and this 
prevents IGD passthrough from working because the Windows code 43 error causes 
Windows to disable the affected device. Other times, the screen is a little 
fuzzy at first but it usually clears up later.

I investigated and found out how the ancient qemu-xen-traditional model ensures 
the IGD grabs PCI slot 2, it is by patching the hw/pci/pci.c file, but the 
patch in qemu-xen-traditional is not appropriate for this version of qemu 
because unlike qemu-xen-traditional, this version is designed to support more 
configuratons than with Xen.

I was able to develop a patch that causes the Intel IGD to grab slot 2 with 
these versions of qemu when qemu is running with xen as the accelerator and 
when using the xenlight (xl/libxl) toolstack to build the Xen HVM. The patch is 
designed to only affect Xen HVMs with IGD passthrough, that is, when using the 
xenlight toolstack and setting gfx_passthru to '1' or 'igd' in the Xen HVM 
DomU's xl.cfg file.

The following patch is for the 6.0+dfsg-1~exp0 package, but it also applies to 
the 5.2+dfsg-10 package also with some fuzz. I used it as the last patch in the 
series of patches in debian/patches, and it works well. It uses CONFIG_DEVICES 
to only compile on platforms with CONFIG_XEN_IGD_PASSTHROUGH set by the meson 
build system, and it also checks and only applies at runtime if the 
gfx_passthru option is set, and all the patch is in the xen part of the qemu 
code.

-Start of Patch

--- a/hw/i386/xen/xen-hvm.c 2021-04-29 13:18:58.0 -0400
+++ b/hw/i386/xen/xen-hvm.c 2021-06-18 09:44:58.0 -0400
@@ -9,6 +9,7 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES
 #include "qemu/units.h"
 
 #include "cpu.h"
@@ -38,6 +39,11 @@
 #include 
 #include 
 
+#ifdef CONFIG_XEN_IGD_PASSTHROUGH
+#include "hw/pci/pci_bus.h"
+#include "hw/xen/xen_pt.h"
+#endif
+
 //#define DEBUG_XEN_HVM
 
 #ifdef DEBUG_XEN_HVM
@@ -1530,6 +1536,21 @@
 exit(1);
 }
 
+#ifdef CONFIG_XEN_IGD_PASSTHROUGH
+/* Reserve pci slot 2 for the Intel IGD */
+void xen_hvm_reserve_igd_slot(PCIBus *pci_bus)
+{
+DPRINTF("Checking if igd-passthrough is set...\n");
+if (xen_igd_gfx_pt_enabled()) {
+DPRINTF("Reserving PCI slot 0x02 for IGD...\n");
+pci_bus->slot_reserved_mask = XEN_IGD_PCI_SLOT;
+}
+else {
+DPRINTF("IGD passthrough is not set\n");
+}
+}
+#endif
+
 void destroy_hvm_domain(bool reboot)
 {
 xc_interface *xc_handle;
--- a/hw/i386/pc_piix.c 2021-06-18 09:39:56.0 -0400
+++ b/hw/i386/pc_piix.c 2021-06-18 09:49:15.0 -0400
@@ -208,6 +208,13 @@
   pci_memory, ram_memory);
 pcms->bus = pci_bus;
 
+#ifdef CONFIG_XEN_IGD_PASSTHROUGH
+/* This function checks if igd-passthru is enabled and 
+ * if so, reserve slot 2 for it on the PCI Bus */
+if (xen_enabled()) {
+xen_hvm_reserve_igd_slot(pci_bus);
+}
+#endif
 piix3 = piix3_create(pci_bus, _bus);
 piix3->pic = x86ms->gsi;
 piix3_devfn = piix3->dev.devfn;
--- a/include/hw/xen/xen-x86.h  2021-04-29 13:18:58.0 -0400
+++ b/include/hw/xen/xen-x86.h  2021-06-18 09:54:05.0 -0400
@@ -12,4 +12,8 @@
 
 void xen_hvm_init_pc(PCMachineState *pcms, MemoryRegion **ram_memory);
 
+#ifdef CONFIG_XEN_IGD_PASSTHROUGH
+void xen_hvm_reserve_igd_slot(PCIBus *pci_bus);
+#endif
+
 #endif /* QEMU_HW_XEN_X86_H */
--- a/hw/xen/xen_pt.c   2021-04-29 13:18:58.0 -0400
+++ b/hw/xen/xen_pt.c   2021-06-18 10:07:42.0 -0400
@@ -53,6 +53,7 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES
 #include "qapi/error.h"
 #include 
 
@@ -65,6 +66,10 @@
 #include "xen_pt.h"
 #include "qemu/range.h"
 #include "exec/address-spaces.h"
+#ifdef CONFIG_XEN_IGD_PASSTHROUGH
+#include "hw/pci/pci_bus.h"
+static void xen_pt_clear_igd_slot(DeviceState *qdev, Error **errp);
+#endif
 
 static bool has_igd_gfx_passthru;
 
@@ 

Bug#988333: linux-image-5.10.0-6-amd64: VGA Intel IGD Passthrough to Debian Xen HVM DomUs not working, but Windows Xen HVMs do work

2021-05-10 Thread Chuck Zmudzinski
Package: src:linux
Version: 5.10.28-1
Severity: normal
Tags: upstream

Dear Maintainer,

I have been using Xen's PCI and VGA passthrough feature since wheezy and jessie 
were the stable versions, and back then both Windows HVMs and Linux HVMs would 
function with the Intel Integrated Graphics Device (IGD), the audio device, and 
the USB 3 controller passed to them. But with buster and bullseye running as 
the Dom0, I can only get the VGA/Passthrough feature to work with Windows Xen 
HVMs. I would expect both Windows and Linux HVMs to work comparably well. 


-- Package-specific info:
Linux version 5.10.0-6-amd64 (debian-ker...@lists.debian.org) (gcc-10 (Debian 
10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP 
Debian 5.10.28-1 (2021-04-09)

BOOT_IMAGE=/boot/vmlinuz-5.10.0-6-amd64 
root=UUID=332b3875-d57c-4083-9d46-3faa28d60691 ro xen-fbfront.video=24,1368,768 
quiet - this is what I have on the bullseye DomU.

On the Dom0, I have
BOOT_IMAGE=/boot/vmlinuz-5.10.0-6-amd64 root=/dev/debian/bullseye ro 
reboot=bios quiet console=tty1 console=hvc0
 
On Dom0, the Xen commandline and version (from xl dmesg):
dom0_mem=2G,max:2G smt=false pv-l1tf=false iommu=1 no-real-mode edd=off
Xen version 4.14.2-pre (Debian 4.14.1+11-gb0b734a8b3-1) 
(pkg-xen-de...@lists.alioth.debian.org) (x86_64-linux-gnu-gcc (Debian 10.2.1-6) 
10.2.1 20210110) debug=n  Sun Feb 28 18:49:45 UTC 2021
Bootloader: GRUB 2.02+dfsg1-20+deb10u2

kernel logs (problems reported in Dom0's syslog when trying to start this 
Debian bullseye Xen HVM DomU with Xen VGA/PCI  passthrough configured):

May  9 10:52:20 bullseye kernel: [0.00] Linux version 5.10.0-6-amd64 
(debian-ker...@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU 
ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.28-1 (2021-04-09)
May  9 10:52:20 bullseye kernel: [0.00] Command line: placeholder 
root=/dev/debian/bullseye ro reboot=bios quiet console=tty1 console=hvc0
.
.
.
Start a bullseye Xen HVM configured for PCI/VGA passthrough using the bullseye 
Xen and Qemu packages for bullseye on Dom0 (Haswell Intel IGD + audio device + 
USB 3.0 controller):

May 10 08:50:03 bullseye kernel: [79077.644346] pciback :00:1b.0: 
xen_pciback: vpci: assign to virtual slot 0
May 10 08:50:03 bullseye kernel: [79077.644478] pciback :00:1b.0: 
registering for 16
May 10 08:50:03 bullseye kernel: [79077.644732] pciback :00:14.0: 
xen_pciback: vpci: assign to virtual slot 1
May 10 08:50:03 bullseye kernel: [79077.644874] pciback :00:14.0: 
registering for 16
May 10 08:50:03 bullseye kernel: [79077.645024] pciback :00:02.0: 
xen_pciback: vpci: assign to virtual slot 2
May 10 08:50:03 bullseye kernel: [79077.645107] pciback :00:02.0: 
registering for 16
May 10 08:50:30 bullseye kernel: [79105.273876] vif vif-16-0 vif16.0: Guest Rx 
ready
May 10 08:50:30 bullseye kernel: [79105.273893] IPv6: ADDRCONF(NETDEV_CHANGE): 
vif16.0: link becomes ready
May 10 08:50:30 bullseye kernel: [79105.278023] xen-blkback: 
backend/vbd/16/51712: using 4 queues, protocol 1 (x86_64-abi) persistent grants
May 10 08:50:44 bullseye kernel: [79119.104937] irq 16: nobody cared (try 
booting with the "irqpoll" option)
May 10 08:50:44 bullseye kernel: [79119.104973] CPU: 0 PID: 0 Comm: swapper/0 
Not tainted 5.10.0-6-amd64 #1 Debian 5.10.28-1
May 10 08:50:44 bullseye kernel: [79119.104976] Hardware name: To Be Filled By 
O.E.M. To Be Filled By O.E.M./B85M Pro4, BIOS P2.50 12/11/2015
May 10 08:50:44 bullseye kernel: [79119.104979] Call Trace:
May 10 08:50:44 bullseye kernel: [79119.104984]  
May 10 08:50:44 bullseye kernel: [79119.104998]  dump_stack+0x6b/0x83
May 10 08:50:44 bullseye kernel: [79119.105008]  __report_bad_irq+0x35/0xa7
May 10 08:50:44 bullseye kernel: [79119.105014]  note_interrupt.cold+0xb/0x61
May 10 08:50:44 bullseye kernel: [79119.105024]  handle_irq_event+0xa8/0xb0
May 10 08:50:44 bullseye kernel: [79119.105030]  handle_fasteoi_irq+0x78/0x1c0
May 10 08:50:44 bullseye kernel: [79119.105037]  generic_handle_irq+0x47/0x50
May 10 08:50:44 bullseye kernel: [79119.105044]  
__evtchn_fifo_handle_events+0x175/0x190
May 10 08:50:44 bullseye kernel: [79119.105054]  
__xen_evtchn_do_upcall+0x66/0xb0
May 10 08:50:44 bullseye kernel: [79119.105063]  
__xen_pv_evtchn_do_upcall+0x11/0x20
May 10 08:50:44 bullseye kernel: [79119.105069]  asm_call_irq_on_stack+0x12/0x20
May 10 08:50:44 bullseye kernel: [79119.105072]  
May 10 08:50:44 bullseye kernel: [79119.105079]  
xen_pv_evtchn_do_upcall+0xa2/0xc0
May 10 08:50:44 bullseye kernel: [79119.105084]  
exc_xen_hypervisor_callback+0x8/0x10
May 10 08:50:44 bullseye kernel: [79119.105091] RIP: 
e030:xen_hypercall_sched_op+0xa/0x20
May 10 08:50:44 bullseye kernel: [79119.105097] Code: 51 41 53 b8 1c 00 00 00 
0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 
53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
cc cc cc cc
May 10 08:50:44 bullseye kernel: 

Bug#776742: Solved for jessie: Bug#776742: xen-utils-common: no support for VGA Passthrough

2017-02-23 Thread Chuck Zmudzinski
 in any of the binary packages 
produced by my builds.


6. I am willing to contribute a fix based on one of my nmu packages to 
the Debian project if the Debian Xen team is interested in adding it to 
the Debian packages archive, either in main, contrib, or non-free as 
appropriate. I am not including any debdiff patches here because they 
are too large.


Chuck Zmudzinski



Bug#776742: xen-utils-common: no support for VGA Passthrough

2017-02-12 Thread Chuck Zmudzinski

Here are my hardware specs:

Motherboard: ASRock B85M Pro4 LGA 1150 Intel B85 HDMI SATA 6Gb/s USB 3.0 
Micro

ATX Intel (About 2 and a half years old)

CPU: Intel core i5-4590S (Haswell 4th Generation)

Chipset: Intel B85

Intel Integrated Graphics: HD4600, No external video card, I pass 
through the

Intel integrated graphics card to my DomUs

Sound: Realtek integrated on the motherboard, this can also be passed 
through

to DomUs using PCI passthrough, as well as the USB on the motherboard,
although I could not get USB 3 ports to work with passthrough, the USB 2
ports work fine though in DomUs

I also have 16 GB RAM and a 240 GB SSD, with a 1 TB HD. I bought the system
specifically because it had hardware specs that were known to support VGA
passthrough at that time. I don't know what the best options are now if you
are looking for hardware that supports VGA passthrough.

About Debian stretch:

As I said I have not tried it. But I think it is more likely to support
VGA passthrough out of the box than Jessie because the version of Xen on
stretch is 4.8, and version 4.8 supports VGA passthrough using either
the newer upstream qemu or the traditional qemu, but the version
of Xen on jessie is 4.4, and in that version VGA passthrough is only
supported using the older traditional version of qemu which is included in
Wheezy but is not available out of the box on jessie. Compare the man
page of xl.cfg on jessie with the man page of xl.cfg on stretch. Look at
what each says about supporting gfx passthrough:

On jessie: gfx_passthru is currently only supported with the qemu-xen-
traditional device-model. Upstream qemu-xen device-model currently does
not have support for gfx_passthru.

On stretch: gfx_passthru is currently supported both with the
qemu-xen-traditional device-model and upstream qemu-xen device-model.

The reason gfx passthrough does not work on jessie is that Debian took
the traditional qemu device model out of its xen package, but that
component is required for that version of Xen to function with VGA
passthrough. Unless Debian backports version 4.7 or 4.8 to jessie
so the upstream qemu device model will work with VGA passthrough,
or unless Debian provides a supported way to install the traditional qemu
device model on jessie, I don't think there will ever be a supported
configuration on jessie that supports VGA passthrough. I got VGA
passthrough working on my jessie system by hacking the xen source package
for jessie, but I don't think it is possible to get VGA passthrough 
working with

the current version of Debian's xen package for jessie, no matter what
hardware you have. You should try it on stretch and wheezy to test your
hardware for VGA passthrough functionality. Wheezy also has a better
chance of working and it also works on my system with wheezy, but it is a
little flaky and I had to hold back upgrades of the hypervisor for it to
continue to work on wheezy. But wheezy has the traditional qemu device
model, but jessie doesn't. For these reasons, you are better off trying 
wheezy

or stretch for VGA passthrough until Debian provides a solution for jessie.

Chuck

On 02/12/2017 01:37 PM, Juergen Schinker wrote:

can you add your hw specs and also what about Debian-Stretch?
J

- On 12 Feb, 2017, at 18:05, Chuck Zmudzinski brchu...@netscape.net wrote:


This bug, at its core, is that currently there is no supported solution
for VGA
passthrough on Xen for stable version Jessie from Debian.

After browsing Xen's repositories, I found out that Xen did not claim to
support
VGA passthrough with the upstream qemu-xen device model until Sep 25,
2015, the
date the xl.cfg man page was updated to indicate support for VGA passthrough
with upstream qemu-xen. This change to the xl.cfg man page was only made
on the
Xen version 4.7 and 4.8 branches, so if you want to use VGA passthrough
without
the traditional qemu-dm binary, you must upgrade to at least Xen version
4.7.
Debian testing (currently stretch) uses Xen 4.8 and it presumably
supports VGA
passthrough without qemu-xen-traditional but I have not tried it.

This situation leaves users of Debian stable (currently Jessie) with no
supported solution from Debian for VGA passthrough on Xen. Obviously
there are
two solutions. Backport Xen 4.7 or greater to Jessie, or restore the
traditional qemu-dm binary to the Xen 4.4.x package for Jessie.

A couple of months ago I decided to try and rebuild the Xen source package
for Jessie with support for qemu-xen-traditional from upstream included.
It did
not take long to get a working package that solves this bug. I
discovered the
following facts:

1. Adding qemu-xen-traditional in a way supported by Xen also requires
rombios
which, like qemu-xen-traditional, is disabled in Debian's official
build of
Xen for Jessie.

2. After configuring the build for qemu-xen-traditional and rombios, the
only
binary package that is modified significantly is xen-utils-4.4, which is
where the qemu-dm binary

Bug#776742: xen-utils-common: no support for VGA Passthrough

2017-02-12 Thread Chuck Zmudzinski
As far as I can tell, Xen still maintains the traditional qemu-dm, and I was
able to recently rebuild the xen package for Jessie and get VGA Passthrough
working on Jessie with the most recent version of the traditional device 
model

that is available from Xen for the stable 4.4 release of Xen.

So I ask again, why can't the Debian Xen team restore qemu-dm to its 
official

Xen package for Jessie?

The only reasonable reasons I can think of is that there is some free 
software
licensing issue with the rombios modules that are statically linked to 
hvmloader
or with some necessary component of qemu-dm, or the Debian Xen team has 
too few
resources and is devoting its efforts to developing Xen for stretch 
rather than
adding feaatures that did not make it into Jessie when it was released. 
But why

should oldstable Wheezy have a feature that stable Jessie does not have?

In any case, I hope the Debian Xen team can explain why qemu-dm cannot be
restored to Debian Jessie's offical xen package.

Thank you for your consideration of my question.

Sincerely,

Chuck Zmudzinski