Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-24 Thread Mathias Gibbens
On Tue, 2023-09-19 at 07:17 +0200, Salvatore Bonaccorso wrote:
> On Sun, Sep 17, 2023 at 12:01:37PM +0530, intrigeri wrote:
> > In the last month or so, a number of people from various Debian teams
> > and other distributions have been tracking down a regression that
> > affects systems upgraded to Bookworm: services that use certain
> > systemd facilities such as PrivateNetwork=yes fail to start in LXC/LXD
> > containers. Among other things, this breaks the autopkgtests of many
> > packages, such as systemd, on ci.debian.net (#1050256). This was
> > tracked down to a kernel regression, for which a fix landed in Linux
> > 6.2:
> > 
> >   1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets
> > 
> > Work is ongoing to backport the fix to linux-stable/linux-6.1.y.
> > I'm Cc'ing John and Mathias who have been working on this.
> > 
> > FYI, ideally this would be fixed in the upcoming Bookworm
> > point-release (12.2, early October).
> 
> Thanks for the details. Has this already been sent it to the stable
> maintainers? I do not see it yet on the stable list.

  I believe that John has been working on the fix for the 6.1 branch,
although I don't know what the status is. I don't have the necessary
familiarity with apparmor internals to attempt to backport the fix
myself, but I'll be very happy to test once it's available.

Mathias


signature.asc
Description: This is a digitally signed message part


Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-18 Thread Salvatore Bonaccorso
Control: tags -1 + confirmed moreinfo

Hi,

On Sun, Sep 17, 2023 at 12:01:37PM +0530, intrigeri wrote:
> Control: reassign -1 src:linux
> Control: retitle -1 AppArmor breaks locking non-fs Unix sockets
> Control: affects -1 src:apparmor src:lxc src:systemd src:pdns src:policykit-1
> Control: found -1 6.1.38-1
> Control: found -1 6.1.38-2
> Control: notfound -1 6.3.1-1~exp1
> 
> Hi Debian Kernel Team,
> 
> In the last month or so, a number of people from various Debian teams
> and other distributions have been tracking down a regression that
> affects systems upgraded to Bookworm: services that use certain
> systemd facilities such as PrivateNetwork=yes fail to start in LXC/LXD
> containers. Among other things, this breaks the autopkgtests of many
> packages, such as systemd, on ci.debian.net (#1050256). This was
> tracked down to a kernel regression, for which a fix landed in Linux
> 6.2:
> 
>   1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets
> 
> Work is ongoing to backport the fix to linux-stable/linux-6.1.y.
> I'm Cc'ing John and Mathias who have been working on this.
> 
> FYI, ideally this would be fixed in the upcoming Bookworm
> point-release (12.2, early October).

Thanks for the details. Has this already been sent it to the stable
maintainers? I do not see it yet on the stable list.

Regards,
Salvatore



Bug#1050256: autopkgtest fails on debci

2023-09-18 Thread Paul Gevers

Hi all,

On 09-09-2023 13:06, Paul Gevers wrote:
All ci.d.n workers (except riscv64) now run the kernel from 
bookworm-backports. systemd passes it's autopkgtest again in unstable, 
testing and stable.


We're having issues [1] with the (backports and) unstable kernel on our 
main amd64 host, so we reverted back to the stable kernel for amd64.


Paul

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052130


OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1038315: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-17 Thread intrigeri
Dear lxd and systemd maintainers,

Michael Biebl (2023-09-11):
> When you do the reassignment, you should probably merge this bug report 
> with #1038315 and #1042880, now that we know what the root cause is.

FTR I did not dare merging these myself: perhaps you want to keep
separate bug reports to track workarounds on top of #1050256 that's
tracking the root cause, or something.

Cheers,
-- 
intrigeri



Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-17 Thread intrigeri
Control: reassign -1 src:linux
Control: retitle -1 AppArmor breaks locking non-fs Unix sockets
Control: affects -1 src:apparmor src:lxc src:systemd src:pdns src:policykit-1
Control: found -1 6.1.38-1
Control: found -1 6.1.38-2
Control: notfound -1 6.3.1-1~exp1

Hi Debian Kernel Team,

In the last month or so, a number of people from various Debian teams
and other distributions have been tracking down a regression that
affects systems upgraded to Bookworm: services that use certain
systemd facilities such as PrivateNetwork=yes fail to start in LXC/LXD
containers. Among other things, this breaks the autopkgtests of many
packages, such as systemd, on ci.debian.net (#1050256). This was
tracked down to a kernel regression, for which a fix landed in Linux
6.2:

  1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets

Work is ongoing to backport the fix to linux-stable/linux-6.1.y.
I'm Cc'ing John and Mathias who have been working on this.

FYI, ideally this would be fixed in the upcoming Bookworm
point-release (12.2, early October).

Current workarounds:

 - ci.debian.net was upgraded to the bookworm-backports kernel
 - various packages maintainers have added workarounds such as disabling
   PrivateNetwork=yes for autopkgtests

Cheers,
-- 
intrigeri



Bug#1050256: autopkgtest fails on debci

2023-09-14 Thread Mathias Gibbens
On Mon, 2023-09-11 at 13:45 +0200, Michael Biebl wrote:
> Am 09.09.23 um 14:20 schrieb intrigeri:
> 
> > At this stage it seems clear that the bug and the corresponding
> > ideal fix are in the AppArmor part of src:linux, and the bug
> > affects at least src:apparmor and src:lxc. I'd like to reflect this
> > in the metadata of #1050256 by reassigning the bug to Linux, and
> > adding "affects" indications. I'll do so in the next few days
> > unless someone objects soon.
> 
> It also affects at least
> src:systemd, src:pdns, src:policykit-1
> All those packages have added workarounds for this issue.
> I'll revert the workaround in systemd and notify the maintainers of
> pdns and policykit-1.
> 
> > Doing so will also be an opportunity for me to sum up the problem
> > for the maintainers of src:linux, and let them know about our
> > desired timeline: ideally this would be fixed in the upcoming
> > Bookworm point-release.

  Not having heard any objections, please feel free to reassign this
bug. As you said, this will give the src:linux maintainers a heads up,
even if the patch isn't quite ready yet (but hopefully in time for the
12.2 point release).

Mathias


signature.asc
Description: This is a digitally signed message part


Bug#1050256: autopkgtest fails on debci

2023-09-14 Thread Mathias Gibbens
On Mon, 2023-09-04 at 12:39 -0700, John Johansen wrote:
> On 9/4/23 12:32, Michael Biebl wrote:
> > John, could you help with getting this fix into 6.1.x?
> 
> yes, I am working on a patch.

Hi John,

  I wanted to check in to see if you've had a chance to work on that
patch for the 6.1 kernel. The deadline for package updates being
included in the 12.2 point release is in roughly two weeks, but given
this will be a patch for the kernel I'd really like to have something
tested and handed over to the src:linux team well before then.

Thanks,
Mathias


signature.asc
Description: This is a digitally signed message part


Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-11 Thread Michael Biebl

Control: severity -1 important

Am 09.09.23 um 14:20 schrieb intrigeri:

Hi again,

Thank you all for working both on workarounds for Debian CI and on
a proper upstream Linux kernel fix. Impressive cross-team work! :)


+1


At this stage it seems clear that the bug and the corresponding ideal
fix are in the AppArmor part of src:linux, and the bug affects at
least src:apparmor and src:lxc. I'd like to reflect this in the
metadata of #1050256 by reassigning the bug to Linux, and adding
"affects" indications. I'll do so in the next few days unless someone
objects soon.


It also affects at least
src:systemd, src:pdns, src:policykit-1
All those packages have added workarounds for this issue.
I'll revert the workaround in systemd and notify the maintainers of pdns 
and policykit-1.



Doing so will also be an opportunity for me to sum up the problem for
the maintainers of src:linux, and let them know about our desired
timeline: ideally this would be fixed in the upcoming Bookworm
point-release.

This being said, if said timeline can't be met in src:linux, it'll be
up to the maintainers of LXC in Debian to decide what they want to do
in the upcoming Bookworm point-release.

If I misunderstood something important, please let me know.


Sounds good to me.

For now, given that all the debci hosts are running the backports 
kernel, I'm downgrading the severity again.


When you do the reassignment, you should probably merge this bug report 
with #1038315 and #1042880, now that we know what the root cause is.



Regards,
Michael


OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-09 Thread intrigeri
Hi again,

Thank you all for working both on workarounds for Debian CI and on
a proper upstream Linux kernel fix. Impressive cross-team work! :)

At this stage it seems clear that the bug and the corresponding ideal
fix are in the AppArmor part of src:linux, and the bug affects at
least src:apparmor and src:lxc. I'd like to reflect this in the
metadata of #1050256 by reassigning the bug to Linux, and adding
"affects" indications. I'll do so in the next few days unless someone
objects soon.

Doing so will also be an opportunity for me to sum up the problem for
the maintainers of src:linux, and let them know about our desired
timeline: ideally this would be fixed in the upcoming Bookworm
point-release.

This being said, if said timeline can't be met in src:linux, it'll be
up to the maintainers of LXC in Debian to decide what they want to do
in the upcoming Bookworm point-release.

If I misunderstood something important, please let me know.

Cheers,
-- 
intrigeri



Bug#1050256: autopkgtest fails on debci

2023-09-09 Thread Paul Gevers

Hi,

On 03-09-2023 10:50, Paul Gevers wrote:
I have manually upgraded the s390x host and 
rebooted, so that can serve as a test arch.


All ci.d.n workers (except riscv64) now run the kernel from 
bookworm-backports. systemd passes it's autopkgtest again in unstable, 
testing and stable.


Paul


OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-09-04 Thread John Johansen

On 9/4/23 12:32, Michael Biebl wrote:

Am 04.09.23 um 20:23 schrieb Mathias Gibbens:

On Mon, 2023-09-04 at 01:00 -0700, John Johansen wrote:

I took a quick look through v6.1..v6.3.1

there is a patch that I think is the likely fix, it first landed in v6.2

1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets


   Thanks for the pointer John -- I think that is the fix we've been
looking for!

   Commit 1cf26c3d2c4c doesn't apply cleanly to the v6.1 tree due to the
other commits from the patchset of Oct 3, 2022 that modified a bunch of
the apparmor code. Because I couldn't quickly cherry-pick all the
changes without amassing a large diff, I made the small proof-of-
concept patch at the end of this message and applied it to the  6.1.38-
4 kernel from bookworm. Booting with the patched kernel allows services
to start up in containers without any issues. :)

   So, I think the next step should be to get that commit properly
backported to the v6.1 longterm tree and included in an upstream
release. Hopefully that would be able to happen in enough time so that
it is bundled with the kernel updates for bookworm's point release next
month. If not, we should be sure to get it into Debian's packaging so
at least there's a proper fix available.



Thanks for the update Mathias, this looks very promising.
A stable update of the Linux 6.1.x kernel would obviously be the ideal solution.

John, could you help with getting this fix into 6.1.x?



yes, I am working on a patch.



Bug#1050256: autopkgtest fails on debci

2023-09-04 Thread Michael Biebl

Am 04.09.23 um 20:23 schrieb Mathias Gibbens:

On Mon, 2023-09-04 at 01:00 -0700, John Johansen wrote:

I took a quick look through v6.1..v6.3.1

there is a patch that I think is the likely fix, it first landed in v6.2

1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets


   Thanks for the pointer John -- I think that is the fix we've been
looking for!

   Commit 1cf26c3d2c4c doesn't apply cleanly to the v6.1 tree due to the
other commits from the patchset of Oct 3, 2022 that modified a bunch of
the apparmor code. Because I couldn't quickly cherry-pick all the
changes without amassing a large diff, I made the small proof-of-
concept patch at the end of this message and applied it to the  6.1.38-
4 kernel from bookworm. Booting with the patched kernel allows services
to start up in containers without any issues. :)

   So, I think the next step should be to get that commit properly
backported to the v6.1 longterm tree and included in an upstream
release. Hopefully that would be able to happen in enough time so that
it is bundled with the kernel updates for bookworm's point release next
month. If not, we should be sure to get it into Debian's packaging so
at least there's a proper fix available.



Thanks for the update Mathias, this looks very promising.
A stable update of the Linux 6.1.x kernel would obviously be the ideal 
solution.


John, could you help with getting this fix into 6.1.x?

Regards,
Michael


OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-09-04 Thread Mathias Gibbens
On Mon, 2023-09-04 at 01:00 -0700, John Johansen wrote:
> I took a quick look through v6.1..v6.3.1
> 
> there is a patch that I think is the likely fix, it first landed in v6.2
> 
> 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets

  Thanks for the pointer John -- I think that is the fix we've been
looking for!

  Commit 1cf26c3d2c4c doesn't apply cleanly to the v6.1 tree due to the
other commits from the patchset of Oct 3, 2022 that modified a bunch of
the apparmor code. Because I couldn't quickly cherry-pick all the
changes without amassing a large diff, I made the small proof-of-
concept patch at the end of this message and applied it to the  6.1.38-
4 kernel from bookworm. Booting with the patched kernel allows services
to start up in containers without any issues. :)

  So, I think the next step should be to get that commit properly
backported to the v6.1 longterm tree and included in an upstream
release. Hopefully that would be able to happen in enough time so that
it is bundled with the kernel updates for bookworm's point release next
month. If not, we should be sure to get it into Debian's packaging so
at least there's a proper fix available.

  I'm happy to help test any proposed patch for this fix on my end.

Mathias

-

> --- a/security/apparmor/lib.c 2023-09-04 16:08:28.818066140 +
> +++ b/security/apparmor/lib.c 2023-09-04 16:09:17.56661 +
> @@ -355,6 +355,9 @@
>   perms->allow |= map_other(dfa_other_allow(dfa, state));
>   perms->audit |= map_other(dfa_other_audit(dfa, state));
>   perms->quiet |= map_other(dfa_other_quiet(dfa, state));
> +
> + // For testing only!
> + perms->allow |= AA_MAY_LOCK;
>  }
>  
>  /**


signature.asc
Description: This is a digitally signed message part


Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-04 Thread Christian Boltz
Hello,

Am Samstag, 2. September 2023, 01:13:11 CEST schrieb Mathias Gibbens:
>   A minimal reproducer is to install bookworm and create a container
> with a systemd service using a hardening option like
> PrivateNetwork=yes. With the latest bookworm kernel (6.1.38-4), the
> service will fail. But, grab a kernel from testing (6.4.11-1) and then
> things work -- with no other changes required. I tried the "oldest"
> kernel on snapshot.d.o post 6.1 series (6.3.1+1~exp1 [1]) and the
> service works properly with that version as well. So, something
> changed in the kernel (either upstream or in Debian's packaging)
> between 6.1 and  6.3 that "unbreaks" services within lxc containers.

I asked in #apparmor, and John answered

[11:04:33]  can someone have a look at 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1050256 ? Short version: 
Debian gets  unix  denials when running lxc with kernel 6.1.38 from bookwork, 
but things work with kernel 6.3.1
[19:19:41]  cboltz: ok, I will try and look at it today
[07:00:34]  cboltz: I didn't see anything that would cause unix 
failures in a first pass. I will take another pass at it tomorrow
[10:01:30]  cboltz: commit 1cf26c3d2c4c apparmor: fix apparmor 
mediating locking non-fs unix sockets

So you could test if the bookwork kernel with 1cf26c3d2c4c applied on 
top fixes the issue.



To answer a question from a later mail:

Am Sonntag, 3. September 2023, 02:56:05 CEST schrieb Michael Biebl:
> I also tested downgrading apparmor to 2.13.6-10 (i.e. the version from
>  oldstable) on a bookworm system.
> 
> This was also sufficient to unbreak lxc.
> 
> So it "looks" like apparmor 3.x makes assumptions about the kernel
> that are not fulfilled by the kernel 6.1.x in bookworm.

The difference is in the abi levels - without an abi/ include specified, 
unix rules don't get enforced (= allow everything), while with abi/3.0
and AppArmor >= 3.x userspace, unix rules get enforced.

abi/3.0 got introduced in AppArmor 3.0, and my guess is that the abi/3.0
include was also added to the lxc profile.

Actually the explanation might be slightly different (same result, but 
without abi/3.0 in the lxc profile):

It looks like the Debian AppArmor maintainers pinned the abi to
/etc/apparmor.d/abi/kernel-5.4-outoftree-network
which, like abi/3.0, includes enforcing unix rules.

(Note: I'm only looking at https://salsa.debian.org/apparmor-team/apparmor.git/
since I don't have a Debian machine running.)

For completeness: 2.13.x doesn't support abi at all (besides ignoring
abi/* includes if it finds them in a profile) so even if you have a
profile with abi/3.0, unix rules won't be enforced.

There's an exception:  Ubuntu kernels carry some patches to enable unix 
and some other rules even with older AppArmor versions.


Regards,

Christian Boltz
-- 
in my experience it's safe to assume developers never test
[Stephan Kulow in opensuse-factory]


signature.asc
Description: This is a digitally signed message part.


Bug#1050256: autopkgtest fails on debci

2023-09-04 Thread John Johansen

I took a quick look through v6.1..v6.3.1

there is a patch that I think is the likely fix, it first landed in v6.2

1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets

it matches up the reported audit logs. Unfortunately it does not have a Fixes
tag but as best I can figure it should be applied all the way back to.

56974a6fcfef apparmor: add base infastructure for socket mediation

how/where this bug surfaces partly depends on the userspace policy and
compiler which combines the features set supported by the kernel with what
policy claims to support. So it is possible to have an affected kernel
but not trigger the bug.



Bug#1050256: autopkgtest fails on debci

2023-09-03 Thread Michael Biebl

Am 03.09.23 um 10:50 schrieb Paul Gevers:

Hi,

On 03-09-2023 02:56, Michael Biebl wrote:

ng?


Do the debci maintainers  / lxc maintainers / release team have any 
preference regarding a/, b/ and c/ ?


One part of me likes the ci.d.n infrastructure to run stable as an 
example of "eat your own dogfood". Another part of me agrees with 
Antonio that it makes sense if it would run a backports kernel to be as 
close as possible to testing as we can reasonably (maintenance wise) can 
get. Because we have a known issue at hand, the balance goes to 
backports for me. If Antonio doesn't beat me to it, I'll get to it 
(although I don't know yet how to do that in our configuration [1] and 
exclude riscv64 too). I have manually upgraded the s390x host and 
rebooted, so that can serve as a test arch.


Seems it worked, the latest run succeeded:
https://ci.debian.net/data/autopkgtest/testing/s390x/s/systemd/37374052/log.gz

Thanks!




OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-09-03 Thread Paul Gevers

Hi,

On 03-09-2023 02:56, Michael Biebl wrote:
My main concern is to "stop the bleeding" quickly, so to speak, 
especially/mainly for debci.


I agree with you, but also consider that with this issue being there 
since ~ April 2023 we don't need to rush.



I guess we have three options here:
a/ upgrade the kernels to the one from backports as suggested by Antonio
b/ disable apparmor confinement for lxc on debci via some debci specific 
configuration
c/ disable apparmor confinement for lxc in bookworm via a stable upload 
of the lxc package


That said, I would be fine with a/ and b/ as well, as this would buy us 
time to investigate this issue without being under the pressure of 
causing debci failures.


What I fear a bit, is that if we do either of the three, Debian infra is 
not affected anymore which removes some incentive to find the root cause.


Those debci failures are hard to debug and I would like to avoid having 
individual maintainers waste time on it.


a, b, or c means that Debian maintainers don't need to dive into it 
anymore, but who knows which downstream project (volunteers or paid 
alike) will need to look into the problem in the future if we don't fix 
it inside packaging?


Do the debci maintainers  / lxc maintainers / release team have any 
preference regarding a/, b/ and c/ ?


One part of me likes the ci.d.n infrastructure to run stable as an 
example of "eat your own dogfood". Another part of me agrees with 
Antonio that it makes sense if it would run a backports kernel to be as 
close as possible to testing as we can reasonably (maintenance wise) can 
get. Because we have a known issue at hand, the balance goes to 
backports for me. If Antonio doesn't beat me to it, I'll get to it 
(although I don't know yet how to do that in our configuration [1] and 
exclude riscv64 too). I have manually upgraded the s390x host and 
rebooted, so that can serve as a test arch.


Paul

[1] https://salsa.debian.org/ci-team/debian-ci-config


OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-09-02 Thread Michael Biebl

Control: severity -1 serious

I'm tentatively raising this to RC, mainly to make this issue more 
visible for other maintainers.





OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-09-02 Thread Michael Biebl

Hi everyone

Am 02.09.23 um 13:09 schrieb Antonio Terceiro:

On Fri, Sep 01, 2023 at 11:13:11PM +, Mathias Gibbens wrote:

   I don't think we have a good understanding of the root cause of this
issue. Initially we thought this was a known upstream issue with all-
but very recent versions of apparmor and a corresponding lxc profile
fix [0]. However, it appears this is a different issue that somehow
depends on the interaction of bookworm's versions of the kernel,
apparmor, and/or lxc.


Nod


   A minimal reproducer is to install bookworm and create a container
with a systemd service using a hardening option like
PrivateNetwork=yes. With the latest bookworm kernel (6.1.38-4), the
service will fail. But, grab a kernel from testing (6.4.11-1) and then
things work -- with no other changes required. I tried the "oldest"
kernel on snapshot.d.o post 6.1 series (6.3.1+1~exp1 [1]) and the
service works properly with that version as well. So, something changed
in the kernel (either upstream or in Debian's packaging) between 6.1
and 6.3 that "unbreaks" services within lxc containers.


Right, these are my findings as well.

I also tested downgrading apparmor to 2.13.6-10 (i.e. the version from 
oldstable) on a bookworm system.


This was also sufficient to unbreak lxc.

So it "looks" like apparmor 3.x makes assumptions about the kernel that 
are not fulfilled by the kernel 6.1.x in bookworm.



   Given that simply installing a newer kernel fixes things, I am
hesitant to start making changes to lxc until we actually understand
what's changed when running the newer kernel and how it's affecting
lxc's behavior.


My main concern is to "stop the bleeding" quickly, so to speak, 
especially/mainly for debci.


I guess we have three options here:
a/ upgrade the kernels to the one from backports as suggested by Antonio
b/ disable apparmor confinement for lxc on debci via some debci specific 
configuration
c/ disable apparmor confinement for lxc in bookworm via a stable upload 
of the lxc package



The MR I proposed is c/, as I don't know how to implement a/ or b/.

That said, I would be fine with a/ and b/ as well, as this would buy us 
time to investigate this issue without being under the pressure of 
causing debci failures.
Those debci failures are hard to debug and I would like to avoid having 
individual maintainers waste time on it.


Do the debci maintainers  / lxc maintainers / release team have any 
preference regarding a/, b/ and c/ ?



Michael



OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-09-02 Thread Antonio Terceiro
On Fri, Sep 01, 2023 at 11:13:11PM +, Mathias Gibbens wrote:
> Control: block 1038315 by -1
> Control: block 1042880 by -1
> 
>   I don't think we have a good understanding of the root cause of this
> issue. Initially we thought this was a known upstream issue with all-
> but very recent versions of apparmor and a corresponding lxc profile
> fix [0]. However, it appears this is a different issue that somehow
> depends on the interaction of bookworm's versions of the kernel,
> apparmor, and/or lxc.
> 
>   A minimal reproducer is to install bookworm and create a container
> with a systemd service using a hardening option like
> PrivateNetwork=yes. With the latest bookworm kernel (6.1.38-4), the
> service will fail. But, grab a kernel from testing (6.4.11-1) and then
> things work -- with no other changes required. I tried the "oldest"
> kernel on snapshot.d.o post 6.1 series (6.3.1+1~exp1 [1]) and the
> service works properly with that version as well. So, something changed
> in the kernel (either upstream or in Debian's packaging) between 6.1
> and 6.3 that "unbreaks" services within lxc containers.
> 
>   Given that simply installing a newer kernel fixes things, I am
> hesitant to start making changes to lxc until we actually understand
> what's changed when running the newer kernel and how it's affecting
> lxc's behavior.

Thanks for the investigation. This led to think of something that would
work around this issue, but maybe has bigger consequences.

I'm wondering whether we should, as a policy, run backports kernels on
the ci.debian.net workers. Given the most important use case is testing
testing¹, having a kernel that is closest to the one in testing might
make sense.

¹ pun intended

Of course, this does not prevents having QEMU workers, and I want to
provide that at some point. But since we won't be able to have QEMU for
all architectures, anyway, I still think running backports kernels in
the lxc workers might be a valid strategy.


signature.asc
Description: PGP signature


Bug#1050256: autopkgtest fails on debci

2023-09-01 Thread Mathias Gibbens
Control: block 1038315 by -1
Control: block 1042880 by -1

  I don't think we have a good understanding of the root cause of this
issue. Initially we thought this was a known upstream issue with all-
but very recent versions of apparmor and a corresponding lxc profile
fix [0]. However, it appears this is a different issue that somehow
depends on the interaction of bookworm's versions of the kernel,
apparmor, and/or lxc.

  A minimal reproducer is to install bookworm and create a container
with a systemd service using a hardening option like
PrivateNetwork=yes. With the latest bookworm kernel (6.1.38-4), the
service will fail. But, grab a kernel from testing (6.4.11-1) and then
things work -- with no other changes required. I tried the "oldest"
kernel on snapshot.d.o post 6.1 series (6.3.1+1~exp1 [1]) and the
service works properly with that version as well. So, something changed
in the kernel (either upstream or in Debian's packaging) between 6.1
and 6.3 that "unbreaks" services within lxc containers.

  Given that simply installing a newer kernel fixes things, I am
hesitant to start making changes to lxc until we actually understand
what's changed when running the newer kernel and how it's affecting
lxc's behavior.

On Thu, 2023-08-31 at 19:54 +0200, Christian Boltz wrote:
> That said - the DENIED log entry translates to
> 
> unix send type=dgram,
> 
> You could try if adding this rule to the lxc-autopkgtest-lxc-iomhit_*
> profile helps - but if the issue is really on the kernel side, my
> hope is limited).

  I have tried tweaking the apparmor profile that's generated for
containers (the relevant part is defined in the variable
AA_PROFILE_UNIX_SOCKETS in src/lxc/lsm/apparmor.c), but haven't had any
success in a workaround. I am not super familiar with apparmor, so
maybe I'm not specifying things right, but I've previously tried the
sort of rules Christian suggested, none of which have had any affect.

On Fri, 2023-09-01 at 13:23 +0200, Michael Biebl wrote:
> The only way to fix the container was to use the aforementioned 
> `lxc.apparmor.profile = unconfined`.
> I think we should do that as the breakage is rather widespread and I 
> already see individual packages trying to work around that to at
> least keep debci afloat.

  I strongly dislike the idea of blanketly disabling apparmor profiles
by default for all lxc installs, since apparmor is one of the ways of
helping to ensure isolation of containers. For the specific instance of
debci, /etc/lxc/default.conf can be modified post-lxc install to change
lxc.apparmor.profile from "generated" to "unconfined" for the time
being.

Mathias

---

[0] -- https://github.com/lxc/lxc/issues/4333
[1] -- https://snapshot.debian.org/package/linux-signed-amd64/6.3.1%2B1~exp1/


signature.asc
Description: This is a digitally signed message part


Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-01 Thread Michael Biebl

Am 01.09.23 um 13:23 schrieb Michael Biebl:
The only way to fix the container was to use the aforementioned 
`lxc.apparmor.profile = unconfined`.
I think we should do that as the breakage is rather widespread and I 
already see individual packages trying to work around that to at least 
keep debci afloat.


See e.g.:
https://salsa.debian.org/systemd-team/systemd/-/merge_requests/211
https://salsa.debian.org/debian/pdns/-/commit/637e54ef73386541086da430553b82db78266bac

or disabling the systemd hardening options completely_
https://salsa.debian.org/utopia-team/polkit/-/blob/master/debian/patches/debian/Don-t-use-PrivateNetwork-yes-for-the-systemd-unit.patch

This is not a good outcome of this and the problem will become more 
apparent with debci running on bookworm now.




I went ahead and submitted
https://salsa.debian.org/lxc-team/lxc/-/merge_requests/18
since I don't see another solution atm.

Looping in the release team as well for their input.


Regards,
Michael


OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-09-01 Thread Michael Biebl

Am 31.08.23 um 19:54 schrieb Christian Boltz:

Hello,

Am Donnerstag, 31. August 2023, 08:41:59 CEST schrieb Michael Biebl:

What we found so far is, that the AppArmor policy of lxc breaks any
systemd service using PrivateNetwork=yes or PrivateIPC=yes when being
  run under lxc (running under bookworm using the bookworm kernel).
I wonder what the best course of action is here.
Should we disable the AA policy of lxc via a stable upload of the lxc
  package until the root cause is found?

Unfortunately I know too little about AppArmor and lxc's AppArmor
policy  and my attempts to ask around for help weren't successful so
far.


Two quick hints, but let me warn you that I'm not familiar with lxc and
also didn't check the content of the lxc-autopkgtest-lxc-iomhit_*
profile.

https://github.com/lxc/lxc/issues/4333 indicates that this issue was
fixed in (much) a newer kernel - but that's probably not news to you
since you wrote that comment ;-)


That said - the DENIED log entry translates to

 unix send type=dgram,

You could try if adding this rule to the lxc-autopkgtest-lxc-iomhit_*
profile helps - but if the issue is really on the kernel side, my hope is
limited).

For testing, you could also try with a more broad
 unix send,
or even
 unix,
rule - but please don't add these broader rules to the production
profile.


I have no idea, where to add that and what specific syntax I should use.
The profile above seems to be autogenerated and I only found a binary 
file with that name in /var/cache/apparmor.


The only way to fix the container was to use the aforementioned 
`lxc.apparmor.profile = unconfined`.
I think we should do that as the breakage is rather widespread and I 
already see individual packages trying to work around that to at least 
keep debci afloat.


See e.g.:
https://salsa.debian.org/systemd-team/systemd/-/merge_requests/211
https://salsa.debian.org/debian/pdns/-/commit/637e54ef73386541086da430553b82db78266bac

or disabling the systemd hardening options completely_
https://salsa.debian.org/utopia-team/polkit/-/blob/master/debian/patches/debian/Don-t-use-PrivateNetwork-yes-for-the-systemd-unit.patch

This is not a good outcome of this and the problem will become more 
apparent with debci running on bookworm now.



Regards,
Michael



OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci

2023-08-31 Thread Christian Boltz
Hello,

Am Donnerstag, 31. August 2023, 08:41:59 CEST schrieb Michael Biebl:
> What we found so far is, that the AppArmor policy of lxc breaks any 
> systemd service using PrivateNetwork=yes or PrivateIPC=yes when being
>  run under lxc (running under bookworm using the bookworm kernel). 
> I wonder what the best course of action is here.
> Should we disable the AA policy of lxc via a stable upload of the lxc
>  package until the root cause is found?
> 
> Unfortunately I know too little about AppArmor and lxc's AppArmor
> policy  and my attempts to ask around for help weren't successful so
> far. 

Two quick hints, but let me warn you that I'm not familiar with lxc and 
also didn't check the content of the lxc-autopkgtest-lxc-iomhit_* 
profile.

https://github.com/lxc/lxc/issues/4333 indicates that this issue was 
fixed in (much) a newer kernel - but that's probably not news to you 
since you wrote that comment ;-)


That said - the DENIED log entry translates to

unix send type=dgram,

You could try if adding this rule to the lxc-autopkgtest-lxc-iomhit_* 
profile helps - but if the issue is really on the kernel side, my hope is 
limited).

For testing, you could also try with a more broad
unix send,
or even
unix,
rule - but please don't add these broader rules to the production 
profile.


Regards,

Christian Boltz
-- 
you need a certificate, nobody knows how to do that securely (including
the CAs ;-) [Bernd Paysan, https://bugs.kde.org/show_bug.cgi?id=131083]


signature.asc
Description: This is a digitally signed message part.


Bug#1050256: autopkgtest fails on debci

2023-08-31 Thread Daniel Scharon
Hello everyone,

On Thu, 2023-08-31 at 08:55 +0200, Michael Biebl wrote:
> > 
> > What we found so far is, that the AppArmor policy of lxc breaks any
> > systemd service using PrivateNetwork=yes or PrivateIPC=yes when
> > being 
> > run under lxc (running under bookworm using the bookworm kernel).
> 
> 
> I.e. by setting `lxc.apparmor.profile = unconfined` in 
> /etc/lxc/default.conf and regenerating the autopkgtest container on 
> bookworm, the failures are gone.
> 


same case for systemd services using DynamicUser=yes

Kind regards,
Dan



smime.p7s
Description: S/MIME cryptographic signature


Bug#1050256: autopkgtest fails on debci

2023-08-31 Thread Michael Biebl

Am 31.08.23 um 08:41 schrieb Michael Biebl:

On Tue, 22 Aug 2023 16:08:24 +0200 Michael Biebl  wrote:

Source: systemd
Version: 254.1-2
Severity: important


Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ ,
systemd has been failing on debci since about the beginning of May.

Asking around on #debci, this might be kernel related, as the debci
related systems were upgraded to bookworm around that time.




What we found so far is, that the AppArmor policy of lxc breaks any 
systemd service using PrivateNetwork=yes or PrivateIPC=yes when being 
run under lxc (running under bookworm using the bookworm kernel).


I wonder what the best course of action is here.
Should we disable the AA policy of lxc via a stable upload of the lxc 
package until the root cause is found?


Unfortunately I know too little about AppArmor and lxc's AppArmor policy 
and my attempts to ask around for help weren't successful so far.





I.e. by setting `lxc.apparmor.profile = unconfined` in 
/etc/lxc/default.conf and regenerating the autopkgtest container on 
bookworm, the failures are gone.





OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-08-31 Thread Michael Biebl

On Tue, 22 Aug 2023 16:08:24 +0200 Michael Biebl  wrote:

Source: systemd
Version: 254.1-2
Severity: important


Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ ,
systemd has been failing on debci since about the beginning of May.

Asking around on #debci, this might be kernel related, as the debci
related systems were upgraded to bookworm around that time.




What we found so far is, that the AppArmor policy of lxc breaks any 
systemd service using PrivateNetwork=yes or PrivateIPC=yes when being 
run under lxc (running under bookworm using the bookworm kernel).


I wonder what the best course of action is here.
Should we disable the AA policy of lxc via a stable upload of the lxc 
package until the root cause is found?


Unfortunately I know too little about AppArmor and lxc's AppArmor policy 
and my attempts to ask around for help weren't successful so far.




Regards,
Michael





OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-08-24 Thread Michael Biebl

Am 23.08.23 um 14:32 schrieb Michael Biebl:


I see the following error in the journal:

Aug 23 14:23:50 debian audit[4096]: AVC apparmor="DENIED" 
operation="file_lock" 
profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 
comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 
requested_mask="send"
Aug 23 14:23:50 debian kernel: audit: type=1400 
audit(1692793430.788:33): apparmor="DENIED" operation="file_lock" 
profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 
comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 
requested_mask="send"
Aug 23 14:23:50 debian kernel: audit: type=1400 
audit(1692793430.788:34): apparmor="DENIED" operation="file_lock" 
profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 
comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 
requested_mask="send"
Aug 23 14:23:50 debian audit[4096]: AVC apparmor="DENIED" 
operation="file_lock" 
profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 
comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 
requested_mask="send"




With the 6.4 kernel, no such error happens.

So, this looks to me like an AppArmor issue, thus reassigning to the 
apparmor package.




It appears this was already reported separately as


https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1038315
and the corresponding upstream bug
https://github.com/lxc/lxc/issues/4333

Apparently any service using PrivateNetwork=yes and running inside lxc, 
will trigger this AppArmor violation.





OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-08-23 Thread Michael Biebl

Control: reassign -1 apparmor
Control: affects -1 src:systemd
Control: retitle -1 apparmor makes systemd autopkgtests fail on bookworm
Control: found -1 3.0.8-3

The plot thickens...

Am 23.08.23 um 13:20 schrieb Michael Biebl:

On Tue, 22 Aug 2023 16:08:24 +0200 Michael Biebl  wrote:

Source: systemd
Version: 254.1-2
Severity: important


Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ ,
systemd has been failing on debci since about the beginning of May.

Asking around on #debci, this might be kernel related, as the debci
related systems were upgraded to bookworm around that time.




Small update:
I can reproduce the failures in a bookworm (qemu) VM, using LXC.
Only upgrading the kernel to the one from trixie [1] is sufficient to 
make autopkgtest pass.




... so does disabling AppArmor with the bookworm kernel.

For completeness sake the failing tests are:

# autopkgtest systemd -- lxc autopkgtest-bookworm


784s hostnamedFAIL non-zero exit status 1
784s localed-locale   FAIL non-zero exit status 1
784s localed-x11-keymap   FAIL non-zero exit status 1
784s networkd-test.py FAIL non-zero exit status 1
784s boot-and-servicesFAIL non-zero exit status 1
784s unit-tests   FAIL non-zero exit status 1


# autopkgtest systemd -- lxc autopkgtest-trixie

782s hostnamedFAIL non-zero exit status 1
782s localed-locale   FAIL non-zero exit status 1
782s networkd-test.py FAIL non-zero exit status 1
782s boot-and-servicesFAIL non-zero exit status 1


Running e.g.
# autopkgtest --test-name=hostnamed systemd -- lxc autopkgtest-trixie

I see the following error in the journal:

Aug 23 14:23:50 debian audit[4096]: AVC apparmor="DENIED" 
operation="file_lock" 
profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 
comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 
requested_mask="send"
Aug 23 14:23:50 debian kernel: audit: type=1400 
audit(1692793430.788:33): apparmor="DENIED" operation="file_lock" 
profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 
comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 
requested_mask="send"
Aug 23 14:23:50 debian kernel: audit: type=1400 
audit(1692793430.788:34): apparmor="DENIED" operation="file_lock" 
profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 
comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 
requested_mask="send"
Aug 23 14:23:50 debian audit[4096]: AVC apparmor="DENIED" 
operation="file_lock" 
profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 
comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 
requested_mask="send"




With the 6.4 kernel, no such error happens.

So, this looks to me like an AppArmor issue, thus reassigning to the 
apparmor package.



Dear AppArmor maintainers: can you please have a look? If you need 
further information, please let me know.


Regards,
Michael


OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-08-23 Thread Michael Biebl

On Tue, 22 Aug 2023 16:08:24 +0200 Michael Biebl  wrote:

Source: systemd
Version: 254.1-2
Severity: important


Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ ,
systemd has been failing on debci since about the beginning of May.

Asking around on #debci, this might be kernel related, as the debci
related systems were upgraded to bookworm around that time.





Small update:
I can reproduce the failures in a bookworm (qemu) VM, using LXC.
Only upgrading the kernel to the one from trixie [1] is sufficient to 
make autopkgtest pass.




[1] 6.4.0-2-amd64


OpenPGP_signature.asc
Description: OpenPGP digital signature


Bug#1050256: autopkgtest fails on debci

2023-08-22 Thread Michael Biebl
Source: systemd
Version: 254.1-2
Severity: important


Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ ,
systemd has been failing on debci since about the beginning of May.

Asking around on #debci, this might be kernel related, as the debci
related systems were upgraded to bookworm around that time.