Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-26 Thread Luca Boccassi
> On Tue, Apr 13, 2021 at 8:09 AM Zbigniew Jędrzejewski-Szmek
>  
> The new metadata guarantees that the ELF data churns, though. For
> example, if I bump the Release in a spec file for something unrelated
> to the build, all the ELF blobs change. The current state means that
> this is deduplicated in RPM CoW and a very cheap upgrade, since the
> binaries weren't all touched.

The content of the key:value pairs is entirely under your control though - if 
you don't want to include the spec release field because of the case where that 
would be the only change to the binary, then don't.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-15 Thread Lennart Poettering
On Do, 15.04.21 10:20, Luca Boccassi (bl...@debian.org) wrote:

> > I'm confused about this - I had put forth an idea for how to make rpm
> > create this when installing packages (so it works with older or third
> > party packages) but the same xattr could be created for any packaging
> > system. Can you clarify what is rpm dependent here?
> >
> > Matthew.
>
> Hi,
>
> There's a few issues with using xattr, some minor and one major.
> The minor issues is that it's really not great when you are shipping
> stuff around - the source/transport/medium/archiving format might or
> might not support it. Having to deal with this for cross-building
> Linux binaries from Windows with SELinux labels I can assure it's a
> massive headache I'd rather not replicate :-)

I think this might not just be a minor issue btw. One of the main
goals of this feature is to make coredumps reasonably useful when they
originate from a binary shipped as container image. But do all popular
container envs even ship xattrs in their deployment images? I mean,
it's an optional tar feature, and do they all enable it? iirc original
"aufs" backed Docker didn't support xattrs, simply because aufs
didn't. I figure that leaked into all later versions, too, no?

Lennart

--
Lennart Poettering, Berlin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-15 Thread Luca Boccassi
> On Wed, 2021-04-14 at 15:29 +, Zbigniew Jędrzejewski-Szmek wrote:
> 
> That's fair - if it were possible to get an fd during dump, we could
> use fgetxattr. If not, we can use /proc/$pid/exe - even when deleted
> you can interact with it:
> 
> [malmond@malmond-x1 ~]$ ls -l /proc/$$/exe
> lrwxrwxrwx. 1 malmond malmond 0 Apr 14 15:45 /proc/364665/exe ->
> '/home/malmond/testbash (deleted)'
> [malmond@malmond-x1 ~]$ attr -l /proc/$$/exe
> Attribute "selinux" has a 54 byte value for /proc/364665/exe
> 
> (this is me copying bash, executing it, then deleting it). My thinking
> is this could go in systemd-coredump as it's invoked when dumping core
> anyway. Libraries are accessible from /proc/$pid/map_files/$range.
> 
> 
> I'm confused about this - I had put forth an idea for how to make rpm
> create this when installing packages (so it works with older or third
> party packages) but the same xattr could be created for any packaging
> system. Can you clarify what is rpm dependent here?
> 
> Matthew.

Hi,

There's a few issues with using xattr, some minor and one major.
The minor issues is that it's really not great when you are shipping stuff 
around - the source/transport/medium/archiving format might or might not 
support it. Having to deal with this for cross-building Linux binaries from 
Windows with SELinux labels I can assure it's a massive headache I'd rather not 
replicate :-)

The major issue though is a different one, if related: one of the central, 
nicest capabilities of this proposal is that the ELF note gets _automatically_ 
included in the generated core file. No change required anywhere for that to 
happen.
This means if you are running a fleet of headless systems, and you collect 
corefiles on each node and ship them off for offline analysis, all the metadata 
is nicely included, automagically. That's because the metadata is a property of 
the binary and assigned at the source, not of the system where it runs on and 
applied at the destination.

If you use xattrs though, the metadata suddenly becomes a property of the 
system where the binary happens to run on. The core file won't have it. The 
only way to see it is to have access to the system. It's no longer nicely 
self-contained and "replicating".

Also you are no longer guaranteed to _always_ have the metadata available as 
long as you add it at build time - suddenly, if your binary ends up installed 
on the 'wrong' system/container/whatever, because they are too old or don't 
have a package manager or whatever else, your metadata is gone.

And just to clarify, these use cases are not theoretical - they are very much 
real and already working. And these are the main reasons the team that handles 
crashdumps at Microsoft suggested adding a '.note' to the ELF rather than a new 
header or other proposals. I realize this is less of a problem if you look at 
things exclusively from the closed-loop of a single distribution building its 
stuff and shipping its installations only, but with this spec and 
implementation we are trying hard to have a general, 'world-wide' solution that 
works across distros, version, flavours, etc etc.

In this sense, the implementation I contributed to systemd-coredump is ""just"" 
a reference implementation of how to parse and use the spec, but it's by no 
means intended to be a systemd-only affair and an internal-only protocol, 
ending at the new journald fields.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-14 Thread Matthew Almond via devel
Sorry for not responding to this in my previous reply.

On Wed, 2021-04-14 at 15:29 +, Zbigniew Jędrzejewski-Szmek wrote:
> I wanted to investigate this, but unfortunately, it's hard to check
> right now, because all builds are non-reproducible (in the sense of
> reproducible-builds.org), because we include the mtime of build
> products in rpm metadata, so pretty much all binary rpms are
> different. 

I'm thinking this isn't that important. In most current RPMs, the
mtimes for files are in two places:

1. In the (main) rpm header
2. in the cpio header for the file in the payload.

I can talk about the effect on RPMCow: the mtime isn't part of the
identity of the file - it's just a content hash. When the files are
actually installed, then the resulting inode is touch'ed to the right
time. Therefore I think it's moot (MOOt?) from a CoW perspective, the
reuse can happen.

Matthew.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-14 Thread Matthew Almond via devel
On Wed, 2021-04-14 at 15:29 +, Zbigniew Jędrzejewski-Szmek wrote:
> Unfortunately this doesn't work for two important cases:
> - when a binary or shared library has been replaced on disk. E.g.
>   it is fairly common for packages to crash on upgrade, and the crash
>   could be in the _old_ code. When the metadata is loaded in a section,
>   we get it all nice and dandy in the coredump. If it's in an xattr,
>   we don't or even worse, get outdated info.

That's fair - if it were possible to get an fd during dump, we could
use fgetxattr. If not, we can use /proc/$pid/exe - even when deleted
you can interact with it:

[malmond@malmond-x1 ~]$ ls -l /proc/$$/exe
lrwxrwxrwx. 1 malmond malmond 0 Apr 14 15:45 /proc/364665/exe ->
'/home/malmond/testbash (deleted)'
[malmond@malmond-x1 ~]$ attr -l /proc/$$/exe
Attribute "selinux" has a 54 byte value for /proc/364665/exe

(this is me copying bash, executing it, then deleting it). My thinking
is this could go in systemd-coredump as it's invoked when dumping core
anyway. Libraries are accessible from /proc/$pid/map_files/$range.

> - it doesn't work for non-rpm stuff.

I'm confused about this - I had put forth an idea for how to make rpm
create this when installing packages (so it works with older or third
party packages) but the same xattr could be created for any packaging
system. Can you clarify what is rpm dependent here?

Matthew.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-14 Thread Zbigniew Jędrzejewski-Szmek
On Wed, Apr 14, 2021 at 11:47:42AM -0400, Neal Gompa wrote:
> On Wed, Apr 14, 2021 at 11:30 AM Zbigniew Jędrzejewski-Szmek
>  wrote:
> >
> > On Tue, Apr 13, 2021 at 12:44:42AM +, Matthew Almond via devel wrote:
> > > On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote:
> > > > Or in other words: packaging metadata are sources too. If they change
> > > > (and a version bump constitutes a change) the output might change,
> > > > and
> > > > that's expected. What's key really is that the only things that can
> > > > effect generated output are the build/packaging environment and the
> > > > sources, but not parameters outside of that, such as the actual
> > > > wallclock.
> > >
> > > The main way that packaging "interferes" with the source is when
> > > patches are applied - the original timestamp of a tarball (for example)
> > > isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair.
> > >
> > > >
> > > > > My concern centers around the Copy on Write (CoW) use case - when
> > > > > packages are updated, some files changes, and some may stay the
> > > > > same.
> > > > > Where they are the same, we can save I/O and possibly download time
> > > > > long term.
> > > >
> > > > Reproducible builds the way they are defined do not address such
> > > > file-level CoW optimization so much. They do address CoW optimization
> > > > on a package level much more however: i.e. the same package build
> > > > will
> > > > have the same files in them, no matter what.
> > > >
> > > > Or to say this differently: if you want reproducible to work the way
> > > > ou think it should work, you'd have to start by convincing the
> > > > uptream
> > > > maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good
> > > > luck with that.
> > >
> > > I think we should be careful to de-couple these two things. Just
> > > because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not
> > > proof that all binaries will. I remain concerned that this proposal
> > > forces the issue and for every single version of every single ELF
> > > binary *must* be different, even if they really didn't change. The
> > > pattern I see is more automation and faster, smaller release cycles,
> > > and this forcing downloads and writes of binaries that really didn't
> > > change their code.
> >
> > Yeah, that's definitely something to think about.
> >
> > The proposed change indeed "forces the issue". This could be a big drawback
> > or not, depending on how often identical binary builds happen for different
> > package versions. If it turns out that the answer is "only rarely", then
> > I wouldn't consider it too important. If the answer is "quite often", we
> > would a chance for a nice optimization.
> >
> > I wanted to investigate this, but unfortunately, it's hard to check
> > right now, because all builds are non-reproducible (in the sense of
> > reproducible-builds.org), because we include the mtime of build
> > products in rpm metadata, so pretty much all binary rpms are
> > different.  And in general other things make builds non-reproducible,
> > and it's not obvious if *this* change makes things worse. I didn't
> > want to dig into individual rpms to compare binaries. I *think* most
> > packages are not actually rebuilt that often without changes…, but real
> > data is definitely needed.
> >
> 
> We could start clamping times by default by adding the following to
> redhat-rpm-config:
> 
> %clamp_mtime_to_source_date_epoch 1

Oh, is this already a thing? Nice!
https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/126

Zbyszek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-14 Thread Neal Gompa
On Wed, Apr 14, 2021 at 11:30 AM Zbigniew Jędrzejewski-Szmek
 wrote:
>
> On Tue, Apr 13, 2021 at 12:44:42AM +, Matthew Almond via devel wrote:
> > On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote:
> > > Or in other words: packaging metadata are sources too. If they change
> > > (and a version bump constitutes a change) the output might change,
> > > and
> > > that's expected. What's key really is that the only things that can
> > > effect generated output are the build/packaging environment and the
> > > sources, but not parameters outside of that, such as the actual
> > > wallclock.
> >
> > The main way that packaging "interferes" with the source is when
> > patches are applied - the original timestamp of a tarball (for example)
> > isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair.
> >
> > >
> > > > My concern centers around the Copy on Write (CoW) use case - when
> > > > packages are updated, some files changes, and some may stay the
> > > > same.
> > > > Where they are the same, we can save I/O and possibly download time
> > > > long term.
> > >
> > > Reproducible builds the way they are defined do not address such
> > > file-level CoW optimization so much. They do address CoW optimization
> > > on a package level much more however: i.e. the same package build
> > > will
> > > have the same files in them, no matter what.
> > >
> > > Or to say this differently: if you want reproducible to work the way
> > > ou think it should work, you'd have to start by convincing the
> > > uptream
> > > maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good
> > > luck with that.
> >
> > I think we should be careful to de-couple these two things. Just
> > because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not
> > proof that all binaries will. I remain concerned that this proposal
> > forces the issue and for every single version of every single ELF
> > binary *must* be different, even if they really didn't change. The
> > pattern I see is more automation and faster, smaller release cycles,
> > and this forcing downloads and writes of binaries that really didn't
> > change their code.
>
> Yeah, that's definitely something to think about.
>
> The proposed change indeed "forces the issue". This could be a big drawback
> or not, depending on how often identical binary builds happen for different
> package versions. If it turns out that the answer is "only rarely", then
> I wouldn't consider it too important. If the answer is "quite often", we
> would a chance for a nice optimization.
>
> I wanted to investigate this, but unfortunately, it's hard to check
> right now, because all builds are non-reproducible (in the sense of
> reproducible-builds.org), because we include the mtime of build
> products in rpm metadata, so pretty much all binary rpms are
> different.  And in general other things make builds non-reproducible,
> and it's not obvious if *this* change makes things worse. I didn't
> want to dig into individual rpms to compare binaries. I *think* most
> packages are not actually rebuilt that often without changes…, but real
> data is definitely needed.
>

We could start clamping times by default by adding the following to
redhat-rpm-config:

%clamp_mtime_to_source_date_epoch 1

> > I have just thought of an alternative proposition: for ELF objects (and
> > ELF objects only): rpm could automatically, and systematically record
> > the metadata in an xattr. This would work on images without rpmdb,
> > works on most filesystem types, be serialized in archives. Most
> > interestingly this could be implemented as an rpm plugin, and would
> > work retroactively for packages that were built before this proposal.
> > It could also be made to work for other packaging systems, and the
> > tooling that reads it wouldn't need to know the original packaging
> > system.
> Unfortunately this doesn't work for two important cases:
> - when a binary or shared library has been replaced on disk. E.g.
>   it is fairly common for packages to crash on upgrade, and the crash
>   could be in the _old_ code. When the metadata is loaded in a section,
>   we get it all nice and dandy in the coredump. If it's in an xattr,
>   we don't or even worse, get outdated info.
> - it doesn't work for non-rpm stuff.
>
> Zbyszek
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct: 
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
> Do not reply to spam on the list, report it: 
> https://pagure.io/fedora-infrastructure


--
真実はいつも一つ!/ Always, there's only one truth!
___
devel mailing list -- devel@lists.fedoraproject.org
To unsu

Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-14 Thread Zbigniew Jędrzejewski-Szmek
On Tue, Apr 13, 2021 at 12:44:42AM +, Matthew Almond via devel wrote:
> On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote:
> > Or in other words: packaging metadata are sources too. If they change
> > (and a version bump constitutes a change) the output might change,
> > and
> > that's expected. What's key really is that the only things that can
> > effect generated output are the build/packaging environment and the
> > sources, but not parameters outside of that, such as the actual
> > wallclock.
> 
> The main way that packaging "interferes" with the source is when
> patches are applied - the original timestamp of a tarball (for example)
> isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair.
> 
> > 
> > > My concern centers around the Copy on Write (CoW) use case - when
> > > packages are updated, some files changes, and some may stay the
> > > same.
> > > Where they are the same, we can save I/O and possibly download time
> > > long term.
> > 
> > Reproducible builds the way they are defined do not address such
> > file-level CoW optimization so much. They do address CoW optimization
> > on a package level much more however: i.e. the same package build
> > will
> > have the same files in them, no matter what.
> > 
> > Or to say this differently: if you want reproducible to work the way
> > ou think it should work, you'd have to start by convincing the
> > uptream
> > maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good
> > luck with that.
> 
> I think we should be careful to de-couple these two things. Just
> because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not
> proof that all binaries will. I remain concerned that this proposal
> forces the issue and for every single version of every single ELF
> binary *must* be different, even if they really didn't change. The
> pattern I see is more automation and faster, smaller release cycles,
> and this forcing downloads and writes of binaries that really didn't
> change their code.

Yeah, that's definitely something to think about.

The proposed change indeed "forces the issue". This could be a big drawback
or not, depending on how often identical binary builds happen for different
package versions. If it turns out that the answer is "only rarely", then
I wouldn't consider it too important. If the answer is "quite often", we
would a chance for a nice optimization.

I wanted to investigate this, but unfortunately, it's hard to check
right now, because all builds are non-reproducible (in the sense of
reproducible-builds.org), because we include the mtime of build
products in rpm metadata, so pretty much all binary rpms are
different.  And in general other things make builds non-reproducible,
and it's not obvious if *this* change makes things worse. I didn't
want to dig into individual rpms to compare binaries. I *think* most
packages are not actually rebuilt that often without changes…, but real
data is definitely needed.

> I have just thought of an alternative proposition: for ELF objects (and
> ELF objects only): rpm could automatically, and systematically record
> the metadata in an xattr. This would work on images without rpmdb,
> works on most filesystem types, be serialized in archives. Most
> interestingly this could be implemented as an rpm plugin, and would
> work retroactively for packages that were built before this proposal.
> It could also be made to work for other packaging systems, and the
> tooling that reads it wouldn't need to know the original packaging
> system.
Unfortunately this doesn't work for two important cases:
- when a binary or shared library has been replaced on disk. E.g.
  it is fairly common for packages to crash on upgrade, and the crash
  could be in the _old_ code. When the metadata is loaded in a section,
  we get it all nice and dandy in the coredump. If it's in an xattr,
  we don't or even worse, get outdated info.
- it doesn't work for non-rpm stuff.

Zbyszek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-13 Thread Colin Walters


On Mon, Apr 12, 2021, at 8:44 PM, Matthew Almond via devel wrote:
> 
> I think we should be careful to de-couple these two things. Just
> because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not
> proof that all binaries will.

Agreed; it'd be interesting to gather some data here, particularly components 
with large binaries.

> I have just thought of an alternative proposition: for ELF objects (and
> ELF objects only): rpm could automatically, and systematically record
> the metadata in an xattr.

OSTree would be affected in the same way as your "RPM CoW" proposal by the 
approach of having it in the binary directly.  Unless we did this, because 
ostree is based on hardlinking which works on every filesystem, but shares an 
inode and hence the extended attributes are included in the ostree checksum.  
(There is some support for adding an additional "payload" i.e. content checksum 
in ostree but it adds another mapping and so we don't enable it by default).

But on reflink-capable filesystems in theory if this content is just in the ELF 
header we could skip it and reflink just the remainder which would be most of 
the binary.  (But, this would necessitate a strategy other than checksumming 
the whole binary of course, something more like rsync-style "rollsum" windows 
that we use for ostree static deltas, e.g.)
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-13 Thread Neal Gompa
On Tue, Apr 13, 2021 at 8:09 AM Zbigniew Jędrzejewski-Szmek
 wrote:
>
> On Mon, Apr 12, 2021 at 10:57:30PM +0200, Lennart Poettering wrote:
> > On Mo, 12.04.21 16:14, David Malcolm (dmalc...@redhat.com) wrote:
> >
> > > So I want to push back on the idea that a single package can be
> > > associated with a coredump, or be the one responsible for the crash:
> > > any or all of the ELF objects linked into the process could be at
> > > fault.
> >
> > The example in the feature page shows how we handle this: you'll see
> > the packaging metadata of all involved ELF objects in coredumpctl's
> > output. i.e. we should be nicely covered on this, and we are fully
> > aware that the "main" ELF objects is the culprit of crashes only in a
> > fraction of cases.
>
> This is true.
>
> OTOH, this new metadata doesn't really change the situation here.
> Before, we already had build-ids for all the packages "involved" in
> the stack trace. And our processing tools already could do the
> conversion to package nevras. (They have to have network access to
> create a report.) The only thing that changes is *how* this conversion
> happens, but for online reports such conversion was always possible.
>

The new metadata guarantees that the ELF data churns, though. For
example, if I bump the Release in a spec file for something unrelated
to the build, all the ELF blobs change. The current state means that
this is deduplicated in RPM CoW and a very cheap upgrade, since the
binaries weren't all touched.



-- 
真実はいつも一つ!/ Always, there's only one truth!
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-13 Thread Zbigniew Jędrzejewski-Szmek
On Mon, Apr 12, 2021 at 10:57:30PM +0200, Lennart Poettering wrote:
> On Mo, 12.04.21 16:14, David Malcolm (dmalc...@redhat.com) wrote:
> 
> > So I want to push back on the idea that a single package can be
> > associated with a coredump, or be the one responsible for the crash:
> > any or all of the ELF objects linked into the process could be at
> > fault.
> 
> The example in the feature page shows how we handle this: you'll see
> the packaging metadata of all involved ELF objects in coredumpctl's
> output. i.e. we should be nicely covered on this, and we are fully
> aware that the "main" ELF objects is the culprit of crashes only in a
> fraction of cases.

This is true.

OTOH, this new metadata doesn't really change the situation here.
Before, we already had build-ids for all the packages "involved" in
the stack trace. And our processing tools already could do the
conversion to package nevras. (They have to have network access to
create a report.) The only thing that changes is *how* this conversion
happens, but for online reports such conversion was always possible.

That said, it *is* strange that abrt prints just one package nevra
in bugzilla reports [1].

Zbyszek

[1] completely arbitrary example I happened to have open in a tab:
https://bugzilla.redhat.com/show_bug.cgi?id=1895937
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-12 Thread Matthew Almond via devel
On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote:
> Or in other words: packaging metadata are sources too. If they change
> (and a version bump constitutes a change) the output might change,
> and
> that's expected. What's key really is that the only things that can
> effect generated output are the build/packaging environment and the
> sources, but not parameters outside of that, such as the actual
> wallclock.

The main way that packaging "interferes" with the source is when
patches are applied - the original timestamp of a tarball (for example)
isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair.

> 
> > My concern centers around the Copy on Write (CoW) use case - when
> > packages are updated, some files changes, and some may stay the
> > same.
> > Where they are the same, we can save I/O and possibly download time
> > long term.
> 
> Reproducible builds the way they are defined do not address such
> file-level CoW optimization so much. They do address CoW optimization
> on a package level much more however: i.e. the same package build
> will
> have the same files in them, no matter what.
> 
> Or to say this differently: if you want reproducible to work the way
> ou think it should work, you'd have to start by convincing the
> uptream
> maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good
> luck with that.

I think we should be careful to de-couple these two things. Just
because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not
proof that all binaries will. I remain concerned that this proposal
forces the issue and for every single version of every single ELF
binary *must* be different, even if they really didn't change. The
pattern I see is more automation and faster, smaller release cycles,
and this forcing downloads and writes of binaries that really didn't
change their code.

I have just thought of an alternative proposition: for ELF objects (and
ELF objects only): rpm could automatically, and systematically record
the metadata in an xattr. This would work on images without rpmdb,
works on most filesystem types, be serialized in archives. Most
interestingly this could be implemented as an rpm plugin, and would
work retroactively for packages that were built before this proposal.
It could also be made to work for other packaging systems, and the
tooling that reads it wouldn't need to know the original packaging
system.

Thoughts?

Matthew

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-12 Thread Lennart Poettering
On Mo, 12.04.21 20:40, Fedora Development ML (devel@lists.fedoraproject.org) 
wrote:

> On Mon, 2021-04-12 at 15:46 -0400, Ben Cotton wrote:
> > https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects
>
> Putting packaging info into a binary guarantees that each successive
> package containing ELF binaries will not contain exactly the same
> binaries, even if there are no changes.
>
> Now, what I just wrote there is predicated on "reproducible builds"
> where the same source (including deps, headers) and the same toolchain
> produce the same output. This may or may not be a thing. My concern is
> that we completely eliminate the possibility of binaries being
> unchanged.

I think this is a misunderstanding how reproducible builds are
supposed to work. For example, consider $SOURCE_DATE_EPOCH as defined
here:

https://reproducible-builds.org/specs/source-date-epoch/

It's expressly defined to be used as the source timestamp when that
source timestamp is included in build output. It also also expressly
documented to be a value initialized from the packaging Changelog
timestamps. Or in other words: the way the reproducible builds project
understands their own stuff it's absolutely OK to generate different
output on package rebuilds that change the package versions.

Or in other words: packaging metadata are sources too. If they change
(and a version bump constitutes a change) the output might change, and
that's expected. What's key really is that the only things that can
effect generated output are the build/packaging environment and the
sources, but not parameters outside of that, such as the actual
wallclock.

> My concern centers around the Copy on Write (CoW) use case - when
> packages are updated, some files changes, and some may stay the same.
> Where they are the same, we can save I/O and possibly download time
> long term.

Reproducible builds the way they are defined do not address such
file-level CoW optimization so much. They do address CoW optimization
on a package level much more however: i.e. the same package build will
have the same files in them, no matter what.

Or to say this differently: if you want reproducible to work the way
ou think it should work, you'd have to start by convincing the uptream
maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good
luck with that.

Lennart

--
Lennart Poettering, Berlin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-12 Thread Lennart Poettering
On Mo, 12.04.21 16:14, David Malcolm (dmalc...@redhat.com) wrote:

> So I want to push back on the idea that a single package can be
> associated with a coredump, or be the one responsible for the crash:
> any or all of the ELF objects linked into the process could be at
> fault.

The example in the feature page shows how we handle this: you'll see
the packaging metadata of all involved ELF objects in coredumpctl's
output. i.e. we should be nicely covered on this, and we are fully
aware that the "main" ELF objects is the culprit of crashes only in a
fraction of cases.

Lennart

--
Lennart Poettering, Berlin
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-12 Thread Matthew Almond via devel
On Mon, 2021-04-12 at 15:46 -0400, Ben Cotton wrote:
> https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects

Putting packaging info into a binary guarantees that each successive
package containing ELF binaries will not contain exactly the same
binaries, even if there are no changes.

Now, what I just wrote there is predicated on "reproducible builds"
where the same source (including deps, headers) and the same toolchain
produce the same output. This may or may not be a thing. My concern is
that we completely eliminate the possibility of binaries being
unchanged.

My concern centers around the Copy on Write (CoW) use case - when
packages are updated, some files changes, and some may stay the same.
Where they are the same, we can save I/O and possibly download time
long term.

My recommendation here is to (continue to?) log build ids, and resolve
remotely if you don't have an rpmdb to consult. Build ids are opaque
and meaningless to end users, but end users aren't the target. My
expectation is that any data collection around crashes needs to
aggregate, and build ids are good enough to identify packages, even
after the fact.

Matthew.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-12 Thread David Malcolm
On Mon, 2021-04-12 at 15:46 -0400, Ben Cotton wrote:
> https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects
> 
> == Summary ==
> All binaries (executables and shared libraries) are annotated with an
> ELF
> note that identifies the rpm distributing this file.
> 
> == Owner ==
> * Name: [[User:Zbyszek|Zbigniew Jędrzejewski-Szmek]]
> * Email: zbys...@in.waw.pl
> * Name: Lennart Poettering
> * Email: mzsrq...@0pointer.net
> 
> 
> == Detailed Description ==
> See [https://github.com/systemd/systemd/issues/18433 systemd issue
> #18433]
> for discussion and implementation proposals.
> 
> Programs crash. And when they do, they dump core, and we want to tell
> the
> user which package, including the version, caused the failure.

This might be better as:
  "which packages [plural] could have been responsible for the failure"

I used to maintain the Fedora "python" package, and I kept receiving
bugzilla reports assigned to the "python" package filed via ABRT,
because /usr/bin/python had crashed.  It was almost never
/usr/bin/python at fault: it's a tiny 4k executable linked to a much
larger libpython.so (in a different subpackage) - but generally that
wasn't at fault either: the python extension API exposes the insides of
the virtual machine and its objects directly, and it's very easy for a
buggy extension to corrupt something in the process, sometimes at some
remove from where the segfault finally crashes the process down.

I tried writing scripts to help update the bugs to use the correct bz
component.  In theory you could look at the deepest point in the
callstack, but it might e.g. be an assertion failure handler in a
shared library, rather than the "real" site of the crash.  Or some
object could have become corrupted at some point long before the crash
actually fires, so the blame can't be diagnosed just from the final
callstack.

Dealing with this deluge of misfiled bug reports is what got me
interested in static analysis and on maintaining GCC itself, fwiw.

So I want to push back on the idea that a single package can be
associated with a coredump, or be the one responsible for the crash:
any or all of the ELF objects linked into the process could be at
fault.

Hope this is constructive (sorry, the wording in the proposal touched a
nerve for me!)

Dave

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


F35 Change: Package information on ELF objects (System-Wide Change proposal)

2021-04-12 Thread Ben Cotton
https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects

== Summary ==
All binaries (executables and shared libraries) are annotated with an ELF
note that identifies the rpm distributing this file.

== Owner ==
* Name: [[User:Zbyszek|Zbigniew Jędrzejewski-Szmek]]
* Email: zbys...@in.waw.pl
* Name: Lennart Poettering
* Email: mzsrq...@0pointer.net


== Detailed Description ==
See [https://github.com/systemd/systemd/issues/18433 systemd issue #18433]
for discussion and implementation proposals.

Programs crash. And when they do, they dump core, and we want to tell the
user which package, including the version, caused the failure. ELF note
`.note.package` will be added to specify package nevra. By embedding the
this information directly in the binary object, package nevra is
immediately available from a core dump.

=== Existing system: `.note.gnu.build-id` ===

We already have build-ids: every ELF object has a `.note.gnu.build-id`
note, and given a core file, we can read the build-id and look it up in the
rpm database (`dnf repoquery --whatprovides debuginfo(build-id) = …`) to
map it to a package name.
Build-ids are unique and compact and very generic and work as expected in
general. But they have some downsides:
* build-ids are not very informative for users. Before the build-id is
converted back to the appropriate package, it's completely opaque.
* build-ids require a working rpm database or an internet connection to map
to the package name.

Three important cases:
* minimal containers: the rpm database is not installed in the containers.
The information about build-ids needs to be stored externally, so package
name information is not available immediately, but only after offline
processing. The new note doesn't depend on the rpm db in any way.
* handling of a core from a container, where the container and host have
different distros
* self-built and external packages: unless a lot of care is taken to keep
access to the debuginfo packages, this information may be lost. The new
note is available even if the repository metadata gets lost. Users can
easily provide equivalent information in a format that makes sense in their
own environment. It should work even when rpms and debs and other formats
are mixed, e.g. during container image creation.

=== New system: `.note.package` ===

The new note is created and propagated similarly to `.note.gnu.build-id`.
The difference is that we inject the information about package nevra from
the build system.

The implementation is very simple: `%{build_ldflags}` are extended with a
command to insert a custom note as a separate section in an ELF object. See
[https://github.com/systemd/package-notes/blob/main/hello.spec hello.spec]
for an example. This is done in the default macros, so all packages that
use the prescribed link flags will be affected.

The note is a compat json string. This allows the format to be trivially
extensible (new fields can be added at will), easy to process (json is
extremely popular and parsers are widely available). Using a single field
is more space-efficient. With multiple fields the padding and alignment
requirements cause unnecessary overhead.

The system was designed with cross-distro collaboration and is flexible
enough to identify binaries from different packaging formats and build
systems (rpms, debs, custom binaries).

The overhead is about 200 bytes for each ELF object.
If we do this only for executables, then for the whole distro, 5000 × 200 =
1 MB.
If we do it for shared libraries, then the cost will be maybe 4 times
higher.
Precise measurements TBD once we know the final implementation and figure
out the right repoquery magic.

=== Examples ===

$ objdump -s -j .note.package build/libhello.so

build/libhello.so: file format elf64-x86-64

Contents of section .note.package:
 02ec 0400 6300 7e1afeca 46444f00  c...~...FDO.
 02fc 7b227479 7065223a 2272706d 222c226e  {"type":"rpm","n
 030c 616d6522 3a226865 6c6c6f22 2c227665  ame":"hello","ve
 031c 7273696f 6e223a22 302d312e 6665  rsion":"0-1.fc35
 032c 2e783836 5f363422 2c226f73 43706522  .x86_64","osCpe"
 033c 3a226370 653a2f6f 3a666564 6f726170  :"cpe:/o:fedorap
 034c 726f6a65 63743a66 65646f72 613a  roject:fedora:33
 035c 227d "}..



$ readelf --notes build/hello | grep "description data" | sed -e
"s/\s*description data: //g" -e "s/ //g" | xxd -p -r | jq
readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to
0x10de
readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to
0x10af
readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to
0x119f
{
  "type": "rpm",
  "name": "hello",
  "version": "0-1.fc35.x86_64",
  "osCpe": "cpe:/o:fedoraproject:fedora:33"
}



$ coredumpctl info
   PID: 44522 (fsverity)
...
   Package: fsverity-utils/1.3-1
  build-id: ac89bf7175b04d7eec7f6544a923f45be111f0be
   Message: Process 44522 (fsverity) of user 1000 dum