Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
> On Tue, Apr 13, 2021 at 8:09 AM Zbigniew Jędrzejewski-Szmek > > The new metadata guarantees that the ELF data churns, though. For > example, if I bump the Release in a spec file for something unrelated > to the build, all the ELF blobs change. The current state means that > this is deduplicated in RPM CoW and a very cheap upgrade, since the > binaries weren't all touched. The content of the key:value pairs is entirely under your control though - if you don't want to include the spec release field because of the case where that would be the only change to the binary, then don't. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Do, 15.04.21 10:20, Luca Boccassi (bl...@debian.org) wrote: > > I'm confused about this - I had put forth an idea for how to make rpm > > create this when installing packages (so it works with older or third > > party packages) but the same xattr could be created for any packaging > > system. Can you clarify what is rpm dependent here? > > > > Matthew. > > Hi, > > There's a few issues with using xattr, some minor and one major. > The minor issues is that it's really not great when you are shipping > stuff around - the source/transport/medium/archiving format might or > might not support it. Having to deal with this for cross-building > Linux binaries from Windows with SELinux labels I can assure it's a > massive headache I'd rather not replicate :-) I think this might not just be a minor issue btw. One of the main goals of this feature is to make coredumps reasonably useful when they originate from a binary shipped as container image. But do all popular container envs even ship xattrs in their deployment images? I mean, it's an optional tar feature, and do they all enable it? iirc original "aufs" backed Docker didn't support xattrs, simply because aufs didn't. I figure that leaked into all later versions, too, no? Lennart -- Lennart Poettering, Berlin ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
> On Wed, 2021-04-14 at 15:29 +, Zbigniew Jędrzejewski-Szmek wrote: > > That's fair - if it were possible to get an fd during dump, we could > use fgetxattr. If not, we can use /proc/$pid/exe - even when deleted > you can interact with it: > > [malmond@malmond-x1 ~]$ ls -l /proc/$$/exe > lrwxrwxrwx. 1 malmond malmond 0 Apr 14 15:45 /proc/364665/exe -> > '/home/malmond/testbash (deleted)' > [malmond@malmond-x1 ~]$ attr -l /proc/$$/exe > Attribute "selinux" has a 54 byte value for /proc/364665/exe > > (this is me copying bash, executing it, then deleting it). My thinking > is this could go in systemd-coredump as it's invoked when dumping core > anyway. Libraries are accessible from /proc/$pid/map_files/$range. > > > I'm confused about this - I had put forth an idea for how to make rpm > create this when installing packages (so it works with older or third > party packages) but the same xattr could be created for any packaging > system. Can you clarify what is rpm dependent here? > > Matthew. Hi, There's a few issues with using xattr, some minor and one major. The minor issues is that it's really not great when you are shipping stuff around - the source/transport/medium/archiving format might or might not support it. Having to deal with this for cross-building Linux binaries from Windows with SELinux labels I can assure it's a massive headache I'd rather not replicate :-) The major issue though is a different one, if related: one of the central, nicest capabilities of this proposal is that the ELF note gets _automatically_ included in the generated core file. No change required anywhere for that to happen. This means if you are running a fleet of headless systems, and you collect corefiles on each node and ship them off for offline analysis, all the metadata is nicely included, automagically. That's because the metadata is a property of the binary and assigned at the source, not of the system where it runs on and applied at the destination. If you use xattrs though, the metadata suddenly becomes a property of the system where the binary happens to run on. The core file won't have it. The only way to see it is to have access to the system. It's no longer nicely self-contained and "replicating". Also you are no longer guaranteed to _always_ have the metadata available as long as you add it at build time - suddenly, if your binary ends up installed on the 'wrong' system/container/whatever, because they are too old or don't have a package manager or whatever else, your metadata is gone. And just to clarify, these use cases are not theoretical - they are very much real and already working. And these are the main reasons the team that handles crashdumps at Microsoft suggested adding a '.note' to the ELF rather than a new header or other proposals. I realize this is less of a problem if you look at things exclusively from the closed-loop of a single distribution building its stuff and shipping its installations only, but with this spec and implementation we are trying hard to have a general, 'world-wide' solution that works across distros, version, flavours, etc etc. In this sense, the implementation I contributed to systemd-coredump is ""just"" a reference implementation of how to parse and use the spec, but it's by no means intended to be a systemd-only affair and an internal-only protocol, ending at the new journald fields. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
Sorry for not responding to this in my previous reply. On Wed, 2021-04-14 at 15:29 +, Zbigniew Jędrzejewski-Szmek wrote: > I wanted to investigate this, but unfortunately, it's hard to check > right now, because all builds are non-reproducible (in the sense of > reproducible-builds.org), because we include the mtime of build > products in rpm metadata, so pretty much all binary rpms are > different. I'm thinking this isn't that important. In most current RPMs, the mtimes for files are in two places: 1. In the (main) rpm header 2. in the cpio header for the file in the payload. I can talk about the effect on RPMCow: the mtime isn't part of the identity of the file - it's just a content hash. When the files are actually installed, then the resulting inode is touch'ed to the right time. Therefore I think it's moot (MOOt?) from a CoW perspective, the reuse can happen. Matthew. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Wed, 2021-04-14 at 15:29 +, Zbigniew Jędrzejewski-Szmek wrote: > Unfortunately this doesn't work for two important cases: > - when a binary or shared library has been replaced on disk. E.g. > it is fairly common for packages to crash on upgrade, and the crash > could be in the _old_ code. When the metadata is loaded in a section, > we get it all nice and dandy in the coredump. If it's in an xattr, > we don't or even worse, get outdated info. That's fair - if it were possible to get an fd during dump, we could use fgetxattr. If not, we can use /proc/$pid/exe - even when deleted you can interact with it: [malmond@malmond-x1 ~]$ ls -l /proc/$$/exe lrwxrwxrwx. 1 malmond malmond 0 Apr 14 15:45 /proc/364665/exe -> '/home/malmond/testbash (deleted)' [malmond@malmond-x1 ~]$ attr -l /proc/$$/exe Attribute "selinux" has a 54 byte value for /proc/364665/exe (this is me copying bash, executing it, then deleting it). My thinking is this could go in systemd-coredump as it's invoked when dumping core anyway. Libraries are accessible from /proc/$pid/map_files/$range. > - it doesn't work for non-rpm stuff. I'm confused about this - I had put forth an idea for how to make rpm create this when installing packages (so it works with older or third party packages) but the same xattr could be created for any packaging system. Can you clarify what is rpm dependent here? Matthew. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Wed, Apr 14, 2021 at 11:47:42AM -0400, Neal Gompa wrote: > On Wed, Apr 14, 2021 at 11:30 AM Zbigniew Jędrzejewski-Szmek > wrote: > > > > On Tue, Apr 13, 2021 at 12:44:42AM +, Matthew Almond via devel wrote: > > > On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote: > > > > Or in other words: packaging metadata are sources too. If they change > > > > (and a version bump constitutes a change) the output might change, > > > > and > > > > that's expected. What's key really is that the only things that can > > > > effect generated output are the build/packaging environment and the > > > > sources, but not parameters outside of that, such as the actual > > > > wallclock. > > > > > > The main way that packaging "interferes" with the source is when > > > patches are applied - the original timestamp of a tarball (for example) > > > isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair. > > > > > > > > > > > > My concern centers around the Copy on Write (CoW) use case - when > > > > > packages are updated, some files changes, and some may stay the > > > > > same. > > > > > Where they are the same, we can save I/O and possibly download time > > > > > long term. > > > > > > > > Reproducible builds the way they are defined do not address such > > > > file-level CoW optimization so much. They do address CoW optimization > > > > on a package level much more however: i.e. the same package build > > > > will > > > > have the same files in them, no matter what. > > > > > > > > Or to say this differently: if you want reproducible to work the way > > > > ou think it should work, you'd have to start by convincing the > > > > uptream > > > > maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good > > > > luck with that. > > > > > > I think we should be careful to de-couple these two things. Just > > > because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not > > > proof that all binaries will. I remain concerned that this proposal > > > forces the issue and for every single version of every single ELF > > > binary *must* be different, even if they really didn't change. The > > > pattern I see is more automation and faster, smaller release cycles, > > > and this forcing downloads and writes of binaries that really didn't > > > change their code. > > > > Yeah, that's definitely something to think about. > > > > The proposed change indeed "forces the issue". This could be a big drawback > > or not, depending on how often identical binary builds happen for different > > package versions. If it turns out that the answer is "only rarely", then > > I wouldn't consider it too important. If the answer is "quite often", we > > would a chance for a nice optimization. > > > > I wanted to investigate this, but unfortunately, it's hard to check > > right now, because all builds are non-reproducible (in the sense of > > reproducible-builds.org), because we include the mtime of build > > products in rpm metadata, so pretty much all binary rpms are > > different. And in general other things make builds non-reproducible, > > and it's not obvious if *this* change makes things worse. I didn't > > want to dig into individual rpms to compare binaries. I *think* most > > packages are not actually rebuilt that often without changes…, but real > > data is definitely needed. > > > > We could start clamping times by default by adding the following to > redhat-rpm-config: > > %clamp_mtime_to_source_date_epoch 1 Oh, is this already a thing? Nice! https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/126 Zbyszek ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Wed, Apr 14, 2021 at 11:30 AM Zbigniew Jędrzejewski-Szmek wrote: > > On Tue, Apr 13, 2021 at 12:44:42AM +, Matthew Almond via devel wrote: > > On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote: > > > Or in other words: packaging metadata are sources too. If they change > > > (and a version bump constitutes a change) the output might change, > > > and > > > that's expected. What's key really is that the only things that can > > > effect generated output are the build/packaging environment and the > > > sources, but not parameters outside of that, such as the actual > > > wallclock. > > > > The main way that packaging "interferes" with the source is when > > patches are applied - the original timestamp of a tarball (for example) > > isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair. > > > > > > > > > My concern centers around the Copy on Write (CoW) use case - when > > > > packages are updated, some files changes, and some may stay the > > > > same. > > > > Where they are the same, we can save I/O and possibly download time > > > > long term. > > > > > > Reproducible builds the way they are defined do not address such > > > file-level CoW optimization so much. They do address CoW optimization > > > on a package level much more however: i.e. the same package build > > > will > > > have the same files in them, no matter what. > > > > > > Or to say this differently: if you want reproducible to work the way > > > ou think it should work, you'd have to start by convincing the > > > uptream > > > maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good > > > luck with that. > > > > I think we should be careful to de-couple these two things. Just > > because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not > > proof that all binaries will. I remain concerned that this proposal > > forces the issue and for every single version of every single ELF > > binary *must* be different, even if they really didn't change. The > > pattern I see is more automation and faster, smaller release cycles, > > and this forcing downloads and writes of binaries that really didn't > > change their code. > > Yeah, that's definitely something to think about. > > The proposed change indeed "forces the issue". This could be a big drawback > or not, depending on how often identical binary builds happen for different > package versions. If it turns out that the answer is "only rarely", then > I wouldn't consider it too important. If the answer is "quite often", we > would a chance for a nice optimization. > > I wanted to investigate this, but unfortunately, it's hard to check > right now, because all builds are non-reproducible (in the sense of > reproducible-builds.org), because we include the mtime of build > products in rpm metadata, so pretty much all binary rpms are > different. And in general other things make builds non-reproducible, > and it's not obvious if *this* change makes things worse. I didn't > want to dig into individual rpms to compare binaries. I *think* most > packages are not actually rebuilt that often without changes…, but real > data is definitely needed. > We could start clamping times by default by adding the following to redhat-rpm-config: %clamp_mtime_to_source_date_epoch 1 > > I have just thought of an alternative proposition: for ELF objects (and > > ELF objects only): rpm could automatically, and systematically record > > the metadata in an xattr. This would work on images without rpmdb, > > works on most filesystem types, be serialized in archives. Most > > interestingly this could be implemented as an rpm plugin, and would > > work retroactively for packages that were built before this proposal. > > It could also be made to work for other packaging systems, and the > > tooling that reads it wouldn't need to know the original packaging > > system. > Unfortunately this doesn't work for two important cases: > - when a binary or shared library has been replaced on disk. E.g. > it is fairly common for packages to crash on upgrade, and the crash > could be in the _old_ code. When the metadata is loaded in a section, > we get it all nice and dandy in the coredump. If it's in an xattr, > we don't or even worse, get outdated info. > - it doesn't work for non-rpm stuff. > > Zbyszek > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure -- 真実はいつも一つ!/ Always, there's only one truth! ___ devel mailing list -- devel@lists.fedoraproject.org To unsu
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Tue, Apr 13, 2021 at 12:44:42AM +, Matthew Almond via devel wrote: > On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote: > > Or in other words: packaging metadata are sources too. If they change > > (and a version bump constitutes a change) the output might change, > > and > > that's expected. What's key really is that the only things that can > > effect generated output are the build/packaging environment and the > > sources, but not parameters outside of that, such as the actual > > wallclock. > > The main way that packaging "interferes" with the source is when > patches are applied - the original timestamp of a tarball (for example) > isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair. > > > > > > My concern centers around the Copy on Write (CoW) use case - when > > > packages are updated, some files changes, and some may stay the > > > same. > > > Where they are the same, we can save I/O and possibly download time > > > long term. > > > > Reproducible builds the way they are defined do not address such > > file-level CoW optimization so much. They do address CoW optimization > > on a package level much more however: i.e. the same package build > > will > > have the same files in them, no matter what. > > > > Or to say this differently: if you want reproducible to work the way > > ou think it should work, you'd have to start by convincing the > > uptream > > maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good > > luck with that. > > I think we should be careful to de-couple these two things. Just > because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not > proof that all binaries will. I remain concerned that this proposal > forces the issue and for every single version of every single ELF > binary *must* be different, even if they really didn't change. The > pattern I see is more automation and faster, smaller release cycles, > and this forcing downloads and writes of binaries that really didn't > change their code. Yeah, that's definitely something to think about. The proposed change indeed "forces the issue". This could be a big drawback or not, depending on how often identical binary builds happen for different package versions. If it turns out that the answer is "only rarely", then I wouldn't consider it too important. If the answer is "quite often", we would a chance for a nice optimization. I wanted to investigate this, but unfortunately, it's hard to check right now, because all builds are non-reproducible (in the sense of reproducible-builds.org), because we include the mtime of build products in rpm metadata, so pretty much all binary rpms are different. And in general other things make builds non-reproducible, and it's not obvious if *this* change makes things worse. I didn't want to dig into individual rpms to compare binaries. I *think* most packages are not actually rebuilt that often without changes…, but real data is definitely needed. > I have just thought of an alternative proposition: for ELF objects (and > ELF objects only): rpm could automatically, and systematically record > the metadata in an xattr. This would work on images without rpmdb, > works on most filesystem types, be serialized in archives. Most > interestingly this could be implemented as an rpm plugin, and would > work retroactively for packages that were built before this proposal. > It could also be made to work for other packaging systems, and the > tooling that reads it wouldn't need to know the original packaging > system. Unfortunately this doesn't work for two important cases: - when a binary or shared library has been replaced on disk. E.g. it is fairly common for packages to crash on upgrade, and the crash could be in the _old_ code. When the metadata is loaded in a section, we get it all nice and dandy in the coredump. If it's in an xattr, we don't or even worse, get outdated info. - it doesn't work for non-rpm stuff. Zbyszek ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Mon, Apr 12, 2021, at 8:44 PM, Matthew Almond via devel wrote: > > I think we should be careful to de-couple these two things. Just > because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not > proof that all binaries will. Agreed; it'd be interesting to gather some data here, particularly components with large binaries. > I have just thought of an alternative proposition: for ELF objects (and > ELF objects only): rpm could automatically, and systematically record > the metadata in an xattr. OSTree would be affected in the same way as your "RPM CoW" proposal by the approach of having it in the binary directly. Unless we did this, because ostree is based on hardlinking which works on every filesystem, but shares an inode and hence the extended attributes are included in the ostree checksum. (There is some support for adding an additional "payload" i.e. content checksum in ostree but it adds another mapping and so we don't enable it by default). But on reflink-capable filesystems in theory if this content is just in the ELF header we could skip it and reflink just the remainder which would be most of the binary. (But, this would necessitate a strategy other than checksumming the whole binary of course, something more like rsync-style "rollsum" windows that we use for ostree static deltas, e.g.) ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Tue, Apr 13, 2021 at 8:09 AM Zbigniew Jędrzejewski-Szmek wrote: > > On Mon, Apr 12, 2021 at 10:57:30PM +0200, Lennart Poettering wrote: > > On Mo, 12.04.21 16:14, David Malcolm (dmalc...@redhat.com) wrote: > > > > > So I want to push back on the idea that a single package can be > > > associated with a coredump, or be the one responsible for the crash: > > > any or all of the ELF objects linked into the process could be at > > > fault. > > > > The example in the feature page shows how we handle this: you'll see > > the packaging metadata of all involved ELF objects in coredumpctl's > > output. i.e. we should be nicely covered on this, and we are fully > > aware that the "main" ELF objects is the culprit of crashes only in a > > fraction of cases. > > This is true. > > OTOH, this new metadata doesn't really change the situation here. > Before, we already had build-ids for all the packages "involved" in > the stack trace. And our processing tools already could do the > conversion to package nevras. (They have to have network access to > create a report.) The only thing that changes is *how* this conversion > happens, but for online reports such conversion was always possible. > The new metadata guarantees that the ELF data churns, though. For example, if I bump the Release in a spec file for something unrelated to the build, all the ELF blobs change. The current state means that this is deduplicated in RPM CoW and a very cheap upgrade, since the binaries weren't all touched. -- 真実はいつも一つ!/ Always, there's only one truth! ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Mon, Apr 12, 2021 at 10:57:30PM +0200, Lennart Poettering wrote: > On Mo, 12.04.21 16:14, David Malcolm (dmalc...@redhat.com) wrote: > > > So I want to push back on the idea that a single package can be > > associated with a coredump, or be the one responsible for the crash: > > any or all of the ELF objects linked into the process could be at > > fault. > > The example in the feature page shows how we handle this: you'll see > the packaging metadata of all involved ELF objects in coredumpctl's > output. i.e. we should be nicely covered on this, and we are fully > aware that the "main" ELF objects is the culprit of crashes only in a > fraction of cases. This is true. OTOH, this new metadata doesn't really change the situation here. Before, we already had build-ids for all the packages "involved" in the stack trace. And our processing tools already could do the conversion to package nevras. (They have to have network access to create a report.) The only thing that changes is *how* this conversion happens, but for online reports such conversion was always possible. That said, it *is* strange that abrt prints just one package nevra in bugzilla reports [1]. Zbyszek [1] completely arbitrary example I happened to have open in a tab: https://bugzilla.redhat.com/show_bug.cgi?id=1895937 ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Mon, 2021-04-12 at 23:10 +0200, Lennart Poettering wrote: > Or in other words: packaging metadata are sources too. If they change > (and a version bump constitutes a change) the output might change, > and > that's expected. What's key really is that the only things that can > effect generated output are the build/packaging environment and the > sources, but not parameters outside of that, such as the actual > wallclock. The main way that packaging "interferes" with the source is when patches are applied - the original timestamp of a tarball (for example) isn't complete enough to use for $SOURCE_DATE_EPOCH. That's fair. > > > My concern centers around the Copy on Write (CoW) use case - when > > packages are updated, some files changes, and some may stay the > > same. > > Where they are the same, we can save I/O and possibly download time > > long term. > > Reproducible builds the way they are defined do not address such > file-level CoW optimization so much. They do address CoW optimization > on a package level much more however: i.e. the same package build > will > have the same files in them, no matter what. > > Or to say this differently: if you want reproducible to work the way > ou think it should work, you'd have to start by convincing the > uptream > maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good > luck with that. I think we should be careful to de-couple these two things. Just because $SOURCE_DATE_EPOCH is likely to affect a lot of binaries is not proof that all binaries will. I remain concerned that this proposal forces the issue and for every single version of every single ELF binary *must* be different, even if they really didn't change. The pattern I see is more automation and faster, smaller release cycles, and this forcing downloads and writes of binaries that really didn't change their code. I have just thought of an alternative proposition: for ELF objects (and ELF objects only): rpm could automatically, and systematically record the metadata in an xattr. This would work on images without rpmdb, works on most filesystem types, be serialized in archives. Most interestingly this could be implemented as an rpm plugin, and would work retroactively for packages that were built before this proposal. It could also be made to work for other packaging systems, and the tooling that reads it wouldn't need to know the original packaging system. Thoughts? Matthew ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Mo, 12.04.21 20:40, Fedora Development ML (devel@lists.fedoraproject.org) wrote: > On Mon, 2021-04-12 at 15:46 -0400, Ben Cotton wrote: > > https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects > > Putting packaging info into a binary guarantees that each successive > package containing ELF binaries will not contain exactly the same > binaries, even if there are no changes. > > Now, what I just wrote there is predicated on "reproducible builds" > where the same source (including deps, headers) and the same toolchain > produce the same output. This may or may not be a thing. My concern is > that we completely eliminate the possibility of binaries being > unchanged. I think this is a misunderstanding how reproducible builds are supposed to work. For example, consider $SOURCE_DATE_EPOCH as defined here: https://reproducible-builds.org/specs/source-date-epoch/ It's expressly defined to be used as the source timestamp when that source timestamp is included in build output. It also also expressly documented to be a value initialized from the packaging Changelog timestamps. Or in other words: the way the reproducible builds project understands their own stuff it's absolutely OK to generate different output on package rebuilds that change the package versions. Or in other words: packaging metadata are sources too. If they change (and a version bump constitutes a change) the output might change, and that's expected. What's key really is that the only things that can effect generated output are the build/packaging environment and the sources, but not parameters outside of that, such as the actual wallclock. > My concern centers around the Copy on Write (CoW) use case - when > packages are updated, some files changes, and some may stay the same. > Where they are the same, we can save I/O and possibly download time > long term. Reproducible builds the way they are defined do not address such file-level CoW optimization so much. They do address CoW optimization on a package level much more however: i.e. the same package build will have the same files in them, no matter what. Or to say this differently: if you want reproducible to work the way ou think it should work, you'd have to start by convincing the uptream maintainers to kill $SOURCE_DATE_EPOCH and similar concepts, but good luck with that. Lennart -- Lennart Poettering, Berlin ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Mo, 12.04.21 16:14, David Malcolm (dmalc...@redhat.com) wrote: > So I want to push back on the idea that a single package can be > associated with a coredump, or be the one responsible for the crash: > any or all of the ELF objects linked into the process could be at > fault. The example in the feature page shows how we handle this: you'll see the packaging metadata of all involved ELF objects in coredumpctl's output. i.e. we should be nicely covered on this, and we are fully aware that the "main" ELF objects is the culprit of crashes only in a fraction of cases. Lennart -- Lennart Poettering, Berlin ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Mon, 2021-04-12 at 15:46 -0400, Ben Cotton wrote: > https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects Putting packaging info into a binary guarantees that each successive package containing ELF binaries will not contain exactly the same binaries, even if there are no changes. Now, what I just wrote there is predicated on "reproducible builds" where the same source (including deps, headers) and the same toolchain produce the same output. This may or may not be a thing. My concern is that we completely eliminate the possibility of binaries being unchanged. My concern centers around the Copy on Write (CoW) use case - when packages are updated, some files changes, and some may stay the same. Where they are the same, we can save I/O and possibly download time long term. My recommendation here is to (continue to?) log build ids, and resolve remotely if you don't have an rpmdb to consult. Build ids are opaque and meaningless to end users, but end users aren't the target. My expectation is that any data collection around crashes needs to aggregate, and build ids are good enough to identify packages, even after the fact. Matthew. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: F35 Change: Package information on ELF objects (System-Wide Change proposal)
On Mon, 2021-04-12 at 15:46 -0400, Ben Cotton wrote: > https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects > > == Summary == > All binaries (executables and shared libraries) are annotated with an > ELF > note that identifies the rpm distributing this file. > > == Owner == > * Name: [[User:Zbyszek|Zbigniew Jędrzejewski-Szmek]] > * Email: zbys...@in.waw.pl > * Name: Lennart Poettering > * Email: mzsrq...@0pointer.net > > > == Detailed Description == > See [https://github.com/systemd/systemd/issues/18433 systemd issue > #18433] > for discussion and implementation proposals. > > Programs crash. And when they do, they dump core, and we want to tell > the > user which package, including the version, caused the failure. This might be better as: "which packages [plural] could have been responsible for the failure" I used to maintain the Fedora "python" package, and I kept receiving bugzilla reports assigned to the "python" package filed via ABRT, because /usr/bin/python had crashed. It was almost never /usr/bin/python at fault: it's a tiny 4k executable linked to a much larger libpython.so (in a different subpackage) - but generally that wasn't at fault either: the python extension API exposes the insides of the virtual machine and its objects directly, and it's very easy for a buggy extension to corrupt something in the process, sometimes at some remove from where the segfault finally crashes the process down. I tried writing scripts to help update the bugs to use the correct bz component. In theory you could look at the deepest point in the callstack, but it might e.g. be an assertion failure handler in a shared library, rather than the "real" site of the crash. Or some object could have become corrupted at some point long before the crash actually fires, so the blame can't be diagnosed just from the final callstack. Dealing with this deluge of misfiled bug reports is what got me interested in static analysis and on maintaining GCC itself, fwiw. So I want to push back on the idea that a single package can be associated with a coredump, or be the one responsible for the crash: any or all of the ELF objects linked into the process could be at fault. Hope this is constructive (sorry, the wording in the proposal touched a nerve for me!) Dave ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
F35 Change: Package information on ELF objects (System-Wide Change proposal)
https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects == Summary == All binaries (executables and shared libraries) are annotated with an ELF note that identifies the rpm distributing this file. == Owner == * Name: [[User:Zbyszek|Zbigniew Jędrzejewski-Szmek]] * Email: zbys...@in.waw.pl * Name: Lennart Poettering * Email: mzsrq...@0pointer.net == Detailed Description == See [https://github.com/systemd/systemd/issues/18433 systemd issue #18433] for discussion and implementation proposals. Programs crash. And when they do, they dump core, and we want to tell the user which package, including the version, caused the failure. ELF note `.note.package` will be added to specify package nevra. By embedding the this information directly in the binary object, package nevra is immediately available from a core dump. === Existing system: `.note.gnu.build-id` === We already have build-ids: every ELF object has a `.note.gnu.build-id` note, and given a core file, we can read the build-id and look it up in the rpm database (`dnf repoquery --whatprovides debuginfo(build-id) = …`) to map it to a package name. Build-ids are unique and compact and very generic and work as expected in general. But they have some downsides: * build-ids are not very informative for users. Before the build-id is converted back to the appropriate package, it's completely opaque. * build-ids require a working rpm database or an internet connection to map to the package name. Three important cases: * minimal containers: the rpm database is not installed in the containers. The information about build-ids needs to be stored externally, so package name information is not available immediately, but only after offline processing. The new note doesn't depend on the rpm db in any way. * handling of a core from a container, where the container and host have different distros * self-built and external packages: unless a lot of care is taken to keep access to the debuginfo packages, this information may be lost. The new note is available even if the repository metadata gets lost. Users can easily provide equivalent information in a format that makes sense in their own environment. It should work even when rpms and debs and other formats are mixed, e.g. during container image creation. === New system: `.note.package` === The new note is created and propagated similarly to `.note.gnu.build-id`. The difference is that we inject the information about package nevra from the build system. The implementation is very simple: `%{build_ldflags}` are extended with a command to insert a custom note as a separate section in an ELF object. See [https://github.com/systemd/package-notes/blob/main/hello.spec hello.spec] for an example. This is done in the default macros, so all packages that use the prescribed link flags will be affected. The note is a compat json string. This allows the format to be trivially extensible (new fields can be added at will), easy to process (json is extremely popular and parsers are widely available). Using a single field is more space-efficient. With multiple fields the padding and alignment requirements cause unnecessary overhead. The system was designed with cross-distro collaboration and is flexible enough to identify binaries from different packaging formats and build systems (rpms, debs, custom binaries). The overhead is about 200 bytes for each ELF object. If we do this only for executables, then for the whole distro, 5000 × 200 = 1 MB. If we do it for shared libraries, then the cost will be maybe 4 times higher. Precise measurements TBD once we know the final implementation and figure out the right repoquery magic. === Examples === $ objdump -s -j .note.package build/libhello.so build/libhello.so: file format elf64-x86-64 Contents of section .note.package: 02ec 0400 6300 7e1afeca 46444f00 c...~...FDO. 02fc 7b227479 7065223a 2272706d 222c226e {"type":"rpm","n 030c 616d6522 3a226865 6c6c6f22 2c227665 ame":"hello","ve 031c 7273696f 6e223a22 302d312e 6665 rsion":"0-1.fc35 032c 2e783836 5f363422 2c226f73 43706522 .x86_64","osCpe" 033c 3a226370 653a2f6f 3a666564 6f726170 :"cpe:/o:fedorap 034c 726f6a65 63743a66 65646f72 613a roject:fedora:33 035c 227d "}.. $ readelf --notes build/hello | grep "description data" | sed -e "s/\s*description data: //g" -e "s/ //g" | xxd -p -r | jq readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x10de readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x10af readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x119f { "type": "rpm", "name": "hello", "version": "0-1.fc35.x86_64", "osCpe": "cpe:/o:fedoraproject:fedora:33" } $ coredumpctl info PID: 44522 (fsverity) ... Package: fsverity-utils/1.3-1 build-id: ac89bf7175b04d7eec7f6544a923f45be111f0be Message: Process 44522 (fsverity) of user 1000 dum