Bug#922744: Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
Hi. Thanks for your reply. The links to bugs you included add much needed detail to this discussion. > "Matthias" == Matthias Klose writes: Matthias> as pre-processing. So we know since about three years Matthias> that dwz doesn't support compressed debug symbols. Your Matthias> language about "claims", "might", and so on is not Matthias> appropriate. No, we know that three years ago dwz didn't support compressed debug symbols. Since that information is three years out of date, you get my "might" and "claim" language. You're the binutils maintainer! If you happen to know that dwz still doesn't support compressed symbols then *say that* and all my language about "might" and "claim" will go away. I absolutely trust your knowledge about what our elf stack does. It's possible it's a language issue, but so far you've used rather vague language rather than making specific claims in an area where you are an expert. If you don't know, that's fine. But if no one who would like to see us move away from compressed debug symbols has chosen to check and see whether dwz still requires uncompressed symbols, well, I think that is significant. I think the primary burden of arguing for a change lies with those proposing that change. So, I do think that people proposing a change need to do things like find out what specific tools break. (Including pointers to bugs as you have done in the last mail also counts as providing that sort of justification. I'll admit that I don't see how the pointer to the rpm find-debuginfo script quite fits in, but I think I follow the valgrind issue.)
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
On 2021-03-06 11:11, Jakub Wilk wrote: * Elana Hashman , 2021-02-17, 11:06: Would you be able to research some representative slice of popular packages that would be affected by the policy change (at least 10) and share the on-disk sizes with compression vs. without? Not exactly what you asked Niels for, but... A few months ago I recompressed whole buster/main/amd64 to see what the effect of ditching --compress-debug-sections would be. Raw data for this experiment is available here: https://github.com/jwilk/lets-shrink-dbgsym/releases/download/20200708/buster-main-amd64-20200708.tsv.xz The columns are: * file name * original .deb size * recompressed .deb size * original installed size * recompressed installed size Note that some of the .deb size savings might be caused by the fix for #868674 (for packages that haven't been rebuilt since the fix). Hi Jakub, This is very helpful. Using this file, I have calculated the following three aggregates: 1. % size between original and recompressed .deb 2. % size between original and recompressed install size 3. size difference in bytes between original and recompressed install size I then performed a quartile analysis on it. Recompressed size is X% of original .deb: Min 3.97% 25% 65.45% Median 74.73% 75% 82.64% Max 105.01% Installed size of recompressed is X% of original installed size: Min 100.06% 25% 230.72% Median 256.76% 75% 292.76% Max 4267.38% Size difference between recompressed and original installed size, is X bytes: Min 20480 (20KB) 25% 89088 (90KB) Median 404480 (404KB) 75% 2461184 (2.5MB) Max 5728869376 (5.72GB) So I think we can conclude the following: - In essentially all cases, recompressed deb has a size improvement over the original. - In all cases, the installed size of the debug symbols is larger, usually about 2-3x the original installed size. - In all but the largest cases, that size difference is negligable. However, the large cases have quite an extreme difference. Hence, I think the tail end of large packages will guide this decision, and Josh very helpfully provided an analysis of those already! Because of the extreme difference for the large packages, and because many of those packages are relatively popular, I think I am inclined to maintain the status quo, i.e. with --compress-debug-section enabled by default. I am open to being convinced otherwise :) Cheers, - e
Bug#922744: Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
On 3/8/21 6:27 PM, Sam Hartman wrote: >> "Matthias" == Matthias Klose writes: > > Matthias> Maybe you should be more specific about "those who can't > Matthias> use" uncompressed debug info in the first place. > > So, you've argued that the disk savings are not significant inside a > package, because packages are themselves compressed. > > What people are arguing is that they want to have debug info for large > programs like firefox or chromium installed, or really debug info for > large parts of their system. > They are in effect arguing that they care about the installed size not > the package size. > They have explicitly argued that having to uninstall and then later > reinstall disadvantages their debug cycle. > > This situation is particularly unfortunate because it sounds like we > have a conflict between two techniques for saving space. > > On one hand we have dwz which tries to optimize and reduce overall size > of debug symbols > > which is incompatible (apparently--no one has explicitly confirmed this) > compressed debug symbols. > > Presumably we can still run dwz within a single package by doing so > before debug symbols are compressed. > But presumably this gets in the way of people running dwz themselves or > something. > > I'll be blunt. > The people who say that they want debug symbols installed on their > system have made a simple, easy to understand argument. let my be blunt as well. The only reference I can find regarding the size on disk is #922744. Contrary to what you're saying "that they want to have debug info for large programs like firefox *or* chromium installed", they want to have debug symbols for firefox *and* chromium *and* more installed at the same time. #631985 speaks about a 10G space requirement for debugging KDE alone. The decision about the compressed debug symbols was made ten years ago. Maybe it's time to re-evaluate what expectations for a debug installation should be set. > The argument that compressed debug symbols break things is still porrly > stated. > We've had a claim that dwz might not work with compressed debug symbols > (and didn't used to). > We've had no one explain how that creates a problem in practice or even > confirm it's still the case. > It felt like pulling teeth to even get an answer that might be a tool we > care about. #87 asked for "postprocessing in dh_strip", however it was implemented in * dh_dwz: Add new experimental tool to run dwz(1) to deduplicate ELF debugging symbols. It should be generally be run before dh_strip (as dh_strip compresses the debug symbols and dwz expects uncompressed debug symbols). (Closes: #87) as pre-processing. So we know since about three years that dwz doesn't support compressed debug symbols. Your language about "claims", "might", and so on is not appropriate. Upstreams are currently looking at issues seen with valgrind about .gnu_debuglink section and .gnu_debugaltlink section in https://bugs.kde.org/show_bug.cgi?id=427969 https://bugs.kde.org/show_bug.cgi?id=396656 so apparently there are issues with another tool (valgrind), and how the debug information is created and split in debhelper. Also see https://github.com/rpm-software-management/rpm/blob/master/scripts/find-debuginfo.sh how dwz is run *after* separating the debug info, not touching the stripped binaries. Apparently the choice for compressed debug sections resulted later in an implementation for dh_dwz which is causing issues on it's own. Unrelated to that, but to not create conflicting dbg and dbgsym packages, there is #968710 and #981245, and it will be difficult to integrate within debhelper without introducing a new debhelper compat level. Also unrelated, there are #971724, #971680, and packages manually installing additional files in auto-generated dbgsym packages. Maybe any of these decisions to dh_strip were maybe mad to the best knowledge at the time, but the current situation is a mess. Sticking to compressed debug sections is just one issue ... Matthias
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
> "Matthias" == Matthias Klose writes: Matthias> Maybe you should be more specific about "those who can't Matthias> use" uncompressed debug info in the first place. So, you've argued that the disk savings are not significant inside a package, because packages are themselves compressed. What people are arguing is that they want to have debug info for large programs like firefox or chromium installed, or really debug info for large parts of their system. They are in effect arguing that they care about the installed size not the package size. They have explicitly argued that having to uninstall and then later reinstall disadvantages their debug cycle. This situation is particularly unfortunate because it sounds like we have a conflict between two techniques for saving space. On one hand we have dwz which tries to optimize and reduce overall size of debug symbols which is incompatible (apparently--no one has explicitly confirmed this) compressed debug symbols. Presumably we can still run dwz within a single package by doing so before debug symbols are compressed. But presumably this gets in the way of people running dwz themselves or something. I'll be blunt. The people who say that they want debug symbols installed on their system have made a simple, easy to understand argument. The argument that compressed debug symbols break things is still porrly stated. We've had a claim that dwz might not work with compressed debug symbols (and didn't used to). We've had no one explain how that creates a problem in practice or even confirm it's still the case. It felt like pulling teeth to even get an answer that might be a tool we care about. Please be less vague! --Sam
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
On 3/7/21 11:51 PM, Sean Whitton wrote: > Hello, > > On Sun 07 Mar 2021 at 03:50PM -07, Sean Whitton wrote: > >> This is not much good if your network is weak or you're offline, or you >> don't want information on your debugging to go out to a web service. > > What I mean is: debuginfod is great in many scenarios, but we should > probably care about those who can't or won't use it too. Sorry if the > above is a bit blunt. your comment is unfocused. This was just another use case where uncompressed debug info could be harmful, and I pointed out a configuration how to avoid it. Maybe you should be more specific about "those who can't use" uncompressed debug info in the first place. Matthias
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
Hello, On Sun 07 Mar 2021 at 03:50PM -07, Sean Whitton wrote: > This is not much good if your network is weak or you're offline, or you > don't want information on your debugging to go out to a web service. What I mean is: debuginfod is great in many scenarios, but we should probably care about those who can't or won't use it too. Sorry if the above is a bit blunt. -- Sean Whitton signature.asc Description: PGP signature
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
Hello, On Fri 05 Mar 2021 at 06:22PM +01, Matthias Klose wrote: > yes, the rationale for uncompressed debug sections is that any tool can access > them. On disk as deb/dbgsym package, there is no big difference in > size. The data elsewhere in this bug would suggest otherwise! > Also a debuginfod server can be configured to send the debuginfo > compressed on the fly. The "only" downside is to have the uncomressed > debuginfo on the disk when doing the debugging. This is not much good if your network is weak or you're offline, or you don't want information on your debugging to go out to a web service. -- Sean Whitton signature.asc Description: PGP signature
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
* Elana Hashman , 2021-02-17, 11:06: Would you be able to research some representative slice of popular packages that would be affected by the policy change (at least 10) and share the on-disk sizes with compression vs. without? Not exactly what you asked Niels for, but... A few months ago I recompressed whole buster/main/amd64 to see what the effect of ditching --compress-debug-sections would be. Raw data for this experiment is available here: https://github.com/jwilk/lets-shrink-dbgsym/releases/download/20200708/buster-main-amd64-20200708.tsv.xz The columns are: * file name * original .deb size * recompressed .deb size * original installed size * recompressed installed size Note that some of the .deb size savings might be caused by the fix for #868674 (for packages that haven't been rebuilt since the fix). -- Jakub Wilk
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
yes, the rationale for uncompressed debug sections is that any tool can access them. On disk as deb/dbgsym package, there is no big difference in size. Also a debuginfod server can be configured to send the debuginfo compressed on the fly. The "only" downside is to have the uncomressed debuginfo on the disk when doing the debugging. Matthias
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
* Elana Hashman: > You and the original report mention "tooling issues". Can you please > provide some examples of tools that do not currently support working > with compressed symbols and the resulting effects on developer workflow? dwz still can't process compressed debuginfo sections, I think. It's the reason why Fedora uncompresses all debuginfo sections. It's also not clear to me which compression approach is to be used, the GNU one or the ELF standard one. I expect support for GNU to be a bit more widespread in our world because it's been around for a bit longer.
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
Hi Niels, Most of the arguments in this and previous bugs are anecdotal. It would be helpful to provide a concrete analysis of the disk space savings that compression provides, and whether it is a reasonable default. There is a discussion about KDE debug symbols requiring 10Gi of disk space a decade ago, but not what the original compressed size was, for example... Would you be able to research some representative slice of popular packages that would be affected by the policy change (at least 10) and share the on-disk sizes with compression vs. without? Personally, I think if there is not much difference in size, it would make sense to not compress as the default. If there are orders of magnitude in difference, the status quo probably still makes sense, as it does provide benefits. Matthias, You and the original report mention "tooling issues". Can you please provide some examples of tools that do not currently support working with compressed symbols and the resulting effects on developer workflow? Thanks, - e signature.asc Description: PGP signature
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
Hello Niels, On Sat 05 Dec 2020 at 01:12PM +01, Niels Thykier wrote: > The underlying arguments for and against --compress-debug-section appear > to be "download size" vs. "installed disk usage" vs. "Tooling support". > Though I ask you to please read the bugs #631985 and #922744 for the > details of the arguments by both proponents. Just had a look a these, thank you. ISTM that the arguments in favour of compressing are more concrete right now: in #631985 there is the example of debugging KDE requiring more than 10G of disc space. (Nine years later perhaps it is more.) On the other hand, in #922744, there is only an unsubstantiated reference to tooling support. I'm going to write to the submitter of #922744 asking for more info. > Why punt it to you? > === > > [...] I think the reasons you give are all reasonable. -- Sean Whitton signature.asc Description: PGP signature
Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?
Package: tech-ctte Dear members of the technical committee I am asking for you provide advice or make a decision according Constitution 6.1.3 on the matter of whether dbgsym files should use file level compression or not. I intend to use the outcome of this bug to determine how to resolve #922744 (either by implementing the request or closing it as wontfix). A bit of context: = Since compat 9 (released in 2012-01-15), debhelper has compressed detached dbgsym files using objcopy's option --compress-debug-section. This was implemented in debhelper/8.9.10 in order to resolve #631985. When -dbgsym packages was implemented several years later, I used --compress-debug-section in the implementation as it was the default for modern compat levels at the time. Then on 2019-02-20, Matthias Klose filed the bug #922744, in where Matthias (in my reading of the subject) effectively requests that debhelper stopped using --compress-debug-section, which would overturn the request in #631985. The underlying arguments for and against --compress-debug-section appear to be "download size" vs. "installed disk usage" vs. "Tooling support". Though I ask you to please read the bugs #631985 and #922744 for the details of the arguments by both proponents. I have _not_ involved any of the parties/stakeholders in this nor heard there arguments. Please see "Why punt it to you?" for why. Why punt it to you? === I am punting this to you because: 1. As stated in #922744, I am largely not emotionally invested in the result. Though I admit having reservations on how the solution is implemented (see "Non-solutions"). 2. I am certain I do _not_ have the spoons and emotional capacity for resolving this in the best way for Debian (as opposed to the "least discussion/work for me" solution). This has kept me from opening the debate with relevant stakeholders. 3. I do not want #922744 to rot forever in the bug tracker (which is frankly what is happening now). Given the nature of the underlying problem is technical, then I thought it was best to rely on you. My other alternative would be to throw it at debian-devel but given point 2 in my list above that seemed like it would be counterproductive for me. My intentions for implementations: == If the advice/decision is to stop using --compress-debug-section then I intend to retroactively undo the change for all compat levels (affecting compat 9+) after the next release (to avoid disrupting the current release). It is my understanding that nothing relies on a 100% coverage of compressed dbgsyms as we never got to a 100% in Debian yet. Furthermore, most compilers do not compress debug sections by default, which means that most tools will need to support uncompressed debug sections to be useful. If the advice/decision is to keep using --compress-debug-sections, I am tempted to just leave the implementation as-is and slow migrate the rest of the packages as the old compat levels are phased out. I am open to changes/advice to alternative solutions for implementation (though please see "Non-solutions") - these alternatives can be presented anyone (including members of the tech-ctte in their private capacity[1]). Non-solutions: =-=-=-=-=-=-=- I do _not_ think we will be better served with compression being something you opt-in/opt-out from based on a command-line option (or a field in debian/control). I think debhelper has way too many "special-case" options or toggles where people have to do case-by-case decisions already and I would see this as "yet another one". You may choose this as a solution but then I require you to overrule me as a maintainer using 6.1.4 with a 3:1 supermajority in the decision. Thanks for your time, ~Niels [1] I do not expect a full decision cycle/vote just to propose an alternative. But also, I do not want 6.3.5 getting in the way of a better solution than I thought of.