Bug#922744: Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-08 Thread Sam Hartman


Hi.
Thanks for your reply.
The links to bugs you included add much needed detail to this
discussion.


> "Matthias" == Matthias Klose  writes:

Matthias> as pre-processing.  So we know since about three years
Matthias> that dwz doesn't support compressed debug symbols.  Your
Matthias> language about "claims", "might", and so on is not
Matthias> appropriate.

No, we know that three years ago dwz didn't support compressed debug
symbols.
Since that information is three years out of date, you get my  "might"
and "claim" language.

You're the binutils maintainer!
If you happen to know that dwz still doesn't support compressed symbols
then *say that* and all my language about "might" and "claim" will go
away.
I absolutely trust your knowledge about what our elf stack does.
It's possible it's a language issue, but so far you've used rather vague
language rather than making specific claims in an area where you are an
expert.


If you don't know, that's fine.
But if no one who would like to see us move away from compressed debug
symbols has chosen to check and see whether dwz still requires
uncompressed symbols, well, I think that is significant.
I think the primary burden  of arguing for a change lies with those
proposing that change.
So, I do think that people proposing a change need to do things like
find out what specific tools break.

(Including pointers to bugs as you have done in the last mail also
counts as providing that sort of justification.
I'll admit that I don't see how the pointer to the rpm find-debuginfo
script quite fits in, but I think I follow the valgrind issue.)



Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-08 Thread Elana Hashman

On 2021-03-06 11:11, Jakub Wilk wrote:

* Elana Hashman , 2021-02-17, 11:06:
Would you be able to research some representative slice of popular 
packages that would be affected by the policy change (at least 10) and 
share the on-disk sizes with compression vs. without?


Not exactly what you asked Niels for, but...

A few months ago I recompressed whole buster/main/amd64 to see what
the effect of ditching --compress-debug-sections would be.
Raw data for this experiment is available here:
https://github.com/jwilk/lets-shrink-dbgsym/releases/download/20200708/buster-main-amd64-20200708.tsv.xz
The columns are:
* file name
* original .deb size
* recompressed .deb size
* original installed size
* recompressed installed size

Note that some of the .deb size savings might be caused by the fix for
#868674 (for packages that haven't been rebuilt since the fix).


Hi Jakub,

This is very helpful. Using this file, I have calculated the following 
three aggregates:


1. % size between original and recompressed .deb
2. % size between original and recompressed install size
3. size difference in bytes between original and recompressed install 
size


I then performed a quartile analysis on it.

Recompressed size is X% of original .deb:

Min 3.97%
25% 65.45%
Median 74.73%
75% 82.64%
Max 105.01%

Installed size of recompressed is X% of original installed size:

Min 100.06%
25% 230.72%
Median 256.76%
75% 292.76%
Max 4267.38%

Size difference between recompressed and original installed size, is X 
bytes:


Min 20480 (20KB)
25% 89088 (90KB)
Median 404480 (404KB)
75% 2461184 (2.5MB)
Max 5728869376 (5.72GB)


So I think we can conclude the following:

- In essentially all cases, recompressed deb has a size improvement over 
the original.
- In all cases, the installed size of the debug symbols is larger, 
usually about 2-3x the original installed size.
- In all but the largest cases, that size difference is negligable. 
However, the large cases have quite an extreme difference.


Hence, I think the tail end of large packages will guide this decision, 
and Josh very helpfully provided an analysis of those already!



Because of the extreme difference for the large packages, and because 
many of those packages are relatively popular, I think I am inclined to 
maintain the status quo, i.e. with --compress-debug-section enabled by 
default. I am open to being convinced otherwise :)


Cheers,

- e



Bug#922744: Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-08 Thread Matthias Klose
On 3/8/21 6:27 PM, Sam Hartman wrote:
>> "Matthias" == Matthias Klose  writes:
> 
> Matthias> Maybe you should be more specific about "those who can't
> Matthias> use" uncompressed debug info in the first place.
> 
> So, you've argued that the disk savings are not significant inside a
> package, because packages are themselves compressed.
> 
> What people are arguing is that they want to have debug info for large
> programs like firefox or chromium installed, or really debug info for
> large parts of their system.
> They are in effect arguing that they care about the installed size not
> the package size.
> They have explicitly argued that having to uninstall and then later
> reinstall disadvantages their debug cycle.
> 
> This situation is particularly unfortunate because it sounds like we
> have a conflict between two techniques for saving space.
> 
> On one hand we have dwz which tries to optimize and reduce  overall size
> of debug symbols
> 
> which is incompatible (apparently--no one has explicitly confirmed this)
> compressed debug symbols.
> 
> Presumably we can still run dwz within a single package by doing so
> before debug symbols are compressed.
> But presumably this gets in the way of people running dwz themselves  or
> something.
> 
> I'll be blunt.
> The people who say that they want debug symbols installed on their
> system have made a simple, easy to understand argument.

let my be blunt as well.  The only reference I can find regarding the size on
disk is #922744.  Contrary to what you're saying "that they want to have debug
info for large programs like firefox *or* chromium installed", they want to have
debug symbols for firefox *and* chromium *and* more installed at the same time.

#631985 speaks about a 10G space requirement for debugging KDE alone.

The decision about the compressed debug symbols was made ten years ago.  Maybe
it's time to re-evaluate what expectations for a debug installation should be 
set.

> The argument that compressed debug symbols break things is still porrly
> stated.
> We've had a claim that dwz might not work with compressed debug symbols
> (and didn't used to).
> We've had no one explain how that creates a problem in practice or even
> confirm it's still the case.
> It felt like pulling teeth to even get an answer that might be a tool we
> care about.

#87 asked for "postprocessing in dh_strip", however it was implemented in

  * dh_dwz: Add new experimental tool to run dwz(1) to deduplicate
ELF debugging symbols.  It should be generally be run before
dh_strip (as dh_strip compresses the debug symbols and dwz
expects uncompressed debug symbols).  (Closes: #87)

as pre-processing.  So we know since about three years that dwz doesn't support
compressed debug symbols.  Your language about "claims", "might", and so on is
not appropriate.

Upstreams are currently looking at issues seen with valgrind about
.gnu_debuglink section and .gnu_debugaltlink section in

  https://bugs.kde.org/show_bug.cgi?id=427969
  https://bugs.kde.org/show_bug.cgi?id=396656

so apparently there are issues with another tool (valgrind), and how the debug
information is created and split in debhelper.

Also see
https://github.com/rpm-software-management/rpm/blob/master/scripts/find-debuginfo.sh
how dwz is run *after* separating the debug info, not touching the stripped
binaries.

Apparently the choice for compressed debug sections resulted later in an
implementation for dh_dwz which is causing issues on it's own.

Unrelated to that, but to not create conflicting dbg and dbgsym packages, there
is #968710 and #981245, and it will be difficult to integrate within debhelper
without introducing a new debhelper compat level.

Also unrelated, there are #971724, #971680, and packages manually installing
additional files in auto-generated dbgsym packages.

Maybe any of these decisions to dh_strip were maybe mad to the best knowledge at
the time, but the current situation is a mess.  Sticking to compressed debug
sections is just one issue ...

Matthias



Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-08 Thread Sam Hartman
> "Matthias" == Matthias Klose  writes:

Matthias> Maybe you should be more specific about "those who can't
Matthias> use" uncompressed debug info in the first place.

So, you've argued that the disk savings are not significant inside a
package, because packages are themselves compressed.

What people are arguing is that they want to have debug info for large
programs like firefox or chromium installed, or really debug info for
large parts of their system.
They are in effect arguing that they care about the installed size not
the package size.
They have explicitly argued that having to uninstall and then later
reinstall disadvantages their debug cycle.

This situation is particularly unfortunate because it sounds like we
have a conflict between two techniques for saving space.

On one hand we have dwz which tries to optimize and reduce  overall size
of debug symbols

which is incompatible (apparently--no one has explicitly confirmed this)
compressed debug symbols.

Presumably we can still run dwz within a single package by doing so
before debug symbols are compressed.
But presumably this gets in the way of people running dwz themselves  or
something.

I'll be blunt.
The people who say that they want debug symbols installed on their
system have made a simple, easy to understand argument.

The argument that compressed debug symbols break things is still porrly
stated.
We've had a claim that dwz might not work with compressed debug symbols
(and didn't used to).
We've had no one explain how that creates a problem in practice or even
confirm it's still the case.
It felt like pulling teeth to even get an answer that might be a tool we
care about.

Please be less vague!

--Sam



Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-08 Thread Matthias Klose
On 3/7/21 11:51 PM, Sean Whitton wrote:
> Hello,
> 
> On Sun 07 Mar 2021 at 03:50PM -07, Sean Whitton wrote:
> 
>> This is not much good if your network is weak or you're offline, or you
>> don't want information on your debugging to go out to a web service.
> 
> What I mean is: debuginfod is great in many scenarios, but we should
> probably care about those who can't or won't use it too.  Sorry if the
> above is a bit blunt.

your comment is unfocused.  This was just another use case where uncompressed
debug info could be harmful, and I pointed out a configuration how to avoid it.

Maybe you should be more specific about "those who can't use" uncompressed debug
info in the first place.

Matthias



Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-07 Thread Sean Whitton
Hello,

On Sun 07 Mar 2021 at 03:50PM -07, Sean Whitton wrote:

> This is not much good if your network is weak or you're offline, or you
> don't want information on your debugging to go out to a web service.

What I mean is: debuginfod is great in many scenarios, but we should
probably care about those who can't or won't use it too.  Sorry if the
above is a bit blunt.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-07 Thread Sean Whitton
Hello,

On Fri 05 Mar 2021 at 06:22PM +01, Matthias Klose wrote:

> yes, the rationale for uncompressed debug sections is that any tool can access
> them.  On disk as deb/dbgsym package, there is no big difference in
> size.

The data elsewhere in this bug would suggest otherwise!

> Also a debuginfod server can be configured to send the debuginfo
> compressed on the fly.  The "only" downside is to have the uncomressed
> debuginfo on the disk when doing the debugging.

This is not much good if your network is weak or you're offline, or you
don't want information on your debugging to go out to a web service.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-06 Thread Jakub Wilk

* Elana Hashman , 2021-02-17, 11:06:
Would you be able to research some representative slice of popular 
packages that would be affected by the policy change (at least 10) and 
share the on-disk sizes with compression vs. without?


Not exactly what you asked Niels for, but...

A few months ago I recompressed whole buster/main/amd64 to see what the 
effect of ditching --compress-debug-sections would be.

Raw data for this experiment is available here:
https://github.com/jwilk/lets-shrink-dbgsym/releases/download/20200708/buster-main-amd64-20200708.tsv.xz
The columns are:
* file name
* original .deb size
* recompressed .deb size
* original installed size
* recompressed installed size

Note that some of the .deb size savings might be caused by the fix for 
#868674 (for packages that haven't been rebuilt since the fix).


--
Jakub Wilk



Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-05 Thread Matthias Klose
yes, the rationale for uncompressed debug sections is that any tool can access
them.  On disk as deb/dbgsym package, there is no big difference in size.  Also
a debuginfod server can be configured to send the debuginfo compressed on the
fly.  The "only" downside is to have the uncomressed debuginfo on the disk when
doing the debugging.

Matthias



Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-03-03 Thread Florian Weimer
* Elana Hashman:

> You and the original report mention "tooling issues". Can you please
> provide some examples of tools that do not currently support working
> with compressed symbols and the resulting effects on developer workflow?

dwz still can't process compressed debuginfo sections, I think.  It's
the reason why Fedora uncompresses all debuginfo sections.

It's also not clear to me which compression approach is to be used,
the GNU one or the ELF standard one.  I expect support for GNU to be a
bit more widespread in our world because it's been around for a bit
longer.



Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2021-02-17 Thread Elana Hashman
Hi Niels,

Most of the arguments in this and previous bugs are anecdotal. It would
be helpful to provide a concrete analysis of the disk space savings that
compression provides, and whether it is a reasonable default.

There is a discussion about KDE debug symbols requiring 10Gi of disk
space a decade ago, but not what the original compressed size was, for
example...

Would you be able to research some representative slice of popular
packages that would be affected by the policy change (at least 10) and
share the on-disk sizes with compression vs. without?

Personally, I think if there is not much difference in size, it would
make sense to not compress as the default. If there are orders of
magnitude in difference, the status quo probably still makes sense, as
it does provide benefits.


Matthias,

You and the original report mention "tooling issues". Can you please
provide some examples of tools that do not currently support working
with compressed symbols and the resulting effects on developer workflow?

Thanks,

- e


signature.asc
Description: PGP signature


Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2020-12-14 Thread Sean Whitton
Hello Niels,

On Sat 05 Dec 2020 at 01:12PM +01, Niels Thykier wrote:

> The underlying arguments for and against --compress-debug-section appear
>  to be "download size" vs. "installed disk usage" vs. "Tooling support".
>  Though I ask you to please read the bugs #631985 and #922744 for the
> details of the arguments by both proponents.

Just had a look a these, thank you.

ISTM that the arguments in favour of compressing are more concrete right
now: in #631985 there is the example of debugging KDE requiring more
than 10G of disc space.  (Nine years later perhaps it is more.)  On the
other hand, in #922744, there is only an unsubstantiated reference to
tooling support.

I'm going to write to the submitter of #922744 asking for more info.

> Why punt it to you?
> ===
>
> [...]

I think the reasons you give are all reasonable.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#976462: tech-ctte: Should dbgsym files be compressed via objcopy --compress-debug-section or not?

2020-12-05 Thread Niels Thykier
Package: tech-ctte

Dear members of the technical committee

I am asking for you provide advice or make a decision according
Constitution 6.1.3 on the matter of whether dbgsym files should use file
level compression or not.

I intend to use the outcome of this bug to determine how to resolve
#922744 (either by implementing the request or closing it as wontfix).


A bit of context:
=

Since compat 9 (released in 2012-01-15), debhelper has compressed
detached dbgsym files using objcopy's option --compress-debug-section.
This was implemented in debhelper/8.9.10 in order to resolve #631985.

When -dbgsym packages was implemented several years later, I used
--compress-debug-section in the implementation as it was the default for
modern compat levels at the time.

Then on 2019-02-20, Matthias Klose filed the bug #922744, in where
Matthias (in my reading of the subject) effectively requests that
debhelper stopped using --compress-debug-section, which would overturn
the request in #631985.

The underlying arguments for and against --compress-debug-section appear
 to be "download size" vs. "installed disk usage" vs. "Tooling support".
 Though I ask you to please read the bugs #631985 and #922744 for the
details of the arguments by both proponents.


I have _not_ involved any of the parties/stakeholders in this nor heard
there arguments.  Please see "Why punt it to you?" for why.


Why punt it to you?
===

I am punting this to you because:

 1. As stated in #922744, I am largely not emotionally invested in the
result.  Though I admit having reservations on how the solution is
implemented (see "Non-solutions").

 2. I am certain I do _not_ have the spoons and emotional capacity for
resolving this in the best way for Debian (as opposed to the "least
discussion/work for me" solution).  This has kept me from opening
the debate with relevant stakeholders.

 3. I do not want #922744 to rot forever in the bug tracker (which is
frankly what is happening now).

Given the nature of the underlying problem is technical, then I thought
it was best to rely on you.  My other alternative would be to throw it
at debian-devel but given point 2 in my list above that seemed like it
would be counterproductive for me.


My intentions for implementations:
==

If the advice/decision is to stop using --compress-debug-section then I
intend to retroactively undo the change for all compat levels (affecting
compat 9+) after the next release (to avoid disrupting the current release).
  It is my understanding that nothing relies on a 100% coverage of
compressed dbgsyms as we never got to a 100% in Debian yet.
Furthermore, most compilers do not compress debug sections by default,
which means that most tools will need to support uncompressed debug
sections to be useful.


If the advice/decision is to keep using --compress-debug-sections, I am
tempted to just leave the implementation as-is and slow migrate the rest
of the packages as the old compat levels are phased out.


I am open to changes/advice to alternative solutions for implementation
(though please see "Non-solutions") - these alternatives can be
presented anyone (including members of the tech-ctte in their private
capacity[1]).


Non-solutions:
=-=-=-=-=-=-=-

I do _not_ think we will be better served with compression being
something you opt-in/opt-out from based on a command-line option (or a
field in debian/control).  I think debhelper has way too many
"special-case" options or toggles where people have to do case-by-case
decisions already and I would see this as "yet another one".

You may choose this as a solution but then I require you to overrule me
as a maintainer using 6.1.4 with a 3:1 supermajority in the decision.

Thanks for your time,
~Niels


[1] I do not expect a full decision cycle/vote just to propose an
alternative.  But also, I do not want 6.3.5 getting in the way of a
better solution than I thought of.