Re: [PATCH v2 5/6] btf: add -fprune-btf option

2024-05-03 Thread Indu Bhagat

On 5/2/24 10:11, David Faust wrote:

This patch adds a new option, -fprune-btf, to control BTF debug info
generation.

As the name implies, this option enables a kind of "pruning" of the BTF
information before it is emitted.  When enabled, rather than emitting
all type information translated from DWARF, only information for types
directly used in the source program is emitted.

The primary purpose of this pruning is to reduce the amount of
unnecessary BTF information emitted, especially for BPF programs.  It is
very common for BPF programs to incldue Linux kernel internal headers in
order to have access to kernel data structures.  However, doing so often
has the side effect of also adding type definitions for a large number
of types which are not actually used by nor relevant to the program.
In these cases, -fprune-btf commonly reduces the size of the resulting
BTF information by approximately 10x.  This both slims down the size of
the resulting object and reduces the time required by the BPF loader to
verify the program and its BTF information.



The 10x reduction is substantial.  Do you think its is worthwhile to 
mention alongside that this data is the average observed for the kernel 
self-tests (I assume it is) ? Just useful info when parsing the commit 
logs, especially when some data is specified...



Note that the pruning implemented in this patch follows the same rules
as the BTF pruning performed unconditionally by LLVM's BPF backend when
generating BTF.  In particular, the main sources of pruning are:

   1) Only generate BTF for types used by variables and functions at
  the file scope.



I dont recollect anymore if BTF_KIND_VAR for unused static vars is also 
a correctness issue for BTF.  (With PR debug/113566, we know having 
BTF_KIND_DATASEC entries for optimized away vars is an issue).


It will be great to add some text here or elsewhere for posterity around 
this.



   2) Avoid emitting full BTF for struct and union types which are only
  pointed-to by members of other struct/union types.  In these cases,
  the full BTF_KIND_STRUCT or BTF_KIND_UNION which would normally
  be emitted is replaced with a BTF_KIND_FWD, as though the
  underlying type was a forward-declared struct or union type.

gcc/
* btfout.cc (btf_minimal_types): New hash set.
(struct btf_fixup): New.
(fixups, forwards): New vecs.
(btf_output): Calculate num_types depending on flag_prune_btf.
(btf_early_finsih): New initialization for flag_prune_btf.
(btf_mark_full_type_used): Likewise.
(btf_minimal_add_type): New function.
(btf_minimal_type_list_cb): Likewise.
(btf_late_collect_pruned_types): Likewise.
(btf_late_add_vars): Handle special case for variables in ".maps"
section when generating BTF for BPF CO-RE target.
(btf_late_finish): Use btf_late_collect_pruned_types when
flag_prune_btf in effect.  Move some initialization to btf_early_finish.
(btf_finalize): Additional deallocation for flag_prune_btf.
* common.opt (fprune-btf): New flag.
* ctfc.cc (init_ctf_strtable): Make non-static.
* ctfc.h (struct ctf_dtdef): Add visited_children_p boolean flag.
(init_ctf_strtable, ctfc_delete_strtab): Make extern.
* doc/invoke.texi (Debugging Options): Document -fprune-btf.

gcc/testsuite/
* gcc.dg/debug/btf/btf-prune-1.c: New test.
* gcc.dg/debug/btf/btf-prune-2.c: Likewise.
* gcc.dg/debug/btf/btf-prune-3.c: Likewise.
---
  gcc/btfout.cc| 394 ++-
  gcc/common.opt   |   4 +
  gcc/ctfc.cc  |   2 +-
  gcc/ctfc.h   |   5 +
  gcc/doc/invoke.texi  |  20 +
  gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c |  25 ++
  gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c |  33 ++
  gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c |  35 ++
  8 files changed, 511 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c
  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 0af0bd39fc7..93d56492bbe 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -833,7 +833,10 @@ output_btf_types (ctf_container_ref ctfc)
  {
size_t i;
size_t num_types;
-  num_types = ctfc->ctfc_types->elements ();
+  if (flag_prune_btf)
+num_types = max_translated_id;
+  else
+num_types = ctfc->ctfc_types->elements ();
  
if (num_types)

  {
@@ -962,6 +965,211 @@ btf_early_add_func_records (ctf_container_ref ctfc)
  }
  }
  
+/* The set of types used directly in the source program, and any types manually

+   marked as used.  This is the set of types which will be emitted when
+   pruning (-fprune-btf) is enabled.  */


Nit: emitted when 

[PATCH v2 5/6] btf: add -fprune-btf option

2024-05-02 Thread David Faust
This patch adds a new option, -fprune-btf, to control BTF debug info
generation.

As the name implies, this option enables a kind of "pruning" of the BTF
information before it is emitted.  When enabled, rather than emitting
all type information translated from DWARF, only information for types
directly used in the source program is emitted.

The primary purpose of this pruning is to reduce the amount of
unnecessary BTF information emitted, especially for BPF programs.  It is
very common for BPF programs to incldue Linux kernel internal headers in
order to have access to kernel data structures.  However, doing so often
has the side effect of also adding type definitions for a large number
of types which are not actually used by nor relevant to the program.
In these cases, -fprune-btf commonly reduces the size of the resulting
BTF information by approximately 10x.  This both slims down the size of
the resulting object and reduces the time required by the BPF loader to
verify the program and its BTF information.

Note that the pruning implemented in this patch follows the same rules
as the BTF pruning performed unconditionally by LLVM's BPF backend when
generating BTF.  In particular, the main sources of pruning are:

  1) Only generate BTF for types used by variables and functions at
 the file scope.

  2) Avoid emitting full BTF for struct and union types which are only
 pointed-to by members of other struct/union types.  In these cases,
 the full BTF_KIND_STRUCT or BTF_KIND_UNION which would normally
 be emitted is replaced with a BTF_KIND_FWD, as though the
 underlying type was a forward-declared struct or union type.

gcc/
* btfout.cc (btf_minimal_types): New hash set.
(struct btf_fixup): New.
(fixups, forwards): New vecs.
(btf_output): Calculate num_types depending on flag_prune_btf.
(btf_early_finsih): New initialization for flag_prune_btf.
(btf_mark_full_type_used): Likewise.
(btf_minimal_add_type): New function.
(btf_minimal_type_list_cb): Likewise.
(btf_late_collect_pruned_types): Likewise.
(btf_late_add_vars): Handle special case for variables in ".maps"
section when generating BTF for BPF CO-RE target.
(btf_late_finish): Use btf_late_collect_pruned_types when
flag_prune_btf in effect.  Move some initialization to btf_early_finish.
(btf_finalize): Additional deallocation for flag_prune_btf.
* common.opt (fprune-btf): New flag.
* ctfc.cc (init_ctf_strtable): Make non-static.
* ctfc.h (struct ctf_dtdef): Add visited_children_p boolean flag.
(init_ctf_strtable, ctfc_delete_strtab): Make extern.
* doc/invoke.texi (Debugging Options): Document -fprune-btf.

gcc/testsuite/
* gcc.dg/debug/btf/btf-prune-1.c: New test.
* gcc.dg/debug/btf/btf-prune-2.c: Likewise.
* gcc.dg/debug/btf/btf-prune-3.c: Likewise.
---
 gcc/btfout.cc| 394 ++-
 gcc/common.opt   |   4 +
 gcc/ctfc.cc  |   2 +-
 gcc/ctfc.h   |   5 +
 gcc/doc/invoke.texi  |  20 +
 gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c |  25 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c |  33 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c |  35 ++
 8 files changed, 511 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 0af0bd39fc7..93d56492bbe 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -833,7 +833,10 @@ output_btf_types (ctf_container_ref ctfc)
 {
   size_t i;
   size_t num_types;
-  num_types = ctfc->ctfc_types->elements ();
+  if (flag_prune_btf)
+num_types = max_translated_id;
+  else
+num_types = ctfc->ctfc_types->elements ();
 
   if (num_types)
 {
@@ -962,6 +965,211 @@ btf_early_add_func_records (ctf_container_ref ctfc)
 }
 }
 
+/* The set of types used directly in the source program, and any types manually
+   marked as used.  This is the set of types which will be emitted when
+   pruning (-fprune-btf) is enabled.  */
+static GTY (()) hash_set *btf_minimal_types;
+
+/* Fixup used to avoid unnecessary pointer chasing for types.  A fixup is
+   created when a structure or union member is a pointer to another struct
+   or union type.  In such cases, avoid emitting full type information for
+   the pointee struct or union type (which may be quite large), unless that
+   type is used directly elsewhere.  */
+struct btf_fixup
+{
+  ctf_dtdef_ref pointer_dtd; /* Type node to which the fixup is applied.  */
+  ctf_dtdef_ref pointee_dtd; /* Original type node referred to by pointer_dtd.
+   If this concrete type is not otherwise used,
+