[PATCH] modpost: support arbitrary symbol length in modversion

2023-01-11 Thread Gary Guo
Currently modversion uses a fixed size array of size (64 - sizeof(long))
to store symbol names, thus placing a hard limit on length of symbols.
Rust symbols (which encodes crate and module names) can be quite a bit
longer. The length limit in kallsyms is increased to 512 for this reason.

It's a waste of space to simply expand the fixed array size to 512 in
modversion info entries. I therefore make it variably sized, with offset
to the next entry indicated by the initial "next" field.

In addition to supporting longer-than-56/60 byte symbols, this patch also
reduce the size for short symbols by getting rid of excessive 0 paddings.
There are still some zero paddings to ensure "next" and "crc" fields are
properly aligned.

This patch does have a tiny drawback that it makes ".mod.c" files generated
a bit less easy to read, as code like

"\x08\x00\x00\x00\x78\x56\x34\x12"
"symbol\0\0"

is generated as opposed to

{ 0x12345678, "symbol" },

because the structure is now variable-length. But hopefully nobody reads
the generated file :)

Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512")
Link: https://github.com/Rust-for-Linux/linux/pull/379

Signed-off-by: Gary Guo 
---
 arch/powerpc/kernel/module_64.c |  3 ++-
 include/linux/module.h  |  6 --
 kernel/module/version.c | 21 +
 scripts/export_report.pl|  9 +
 scripts/mod/modpost.c   | 33 +++--
 5 files changed, 43 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index ff045644f13f..eac23c11d579 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -236,10 +236,11 @@ static void dedotify_versions(struct modversion_info 
*vers,
 {
struct modversion_info *end;
 
-   for (end = (void *)vers + size; vers < end; vers++)
+   for (end = (void *)vers + size; vers < end; vers = (void *)vers + 
vers->next) {
if (vers->name[0] == '.') {
memmove(vers->name, vers->name+1, strlen(vers->name));
}
+   }
 }
 
 /*
diff --git a/include/linux/module.h b/include/linux/module.h
index 8c5909c0076c..37cb25af9099 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -34,8 +34,10 @@
 #define MODULE_NAME_LEN MAX_PARAM_PREFIX_LEN
 
 struct modversion_info {
-   unsigned long crc;
-   char name[MODULE_NAME_LEN];
+   /* Offset of the next modversion entry in relation to this one. */
+   u32 next;
+   u32 crc;
+   char name[0];
 };
 
 struct module;
diff --git a/kernel/module/version.c b/kernel/module/version.c
index 53f43ac5a73e..af7478dcc158 100644
--- a/kernel/module/version.c
+++ b/kernel/module/version.c
@@ -17,32 +17,29 @@ int check_version(const struct load_info *info,
 {
Elf_Shdr *sechdrs = info->sechdrs;
unsigned int versindex = info->index.vers;
-   unsigned int i, num_versions;
-   struct modversion_info *versions;
+   struct modversion_info *versions, *end;
+   u32 crcval;
 
/* Exporting module didn't supply crcs?  OK, we're already tainted. */
if (!crc)
return 1;
+   crcval = *crc;
 
/* No versions at all?  modprobe --force does this. */
if (versindex == 0)
return try_to_force_load(mod, symname) == 0;
 
versions = (void *)sechdrs[versindex].sh_addr;
-   num_versions = sechdrs[versindex].sh_size
-   / sizeof(struct modversion_info);
+   end = (void *)versions + sechdrs[versindex].sh_size;
 
-   for (i = 0; i < num_versions; i++) {
-   u32 crcval;
-
-   if (strcmp(versions[i].name, symname) != 0)
+   for (; versions < end; versions = (void *)versions + versions->next) {
+   if (strcmp(versions->name, symname) != 0)
continue;
 
-   crcval = *crc;
-   if (versions[i].crc == crcval)
+   if (versions->crc == crcval)
return 1;
-   pr_debug("Found checksum %X vs module %lX\n",
-crcval, versions[i].crc);
+   pr_debug("Found checksum %X vs module %X\n",
+crcval, versions->crc);
goto bad_version;
}
 
diff --git a/scripts/export_report.pl b/scripts/export_report.pl
index feb3d5542a62..1117646f3141 100755
--- a/scripts/export_report.pl
+++ b/scripts/export_report.pl
@@ -116,18 +116,19 @@ foreach my $thismod (@allcfiles) {
while ( <$module> ) {
chomp;
if ($state == 0) {
-   $state = 1 if ($_ =~ /static const struct 
modversion_info/);
+   

Re: [PATCH] modpost: support arbitrary symbol length in modversion

2023-01-13 Thread Gary Guo
On Thu, 12 Jan 2023 14:40:59 -0700
Lucas De Marchi  wrote:

> On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote:
> >
> > struct modversion_info {
> >-unsigned long crc;
> >-char name[MODULE_NAME_LEN];
> >+/* Offset of the next modversion entry in relation to this one. */
> >+u32 next;
> >+u32 crc;
> >+char name[0];  
> 
> although not really exported as uapi, this will break userspace as this is
> used in the  elf file generated for the modules. I think
> this change must be made in a backward compatible way and kmod updated
> to deal with the variable name length:
> 
> kmod $ git grep "\[64"
> libkmod/libkmod-elf.c:  char name[64 - sizeof(uint32_t)];
> libkmod/libkmod-elf.c:  char name[64 - sizeof(uint64_t)];
> 
> in kmod we have both 32 and 64 because a 64-bit kmod can read both 32
> and 64 bit module, and vice versa.
> 

Hi Lucas,

Thanks for the information.

The change can't be "truly" backward compatible, in a sense that
regardless of the new format we choose, kmod would not be able to decode
symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves
is going to be incomplete, isn't it?

What kind of backward compatibility should be expected? It could be:
* short symbols can still be found by old versions of kmod, but not
  long symbols;
* or, no symbols are found by old versions of kmod, but it does not
  fail;
* or, old versions of kmod would fail gracefully for not able to
  recognise the format of __versions section, but it didn't do anything
  crazy (e.g. decode it as old format).

Also, do you think the current modversion format should stick forever
or would we be able to migrate away from it eventually and fail old
versions of modprobe given enough time?

Best,
Gary


Re: [PATCH] modpost: support arbitrary symbol length in modversion

2023-01-19 Thread Gary Guo
On Tue, 17 Jan 2023 11:22:45 -0800
Lucas De Marchi  wrote:

> On Tue, Jan 17, 2023 at 06:51:44PM +0100, Michal Suchánek wrote:
> >Hello,
> >
> >On Fri, Jan 13, 2023 at 06:18:41PM +, Gary Guo wrote:  
> >> On Thu, 12 Jan 2023 14:40:59 -0700
> >> Lucas De Marchi  wrote:
> >>  
> >> > On Wed, Jan 11, 2023 at 04:11:51PM +, Gary Guo wrote:  
> >> > >
> >> > > struct modversion_info {
> >> > >-   unsigned long crc;
> >> > >-   char name[MODULE_NAME_LEN];
> >> > >+   /* Offset of the next modversion entry in relation to this one. 
> >> > >*/
> >> > >+   u32 next;
> >> > >+   u32 crc;
> >> > >+   char name[0];  
> >> >
> >> > although not really exported as uapi, this will break userspace as this 
> >> > is
> >> > used in the  elf file generated for the modules. I think
> >> > this change must be made in a backward compatible way and kmod updated
> >> > to deal with the variable name length:
> >> >
> >> > kmod $ git grep "\[64"
> >> > libkmod/libkmod-elf.c:  char name[64 - sizeof(uint32_t)];
> >> > libkmod/libkmod-elf.c:  char name[64 - sizeof(uint64_t)];
> >> >
> >> > in kmod we have both 32 and 64 because a 64-bit kmod can read both 32
> >> > and 64 bit module, and vice versa.
> >> >  
> >>
> >> Hi Lucas,
> >>
> >> Thanks for the information.
> >>
> >> The change can't be "truly" backward compatible, in a sense that
> >> regardless of the new format we choose, kmod would not be able to decode
> >> symbols longer than "64 - sizeof(long)" bytes. So the list it retrieves
> >> is going to be incomplete, isn't it?
> >>
> >> What kind of backward compatibility should be expected? It could be:
> >> * short symbols can still be found by old versions of kmod, but not
> >>   long symbols;  
> >
> >That sounds good. Not everyone is using rust, and with this option
> >people who do will need to upgrade tooling, and people who don't care
> >don't need to do anything.  
> 
> that could be it indeed. My main worry here is:
> 
> "After the support is added in kmod, kmod needs to be able to output the
> correct information regardless if the module is from before/after the
> change in the kernel and also without relying on kernel version."
> Just changing the struct modversion_info doesn't make that possible.
> 
> Maybe adding the long symbols in another section?

Yeah, that's what I imagined how it could be implemented when I said
"short symbols can still be found by old versions of kmod, but not long
symbols".

> Or ble just increase to 512 and add the size to a
> "__versions_hdr" section. If we then output a max size per module,
> this would offset a little bit the additional size gained for the
> modules using rust.

That format isn't really elegant IMO. And symbol length can vary a lot,
having all symbols dictated by the longest symbol doesn't sound a good
approach.

> And the additional 0's should compress well
> so I'm not sure the additional size is that much relevant here.

I am not sure why compression is mentioned here. I don't think section
in .ko files are compressed.

(sorry forget to reply-all, re-send email to the list)

Best,
Gary


Re: [PATCH] modpost: support arbitrary symbol length in modversion

2023-01-19 Thread Gary Guo
On Thu, 19 Jan 2023 16:18:57 +0100
Michal Suchánek  wrote:

> On Thu, Jan 19, 2023 at 03:09:36PM +0000, Gary Guo wrote:
> > On Tue, 17 Jan 2023 11:22:45 -0800
> > Lucas De Marchi  wrote:
> >   
> > > And the additional 0's should compress well
> > > so I'm not sure the additional size is that much relevant here.  
> > 
> > I am not sure why compression is mentioned here. I don't think section
> > in .ko files are compressed.  
> 
> There is the option to compress the whole .ko files, and it's commonly
> used.

Hi Michal,

I am aware that there is an option but I am surprised to hear that it's
commonly used. I don't think that's enabled by default, and certainly
Debian/Ubuntu does not have it enabled.

Best,
Gary