Bug#783210: [PATCH] nscd_stat.c: make the build reproducible

2016-11-04 Thread Ximin Luo
Ximin Luo:
> Mike Frysinger:
>> On 28 Jul 2016 15:15, Florian Weimer wrote:
>>> On 03/09/2016 05:30 PM, Mike Frysinger wrote:
 would it be so terrible to properly marshall this data ?
>>>
>>> Ximin Luo and I discussed this and I wonder if it is possible to read 
>>> out the libc.so.6 build ID if it is present.  It should indirectly call 
>>> all the layout dependencies and be reasonably easy to access because it 
>>> is in an allocated section (and we might want to print it from an 
>>> libc.so.6 invocation, too).
>>>
>>> We still need the time-based approach if the build ID is not available, 
>>> but I expect most distributions will have something like it.
>>>
>>> The Debian bug is:
>>>
>>>https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=783210
>>>
>>> (Also Cc:ed)
>>
>> agreed that build-id should be an acceptable replacement for what the
>> code is doing today, but in order to pull that off, i guess you'd have
>> to have to do a configure test to see if build-id is active ?  if you
>> leave the logic to runtime, you'd still need to include the datetime
>> stamp in the object which would still make the build unreproducible.
>>
>> this also doesn't really cover the quoted idea of marshalling the data
>> between client & server :).
>> -mike
>>
> 
> Hi all,
> 
> I've written a small program that prints out the Build IDs of all the objects 
> that are dynamically linked to it, plus itself.
> 
> It works well, although I'm not a C expert so I don't know if it is portable 
> enough. For example, I hard-code some >>2 <<2s in there, along with a uint8_t 
> - I didn't see a corresponding ElfW(xxx) type in elf.h
> 
> Another downside is it needs to be linked against libdl, which I think is not 
> the case currently with nscd. I'm not sure if this carries extra security 
> risk or whatever.
> 

Oh! Actually it doesn't need to be linked against libdl. That was from an 
earlier version of the code where I was using dlinfo instead of 
dl_iterate_phdr. But this latter function doesn't need extra libs. :)

> An alternative would be to detect the build-id *at build time* and then 
> monkey-patch it into the binary itself.
> 
> What do you all think? How shall I proceed?
> 

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



Bug#783210: [PATCH] nscd_stat.c: make the build reproducible

2016-11-04 Thread Ximin Luo
Mike Frysinger:
> On 28 Jul 2016 15:15, Florian Weimer wrote:
>> On 03/09/2016 05:30 PM, Mike Frysinger wrote:
>>> would it be so terrible to properly marshall this data ?
>>
>> Ximin Luo and I discussed this and I wonder if it is possible to read 
>> out the libc.so.6 build ID if it is present.  It should indirectly call 
>> all the layout dependencies and be reasonably easy to access because it 
>> is in an allocated section (and we might want to print it from an 
>> libc.so.6 invocation, too).
>>
>> We still need the time-based approach if the build ID is not available, 
>> but I expect most distributions will have something like it.
>>
>> The Debian bug is:
>>
>>https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=783210
>>
>> (Also Cc:ed)
> 
> agreed that build-id should be an acceptable replacement for what the
> code is doing today, but in order to pull that off, i guess you'd have
> to have to do a configure test to see if build-id is active ?  if you
> leave the logic to runtime, you'd still need to include the datetime
> stamp in the object which would still make the build unreproducible.
> 
> this also doesn't really cover the quoted idea of marshalling the data
> between client & server :).
> -mike
> 

Hi all,

I've written a small program that prints out the Build IDs of all the objects 
that are dynamically linked to it, plus itself.

It works well, although I'm not a C expert so I don't know if it is portable 
enough. For example, I hard-code some >>2 <<2s in there, along with a uint8_t - 
I didn't see a corresponding ElfW(xxx) type in elf.h

Another downside is it needs to be linked against libdl, which I think is not 
the case currently with nscd. I'm not sure if this carries extra security risk 
or whatever.

An alternative would be to detect the build-id *at build time* and then 
monkey-patch it into the binary itself.

What do you all think? How shall I proceed?

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
#define _GNU_SOURCE
#include 
#include 

int callback (struct dl_phdr_info *info, size_t size, void *data) {
  printf ("\nname: %s\n", info->dlpi_name);

  ElfW(Phdr) *phdr = (ElfW(Phdr) *) info->dlpi_phdr;
  for (ElfW(Half) i = 0; i < info->dlpi_phnum; i++) {
if (phdr->p_type == PT_NOTE) {
  ElfW(Addr) addr = info->dlpi_addr + info->dlpi_phdr[i].p_vaddr;
  ElfW(Addr) nend = addr + info->dlpi_phdr[i].p_memsz;
  //printf ("found NOTE segment at: %p to %p\n", addr, nend);

  while (addr < nend) {
	ElfW(Nhdr) *nhdr = (ElfW(Nhdr) *) addr;
	// According to the ELF spec, namesz and descsz do not include padding
	// but that's how they're laid out in memory; add the padding here.
	ElfW(Addr) nameoff = (((nhdr->n_namesz-1)>>2)+1)<<2;
	ElfW(Addr) descoff = (((nhdr->n_descsz-1)>>2)+1)<<2;

	if (nhdr->n_type == NT_GNU_BUILD_ID) {
	  const uint8_t *buf = (const uint8_t *) ((ElfW(Addr))(nhdr + 1) + nameoff);
	  printf("Build ID");
	  for (int j = 0; j < nhdr->n_descsz; j++)
	printf(":%02X", buf[j]);
	  printf("\n");
	}

	//printf("skipping section type %02X\n", nhdr->n_type);
	addr = (ElfW(Addr))(nhdr + 1) + nameoff + descoff;
  }
}

phdr += 1;
  }

  return 0;
}

int main() {
  dl_iterate_phdr(callback, NULL);
}


Bug#783210: [PATCH] nscd_stat.c: make the build reproducible

2016-07-31 Thread Mike Frysinger
On 28 Jul 2016 15:15, Florian Weimer wrote:
> On 03/09/2016 05:30 PM, Mike Frysinger wrote:
> > would it be so terrible to properly marshall this data ?
> 
> Ximin Luo and I discussed this and I wonder if it is possible to read 
> out the libc.so.6 build ID if it is present.  It should indirectly call 
> all the layout dependencies and be reasonably easy to access because it 
> is in an allocated section (and we might want to print it from an 
> libc.so.6 invocation, too).
> 
> We still need the time-based approach if the build ID is not available, 
> but I expect most distributions will have something like it.
> 
> The Debian bug is:
> 
>https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=783210
> 
> (Also Cc:ed)

agreed that build-id should be an acceptable replacement for what the
code is doing today, but in order to pull that off, i guess you'd have
to have to do a configure test to see if build-id is active ?  if you
leave the logic to runtime, you'd still need to include the datetime
stamp in the object which would still make the build unreproducible.

this also doesn't really cover the quoted idea of marshalling the data
between client & server :).
-mike


signature.asc
Description: Digital signature


Bug#783210: [PATCH] nscd_stat.c: make the build reproducible

2016-07-29 Thread Ludovic Courtès
Florian Weimer  skribis:

> We still need the time-based approach if the build ID is not
> available, but I expect most distributions will have something like
> it.

FWIW in Guix we solve it by filling the ‘compilation’ array with a
substring of the installation prefix¹.

Since the installation prefix is something like
/gnu/store/5fx3vscv9pqjr1k0vyaqnpqlvvzl8rff-glibc-2.22, which comprises
a hash of all the source, build scripts, and dependencies used to build
it, we know that it uniquely identifies the result of this specific
glibc build.

The build ID should be a good approximation of this.

Ludo’.

¹ http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/base.scm#n603



Bug#783210: [PATCH] nscd_stat.c: make the build reproducible

2016-07-28 Thread Florian Weimer

On 03/09/2016 05:30 PM, Mike Frysinger wrote:


would it be so terrible to properly marshall this data ?


Ximin Luo and I discussed this and I wonder if it is possible to read 
out the libc.so.6 build ID if it is present.  It should indirectly call 
all the layout dependencies and be reasonably easy to access because it 
is in an allocated section (and we might want to print it from an 
libc.so.6 invocation, too).


We still need the time-based approach if the build ID is not available, 
but I expect most distributions will have something like it.


The Debian bug is:

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=783210

(Also Cc:ed)

Florian