Bug#794488: Re: Bug#794488: Piglit tests in Mesa crash, if radeonsi_dri.so is linked with libelf1 (libelfg0 works)
On Thu, Aug 06, 2015 at 05:14:25PM +0200, Kai Wasserbäch wrote: Could you compile the following with: gcc -g -lelf -o elfrel elfrel.c this does not work for several reasons: 1. I certainly need -std=c99 for the inline initialisation of the counter in the for() statement. Ah, yes, this system has gcc 5.1 which defaults to gnu11. 2. *section (first used in »gelf_getshdr(section, section_header)«) isn't defined/filled anywhere: [...] Long story short: did you paste the entire/correct code? Drat, so sorry. I must have copy/pasted an earlier version, that didn't even compile. Attached is a version I double checked, that includes one extra check (the size of the .text section). It gives the following output for me: $ for i in 794488_elfs/libelf*/*; do ./elfrel $i; done file: 794488_elfs/libelf1/dump.elf.EL5kJT .text code size: 24 Nothing found file: 794488_elfs/libelf1/dump.elf.J4EnbO .text code size: 11c symbols: 5 1: not global or undefined 2: not global or undefined 3: not global or undefined 4: not global or undefined 5: 0 relocations: 2 0: 10, SCRATCH_RSRC_DWORD1 1: 2c, SCRATCH_RSRC_DWORD0 file: 794488_elfs/libelfg0/dump.elf.7NnBvc .text code size: 24 Nothing found file: 794488_elfs/libelfg0/dump.elf.ahPsJJ .text code size: 11c symbols: 5 1: not global or undefined 2: not global or undefined 3: not global or undefined 4: not global or undefined 5: 0 relocations: 2 0: 10, SCRATCH_RSRC_DWORD1 1: 2c, SCRATCH_RSRC_DWORD0 file: 794488_elfs/libelfg0/dump.elf.DYTjdO .text code size: 28 Nothing found file: 794488_elfs/libelfg0/dump.elf.Lke6Xg .text code size: 38 Nothing found Could you run it against old/new libelf to see if anything is different. If not, then I am looking for the bug in the wrong place. Thanks, Mark #include gelf.h #include stdio.h #include string.h #include inttypes.h #include sys/types.h #include sys/stat.h #include fcntl.h int main (int argc, char **argv) { elf_version(EV_CURRENT); printf (file: %s\n, argv[1]); int fd = open (argv[1], O_RDONLY); Elf *elf = elf_begin (fd, ELF_C_READ, NULL); size_t section_str_index; elf_getshdrstrndx(elf, section_str_index); size_t reloc_count, symbol_sh_link, symbol_count; Elf_Data *relocs, *symbols; Elf_Scn *section = NULL; while ((section = elf_nextscn(elf, section))) { const char *name; GElf_Shdr section_header; if (gelf_getshdr(section, section_header) != section_header) { fprintf(stderr, Failed to read ELF section header\n); return -1; } name = elf_strptr(elf, section_str_index, section_header.sh_name); if (strncmp(name, .symtab, 7) == 0) { symbols = elf_getdata(section, NULL); symbol_sh_link = section_header.sh_link; symbol_count = section_header.sh_size / section_header.sh_entsize; } else if (strcmp (name, .rel.text) == 0) { relocs = elf_getdata(section, NULL); reloc_count = section_header.sh_size / section_header.sh_entsize; } else if (strcmp (name, .text) == 0) { Elf_Data *section_data = elf_getdata(section, NULL); printf (.text code size: %zx\n, section_data-d_size); } } if (!relocs || !symbols || !reloc_count) { printf(Nothing found\n); return -1; } printf (symbols: %zd\n, symbol_count); GElf_Sym symbol; size_t i = 0; while (gelf_getsym (symbols, i++, symbol)) { if (GELF_ST_BIND(symbol.st_info) != STB_GLOBAL || symbol.st_shndx == 0) { printf (%zd: not global or undefined\n, i); continue; } printf (%zd: % PRIx64 \n, i, symbol.st_value); } printf (relocations: %zd\n, reloc_count); for (size_t i = 0; i reloc_count; i++) { GElf_Sym symbol; GElf_Rel rel; char *symbol_name; gelf_getrel(relocs, i, rel); gelf_getsym(symbols, GELF_R_SYM(rel.r_info), symbol); symbol_name = elf_strptr(elf, symbol_sh_link, symbol.st_name); printf (%zd: % PRIx64 , %s\n, i, rel.r_offset, symbol_name); } return 0; }
Bug#794488: Re: Bug#794488: Piglit tests in Mesa crash, if radeonsi_dri.so is linked with libelf1 (libelfg0 works)
Dear Mark, Mark Wielaard wrote on 06.08.2015 00:29: On Wed, 2015-08-05 at 17:55 +0200, Kai Wasserbäch wrote: So, if I've understood you correctly, you want an ELF dump of a Mesa build linked against libelfg0 and one linked against libelf1. You can find the generated files in the attached Tar archive. Please note, that the run with libelf1 only produced two dumps before segfaulting. That seems to confirm that the generation is the same (at least for the first two files). I was hoping to find a difference between parsing some of these files with an old/new libelf. So I wrote a little program that is just the parsing as radeon_elf_read () does. But found no difference. Maybe I am not testing against the right versions though (they are my local builds). Could you compile the following with: gcc -g -lelf -o elfrel elfrel.c this does not work for several reasons: 1. I certainly need -std=c99 for the inline initialisation of the counter in the for() statement. 2. *section (first used in »gelf_getshdr(section, section_header)«) isn't defined/filled anywhere: # $ gcc -g -lelf -std=c99 -o elfrel elfrel.c # elfrel.c: In function ‘main’: # elfrel.c:26:24: error: ‘section’ undeclared (first use in this function) #if (gelf_getshdr(section, section_header) != section_header) # ^ # elfrel.c:26:24: note: each undeclared identifier is reported only once for each function it appears in While defining »Elf_Scn *section;« gives me a compiling elfrel.c, it unsurprisingly doesn't do anything afterwards, since section is empty. Long story short: did you paste the entire/correct code? Maybe just attach the file and send an MD5 or something along. Cheers, Kai signature.asc Description: OpenPGP digital signature
Bug#794488: Re: Bug#794488: Piglit tests in Mesa crash, if radeonsi_dri.so is linked with libelf1 (libelfg0 works)
On Wed, 2015-08-05 at 11:13 +0900, Michel Dänzer wrote: Note that the ELF object is actually created in LLVM. Do you happen to know whether that also uses libelf to generate the file? I am assuming there is some bug in the relocation section parsing and that the generated ELF images are identical in the old/new situation. But maybe the ELF images generated are actually different? Kai, would it be possible to run the tweaked code to dump the ELF images to disk against both the old/new libelf and see if they differ? Thanks, Mark -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#794488: Re: Bug#794488: Piglit tests in Mesa crash, if radeonsi_dri.so is linked with libelf1 (libelfg0 works)
On 05.08.2015 18:55, Mark Wielaard wrote: On Wed, 2015-08-05 at 11:13 +0900, Michel Dänzer wrote: Note that the ELF object is actually created in LLVM. Do you happen to know whether that also uses libelf to generate the file? AFAICT LLVM doesn't use libelf, presumably it has its own ELF code. I am assuming there is some bug in the relocation section parsing and that the generated ELF images are identical in the old/new situation. But maybe the ELF images generated are actually different? Kai, would it be possible to run the tweaked code to dump the ELF images to disk against both the old/new libelf and see if they differ? I can't speak for Kai, but I didn't change anything about LLVM between the failing and working case, so the generated ELF objects should be identical in both cases. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#794488: Re: Bug#794488: Piglit tests in Mesa crash, if radeonsi_dri.so is linked with libelf1 (libelfg0 works)
Michel Dänzer wrote on 05.08.2015 12:08: On 05.08.2015 18:55, Mark Wielaard wrote: On Wed, 2015-08-05 at 11:13 +0900, Michel Dänzer wrote: Note that the ELF object is actually created in LLVM. Do you happen to know whether that also uses libelf to generate the file? AFAICT LLVM doesn't use libelf, presumably it has its own ELF code. That's correct. LLVM has it's own ELF code and isn't linked against any libelf{1,g0}. You can find that code in the lib/MC directory of LLVM (online browsable version: https://github.com/llvm-mirror/llvm/tree/master/lib/MC). There are several ELF writing related files in there, like ELFObjectWriter.cpp (https://github.com/llvm-mirror/llvm/blob/master/lib/MC/ELFObjectWriter.cpp). The AMDGPU backend instances this in https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp I am assuming there is some bug in the relocation section parsing and that the generated ELF images are identical in the old/new situation. But maybe the ELF images generated are actually different? Kai, would it be possible to run the tweaked code to dump the ELF images to disk against both the old/new libelf and see if they differ? I can't speak for Kai, but I didn't change anything about LLVM between the failing and working case, so the generated ELF objects should be identical in both cases. That's the same for me: I can take any Mesa build I've done and link it against libelf1 (0.163) and it fails or link it against libelfg0 and it works. The code responsible for writing is unchanged between these too (ie. they are linked against the same libLLVM-3.8). The radeonsi driver in Mesa only reads the generated ELF and passes it on to the kernel/GPU (AFAIK). So, if I've understood you correctly, you want an ELF dump of a Mesa build linked against libelfg0 and one linked against libelf1. You can find the generated files in the attached Tar archive. Please note, that the run with libelf1 only produced two dumps before segfaulting. Cheers, Kai 794488_elfs.tar.xz Description: application/xz signature.asc Description: OpenPGP digital signature
Bug#794488: Re: Bug#794488: Piglit tests in Mesa crash, if radeonsi_dri.so is linked with libelf1 (libelfg0 works)
On Wed, 2015-08-05 at 17:55 +0200, Kai Wasserbäch wrote: So, if I've understood you correctly, you want an ELF dump of a Mesa build linked against libelfg0 and one linked against libelf1. You can find the generated files in the attached Tar archive. Please note, that the run with libelf1 only produced two dumps before segfaulting. That seems to confirm that the generation is the same (at least for the first two files). I was hoping to find a difference between parsing some of these files with an old/new libelf. So I wrote a little program that is just the parsing as radeon_elf_read () does. But found no difference. Maybe I am not testing against the right versions though (they are my local builds). Could you compile the following with: gcc -g -lelf -o elfrel elfrel.c and then run it against all these ELF files with the bad libelf like for i in 794488_elfs/libelf*/dump.elf.*; do ./elfrel $i; done For me the output looks like: file: 794488_elfs/libelf1/dump.elf.EL5kJT Nothing found file: 794488_elfs/libelf1/dump.elf.J4EnbO symbols: 5 1: not global or undefined 2: not global or undefined 3: not global or undefined 4: not global or undefined 5: 0 relocations: 2 0: 10, SCRATCH_RSRC_DWORD1 1: 2c, SCRATCH_RSRC_DWORD0 file: 794488_elfs/libelfg0/dump.elf.7NnBvc Nothing found file: 794488_elfs/libelfg0/dump.elf.ahPsJJ symbols: 5 1: not global or undefined 2: not global or undefined 3: not global or undefined 4: not global or undefined 5: 0 relocations: 2 0: 10, SCRATCH_RSRC_DWORD1 1: 2c, SCRATCH_RSRC_DWORD0 file: 794488_elfs/libelfg0/dump.elf.DYTjdO Nothing found file: 794488_elfs/libelfg0/dump.elf.Lke6Xg Nothing found Hopefully for you the output looks different with the bad (or good) libelf. Then I need to make sure I have the right bad/good version myself. Otherwise I need to dig a bit deeper to understand what is going wrong. Thanks, Mark $ cat elfrel.c #include gelf.h #include stdio.h #include string.h #include inttypes.h #include sys/types.h #include sys/stat.h #include fcntl.h int main (int argc, char **argv) { elf_version(EV_CURRENT); printf (file: %s\n, argv[1]); int fd = open (argv[1], O_RDONLY); Elf *elf = elf_begin (fd, ELF_C_READ, NULL); size_t section_str_index; elf_getshdrstrndx(elf, section_str_index); size_t reloc_count, symbol_sh_link, symbol_count; Elf_Data *relocs, *symbols; { const char *name; GElf_Shdr section_header; if (gelf_getshdr(section, section_header) != section_header) { fprintf(stderr, Failed to read ELF section header\n); return -1; } name = elf_strptr(elf, section_str_index, section_header.sh_name); if (strncmp(name, .symtab, 7) == 0) { symbols = elf_getdata(section, NULL); symbol_sh_link = section_header.sh_link; symbol_count = section_header.sh_size / section_header.sh_entsize; } else if (strcmp (name, .rel.text) == 0) { relocs = elf_getdata(section, NULL); reloc_count = section_header.sh_size / section_header.sh_entsize; } } if (!relocs || !symbols || !reloc_count) { printf(Nothing found\n); return -1; } printf (symbols: %zd\n, symbol_count); GElf_Sym symbol; size_t i = 0; while (gelf_getsym (symbols, i++, symbol)) { if (GELF_ST_BIND(symbol.st_info) != STB_GLOBAL || symbol.st_shndx == 0) { printf (%zd: not global or undefined\n, i); continue; } printf (%zd: % PRIx64 \n, i, symbol.st_value); } printf (relocations: %zd\n, reloc_count); for (size_t i = 0; i reloc_count; i++) { GElf_Sym symbol; GElf_Rel rel; char *symbol_name; gelf_getrel(relocs, i, rel); gelf_getsym(symbols, GELF_R_SYM(rel.r_info), symbol); symbol_name = elf_strptr(elf, symbol_sh_link, symbol.st_name); printf (%zd: % PRIx64 , %s\n, i, rel.r_offset, symbol_name); } return 0; } -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#794488: Re: Bug#794488: Piglit tests in Mesa crash, if radeonsi_dri.so is linked with libelf1 (libelfg0 works)
On 04.08.2015 05:34, Kai Wasserbäch wrote: Dear Mark, Mark Wielaard wrote on 03.08.2015 21:47: Could you point me to the source code that does the libelf calls to create the ELF file? Maybe reading the source helps to figure out what might go wrong. The stacktrace from the test doesn't immediately seem to give a direct clue. I think all the ELF stuff is encapsulated in http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/radeon/radeon_elf_util.c (and the header for that). The functions defined therein are called from http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/radeonsi/si_shader.c and http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/radeonsi/si_compute.c if I haven't missed something. Michel can probably spot any mistakes in this, therefore I CCed him on this message. Note that the ELF object is actually created in LLVM. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org