[Reproducible-builds] Привет!

2015-12-17 Thread Магда
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

[Reproducible-builds] Bug#808207: diffoscope: Filter objdump --disassemble output before diffing it

2015-12-17 Thread Mike Hommey
Source: diffoscope
Version: 43
Severity: wishlist


When comparing large ELF binaries, some minor differences can end up hurting
the visibility of more important differences.

Specifically, objdump --disassemble displays symbols+offsets for addresses
it derives from IP-relative addressing, like the following:

   9d2be2: 48 8d 05 42 65 24 02lea0x2246542(%rip),%rax# 
2c1912b <_fini@@xul45a1+0x1d803>

In the particular case I'm looking at, though, some function ends up pushing
the rest of the .text section, so that the _fini symbol (and many others,
actually) move.

So I end up with a *lot* of differences like:

<   9d2be2: 48 8d 05 42 65 24 02lea0x2246542(%rip),%rax# 
2c1912b <_fini@@xul45a1+0x1d803>
---
>   9d2be2: 48 8d 05 42 65 24 02lea0x2246542(%rip),%rax# 
> 2c1912b <_fini@@xul45a1+0x1d7e3>
(note: this is a diff I got manually, because it's easier to visualize than a
copy/paste of the HTML output I got from diffoscope)

The code is the same, the address is the same, but the pseudo-symbol doesn't
match and it actually doesn't matter because that actually points to some place
in .rodata, and the .rodata hasn't moved, only _fini and some earlier symbols
have.

In another case, the symbol between angle brackets is an actual symbol (on
non-stripped binaries) but the symbol name is different because GCC decided
to use a different suffix[1]. For example:

<   9d2f35: 48 8d 05 d1 5b 33 02lea0x2335bd1(%rip),%rax# 
2d08b0d <__FUNCTION__.10544+0x29d>
---
>   9d2f35: 48 8d 05 d1 5b 33 02lea0x2335bd1(%rip),%rax# 
> 2d08b0d <__FUNCTION__.10547+0x29d>

The difference might seem interesting to note, but in fact it's not, because it
will already appear in the `readelf --all` diff:

<  17956: 02d0887021 OBJECT  LOCAL  DEFAULT   16 __FUNCTION__.10544
---
>  17956: 02d0887021 OBJECT  LOCAL  DEFAULT   16 __FUNCTION__.10547

Anyways, those symbols between angle brackets are just adding noise that would
be better left out. I'm not sure, though, that there is an option to objdump
that allows to make it not display those symbols (and a quick glance at the
binutils source suggests there isn't). I can only suggest sending the output
of objdump through sed :-/

Something like (awful):

@tool_required('objdump')
@tool_required('sed')
def cmdline(self):
return ['sh', '-c', 'objdump --disassemble --full-contents "%s" | sed 
"s/<.*>//"' % self.path]


Mike



1. Example of how this can happen:

$ cat > test.c 

[Reproducible-builds] Bug#808121: Bug#808121: diffoscope: HTML output is bloated

2015-12-17 Thread Esa Peuha
While we are at it, let's convert HTML character entity references
(which each use 6-8 characters and as many bytes in the HTML file)
to actual characters (which UTF-8 encodes as 2-3 bytes). Since all
diffoscope output files are peppered with abundant amounts of these
things, this could reduce the file sizes by a few percent at least.
I used Python string literals instead of the actual characters in
the Python file, because 1) the non-breaking and zero-width spaces
would be very hard to distinguish from ordinary space and missing
string content, respectively, and 2) it is impossible to be sure
that every piece of software that is ever going to be used to view
or edit the file would handle non-ASCII characters correctly.
--- presenters/html.py.orig 2015-12-16 19:42:25.0 +0200
+++ presenters/html.py  2015-12-17 15:10:53.654467937 +0200
@@ -290,9 +290,9 @@
 n = TABSIZE-(i%TABSIZE)
 if n == 0:
 n = TABSIZE
-t.write(''+''*(n-1))
+t.write('\xbb'+'\xa0'*(n-1))
 elif c == " " and ponct == 1:
-t.write('')
+t.write('\xb7')
 elif c == "\n" and ponct == 1:
 t.write('\')
 elif ord(c) < 32:
@@ -304,11 +304,11 @@
 i += 1
 
 if WORDBREAK.count(c) == 1:
-t.write('')
+t.write('\u200b')
 i = 0
 if i > LINESIZE:
 i = 0
-t.write("")
+t.write('\u200b')
 
 return t.getvalue()
 
@@ -353,7 +353,7 @@
 print_func(u'')
 else:
 s1 = ""
-print_func(u'')
+print_func(u'\xa0')
 
 if s2 is not None:
 print_func(u'%d ' % line2)
@@ -362,7 +362,7 @@
 print_func(u'')
 else:
 s2 = ""
-print_func(u'')
+print_func(u'\xa0')
 finally:
 print_func(u"\n", force=True)
 
@@ -522,7 +522,7 @@
 print_func(u"%s"
% escape(difference.source2))
 anchor = '/'.join(sources[1:])
-print_func(u" " % 
(anchor, anchor))
+print_func(u" \xb6" % 
(anchor, anchor))
 print_func(u"")
 if difference.comments:
 print_func(u"%s"
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

[Reproducible-builds] Bug#808121: Bug#808121: Bug#808121: diffoscope: HTML output is bloated

2015-12-17 Thread Jérémy Bobbio
Esa Peuha:
> While we are at it, let's convert HTML character entity references
> (which each use 6-8 characters and as many bytes in the HTML file)
> to actual characters (which UTF-8 encodes as 2-3 bytes). Since all
> diffoscope output files are peppered with abundant amounts of these
> things, this could reduce the file sizes by a few percent at least.
> I used Python string literals instead of the actual characters in
> the Python file, because 1) the non-breaking and zero-width spaces
> would be very hard to distinguish from ordinary space and missing
> string content, respectively, and 2) it is impossible to be sure
> that every piece of software that is ever going to be used to view
> or edit the file would handle non-ASCII characters correctly.

Thanks for the patch. It's been commited and push.

I would be grateful if you could submit ready-to-merge Git changes next
time (see git-format-patch(1)).

-- 
Lunar.''`. 
lu...@debian.org: :Ⓐ  :  # apt-get install anarchism
`. `'` 
  `-   


signature.asc
Description: Digital signature
___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

[Reproducible-builds] Bug#808267: diffoscope: Redundant information in ELF comparisons

2015-12-17 Thread Mike Hommey
Source: diffoscope
Version: 43
Severity: normal

When comparing ELF files, the following commands are used:
- readelf --all
- readelf --debug-dump
- objdump --disassemble --full-contents

objdump --disassemble --full-contents is actually redundant in itself. For
example, it will dump both an hexdump and a disassembly of the .text section.
It's also redundant with the output of readelf --debug-dump because it does an
hexdump of the .debug_* sections that readelf --debug-dump does a dwarf dump
of.

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.2.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=ja_JP.UTF-8, LC_CTYPE=ja_JP.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds


[Reproducible-builds] Bug#808267: diffoscope: Redundant information in ELF comparisons

2015-12-17 Thread Mike Hommey
On Fri, Dec 18, 2015 at 10:10:54AM +0900, Mike Hommey wrote:
> Source: diffoscope
> Version: 43
> Severity: normal
> 
> When comparing ELF files, the following commands are used:
> - readelf --all
> - readelf --debug-dump
> - objdump --disassemble --full-contents
> 
> objdump --disassemble --full-contents is actually redundant in itself. For
> example, it will dump both an hexdump and a disassembly of the .text section.
> It's also redundant with the output of readelf --debug-dump because it does an
> hexdump of the .debug_* sections that readelf --debug-dump does a dwarf dump
> of.

objdump --disassemble --full-contents also outputs a dump of e.g.
.note.gnu.build-id, which is printed out in nicer form in readelf --all.

Mike

___
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds