Esa Peuha: > While we are at it, let's convert HTML character entity references > (which each use 6-8 characters and as many bytes in the HTML file) > to actual characters (which UTF-8 encodes as 2-3 bytes). Since all > diffoscope output files are peppered with abundant amounts of these > things, this could reduce the file sizes by a few percent at least. > I used Python string literals instead of the actual characters in > the Python file, because 1) the non-breaking and zero-width spaces > would be very hard to distinguish from ordinary space and missing > string content, respectively, and 2) it is impossible to be sure > that every piece of software that is ever going to be used to view > or edit the file would handle non-ASCII characters correctly.
Thanks for the patch. It's been commited and push. I would be grateful if you could submit ready-to-merge Git changes next time (see git-format-patch(1)). -- Lunar .''`. lu...@debian.org : :Ⓐ : # apt-get install anarchism `. `'` `-
signature.asc
Description: Digital signature