Thanks for the feedback. On Sat, 8 Apr 2023, 12:32 Guillem Jover, <guil...@debian.org> wrote:
> Hi! > > On Sun, 2023-03-19 at 17:09:00 +0100, Juanmi Taboada wrote: > > Checking documentation for deb packages, I read that the control file > > should be UTF-8: > > > > - Reference: > > https://www.debian.org/doc/debian-policy/ch-controlfields.html > > - 5.1 Syntax of control files at the end: *"All control files must be > > encoded in UTF-8."* > > > > I was able to build a non-utf8 package using *dpkg -b*. > > > > This was originally reported in Landscape-Client: > > https://bugs.launchpad.net/landscape-client/+bug/1813442 > > > Making reference to the first version, '1.0.0.944' of the package > "veeam". > > The report points: > > "The strange character is the U+FFFD � REPLACEMENT CHARACTER." > > > > I was able to reproduce the problem in Landscape Client, and I discovered > > the error came from a wrong encoding used in the control file. > > I made a wrong encoded description, which reproduced the error on our > side. > > > > Nevertheless, it is not a bug in Landscape but in dpkg, which allowed > > building a deb package with a wrong encoded control file. > > The dpkg deb822(5) man page has similar wording, I think mostly > because it was adapted from the Debian policy. So, while I think > settling on UTF-8 for the only supported encoding makes sense, dpkg > itself does not really care, and will work with pretty much any > encoding thrown at it, for the things it cares it restricts itself > to just ASCII and tries to validate that strictly. > > In this case I think there might be four (or more) potential bugs > here: > > 1) The deb822(5) man page should probably be clarified to distinguish > what to expect about encodings. > 2) The dpkg-source (et al), dpkg-deb and dpkg might perhaps need to be > improved to be more strict when parsing, and validating their > inputs, including encoding. > 3) The affected packages with wrong encoding should get bugs filed > and fixed. > 4) The landscape client software should ideally cope more gracefully, > and not fail when confronted with wrongly encoded files? Because > these can also be generated by something that is not dpkg-deb, as > people seem to be fond of creating their own .deb packers for their > build systems and other tooling. > > > The broken description package is attached for further study. > > Thanks, I've added an entry to my TODO to handle the above items from > the dpkg side. > > Regards, > Guillem >