On Fri, Aug 24, 2018 at 05:14:47PM +0200, Thomas Lamprecht wrote: > On 8/24/18 4:56 PM, Dietmar Maurer wrote: > > BTW, why do you think /etc/hosts may contain utf8 characters? > > Is that defined/documented somewhere? > > > > With his patch the whole file including comments gets returned as "raw", > and comments can contain utf-8 (not defending the binmode)
I'm not *completely* opposed to patches that enforce utf-8 on *certain* files. Particularly ones of which the non-comment content needs to be ASCII compatible anyway. Here's the issue with utf-8: If the file contains non-ascii characters, our code currently reads it as-is, that is, a byte '234' will become the code point '234' in perl's internal string. This means a single utf-8 encoded letter is treated as 2 or more letters in the range 128..255 (as is to be expected). When serializing this to json in the API's output code, we tell to_json to produce utf-8, this produces the utf-8 representation for *each* of the bytes that initially made up the character separately. The GUI then decodes that to produce the same string perl was using internally: something containing 2 or more code points for what was initially a single utf-8 encoded letter. This of course shows up as garbage in the GUI. One could argue we're currently using a file-encoding of latin-1 / iso-8859-1, as its code points 0..255 AFAIK map directly to unicode... _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel