On 5 August 2011 16:26, Ferenc Kovacs <tyr...@gmail.com> wrote: >> If I tell you I'm on Windows ... > my desktop PC also runs windows, but I'm running my development > machine on linux inside vmware. > >> >> >> Any XML file that contains only ASCII (0x20-0x7F), I've already >> changed the xml encoding to UTF-8, as there is no difference in the >> byte values. >> >> >> On the assumption that the XML encode="" value is accurate, would >> using mb_convert_encoding() be enough? > > yeah, or you can also use iconv > >> >> Find files NOT UTF-8, read XML encoding, use mb_convert_encoding() to >> convert file and save. > > sounds good > >> >> If that works, then there are 6913 XML files in phpdoc translations >> NOT UTF-8 (3521 ISO-8859-1, 2901 ISO-8859-2, 425 ISO-8859-7, 65 BIG5 >> and 1 ISO-8859-8). >> >> In doing this, toggling between the two versions (ISO encoded and >> UTF-8 encoded), my editor doesn't seem to show any differences. >> >> If I do a full file comparison (which is encoding aware), my editor >> says the only difference is in the <?xml > line due to the encoding. >> >> Running a diff shows the entire file to be different (as expected). >> >> Is that what you'd expect? >> > > yep. > > -- > Ferenc Kovács > @Tyr43l - http://tyrael.hu >
Then I've just converted all outstanding XMLs to UTF-8. I won't be pushing all of them in 1 go ... tempted, but I think Hannes and every translator would be screaming down my neck for that. So. On Monday, I'll be pushing them through in very small chunks (probably on a per language per extension basis). I'll then look into a pre-commit hook for non UTF-8 encoded XML files. -- Richard Quadling Twitter : EE : Zend : PHPDoc @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea