On Thu, 2009-05-28 at 14:40 -0700, WP Blatchley wrote: > >> Could an Iyonix owner tell me if RISC OS 5 can properly display > > >> UTF-8 encoded messages in windows and menus? > > > The RO5 Wimp speaks Unicode. If you set your system alphabet to UTF8, > > then it will display such strings correctly. The downside, however, is > > that most applications have no knowledge of UTF-8, so their menu text > > will likely come out garbled (invariably, it's the shift arrow which > > confuses matters). > > > My understanding is that for the embedded RO boxes shipped in Japan, the > > system alphabet was set to UTF8 by default. I suspect it's likely that > > you will need to softload ROOL International and InternationalKeyboard > > modules to be able to set the system alphabet to UTF8. I suspect that > > it's unwise to attempt to softload the ROOL Wimp on a ROL OS, however. > > So if I were to translate the necessary files, someone could at least give > me a screenshot of it running on RO5?
I expect so, yes. > That would be satisfying! I'll have to try to get R05 running under > emulation on Windows at least, so I can see my translation in action! There's not yet a complete RO5 ROM image for emulation. > It's a shame that setting the system alphabet to UTF8 will break a lot of > applications' menus (and slightly ironic, seeing as UTF8 was designed to > slot into non-Unicode aware systems without causing problems). Top-bit-set characters will always be misinterpreted, in the general case. > Perhaps there could be a hack written that tries to assess whether a > Messages file is UTF8-encoded or Latin-1-encoded, and transcode the > strings accordingly on the fly... Still, that's a discussion for another > mailing list, I suppose. It's somewhat awkward as the thing doing the translation has no idea where the strings will be used. Therefore, transcoding messages as they're loaded from the Messages file will probably break things. Of course, it's possible to trigger such transcoding by sticking some magic value in the Messages file, but that then requires application authors to do whatever work is necessary to cope with it. Additionally, many applications a) don't use MessageTrans and b) don't have Messages files, so you won't catch those by extending MessageTrans. For strings drawn in icons and menus by the Wimp, it should be relatively easy for the Wimp to distinguish between UTF-8 text and legacy 8bit text. The difficulty arises with working out which 8bit character set the text is in. In most cases, it'll be Acorn Latin 1, but it's guaranteed that there are edge cases. It occurs to me that it's likely that the Wimp doesn't even inspect strings to be drawn -- it just throws them at the Font Manager, having opened the desktop font without an explicit encoding specifier, so it'll use whatever the system alphabet is set to. Perhaps, therefore, the Font Manager should attempt to fix this case up. Finally, setting the system alphabet to UTF-8 changes the way in which the Font Manager works when an application opens a font without an explicit encoding. That's going to break many things because far too few applications deal with character sets and font encodings at all, let alone properly. > Any pointers as to where to start with the translation files? Copy !NetSurf.Resources.en.Messages to !NetSurf.Resources.ja.Messages, translate the strings, and send use the resulting file. That's about it. The text at the top of the Messages file describes what the format is. Modern Zap can edit UTF-8 encoded files quite happily -- you may have to tell it that a file is UTF-8 encoded; I can't remember, off hand. J.
