Haxe wrote: > I don't believe that there are still systems in use that use a > non-unicode implementation of wchar_t. At least on GNU systems, it is > always UCS-4. If anyone uses GTKG on a system with a non-unicode > wchar_t, please raise you hand.
I don't think every user reads this list or this mail. There's a macro to check whether wchar_t uses Unicode at compile-time. That check works only in one direction of course. It's not defined here. Testing this at runtime seems difficult. > > I really have no idea how to convert wchar_t * (or win_t *) strings > > to UTF-8 without that knowledge. Even if we blindly assume it's > > Unicode it may be UTF-8, UCS-2, UTF-16, UTF-32 maybe even a > > system-specific non-standard encoding of Unicode codepoints. > You don't need to know the internal encoding of wchar_t. iconv_open() > undstands "WCHAR_T" as encoding name. That's specific to GNU iconv but on Linux systems we don't have trouble with the name of encodings anyway. We can probably detect this feature at runtime but we need a fall-back which means the current handling nonetheless. Maybe it's better to make use of this feature if seemingly present though. > > Which is something "all others" do not. Should I remove it again? > No, it's OK :-). That's not that big a point. I only said that to point > out that there are already options that cause GTKG to store a file on > disk with a name that is not exactly the name by which is was found on > the net. And it also uses underscores, with all implications (loss of > information, separation of gnutella keywords). Well actually many clients (e.g., BearShare, Morpheus but possible all except Gtk-Gnutella and LimeWire) still don't convert strings to UTF-8. So if you see underscores that might as well be caused by trying to convert a string to UTF-8 under the assumption that it's encoded in your current locale. Of course, we could try a dozen different encodings and use the next best which seems to be convertible but I think that would suck performance-wise. I considered adding an popup item that allows you to select a different encoding for each result like Mozilla but I'm not sure it's worth the effort. That's a drawback of an open network, it can easily turn into anarchy. > And during my experiments with a utf-8 environment, I already > encountered the first problems in other applications. amaroK, for > example, which is my preferred audio player, has a bug that causes > problems if files with utf-8 encoded filenames are called from the > command line. AFAIK, it's still busily maintained. So I'd assume if you submit a bug report it'll be fixed fairly soon. That's just how things are, unused stuff is almost always buggy. Though I'm sure others noticed the bugs before you but out of 100 - maybe 1000 - there's only one who reports the problem. On a (un)related note, I find it odd if people run into problems with Gtk-Gnutella but never ask here but rather on a random forum or mailing list where people know ish about it and then recommend some random crap instead. -- Christian
pgpX74iLoDAEu.pgp
Description: PGP signature
