Haxe wrote:
> I don't believe that there are still systems in use that use a 
> non-unicode implementation of wchar_t. At least on GNU systems, it is 
> always UCS-4. If anyone uses GTKG on a system with a non-unicode 
> wchar_t, please raise you hand.

I don't think every user reads this list or this mail. There's a
macro to check whether wchar_t uses Unicode at compile-time. That
check works only in one direction of course. It's not defined
here. Testing this at runtime seems difficult.
 
> > I really have no idea how to convert wchar_t * (or win_t *) strings
> > to UTF-8 without that knowledge. Even if we blindly assume it's
> > Unicode it may be UTF-8, UCS-2, UTF-16, UTF-32 maybe even a
> > system-specific non-standard encoding of Unicode codepoints.
 
> You don't need to know the internal encoding of wchar_t. iconv_open() 
> undstands "WCHAR_T" as encoding name.

That's specific to GNU iconv but on Linux systems we don't have
trouble with the name of encodings anyway. We can probably detect
this feature at runtime but we need a fall-back which means the
current handling nonetheless. Maybe it's better to make use of this
feature if seemingly present though.

> > Which is something "all others" do not. Should I remove it again?
 
> No, it's OK :-). That's not that big a point. I only said that to point 
> out that there are already options that cause GTKG to store a file on 
> disk with a name that is not exactly the name by which is was found on 
> the net. And it also uses underscores, with all implications (loss of 
> information, separation of gnutella keywords).

Well actually many clients (e.g., BearShare, Morpheus but possible
all except Gtk-Gnutella and LimeWire) still don't convert strings
to UTF-8. So if you see underscores that might as well be caused
by trying to convert a string to UTF-8 under the assumption that
it's encoded in your current locale. Of course, we could try a
dozen different encodings and use the next best which seems to
be convertible but I think that would suck performance-wise.

I considered adding an popup item that allows you to select a
different encoding for each result like Mozilla but I'm not sure
it's worth the effort. That's a drawback of an open network, it can
easily turn into anarchy.

> And during my experiments with a utf-8 environment, I already 
> encountered the first problems in other applications. amaroK, for 
> example, which is my preferred audio player, has a bug that causes 
> problems if files with utf-8 encoded filenames are called from the 
> command line.

AFAIK, it's still busily maintained. So I'd assume if you submit
a bug report it'll be fixed fairly soon. That's just how things
are, unused stuff is almost always buggy. Though I'm sure others
noticed the bugs before you but out of 100 - maybe 1000 - there's
only one who reports the problem.

On a (un)related note, I find it odd if people run into problems
with Gtk-Gnutella but never ask here but rather on a random
forum or mailing list where people know ish about it and then
recommend some random crap instead.

-- 
Christian

Attachment: pgpX74iLoDAEu.pgp
Description: PGP signature

Reply via email to