Daichi Kawahata wrote:
> However where outgoing query is concerned, gtkg with GTK2 and ICU is
> dropping my search query (some Japanese, Chinese, Russian) automatically
> with '(WARNING): dropping invalid local query ""'.

This can happen if you use ICU but it doesn't support your locale settings.
That should be fixed in CVS now. However, I've noticed that this happens
also for some queries even when not using ICU. For example, if I search
for the Kanji that is pronounced "no" (looks like the traffic sign for
"no parking") I get the above error message.
There's something wrong in compact_query() and/or compact_query_utf8().

> > If string is not UTF-8 encoded, gtk-gnutella can only guess the used encoded
> > which means it falls back to used locale character set boldly assuming that
> > the user is rather interested in search results from users/machines using
> > the same locale settings.
 
> Ah, my problem might be around here. If there is a feature which can confirm
> filenames of mine currently shared (I know there is number of files, its size
> and LimeWire have all these feature), or emits notification when I'm trying to
> share a file with invalid encoded its filename, the problem caused by encoding
> is less than now for those who are annoying against bogus strings same as me,
> yes, applied only to outgoing of hits on local DB in the gtkg though...

It's pity but gtk-gnutella still doesn't have a feature to show you which
files you share. Well, maybe it's not a bad idea to share only files that
have names which are either UTF-8 encoded or can be converted to UTF-8
based on your locale settings. Since query matching cannot properly work
otherwise sharing files with broken names might be pointless anyway.
I don't think gtk-gnutella emits (many) false-positives at the moment.
At least, I don't witness such behaviour in the wild from gtk-gnutella. 

> My search query is 'limewire', 'japanese' which brings many Japanese filename.

I copy-pasted some Kanji from other search results or Mozilla. That brings
up much more Japanese filenames, at least here.

> My encoding is ja_JP.EUC (LANG, LC_ALL). And there are two versions of
> libiconv, I'm using locally installed GNU libiconv 1.9.2 to be enabled
> extra encoding. I have a bit hesitation to enfoce whole my encoding UTF-8,
> since my system dosen't have it. Well ok, I'll have to write a wrapper
> script.

Well, if you share files with names that are neither ASCII nor UTF-8
encoded that's probably problematic. gtk-gnutella does not convert the
filenames of your shared files to UTF-8, as far as I can see. Usually,
the operating system won't even restrict filenames to any encoding and
handle them like pure binary data (except for / and \0). So it might
even impossible to convert the filenames to UTF-8. At least, it's quite
difficult if you have a mix of differently encoded filenames.

The GTK+ developers encourage UTF-8 encoding for filenames and you'll
see that the file dialogs do not even show files with an invalid
encoding.

-- 
Christian

Attachment: pgpwcKkgr3C3e.pgp
Description: PGP signature

Reply via email to