Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-07-11 Thread Guy Harris
On Jul 11, 2011, at 4:00 PM, Stephen Fisher wrote: > The popular SecureCRT terminal emulator defaults to "default" (same as > local system) character encoding, at least on Windows systems. This is > not compatible with UTF-8 in my experience. Not surprising, given that "default"/"same as loca

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-07-11 Thread Stephen Fisher
On Tue, Jun 28, 2011 at 10:01:14AM -0700, Guy Harris wrote: > I don't know what the various terminal emulators for Windows, e.g. > cmd.exe, do. The popular SecureCRT terminal emulator defaults to "default" (same as local system) character encoding, at least on Windows systems. This is not com

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-29 Thread Guy Harris
On Jun 29, 2011, at 1:45 PM, Stig Bjørlykke wrote: > Ok, what about trying to convert back to locale when output error > messages from tshark? > Something like the attached patch, maybe? Something like that, but with a g_free() of "string" afterwards. :-)

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-29 Thread Stig Bjørlykke
On Tue, Jun 28, 2011 at 7:01 PM, Guy Harris wrote: > In any case, that means that using strerror() is probably not going to be > sufficient to fix the problem.  What we might want to do is use UTF-8 > everywhere we can, and, for non-GUI output, convert to the appropriate > character encoding -

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-29 Thread Guy Harris
On Jun 29, 2011, at 2:37 AM, Graham Bloice wrote: > For reference, here's the test executable output on Win7, using the SDK 7.0 > build environment (a cmd.prompt): Not surprisingly, it doesn't work. Microsoft introduced Unicode support when they introduced Win32; as they were introducing a ne

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-29 Thread Graham Bloice
On 28/06/2011 18:27, Guy Harris wrote: > On Jun 28, 2011, at 6:10 AM, Stig Bjørlykke wrote: > >> On Tue, Jun 28, 2011 at 2:58 AM, Guy Harris wrote: >>>1) UN*Xes where LANG etc. aren't set to a locale with UTF-8 as the >>> encoding (are you seeing the issue with Norwegian characters on you

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Stig Bjørlykke
On Tue, Jun 28, 2011 at 9:37 PM, Guy Harris wrote: > However, if LANG is blank, you presumably don't have Terminal set up to "Set > local enviornment variables on startup" (Preferences > Settings > Advanced, > at the bottom); Actually I have "Set local environment variables on startup" checked.

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 12:25 PM, Stig Bjørlykke wrote: > On Tue, Jun 28, 2011 at 7:27 PM, Guy Harris wrote: >> OK, what OS are you using? > > Snow:~ stig$ uname -a > Darwin ... Well, that answers *that* question. :-) So the locale's encoding should probably be UTF-8, given that it's OS X. Howev

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Stig Bjørlykke
On Tue, Jun 28, 2011 at 7:27 PM, Guy Harris wrote: > OK, what OS are you using? Snow:~ stig$ uname -a Darwin Snow.local 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386 Snow:~ stig$ echo $LANG Snow:~ stig$ gcc norsk.c -o norsk && ./norsk S

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 10:27 AM, Guy Harris wrote: > We have an issue regarding strings in packets in general. Strings might be > in a number of encodings, including ASCII (meaning that any byte with the 8th > bit set is something that shouldn't be there), other national variants of ISO > 646, U

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 10:43 AM, Guy Harris wrote: > On Jun 28, 2011, at 10:27 AM, Guy Harris wrote: > >> when putting them into a textual representation of the protocol tree or >> into columns or something else to be shown to humans, map them to UTF-8, >> with anything that can't be mapped

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 10:27 AM, Guy Harris wrote: > when putting them into a textual representation of the protocol tree or > into columns or something else to be shown to humans, map them to UTF-8, with > anything that can't be mapped to UTF-8 - including, if the encoding is > putatively

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 10:01 AM, Guy Harris wrote: > In any case, that means that using strerror() is probably not going to be > sufficient to fix the problem. What we might want to do is use UTF-8 > everywhere we can, and, for non-GUI output, convert to the appropriate > character encoding - wh

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 6:10 AM, Stig Bjørlykke wrote: > On Tue, Jun 28, 2011 at 2:58 AM, Guy Harris wrote: >>1) UN*Xes where LANG etc. aren't set to a locale with UTF-8 as the >> encoding (are you seeing the issue with Norwegian characters on your system? >> If so, what's the setting of

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 3:33 AM, Stig Bjørlykke wrote: > Do we always know where the error message is used? > I suspect file_open_error_message() is used both in GUI and tshark. Yes - it's in epan. ___ Sent via:Wireshark-dev

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 3:22 AM, Jakub Zawadzki wrote: > Btw. I know that nowadays I'm the only one who uses non-utf locales on > console, > but when we print on console (stdout/stderr) I think we should use strerror() > from libc, > i.e. strerror() which don't recode message to utf-8. It's more c

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Guy Harris
On Jun 28, 2011, at 2:25 AM, Graham Bloice wrote: > On 28/06/2011 01:58, Guy Harris wrote: >> >> 2) Windows, where "Unicode" generally means "UTF-16", and APIs that >> return strings encoded as sequences of octets rather than hexadectets >> probably return strings in the local code page.

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Stig Bjørlykke
On Tue, Jun 28, 2011 at 2:58 AM, Guy Harris wrote: >        1) UN*Xes where LANG etc. aren't set to a locale with UTF-8 as the > encoding (are you seeing the issue with Norwegian characters on your system?   > If so, what's the setting of LANG?); I only had issues with Norwegian characters in fi

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Stig Bjørlykke
On Tue, Jun 28, 2011 at 12:22 PM, Jakub Zawadzki wrote: > Btw. I know that nowadays I'm the only one who uses non-utf locales on > console, > but when we print on console (stdout/stderr) I think we should use strerror() > from libc, > i.e. strerror() which don't recode message to utf-8. Do we a

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Jakub Zawadzki
On Tue, Jun 28, 2011 at 10:14:34AM +0200, Stig Bj?rlykke wrote: > On Tue, Jun 28, 2011 at 9:35 AM, Jakub Zawadzki > wrote: > > g_strerror() ? > > Yes, of course :) Thank you. no problem ;-) Btw. I know that nowadays I'm the only one who uses non-utf locales on console, but when we print on con

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Graham Bloice
On 28/06/2011 01:58, Guy Harris wrote: > > 2) Windows, where "Unicode" generally means "UTF-16", and APIs that > return strings encoded as sequences of octets rather than hexadectets > probably return strings in the local code page. > Is this a first sighting of a new word "hexadectet"? Goo

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Stig Bjørlykke
On Tue, Jun 28, 2011 at 9:35 AM, Jakub Zawadzki wrote: > g_strerror() ? Yes, of course :) Thank you. -- Stig Bjørlykke ___ Sent via:Wireshark-dev mailing list Archives:http://www.wireshark.org/lists/wireshark-dev

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-28 Thread Jakub Zawadzki
On Mon, Jun 27, 2011 at 05:58:35PM -0700, Guy Harris wrote: > > We have about 240 calls to strerror(). > > ...and, unfortunately, a variant that converts to UTF-8 and is API-compatible > is non-trivial, > as any version that allocates a buffer for the result of the conversion would > leak memor

Re: [Wireshark-dev] UTF8 vs. locale in error messages (bug 5715)

2011-06-27 Thread Guy Harris
On Jun 27, 2011, at 11:54 AM, Stig Bjørlykke wrote: > When looking at bug 5715 I found that we use both UTF8 (from file > names) and locale (from strerror()) in the error messages presented > from simple_dialog(). In vsimple_dialog() we convert all messages > with g_locale_to_utf8(), which will