I forgot to cc this on the original reply. --Larry
On 04 May 2001 11:49:26 -0400, Jon Trowbridge wrote: > On Fri, May 04, 2001 at 05:03:00AM -0400, Christopher James Lahey wrote: > > On 04 May 2001 15:50:10 +0930, Not Zed wrote: > > > On 04 May 2001 01:45:30 -0400, Jon Trowbridge wrote: > > > > Before using any of GAL's utf-8 utility functions (like g_utf8_strlen or > > > > g_utf8_strncpy), you need to check that your string is actually a > > > > well-formed chunk of utf-8 using g_utf8_validate. > > > > > > Well apparently its worse, a post to g-h from Havoc says bad utf8 can > > > even crash the gtk2 interfaces as well. > > > > Yeah, this is a major problem really since there will potentially be > > utf8 coming from user generated sources, or even worse other people > > generated sources. A first example that comes to mind actually is mail > > marked as being utf8 but actually being broken. BOOM. > > Just to continue the festival of utf8 lamentation: there seem to be a lot of > unsafe calls to the g_utf8_* functions in gtkhtml. I'm sure they are > scattered through evo too, but it is in gtkhtml is where I've been having > problems with 100%-cpu-draining lock-ups. > Yeah, until a few days ago gtkhtml used the libunicode utf8 functions that seem to be a hell of a lot more robust than the g_utf8 functions. We dropped the libunicode functions so that we could drop that dependency but it has ended up showing us how often evolution is feeding us invalid utf-8 through interfaces that are supposed to be utf-8 only, and how crappy g_utf8* are with respect to invalid input. --Larry