I forgot to cc this on the original reply.

--Larry


On 04 May 2001 11:49:26 -0400, Jon Trowbridge wrote:
> On Fri, May 04, 2001 at 05:03:00AM -0400, Christopher James Lahey wrote:
> > On 04 May 2001 15:50:10 +0930, Not Zed wrote:
> > > On 04 May 2001 01:45:30 -0400, Jon Trowbridge wrote:
> > > > Before using any of GAL's utf-8 utility functions (like g_utf8_strlen or
> > > > g_utf8_strncpy), you need to check that your string is actually a
> > > > well-formed chunk of utf-8 using g_utf8_validate.
> > > 
> > > Well apparently its worse, a post to g-h from Havoc says bad utf8 can
> > > even crash the gtk2 interfaces as well.
> > 
> > Yeah, this is a major problem really since there will potentially be
> > utf8 coming from user generated sources, or even worse other people
> > generated sources.  A first example that comes to mind actually is mail
> > marked as being utf8 but actually being broken.  BOOM.
> 
> Just to continue the festival of utf8 lamentation: there seem to be a lot of
> unsafe calls to the g_utf8_* functions in gtkhtml.  I'm sure they are
> scattered through evo too, but it is in gtkhtml is where I've been having
> problems with 100%-cpu-draining lock-ups.
> 

Yeah, until a few days ago gtkhtml used the libunicode utf8 functions
that seem to be a hell of a lot more robust than the g_utf8 functions.
We dropped the libunicode functions so that we could drop that
dependency but it has ended up showing us how often evolution is feeding
us invalid utf-8 through interfaces that are supposed to be utf-8 only,
and how crappy g_utf8* are with respect to invalid input.

--Larry


Reply via email to