Re: [pygtk] utf8 validating string
John Ehresman wrote: > Yann Leboulanger wrote: >> John Ehresman wrote: >>> I'm confused here; I think your last example passes '\x0' to a gtk >>> function which does not work. Either remove the '\x0' or do something >>> else with \x0 here. Or am I missing something? >>> >> >> removeing the \x0 isn't a problem, a replce can do that, but is it the >> only char that will cause this problem? > > Yes as long as the rest is valid utf8. \x0 is a problem because it > terminates C strings so you can never have a C string with a \x0 in it > (it's not quite that simple, but if you don't know C it's probably close > enough). Python strings can contain \x0 so there's a problem when > passing the length to the conversion function. > > Cheers, > > John > ok great, thanks, python's greater than C ;) Ok ok I go out ->[] ;) -- Yann ___ pygtk mailing list pygtk@daa.com.au http://www.daa.com.au/mailman/listinfo/pygtk Read the PyGTK FAQ: http://www.async.com.br/faq/pygtk/
Re: [pygtk] utf8 validating string
John Ehresman wrote: > I'm confused here; I think your last example passes '\x0' to a gtk > function which does not work. Either remove the '\x0' or do something > else with \x0 here. Or am I missing something? > removeing the \x0 isn't a problem, a replce can do that, but is it the only char that will cause this problem? -- Yann ___ pygtk mailing list pygtk@daa.com.au http://www.daa.com.au/mailman/listinfo/pygtk Read the PyGTK FAQ: http://www.async.com.br/faq/pygtk/
Re: [pygtk] utf8 validating string
John Ehresman wrote: > Yann Leboulanger wrote: >> I'd like not to have it. But I getthis string by gpg-decodding a message >> send by Miranda IM. I think it's a bug in their GnuPG implementation, >> but anyway I'd like my client to detect those bad string and a) print >> message correctly if I can or b) don't traceback and print a warning >> message. But for that I need a function that tells me that >> g_utf8_validate will fail ... > > You probably should explicitly decide how to handle \0. If it's always > at the end, it's probably just a simple bug and can be chopped off but > it may be something more if valid text follows the \0. > > But in general, I think this'll work: > > def valid_glib_utf8(s): > try: > unicode(s, 'utf-8') > except Exception: > return False > else: > return '\x0' not in s > > In case you need it s.replace('\x0', '') will remove the \0's. > > Cheers, > > John > That doesn't work: >>> import gtk >>> tv = gtk.TextView() >>> b = tv.get_buffer() >>> t = "test\x00" >>> u = unicode(t, 'utf-8') >>> b.set_text(t) __main__:1: GtkWarning: gtk_text_buffer_emit_insert: assertion `g_utf8_validate (text, len, NULL)' failed it's the same if I try with the unicode: >>> import gtk >>> tv = gtk.TextView() >>> b = tv.get_buffer() >>> t = "test\x00" >>> u = unicode(t, 'utf-8') >>> b.set_text(u) __main__:1: GtkWarning: gtk_text_buffer_emit_insert: assertion `g_utf8_validate (text, len, NULL)' failed -- Yann ___ pygtk mailing list pygtk@daa.com.au http://www.daa.com.au/mailman/listinfo/pygtk Read the PyGTK FAQ: http://www.async.com.br/faq/pygtk/
Re: [pygtk] utf8 validating string
Yann Leboulanger wrote: import gtk tv = gtk.TextView() b = tv.get_buffer() t = "Let's check this out.\x00" u = unicode(t, 'utf-8') b.set_text(t) __main__:1: GtkWarning: gtk_text_buffer_emit_insert: assertion `g_utf8_validate (text, len, NULL)' failed but b.set_text(u) works ... is it the way to go? Your mistake might be the final '\x00'. Is there a reason you're including it? Python handles \x00 in strings, but gtk (& most C libs) probably doesn't. Cheers, John ___ pygtk mailing list pygtk@daa.com.au http://www.daa.com.au/mailman/listinfo/pygtk Read the PyGTK FAQ: http://www.async.com.br/faq/pygtk/
Re: [pygtk] utf8 validating string
Yann Leboulanger wrote: Hi, I have a string that a textview can't display. It contains invalid chars: t = "Let's check this out.\x00" import gtk tv = gtk.TextView() b = tv.get_buffer() b.set_text(t) __main__:1: GtkWarning: gtk_text_buffer_emit_insert: assertion `g_utf8_validate (text, len, NULL)' failed but when I di that I have no problem: t.decode('utf-8') u"Let's check this out.\x00" try: u = unicode(t, 'utf-8') except Exception: print 'not utf8' else: b.set_text(t) John ___ pygtk mailing list pygtk@daa.com.au http://www.daa.com.au/mailman/listinfo/pygtk Read the PyGTK FAQ: http://www.async.com.br/faq/pygtk/
Re: [pygtk] utf8 validating string
John Ehresman wrote: > Yann Leboulanger wrote: >> Hi, >> >> I have a string that a textview can't display. It contains invalid chars: >> > t = "Let's check this out.\x00" > import gtk > tv = gtk.TextView() > b = tv.get_buffer() > b.set_text(t) >> __main__:1: GtkWarning: gtk_text_buffer_emit_insert: assertion >> `g_utf8_validate (text, len, NULL)' failed >> >> but when I di that I have no problem: > t.decode('utf-8') >> u"Let's check this out.\x00" > > try: > u = unicode(t, 'utf-8') > except Exception: > print 'not utf8' > else: > b.set_text(t) > > John > >>> import gtk >>> tv = gtk.TextView() >>> b = tv.get_buffer() >>> t = "Let's check this out.\x00" >>> u = unicode(t, 'utf-8') >>> b.set_text(t) __main__:1: GtkWarning: gtk_text_buffer_emit_insert: assertion `g_utf8_validate (text, len, NULL)' failed but b.set_text(u) works ... is it the way to go? ___ pygtk mailing list pygtk@daa.com.au http://www.daa.com.au/mailman/listinfo/pygtk Read the PyGTK FAQ: http://www.async.com.br/faq/pygtk/
[pygtk] utf8 validating string
Hi, I have a string that a textview can't display. It contains invalid chars: >>> t = "Let's check this out.\x00" >>> import gtk >>> tv = gtk.TextView() >>> b = tv.get_buffer() >>> b.set_text(t) __main__:1: GtkWarning: gtk_text_buffer_emit_insert: assertion `g_utf8_validate (text, len, NULL)' failed but when I di that I have no problem: >>> t.decode('utf-8') u"Let's check this out.\x00" so what could I do to validate the string before sending it to GTK? ___ pygtk mailing list pygtk@daa.com.au http://www.daa.com.au/mailman/listinfo/pygtk Read the PyGTK FAQ: http://www.async.com.br/faq/pygtk/