We are working with IO which returns the number of bytes written, not utf-8
characters.

The following shows this fairly well:
init
  var utf8 = "𐑣𐑩𐑤𐑴, 𐑢𐑻𐑤𐑛!"
  print "%d", (int) utf8.length
11

Python (3.1.2) does the same:
>>> utf8 = "𐑣𐑩𐑤𐑴, 𐑢𐑻𐑤𐑛!"
>>> len(utf8)
11

However, we can see that this is not the number of bytes:
>>> buff = bytes(utf8, "utf-8")
>>> len(buff)
35

Of course we can't do this with Genie since it has a broken to_utf8()
function:
  print "%d", (int) utf8.to_utf8().length
4196432

IO read/write functions return bytes for the number read and written.  For
this reason, I wrote us up an alternative mapping of GString for using
chra[] instead of string, and we avoid string whenever we have data which
could have utf8 content.

The pointer solution seems to work in this case, but Genie does not have
slice support currently (I checked), where Vala does and implements it in
nearly the same way Python does.




On Sat, May 22, 2010 at 11:46 AM, jamie <jamie.mccr...@googlemail.com>wrote:

> On Sat, 2010-05-22 at 06:38 -0400, Arc Riley wrote:
> > I'm migrating some C code to Genie and ran into a troubling limitation;
> the
> > following line of code does not seem to have an equivalent in Genie:
> >
> >         session->wbuff = g_string_append_len(session->wbuff, str+sent,
> > len-sent);
> >
> > What this means is "when you can't send all the data you'd like, append
> the
> > rest to our (known to be empty) GString buffer".
> >
> > The key part here is "the rest" - getting a slice of an array, such as
> C's
> > array + value syntax (Incompatible operand) or Python's array[start:]
> syntax
> > (error: syntax error, expected `]' but got `:' with previous identifier).
> >
> > We can't use Genie strings for the buffer because UTF8 length != byte
> > length.
> > _______________________________________________
>
> You should be using the StringBuilder class for this
>
> public unowned StringBuilder append_len (string val, ssize_t len);
>
> (StringBuilder is a GString)
>
> Also string class has a size method which uses strlen - it would
> probably be clearer if it was called ByteLength
>
> See http://git.gnome.org/browse/vala/tree/vapi/glib-2.0.vapi
>
> If there are still problems let me know - I will be bug fixing genie
> tomorrow
>
> jamie
>
>
>
_______________________________________________
vala-list mailing list
vala-list@gnome.org
http://mail.gnome.org/mailman/listinfo/vala-list

Reply via email to