Re: g_malloc overhead
On Thu, Jan 29, 2009 at 5:02 PM, Xavier Bestel xavier.bes...@free.fr wrote: On Thu, 2009-01-29 at 16:51 +0200, Tor Lillqvist wrote: I think strncpy() is one of the few that needs an utf8 equivalent, because a char may span several bytes. Well, he didn't say exactly what semantics he wanted his g_utf8_strncpy() and g_utf16_strncpy() to have. In the UTF-8 case, should the size mean characters or bytes? In the UTF-16 case, characters or 16-bit units? The existing g_utf8_strncpy() has it meaning characters. As such I think the name is bit unfortunate, because of the similarity to strncpy() but then different semantics of the size parameter. Even if the meaning was bytes, I think an utf8-aware function that avoids cutting in the middle of a multibyte char is a plus. Then the meaning wouldn't be bytes anymore. It would be bytes with some exceptions, which would A LOT more confusing. Xav ___ gtk-devel-list mailing list gtk-devel-l...@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Mon, 2009-01-26 at 22:30 +0100, Martin (OPENGeoMap) wrote: hi: Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? I was talking about a full utf16 and utf8 api in glib and use a macro to work work intermediate string: For example in windows they have this types: LPSTR =char * char * is used for utf-8 AFAIR LPWSTR= utf16windowschar * gunichar2 perhaps in glib we could have utf16 and utf8 in that way or am i wrong? I'm not glib developer. As far as the module of operating on utf-16 strings is proposed I'm not against. However I would prefere to not have 2 entries to each function. Regards Regards. signature.asc Description: This is a digitally signed message part ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: utf-16 and glib (was: g_malloc overhead)
On Mon, 2009-01-26 at 22:49 +0100, Martin (OPENGeoMap) wrote: Maciej Piechotka escribió: On Mon, 2009-01-26 at 22:30 +0100, Martin (OPENGeoMap) wrote: hi: Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? I was talking about a full utf16 and utf8 api in glib and use a macro to work work intermediate string: For example in windows they have this types: LPSTR =char * char * is used for utf-8 AFAIR LPWSTR= utf16windowschar * gunichar2 perhaps in glib we could have utf16 and utf8 in that way or am i wrong? I'm not glib developer. As far as the module of operating on utf-16 strings is proposed I'm not against. However I would prefere to not have 2 entries to each function. Hi: What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); That's one not needed as strncpy should work. gunichar2 * g_utf16_strncpy (gunichar2*dest,const gunichar2*src,gsize n); That's kind of support I'm not against. and the macro: gtext* g_text_strncpy (gtext*dest,const gtext*src,gsize n); regards. With the entries - nothing. With macro - it may be just me but I percive it shooting into foot. Just imagine that some header will assume gtext to be utf-8. Other will turn on the macro (or user code) and change it to utf-16. IMHO - having magic switch which might change the ABI is not good. Regards signature.asc Description: This is a digitally signed message part ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Sun, 18 Jan 2009 17:43:57 +0100 Martín Vales mar...@opengeomap.org wrote: Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. Probably one of the biggest reasons, is that UTF-8 does not use \0 octets, whereas UTF-16 does. This means that UTF-8 data can transparently pass through all of the usual str*() functions in C, such as strlen(), strcpy(), etc... -- Paul LeoNerd Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Am Montag, den 26.01.2009, 12:40 +0100 schrieb Martín Vales: Paul LeoNerd Evans escribió: On Sun, 18 Jan 2009 17:43:57 +0100 Martín Vales mar...@opengeomap.org wrote: Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. Probably one of the biggest reasons, is that UTF-8 does not use \0 octets, whereas UTF-16 does. This means that UTF-8 data can transparently pass through all of the usual str*() functions in C, such as strlen(), strcpy(), etc... I can see the advantages of use utf8 but the true it´s most of people use utf16. I know gnome/linux/cairo/freedesktop promote utf8 but most people use utf16: http://unicode.org/notes/tn12/#Software_16 Currently C doesn't support for UTF-16 literals. The wchar_t type is 32 bits on Linux. So instead of: do_something (abc) you'd suddenly have to write: const utf16_t abc_literal[] = { 65, 66, 67, 0 }; /* abc */ do_something (abc_literal); I really don't see how this would help. Ciao, Mathias -- Mathias Hasselmann mathias.hasselm...@gmx.de Personal Blog: http://taschenorakel.de/mathias/ Openismus GmbH: http://www.openismus.com/ ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Mon, 2009-01-26 at 22:30 +0100, Martin (OPENGeoMap) wrote: hi: Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? I was talking about a full utf16 and utf8 api in glib and use a macro to work work intermediate string: For example in windows they have this types: LPSTR =char * char * is used for utf-8 AFAIR LPWSTR= utf16windowschar * gunichar2 perhaps in glib we could have utf16 and utf8 in that way or am i wrong? I'm not glib developer. As far as the module of operating on utf-16 strings is proposed I'm not against. However I would prefere to not have 2 entries to each function. Regards Regards. ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: utf-16 and glib (was: g_malloc overhead)
On Mon, 2009-01-26 at 22:49 +0100, Martin (OPENGeoMap) wrote: Maciej Piechotka escribió: On Mon, 2009-01-26 at 22:30 +0100, Martin (OPENGeoMap) wrote: hi: Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? I was talking about a full utf16 and utf8 api in glib and use a macro to work work intermediate string: For example in windows they have this types: LPSTR =char * char * is used for utf-8 AFAIR LPWSTR= utf16windowschar * gunichar2 perhaps in glib we could have utf16 and utf8 in that way or am i wrong? I'm not glib developer. As far as the module of operating on utf-16 strings is proposed I'm not against. However I would prefere to not have 2 entries to each function. Hi: What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); That's one not needed as strncpy should work. gunichar2 * g_utf16_strncpy (gunichar2*dest,const gunichar2*src,gsize n); That's kind of support I'm not against. and the macro: gtext* g_text_strncpy (gtext*dest,const gtext*src,gsize n); regards. With the entries - nothing. With macro - it may be just me but I percive it shooting into foot. Just imagine that some header will assume gtext to be utf-8. Other will turn on the macro (or user code) and change it to utf-16. IMHO - having magic switch which might change the ABI is not good. Regards ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); It isn't needed. The nice thing about UTF-8 is that strings in UTF-8 can be handled with normal C str* functions just fine. gunichar2 * g_utf16_strncpy (gunichar2*dest,const gunichar2*src,gsize n); Such a function might well be useful in some circumstances dealing with interoperability or data formats, and I don't oppose adding it to GLib. (Together with g_utf16_strcpy(), g_utf16_strcat() etc.) But I don't think I have ever personally needed such a function in platform-independent GTK code;) (And in code that is inside a Windows ifdef, such functions aren't needed either. The C library on Windows already has wcsncpy(), wcscpy(), wcscat() etc.) and the macro: gtext* g_text_strncpy (gtext*dest,const gtext*src,gsize n); Never, ever. Didn't the previous replies get this across strongly enough? This idiocy is not something we want to copy from the stone age Windows programming style. (In current-day Windows-specific programming in C, I see no reason to uglify your code with those TEXT() macros, TCHAR types, etc. Just use wchar_t for characters, wchar_t literals (L'A'), and wchar_t string literals (LFoo), and call the wide-char versions of C library and Win32 API functions explicitly. Win9x is dead. No reason not to use Unicode explicitly all the time.) (And actually, why would one want to do Windows-specific programming in general in C (or C++) any more... C# and Java are so much nicer. And neither of them has any of this silly TEXT and TCHAR stuff.) --tml ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Thu, 2009-01-29 at 16:51 +0200, Tor Lillqvist wrote: I think strncpy() is one of the few that needs an utf8 equivalent, because a char may span several bytes. Well, he didn't say exactly what semantics he wanted his g_utf8_strncpy() and g_utf16_strncpy() to have. In the UTF-8 case, should the size mean characters or bytes? In the UTF-16 case, characters or 16-bit units? The existing g_utf8_strncpy() has it meaning characters. As such I think the name is bit unfortunate, because of the similarity to strncpy() but then different semantics of the size parameter. Even if the meaning was bytes, I think an utf8-aware function that avoids cutting in the middle of a multibyte char is a plus. Xav ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Tor Lillqvist escribió: What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); It isn't needed. The nice thing about UTF-8 is that strings in UTF-8 can be handled with normal C str* functions just fine. this function it really exist :-[ . http://library.gnome.org/devel/glib/unstable/glib-Unicode-Manipulation.html#g-utf8-strncpy n is the number of real chars not the number of bytes. regards. ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); It isn't needed. The nice thing about UTF-8 is that strings in UTF-8 can be handled with normal C str* functions just fine. gunichar2 * g_utf16_strncpy (gunichar2*dest,const gunichar2*src,gsize n); Such a function might well be useful in some circumstances dealing with interoperability or data formats, and I don't oppose adding it to GLib. (Together with g_utf16_strcpy(), g_utf16_strcat() etc.) But I don't think I have ever personally needed such a function in platform-independent GTK code;) (And in code that is inside a Windows ifdef, such functions aren't needed either. The C library on Windows already has wcsncpy(), wcscpy(), wcscat() etc.) and the macro: gtext* g_text_strncpy (gtext*dest,const gtext*src,gsize n); Never, ever. Didn't the previous replies get this across strongly enough? This idiocy is not something we want to copy from the stone age Windows programming style. (In current-day Windows-specific programming in C, I see no reason to uglify your code with those TEXT() macros, TCHAR types, etc. Just use wchar_t for characters, wchar_t literals (L'A'), and wchar_t string literals (LFoo), and call the wide-char versions of C library and Win32 API functions explicitly. Win9x is dead. No reason not to use Unicode explicitly all the time.) (And actually, why would one want to do Windows-specific programming in general in C (or C++) any more... C# and Java are so much nicer. And neither of them has any of this silly TEXT and TCHAR stuff.) --tml ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Hi Tor, On Thu, 2009-01-29 at 16:37 +0200, Tor Lillqvist wrote: What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); It isn't needed. The nice thing about UTF-8 is that strings in UTF-8 can be handled with normal C str* functions just fine. I think strncpy() is one of the few that needs an utf8 equivalent, because a char may span several bytes. Xav ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Thu, 2009-01-29 at 16:51 +0200, Tor Lillqvist wrote: I think strncpy() is one of the few that needs an utf8 equivalent, because a char may span several bytes. Well, he didn't say exactly what semantics he wanted his g_utf8_strncpy() and g_utf16_strncpy() to have. In the UTF-8 case, should the size mean characters or bytes? In the UTF-16 case, characters or 16-bit units? The existing g_utf8_strncpy() has it meaning characters. As such I think the name is bit unfortunate, because of the similarity to strncpy() but then different semantics of the size parameter. Even if the meaning was bytes, I think an utf8-aware function that avoids cutting in the middle of a multibyte char is a plus. Xav ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Thu, Jan 29, 2009 at 5:02 PM, Xavier Bestel xavier.bes...@free.fr wrote: On Thu, 2009-01-29 at 16:51 +0200, Tor Lillqvist wrote: I think strncpy() is one of the few that needs an utf8 equivalent, because a char may span several bytes. Well, he didn't say exactly what semantics he wanted his g_utf8_strncpy() and g_utf16_strncpy() to have. In the UTF-8 case, should the size mean characters or bytes? In the UTF-16 case, characters or 16-bit units? The existing g_utf8_strncpy() has it meaning characters. As such I think the name is bit unfortunate, because of the similarity to strncpy() but then different semantics of the size parameter. Even if the meaning was bytes, I think an utf8-aware function that avoids cutting in the middle of a multibyte char is a plus. Then the meaning wouldn't be bytes anymore. It would be bytes with some exceptions, which would A LOT more confusing. Xan Xav ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Thu, Jan 29, 2009 at 5:02 PM, Xavier Bestel xavier.bes...@free.fr wrote: On Thu, 2009-01-29 at 16:51 +0200, Tor Lillqvist wrote: I think strncpy() is one of the few that needs an utf8 equivalent, because a char may span several bytes. Well, he didn't say exactly what semantics he wanted his g_utf8_strncpy() and g_utf16_strncpy() to have. In the UTF-8 case, should the size mean characters or bytes? In the UTF-16 case, characters or 16-bit units? The existing g_utf8_strncpy() has it meaning characters. As such I think the name is bit unfortunate, because of the similarity to strncpy() but then different semantics of the size parameter. Even if the meaning was bytes, I think an utf8-aware function that avoids cutting in the middle of a multibyte char is a plus. Then the meaning wouldn't be bytes anymore. It would be bytes with some exceptions, which would A LOT more confusing. Xav ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tor Lillqvist a écrit : The existing g_utf8_strncpy() has it meaning characters. As such I think the name is bit unfortunate, because of the similarity to strncpy() but then different semantics of the size parameter. --tml I don't think it is so confusing since I think strncpy() expects ASCII characters, and ASCII characters are obviously 1-byte sized; this is even more true if you see there's a memcpy() function that is quite the same as what strncpy() is. Then considering both strncpy() and g_utf8_strncpy() takes the number of chars as the size argument fixes confusing if using it with what it was designed for (respectively ASCII and UTF-8). And when computing UTF-8 strings, I think it is obvious that if there's an utf8_* function, it does the same as the C's one does with ASCII string, no? Regards, Colomban -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkmB3EYACgkQyqbACDEjVWhhFwCeNHKu1/wDCnuGwoCuHGczzFnK 1msAnRi633VAMvjhagG8+S36/P0AG1hI =gg6h -END PGP SIGNATURE- ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Tor Lillqvist escribió: What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); It isn't needed. The nice thing about UTF-8 is that strings in UTF-8 can be handled with normal C str* functions just fine. this function it really exist :-[ . http://library.gnome.org/devel/glib/unstable/glib-Unicode-Manipulation.html#g-utf8-strncpy n is the number of real chars not the number of bytes. regards. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
I don't think it is so confusing since I think strncpy() expects ASCII characters, No. strncpy() expects C chars, half of which are not even in ASCII! In other words bytes. It doesn't care at all whether the bytes represent ASCII, EBCDIC, or whatever. strncpy() works fine for C strings that represent text in whatever multi-byte codeset (as long as it lacks zero bytes), like UTF-8, Microsoft's double-byte codepages, etc. (Well, I exaggerate, obviously if you want to be sure that multi-byte characters don't get truncated you shouldn't use strncpy(), but some encoding-aware function.) this is even more true if you see there's a memcpy() function that is quite the same as what strncpy() is. No it isn't. strncpy() stops when it encounters a zero char (byte). memcpy() always copies exactly the requested number of chars (bytes). Then considering both strncpy() and g_utf8_strncpy() takes the number of chars as the size argument That is a quite misleading misuse of the term char. g_utf8_strncpy() takes the number of Unicode characters (code points), each of which is represented by one or more bytes. Not chars. Please let's stick to using the term char to always mean what it means in C, i.e. byte or octet (as long as we ignore weird architectures). If you mean the more abstract concept character, say so! --tml ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Paul LeoNerd Evans escribió: On Sun, 18 Jan 2009 17:43:57 +0100 Martín Vales mar...@opengeomap.org wrote: Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. Probably one of the biggest reasons, is that UTF-8 does not use \0 octets, whereas UTF-16 does. This means that UTF-8 data can transparently pass through all of the usual str*() functions in C, such as strlen(), strcpy(), etc... I can see the advantages of use utf8 but the true it´s most of people use utf16. I know gnome/linux/cairo/freedesktop promote utf8 but most people use utf16: http://unicode.org/notes/tn12/#Software_16 Regards. ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Martín Vales wrote: I can see the advantages of use utf8 but the true it´s most of people use utf16. I know gnome/linux/cairo/freedesktop promote utf8 but most people use utf16: http://unicode.org/notes/tn12/#Software_16 This is a very baseless claim. One that actually turns out to be false. Most people don't right Windows code. Most people read and write content on the internet, and I bet more than 99% of the Unicode content on the net is in UTF-8. As for the technical note you cite, it's a very biased document of its own. I once wrote a full critical review of it but can't find it. Lets just say that UTF-16 is at best implementation details of Firefox. I can't see how that can be relevant here. Moreover, it's plain wrong that Python uses UTF-16. Python APIs are encoding-agnostic, and while Python 2.x can be compiled with UCS-2, it's recommended that UCS-4 be enabled. And note the difference: I said UCS-2, not UTF-16. UTF-16 is a disease. It's variable-width, so it doesn't have the benefits of UTF-32. It's sixteen bit, so it doesn't have the ASCII-compatibility of UTF-8. Tell me one good thing about it other than everyone made the mistake of using it and now they have to keep doing that because they exposed it in their API. behdad ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Mon, Jan 26, 2009 at 9:12 AM, Behdad Esfahbod beh...@behdad.org wrote: Lets just say that UTF-16 is at best implementation details of Firefox. Well, JavaScript is notably UTF-16. Given that the Web, Java and .NET (i.e. all the most important platforms) are all UTF-16 it's likely to be with us for quite a while, so it's important to understand. But yeah, there's no way POSIX/GNOME etc. could switch even if it made sense to do so (which it clearly doesn't). ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Colin Walters escribió: On Mon, Jan 26, 2009 at 9:12 AM, Behdad Esfahbod beh...@behdad.org wrote: Lets just say that UTF-16 is at best implementation details of Firefox. Well, JavaScript is notably UTF-16. Given that the Web, Java and .NET (i.e. all the most important platforms) are all UTF-16 it's likely to be with us for quite a while, so it's important to understand. Yes i only wanted say that. For example i work in c# and i would like create glib libraries and use it in .net, but the char in mono/.NET is utf16 and therefore i have there the same overhead. The solution are 2: 1.- conversion using glib (): http://library.gnome.org/devel/glib/2.19/glib-Unicode-Manipulation.html#gunichar2 .-2. automatic NET conversion in the p/invoke side. The 2 solutions have the same overhead. But yeah, there's no way POSIX/GNOME etc. could switch even if it made sense to do so (which it clearly doesn't). Yes, i only talked about the overhead with utf8 outside of glib, only that. Perhaps the only solution is add more suport to utf16 in glib with more methods. Regards. ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Mon, 2009-01-26 at 18:30 +0100, Martín Vales wrote: Colin Walters escribió: On Mon, Jan 26, 2009 at 9:12 AM, Behdad Esfahbod beh...@behdad.org wrote: Lets just say that UTF-16 is at best implementation details of Firefox. Well, JavaScript is notably UTF-16. Given that the Web, Java and .NET (i.e. all the most important platforms) are all UTF-16 it's likely to be with us for quite a while, so it's important to understand. Yes i only wanted say that. For example i work in c# and i would like create glib libraries and use it in .net, but the char in mono/.NET is utf16 and therefore i have there the same overhead. The solution are 2: 1.- conversion using glib (): http://library.gnome.org/devel/glib/2.19/glib-Unicode-Manipulation.html#gunichar2 .-2. automatic NET conversion in the p/invoke side. The 2 solutions have the same overhead. But yeah, there's no way POSIX/GNOME etc. could switch even if it made sense to do so (which it clearly doesn't). Yes, i only talked about the overhead with utf8 outside of glib, only that. Perhaps the only solution is add more suport to utf16 in glib with more methods. There's zero point in talking about a solution until you have profile data indicating that there is a problem. - Owen ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Martín Vales mar...@opengeomap.org writes: Colin Walters escribió: On Mon, Jan 26, 2009 at 9:12 AM, Behdad Esfahbod beh...@behdad.org wrote: Lets just say that UTF-16 is at best implementation details of Firefox. Well, JavaScript is notably UTF-16. Given that the Web, Java and .NET To be honest - aren't web currently XML-based (XHTML co.)? And isn't UTF-8 default encoding, and acidentally the most widly used, for XML? But yeah, there's no way POSIX/GNOME etc. could switch even if it made sense to do so (which it clearly doesn't). Yes, i only talked about the overhead with utf8 outside of glib, only that. Perhaps the only solution is add more suport to utf16 in glib with more methods. Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? Regards -- I've probably left my head... somewhere. Please wait untill I find it. Homepage (pl_PL): http://uzytkownik.jogger.pl/ (GNU/)Linux User: #425935 (see http://counter.li.org/) ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
hi: Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? I was talking about a full utf16 and utf8 api in glib and use a macro to work work intermediate string: For example in windows they have this types: LPSTR =char * LPWSTR= utf16windowschar * ... and: LPTSTR type. If we defined the _UNICODE macro is LPWSTR else LPSTR . ,...after they have a full api to manage utf16 and ansi strings, (strcat, strcpy, etc), http://msdn.microsoft.com/en-us/library/h1x0y282.aspx ... and finally macros to use string in the same way _TEXT _T, etc. _TEXT(are you defined _UNICODE macro?. Perhaps i am ansi or perhaps utf16) http://msdn.microsoft.com/en-us/library/dd374074(VS.85).aspx http://msdn.microsoft.com/en-us/library/dd374074%28VS.85%29.aspx perhaps in glib we could have utf16 and utf8 in that way or am i wrong? Regards. ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Maciej Piechotka escribió: On Mon, 2009-01-26 at 22:30 +0100, Martin (OPENGeoMap) wrote: hi: Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? I was talking about a full utf16 and utf8 api in glib and use a macro to work work intermediate string: For example in windows they have this types: LPSTR =char * char * is used for utf-8 AFAIR LPWSTR= utf16windowschar * gunichar2 perhaps in glib we could have utf16 and utf8 in that way or am i wrong? I'm not glib developer. As far as the module of operating on utf-16 strings is proposed I'm not against. However I would prefere to not have 2 entries to each function. Hi: What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); gunichar2 * g_utf16_strncpy (gunichar2*dest,const gunichar2*src,gsize n); and the macro: gtext* g_text_strncpy (gtext*dest,const gtext*src,gsize n); regards. Regards Regards. ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Sun, 18 Jan 2009 17:43:57 +0100 Martín Vales mar...@opengeomap.org wrote: Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. Probably one of the biggest reasons, is that UTF-8 does not use \0 octets, whereas UTF-16 does. This means that UTF-8 data can transparently pass through all of the usual str*() functions in C, such as strlen(), strcpy(), etc... -- Paul LeoNerd Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: PGP signature ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Paul LeoNerd Evans escribió: On Sun, 18 Jan 2009 17:43:57 +0100 Martín Vales mar...@opengeomap.org wrote: Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. Probably one of the biggest reasons, is that UTF-8 does not use \0 octets, whereas UTF-16 does. This means that UTF-8 data can transparently pass through all of the usual str*() functions in C, such as strlen(), strcpy(), etc... I can see the advantages of use utf8 but the true it´s most of people use utf16. I know gnome/linux/cairo/freedesktop promote utf8 but most people use utf16: http://unicode.org/notes/tn12/#Software_16 Regards. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Am Montag, den 26.01.2009, 12:40 +0100 schrieb Martín Vales: Paul LeoNerd Evans escribió: On Sun, 18 Jan 2009 17:43:57 +0100 Martín Vales mar...@opengeomap.org wrote: Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. Probably one of the biggest reasons, is that UTF-8 does not use \0 octets, whereas UTF-16 does. This means that UTF-8 data can transparently pass through all of the usual str*() functions in C, such as strlen(), strcpy(), etc... I can see the advantages of use utf8 but the true it´s most of people use utf16. I know gnome/linux/cairo/freedesktop promote utf8 but most people use utf16: http://unicode.org/notes/tn12/#Software_16 Currently C doesn't support for UTF-16 literals. The wchar_t type is 32 bits on Linux. So instead of: do_something (abc) you'd suddenly have to write: const utf16_t abc_literal[] = { 65, 66, 67, 0 }; /* abc */ do_something (abc_literal); I really don't see how this would help. Ciao, Mathias -- Mathias Hasselmann mathias.hasselm...@gmx.de Personal Blog: http://taschenorakel.de/mathias/ Openismus GmbH: http://www.openismus.com/ ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Martín Vales wrote: I can see the advantages of use utf8 but the true it´s most of people use utf16. I know gnome/linux/cairo/freedesktop promote utf8 but most people use utf16: http://unicode.org/notes/tn12/#Software_16 This is a very baseless claim. One that actually turns out to be false. Most people don't right Windows code. Most people read and write content on the internet, and I bet more than 99% of the Unicode content on the net is in UTF-8. As for the technical note you cite, it's a very biased document of its own. I once wrote a full critical review of it but can't find it. Lets just say that UTF-16 is at best implementation details of Firefox. I can't see how that can be relevant here. Moreover, it's plain wrong that Python uses UTF-16. Python APIs are encoding-agnostic, and while Python 2.x can be compiled with UCS-2, it's recommended that UCS-4 be enabled. And note the difference: I said UCS-2, not UTF-16. UTF-16 is a disease. It's variable-width, so it doesn't have the benefits of UTF-32. It's sixteen bit, so it doesn't have the ASCII-compatibility of UTF-8. Tell me one good thing about it other than everyone made the mistake of using it and now they have to keep doing that because they exposed it in their API. behdad ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Mon, Jan 26, 2009 at 9:12 AM, Behdad Esfahbod beh...@behdad.org wrote: Lets just say that UTF-16 is at best implementation details of Firefox. Well, JavaScript is notably UTF-16. Given that the Web, Java and .NET (i.e. all the most important platforms) are all UTF-16 it's likely to be with us for quite a while, so it's important to understand. But yeah, there's no way POSIX/GNOME etc. could switch even if it made sense to do so (which it clearly doesn't). ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Colin Walters escribió: On Mon, Jan 26, 2009 at 9:12 AM, Behdad Esfahbod beh...@behdad.org wrote: Lets just say that UTF-16 is at best implementation details of Firefox. Well, JavaScript is notably UTF-16. Given that the Web, Java and .NET (i.e. all the most important platforms) are all UTF-16 it's likely to be with us for quite a while, so it's important to understand. Yes i only wanted say that. For example i work in c# and i would like create glib libraries and use it in .net, but the char in mono/.NET is utf16 and therefore i have there the same overhead. The solution are 2: 1.- conversion using glib (): http://library.gnome.org/devel/glib/2.19/glib-Unicode-Manipulation.html#gunichar2 .-2. automatic NET conversion in the p/invoke side. The 2 solutions have the same overhead. But yeah, there's no way POSIX/GNOME etc. could switch even if it made sense to do so (which it clearly doesn't). Yes, i only talked about the overhead with utf8 outside of glib, only that. Perhaps the only solution is add more suport to utf16 in glib with more methods. Regards. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Mon, 2009-01-26 at 18:30 +0100, Martín Vales wrote: Colin Walters escribió: On Mon, Jan 26, 2009 at 9:12 AM, Behdad Esfahbod beh...@behdad.org wrote: Lets just say that UTF-16 is at best implementation details of Firefox. Well, JavaScript is notably UTF-16. Given that the Web, Java and .NET (i.e. all the most important platforms) are all UTF-16 it's likely to be with us for quite a while, so it's important to understand. Yes i only wanted say that. For example i work in c# and i would like create glib libraries and use it in .net, but the char in mono/.NET is utf16 and therefore i have there the same overhead. The solution are 2: 1.- conversion using glib (): http://library.gnome.org/devel/glib/2.19/glib-Unicode-Manipulation.html#gunichar2 .-2. automatic NET conversion in the p/invoke side. The 2 solutions have the same overhead. But yeah, there's no way POSIX/GNOME etc. could switch even if it made sense to do so (which it clearly doesn't). Yes, i only talked about the overhead with utf8 outside of glib, only that. Perhaps the only solution is add more suport to utf16 in glib with more methods. There's zero point in talking about a solution until you have profile data indicating that there is a problem. - Owen ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Mon, Jan 26, 2009 at 12:57:28PM -0500, Owen Taylor wrote: On Mon, 2009-01-26 at 18:30 +0100, Martín Vales wrote: Yes, i only talked about the overhead with utf8 outside of glib, only that. Perhaps the only solution is add more suport to utf16 in glib with more methods. There's zero point in talking about a solution until you have profile data indicating that there is a problem. Indeed. UTF-16 is horribly broken by design, and any attempt made to migrate in the direction _towards_ it is a flawed one, and should be avoided. UTF-8 is backward-compatible with the legacy str*() functions in C, which, like it or not, will be around for a while yet. * It makes sure not to embed any ASCII NUL ('\0') in the stream unless it means it, as U+, which makes it work with these functions. * UTF-8 has nice properties in substring matches - grep can work on UTF-8 despite not knowing it, because no valid UTF-8 string ever appears falsely as a substring of another. * This also means that the only occurance of '\n' in a UTF-8 stream is a real one. This means that cat, head/tail, awk, etc... can properly detect where the linefeeds are. 'head' can print, say, the first 3 lines of UTF-8 text without knowing it's UTF-8. * UTF-8 can be sorted by only sorting the encoded bytes. sort can sort a UTF-8-encoded text file. The order of the Unicode strings, is the same as the bytewise-sorted order of the raw bytes that encode it. This list goes on. Meanwhile, on the other end of the spectrum, storing Unicode data as decoded 32bit integers makes some sense. It means string indexing operations are constant-width - the substring between the 4th and 9th characters in such an array will be known to lie between the 16th and 36th bytes. The presence of combining characters, and double-width glyphs does make this transformation a bit harder, effectively reducing the advantage such a scheme has. Compared to that, UTF-16 offers NONE of these advantages. UTF-16 cannot be passed through any legacy str*() function, nor will it work in grep, sed, awk, cut, sort, head, tail, or in fact _any_ of the standard UNIX text tools. Nor can UTF-16 be array indexed in constant time, because of the surrogate pairs used to encode codepoints outside of the BMP (Basic Multilingual Plane). In Summary - UTF-16. Don't. Just Don't. -- Paul LeoNerd Evans leon...@leonerd.org.uk ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Martín Vales mar...@opengeomap.org writes: Colin Walters escribió: On Mon, Jan 26, 2009 at 9:12 AM, Behdad Esfahbod beh...@behdad.org wrote: Lets just say that UTF-16 is at best implementation details of Firefox. Well, JavaScript is notably UTF-16. Given that the Web, Java and .NET To be honest - aren't web currently XML-based (XHTML co.)? And isn't UTF-8 default encoding, and acidentally the most widly used, for XML? But yeah, there's no way POSIX/GNOME etc. could switch even if it made sense to do so (which it clearly doesn't). Yes, i only talked about the overhead with utf8 outside of glib, only that. Perhaps the only solution is add more suport to utf16 in glib with more methods. Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? Regards -- I've probably left my head... somewhere. Please wait untill I find it. Homepage (pl_PL): http://uzytkownik.jogger.pl/ (GNU/)Linux User: #425935 (see http://counter.li.org/) ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
hi: Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? I was talking about a full utf16 and utf8 api in glib and use a macro to work work intermediate string: For example in windows they have this types: LPSTR =char * LPWSTR= utf16windowschar * ... and: LPTSTR type. If we defined the _UNICODE macro is LPWSTR else LPSTR . ,...after they have a full api to manage utf16 and ansi strings, (strcat, strcpy, etc), http://msdn.microsoft.com/en-us/library/h1x0y282.aspx ... and finally macros to use string in the same way _TEXT _T, etc. _TEXT(are you defined _UNICODE macro?. Perhaps i am ansi or perhaps utf16) http://msdn.microsoft.com/en-us/library/dd374074(VS.85).aspx http://msdn.microsoft.com/en-us/library/dd374074%28VS.85%29.aspx perhaps in glib we could have utf16 and utf8 in that way or am i wrong? Regards. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Maciej Piechotka escribió: On Mon, 2009-01-26 at 22:30 +0100, Martin (OPENGeoMap) wrote: hi: Well - what do you mean? Having 2 functions - one reciving utf-16 and one utf-8? To be honest - it doesn't make any sense to me (it would create much mess, double the code, make programming errors easier...). Converting? What's wrong with g_utf16_to_utf8? I was talking about a full utf16 and utf8 api in glib and use a macro to work work intermediate string: For example in windows they have this types: LPSTR =char * char * is used for utf-8 AFAIR LPWSTR= utf16windowschar * gunichar2 perhaps in glib we could have utf16 and utf8 in that way or am i wrong? I'm not glib developer. As far as the module of operating on utf-16 strings is proposed I'm not against. However I would prefere to not have 2 entries to each function. Hi: What is wrong with: gchar* g_utf8_strncpy (gchar *dest,const gchar *src,gsize n); gunichar2 * g_utf16_strncpy (gunichar2*dest,const gunichar2*src,gsize n); and the macro: gtext* g_text_strncpy (gtext*dest,const gtext*src,gsize n); regards. Regards Regards. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Tue, 2009-01-20 at 09:01 +0100, Martín Vales wrote: BJörn Lindqvist escribió: Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. gslice i believe use malloc internally. I believe you always need malloc/new-(C/C++) because you depend on ms Windows API. I am not sure if you can build your own malloc because you depend on the operating system. Sure, you must malloc to get new memory, but you can malloc bigger than what you need and hand out the extra memory later at a much lower cost. -Larry la...@yrral.net regards. 2009/1/18, muppet sc...@asofyet.org: On Jan 18, 2009, at 11:43 AM, Martín Vales wrote: What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems? This vtable allows you to swap in a different allocator with next to no effort. Maybe it has special OOM handling, or uses a special pool or allocation algorithm tuned to your use-case, or does debugging logging work, or whatever. The fact that the default is the same everywhere is a bit beside the point of having the functionality. -- Me: What's that in your mouth? Zella: *swallows laboriously* Nothing. Me: What did you just swallow? Zella: A booger. Me: Baby girl, don't eat boogers. That's gross. Zella: But it was in my nose. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list ___ gtk-app-devel-list mailing list gtk-app-devel-l...@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Mon, 2009-01-19 at 18:43 +0100, BJörn Lindqvist wrote: Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. malloc is a library call. It's not worth changing memory allocators unless you have a good solid understanding of how your program uses memory, and have done *very* detailed timings. The main trade-offs are between space, time and complexity of code. Errors in malloc() or other memory code can be very difficult to find and debug, so it' an area to avoid if at all possible. Having said all that, yes, using g_slice may help in some cases. But you need to do timings and profiling, of course. Liam Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ Pictures from old books: http://fromoldbooks.org/ Ankh: irc.sorcery.net irc.gnome.org www.advogato.org ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Wed, 2009-01-21 at 10:21 +0100, BJörn Lindqvist wrote: 2009/1/21 Liam R E Quin l...@holoweb.net: On Mon, 2009-01-19 at 18:43 +0100, BJörn Lindqvist wrote: Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. malloc is a library call. On Linux, it is implemented using mmap() and brk() which are system calls. brk(2) is called to grow the heap, but not on every malloc() call; mmap(2) is used only for large objects, and then not always. If you malloc() a few megabytes and then call free, a program that allocates a lot of small objects may well go faster on some systems, and slower on others. Yes, g_slice was tested, but the program _calling_ g_slice is in the domain of the user, and errors in calling g_slice or malloc() can be hard to debug. No more from me on this. Liam -- Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ Pictures from old books: http://fromoldbooks.org/ Ankh: irc.sorcery.net irc.gnome.org www.advogato.org ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
2009/1/21 Liam R E Quin l...@holoweb.net: On Mon, 2009-01-19 at 18:43 +0100, BJörn Lindqvist wrote: Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. malloc is a library call. On Linux, it is implemented using mmap() and brk() which are system calls. The point is that malloc usually translates into one or more system calls which are expensive. With a custom allocator the system call part of malloc can be avoided. It's not worth changing memory allocators unless you have a good solid understanding of how your program uses memory, and have done *very* detailed timings. You are right of course. For GSlice in particular, it was tested thoroughly when it was merged to glib. See http://markmail.org/message/ohmuxdfyttuy4ipa. For gtk programs I believe we have quite good understanding on how applications use memory. Another example is Python which also uses a custom memory allocator. It works very well because Python uses lots of short-lived small objects. -- mvh Björn ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Martín Vales mar...@opengeomap.org writes: hi: I working with visual c++ in Windows and i find glib very useful for many C task, but i am worry about the g_malloc overhead. We really need a new malloc?? gpointer g_malloc (gsize n_bytes) { if (G_UNLIKELY (!g_mem_initialized)) g_mem_init_nomessage(); if (G_LIKELY (n_bytes)) { gpointer mem; mem = glib_mem_vtable.malloc (n_bytes); if (mem) return mem; g_error (%s: failed to allocate %G_GSIZE_FORMAT bytes, G_STRLOC, n_bytes); } return NULL; } What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems?. static GMemVTable glib_mem_vtable = { standard_malloc, standard_realloc, standard_free, standard_calloc, standard_try_malloc, standard_try_realloc, }; g_malloc will abort program when no additional memory is avaible (as usually programers do not care about handling it as it would require usually... allocating memory). From g_try_malloc: Attempts to allocate n_bytes, and returns NULL on failure. Contrast with g_malloc(), which aborts the program on failure. Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. I guess that: 1. Because utf-8 is currently the main coding for unicode I guess (see xml co.) 2. Because the most strings in latin alphabet will be nearly 2x smaller then in utf-16 (on average in my mother language AFAIR utf-8 is bigger by a few % then iso-8859-2 - utf-16 would by 100% bigger) 3. I guess that utf-8 is a standard on main Gnome platform - GNU/Linux. While I met in many places generating xx_XX.UTF-8 locales I've never encountered utf-16. 4. utf-16 is not fixed size so this is not an advantage over utf-8 (utf-32 is). Regards -- I've probably left my head... somewhere. Please wait untill I find it. Homepage (pl_PL): http://uzytkownik.jogger.pl/ (GNU/)Linux User: #425935 (see http://counter.li.org/) ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Martín Vales mar...@opengeomap.org writes: hi: I working with visual c++ in Windows and i find glib very useful for many C task, but i am worry about the g_malloc overhead. We really need a new malloc?? gpointer g_malloc (gsize n_bytes) { if (G_UNLIKELY (!g_mem_initialized)) g_mem_init_nomessage(); if (G_LIKELY (n_bytes)) { gpointer mem; mem = glib_mem_vtable.malloc (n_bytes); if (mem) return mem; g_error (%s: failed to allocate %G_GSIZE_FORMAT bytes, G_STRLOC, n_bytes); } return NULL; } What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems?. static GMemVTable glib_mem_vtable = { standard_malloc, standard_realloc, standard_free, standard_calloc, standard_try_malloc, standard_try_realloc, }; g_malloc will abort program when no additional memory is avaible (as usually programers do not care about handling it as it would require usually... allocating memory). From g_try_malloc: Attempts to allocate n_bytes, and returns NULL on failure. Contrast with g_malloc(), which aborts the program on failure. Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. I guess that: 1. Because utf-8 is currently the main coding for unicode I guess (see xml co.) 2. Because the most strings in latin alphabet will be nearly 2x smaller then in utf-16 (on average in my mother language AFAIR utf-8 is bigger by a few % then iso-8859-2 - utf-16 would by 100% bigger) 3. I guess that utf-8 is a standard on main Gnome platform - GNU/Linux. While I met in many places generating xx_XX.UTF-8 locales I've never encountered utf-16. 4. utf-16 is not fixed size so this is not an advantage over utf-8 (utf-32 is). Regards -- I've probably left my head... somewhere. Please wait untill I find it. Homepage (pl_PL): http://uzytkownik.jogger.pl/ (GNU/)Linux User: #425935 (see http://counter.li.org/) ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Malloc is not a system call. And please don't make performance assumptions without having benchmark data to back it up. Note that it is not necessarily that clear what is a system call on Windows, as far as I know. Something like the gslice allocator could probably improve performance a bit. At least the g_slice_free() API requires passing the size of the block, so it is not possible to simply have g_malloc() call g_slice_alloc(), and g_free() and g_realloc() call g_slice_free(). If you start adding a bookkeeping layer to keep track of the size of each allocation, you end up with a bunch of code that might well correspond to what the C library's malloc, or the heap management code in the kernel32 library (which is code running at user level, not in the kernel, as far as I know) that it calls, already does anyway. --tml ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Wed, 2009-01-21 at 10:21 +0100, BJörn Lindqvist wrote: 2009/1/21 Liam R E Quin l...@holoweb.net: On Mon, 2009-01-19 at 18:43 +0100, BJörn Lindqvist wrote: Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. malloc is a library call. On Linux, it is implemented using mmap() and brk() which are system calls. brk(2) is called to grow the heap, but not on every malloc() call; mmap(2) is used only for large objects, and then not always. If you malloc() a few megabytes and then call free, a program that allocates a lot of small objects may well go faster on some systems, and slower on others. Yes, g_slice was tested, but the program _calling_ g_slice is in the domain of the user, and errors in calling g_slice or malloc() can be hard to debug. No more from me on this. Liam -- Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ Pictures from old books: http://fromoldbooks.org/ Ankh: irc.sorcery.net irc.gnome.org www.advogato.org ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Malloc is not a system call. And please don't make performance assumptions without having benchmark data to back it up. Note that it is not necessarily that clear what is a system call on Windows, as far as I know. Something like the gslice allocator could probably improve performance a bit. At least the g_slice_free() API requires passing the size of the block, so it is not possible to simply have g_malloc() call g_slice_alloc(), and g_free() and g_realloc() call g_slice_free(). If you start adding a bookkeeping layer to keep track of the size of each allocation, you end up with a bunch of code that might well correspond to what the C library's malloc, or the heap management code in the kernel32 library (which is code running at user level, not in the kernel, as far as I know) that it calls, already does anyway. --tml ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Tue, 2009-01-20 at 09:01 +0100, Martín Vales wrote: BJörn Lindqvist escribió: Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. gslice i believe use malloc internally. I believe you always need malloc/new-(C/C++) because you depend on ms Windows API. I am not sure if you can build your own malloc because you depend on the operating system. Sure, you must malloc to get new memory, but you can malloc bigger than what you need and hand out the extra memory later at a much lower cost. -Larry la...@yrral.net regards. 2009/1/18, muppet sc...@asofyet.org: On Jan 18, 2009, at 11:43 AM, Martín Vales wrote: What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems? This vtable allows you to swap in a different allocator with next to no effort. Maybe it has special OOM handling, or uses a special pool or allocation algorithm tuned to your use-case, or does debugging logging work, or whatever. The fact that the default is the same everywhere is a bit beside the point of having the functionality. -- Me: What's that in your mouth? Zella: *swallows laboriously* Nothing. Me: What did you just swallow? Zella: A booger. Me: Baby girl, don't eat boogers. That's gross. Zella: But it was in my nose. ___ gtk-devel-list mailing list gtk-devel-l...@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Tue, Jan 20, 2009 at 12:48 PM, Larry Reaves la...@yrral.net wrote: On Tue, 2009-01-20 at 09:01 +0100, Martín Vales wrote: BJörn Lindqvist escribió: Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. gslice i believe use malloc internally. I believe you always need malloc/new-(C/C++) because you depend on ms Windows API. I am not sure if you can build your own malloc because you depend on the operating system. Sure, you must malloc to get new memory, but you can malloc bigger than what you need and hand out the extra memory later at a much lower cost. I recall reading somewhere that mmap can be used to build custom memory allocators. If that's true than one can bypass malloc. I think that you can request memory through mmap by using MAP_ANONYMOUS. Emmanuel Rodriguez ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
On Jan 18, 2009, at 11:43 AM, Martín Vales wrote: What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems? This vtable allows you to swap in a different allocator with next to no effort. Maybe it has special OOM handling, or uses a special pool or allocation algorithm tuned to your use-case, or does debugging logging work, or whatever. The fact that the default is the same everywhere is a bit beside the point of having the functionality. -- Me: What's that in your mouth? Zella: *swallows laboriously* Nothing. Me: What did you just swallow? Zella: A booger. Me: Baby girl, don't eat boogers. That's gross. Zella: But it was in my nose. ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. 2009/1/18, muppet sc...@asofyet.org: On Jan 18, 2009, at 11:43 AM, Martín Vales wrote: What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems? This vtable allows you to swap in a different allocator with next to no effort. Maybe it has special OOM handling, or uses a special pool or allocation algorithm tuned to your use-case, or does debugging logging work, or whatever. The fact that the default is the same everywhere is a bit beside the point of having the functionality. -- Me: What's that in your mouth? Zella: *swallows laboriously* Nothing. Me: What did you just swallow? Zella: A booger. Me: Baby girl, don't eat boogers. That's gross. Zella: But it was in my nose. ___ gtk-devel-list mailing list gtk-devel-l...@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list -- mvh Björn ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
Re: g_malloc overhead
BJörn Lindqvist escribió: Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. gslice i believe use malloc internally. I believe you always need malloc/new-(C/C++) because you depend on ms Windows API. I am not sure if you can build your own malloc because you depend on the operating system. regards. 2009/1/18, muppet sc...@asofyet.org: On Jan 18, 2009, at 11:43 AM, Martín Vales wrote: What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems? This vtable allows you to swap in a different allocator with next to no effort. Maybe it has special OOM handling, or uses a special pool or allocation algorithm tuned to your use-case, or does debugging logging work, or whatever. The fact that the default is the same everywhere is a bit beside the point of having the functionality. -- Me: What's that in your mouth? Zella: *swallows laboriously* Nothing. Me: What did you just swallow? Zella: A booger. Me: Baby girl, don't eat boogers. That's gross. Zella: But it was in my nose. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
Actually, a custom allocator could be useful even in the general case. Malloc is a system call and has quite bad performance on certain platforms (windows in particular i think). Something like the gslice allocator could Probably improve performance a bit. 2009/1/18, muppet sc...@asofyet.org: On Jan 18, 2009, at 11:43 AM, Martín Vales wrote: What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems? This vtable allows you to swap in a different allocator with next to no effort. Maybe it has special OOM handling, or uses a special pool or allocation algorithm tuned to your use-case, or does debugging logging work, or whatever. The fact that the default is the same everywhere is a bit beside the point of having the functionality. -- Me: What's that in your mouth? Zella: *swallows laboriously* Nothing. Me: What did you just swallow? Zella: A booger. Me: Baby girl, don't eat boogers. That's gross. Zella: But it was in my nose. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list -- mvh Björn ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
g_malloc overhead
hi: I working with visual c++ in Windows and i find glib very useful for many C task, but i am worry about the g_malloc overhead. We really need a new malloc?? gpointer g_malloc (gsize n_bytes) { if (G_UNLIKELY (!g_mem_initialized)) g_mem_init_nomessage(); if (G_LIKELY (n_bytes)) { gpointer mem; mem = glib_mem_vtable.malloc (n_bytes); if (mem) return mem; g_error (%s: failed to allocate %G_GSIZE_FORMAT bytes, G_STRLOC, n_bytes); } return NULL; } What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems?. static GMemVTable glib_mem_vtable = { standard_malloc, standard_realloc, standard_free, standard_calloc, standard_try_malloc, standard_try_realloc, }; Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. . #ifdef G_OS_WIN32 wpath = g_utf8_to_utf16 (path, -1, NULL, NULL, error); if (wpath == NULL) return NULL; dir = g_new (GDir, 1); dir-wdirp = _wopendir (wpath); g_free (wpath); if (dir-wdirp) return dir; . Regards. ___ gtk-app-devel-list mailing list gtk-app-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-app-devel-list
g_malloc overhead
hi: I working with visual c++ in Windows and i find glib very useful for many C task, but i am worry about the g_malloc overhead. We really need a new malloc?? gpointer g_malloc (gsize n_bytes) { if (G_UNLIKELY (!g_mem_initialized)) g_mem_init_nomessage(); if (G_LIKELY (n_bytes)) { gpointer mem; mem = glib_mem_vtable.malloc (n_bytes); if (mem) return mem; g_error (%s: failed to allocate %G_GSIZE_FORMAT bytes, G_STRLOC, n_bytes); } return NULL; } What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems?. static GMemVTable glib_mem_vtable = { standard_malloc, standard_realloc, standard_free, standard_calloc, standard_try_malloc, standard_try_realloc, }; Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. . #ifdef G_OS_WIN32 wpath = g_utf8_to_utf16 (path, -1, NULL, NULL, error); if (wpath == NULL) return NULL; dir = g_new (GDir, 1); dir-wdirp = _wopendir (wpath); g_free (wpath); if (dir-wdirp) return dir; . Regards. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Sun, Jan 18, 2009 at 11:43 AM, Martín Vales mar...@opengeomap.org wrote: Other overhead i see is the open dir/file funtions, where in windows we need do the utf8 to utf16 everytime in windows. If JAVA,.NET and Qt use utf16 by default why in gnome world we use utf8 by default?. Historically, Unix was a late adopter of Unicode. And crucially, the Unicode designers originally thought 16 bits would be enough. So Java was explicitly designed around Unicode and specifically UTF-16, and Windows was a relatively early adopter. Only later did it became clear that more code point space was needed, and also that UTF-8 specifically had a number of advantages. Strings and encodings are actually a pretty interesting subject I think, and for any programmer it's worth taking some time to read available material on the web at least. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_malloc overhead
On Jan 18, 2009, at 11:43 AM, Martín Vales wrote: What are the advantages of use a glib_mem_vtable ???. I think we have the same malloc function in all operating systems? This vtable allows you to swap in a different allocator with next to no effort. Maybe it has special OOM handling, or uses a special pool or allocation algorithm tuned to your use-case, or does debugging logging work, or whatever. The fact that the default is the same everywhere is a bit beside the point of having the functionality. -- Me: What's that in your mouth? Zella: *swallows laboriously* Nothing. Me: What did you just swallow? Zella: A booger. Me: Baby girl, don't eat boogers. That's gross. Zella: But it was in my nose. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list