Re: inlining glib functions (Was: public barrier functions)
On Tue, 2005-12-13 at 22:13 +0100, Tim Janik wrote: On Tue, 13 Dec 2005, Gustavo J. A. M. Carneiro wrote: Ter, 2005-12-13 às 17:11 +0100, Tim Janik escreveu: IMHO, some functions are obvious candidates for inlining, regardless of any profiling done on them. For instance: gchar* g_strdup (const gchar *str) { gchar *new_str; gsize length; if (str) { length = strlen (str) + 1; new_str = g_new (char, length); memcpy (new_str, str, length); } else new_str = NULL; return new_str; } This function is trivial. I doubt you'll ever find any new bugs in it. It is called in many places. So why pay a performance penalty when you could easily avoid it? inlining doesn't automatically mean performance improvements and not inlining doesn't automatically cause performance penalties. if you start to inline lots of widely used small functions in non performance critical code sections, all you've gained is a bigger code section size and less likelyness for warm instruction caches (that becomes especially critical when starting to bloat tight loops due to inlining). now consider that 90% of a programs runtime is spent in 10% of its code, that means 90% of your inlininig does ocoour in non performance critical sections. that's why modern compilers use tunable heuristics to decide about automated inlining and don't stupidly inline everything they can. I personally would not say to inline every trivial function in GLib, I was talking about single-instruction functions that are not inlined right now, it might even be possible that the call instruction itself is longer than the instruction itself, not to mention that functions linked in from shared libraries jump twice to reach the actual body of the function. (first call to a stub which jumps to the function itself), so it effectively empties the instruction pipeline twice. Nevertheless I shut up and post patches :) Thanks for the information and sorry for the noise. -- Bazsi ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: public barrier functions
Hi Paul, But I'd be interested to see some benchmarks; see how much this actually matters. Run a typical program twice; once with functions and once with some inlines/macros. It's quite likely that in a real-world program, the ratio of time it actually spends doing the atomic operation function calls, to the amount of CPU time in general, will actually be rather small indeed. Such an optimisation is likely to be of little actual benefit, for the cost it brings. We don't have numbers of typical programs, but we have benchmarks. Look at http://bugzilla.gnome.org/show_bug.cgi?id=63621 The numbers for i386 are: inline : 4.376484 sec function : 7.325000 sec fallback : 23.984717 sec for ppc64: inline : 1.961480 sec function : 3.328593 sec fallback : 31.004492 sec Regards, Sebastian -- Sebastian Wilhelmi |här ovanför alla molnen mailto:[EMAIL PROTECTED] | är himlen så förunderligt blå http://seppi.de| ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: public barrier functions
On Mon, 2005-12-12 at 22:41 +0100, Sebastian Wilhelmi wrote: Hi Balazs, Is there a specific reasons why the barrier functions implemented by gatomic.c and gatomic.h are not exported APIs? We didn't want to create the Swiss army knife for high performance multithread programming, just atomic integers. As you are surely aware, using memory barriers is far from an easy topic and bugs are easily introduced. Sure. I only wanted to avoid using locks on the fastpath of my application, but I already solved it with atomic integers and I've read a lot of articles on memory barriers in the meanwhile. And while I am at it, would it be possible to change the atomic operations to inline functions? I'd think it is much better inline single-instruction functions as otherwise the call overhead is too great. That would make it impossible to fix the corresponding implementations also for already compiled programs, should bugs surface (which they already did) and it would also make it impossible to guarantee, that all programs really use the same implementation, i.e. with inline functions one module could use the asm version (because gcc is used) and the second module would use the mutex versions (because another compiler is used). That would be very bad of course. That's a valid point but maybe it would be possible to request inlined implementations of some functions by the use of a preprocessor symbol, e.g. #define G_GLIB_USE_INLINE_FUNCS #include glib.h Of course this would only be possible if it is not a maintenance head-ache, e.g. the copied function bodies should not be copied, but autogenerated instead. Would something like this be accepted? -- Bazsi ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: inlining glib functions (Was: public barrier functions)
Gustavo J. A. M. Carneiro said: IMHO, some functions are obvious candidates for inlining, regardless of any profiling done on them. For instance: gchar* g_strdup (const gchar *str) { gchar *new_str; gsize length; if (str) { length = strlen (str) + 1; new_str = g_new (char, length); memcpy (new_str, str, length); } else new_str = NULL; return new_str; } This function is trivial. I doubt you'll ever find any new bugs in it. It is called in many places. So why pay a performance penalty when you could easily avoid it? Glib has many such small functions. g_strdup() is a *very* poor example of a function that could be inlined for performance. The function is trivial, yes, but it calls strlen(), malloc(), and memcpy() --- three operations which are going to swamp the time spent in making a call to a real symbol for g_strdup(). This would be a misguided optimization, throwing away the ability to fix bugs behind the scenes for a negligible speed improvement. -- muppet scott at asofyet dot org ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: inlining glib functions (Was: public barrier functions)
On Tue, 13 Dec 2005, Gustavo J. A. M. Carneiro wrote: Ter, 2005-12-13 às 17:11 +0100, Tim Janik escreveu: more important than _how_ to inline is _what_ and _why_ to inline. in general, things that can easily and reasonably be inlined have been already been provided as inlined functions or macros in the glib headers. so for functions that are not inlined but you think _should_ be inlined, a persuasive argument should be given, e.g. a profiling scenario where the function in question shows up with significant figures and significant timing improvements for using the inlined version. the g_atomic_* functions are a good example of this (see profiling figures mentioned in the original thread), but they are still not inlined for other reasons. IMHO, some functions are obvious candidates for inlining, regardless of any profiling done on them. For instance: gchar* g_strdup (const gchar *str) { gchar *new_str; gsize length; if (str) { length = strlen (str) + 1; new_str = g_new (char, length); memcpy (new_str, str, length); } else new_str = NULL; return new_str; } This function is trivial. I doubt you'll ever find any new bugs in it. It is called in many places. So why pay a performance penalty when you could easily avoid it? inlining doesn't automatically mean performance improvements and not inlining doesn't automatically cause performance penalties. if you start to inline lots of widely used small functions in non performance critical code sections, all you've gained is a bigger code section size and less likelyness for warm instruction caches (that becomes especially critical when starting to bloat tight loops due to inlining). now consider that 90% of a programs runtime is spent in 10% of its code, that means 90% of your inlininig does ocoour in non performance critical sections. that's why modern compilers use tunable heuristics to decide about automated inlining and don't stupidly inline everything they can. what you're suggesting is blind optimization, experienced programmers will tell you that this will result in more harm than good. profiling a critical section, and maybe inlining/optimizing a single string copy in a critical loop can gain you ten- or hundredfold the improvements that you could get from some shallow global optimization. you don't even need to believe me, just start googling for premature and optimization and read, there's enough stuff out there to make your christmas holidays ;) Glib has many such small functions. [ BTW, if (str) could be changed to if (G_LIKELY(str)) ] yes it could, and it would make sense. the best thing you can do to make sure such improvements are integrated, is to submit complete patches for such changes (including changelog entries) so we only need to apply, compile and test them and are done. (and if you have/need commit access, after a number of quality submissions that is usually granted because it can save us additional work.) One other thing; it is well known that inline functions are better than macros: - Give you better type safety; - Less cryptic warnings/errors when calling them with wrong types - For debugging, you can still disable inlining through the CFLAGS in order to step into the inline functions in step by step debugging; So why not start using less macros and more inline functions? well, we're not completely unaware of the type safety inline functions can offer over some macros uses ;) why don't you start to submit patches for macros where you think we really should have used an inlined function and we discuss specific cases then? you can take a look at the existing glib headers to see how we do inlined functions, and read the comments which describe how inlined functions still are built in a non-inlined version into glib. -- Gustavo J. A. M. Carneiro --- ciaoTJ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: inlining glib functions (Was: public barrier functions)
On Tue, 2005-12-13 at 15:40 -0500, muppet wrote: Gustavo J. A. M. Carneiro said: IMHO, some functions are obvious candidates for inlining, regardless of any profiling done on them. For instance: gchar* g_strdup (const gchar *str) { gchar *new_str; gsize length; if (str) { length = strlen (str) + 1; new_str = g_new (char, length); memcpy (new_str, str, length); } else new_str = NULL; return new_str; } This function is trivial. I doubt you'll ever find any new bugs in it. It is called in many places. So why pay a performance penalty when you could easily avoid it? Glib has many such small functions. g_strdup() is a *very* poor example of a function that could be inlined for performance. The function is trivial, yes, but it calls strlen(), malloc(), and memcpy() --- three operations which are going to swamp the time spent in making a call to a real symbol for g_strdup(). This would be a misguided optimization, OK, I just made one mistake: I forgot that strlen and memcpy are already inlined, so that function in particular turned out not to be so small :P Some examples of really tiny functions are g_list_find, g_list_length, g_ascii_dtostr. About half a dozen assembly instructions (on x86) each. throwing away the ability to fix bugs behind the scenes I meant this only for functions that are trivial; do you think there's any chance for anyone ever spot a bug in g_strdup? Regards. -- Gustavo J. A. M. Carneiro [EMAIL PROTECTED] [EMAIL PROTECTED] The universe is always one step beyond logic ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
public barrier functions
Hi, Is there a specific reasons why the barrier functions implemented by gatomic.c and gatomic.h are not exported APIs? I'd like to avoid locking in some situations where these memory barrier instructions would come handy. One thread: ptr = NULL Other thread: void *my_ptr = ptr; if (ptr) { } Of course this would be a race condition if I was trying to use the pointer, but if I add reference counting like this: One thread: loc_ptr = ptr; ptr = NULL; g_data_unref(loc_ptr); Other thread: void *my_ptr = g_data_ref(ptr); if (my_ptr) { } Unless I miss something this should work, provided: 1) provided g_data_ref handles NULL pointers 2) the reference count of ptr itself is manipilated with atomic operations. 3) the CPU ensures proper read/write memory ordering Now this is not true on some non-x86 CPUs in which case I'd need something like: One thread: loc_ptr = ptr; ptr = NULL; wmb(); g_data_unref(loc_ptr); Other thread: void *my_ptr = g_data_ref(ptr); if (my_ptr) { } Now the question is why the memory barrier macros are hidden in the gatomic module and not exported. And while I am at it, would it be possible to change the atomic operations to inline functions? I'd think it is much better inline single-instruction functions as otherwise the call overhead is too great. Thanks in advance, -- Bazsi ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: inlining glib functions (Was: public barrier functions)
On Mon, 2005-12-12 at 18:44 +, Gustavo J. A. M. Carneiro wrote: Seg, 2005-12-12 às 19:29 +0100, Balazs Scheidler escreveu: [...] And while I am at it, would it be possible to change the atomic operations to inline functions? I'd think it is much better inline single-instruction functions as otherwise the call overhead is too great. I agree. Also many other glib functions could be static inline in the public header files. For instance, many of the functions in glist.c and gslist.c are really tiny, thus could easily be inlined, but aren't because the compiler has no access to their implementation, only to their prototype. One problem I see with this is binary compatibility. The shared lib version of glib has to provide the old non-inlined symbols, and simply moving the functions to the header as static inline would remove those symbols, even though I would not be surprised if this could be worked around with some gcc trickery, something along the lines of: gatomic.h: static inline void g_atomic_int_inc(gint *value) { ... } ginlineimpls.c (probably auto-generated in some way): #define g_atomic_int_inc __inline_g_atomic_int_inc #include gatomic.h #undef g_atomic_int_inc void g_atomic_int_inc(gint *value) { __inline_g_atomic_int_inc(value); } Other opinions? -- Bazsi ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: public barrier functions
On Mon, Dec 12, 2005 at 10:41:55PM +0100, Sebastian Wilhelmi wrote: And while I am at it, would it be possible to change the atomic operations to inline functions? I'd think it is much better inline single-instruction functions as otherwise the call overhead is too great. That would make it impossible to fix the corresponding implementations also for already compiled programs, should bugs surface (which they already did) and it would also make it impossible to guarantee, that all programs really use the same implementation, i.e. with inline functions one module could use the asm version (because gcc is used) and the second module would use the mutex versions (because another compiler is used). That would be very bad of course. Yes, I'd agree here. I think it's more important to have a consistent library of such operations across all programs, an an easy way to apply bugfixes, than it is to have down-to-the-cycle optimisation. But I'd be interested to see some benchmarks; see how much this actually matters. Run a typical program twice; once with functions and once with some inlines/macros. It's quite likely that in a real-world program, the ratio of time it actually spends doing the atomic operation function calls, to the amount of CPU time in general, will actually be rather small indeed. Such an optimisation is likely to be of little actual benefit, for the cost it brings. -- Paul LeoNerd Evans [EMAIL PROTECTED] ICQ# 4135350 | Registered Linux# 179460 http://www.leonerd.org.uk/ signature.asc Description: Digital signature ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list