On Monday 08 October 2007 13:50, Heikki Linnakangas wrote: > While profiling a test case of exporting data from PostgreSQL, I noticed > that a lot of CPU time was spent in sprintf, formatting timestamps like > "2007-10-01 12:34". I could speed that up by an order of magnitude by > replacing the sprintf call with tailored code, but it occurred to me > that we could do the same in a more generic way in GCC.
It is already done in gcc to some extent: for example, it replaces printf("message\n") with puts("Message"). It's too far-fetched for my tastes. I think gcc should not do it. How gcc can know what printf() and puts() mean in *my* libc? I think such optimizations should be done in glibc. > The format string is almost always a string literal, so instead of > parsing it at runtime in glibc sprintf, we can preparse it in gcc. We > already rewrite two simple cases: > > sprintf(dest, "constant") -> strcpy(dest, "constant") > sprintf(dest, "%s", ptr) -> strcpy(dest, ptr) If done in glibc, it can work even for sprintf(dest, variable_which_happens_to_have_no_percents) sprintf(dest, fmt, ptr) /* fmt var may be "%s" here */ > To effectively preparse any common format string, I'm proposing that we > add more rules to rewrite this kind of format strings as well: > > sprintf(dest, "%d", arg1); -> a new function that does the same thing, > but without the overhead of parsing the format string. Like itoa on some > platforms. We could inline it as well. That would allow further > optimizations, if for example the compiler knows that arg1 is within a > certain range (do we do that kind of optimizations?) > > sprintf(dest, "constant%...", args...) -> memcpy(dest, "constant", 8); > sprintf(dest+8, "%...", args...); Just make printf faster instead by implementing it there. > sprintf(dest, "%dconstant%...", args1, args...) -> sprintf(dest, "%d", > args1); memcpy(dest+X, "constant", 8); sprintf(dest+XX, "%...", args...); How do you know that in this place user wants faster, not smaller code? > If the sprintf and memcpy calls generated in the last two rewrites are > further simplified, format strings like "%d-%d-%d" wouldn't need to call > the glibc sprintf at all. The last form of rewrite wouldn't likely be a > win unless the resulting sprintf-calls can be converted into something > cheaper, because otherwise we're just introducing more library call > overhead. Yes, printing dates and whatnot is very binary->decimal intensive. In linux kernel, decimal conversion in vsprintf() is optimized with custom conversion code. x3 faster, and no, it's not written in assembly. You may port it to glibc. -- vda