On Monday 08 October 2007 13:50, Heikki Linnakangas wrote:
> While profiling a test case of exporting data from PostgreSQL, I noticed
> that a lot of CPU time was spent in sprintf, formatting timestamps like
> "2007-10-01 12:34". I could speed that up by an order of magnitude by
> replacing the sprintf call with tailored code, but it occurred to me
> that we could do the same in a more generic way in GCC.

It is already done in gcc to some extent: for example, it
replaces printf("message\n") with puts("Message").

It's too far-fetched for my tastes. I think gcc should not do it.
How gcc can know what printf() and puts() mean in *my* libc?

I think such optimizations should be done in glibc.

> The format string is almost always a string literal, so instead of
> parsing it at runtime in glibc sprintf, we can preparse it in gcc. We
> already rewrite two simple cases:
> 
> sprintf(dest, "constant") -> strcpy(dest, "constant")
> sprintf(dest, "%s", ptr) -> strcpy(dest, ptr)

If done in glibc, it can work even for

sprintf(dest, variable_which_happens_to_have_no_percents)
sprintf(dest, fmt, ptr) /* fmt var may be "%s" here */

> To effectively preparse any common format string, I'm proposing that we
> add more rules to rewrite this kind of format strings as well:
> 
> sprintf(dest, "%d", arg1); -> a new function that does the same thing,
> but without the overhead of parsing the format string. Like itoa on some
> platforms. We could inline it as well. That would allow further
> optimizations, if for example the compiler knows that arg1 is within a
> certain range (do we do that kind of optimizations?)
> 
> sprintf(dest, "constant%...", args...) -> memcpy(dest, "constant", 8);
> sprintf(dest+8, "%...", args...);

Just make printf faster instead by implementing it there.

> sprintf(dest, "%dconstant%...", args1, args...) -> sprintf(dest, "%d",
> args1); memcpy(dest+X, "constant", 8); sprintf(dest+XX, "%...", args...);

How do you know that in this place user wants faster, not smaller code?

> If the sprintf and memcpy calls generated in the last two rewrites are
> further simplified, format strings like "%d-%d-%d" wouldn't need to call
> the glibc sprintf at all. The last form of rewrite wouldn't likely be a
> win unless the resulting sprintf-calls can be converted into something
> cheaper, because otherwise we're just introducing more library call
> overhead.

Yes, printing dates and whatnot is very binary->decimal intensive.

In linux kernel, decimal conversion in vsprintf() is optimized
with custom conversion code. x3 faster, and no, it's not written in assembly.
You may port it to glibc.
--
vda

Reply via email to