32-bit unsigned division A/B by compile-time constant B can be optimized by replacing it with multiplication and shift right. For example, division by 10 is done like this: (A*3435973837) >> 35, in i386 asm: movl $-858993459, %ecx movl 8(%ebp), %eax mull %ecx movl %edx, %esi shrl $3, %esi Choosing right constant by which we need to multiply is not trivial: large A values can give wrong results. I spent a few days researching this. I will attach the following test programs:
find_fast_div.c: reports fastdiv params for every B in [3..2^32-1] fast_div_bench.c: measures speed improvement when dividing by B=10. find_fast_div_random.c: there is fastdiv_params() function which is intended to go into gcc's optimizer, "constant division" department. It calculates fastdiv parameters (K,bits) for given B and max_A (max_A is to be supplied by gcc's value range analyser). main() calls fastdiv_params() with random parameters in the loop. I am not familiar with gcc internals. Please tell me what should I do next in order to integrate it inot gcc. -- Summary: Improved division-by-constant code Product: gcc Version: 4.2.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: vda dot linux at googlemail dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395