32-bit unsigned division A/B by compile-time constant B can be optimized by
replacing it with multiplication and shift right. For example, division by 10
is done like this: (A*3435973837) >> 35, in i386 asm:
        movl    $-858993459, %ecx
        movl    8(%ebp), %eax
        mull    %ecx
        movl    %edx, %esi
        shrl    $3, %esi
Choosing right constant by which we need to multiply is not trivial: large A
values can give wrong results. I spent a few days researching this. I will
attach the following test programs:

find_fast_div.c: reports fastdiv params for every B in [3..2^32-1]

fast_div_bench.c: measures speed improvement when dividing by B=10.

find_fast_div_random.c: there is fastdiv_params() function which is intended to
go into gcc's optimizer, "constant division" department. It calculates fastdiv
parameters (K,bits) for given B and max_A (max_A is to be supplied by gcc's
value range analyser). main() calls fastdiv_params() with random parameters in
the loop.

I am not familiar with gcc internals. Please tell me what should I do next in
order to integrate it inot gcc.


-- 
           Summary: Improved division-by-constant code
           Product: gcc
           Version: 4.2.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: vda dot linux at googlemail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395

Reply via email to