dynamic code generation to superoptimize calendar calculations

Kragen Javier Sitaker Thu, 06 Mar 2008 00:37:06 -0800

In http://blog.plover.com/calendar/leapday.html Mark-Jason Dominus
suggests this algorithm for calculating leap years, as a proposed
replacement for the Gregorian system:


   1. Divide the year by 33. If the result [remainder?] is 0, it is not
      a leap year.  Otherwise,
   2. If the result is divisible by 4, it is a leap year. 

This Dominus Calendar has an average leap-day correction of
0.24242424... leap-days per year, against the Gregorian calendar's
0.2425, the tropical year's 0.24219 or so leap-days per year, and the
vernal equinox year's 0.2422 or so leap-days per year.

Perhaps it would be slightly simpler, and equally accurate, to do the
following:

   1. Divide the year by 33.
   2. Divide the remainder by 4.
   3. If the remainder is 1, it is a leap year.

Other values that would work in place of 1 are 2 and 3, but not 0.

I suspect there is some simpler algorithm (in the sense of not requiring
division by a large number such as 33) that is also more accurate.  You
could write the second one as \ y . ((y % 33) % 4) == 1 and the first
one as \ y . (\ r . (r != 0) && ((r % 4) == 0)) (y % 33), or if you're
into flat representations,
        r = y % 33
        r2 = r % 4
        result = r2 == 1
or
        r = y % 33
        r2 = r != 0
        r3 = r % 4
        r4 = r3 == 0
        result = r2 & r4

If our repertoire of primitives is %, ==, !=, <, &, and |, our
repertoire of operands is limited to integers in [0, 33] and previously
produced results, then the formula in each step has at most 39 * 6 * 39
= 12168 possibilities.  If our number of steps is limited to 5, then
there are only 12168^5 programs to search to guarantee that we find the
above two, which is unfortunately 266 744 826 599 558 381 568 programs.
That's on the edge of being practical to search by brute force; I think
it would be less than a thousand years on a thousand-CPU cluster.

The four-step programs should be practical to exhaustively search ---
they should take a few days at most on such a cluster.  A language
including + but only integers up to 17 would also include a form of my
short program, and would have at most (18+4) * 7 * (18+4) = 3388
possible formulas for a step, which gives less than 3388^4 programs to
search, which is only 131 756 972 359 936.  More exactly, there are 2527
possible first formulas, 2800 possible second ones, and then 3087, 3388.
The actual product is 74 001 973 953 600, almost twice as small.
Type-compatibility (the result has to be boolean; & and | can only apply
to two booleans; other operators can only apply to two numbers)
restricts the number of formulas further, but that number is difficult
to calculate exactly.

One difficulty is that measuring the accuracy of such a leap-day
calculation program presumably requires running it a number of times on
different year numbers, which adds an order of magnitude or two to the
search time.  If you have to run each one on average 30 times to find
out that it's unacceptable, you could end up running 2 x 10^15 steps or
so, which is maybe 10^16 instructions.  A modern quad-core CPU can
probably run around 5 x 10^9 instructions per second, so that leaves 2 x
10^5 seconds of testing if each primitive instruction above is a CPU
instruction.

A better approach may be to run the instructions SIMD-fashion, APL-like,
on a bunch of years at once.  This should avoid the need for dynamic
code generation to get reasonable performance, although you probably
still want to have the inner loops written in C or assembly.  (I don't
think you can use MMX or SSE for the division, but you can probably use
it for comparison and population count.)

dynamic code generation to superoptimize calendar calculations

Reply via email to