Re: How to make Python run as fast (or faster) than Julia

Ned Batchelder Mon, 26 Feb 2018 05:47:32 -0800

On 2/26/18 7:13 AM, bartc wrote:

On 26/02/2018 11:40, Chris Angelico wrote:

On Mon, Feb 26, 2018 at 10:13 PM, bartc <b...@freeuk.com> wrote:

Below is the first draft of a Python port of a program to do withrandom
numbers. (Ported from my language, which in turned ported it from a C
program by George Marsaglia, the random number guy.)
However, running it quickly exhausts the memory in my machine. Thereason isthat Python unhelpfully widens all results to bignums as needed. Thecode
relies on calculations being modulo 2^64.
Note that restricting integer ops to 64 bits probably still won'twork, as I
believe the numbers need to be unsigned.


No, it's because the original implementation assumed integer
wrap-around (at least, I think that's what's happening; I haven't
analyzed the code in great detail). That means all your integer
operations are doing two jobs: the one they claim to, and then a
masking to 64 bits signed. That's two abstract operations that happen,
due to the nature of the CPU, to work efficiently together. If you
don't implement both halves of that in your Python port, you have
failed at porting. What if you were porting a program from a 72-bit
chip that assumed Binary Coded Decimal? Would you complain that C's
integers are badly behaved?

And that's without even asking whether a C program that assumes
integer wrap-around counts as portable. At least with Python, you have
a guarantee that integer operations are going to behave the same way
on all compliant implementations of the language.

A C version is given below. (One I may have messed around with, whichI'm not sure works properly. For an original, google for Marsaglia andKISS64 or SUPRKISS64.)

Most integers are unsigned, which have well-defined overflow in C(they just wrap as expected). In C, a mixed signed/unsigned op isperformed as unsigned.


-----------------------------

/*   SUPRKISS64.c, period 5*2^1320480*(2^64-1)   */
#include <stdio.h>
#include <stdlib.h>
#include "timer.c"
 static signed long long Q[20632],
                 carry=36243678541LL,
                 xcng=12367890123456LL,
                 xs=521288629546311LL,
                 indx=20632;

 #define CNG ( xcng=6906969069LL*xcng+123 )
 #define XS  ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
 #define SUPR ( indx<20632 ? Q[indx++] : refill() )
 #define KISS SUPR+CNG+XS

 signed long long refill(void) {
  int i; signed long long z,h;
  for(i=0;i<20632;i++) {
    h = (carry&1);
    z = ((Q[i]<<41)>>1)+((Q[i]<<39)>>1)+(carry>>1);
    carry = (Q[i]>>23)+(Q[i]>>25)+(z>>63);
    Q[i] = ~((z<<1)+h);
  }
  indx=1;
  return (Q[0]);
 }

 int main() {
  int i; signed long long x;

  for(i=0;i<20632;i++) Q[i]=CNG+XS;

  for(i=0;i<1000000000;i++) x=KISS;

  printf("Does x=4013566000157423768\n     x=%llu.\n",x);
}

With proper 64-bit masking (def only64(x): return x &0xFFFFFFFFFFFFFFFF), the Python version produces the correct answerusing a reasonable amount of memory. Well, once you notice that thePython code had N=1e5, and the C code had N=1e9 :) If you want toexperiment, with N=1e5, the final number should be 5255210926702073855.

Also, I note that you said, "Most integers are unsigned", but the C codehas them all declared as signed? It doesn't seem to have mattered toyour result, but I'm not an expert on C portability guarantees.


--Ned.
--
https://mail.python.org/mailman/listinfo/python-list

Re: How to make Python run as fast (or faster) than Julia

Reply via email to