On 2/26/18 7:13 AM, bartc wrote:
On 26/02/2018 11:40, Chris Angelico wrote:
On Mon, Feb 26, 2018 at 10:13 PM, bartc <b...@freeuk.com> wrote:
Below is the first draft of a Python port of a program to do with random
numbers. (Ported from my language, which in turned ported it from a C
program by George Marsaglia, the random number guy.)

However, running it quickly exhausts the memory in my machine. The reason is that Python unhelpfully widens all results to bignums as needed. The code
relies on calculations being modulo 2^64.

Note that restricting integer ops to 64 bits probably still won't work, as I
believe the numbers need to be unsigned.

No, it's because the original implementation assumed integer
wrap-around (at least, I think that's what's happening; I haven't
analyzed the code in great detail). That means all your integer
operations are doing two jobs: the one they claim to, and then a
masking to 64 bits signed. That's two abstract operations that happen,
due to the nature of the CPU, to work efficiently together. If you
don't implement both halves of that in your Python port, you have
failed at porting. What if you were porting a program from a 72-bit
chip that assumed Binary Coded Decimal? Would you complain that C's
integers are badly behaved?

And that's without even asking whether a C program that assumes
integer wrap-around counts as portable. At least with Python, you have
a guarantee that integer operations are going to behave the same way
on all compliant implementations of the language.


A C version is given below. (One I may have messed around with, which I'm not sure works properly. For an original, google for Marsaglia and KISS64 or SUPRKISS64.)

Most integers are unsigned, which have well-defined overflow in C (they just wrap as expected). In C, a mixed signed/unsigned op is performed as unsigned.

-----------------------------

/*   SUPRKISS64.c, period 5*2^1320480*(2^64-1)   */
#include <stdio.h>
#include <stdlib.h>
#include "timer.c"
 static signed long long Q[20632],
                 carry=36243678541LL,
                 xcng=12367890123456LL,
                 xs=521288629546311LL,
                 indx=20632;

 #define CNG ( xcng=6906969069LL*xcng+123 )
 #define XS  ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
 #define SUPR ( indx<20632 ? Q[indx++] : refill() )
 #define KISS SUPR+CNG+XS

 signed long long refill(void) {
  int i; signed long long z,h;
  for(i=0;i<20632;i++) {
    h = (carry&1);
    z = ((Q[i]<<41)>>1)+((Q[i]<<39)>>1)+(carry>>1);
    carry = (Q[i]>>23)+(Q[i]>>25)+(z>>63);
    Q[i] = ~((z<<1)+h);
  }
  indx=1;
  return (Q[0]);
 }

 int main() {
  int i; signed long long x;

  for(i=0;i<20632;i++) Q[i]=CNG+XS;

  for(i=0;i<1000000000;i++) x=KISS;

  printf("Does x=4013566000157423768\n     x=%llu.\n",x);
}


With proper 64-bit masking (def only64(x): return x & 0xFFFFFFFFFFFFFFFF), the Python version produces the correct answer using a reasonable amount of memory. Well, once you notice that the Python code had N=1e5, and the C code had N=1e9 :)   If you want to experiment, with N=1e5, the final number should be 5255210926702073855.

Also, I note that you said, "Most integers are unsigned", but the C code has them all declared as signed?  It doesn't seem to have mattered to your result, but I'm not an expert on C portability guarantees.

--Ned.
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to