> def t(n):
>    s = time.time()
>    for x in range(0,10000):
>        l = []
>        for i in range(1,10001):
>            l.append(i)
>    e = time.time()
>    print(e-s, len(l))
>    return l
> 
[…]
> So Pharo at 1.6 seconds (Array time) is respectable compared to Julia's 
> highly optimized preallocated arrays and thoroughly keeps Python in its place.

You have a super machine. Look at these figures:

Consider the Pharo version of your code:
[0 to: 10000 do: [ :x |
        l := OrderedCollection new.
        1 to: 10001 do: [ :i | l add: i ].
]] timeToRun

My macbook air returns 0:00:00:04.234 seconds. We can improve it by 
preallocating the internal array with the following:

[0 to: 10000 do: [ :x |
        l := OrderedCollection new: 10001.
        1 to: 10001 do: [ :i | l add: i ].
]] timeToRun
returns 0:00:00:03.959

We can avoid all the additional check to the ordered collection.
Getting rid of the ordered collection, and using an array gives me:

[0 to: 10000 do: [ :x |
        l := Array new: 10001.
        1 to: 10001 do: [ :i | l at: i put: i ].
]] timeToRun

returns 0:00:00:02.45

Nearly a factor x 2, but that is the easy part. 
Let’s __really__ speed this thing up by writing the code "1 to: 10001 do: [ :i 
| l at: i put: i ].” in C.

Create the file /tmp/testarray.c  with the following code:
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
#include <stdio.h>
int fillArray(int *array, int size) {
  int s = 0;
  int i;
  for(i = 0; i < size; i ++ ) {
    array[i] = i;
  }
  return 0;
}

int main() {
  int i, sum = 0;
  int table[5];
  
  fillArray(table, 5);
  
  for(i = 0; i < 5; i ++ )
    sum += table[i];
    
  printf("The sum is %d\n", sum);

  return 0;
}
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

The important thing is the function fillArray. The main is there just to try 
out.
Compile this file with:
 gcc -c -m32  testarray.c
and then create a dynamic library:
gcc -shared -m32 -o testarray.dylib testarray.o

So, you should have the file testarray.dylib  in your /tmp. Here is what I have 
on my machine:
/tmp> ls -l testarray.dylib 
-rwxr-xr-x  1 alexandrebergel  wheel  8588 Jan 16 17:02 testarray.dylib

This library is accessible from Pharo using NativeBoost. 

Create a class called A and add these two methods:
A>>fillArray: array
        ^ self fillArray: array asWordArray size: array size 

A>>fillArray: array size: size
  <primitive: #primitiveNativeCall module: #NativeBoostPlugin error: errorCode>
    ^ self
        nbCall: #( int fillArray( int* array, int size ) )
        module: '/tmp/testarray.dylib'

You can make sure you have the same effect in Pharo than in C with the 
equivalent of the main function:
l := WordArray new: 5.
A new fillArray: l.
l sum
=> 10


Ok, so now we are ready to re-write our code:

[0 to: 10000 do: [ :x |
        l := WordArray new: 10001.
        A new fillArray: l.
]] timeToRun

=> 0:00:00:00.463

We have gained a x 10 factor. 

Cheers,
Alexandre

-- 
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.




Reply via email to