On Wed, Mar 5, 2014 at 5:19 PM, Roger Hui <[email protected]> wrote:
> Good answers. For /:~x vs. g{x, the explanations are:
>
> - Indexing must check for index error. Sorting does not.
> - Indexing uses random read access over a large chunk of memory (i.e.
> x). Sort does not.
>
I was curious how much overhead the type conversion and index error
checking added. I was surprised it was virtually nothing.
I added a unsafe foreign, 128!:6 that maps the indices of a character
array into a new array:
(0 0 1 2 1) (128!:6) 'abcd'
aabcb
100 timer 's=: x{~g=: /:x'
0.0792123
100 timer 's=: x(128!:6)~g=: /:x'
0.0770512
100 timer 'S=: /:~x[G=: /:x'
0.060501
s-:S
1
G-:g
1
100 timer 's=: x{~g'
0.0292137
100 timer 's=: x(128!:6)~g'
0.0260004
As you can see, this seems to show very little overhead from those
operations even though the conversion appears to copy the array of
indices and bounds checking would seem to be another operation. I
can't think of any way to speed it up. It's extremely fast anyways.
I'm still somewhat surprised that jtsortc is faster..It seems that
indexing into the large character array is slower than the 256
character array in jtsortc. My guess is the large array cannot be held
in cache but the small array of character counts can. Stackoverflow
seems to suggest it could be the reason:
http://stackoverflow.com/questions/8004669/long-arrays-caching-issues.
I'm even more impressed with the sort implementation. I didn't
research it too extensively.
x.c
case XC(128, 6): R CDERIV(CIBEAM, 0, jtifromf, 1, RMAX, RMAX);
vfrom.c
F2(jtifromf){
A z; UC*wv,*v; I *av; I n, j;
n = AN(a);
GA(z, AT(w), AN(a), AR(a), AS(a));
v = (C*)AV(z);
av = AV(a); wv = UAV(w);
j = 0;
for (; j < n; j++) {
*v++ = wv[*av++];
}
R z;
}
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm