It's a data dependency:

   xxx1 =.  1  ,"1~ 10000 3?@$ 100
   6!:2'~. xxx1'
0.0068845
   xxx1 =.  1  ,"1~ 10000 3?@$ 10000
   6!:2'~. xxx1'
0.000578

Repetitions at lengths of 32 bits seem to be trouble (not surprising when you think about it).  One kludgy-smelling workaround would be to rotate the value of each word by a varying number of bits.  The advantage of that is that it requires very little setup and would still be fast for short arguments.

You have been looking into better hash functions.  Do you want to implement one here?  Preferably one that is very fast for short arguments.

hhr

On 11/24/2022 12:40 AM, Elijah Stone wrote:
That looks rather serious.  It seems some problem with the hashing function is causing an unreasonably high rate of collisions.  As a temporary workaround, you can try using ~.&.:(1&|."1), or use the 32-bit version.

On Thu, 24 Nov 2022, Ben Gorte wrote:

G'day,


Still J-ing along, I believe I ran into a little performance issue.

I can strip it down into:


A3=:?1000000 3$100 NB. three columns of random numbers 0 .. 99

A31 =: A3,.1 NB. with an extra column of 1-s

A4=:?1000000 4$100 NB. or four random columns

6!:2 'echo $~.A3'

632783 3

0.037673

6!:2 'echo $~.A31'

632783 4

24.6091

That's 600 times longer!


However,

6!:2 'echo $~.A4'

995025 4

0.43277

is again much quicker.


This is consistent over versions 807, 903 and 904, all with Ubuntu linux.


I do need the fourth column and it has many 1-s (not all).


Thanks,

Ben
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to