This problem, which Elijah has found can be attributed to my insouciance
about the robustness of CRC-32 as a hash function, afflicts the entire
i.-family and has no good workaround.
However, I think it will be easy to fix, and I hope we will get the fix
out in the next beta, which should be coming soon.
Henry Rich
On 11/25/2022 7:11 PM, Ben Gorte wrote:
Hi Henry, could you be a bit more explicit about these 32 bits +
workaround?
Eliahs suggestion helps, but doesn't seem to solve the entire issue.
For example, something similar occurs in dyadic i.
Thanks,
Ben
On Fri, 25 Nov 2022 at 02:57, Henry Rich <henryhr...@gmail.com> wrote:
It's a data dependency:
xxx1 =. 1 ,"1~ 10000 3?@$ 100
6!:2'~. xxx1'
0.0068845
xxx1 =. 1 ,"1~ 10000 3?@$ 10000
6!:2'~. xxx1'
0.000578
Repetitions at lengths of 32 bits seem to be trouble (not surprising
when you think about it). One kludgy-smelling workaround would be to
rotate the value of each word by a varying number of bits. The
advantage of that is that it requires very little setup and would still
be fast for short arguments.
You have been looking into better hash functions. Do you want to
implement one here? Preferably one that is very fast for short arguments.
hhr
On 11/24/2022 12:40 AM, Elijah Stone wrote:
That looks rather serious. It seems some problem with the hashing
function is causing an unreasonably high rate of collisions. As a
temporary workaround, you can try using ~.&.:(1&|."1), or use the
32-bit version.
On Thu, 24 Nov 2022, Ben Gorte wrote:
G'day,
Still J-ing along, I believe I ran into a little performance issue.
I can strip it down into:
A3=:?1000000 3$100 NB. three columns of random numbers 0 .. 99
A31 =: A3,.1 NB. with an extra column of 1-s
A4=:?1000000 4$100 NB. or four random columns
6!:2 'echo $~.A3'
632783 3
0.037673
6!:2 'echo $~.A31'
632783 4
24.6091
That's 600 times longer!
However,
6!:2 'echo $~.A4'
995025 4
0.43277
is again much quicker.
This is consistent over versions 807, 903 and 904, all with Ubuntu
linux.
I do need the fourth column and it has many 1-s (not all).
Thanks,
Ben
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm