On Sun, 24 Jul 2022, Raul Miller wrote:
space/time tradeoff
There should not be a significant time penalty for blocking, compared with fully eager iteration.
You can actually implement the optimisation in pure j (and see the performance uplift); the methodology is similar to that used by the parallel implementations of primitive modifiers which I demonstrated a couple of months ago. Main difference is that it's a variable number of fixed-size slices, rather than a fixed number of variably-sized slices; the trick is to pick a size which is small enough that everything you need fits in l1, but large enough to amortise the dispatch overhead. Dispatch is not _that_ expensive, so this is quite doable.
---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm