Some additional observations:

  * Running on tmpfs on Linux is ~ 30% faster than with ImDisk on W10 for me.
  * WinFSP is still unreliable for any serious work. The accumulated csv was 
just truncated on write with no errors.
  * Using channels for threading gives a significant memory overhead. Some 
permutations even get OOM on my 16GB laptop, even though there shouldn't really 
be much copying. Looks like memory isn't freed fast enough, but I didn't 
investigate. This is for my pathological set of data (long tables, short seqs), 
of course.
  * Using a global table accessed from threads with a lock is just overhead. 
Interestingly, it's slower than a single threaded version on Linux, but faster 
on Windows. Collecting intermediate per-thread tables and then merging them 
could be faster than immediate access.



All in all, adding threading with what's readily available turned out pretty 
meh. May be it's just _my_ code. 

Reply via email to