I'm no threading expert but 1000 - if those indeed are real
threads (like, preemptive OS threads) - is wayyyy too many.
My understanding is that there's no benefit to creating more than
the number of hyperthreads your CPU supports (minus your main
thread?). So you'd want a work queue of all the available
work/images, and then some reasonable number of threads (4-12
depending on core count / hyperthreading) taking work as they
need it from that queue.
You should probably be looking at std.parallelism (TaskPool etc)
for this. Perhaps somebody can provide a more detailed how-to ...