Why can't the clojure bytecode compiler hand-perform this like
functional languages do when compiling to native code? Is it to keep
the clojure compiler fast (for dynamic runtime compilation), since
performing tail call optimisation presumably requires a bunch of extra
checks and more complex code
There are more rasons to want to avoid using threads than memory.
Besides the obvious cost of creating and destroying threads (which is
reduced or removed by using thread pools), you also have the cost of
time slicing once you have more software threads than hardware
threads: there is the obvious c