Paul Lalonde wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

CSP doesn't scale very well to hundreds of simultaneously executing threads (my claim, not, as far as I've found yet, anyone else's). It is very well suited to a small number of threads that need to communicate, and as a model of concurrency for tasks with few points of contact. For performance, the channel locks become a bottleneck as the number of cores scale up. As far as expressiveness, there are still issues with composability and correctness as the number of threads interacting increases. Yes, you at least get local stacks, but the work seems to get exponentially harder as the number of systems in the simulation (um, game engine) increases.
Interesting.
I agree with you, taking care about memory hierarchy is becoming very important. Especially if you think about the upcoming NUMAcc systems (Opterons are already there though). But the fact is doesn't scale well is not about CSP itself, but the way it has been implemented. If CSP system itself takes care about memory hierarchy and uses no synchronisation (using IPI to send message to another core by example), CSP scales very well. Of course IPI mechanism requires a switch to kernel mode which costs a lot. But this is necessary only if the destination thread is running on another core, and I don't think latency is very important in algorigthms requiring a lot of cpus.
What do you think ?

Phil;


Reply via email to