Hi,

I am using PUB/SUB socket pattern to distribute commands from the
coordinator to the many worker processes, and I also have the PUSH/PULL to
have each worker process to push the processing results to the coordinator.
The coordinator is bound to the PUB socket and also the PULL socket, with
the current context to set to 1 thread.   In my test environment, there
would be one single coordinator process and up to 200 worker processes.

I have just started the scalability testing. But it seems that with 15
worker processes, the end-to-end communication latency is about 15 ms, for
the coordinator to distribute (via PUB) the commands and finally aggregate
the results back (via PULL) from the worker processes. But when I increased
the number of worker processes to 50, I then observed the end-to-end
communication latency of about 80 ms. This implies that as the number of
the worker processes grow, the latency also grows and thus brings up the
scalability issue.

The message size communicated between the coordinator and the worker
processes are not that big, less than 100 Bytes.

While I am planning to measure the latency spent on each hop, I would like
to seek suggestions:

*for a large number of the worker processes to be handled by a single
coordinator with low latency, should the context at the coordinator be set
to >  1 thread?

*Should I use the other socket pattern such as Router/Dealer, instead of
pub/sub and push/pull, in order to address the scalability issue?

Regards,

Jun
_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to