Re: Speeding up network access: threading?

exarkun Mon, 04 Jan 2010 19:39:21 -0800

On 04:22 pm, m...@privacy.net wrote:

Hello,
what would be best practise for speeding up a larger number of http-getrequests done via urllib? Until now they are made in sequence, eachrequest taking up to one second. The results must be merged into alist, while the original sequence needs not to be kept.
I think speed could be improved by parallizing. One could use multiplethreads.Are there any python best practises, or even existing modules, forcreating and handling a task queue with a fixed number of concurrentthreads?

Using multiple threads is one approach. There are a few thread poolimplementations lying about; one is part of Twisted,<http://twistedmatrix.com/documents/current/api/twisted.python.threadpool.ThreadPool.html>.

Another approach is to use non-blocking or asynchronous I/O to makemultiple requests without using multiple threads. Twisted can help youout with this, too. There's two async HTTP client APIs available. Theolder one:


http://twistedmatrix.com/documents/current/api/twisted.web.client.getPage.html
http://twistedmatrix.com/documents/current/api/twisted.web.client.HTTPClientFactory.html

And the newer one, introduced in 9.0:

http://twistedmatrix.com/documents/current/api/twisted.web.client.Agent.html

Jean-Paul
--
http://mail.python.org/mailman/listinfo/python-list

Re: Speeding up network access: threading?

Reply via email to