Item pipelines support returning a (Twisted) deferred so you could run the
cpu intensive task in a thread (see deferToThread twisted function). You'll
need to read a bit about Twisted deferreds but I think it's worth the
effort.

That said, Python multi-threading is not well suited for CPU intensive
tasks (unless the library you're using has special support for releasing
the GIL <https://wiki.python.org/moin/GlobalInterpreterLock>, which is
unlikely). If you plan to run this on a multi-core machine, it might be
better to consider using multiprocessing
<https://docs.python.org/2/library/multiprocessing.html>, or the analogous
twisted facilty for managing processes
<http://twistedmatrix.com/documents/13.2.0/core/howto/process.html>. Still,
from Scrapy point of view, this would be triggered from a pipeline
returning a deferred (that triggers when the sub-process task finishes) to
allow Scrapy main thread to continue doing its work (downloading, running
callbacks, parsing pages).

Hope this helps, good luck!


On Fri, May 23, 2014 at 12:16 PM, James Ford <[email protected]> wrote:

> Hello,
>
> Where in the scrapy architecture is it best to conduct some long running
> operation?
>
> Right now i'm doing some cpu-intensive document conversion in an
> item-pipeline but I don't like the performance. Maybe I should do it in the
> downloader?
>
> Thanks,
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to