Item pipelines support returning a (Twisted) deferred so you could run the cpu intensive task in a thread (see deferToThread twisted function). You'll need to read a bit about Twisted deferreds but I think it's worth the effort.
That said, Python multi-threading is not well suited for CPU intensive tasks (unless the library you're using has special support for releasing the GIL <https://wiki.python.org/moin/GlobalInterpreterLock>, which is unlikely). If you plan to run this on a multi-core machine, it might be better to consider using multiprocessing <https://docs.python.org/2/library/multiprocessing.html>, or the analogous twisted facilty for managing processes <http://twistedmatrix.com/documents/13.2.0/core/howto/process.html>. Still, from Scrapy point of view, this would be triggered from a pipeline returning a deferred (that triggers when the sub-process task finishes) to allow Scrapy main thread to continue doing its work (downloading, running callbacks, parsing pages). Hope this helps, good luck! On Fri, May 23, 2014 at 12:16 PM, James Ford <[email protected]> wrote: > Hello, > > Where in the scrapy architecture is it best to conduct some long running > operation? > > Right now i'm doing some cpu-intensive document conversion in an > item-pipeline but I don't like the performance. Maybe I should do it in the > downloader? > > Thanks, > > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
