I think the code from snippets.scrapy.org causes log messages from the 
deferred to appear after the spider closes and the stats are printed. I can 
not devote time to create steps to reproduce this right now but a delay in 
its deferred that causes it to finish after the engine stops should do it.
Anyway, back to the topic, since all the middlewares (pipeline among them) 
run asynchronously, I thought process_item does not need to create a 
differed because it is called asynchronously already.
However I tried time.sleep() in process_item() and it seems further yields 
from spider.parse() did block. Maybe it's a bug or maybe CONCURRENT_ITEMS 
refers to something else.


On Wednesday, 28 May 2014 13:34:16 UTC+3, Dimitris Kouzis - Loukas wrote:
>
> Basically I assume that since the whole architecture is async - you 
> wouldn't like to block e.g. to access a file or any socket operation. So if 
> someone wants to do e.g. an API lookup, I guess it's better to do it 
> asynchronously. For example this is the typical MySQL async example... 
> http://snipplr.com/view/66989/async-twisted-db-pipeline/ ... it doesn't 
> use Deferred but it is async. An example of pipeline with is itself the 
> ImagesPipeline: 
> https://github.com/scrapy/scrapy/blob/master/scrapy/contrib/pipeline/media.py#L38
>
> On Wednesday, May 28, 2014 4:25:50 AM UTC-4, Nikolaos-Digenis Karagiannis 
> wrote:
>>
>> Why deferred? Do you want to overcome this 
>> http://doc.scrapy.org/en/latest/topics/settings.html#concurrent-itemsrestriction
>>  in a specific pipeline or while processing a specific item?
>> I am asking because I inherited such a pipeline and I am still searching 
>> for a justification for deferring the item processing a second time.
>>
>>
>> On Wednesday, 28 May 2014 08:44:27 UTC+3, Dimitris Kouzis - Loukas wrote:
>>>
>>> Hello,
>>>
>>> Let's assume I have a middleware e.g. a pipeline and it is async (uses 
>>> Deferred) and I would like to write some unit tests for that. What would 
>>> you suggest as a good way to organise test code and use as much as possible 
>>> scrapy infrastruct. Scrapy uses trial and I guess it's a good idea to 
>>> inherit from SiteTest e.g. as in scrapy/tests/test_command_fetch.py. Is 
>>> this right?
>>>
>>> Thanks
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to