sometimes I have to visit a URL to populate a cookie, then go for other 
URLS.
the solution is to have only the first URL in start_urls, then have the 
parse method return the list of other urls to visit.

Em segunda-feira, 13 de outubro de 2014 14h56min30s UTC-3, Nicolás 
Alejandro Ramírez Quiros escreveu:
>
> If they are from different domains override start_requests and use 
> meta['download_slot'] = <some_name>
>
> El martes, 7 de octubre de 2014 18:17:11 UTC-2, [email protected] 
> escribió:
>>
>> It look like Scrapy just run all start_urls at the same time. How do I 
>> tell scrapy to start with url1 , wait 30s, then fetch url2
>>
>> Here is my setting:
>>
>> AUTOTHROTTLE_ENABLED = True
>> AUTOTHROTTLE_DEBUG = True
>>
>> DOWNLOAD_DELAY = 60
>> DOWNLOAD_TIMEOUT = 30
>> CONCURRENT_REQUESTS_PER_DOMAIN = 1
>> AUTOTHROTTLE_START_DELAY = 10
>>
>>  
>> And this is spider
>>
>>     start_urls = [
>>         "url1",
>>         "url2",
>>         "url3",
>>         "url4",
>>         "url5",
>>      ]
>>
>>
>> Here is the log:
>>
>> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
>> url1> (referer: None)
>> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
>> url2> (referer: None)
>> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
>> url3> (referer: None)
>> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
>> url4> (referer: None)
>> 2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET 
>> url5> (referer: None)
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to