It look like Scrapy just run all start_urls at the same time. How do I tell
scrapy to start with url1 , wait 30s, then fetch url2
Here is my setting:
AUTOTHROTTLE_ENABLED = True
AUTOTHROTTLE_DEBUG = True
DOWNLOAD_DELAY = 60
DOWNLOAD_TIMEOUT = 30
CONCURRENT_REQUESTS_PER_DOMAIN = 1
AUTOTHROTTLE_START_DELAY = 10
And this is spider
start_urls = [
"url1",
"url2",
"url3",
"url4",
"url5",
]
Here is the log:
2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET
url1> (referer: None)
2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET
url2> (referer: None)
2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET
url3> (referer: None)
2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET
url4> (referer: None)
2014-10-07 14:04:53-0600 [craigslist_spider] DEBUG: Crawled (200) <GET
url5> (referer: None)
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.