If I was tasked with writing spiders that scraped based on other spiders activity, I would let one spider run fully, persist the data, then read the data into the next spider.
If it was for some reason critical that the item is processed immediately, then I would write one spider, allowing all relevant domains, and use logic to route the requests.. maybe have a bunch of methods that scrape sites and they call a router method as callback. That router method investigates the item and calls the next required scraping method. When the item isn't routed anywhere it finally gets sent to the pipeline. Nice and simple. -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
