I am building a crawling spider to crawl community of web sites but I need to be able to access and store the current depth of the crawl. For example, the start urls would have a depth of 0, each url crawled from the start url would have a depth of 1, and so on up until the DEPTH_LIMIT. The current depth needs to be stored with each item processed by the spider.
Does the DEPTH_STATA provide a current depth value or is there another mechanism in place that I can leverage or must this be implemented as part of the spider itself? Thanks -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/groups/opt_out.
