Hello,

I have a method that drills down into menus, and when it gets to an index 
page, passes that page to a method that gets the products from the index.  
Problem is, sometimes I get to the last menu before index, and sometimes I 
get the actual index page.  I have tackled the issue this way:

  def drill_down(self, response):
    hxs=HtmlXPathSelector(response)  
        
    xpath = 
".//dl[@id='categories_menu']//dt[contains(text(),'Category')]/following-sibling::dd[count(preceding-sibling::dt)
 
= 1]//a/@href"
    if hxs.select(xpath):
      for submenu in hxs.select(xpath).extract():
        self.log('dl index page: %s' % self.get_abs_url(submenu), 
level=log.DEBUG)
        yield Request(url=self.get_abs_url(submenu), 
callback=self.get_products_from_index)
    else:
      self.log('reg index page: %s' % response.url, level=log.DEBUG )

      yield self.get_products_from_index(response) # ERROR: Spider must 
return Request, BaseItem or None, got 'generator'
      #return self.get_products_from_index(response) # SyntaxError: 
'return' with argument inside generator
      #yield Request(url=response.url, 
callback=self.get_products_from_index) # gets filtered as duplicate request


Anyone know the best way to deal with this?

Thanks,

Bill

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to