Re: How to scrape Ajax content

Enrico Jr Tuvera Tue, 02 Dec 2014 07:14:46 -0800

Hi Chetan,

No, that code should go into the parse function, effectively making it 
recursive. The full parse() function (as you provided it, at least) should 
look like this:


 def parse(self, response):
        hxs = Selector(response)

        path = hxs.xpath(".//div[@id='search_result_container']") 
        item = ItemscountItem()
 
        for ids in path:
            gameIds = pack.xpath('.//a/@data-ds-appid').extract() # 
extracting all game ids
           
             item["GameID"] = str(gameIds)
             yield item # change it to yield because return and yield can't 
exist in the same function
        
        next_btn = sel.xpath("//a[contains(text(), '>')]//@href")
        if next_btn:
            href = next_btn.extract()[0]  # note that I wouldn't use the 
[0] syntax, there's a better way to do this but I'm just working with the 
code you've got here.
            yield Request(url=href, callback=self.parse) # this should grab 
the next page if I'm not mistaken.

If I'm not mistaken that should do the trick. I've written spiders in this 
manner before and they work fine. There are other ways to do it but this 
one seems to be the most
straightforward given your code.

Hope this helps,
Enrico


On Tuesday, December 2, 2014 2:47:45 PM UTC+8, Chetan Motamarri wrote:
>
> Hi Enrico,
>
> I didn't get that explanation. Sorry about that. Should I write your code 
> in a separate function ? 
>
> Could you please help in writing code for this spider. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: How to scrape Ajax content

Reply via email to