Hi Chetan,

So it looks like the next page can be accessed by sending a GET request to 
the url specified by the next button. No special ajax trickery needed.

You could try adding this to your parse method, but make sure to change the 
'return' in your for loop to a 'yield', otherwise it won't work.

    next_btn = sel.xpath("//a[contains(text(), '>')]//@href")
    if next_btn:
        href = next_btn.extract()[0]  # note that I wouldn't use the [0] 
syntax, there's a better way to do this but I'm just working with the code 
you've got here.
        yield Request(url=href, callback=self.parse)

That ought to do the trick.

Thanks,
Enrico

On Tuesday, December 2, 2014 9:07:31 AM UTC+8, Chetan Motamarri wrote:
>
> Hi All
>
> I need to extract *id's of games* in "
> http://store.steampowered.com/search/?sort_by=Released_DESC&os=win#sort_by=Released_DESC
> ". 
>
> The point is, I was able to extract game id's in first page. I don't have 
> any idea on how to move to next page and extract ids in those pages. My 
> code is:
>
> class ScrapePriceSpider(BaseSpider):
>     
>     name = 'UpdateGames'     
>     allowed_domains = ['http://store.steampowered.com']    
>     start_urls = 
> ['*http://store.steampowered.com/search/?sort_by=Released_DESC&os=win#sort_by=Released_DESC&;
>  
> <http://store.steampowered.com/search/?sort_by=Released_DESC&os=win#sort_by=Released_DESC&;>page=1'*
> ]
>     
>     def parse(self, response):
>         hxs = Selector(response)
>
>         path = hxs.xpath(".//div[@id='search_result_container']") 
>         item = ItemscountItem()
>  
>         for ids in path:
>             gameIds = pack.xpath('.//a/@data-ds-appid').extract() # 
> extracting all game ids
>            
>              item["GameID"] = str(gameIds)
>              return item
>
> Like this *my goal is to extract all game ids in 353 pages given there. *I 
> think Ajax is used for pagination. I was not able to extract game ids from 
> 2nd page onwards. I tried giving 
> *"http://store.steampowered.com/search/?sort_by=Released_DESC&os=win#sort_by=Released_DESC&;
>  
> <http://store.steampowered.com/search/?sort_by=Released_DESC&os=win#sort_by=Released_DESC&;>page=2*"
>  
> is given in start_urls but no use.
>
>
> Please help me with this.
>
>
> Thanks
> Chetan Motamarri
>
>
>
>
>
>
>
>
>
>
>
>
>     
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to