How to scrape Ajax content

Chetan Motamarri Mon, 01 Dec 2014 17:07:50 -0800

Hi All

I need to extract *id's of games* in "
http://store.steampowered.com/search/?sort_by=Released_DESC&os=win#sort_by=Released_DESC
".


The point is, I was able to extract game id's in first page. I don't have 
any idea on how to move to next page and extract ids in those pages. My 
code is:

class ScrapePriceSpider(BaseSpider):
    
    name = 'UpdateGames'     
    allowed_domains = ['http://store.steampowered.com']    
    start_urls = ['
*http://store.steampowered.com/search/?sort_by=Released_DESC&os=win#sort_by=Released_DESC&page=1'*
]
    
    def parse(self, response):
        hxs = Selector(response)

        path = hxs.xpath(".//div[@id='search_result_container']") 
        item = ItemscountItem()
 
        for ids in path:
            gameIds = pack.xpath('.//a/@data-ds-appid').extract() # 
extracting all game ids
           
             item["GameID"] = str(gameIds)
             return item

Like this *my goal is to extract all game ids in 353 pages given there. *I 
think Ajax is used for pagination. I was not able to extract game ids from 
2nd page onwards. I tried giving 
*"http://store.steampowered.com/search/?sort_by=Released_DESC&os=win#sort_by=Released_DESC&page=2*";
 
is given in start_urls but no use.


Please help me with this.


Thanks
Chetan Motamarri












    

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

How to scrape Ajax content

Reply via email to