Re: Extracting data from a table with multiple pages

Chetan Motamarri Thu, 02 Oct 2014 20:06:12 -0700

Hi Travis,

Thanks for your reply.

Using phantomJS, can I crawl the data without opening the webpage ? What I 
mean is if write some automated script in selenium to pull data, each page 
will open in browser and pulls data right..  

This method will take lot of time to pull data (as every page has to be 
opened in browser and then crawled). I need to pull all games discussions 
like this. So I am worried about time too.

Is phantomJS is also like this(selenium) or different ?

Thanks again. 

On Thursday, October 2, 2014 7:38:56 PM UTC-7, Chetan Motamarri wrote:
>
> Hi,
>
> I need to extract start date & time of all discussions(in all pages) from 
> this url "http://steamcommunity.com/workshop/discussions/?appid=570"; . I 
> tried in lot of ways but can't. 
>
> Here discussions are changing dynamically.(i.e. when page 2 is clicked, I 
> was not able find 2nd page discussions in source code). 
>
> My idea is if there is any source file of discussions (like .xml/.json) 
> then we can pull data directly from that src page. But I was not able to 
> find out location of source file of discussions. 
>
> How to use scrapy here ?
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Extracting data from a table with multiple pages

Reply via email to