Hi Chetan,

SteamCommunity.com appears to load the data via AJAX, rather than in the
HTML source code.  AFAIK, you can't use scrapy to do this (at least, not
without serious modification).

In order to scrape Javascript loaded data, you'll need to use something
like a headless javascript browser with python bindings.  Some common ones
are phantomJS and selenium.

There is also a lightweight package that seems a bit simpler than rolling
your own.  I've never used it, so cannot speak to its quality, but the
documentation appears decent.  You might try this first.  Check out
http://dryscrape.readthedocs.org/en/latest/ for more information about it.

-Travis

On Thu, Oct 2, 2014 at 7:38 PM, Chetan Motamarri <[email protected]> wrote:

> Hi,
>
> I need to extract start date & time of all discussions(in all pages) from
> this url "http://steamcommunity.com/workshop/discussions/?appid=570"; . I
> tried in lot of ways but can't.
>
> Here discussions are changing dynamically.(i.e. when page 2 is clicked, I
> was not able find 2nd page discussions in source code).
>
> My idea is if there is any source file of discussions (like .xml/.json)
> then we can pull data directly from that src page. But I was not able to
> find out location of source file of discussions.
>
> How to use scrapy here ?
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to