Hi,

If you look at the docs and feature list for Scrapy, you'll see it has a
whole bunch of scraping features more than just selecting a DOM element.
E.g.:

http://doc.scrapy.org/en/latest/intro/overview.html#what-else

So for us, this would also cover handling things like throttling,
configuration, distributing jobs, managing jobs etc.

PhantomJS seems to be the way to go, from other people's comments.

Most of the full-featured Node frameworks seem to be inactive. E.g.:

https://github.com/chriso/node.io
https://github.com/sylvinus/node-crawler
https://github.com/mape/node-scraper

Only one I've found which is still actively maintained, which Matthew Page
above mention is Noodle:

https://github.com/dharmafly/noodle

Cheers,
Victor


On Sat, Jan 25, 2014 at 3:56 PM, Jamie Popkin <popk...@gmail.com> wrote:

> At the risk of sounding stupid... Can't you just use jQuery? It's got
> everything you need for fetching and parsing web content.
>
> --
> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to nodejs@googlegroups.com
> To unsubscribe from this group, send email to
> nodejs+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "nodejs" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/nodejs/0E76dy0mgwI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> nodejs+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nodejs@googlegroups.com
To unsubscribe from this group, send email to
nodejs+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to nodejs+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to