Hi,
If you look at the docs and feature list for Scrapy, you'll see it has a
whole bunch of scraping features more than just selecting a DOM element.
E.g.:
http://doc.scrapy.org/en/latest/intro/overview.html#what-else
So for us, this would also cover handling things like throttling,
configuratio
At the risk of sounding stupid... Can't you just use jQuery? It's got
everything you need for fetching and parsing web content.
--
--
Job Board: http://jobs.nodejs.org/
Posting guidelines:
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you
On Jan 24, 2014, at 5:18 PM, Alexey Petrushin
wrote:
> I meant interactive control of phantom.js via child_process (not issuing just
> one command by supplying argv when it start) is it possible?
>
I haven’t had the need, so this is not from experience, but you may be able to
use stdin/stdout
On Jan 16, 2014, at 11:19 AM, Matt wrote:
>
> node-phantom-simple on npm.
>
> This was developed after 12 months of trying different phantom modules,
> having weird failures with each (some don't work under cluster, some don't
> work under load, some just randomly fail). It's used in productio
On Thu, Jan 16, 2014 at 11:11 AM, // ravi wrote:
> On Jan 16, 2014, at 10:44 AM, Matt wrote:
>
> And PhantomJS crashes randomly a lot (well, "a lot" depends on how much
> you're doing), so you have to deal with that. And the libraries for
> controlling it all suck except for the one I wrote (obv
And PhantomJS crashes randomly a lot (well, "a lot" depends on how much
you're doing), so you have to deal with that. And the libraries for
controlling it all suck except for the one I wrote (obviously I state that
completely unbiasedly! /s).
But no, I don't know of anything that deals with all th
You can also use scraper for web scrapping. Have a look at it.
https://nodejsmodules.org/tags/scraping
On Thu, Jan 16, 2014 at 9:14 PM, Matt wrote:
> And PhantomJS crashes randomly a lot (well, "a lot" depends on how much
> you're doing), so you have to deal with that. And the libraries for
>
On Jan 16, 2014, at 10:44 AM, Matt wrote:
> And PhantomJS crashes randomly a lot (well, "a lot" depends on how much
> you're doing), so you have to deal with that. And the libraries for
> controlling it all suck except for the one I wrote (obviously I state that
> completely unbiasedly! /s).
>
On Jan 15, 2014, at 9:09 PM, Victor Hooi wrote:
>
> I'm wondering if anybody knows of any web-scraping frameworks in Node.JS?
>
> Previously, there was node.io (https://github.com/chriso/node.io), however,
> the project was recently discontinued.
>
> Googling for Node.JS and web scraping, most
Hi,
I'm wondering if anybody knows of any web-scraping frameworks in Node.JS?
Previously, there was node.io (https://github.com/chriso/node.io), however,
the project was recently discontinued.
Googling for Node.JS and web scraping, most of the guides online just talk
about using requests and c
10 matches
Mail list logo