Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-02-09 Thread Victor Hooi
Hi, If you look at the docs and feature list for Scrapy, you'll see it has a whole bunch of scraping features more than just selecting a DOM element. E.g.: http://doc.scrapy.org/en/latest/intro/overview.html#what-else So for us, this would also cover handling things like throttling, configuratio

Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-02-07 Thread Jamie Popkin
At the risk of sounding stupid... Can't you just use jQuery? It's got everything you need for fetching and parsing web content. -- -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you

Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-01-24 Thread // ravi
On Jan 24, 2014, at 5:18 PM, Alexey Petrushin wrote: > I meant interactive control of phantom.js via child_process (not issuing just > one command by supplying argv when it start) is it possible? > I haven’t had the need, so this is not from experience, but you may be able to use stdin/stdout

Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-01-16 Thread // ravi
On Jan 16, 2014, at 11:19 AM, Matt wrote: > > node-phantom-simple on npm. > > This was developed after 12 months of trying different phantom modules, > having weird failures with each (some don't work under cluster, some don't > work under load, some just randomly fail). It's used in productio

Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-01-16 Thread Matt
On Thu, Jan 16, 2014 at 11:11 AM, // ravi wrote: > On Jan 16, 2014, at 10:44 AM, Matt wrote: > > And PhantomJS crashes randomly a lot (well, "a lot" depends on how much > you're doing), so you have to deal with that. And the libraries for > controlling it all suck except for the one I wrote (obv

Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-01-16 Thread Matt
And PhantomJS crashes randomly a lot (well, "a lot" depends on how much you're doing), so you have to deal with that. And the libraries for controlling it all suck except for the one I wrote (obviously I state that completely unbiasedly! /s). But no, I don't know of anything that deals with all th

Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-01-16 Thread Arvind Gupta
You can also use scraper for web scrapping. Have a look at it. https://nodejsmodules.org/tags/scraping On Thu, Jan 16, 2014 at 9:14 PM, Matt wrote: > And PhantomJS crashes randomly a lot (well, "a lot" depends on how much > you're doing), so you have to deal with that. And the libraries for >

Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-01-16 Thread // ravi
On Jan 16, 2014, at 10:44 AM, Matt wrote: > And PhantomJS crashes randomly a lot (well, "a lot" depends on how much > you're doing), so you have to deal with that. And the libraries for > controlling it all suck except for the one I wrote (obviously I state that > completely unbiasedly! /s). >

Re: [nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-01-15 Thread // ravi
On Jan 15, 2014, at 9:09 PM, Victor Hooi wrote: > > I'm wondering if anybody knows of any web-scraping frameworks in Node.JS? > > Previously, there was node.io (https://github.com/chriso/node.io), however, > the project was recently discontinued. > > Googling for Node.JS and web scraping, most

[nodejs] Web Scraping Frameworks for Node.JS? (e.g. like Python's Scrapy)

2014-01-15 Thread Victor Hooi
Hi, I'm wondering if anybody knows of any web-scraping frameworks in Node.JS? Previously, there was node.io (https://github.com/chriso/node.io), however, the project was recently discontinued. Googling for Node.JS and web scraping, most of the guides online just talk about using requests and c