looking for pointers on what I need to dive into next - I suspect
something horrible is happening in node.io: htmlparser.js.

I am using node.io to scrape
http://www.reuters.com/article/2012/03/30/utilities-southern-kemper-idUSL2E8EUAHQ20120330?feedType=RSS&feedName=utilitiesSector

and I get a segfault each time.

information:
valgrind & strace output - http://pastebin.com/McidkC0g

System info - Linux  2.6.18-194.3.1.el5 #1 SMP Thu May 13 13:09:10 EDT
2010 i686 athlon i386 GNU/Linux
$ free -m
             total       used       free     shared    buffers
cached
Mem:          3034       2809        224          0        531
1593
-/+ buffers/cache:        684       2349
Swap:         2047          0       2047

$ node -v
v0.6.15
$ npm -v
1.1.18

node.io scrape options:
var scrapeOptions = {
        silent:  true,
        jsdom:   true,                                  // enable
parsing of js files
        external_resources: ['script'],
    timeout: 10,                                //Timeout after 10
seconds
    max: 1,
    retries: 3                                  //Threads can retry 3
times before failing
};

FWIW, I also had this error in node.js v0.4.11, and one of the first
steps I took was to upgrade node.js, npm and relevant npm modules.

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to