I'm learning node and decided to build a web crawler. It works well enough, 
but when i try to crawl a site like reddit I start running into severe 
memory issues.

The goal of the crawler would be to take a provided url, crawl the page, 
gather all internal links and crawl them, then store all the html from all 
pages into a mongo database or on the file system.

Since i'm working with large amounts of data it is important that I 
understand garbage collection in node but no matter what I do I can't seem 
to help the performance. Any chance one of you with more expertise could 
take a look and help me figure out where my holes are?

git repo here: https://github.com/jcrowe206/crawler[1]

-- 
-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nodejs@googlegroups.com
To unsubscribe from this group, send email to
nodejs+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to nodejs+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to