Dear all, My client uses HTTrack with GDS (Google desktop search). While pages are fetched much quicker using nutch (kudos to the nutch engine developers), it doesnt seem to index the entire page like HTTrack/GDS does. As a result, he claims if he searchs on 'hbx' (a web analytics tool that is developed by visual science) GDS returns 26 hits and nutch returns none. I found out that the only places that contain hbx in those documents are all in the javascript that come with the page.
Is there anyway to get Nutch to index the javascript as a document too? Or is there any special configuration that I should have? Thanks!! -- View this message in context: http://www.nabble.com/How-to-index-javascript-contents-tf3905819.html#a11073844 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
