Dear all, 

My client uses HTTrack with GDS (Google desktop search). While pages are 
fetched much quicker using nutch (kudos to the nutch engine developers), it 
doesnt seem to index the entire page like HTTrack/GDS does. As a result, he 
claims if he searchs on 'hbx' (a web analytics tool that is developed by visual 
science) GDS returns 26 hits and nutch returns none. I found out that the only 
places that contain hbx in those documents are all in the javascript that come 
with the page. 

Is there anyway to get Nutch to index the javascript as a document too? Or is 
there any special configuration that I should have? 

Thanks!!
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to