Have you enabled the parse-js plugin in your nutch-site.xml?
Ann
----- Original Message ----
From: Joseph Chan <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Tuesday, June 12, 2007 11:28:40 AM
Subject: Can nutch index the javascript code too?
Dear all,
My client uses HTTrack with GDS (Google desktop search). While pages are
fetched much quicker using nutch (kudos to the nutch engine developers), it
doesnt seem to index the entire page like HTTrack/GDS does. As a result, he
claims if he searchs on 'hbx' (a web analytics tool that is developed by visual
science) GDS returns 26 hits and nutch returns none. I found out that the only
places that contain hbx in those documents are all in the javascript that come
with the page.
Is there anyway to get Nutch to index the javascript as a document too? Or is
there any special configuration that I should have?
Thanks!!
____________________________________________________________________________________
Luggage? GPS? Comic books?
Check out fitting gifts for grads at Yahoo! Search
http://search.yahoo.com/search?fr=oni_on_mail&p=graduation+gifts&cs=bz-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general