Hi,
 I started Nutch on my localhost web site. In application I have 
javascript files that create dynamic urls.
My 
question is: what should I configure so that Nutch recognizes these urls and 
completely crawls the site?

Below is part of log file that nutch 
generates.

fetching https://www.localhost/script/ShockwaveFlash.ShockwaveFlash.
fetching https://www.localhost/script/
fetching https://www.localhost/script/webtv/2.6
fetching https://www.localhost/script/_level0/_root
fetching https://www.localhost/script/betslip.aspx
fetching https://www.localhost/shared/script/+s_c2fe(c.substring(o+1,e))+
fetching https://www.localhost/shared/script/)<0)||oc.indexOf(
fetching https://www.localhost/shared/script/+m).indexOf(
fetching https://www.localhost/shared/script/c.indexOf(\
fetching https://www.localhost/shared/script/);else{if(s.ismac&&s.u.indexOf(
fetching https://www.localhost/registration.aspx#
fetch of https://www.localhost/shared/script/)<0)||oc.indexOf( failed with: 
java.lang.IllegalArgumentException: Invalid uri 
'https://www.localhost/shared/script/)<0)||oc.indexOf(': escaped absolute path 
not valid


Thanks.

Stjepan




 
____________________________________________________________________________________
TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to