Normally, I'd be scouring the list archives for this info, but they
appear to be broken right now.
I'm enclosing the relevant htdig.conf lines and a snippet of rundig's
output.
I'm wondering why my index.jsp is getting indexed, but the links on it
are not. Specifically, the pages in the output below are that I expected
to get indexed are http://www.foobar/site/about/about.jsp and
http://www.foobar.org/site/events/events.jsp.
The most frequent error message I'm getting is "Rejected: URL not in the
limits!"
Is it reading the Javascript stuff (OnMouseXXX, etc.) and getting
confused? Or is it something in my configuration file?
Any ideas?
Thanks.
---- Snipped from rundig -vvv output ---
Rejected: Extension is invalid!
url rejected: (level 1)http://www.foobar.org/site/sitestyle.css
image: http://www.foobar.org/site/pix/top_left_arc.gif
image: http://www.foobar.org/site/pix/top_blue_vert.gif
A tag: pos = 2, position = ="about/about.jsp"
onMouseOver="changeImage('top_about','top_about_roll')"
onMouseOut="changeImage('top_about','top_about')">
href: http://www.foobar/site/about/about.jsp (About Us )
Rejected: URL not in the limits!
url rejected: (level 1)http://www.foobar.org/site/about/about.jsp
image: http://www.foobar.org/site/pix/shim.gif
A tag: pos = 2, position = ="events/events.jsp"
onMouseOver="changeImage('top_events','top_events_roll')"
onMouseOut="changeImage('top_events','top_events')">
href: http://www.foobar.org/site/events/events.jsp (Events )
---- most of my htdig.conf ---
database_dir: /var/lib/htdig
limit_urls_to: ${start_url}
exclude_urls: /cgi-bin/ .cgi
bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \
.jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi \
.css
maintainer: [EMAIL PROTECTED]
max_head_length: 50000
max_doc_size: 5000000
no_excerpt_show_top: true
search_algorithm: exact:1 synonyms:0.5 endings:0.1
no_next_page_text:
no_prev_page_text:
start_url: http://www.foobar.org/site/index.jsp
local_urls: http://www.foobar.com/=/home/kfish/www/kfish/
local_user_urls: http://www.foobar.com/=/home/,/public_html/
-------------
--
Richard Seymour : Anarchy Software, Inc.
- * - - * - - - * -+- * - - - * - - * -
`°º¤ø,¸ ¸,ø¤º°'
`°º¤ø,¸¸,ø¤º°
_______________________________________________
htdig-general mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-general