Hello,

  We have been using HTDig on our site for several years now and are
currently running 3.2.0b5 and have had no real issues with it...however,
for some reason certain documents are being omitted from the database
each night and after hunting around the htdig.out file and various pages
throughout our site I may have found a possible bug in HTDig.

  It seems that when multiple href's are on a single line, only the first
one is considered and followed. I verified that many documents that are
NOT in the database at all (and are never hit by HTDig) are linked in
code on lines that contain 2 links, an anchor link and then the document
link. Each link is properly coded and the necessary closing </a> is
present, I examined  over 20 documents and each time I found the code
setup like this I found the document was not included in the database.

  So, my question is, does this make any sense? Is there something in
htdig that prevents parsing multiple links on a single line? Any advice
you can provide would be greatly appreciated.

Cheers,
Jonathan Schlackl


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
ht://Dig general mailing list: <[email protected]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to