I have two files on my test site being indexed. fruit.html pineapple.html
There's a word, "large" on fruit.html. "large" does NOT appear anywhere within pineapple.html, however, when I htsearch on the index, both documents show as a match. In fact, the base_score is very high for the query "large" on the document pineapple.html. If I index pineapple.html alone, the query "large" yields no results. So there is definitely some type of relationship between the two documents and the word "large". htdump revealed the relationship by outputting this: 3 u:http://jtest/pineapple.htm t:pineapples a:0 m:1033563111 s:5824 H: Pineapple trees h: l:1033563111 L:12 b:5 c:1 g:0 e: n: S: d:large A:Stump^ATrunk The key part is "d:large". According to the htdump doc page online, the "d" element is defined as: "The text of links pointing to this document. (e.g. <a href="docURL">description</a>)" fruit.html, the other HTML file indexed, contains a hyperlink to pineapple.html that looks like this: <a href="pineapple.html" onMouseOver="MM_showHideLayers('Pine_apple','','show','large','','hide','apple','','hide','tomato','')"><img border="0" src="images/pineapple.jpg" width="80" height="60"></a> What configuration attribute must set to zero in order for that extra anchor text being indexed and factored into pineapple.html's word list? I've basically tried setting all "documented" factors to zero (including backlink). I used the latest htdig-3.2.0b4-092902 as well as a January 2002 release -- both behave the same way. __________________________________________________ Do you Yahoo!? New DSL Internet Access from SBC & Yahoo! http://sbc.yahoo.com ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-dev mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/htdig-dev
