Hi Thanks for your help. It makes sense - however it does not seem to be working I have set:
description_factor: 0 ignore_alt_text: true in htdig.conf and have re-indexed the site, and when I search for "bookmark" I am still finding the "Bookmark This Page" link which appears as a standard <a href e.g. <a href="blah">Bookmark This Page</a> and when I search for monkey I still get monkey found in the alt text of one of my images. Maybe there is another option that is overriding this in my config file? doesn't appear to be. I am using version 3.2.0 (I need phrase matching) - here are the config attributes that are set in htdig.conf: **** description_factor: 0 ignore_alt_text: true database_dir: /var/lib/htdig/htdig.db start_url: http://www.thewebsiteiamindexing.com/ limit_urls_to: ${start_url} exclude_urls: /cgi-bin/ .cgi bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \ .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi .css maintainer: [EMAIL PROTECTED] max_head_length: 10000 max_doc_size: 200000 no_excerpt_show_top: true search_algorithm: exact:1 substring:1 synonyms:0.5 endings:0.1 prefix:0.1 template_map: Long long ${common_dir}/long.html \ Short short ${common_dir}/short.html matches_per_page: 20 excerpt_length: 200 **** Any help appreciated. Regards John ----- Original Message ----- From: "Jim Cole" <[EMAIL PROTECTED]> To: "Joe R. Jah" <[EMAIL PROTECTED]> Cc: "Gabriele Bartolini" <[EMAIL PROTECTED]>; "John Legg" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Thursday, July 04, 2002 9:48 AM Subject: Re: [htdig] Finding Referenced Pages not Referring Pages (link weighting) > Hi - If you are only trying to remove descriptions of links from > the index, then using description_factor is the way to go. See > http://www.htdig.org/attrs.html#description_factor If instead you > are trying to prevent the index.html page itself from being > indexed, while following links on that page, then something like > the META tag described below is probably the best approach. > > Jim > > > Joe R. Jah's bits of Wed, 3 Jul 2002 translated to: > > >On Wed, 3 Jul 2002, Gabriele Bartolini wrote: > > > >> Date: Wed, 03 Jul 2002 20:45:56 +0200 > >> From: Gabriele Bartolini <[EMAIL PROTECTED]> > >> To: John Legg <[EMAIL PROTECTED]>, [EMAIL PROTECTED] > >> Subject: Re: [htdig] Finding Referenced Pages not Referring Pages (link > > weighting) > >> > >> > >> >Is it possible to make htDig only return referenced pages not referring pages? > >> > > >> >For example if you had a page called "index.html" which contains a link > >> >called "monkey" that links (refers) to the page "monkey.html". -- I > >> >would like htDig to return monkey.html in the results list but not > >> >index.html, as that just contains a link to monkey.html. - Is this possible? > >> > >> I am not sure about this and unfortunately I can't try it right now. Maybe > >> it is worthwhile giving it a go anyway. > >> > >> Try: > >> > >> description_factor: 0 > >> > >> I hope I am not saying bs. What do you think, people in the list? > >> > >> Ciao > >> -Gabriele > > > >I believe putting <META name="robots" content="follow,noindex"> in the > >header of index.html file will achieve the desired results. > > > >Regards, > > > >Joe > > > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Caffeinated soap. No kidding. > http://thinkgeek.com/sf > _______________________________________________ > htdig-general mailing list <[EMAIL PROTECTED]> > To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe > FAQ: http://htdig.sourceforge.net/FAQ.html > ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Caffeinated soap. No kidding. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

