This is from S. Budd <[EMAIL PROTECTED]> on the subject of ranking (in 3.1) >Ranking pages and the use of Meta tags with Htdig > >1. How pages are ranked. > >The search program "htsearch" ranks the web pages which satisfy the >search terms before they are returned in the results page. It uses a >complex rule to rank the pages. This rule takes into account the >following factors which can be set either on the search form or in the site >configuration file. > > >description_factor > Plain old "descriptions" are the text of a link pointing to a document. >This factor gives weight to the words of these descriptions of the document. >Not surprisingly, these can be pretty accurate summaries of a document's >content. default: 150 example: description_factor: 350 > > >heading_factor >This is a factor which will be used to multiply the weight of word between ><h1> and </h1> tags, as well as headings of levels <h2> through <h6>. It >is used to assign the level of importance to headings. Setting a factor to >0 will cause words in these headings to be ignored. The number may be a >floating point number. Default 5 example: heading_factor: 20.9 > > >keywords_factor >This is a factor which will be used to multiply the weight of words in this >list of keywords of a document. The number may be a floating point number. >Default 10 example: keywords_factor: 12 > > >meta_description_factor >This is a factor which will be used to multiply the weight of words in any >META description tags in a document. The number may be a floating point >number. Default 50 example: meta_description_factor: 20 > > > >text_factor >This is a factor which will be used to multiply the weight of words that >are not in any special part of a document. Setting a factor to 0 will >cause normal words to be ignored. The number may be a floating point >number. Default 1 example: text_factor: 0 > > >title_factor >This is a factor which will be used to multiply the weight of words in the >title of a document. Setting a factor to 0 will cause words in the title to >be ignored. The number may be a floating point number. Default 100. >example: title_factor: 12 > > >backlink_factor >This is a weight of "how important" a page is, based on the number of >URLs pointing to it. It's actually multiplied by the ratio of the incoming >URLs (backlinks) and outgoing URLs, to balance out pages with lots of >links to pages that link back to them. This factor can be changed >without changing the database in any way. However, setting this value to >something other than 0 incurs a slowdown on search results. Default 1000. >example: backlink_factor: 501.1 > > >date_factor > This factor, like backlink_factor can be changed without modifying the >database. It gives higher rankings to newer documents and lower rankings >to older documents. Before setting this factor, it's advised to make sure >your servers are returning accurate dates (check the dates returned in >the long format). Additionally, setting this to a nonzero value incurs a >performance hit on searching. Default 0 example date_factor : >0.35 > >2. Using <META .... > tags. > > In HTML, any number of <META> tags can be used between the <HEAD> and ></HEAD> tags of a document. There are three possible attributes to this tag, >two of which are recognized by ht://Dig: One is NAME which is used to >name a specific property and the other is CONTENT which is used to supply >the value for a named property. For example, a document could start with >something like the following: > > <HTML> > <HEAD> > <META NAME="htdig-keywords" CONTENT="phone telephone online >electronic directory"> > <META NAME="htdig-email" CONTENT="[EMAIL PROTECTED]"> > <TITLE>Some document title</TITLE> > </HEAD> > <BODY> > > Body of document > > </BODY> > </HTML> > > > > > > > > >Htdig recognizes the following values for NAME 's > >NAME="Htdig-keywords" >The value of this property should be a blank separated list of keywords >which will get a very high weight when searching. This can be used to get >around some problems with common synonyms for words in the document. For >example, if a document is a telephone directory, possible keywords could be >"telephone phone directory book list". Now, regardless of what text is >actually in the document, it can be found if these keywords are used in the >search. The weight that words in the content string will have in > >NAME="keywords" > The value of this property should be a blank separated list of keywords, >just as for the htdig-keywords property. They are treated as equivalent by >htdig. The reason for two different properties is that the keywords property >is used by other search engines as well, while the htdig-keywords property >can be used for words you want indexed only by htdig. You can get htdig to >treat other property names as equivalent to htdig-keywords, or disable the >htdig-keywords or keywords properties, by changing the >keywords_meta_tag_names attribute in your configuration. > >NAME="description" > The value allows you to specify an alternate excerpt (description) of a >page. If the config-file attribute use_meta_description is used, then any >documents with descriptions will use them instead of the automatically >generated excerpts. The weight that words in the content string will have >in search results is controlled by the meta_description_factor attribute in >your configuration. > >There is also the possibility of introducing arbitrary <META NAME="xxx" >tags. For example > > <META NAME="dc.creator" CONTENT="Paul Wolstenholme"> > <META NAME="dc.creator" CONTENT="Richard Smith"> > > >To do this you have to introduce the following two configuration entries: > >keywords_meta_tag_name ( needed when digging is done) >The words in this list are used to search for keywords in HTML META tags. >This list can contain any number of strings that each will be seen as the >name for whatever keyword convention is used. The META tags have the >following format: <META NAME="somename" CONTENT="somevalue"> default: >keywords htdig-keywords example: keywords_meta_tag_names >keywords description > >In the above example you would use keywords_meta_tag_names: dc.creator > > > >max_meta_description_length ( needed when digging is done ) >While gathering descriptions from meta description tags, htdig will >truncate descriptions which are longer than this length. This is required >in case a webmaster tries to swamp a search result by repeating a keyword >may times. Default 512 example: max_meta_description_length: >1000 > > > > >It is possible to have the NAME="description" CONTENT=" xxx ..... " meta >tag used for the description of a found page instead of the usual excerpts. >This is accomplished with the following configuration parameter > >use_meta_description > If set to true, any META description tags will be used as excerpts by >htsearch. Any documents that do not have META descriptions will retain >their normal excerpts. Default false. example: >use_meta_description: true > > ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
