On Tue, 14 Dec 1999, Gilles Detillieux wrote:
> Date: Tue, 14 Dec 1999 13:22:26 -0600 (CST)
> From: Gilles Detillieux <[EMAIL PROTECTED]>
> To: "Joe R. Jah" <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
> Subject: Re: max_keywords works as expected;)
>
> I'm not talking about the document body here. The 26 words in questions
> were words that you said were in the meta description tag. If you include
> description in keywords_meta_tag_names, then the meta description tag will
> get parsed just like any meta keywords tag (or for that matter, any meta
> tag with a name that appears in keywords_meta_tag_names). That means the
> words in that tag will get indexed and counted as keywords, and use up
> part or all of the max_keywords quota for that document.
>
> Words in the document body have no bearing on this, as they're not treated
> as keywords.
Sorry, this stems from my confusion;(
> I thought I had answered that question yesterday, but I guess I wasn't
> clear enough. The HTML parser parses the HTML tags linearly, i.e. tags
> are processed in the order in which they appear in the document.
> It makes absolutely no difference in which order you list the names in
> the keywords_meta_tag_names attribute - what matters is the order in
> which they appear within any given document. Of course, one document
> may use a completely different order than the next.
>
> If you have keywords and htdig-keywords in keywords_meta_tag_names, then
> the two different meta tags will be treated as completely equivalent.
> If both appear in one document, the parser will parse both the same way.
> It will index and count the words in the first meta tag (whichever one
> that may be), then it will do the same with the second, just as though
> all the contents of the second tag were appended to the first. In either
> case, it counts and indexes the keywords until max_keywords is used up,
> after which it won't look at another keyword in that document.
You has answered it clearly the first time. Like I said, the problem was
my confusion;( It suddenly cleared up in my mind yesterday. I appreciate
your valuable time elaborating your already clear explanation.
This whole confusion had closed my eyes to the fact that htdig was dumping
core all the time and I was not noticing it;( Last night after several
trial and errors I realized that your version of local duplicate
suppression patch,
ftp://sol.ccsf.cc.ca.us/htdig-patches/3.1.4/Retriever.cc.0,
lacked one "return TRUE;" statement that was causing my htdig to dig
almost all the documents and silently dump core, leaving just two dozen
documents undug. Everything else would work, except an obscure error
message among the usual bad links, and a six meg core file in htdig
directory;) I added the statement to the patch, recompiled and randig;
everything worked fine:
+ }
+ visited.Add(key,local_filename);
*+ return TRUE;*
+ }
I have patched the patch in the patch site;)
Regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED]
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.