Hello,

I'm new to the list, this isn't in the FAQ, and the list archive at
http://wormhole.eosys.com/mail-archives/htdig/ doesn't seem to be
accessible, so I'm in search of some input. =)

The info at http://htdig.sdsu.edu/meta.html gives the htdig-syntax for
meta tag keywords as not having commas between keywords. So far so
good-- some bots/spiders use the commas, some don't.

The problem is, htdig doesn't merely ignore or strip any commas that are
there, but rather lumps them in as part of the keyword (according to the
debug output we've seen). That is, the tag

<META NAME="keywords" CONTENT="guestbook, register, newsletter">   

produces the four words

guestbook,
register,
newsletter

A search for the word 'newsletter' would have a positive result (no
trailing comma), but a search for 'guestbook' would not (because htdig
indexed it as 'guestbook,' complete with the comma).

A real live example can be found at http://www.centuryinter.net/. Use
the search tool on the front page to look for the word 'harold'-- no
match. Then, visit http://www.centuryinter.net/links_finance.html and
view the source-- 'harold' is one of the keywords listed (comma
delimited). Finally, use the search tool to search for 'dog'-- match.

We're running the latest version, obtained within the past couple of
weeks from http://www.htdig.org/files/htdig-3.0.8b2.tar.gz. It was
compiled and otherwise set up without incident on a Digital UNIX box.

Is this behavior considered buggy, or are most happy to leave their meta
tags comma-less? I could understand that if htdig would simply ignore
the commas-- but as it is, the commas (which are commonly used in
keyword lists) break the search as demonstrated above.

Is there a patch for this, or any other option to get this corrected?
Any thoughts? Thanks in advance for any information!

Brad Shelton
[EMAIL PROTECTED]
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to