I have found that if a page contains keywords with just a space in the
contents, like so:

<html>
<head>
<title>Test</title>
<META name=keywords content=" ">
<META name=description content=" ">
</head>
<body>

then the page is indexed ok, but no excerpt is shown by htsearch with
Format=Long. 

Just changing that to:

<html>
<head>
<title>Test</title>
<META name=keywords content="">
<META name=description content="">
</head>
<body>

clears the problem.

How did I find this, and why does it matter?

Well I'm working on an external conversion script which tries to extract
the keywords and summary from WordPerfect documents.  In real life such
documents often have no summary or keywords and I was using a space as
the default.

I can work around this, so its no great deal, but the bug may have other
consequences I havn't found yet.

By the way, my script is based on conv_doc.pl and can be used in its
place.  I hope to send it in when I've finished polishing it. 

---
 
David J Adams
<[EMAIL PROTECTED]>
Computing Services
University of Southampton

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to