Hello!

My problem is as follows:

 - I have a bunch of files (doc, pdf, html, txt, whatever) that needs
   to be indexed. 
 - I can't modify the files. 
 - I want to enter additional keywords or comments for each file.
 - A keyword or comment match should be ranked higher than
   a match in the file itself
 - The keywords should be usable as meta data

I first tried to write comments and keywords into files with the same
name and an added suffix (indexing starts with the auto generated
directory index). If a comment match occurs, i just have to strip the
suffice to know the correct file. But this solution is not very good:
 - Metadata and data is treated separate. That means:
   � One document will probably generate two results (file + file with
     metadata) 
   � A match in the metadata file and the file doesn't rank higher
     than a match in one of them
   etc.
 - A comment match does not count more than a regular match
 - The keywords are not available as meta data

Another approach was to present every file using a cgi script that reads
the data from the metafile and adds it into meta tags. But this means
that the cgi script has to convert every file to HTML, i would have to
duplicate the entire existing filter functionality. Bad.. And when i
only link to the file, it and it's metadata would be treated separate.

Is there a better way to achieve my goals?

Thanks for all hints,

 - Nikolaus



-------------------------------------------------------
This sf.net email is sponsored by:
With Great Power, Comes Great Responsibility
Learn to use your power at OSDN's High Performance Computing Channel
http://hpc.devchannel.org/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to