Hello!
My problem is as follows:
- I have a bunch of files (doc, pdf, html, txt, whatever) that needs
to be indexed.
- I can't modify the files.
- I want to enter additional keywords or comments for each file.
- A keyword or comment match should be ranked higher than
a match in the file itself
- The keywords should be usable as meta data
I first tried to write comments and keywords into files with the same
name and an added suffix (indexing starts with the auto generated
directory index). If a comment match occurs, i just have to strip the
suffice to know the correct file. But this solution is not very good:
- Metadata and data is treated separate. That means:
� One document will probably generate two results (file + file with
metadata)
� A match in the metadata file and the file doesn't rank higher
than a match in one of them
etc.
- A comment match does not count more than a regular match
- The keywords are not available as meta data
Another approach was to present every file using a cgi script that reads
the data from the metafile and adds it into meta tags. But this means
that the cgi script has to convert every file to HTML, i would have to
duplicate the entire existing filter functionality. Bad.. And when i
only link to the file, it and it's metadata would be treated separate.
Is there a better way to achieve my goals?
Thanks for all hints,
- Nikolaus
-------------------------------------------------------
This sf.net email is sponsored by:
With Great Power, Comes Great Responsibility
Learn to use your power at OSDN's High Performance Computing Channel
http://hpc.devchannel.org/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html