I can suggest two possible solutions, either requires a far amount of work:
1) Create an .html file for each file you need to index, each file to contain metadata for indexing and a link to the file it refers to. Then index these files. Not a very good solution as far as end-users are concerned, but simple to do. 2) Write a converter script which will add the appropriate metadata to the output from doc2html.pl or whatever conversion script you are using, during the indexing process. A bit of challenge to write, but it should give you exactly what you want and will be transparent to end users. David Adams Southampton University ----- Original Message ----- From: "Nikolaus Rath" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Tuesday, December 17, 2002 11:30 AM Subject: [htdig] Files and Metadate in other files > Hello! > > My problem is as follows: > > - I have a bunch of files (doc, pdf, html, txt, whatever) that needs > to be indexed. > - I can't modify the files. > - I want to enter additional keywords or comments for each file. > - A keyword or comment match should be ranked higher than > a match in the file itself > - The keywords should be usable as meta data > > I first tried to write comments and keywords into files with the same > name and an added suffix (indexing starts with the auto generated > directory index). If a comment match occurs, i just have to strip the > suffice to know the correct file. But this solution is not very good: > - Metadata and data is treated separate. That means: > � One document will probably generate two results (file + file with > metadata) > � A match in the metadata file and the file doesn't rank higher > than a match in one of them > etc. > - A comment match does not count more than a regular match > - The keywords are not available as meta data > > Another approach was to present every file using a cgi script that reads > the data from the metafile and adds it into meta tags. But this means > that the cgi script has to convert every file to HTML, i would have to > duplicate the entire existing filter functionality. Bad.. And when i > only link to the file, it and it's metadata would be treated separate. > > Is there a better way to achieve my goals? > > Thanks for all hints, > > - Nikolaus > > > > ------------------------------------------------------- > This sf.net email is sponsored by: > With Great Power, Comes Great Responsibility > Learn to use your power at OSDN's High Performance Computing Channel > http://hpc.devchannel.org/ > _______________________________________________ > htdig-general mailing list <[EMAIL PROTECTED]> > To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe > FAQ: http://htdig.sourceforge.net/FAQ.html > > ------------------------------------------------------- This sf.net email is sponsored by: With Great Power, Comes Great Responsibility Learn to use your power at OSDN's High Performance Computing Channel http://hpc.devchannel.org/ _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

