On Fri, June 6, 2008 2:59 am, Brian Chabot said: > > > I suppose one option would be a stand-alone Apache installation for each > OS and htdig, but that only indexes HTML and TXT files... >
You could always add to that additional indexing filters for other formats. I've seen opensource projects that incorporate plugins to index things like pdf files, MS doc/xls/ppt files, JPEG exif data, etc., with options for OCR to index scanned documents. One such that I tried out was Docmgr: http://www.docmgr.org/ I'm not sure how much work it would take to get htdig to use the same plugins, but it might be worth investigating. -- John Abreau / Executive Director, Boston Linux & Unix IM: [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] Email [EMAIL PROTECTED] / WWW http://www.abreau.net / PGP-Key-ID 0xD5C7B5D9 PGP-Key-Fingerprint 72 FB 39 4F 3C 3B D6 5B E0 C8 5A 6E F1 2C BE 99 -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/