On Thu, Jan 16, 2003 at 10:56AM, martin bower wrote: > Ive had a quick look and found xpdf, but wondered what the pitfalls are > writing my own indexer, and any advice would be welcome.
Well, I'll start off with some gratuitous self-promotion ;) Have you tried looking at the Xapian search library ( http://www.xapian.org/ )? It's quite flexible, and rather impressively fast. A degree of bias from me is inevitable, however, as I am responsible for the Perl bindings ( http://search.cpan.org/author/KILINRAX/Search-Xapian-0.05/ ). I found writing your own indexer can be quite a fun, challenging project. If you opt for this approach instead then you might find reading the following section of the Xapian docs useful, which give a brief introduction to information retrieval concepts: http://www.xapian.org/docs/intro_ir.html -- Alex Bowley http://hyperspeed.org/ "NETWORK. Anything reticulated or decussated at equal distances, with interstices between the intersections." - Samuel Johnson, 'A Dictionary of the English Language', 1755