On Thu, Jan 16, 2003 at 10:56AM, martin bower wrote:
> Ive had a quick look and found xpdf, but wondered what the pitfalls are 
> writing my own indexer, and any advice would be welcome.

Well, I'll start off with some gratuitous self-promotion ;)
Have you tried looking at the Xapian search library
( http://www.xapian.org/ )? It's quite flexible, and rather
impressively fast. A degree of bias from me is inevitable, however, as
I am responsible for the Perl bindings
( http://search.cpan.org/author/KILINRAX/Search-Xapian-0.05/ ).

I found writing your own indexer can be quite a fun, challenging
project. If you opt for this approach instead then you might find
reading the following section of the Xapian docs useful, which give a
brief introduction to information retrieval concepts:
http://www.xapian.org/docs/intro_ir.html

-- 
Alex Bowley                                           http://hyperspeed.org/
"NETWORK. Anything reticulated or decussated at equal distances, with
 interstices between the intersections."
              - Samuel Johnson, 'A Dictionary of the English Language', 1755

Reply via email to