On Jan 21, 2005, at 10:47 PM, Paul Smith wrote:
As a log4j developer, I've been toying with the idea of what Lucene could do for me, maybe as an excuse to play around with Lucene.

First off, let me thank you for your work with log4j! I've been using it at lucenebook.com with the SMTPAppender (once I learned that I needed a custom trigger to release e-mails when I wanted, not just on errors) and it's been working great.


Now, I could provide a Field to the LoggingEvent Document that has a sequence #, and once a user has chosen an appropriate matching event, do another search for the documents with a Sequence # between +/- the context size.
My question is, is that going to be an efficient way to do this? The sequence # would be treated as text, wouldn't it? Would the range search on an int be the most efficient way to do this?


I know from the Hits documentation that one can retrieve the Document ID of a matching entry. What is the contract on this Document ID? Is each Document added to the Index given an increasing number? Can one search an index by Document ID? Could one search for Document ID's between a range? (Hope you can see where I'm going here).


You wouldn't even need the sequence number. You'll certainly be adding the documents to the index in the proper sequence already (right?). It is easy to random access documents if you know Lucene's document ids. Here's the pseudo-code....

- construct an IndexReader
- open an IndexSearcher using the IndexReader
- search, getting Hits back
- for a hit you want to see the context, get hits.id(hit#)
- subtract context size from the id, grab documents using reader.document(id)


You don't "search" for a document by id, but rather jump right to it with IndexReader.

Many thanks for an excellent API, and kudos to Erik & Otis for a great eBook btw.

Thanks!

        Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to