Well, a couple of things.... The IndexReader is opened on an already existing index. You need to have built an index to open a reader on it. So there isn't really a default reader.
So usually, one uses Lucene to create an index out of some number of documents, and use that index for searching. One of the major parts of creating an index is exactly reading in the raw text and deciding what needs to go into the index in various fields. This makes what you're trying to do a bit clumsy, at least it would have a release or two ago... However, you're in great luck, because some kind soul created MemoryIndex, which is a simplified in-memory-only index intended to be used to create indexes on a single document for some processing and then throw out. Which may be just what you need. See the jar in C:\lucene-2.1.0\contrib\memory But I'm still not sure this is what you need, because you only get relevancy rankings in Lucene based upon the query you are trying to submit. If you're doing simple term frequency counting, perhaps you can get creative with TermFreqVector and other Term* classes... Perhaps others will be able to chime in here.... Best Erick On 4/9/07, Ted Chen <[EMAIL PROTECTED]> wrote:
Hi, I'm new to Lucene, but I did spend quite some time trying to find an answer to the problem before turning to you gurus for help. I'm given a text file. My task is to extract some top keywords from the file so that these words can describe this document. Ideally, terms with the highest tfidf should be returned. I noticed lucene provides a method called QueryTermExtractor.getIdfWeightedTerms(), but to use this method, I need to provide a query and an IndexReader. In my case, I only have a text file, and don't have a handle on any IndexReader. Is there any way to indicate I want to use the "default IndexReader", where the default reader represents a typical collection of documents. Also, do I need to construct a Query out of the text file? If so, is the best choice a multi-term query? Any suggestions? Thanks, Ted ____________________________________________________________________________________ Sucker-punch spam with award-winning protection. Try the free Yahoo! Mail Beta. http://advision.webevents.yahoo.com/mailbeta/features_spam.html