Re: Order the index by timestamp field and Get n documents

2008-11-16 Thread Tomer Gabel
Possibly the fastest way to do this is to use a sortable timestamp field (e.g. padded long) and use a TermEnumerator, which always gives a lexicographically-sorted enumeration. Since you'd probably prefer a "most recent" policy you may need to come up with a reverse-timestamp scheme (e.g. instead

how to estimate how much memory is required to support the large index search

2008-11-16 Thread Zhibin Mai
Hello, I am a beginner on using lucene. We developed an application to create and search index using lucene 2.3.1. We would like to know how to estimate how much memory is required to support the index search given an index. Recently, the size of the index has reached to about 200GB with 197M of

InstantiatedIndex help

2008-11-16 Thread Darren Govoni
Hi gang, I am trying to trace the 2.4 API to create an InstantiatedIndex, but its rather difficult to connect directory,reader,search,index etc just reading the javadocs. I have a (POI - plain old index) directory already and want to create a faster InstantiatedIndex and IndexSearcher to q

Re: InstantiatedIndex help

2008-11-16 Thread Mark Miller
Check out the docs at: http://lucene.apache.org/java/2_4_0/api/contrib-instantiated/index.html There is a performance graph there to check out. The code should be fairly straightforward - you can make an InstantiatedIndex thats empty, or seed it with an IndexReader. Then you can make an Inst

Re: InstantiatedIndex help

2008-11-16 Thread Darren Govoni
Hi Mark, Thanks for the tips. Here's what I will try (psuedo-code) endirectory = RAMDirectory("index/dictionary.en") ensearcher = IndexSearcher(endirectory) // Adding these reader = ensearcher.getIndexReader() iindex = InstantiatedIndex(reader) ireader = iindex.indexReade

Re: InstantiatedIndex help

2008-11-16 Thread Mark Miller
Can you start with an empty index? Then how about: // Adding these iindex = InstantiatedIndex() ireader = iindex.indexReaderFactory() isearcher = IndexSearcher(ireader) If you want a copy from another IndexReader though, you have to get that reader from somewhere right? - Mark D

Re: InstantiatedIndex help

2008-11-16 Thread Darren Govoni
Yeah. That makes sense. Its not too hard to wrap those extra steps so I can end up with something simpler too. Like: iindex = InstantiatedIndex("path/to/my/index") I'm lazy so the intermediate hoops to jump through clutter my code. Hehe. :) Darren On Sun, 2008-11-16 at 11:46 -0500, Mark Miller

Re: InstantiatedIndex help + first impression

2008-11-16 Thread Darren Govoni
After I switched to InstantiatedIndex from RAMDirectory (but using the reader from my RAMDirectory to create the InstantiatedIndex), I see a less than 25% (.25) improvement in speed. Nowhere near the 100x (100.00) speed mentioned in the documentation. Probably I am doing something wrong. I am usi

Re: Scoped Search and Facets generation using Lucene

2008-11-16 Thread Alexander Aristov
If you mean using XPath then Nutch doesn't support this. You should develop it yourself. Alexander 2008/11/14 Otis Gospodnetic <[EMAIL PROTECTED]> > Hi Mayur, > > Solr has built-in support for facets. I don't understand what you mean by > scoped searches. Could you please give a concrete examp