Re: Inquiry on Lucene Stemming

2008-12-20 Thread Chris Hostetter
: Well some client inquiries if it's possible to expand such simple words : and does Lucene have an API for this logic? Because all I read was the : stemming logic for Lucene was the other way around which is, example : "flashing" it will be trimmed to the root word "flash" when searched. ther

Re: Lucene SpellChecker returns no suggetions after changing Server

2008-12-20 Thread Chris Hostetter
: How can I speed it up? don't construct a new LuceneDictionary/IndexReader on every "suggest" call ... construct them once, and reuse them for each suggestion. : My temporary method: : public static Vector suggest(String query, String indexName, String field, : float accuracy) { :

Re: BooleanQuery Performance Help

2008-12-20 Thread Erick Erickson
What specifically are you measuring when you time the queries? I've been mislead by including in my measurement say, creating the response. I realize that throughput includes assembling the response, but the solution is different depending upon whether it's the actual search or what you do with the

Re: BooleanQuery Performance Help

2008-12-20 Thread Paul Elschot
Op Saturday 20 December 2008 15:23:43 schreef Prafulla Kiran: > Hi Everyone, > > I have an index of relatively small size (400mb) , containing roughly > 0.7 million documents. The index is actually a copy of an existing > database table. Hence, most of my queries are of the form > > " +field1:value

BooleanQuery Performance Help

2008-12-20 Thread Prafulla Kiran
Hi Everyone, I have an index of relatively small size (400mb) , containing roughly 0.7 million documents. The index is actually a copy of an existing database table. Hence, most of my queries are of the form " +field1:value1 +field2:value2 +field3:value3. ~20 fields" I have been running

Re: Lucene and JSON

2008-12-20 Thread Paul Libbrecht
Thom, Lucene support binaries and arbitrary strings... this is far enough to store JSON or XML... (much independently of its size!). E.g. we store some objects in JSON using xStream serialization (that is we add a field which is stored containing the object serialization... it could be XM

Re: Default and optimal use of RAMDirectory

2008-12-20 Thread Michael McCandless
Actually, things have improved since LIA1 was written a few years ago: IndexWriter now does a good job managing the RAM buffer you assign to it, so you should not see much benefit by doing your own buffering with RAMDirectory (and if you somehow do, I'd like to know about it!). Instead you should