Re: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Michael McCandless
Unfortunately, the terms index (before 4.0) is not RAM efficient -- I wrote about this here: http://chbits.blogspot.com/2010/07/lucenes-ram-usage-for-searching.html Every indexed term that's loaded into RAM creates 4 objects (TermInfo, Term, String, char[]), as you see in your profiler

Re: Autocomplete with Filter Query

2010-09-11 Thread Ingo Renner
Am 10.09.2010 um 17:14 schrieb David Yang: Hi David, Is there any way to provide autocomplete while filtering results? yes, you can use facets to achieve that. best Ingo -- Ingo Renner TYPO3 Core Developer, Release Manager TYPO3 4.2, Admin Google Summer of Code TYPO3 - Open Source

RE: multivalued fields in result

2010-09-11 Thread Markus Jelsma
Yes, you'll get what is stored and asked for.   -Original message- From: Jason Chaffee jchaf...@ebates.com Sent: Sat 11-09-2010 05:27 To: solr-user@lucene.apache.org; Subject: multivalued fields in result Is it possible to return multivalued files in the result?   I would like to have

RE: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Burton-West, Tom
Thanks Mike, Do you use a terms index divisor? Setting that to 2 would halve the amount of RAM required but double (on average) the seek time to locate a given term (but, depending on your queries, that seek time may still be a negligible part of overall query time, ie the tradeoff could be very

mm=0?

2010-09-11 Thread Satish Kumar
Hi, We have a requirement to show at least one result every time -- i.e., even if user entered term is not found in any of the documents. I was hoping setting mm to 0 will return results in all cases, but it is not. For example, if user entered term alpha and it is *not* in any of the documents

Re: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Lance Norskog
There is a trick: facets with only one occurrence tend to be mispellings or dirt. You write a program to fetch the terms (Lucene's CheckIndex is a great starting point) create a stopwords file. Here's a data mining project: which languages are more vulnerable to dirty OCR? Burton-West, Tom

Re: Solr memory use, jmap and TermInfos/tii

2010-09-11 Thread Michael McCandless
On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom tburt...@umich.edu wrote:  Is there an example of how to set up the divisor parameter in solrconfig.xml somewhere? Alas I don't know how to configure terms index divisor from Solr... In 4.0, w/ flex indexing, the RAM efficiency is much better

Re: Solr and jvm Garbage Collection tuning

2010-09-11 Thread Dennis Gearon
Thanks for the real life examples. You would have to do a LOT of sharding to get that to work better. Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Fri, 9/10/10,