Re: Which stemmer?

2012-11-21 Thread Elmer van Chastelet
ft.nl/wordsearch/ I can extend the app with more analyzers. Please let me know :) --Elmer Example On 11/14/2012 07:55 PM, Scott Smith wrote: Does anyone have any experience with the stemmers? I know that Porter is what "everyone" uses. Am I better off with KStemFilter (better

Re: PhoneticFilterFactory 's inject parameter

2012-04-25 Thread Elmer van Chastelet
rdTokenizer? Remember that all queries and terms don't contain white spaces. Thanks again. -Elmer On 04/25/2012 02:53 PM, Ian Lea wrote: You seem to be quietly going round in circles, by yourself! I suggest a small self-contained program/test case with a RAM index created from scr

Re: PhoneticFilterFactory 's inject parameter

2012-04-25 Thread Elmer van Chastelet
7;m currently using. It contains various fields for the available phonetic filter encoders: https://www.box.com/s/34212e82227e102f6734 Can somebody explain this behavior? What's the real use of the inject parameter of the PhoneticFilterFactory? Thanks in advance. -Elmer On 04/25/2012 12:25

Re: PhoneticFilterFactory 's inject parameter

2012-04-25 Thread Elmer van Chastelet
Problem solved. Long story short: for some reason I had deleted documents in the index and the non-deleted documents used the phonetic filter with inject set to false. Works fine now :) On 04/23/2012 09:27 PM, Elmer van Chastelet wrote: Hi all, (scroll to bottom for question) I was setting

Re: PhoneticFilterFactory 's inject parameter

2012-04-24 Thread Elmer van Chastelet
x27;compete' is totally absent _/_in the scoring*_/, and only the phonetic encoding seems affect the ranking... * and /present/ in the parsed query as previously stated. -Elmer

PhoneticFilterFactory 's inject parameter

2012-04-23 Thread Elmer van Chastelet
different than performing a query using the search tab in Luke. Q: Can somebody explain why the injected original terms seem to get ignored at query time? Or may it be related to the name of the search field ('value'), or something else? We use Lucene 3.1 with SOLR analyzers (by Hibernate Search 3.4.2). -Elmer

Re: Autocompletion on large index

2011-07-07 Thread Elmer
Thanks, Your replies ended up in my spam box and therefore I missed your recommendation to use FST. I'll do more testing soon with FST instead of TST. And I'll surely take a look at that talk! BR, Elmer On Thu, 2011-07-07 at 11:09 +0200, Dawid Weiss wrote: > Elmer. Tst will have a l

Re: Autocompletion on large index

2011-07-07 Thread Elmer
#x27;t think that's possible without changing the inner working of the TSTLookup? BR, Elmer On Wed, 2011-07-06 at 20:02 +0200, Elmer wrote: > > You could try storing your autocomplete index in a RAMDirectory? > > I forgot to mention. I tried this previously, but that also r

Re: Autocompletion on large index

2011-07-07 Thread Elmer
external HDD. I created a zip, with JAR and sourcecode, available here: http://www.computer-tuning.nl/lucene/TSTLookupWithPrefixLimit.zip You still need the spellchecker for dependencies. BR, Elmer On Wed, 2011-07-06 at 20:52 +0200, Elmer wrote: > I just profiled the application

Re: Autocompletion on large index

2011-07-06 Thread Elmer
I just profiled the application and tst.TernaryTreeNode takes 99.99..% of the memory. I'll test further tomorrow and report on mem usage for runnable smaller indexes. I will email you privately for sharing the index to work with. BR, Elmer -Oorspronkelijk bericht- From: Mi

Re: Autocompletion on large index

2011-07-06 Thread Elmer
You could try storing your autocomplete index in a RAMDirectory? I forgot to mention. I tried this previously, but that also resulted in heap space problems. That's why I was interested in using the new suggest classes :) BR, Elmer -Oorspronkelijk bericht- From: Michael McCan

Re: Autocompletion on large index

2011-07-06 Thread Elmer
builds the tree/fsa/... faster from dictionary than from file (the lookup data file that can be stored and loaded through the .store and .load methods). But the larger set of publication titles is currently no-go with 2.5GB of heapspace, only having a main class that builds the Look

Autocompletion on large index

2011-07-06 Thread Elmer
ork in-memory on such a dataset without increasing java's heapspace? FTR, the 3.3.0 autocompletion modules use more than 2.5GB of RAM, where my own autocompleter index is stored on disk using about 300MB. BR, Elmer - To unsubscri

Re: Boosting a document at query time, based on a field value/range

2011-06-15 Thread Elmer
Hmm, something went wrong. My mail client swapped dates or displayed your initial question as new :? Threading fail ;) Sorry for this :) On Wed, 2011-06-15 at 12:28 +0200, Elmer wrote: > Let's try again ;) > > If I understand you correctly, you want the returned results to

Re: Boosting a document at query time, based on a field value/range

2011-06-15 Thread Elmer
matching the entered range will appear higher, because it matches both clauses, and docs matching Q, but are outside that range only match the first clause. Br, Elmer On Thu, 2011-06-09 at 17:10 +0200, Sowmya V.B. wrote: > Hi All > > I have joined the group only today..and began working w

Improving disk efficiency for autocompleter / spellchecker

2011-06-10 Thread Elmer
completer.close() will only be invoked once (until the index is rebuilt/updated), which will probably reduce the IO reads/writes. Are there more ways to improve disk efficiency? BR, Elmer - To unsubscribe, e-mail: java-user-u

Re: MultiFieldQueryParser with default AND and stopfilter

2011-06-09 Thread Elmer
ot to be") are still matched on the non-stopword fields. Moreover, scoring will probably better match the relevance. BR, Elmer On Thu, 2011-06-09 at 07:32 +1000, Trejkaz wrote: > On Wed, Jun 8, 2011 at 6:52 PM, Elmer wrote: > > the parsed query becomes: > > > > '+(ti

Re: MultiFieldQueryParser with default AND and stopfilter

2011-06-08 Thread Elmer
ly appears in the title, and 'retrieval' only in the description, there is a match (and there is, I just tested it ;)) Br, Elmer On Wed, 2011-06-08 at 16:19 +0100, Ian Lea wrote: > Then surely the stop word issue is a red herring. Using MFQP with AND > everywhere you'll neve

Re: MultiFieldQueryParser with default AND and stopfilter

2011-06-08 Thread Elmer
this solution won't work if the search query is changed to: 'the search project', since 'search' is not in the title field. Br, Elmer On Wed, 2011-06-08 at 16:35 +0200, Elmer wrote: > Thank you, > > I already use the PerFieldAnalyzerWrapper (by Hibernate Search

Re: MultiFieldQueryParser with default AND and stopfilter

2011-06-08 Thread Elmer
itle: 'Lucene project' and desc: 'the open source search software from Apache') I hope it's clear what I mean :) Otherwise, let me know! BR, Elmer On Wed, 2011-06-08 at 14:42 +0100, Ian Lea wrote: > Except that I think he has loads of other fields and wants to keep it

MultiFieldQueryParser with default AND and stopfilter

2011-06-08 Thread Elmer
the query, that appears to be a stopword in a field (i.e. some filter filters the token out), I want it to be matched instead of not. Anyone who knows a way to deal with this? I would prefer to keep using the MFQP, since I need to support multiple fields, querytime boosting and lucene syntax