I know it's documented that Lucene/Solr doesn't apply filters to queries with wildcards, but this seems to trip up a lot of users. I can also see why wildcards break a number of filters, but a number of filters (e.g. mapping charsets) could mostly or entirely work. The N-gram filter is another one that would be great to still run when there wildcards. If you indexed 4-grams and the query is a "*testp*", you currently won't get any results; but the N-gram filter could have a wildcard mode that, in this case, would return just the first 4-gram as a token.
Is this something you've considered? It would have to be enabled in the core network, but disabled by default for existing filters; then it could be enabled 1-by-1 for existing filters. Apologies if the dev list is a better place for this. Scott > -----Original Message----- > From: Ahmet Arslan [mailto:iori...@yahoo.com] > Sent: Thursday, November 21, 2013 8:40 AM > To: solr-user@lucene.apache.org > Subject: Re: search with wildcard > > Hi Adnreas, > > If you don't want to use wildcards at query time, alternative way is to > use NGrams at indexing time. This will produce a lot of tokens. e.g. > For example 4grams of your example : Supertestplan => supe uper pert > erte rtes *test* estp stpl tpla plan > > > Is that you want? By the way why do you want to search inside of words? > > <filter class="solr.NGramFilterFactory" minGramSize="3" > maxGramSize="4"/> > > > > > On Thursday, November 21, 2013 5:23 PM, Andreas Owen <a...@conx.ch> > wrote: > > I suppose i have to create another field with diffenet tokenizers and > set > the boost very low so it doesn't really mess with my ranking because > there > the word is now in 2 fields. What kind of tokenizer can do the job? > > > > From: Andreas Owen [mailto:a...@conx.ch] > Sent: Donnerstag, 21. November 2013 16:13 > To: solr-user@lucene.apache.org > Subject: search with wildcard > > > > I am querying "test" in solr 4.3.1 over the field below and it's not > finding > all occurences. It seems that if it is a substring of a word like > "Supertestplan" it isn't found unless I use a wildcards "*test*". This > is > write because of my tokenizer but does someone know a way around this? > I > don't want to add wildcards because that messes up queries with > multiple > words. > > > > <fieldType name="text_de" class="solr.TextField" > positionIncrementGap="100"> > > <analyzer> > > <tokenizer class="solr.StandardTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > > > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="lang/stopwords_de.txt" format="snowball" > enablePositionIncrements="true"/> <!-- remove common words --> > > <filter class="solr.GermanNormalizationFilterFactory"/> > > <filter > class="solr.SnowballPorterFilterFactory" language="German"/> <!-- > remove > noun/adjective inflections like plural endings --> > > > > </analyzer> > > </fieldType>