I guess you did not bother clicking through the link then, because that's exactly the filter I was using. :-) I am glad you found it this way.
You can also find the full list of filters and tokenizers at: http://www.solr-start.com/info/analyzers/ Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Thu, Jul 17, 2014 at 12:53 PM, Hayden Muhl <haydenm...@gmail.com> wrote: > Thank you Jorge. I didn't know about that filter. It's just what I was > looking for. > > - Hayden > > > On Wed, Jul 16, 2014 at 4:35 PM, Jorge Luis Betancourt Gonzalez < > jlbetanco...@uci.cu> wrote: > >> Perhaps what you’re trying to do could be addressed by using the >> EdgeNGramFilterFactory filter? For query suggestions I’m using a very >> similar approach, this is an extract of the configuration I’m using: >> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" >> generateNumberParts="1" catenateWords="0" catenateNumbers="0" >> catenateAll="0" splitOnCaseChange="1"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.EdgeNGramFilterFactory" maxGramSize=“10" >> minGramSize="1"/> >> >> Basically this allows you to get partial matches from any part of the >> string, let’s say the field get’s this content at index time: "A brown >> fox”, this document will be matched by the query (“bro”) for instance. My >> personal recommendation is to use this in a separated field that get’s >> populated through a copyField, this way you could apply different boosts. >> >> Greetings, >> >> On Jul 16, 2014, at 2:00 PM, Hayden Muhl <haydenm...@gmail.com> wrote: >> >> > A copy field does not address my problem, and this has nothing to do with >> > stored fields. This is a query parsing problem, not an indexing problem. >> > >> > Here's the use case. >> > >> > If someone has a username like "bob-smith", I would like it to match >> > prefixes of "bo" and "sm". I tokenize the username into the tokens "bob" >> > and "smith". Everything is fine so far. >> > >> > If someone enters "bo sm" as a search string, I would like "bob-smith" to >> > be one of the results. The query to do this is straight forward, >> > "username:bo* username:sm*". Here's the problem. In order to construct >> that >> > query, I have to tokenize the search string "bo sm" **on the client**. I >> > don't want to reimplement tokenization on the client. Is there any way to >> > give Solr the string "bo sm", have Solr do the tokenization, then treat >> > each token like a prefix? >> > >> > >> > On Tue, Jul 15, 2014 at 4:55 PM, Alexandre Rafalovitch < >> arafa...@gmail.com> >> > wrote: >> > >> >> So copyField it to another and apply alternative processing there. Use >> >> eDismax to search both. No need to store the copied field, just index >> it. >> >> >> >> Regards, >> >> Alex >> >> On 16/07/2014 2:46 am, "Hayden Muhl" <haydenm...@gmail.com> wrote: >> >> >> >>> Both fields? There is only one field here: username. >> >>> >> >>> >> >>> On Mon, Jul 14, 2014 at 6:17 PM, Alexandre Rafalovitch < >> >> arafa...@gmail.com >> >>>> >> >>> wrote: >> >>> >> >>>> Search against both fields (one split, one not split)? Keep original >> >>>> and tokenized form? I am doing something similar with class name >> >>>> autocompletes here: >> >>>> >> >>>> >> >>> >> >> >> https://github.com/arafalov/Solr-Javadoc/blob/master/JavadocIndex/JavadocCollection/conf/schema.xml#L24 >> >>>> >> >>>> Regards, >> >>>> Alex. >> >>>> Personal: http://www.outerthoughts.com/ and @arafalov >> >>>> Solr resources: http://www.solr-start.com/ and @solrstart >> >>>> Solr popularizers community: >> >> https://www.linkedin.com/groups?gid=6713853 >> >>>> >> >>>> >> >>>> On Tue, Jul 15, 2014 at 8:04 AM, Hayden Muhl <haydenm...@gmail.com> >> >>> wrote: >> >>>>> I'm working on using Solr for autocompleting usernames. I'm running >> >>> into >> >>>> a >> >>>>> problem with the wildcard queries (e.g. username:al*). >> >>>>> >> >>>>> We are tokenizing usernames so that a username like "solr-user" will >> >> be >> >>>>> tokenized into "solr" and "user", and will match both "sol" and "use" >> >>>>> prefixes. The problem is when we get "solr-u" as a prefix, I'm having >> >>> to >> >>>>> split that up on the client side before I construct a query >> >>>> "username:solr* >> >>>>> username:u*". I'm basically using a regex as a poor man's tokenizer. >> >>>>> >> >>>>> Is there a better way to approach this? Is there a way to tell Solr >> >> to >> >>>>> tokenize a string and use the parts as prefixes? >> >>>>> >> >>>>> - Hayden >> >>>> >> >>> >> >> >> >> VII Escuela Internacional de Verano en la UCI del 30 de junio al 11 de >> julio de 2014. Ver www.uci.cu >>