Hi Visal,

Maybe the next pattern can help you (the conf attached by David is really
nice):

...pattern="(\s)+" replacement="" replace="all"/>

Hope it helps.

On Wed, Jan 21, 2015 at 10:57 PM, David M Giannone <david.giann...@gm.com>
wrote:

> This is what we use for our autosuggest field in Solr 3.4.  It works for
> us as you describe below.
>
>
>             <fieldType name="autocomplete_edge" class="solr.TextField">
>                         <analyzer type="index">
>                                 <tokenizer
> class="solr.KeywordTokenizerFactory"/>
>                                 <charFilter
> class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>                                 <filter
> class="solr.LowerCaseFilterFactory"/>
>                                 <filter
> class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])"
> replacement=" " replace="all"/>
>                                 <filter
> class="solr.EdgeNGramFilterFactory" maxGramSize="30" minGramSize="1"/>
>                                 <filter
> class="solr.PatternReplaceFilterFactory" pattern="([^\w\d])" replacement=""
> replace="all"/>
>                         </analyzer>
>                         <analyzer type="query">
>                                 <tokenizer
> class="solr.KeywordTokenizerFactory"/>
>                                 <charFilter
> class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>                                 <filter
> class="solr.LowerCaseFilterFactory"/>
>                                 <filter
> class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])"
> replacement=" " replace="all"/>
>                                 <filter
> class="solr.PatternReplaceFilterFactory" pattern="([^\w\d])" replacement=""
> replace="all"/>
>                                 <filter
> class="solr.PatternReplaceFilterFactory" pattern="^(.{30})(.*)?"
> replacement="$1" replace="all"/>
>                         </analyzer>
>                 </fieldType>
>
>
>
> -----Original Message-----
> From: Vishal Swaroop [mailto:vishal....@gmail.com]
> Sent: Wednesday, January 21, 2015 4:40 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Ignore whitesapce, underscore using KeywordTokenizer...
> EdgeNGramFilter
>
> I tried adding *PatternReplaceFilterFactory *in index section but it is
> not working
>
> Example itemName data can be :
> - "ABC E12" : if user types "ABCE" suggestion should be "ABC E12"
> - "ABCE_12" : if user types "ABCE1" suggestion should be "ABCE_12"
>
> <field name="itemName" type="text_general_edge_ngram" indexed="true"
> stored="true" multiValued="false" />
>
> <fieldType name="text_general_edge_ngram" class="solr.TextField"
> positionIncrementGap="100">
>    <analyzer type="index">
> <tokenizer class="solr.KeywordTokenizerFactory"/>
> *<filter class="solr.PatternReplaceFilterFactory" pattern="(\s+)"
> replacement="" replace="all" />*
> <filter class="solr.LowerCaseFilterFactory"/>
>     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> maxGramSize="15" side="front"/>
>    </analyzer>
>
>    <analyzer type="query">
>     <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>    </analyzer>
> </fieldType>
>
> On Wed, Jan 21, 2015 at 3:31 PM, Alvaro Cabrerizo <topor...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Not sure, but I think that the PatternReplaceFilterFactory or the
> > PatternReplaceCharFilterFactory could help you deleting those
> > characters.
> >
> > Regards.
> > On Jan 21, 2015 7:59 PM, "Vishal Swaroop" <vishal....@gmail.com> wrote:
> >
> > > I am trying to implement type-ahead suggestion for single field
> > > which should ignore whitesapce, underscore or special characters in
> > autosuggest.
> > >
> > > It works as suggested by Alex using KeywordTokenizerFactory but how
> > > to ignore whitesapce, underscore...
> > >
> > > Example itemName data can be :
> > > "ABC E12" : if user types "ABCE" suggestion should be "ABC E12"
> > > "ABCE_12" : if user types "ABCE1" suggestion should be "ABCE_12"
> > >
> > > Schema.xml
> > > <field name="itemName" type="text_general_edge_ngram" indexed="true"
> > > stored="true" multiValued="false" />
> > >
> > > <fieldType name="text_general_edge_ngram" class="solr.TextField"
> > > positionIncrementGap="100">
> > >    <analyzer type="index">
> > > <tokenizer class="solr.KeywordTokenizerFactory"/>
> > > <filter class="solr.LowerCaseFilterFactory"/>
> > >     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> > > maxGramSize="15" side="front"/>
> > >    </analyzer>
> > >    <analyzer type="query">
> > >     <tokenizer class="solr.LowerCaseTokenizerFactory"/>
> > >    </analyzer>
> > > </fieldType>
> > >
> >
>
>
> Nothing in this message is intended to constitute an electronic signature
> unless a specific statement to the contrary is included in this message.
>
> Confidentiality Note: This message is intended only for the person or
> entity to which it is addressed. It may contain confidential and/or
> privileged material. Any review, transmission, dissemination or other use,
> or taking of any action in reliance upon this message by persons or
> entities other than the intended recipient is prohibited and may be
> unlawful. If you received this message in error, please contact the sender
> and delete it from your computer.
>

Reply via email to