Hi Team,
Can you please clarify the below. My understanding is tokenizer is used to say
how the content should be indexed physically in file system. Filters are used
to query result. The blow lines are from my setup. But I have seen eg that
include filters inside <analyzer type=”index”> and tokenizer in <analyzer
type=”query”> that confused me.
<fieldType name="customSearch" class="solr.TextField"
positionIncrementGap="100" >
<analyzer type="index">
<tokenizer
class="solr.LowerCaseTokenizerFactory"/>
<tokenizer
class="solr.StandardTokenizerFactory"/>
<tokenizer
class="solr.NGramTokenizerFactory" minGramSize="2" maxGramSize="2"/>
</analyzer>
<analyzer type="query">
<filter class="solr.NGramFilterFactory"
minGramSize="2" maxGramSize="2"/>
</analyzer>
</fieldType>
My goal is to user solr and find the best match among the technology names e.g
Actual tech name
1. Microsoft Visual Studio
2. Microsoft Internet Explorer
3. Microsoft Visio
When user types Microsoft Visal Studio user should get Microsoft Visual Studio.
Basically misspelled and jumble words should match closest tech name
Corporate Executive Board India Private Limited. Registration No:
U741040HR2004PTC035324. Registered office: 6th Floor, Tower B, DLF Building
No.10 DLF Cyber City, Gurgaon, Haryana-122002, India..
This e-mail and/or its attachments are intended only for the use of the
addressee(s) and may contain confidential and legally privileged information
belonging to CEB and/or its subsidiaries, including CEB subsidiaries that offer
SHL Talent Measurement products and services. If you have received this e-mail
in error, please notify the sender and immediately, destroy all copies of this
email and its attachments. The publication, copying, in whole or in part, or
use or dissemination in any other way of this e-mail and attachments by anyone
other than the intended person(s) is prohibited.