is your default operator OR? change it to AND
On Fri, Nov 8, 2019 at 11:30 AM Guilherme Viteri <gvit...@ebi.ac.uk> wrote: > HI Walter and Paras > > I indexed it removing all the references to StopWordFilter and I went from > 121 results to near 20K as the search term q="Lymphoid and a non-Lymphoid > cell" is matching entities such as "IFT A" or "Lamin A". So I don't think > removing it completely is the way to go from the scenario we have, but I > appreciate the suggestion... > > Yes the response is using fl=* > I am trying some combinations at the moment, but yet no success. > > defType=edismax > q.alt=Lymphoid and a non-Lymphoid cell > Number of results=1599 > Quite a considerable increase, even though reasonable meaningful results. > > I am sorry but I didn't understand what do you want me to do exactly with > the lst (??) and qf and bf. > > Thanks everyone with their inputs > > > > On 8 Nov 2019, at 06:45, Paras Lehana <paras.leh...@indiamart.com> > wrote: > > > > Hi Guilherme > > > > By accident, I ended up querying the using the default handler (/select) > and it worked. > > > > You've just found the culprit. Thanks for giving the material I > requested. Your analysis chain is working as expected. I don't see any > issue in either StopWordFilter or your boosts. I also use a boost of 50 > when boosting contextual suggestions (boosting "gold iphone" on a page of > iphone) but I take Walter's suggestion and would try to optimize my > weights. I agree that this 50 thing was not researched much about by us as > well (we never faced performance or relevance issues). > > > > See the major difference in both the handlers - edismax. I'm pretty sure > that your problem lies in the parsing of queries (you can confirm that from > parsedquery key in debug of both JSON responses). I hope you have provided > the response with fl=*. Replace q with q.alt in your /search handler query > and I think you should start getting responses. That's because q.alt uses > standard parser. If you want to keep using edisMax, I suggest you to test > the responses removing some combination of lst (qf, bf) and find what's > restricting the documents to come up. I'm out of office today - would have > certainly tried analyzing the field values of the document in /select > request and compare it with qf/bq in solrconfig.xml /search. Do this for me > and you'd certainly find something. > > > > On Thu, 7 Nov 2019 at 21:00, Walter Underwood <wun...@wunderwood.org > <mailto:wun...@wunderwood.org>> wrote: > > I normally use a weight of 8 for the most important field, like title. > Other fields might get a 4 or 2. > > > > I add a “pf” field with the weights doubled, so that phrase matches have > a higher weight. > > > > The weight of 8 comes from experience at Infoseek and Inktomi, two early > web search engines. With different relevance algorithms and totally > different evaluation and tuning systems, they settled on weights of 8 and > 7.5 for HTML titles. With the the two radically different system getting > the same number, I decided that was a property of the documents, not of the > search engines. > > > > wunder > > Walter Underwood > > wun...@wunderwood.org <mailto:wun...@wunderwood.org> > > http://observer.wunderwood.org/ <http://observer.wunderwood.org/> (my > blog) > > > >> On Nov 7, 2019, at 9:03 AM, Guilherme Viteri <gvit...@ebi.ac.uk > <mailto:gvit...@ebi.ac.uk>> wrote: > >> > >> Hi Wunder, > >> > >> My indexer takes quite a few hours to be executed I am shortening it to > run faster, but I also need to make sure it gives what we are expecting. > This implementation's been there for >4y, and massively used. > >> > >>> In your edismax handlers, weights of 20, 50, and 100 are extremely > high. I don’t think I’ve ever used a weight higher than 16 in a dozen years > of configuring Solr. > >> I've inherited that implementation and I am really keen to adequate it, > what would you recommend ? > >> > >> Cheers > >> Guilherme > >> > >>> On 7 Nov 2019, at 14:43, Walter Underwood <wun...@wunderwood.org > <mailto:wun...@wunderwood.org>> wrote: > >>> > >>> Thanks for posting the files. Looking at schema.xml, I see that you > still are using StopFilterFactory. The first advice we gave you was to > remove that. > >>> > >>> Remove StopFilterFactory everywhere and reindex. > >>> > >>> You will continue to have problems matching stopwords until you do > that. > >>> > >>> In your edismax handlers, weights of 20, 50, and 100 are extremely > high. I don’t think I’ve ever used a weight higher than 16 in a dozen years > of configuring Solr. > >>> > >>> wunder > >>> Walter Underwood > >>> wun...@wunderwood.org <mailto:wun...@wunderwood.org> > >>> http://observer.wunderwood.org/ <http://observer.wunderwood.org/> > (my blog) > >>> > >>>> On Nov 7, 2019, at 6:56 AM, Guilherme Viteri <gvit...@ebi.ac.uk > <mailto:gvit...@ebi.ac.uk>> wrote: > >>>> > >>>> Hi Paras, everyone > >>>> > >>>> Thank you again for your inputs and suggestions. I sorry to hear you > had trouble with the attachments I will host it somewhere and share the > links. > >>>> I don't tweak my index, I get the data from the graph database, > create a document as they are and save to solr. > >>>> > >>>> So, I am sending the new analysis screen querying the way you > suggested. Also the results with params and solr query url. > >>>> > >>>> During the process of querying what you asked I found something > really weird (at least for me). By accident, I ended up querying the using > the default handler (/select) and it worked. Then If I use the one I must > use, then sadly doesn't work. I am posting both results and I will also > post the handlers as well. > >>>> > >>>> Here is the link with all the files mentioned before > >>>> > https://www.dropbox.com/sh/fymfm1q94zum1lx/AADwU1c9EUf2A4d7FtzSKR54a?dl=0 > <https://www.dropbox.com/sh/fymfm1q94zum1lx/AADwU1c9EUf2A4d7FtzSKR54a?dl=0> > <https://www.dropbox.com/sh/fymfm1q94zum1lx/AADwU1c9EUf2A4d7FtzSKR54a?dl=0 > <https://www.dropbox.com/sh/fymfm1q94zum1lx/AADwU1c9EUf2A4d7FtzSKR54a?dl=0 > >> > >>>> If the link doesn't work www dot dropbox dot com slash sh slash > fymfm1q94zum1lx/AADwU1c9EUf2A4d7FtzSKR54a ? dl equals 0 > >>>> > >>>> Thanks > >>>> > >>>>> On 7 Nov 2019, at 05:23, Paras Lehana <paras.leh...@indiamart.com > <mailto:paras.leh...@indiamart.com>> wrote: > >>>>> > >>>>> Hi Guilherme. > >>>>> > >>>>> I am sending they analysis result and the json result as requested. > >>>>> > >>>>> > >>>>> Thanks for the effort. Luckily, I can see your attachments (low > quality > >>>>> though). > >>>>> > >>>>> From the analysis screen, the analysis is working as expected. One > of the > >>>>> reasons for query="lymphoid and *a* non-lymphoid cell" not matching > >>>>> document containing "Lymphoid and a non-Lymphoid cell" I can > initially > >>>>> think of is: the stopword "a" is probably present in post-analysis > either > >>>>> of query or index. Did you tweak your index time analysis after > indexing? > >>>>> > >>>>> Do two things: > >>>>> > >>>>> 1. Post the analysis screen for and index=*"Immunoregulatory > >>>>> interactions between a Lymphoid and a non-Lymphoid cell"* and > >>>>> "query=*"lymphoid > >>>>> and a non-lymphoid cell"*. Try hosting the image and providing the > link > >>>>> here. > >>>>> 2. Give the same JSON output as you have sent but this time with > >>>>> *"echoParams=all"*. Also, post the exact Solr query url. > >>>>> > >>>>> > >>>>> > >>>>> On Wed, 6 Nov 2019 at 21:07, Erick Erickson <erickerick...@gmail.com > <mailto:erickerick...@gmail.com>> wrote: > >>>>> > >>>>>> I don’t see the attachments, maybe I deleted old e-mails or some > such. The > >>>>>> Apache server is fairly aggressive about stripping attachments > though, so > >>>>>> it’s also possible they didn’t make it through. > >>>>>> > >>>>>>> On Nov 6, 2019, at 9:28 AM, Guilherme Viteri <gvit...@ebi.ac.uk > <mailto:gvit...@ebi.ac.uk>> wrote: > >>>>>>> > >>>>>>> Thanks Erick. > >>>>>>> > >>>>>>>> First, your index and analysis chains are considerably different, > this > >>>>>> can easily be a source of problems. In particular, using two > different > >>>>>> tokenizers is a huge red flag. I _strongly_ recommend against this > unless > >>>>>> you’re totally sure you understand the consequences. Additionally, > your use > >>>>>> of the length filter is suspicious, especially since your problem > statement > >>>>>> is about the addition of a single letter term and the min length > allowed on > >>>>>> that filter is 2. That said, it’s reasonable to suppose that the > ’a’ is > >>>>>> filtered out in both cases, but maybe you’ve found something odd > about the > >>>>>> interactions. > >>>>>>> I will investigate the min length and post the results later. > >>>>>>> > >>>>>>>> Second, I have no idea what this will do. Are the equal signs > typos? > >>>>>> Used by custom code? > >>>>>>> This the url in my application, not solr params. That's the query > string. > >>>>>>> > >>>>>>>> What does “species=“ do? That’s not Solr syntax, so it’s likely > that > >>>>>> all the params with an equal-sign are totally ignored unless it’s > just a > >>>>>> typo. > >>>>>>> This is part of the application. Species will be used later on in > solr > >>>>>> to filter out the result. That's not solr. That my app params. > >>>>>>> > >>>>>>>> Third, the easiest way to see what’s happening under the covers > is to > >>>>>> add “&debug=true” to the query and look at the parsed query. Ignore > all the > >>>>>> relevance calculations for the nonce, or specify “&debug=query” to > skip > >>>>>> that part. > >>>>>>> The two json files i've sent, they are debugQuery=on and the > explain tag > >>>>>> is present. > >>>>>>> I will try the searching the way you mentioned. > >>>>>>> > >>>>>>> Thank for your inputs > >>>>>>> > >>>>>>> Guilherme > >>>>>>> > >>>>>>>> On 6 Nov 2019, at 14:14, Erick Erickson <erickerick...@gmail.com > <mailto:erickerick...@gmail.com>> > >>>>>> wrote: > >>>>>>>> > >>>>>>>> Fwd to another server > >>>>>>>> > >>>>>>>> First, your index and analysis chains are considerably different, > this > >>>>>> can easily be a source of problems. In particular, using two > different > >>>>>> tokenizers is a huge red flag. I _strongly_ recommend against this > unless > >>>>>> you’re totally sure you understand the consequences. Additionally, > your use > >>>>>> of the length filter is suspicious, especially since your problem > statement > >>>>>> is about the addition of a single letter term and the min length > allowed on > >>>>>> that filter is 2. That said, it’s reasonable to suppose that the > ’a’ is > >>>>>> filtered out in both cases, but maybe you’ve found something odd > about the > >>>>>> interactions. > >>>>>>>> > >>>>>>>> Second, I have no idea what this will do. Are the equal signs > typos? > >>>>>> Used by custom code? > >>>>>>>> > >>>>>>>>>> > >>>>>> > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > < > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > > > >>>>>>>> > >>>>>>>> What does “species=“ do? That’s not Solr syntax, so it’s likely > that > >>>>>> all the params with an equal-sign are totally ignored unless it’s > just a > >>>>>> typo. > >>>>>>>> > >>>>>>>> Third, the easiest way to see what’s happening under the covers > is to > >>>>>> add “&debug=true” to the query and look at the parsed query. Ignore > all the > >>>>>> relevance calculations for the nonce, or specify “&debug=query” to > skip > >>>>>> that part. > >>>>>>>> > >>>>>>>> 90% + of the time, the question “why didn’t this query do what I > >>>>>> expect” is answered by looking at the “&debug=query” output and the > >>>>>> analysis page in the admin UI. NOTE: for the analysis page be sure > to look > >>>>>> at _both_ the query and index output. Also, and very important > about the > >>>>>> analysis page (and this is confusing) is that this _assumes_ that > what you > >>>>>> put in the text boxes have made it through the query parser intact > and is > >>>>>> analyzed by the field selected. Consider the search "q=field:word1 > word2". > >>>>>> Now you type “word1 word2” into the analysis text box and it looks > like > >>>>>> what you expect. That’s misleading because the query is _parsed_ as > >>>>>> "field:word1 default_search_field:word2”. This is where > “&debug=query” > >>>>>> helps. > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Erick > >>>>>>>> > >>>>>>>>> On Nov 6, 2019, at 2:36 AM, Paras Lehana < > paras.leh...@indiamart.com <mailto:paras.leh...@indiamart.com>> > >>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Hi Walter, > >>>>>>>>> > >>>>>>>>> The solr.StopFilter removes all tokens that are stopwords. Those > words > >>>>>> will > >>>>>>>>>> not be in the index, so they can never match a query. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I think the OP's concern is different results when adding a > stopword. I > >>>>>>>>> think he's using the filter factory correctly - the query chain > >>>>>> includes > >>>>>>>>> the filter as well so it should remove "a" while querying. > >>>>>>>>> > >>>>>>>>> *@Guilherme*, please post results for both the query, the > document in > >>>>>>>>> result you are concerned about and post full result of analysis > screen > >>>>>> (for > >>>>>>>>> both query and index). > >>>>>>>>> > >>>>>>>>> On Tue, 5 Nov 2019 at 21:38, Walter Underwood < > wun...@wunderwood.org <mailto:wun...@wunderwood.org>> > >>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> No. > >>>>>>>>>> > >>>>>>>>>> The solr.StopFilter removes all tokens that are stopwords. > Those words > >>>>>>>>>> will not be in the index, so they can never match a query. > >>>>>>>>>> > >>>>>>>>>> 1. Remove the lines with solr.StopFilter from every analysis > chain in > >>>>>>>>>> schema.xml. > >>>>>>>>>> 2. Reload the collection, restart Solr, or whatever to read the > new > >>>>>> config. > >>>>>>>>>> 3. Reindex all of the documents. > >>>>>>>>>> > >>>>>>>>>> When indexed with the new analysis chain, the stopwords will > not be > >>>>>>>>>> removed and they will be searchable. > >>>>>>>>>> > >>>>>>>>>> wunder > >>>>>>>>>> Walter Underwood > >>>>>>>>>> wun...@wunderwood.org <mailto:wun...@wunderwood.org> > >>>>>>>>>> http://observer.wunderwood.org/ < > http://observer.wunderwood.org/> (my blog) > >>>>>>>>>> > >>>>>>>>>>> On Nov 5, 2019, at 8:56 AM, Guilherme Viteri < > gvit...@ebi.ac.uk <mailto:gvit...@ebi.ac.uk>> > >>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> Ok. I am kind a lost now. > >>>>>>>>>>> If I open up the console > analysis and perform it, that's the > final > >>>>>>>>>> result. > >>>>>>>>>>> <Screenshot 2019-11-05 at 14.54.16.png> > >>>>>>>>>>> > >>>>>>>>>>> Your suggestion is: get rid of the <filter stopword.txt> in the > >>>>>>>>>> schema.xml and during index phase replaceAll("in > stopwords.txt"," ") > >>>>>> then > >>>>>>>>>> add to solr. Is that correct ? > >>>>>>>>>>> > >>>>>>>>>>> Thanks David > >>>>>>>>>>> > >>>>>>>>>>>> On 5 Nov 2019, at 14:48, David Hastings < > >>>>>> hastings.recurs...@gmail.com <mailto:hastings.recurs...@gmail.com> > >>>>>>>>>> <mailto:hastings.recurs...@gmail.com <mailto: > hastings.recurs...@gmail.com>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Fwd to another server > >>>>>>>>>>>> > >>>>>>>>>>>> no, > >>>>>>>>>>>> <filter class="solr.StopFilterFactory" > ignoreCase="true" > >>>>>>>>>>>> words="stopwords.txt"/> > >>>>>>>>>>>> > >>>>>>>>>>>> is still using stopwords and should be removed, in my opinion > of > >>>>>> course, > >>>>>>>>>>>> based on your use case may be different, but i generally axe > any > >>>>>>>>>> reference > >>>>>>>>>>>> to them at all > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, Nov 5, 2019 at 9:47 AM Guilherme Viteri < > gvit...@ebi.ac.uk <mailto:gvit...@ebi.ac.uk> > >>>>>>>>>> <mailto:gvit...@ebi.ac.uk <mailto:gvit...@ebi.ac.uk>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Thanks. > >>>>>>>>>>>>> Haven't I done this here ? > >>>>>>>>>>>>> <fieldType name="text_field" class="solr.TextField" > >>>>>>>>>>>>> positionIncrementGap="100" omitNorms="false" > > >>>>>>>>>>>>> <analyzer type="index"> > >>>>>>>>>>>>> <tokenizer class="solr.StandardTokenizerFactory"/> > >>>>>>>>>>>>> <filter class="solr.ClassicFilterFactory"/> > >>>>>>>>>>>>> <filter class="solr.LengthFilterFactory" min="2" > >>>>>>>>>> max="20"/> > >>>>>>>>>>>>> <filter class="solr.LowerCaseFilterFactory"/> > >>>>>>>>>>>>> <filter class="solr.StopFilterFactory" > ignoreCase="true" > >>>>>>>>>>>>> words="stopwords.txt"/> > >>>>>>>>>>>>> </analyzer> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>>> On 5 Nov 2019, at 14:15, David Hastings < > >>>>>> hastings.recurs...@gmail.com <mailto:hastings.recurs...@gmail.com> > >>>>>>>>>> <mailto:hastings.recurs...@gmail.com <mailto: > hastings.recurs...@gmail.com>>> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Fwd to another server > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> The first thing you should do is remove any reference to > stop > >>>>>> words > >>>>>>>>>> and > >>>>>>>>>>>>>> never use them, then re-index your data and try it again. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Tue, Nov 5, 2019 at 9:14 AM Guilherme Viteri < > >>>>>> gvit...@ebi.ac.uk <mailto:gvit...@ebi.ac.uk> > >>>>>>>>>> <mailto:gvit...@ebi.ac.uk <mailto:gvit...@ebi.ac.uk>>> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I am performing a search to match a name (text_field), > however > >>>>>> this > >>>>>>>>>> term > >>>>>>>>>>>>>>> contains 'and' and 'a' and it doesn't return any records. > If i > >>>>>> remove > >>>>>>>>>>>>> 'a' > >>>>>>>>>>>>>>> then it works. > >>>>>>>>>>>>>>> e.g > >>>>>>>>>>>>>>> Search Term: lymphoid and a non-lymphoid cell > >>>>>>>>>>>>>>> doesn't work: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>> > >>>>>> > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > < > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > > > >>>>>>>>>> < > >>>>>>>>>> > >>>>>> > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > < > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > > > >>>>>>>>>>> > >>>>>>>>>>>>>>> < > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>> > >>>>>> > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > < > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > > > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Search term: lymphoid and non-lymphoid cell > >>>>>>>>>>>>>>> works: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>> > >>>>>> > https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > < > https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > > > >>>>>>>>>>>>>>> < > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>> > >>>>>> > https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > < > https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > > > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> interested in the first result > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> schema.xml > >>>>>>>>>>>>>>> <field name="name" > type="text_field" > >>>>>>>>>>>>>>> indexed="true" stored="true" omitNorms="false" > >>>>>> required="true" > >>>>>>>>>>>>>>> multiValued="false"/> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> <analyzer type="query"> > >>>>>>>>>>>>>>> <tokenizer class="solr.PatternTokenizerFactory" > >>>>>>>>>>>>>>> pattern="[^a-zA-Z0-9/._:]"/> > >>>>>>>>>>>>>>> <filter class="solr.PatternReplaceFilterFactory" > >>>>>>>>>>>>>>> pattern="^[/._:]+" replacement=""/> > >>>>>>>>>>>>>>> <filter class="solr.PatternReplaceFilterFactory" > >>>>>>>>>>>>>>> pattern="[/._:]+$" replacement=""/> > >>>>>>>>>>>>>>> <filter class="solr.PatternReplaceFilterFactory" > >>>>>>>>>>>>>>> pattern="[_]" replacement=" "/> > >>>>>>>>>>>>>>> <filter class="solr.LengthFilterFactory" min="2" > >>>>>>>>>>>>> max="20"/> > >>>>>>>>>>>>>>> <filter class="solr.LowerCaseFilterFactory"/> > >>>>>>>>>>>>>>> <filter class="solr.StopFilterFactory" > >>>>>>>>>> ignoreCase="true" > >>>>>>>>>>>>>>> words="stopwords.txt"/> > >>>>>>>>>>>>>>> </analyzer> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> <fieldType name="text_field" class="solr.TextField" > >>>>>>>>>>>>>>> positionIncrementGap="100" omitNorms="false" > > >>>>>>>>>>>>>>> <analyzer type="index"> > >>>>>>>>>>>>>>> <tokenizer class="solr.StandardTokenizerFactory"/> > >>>>>>>>>>>>>>> <filter class="solr.ClassicFilterFactory"/> > >>>>>>>>>>>>>>> <filter class="solr.LengthFilterFactory" min="2" > >>>>>>>>>>>>> max="20"/> > >>>>>>>>>>>>>>> <filter class="solr.LowerCaseFilterFactory"/> > >>>>>>>>>>>>>>> <filter class="solr.StopFilterFactory" > >>>>>>>>>> ignoreCase="true" > >>>>>>>>>>>>>>> words="stopwords.txt"/> > >>>>>>>>>>>>>>> </analyzer> > >>>>>>>>>>>>>>> <analyzer type="query"> > >>>>>>>>>>>>>>> <tokenizer class="solr.PatternTokenizerFactory" > >>>>>>>>>>>>>>> pattern="[^a-zA-Z0-9/._:]"/> > >>>>>>>>>>>>>>> <filter class="solr.PatternReplaceFilterFactory" > >>>>>>>>>>>>>>> pattern="^[/._:]+" replacement=""/> > >>>>>>>>>>>>>>> <filter class="solr.PatternReplaceFilterFactory" > >>>>>>>>>>>>>>> pattern="[/._:]+$" replacement=""/> > >>>>>>>>>>>>>>> <filter class="solr.PatternReplaceFilterFactory" > >>>>>>>>>>>>>>> pattern="[_]" replacement=" "/> > >>>>>>>>>>>>>>> <filter class="solr.LengthFilterFactory" min="2" > >>>>>>>>>>>>> max="20"/> > >>>>>>>>>>>>>>> <filter class="solr.LowerCaseFilterFactory"/> > >>>>>>>>>>>>>>> <filter class="solr.StopFilterFactory" > >>>>>>>>>> ignoreCase="true" > >>>>>>>>>>>>>>> words="stopwords.txt"/> > >>>>>>>>>>>>>>> </analyzer> > >>>>>>>>>>>>>>> </fieldType> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> stopwords.txt > >>>>>>>>>>>>>>> #Standard english stop words taken from Lucene's > StopAnalyzer > >>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>> b > >>>>>>>>>>>>>>> c > >>>>>>>>>>>>>>> .... > >>>>>>>>>>>>>>> an > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>> are > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Running SolR 6.6.2. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Is there anything I could do to prevent this ? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks > >>>>>>>>>>>>>>> Guilherme > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> -- > >>>>>>>>> Regards, > >>>>>>>>> > >>>>>>>>> *Paras Lehana* [65871] > >>>>>>>>> Development Engineer, Auto-Suggest, > >>>>>>>>> IndiaMART Intermesh Ltd. > >>>>>>>>> > >>>>>>>>> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, > >>>>>>>>> Noida, UP, IN - 201303 > >>>>>>>>> > >>>>>>>>> Mob.: +91-9560911996 > >>>>>>>>> Work: 01203916600 | Extn: *8173* > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> IMPORTANT: > >>>>>>>>> NEVER share your IndiaMART OTP/ Password with anyone. > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> -- > >>>>> -- > >>>>> Regards, > >>>>> > >>>>> *Paras Lehana* [65871] > >>>>> Development Engineer, Auto-Suggest, > >>>>> IndiaMART Intermesh Ltd. > >>>>> > >>>>> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, > >>>>> Noida, UP, IN - 201303 > >>>>> > >>>>> Mob.: +91-9560911996 > >>>>> Work: 01203916600 | Extn: *8173* > >>>>> > >>>>> -- > >>>>> IMPORTANT: > >>>>> NEVER share your IndiaMART OTP/ Password with anyone. > >>>> > >>> > >> > > > > > > > > -- > > -- > > Regards, > > > > Paras Lehana [65871] > > Development Engineer, Auto-Suggest, > > IndiaMART Intermesh Ltd. > > > > 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, > > Noida, UP, IN - 201303 > > > > Mob.: +91-9560911996 <tel:+91-9560911996> > > Work: 01203916600 | Extn: 8173 > > > > IMPORTANT: > > NEVER share your IndiaMART OTP/ Password with anyone. > >