Re: Problem with caps and star symbol
Thanks for your point. I was really tripping that issue. But Now I need a bit help more. As far I have noticed that in the case of a value like *role_delete* , WordDelimiterFilterFactory index two words like *role* and *delete* and in both search result with the term *role* and *delete* will include that document. Now In the case of the value like *role_delete* I want to index all four terms like [ *role_delete, roledelete, role, delete ].* In total both the original and processed word by WordDelimiterFilterFactory will be indexed. Is it possible ?? Does any additional filter with WordDelimiterFilterFactory can do that ?? Or any filter can do such like operation ?? On Tue, May 31, 2011 at 8:07 PM, Erick Erickson erickerick...@gmail.comwrote: I think you're tripping over the issue that wildcards aren't analyzed, they don't go through your analysis chain. So the casing matters. Try lowercasing the input and I believe you'll see more like what you expect... Best Erick On Mon, May 30, 2011 at 12:07 AM, Saumitra Chowdhury saumi...@smartitengineering.com wrote: I am sending some xml to understand the scenario. Indexed term = ROLE_DELETE Search Term = roledelete response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : roledelete/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 Indexed term = ROLE_DELETE Search Term = role response lst name=responseHeader int name=status0/int int name=QTime5/int lst name=params str name=indenton/str str name=start0/str str name=qname : role/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response Indexed term = ROLE_DELETE Search Term = role* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : role*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response Indexed term = ROLE_DELETE Search Term = Role* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : Role*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response Indexed term = ROLE_DELETE Search Term = ROLE_DELETE* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : ROLE_DELETE*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response I am also adding a analysis html. On Mon, May 30, 2011 at 7:19 AM, Erick Erickson erickerick...@gmail.com wrote: I'd start by looking at the analysis page from the Solr admin page. That will give you an idea of the transformations the various steps carry out, it's invaluable! Best Erick On May 26, 2011 12:53 AM, Saumitra Chowdhury saumi...@smartitengineering.com wrote: Hi all , In my schema.xml i am using WordDelimiterFilterFactory, LowerCaseFilterFactory, StopFilterFactory for index analyzer and an extra SynonymFilterFactory for query analyzer. I am indexing a field name '*name*'.Now if a value with all caps like NAME_BILL is indexed I am able get this as search result with the term *name_bill *, *NAME_BILL *, *namebill *, *namebill** , *nameb** ... But for the term like following * NAME_BILL** , *name_bill** , *namebill** , *NAME** the result does mot show this document. Can anyone please explain why this is happening? .In fact star * is not giving any result in many cases specially if it is used after full value of a field. Portion of my schema is given below. fieldType name=text_ws class=solr.TextField positionIncrementGap=100 - analyzer tokenizer class
Re: Problem with caps and star symbol
Its Working as I was looking for.Thanks Mr. Erick. On Wed, Jun 1, 2011 at 8:29 PM, Erick Erickson erickerick...@gmail.comwrote: Take a look here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory I think you want generateWordParts=1, catenateWords=1 and preserveOriginal=1, but check it out with the admin/analysis page. Oh, and your index-time and query-time patterns for WDFF will probably be different, see the example schema. Best Erick On Wed, Jun 1, 2011 at 7:40 AM, Saumitra Chowdhury saumi...@smartitengineering.com wrote: Thanks for your point. I was really tripping that issue. But Now I need a bit help more. As far I have noticed that in the case of a value like *role_delete* , WordDelimiterFilterFactory index two words like *role* and *delete* and in both search result with the term *role* and *delete* will include that document. Now In the case of the value like *role_delete* I want to index all four terms like [ *role_delete, roledelete, role, delete ].* In total both the original and processed word by WordDelimiterFilterFactory will be indexed. Is it possible ?? Does any additional filter with WordDelimiterFilterFactory can do that ?? Or any filter can do such like operation ?? On Tue, May 31, 2011 at 8:07 PM, Erick Erickson erickerick...@gmail.com wrote: I think you're tripping over the issue that wildcards aren't analyzed, they don't go through your analysis chain. So the casing matters. Try lowercasing the input and I believe you'll see more like what you expect... Best Erick On Mon, May 30, 2011 at 12:07 AM, Saumitra Chowdhury saumi...@smartitengineering.com wrote: I am sending some xml to understand the scenario. Indexed term = ROLE_DELETE Search Term = roledelete response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : roledelete/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 Indexed term = ROLE_DELETE Search Term = role response lst name=responseHeader int name=status0/int int name=QTime5/int lst name=params str name=indenton/str str name=start0/str str name=qname : role/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response Indexed term = ROLE_DELETE Search Term = role* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : role*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response Indexed term = ROLE_DELETE Search Term = Role* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : Role*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response Indexed term = ROLE_DELETE Search Term = ROLE_DELETE* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : ROLE_DELETE*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response I am also adding a analysis html. On Mon, May 30, 2011 at 7:19 AM, Erick Erickson erickerick...@gmail.com wrote: I'd start by looking at the analysis page from the Solr admin page. That will give you an idea of the transformations the various steps carry out, it's invaluable! Best Erick On May 26, 2011 12:53 AM, Saumitra Chowdhury saumi...@smartitengineering.com wrote: Hi all , In my schema.xml i am using WordDelimiterFilterFactory, LowerCaseFilterFactory, StopFilterFactory for index
Re: Problem with caps and star symbol
I am sending some xml to understand the scenario. Indexed term = ROLE_DELETE Search Term = roledelete response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : roledelete/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 Indexed term = ROLE_DELETE Search Term = role response lst name=responseHeader int name=status0/int int name=QTime5/int lst name=params str name=indenton/str str name=start0/str str name=qname : role/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response Indexed term = ROLE_DELETE Search Term = role* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : role*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=1 start=0 doc str name=creationDateMon May 30 13:09:14 BDST 2011/str str name=displayNameGlobal Role for Deletion/str str name=idrole:9223372036854775802/str str name=lastModifiedDateMon May 30 13:09:14 BDST 2011/str str name=nameROLE_DELETE/str /doc /result /response Indexed term = ROLE_DELETE Search Term = Role* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : Role*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response Indexed term = ROLE_DELETE Search Term = ROLE_DELETE* response lst name=responseHeader int name=status0/int int name=QTime4/int lst name=params str name=indenton/str str name=start0/str str name=qname : ROLE_DELETE*/str str name=version2.2/str str name=rows10/str /lst /lst result name=response numFound=0 start=0/ /response I am also adding a analysis html. On Mon, May 30, 2011 at 7:19 AM, Erick Erickson erickerick...@gmail.comwrote: I'd start by looking at the analysis page from the Solr admin page. That will give you an idea of the transformations the various steps carry out, it's invaluable! Best Erick On May 26, 2011 12:53 AM, Saumitra Chowdhury saumi...@smartitengineering.com wrote: Hi all , In my schema.xml i am using WordDelimiterFilterFactory, LowerCaseFilterFactory, StopFilterFactory for index analyzer and an extra SynonymFilterFactory for query analyzer. I am indexing a field name '*name*'.Now if a value with all caps like NAME_BILL is indexed I am able get this as search result with the term *name_bill *, *NAME_BILL *, *namebill *, *namebill** , *nameb** ... But for the term like following * NAME_BILL** , *name_bill** , *namebill** , *NAME** the result does mot show this document. Can anyone please explain why this is happening? .In fact star * is not giving any result in many cases specially if it is used after full value of a field. Portion of my schema is given below. fieldType name=text_ws class=solr.TextField positionIncrementGap=100 - analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType - fieldType name=text class=solr.TextField positionIncrementGap=100 - analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ /analyzer - analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ /analyzer /fieldType - fieldType name=textTight class=solr.TextField positionIncrementGap=100 - analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class