Re: Solr4.2 - Fuzzy Search Problems
Thanks Chris , for my 2nd Query (~1 returns words with 2 editing distance), it may be the issue. still m looking for my last issue. hope jira helps to resolve that. Chris Hostetter-3 wrote : : 2) although I set editing distance to 1 in my query (e.g. worde~1), solr : returns me results having 2 editing distance (like WORDOES, WORHEE, WORKEE, : .. ect. ) fuzzy search works on *terms* in your index -- if you use a stemme when you index your data (your schema shows that you are) then a word in your input like WORDOES might wind up in your index as a term within the edit distance you specified (ie: wordo or word or something similar) : 3) Last and major issue, I had very few data at startup in my solr core (say : around 1K - 2K ), at that time, when i was searching with worde~1 , it was : returning many records (around 450). : : Then I ingested few more records in my solr core (say around 1K). It was : ingested successfully , no errors or warning in Log. After that when I : performed the same fuzzy search (worde~1) on previous records only, not in : new ingested records , It did not return me previous results(around 450) as : well, and return total 1 record only having highlight as WORD!N . This sounds like the same issue as discribed in SOLR-4824... https://issues.apache.org/jira/browse/SOLR-4824 -Hoss -- View this message in context: http://lucene.472066.n3.nabble.com/Solr4-2-Fuzzy-Search-Problems-tp4063199p4065576.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with getting highlight with hl.maxAnalyzedChars = -1
Hi, Query pasted in my post, is returning 1 record with 0 highlights, if i just remove hl.maxAnalyzedChars= -1 from my query, it return proper highlight... same query with some different random id ,working fine and returning highlights properly , while few records return 0 highlights with hl.maxAnalyzedChars= -1 so behavior of hl.maxAnalyzedChars=-1 is causing issue, and its randomly failing for few records. Also i can not remove hl.maxAnalyzedChars=-1 from my search as, my text field is very long and I don't want parser to limit the scan char length, so need to resolve issue some how , without removing hl.maxAnalyzedChars= -1 from search query. Dmitry Kan-2 wrote You didn't say, what is exactly going weird.. On Fri, May 10, 2013 at 2:19 PM, meghana lt; meghana.ravani@ gt; wrote: I am facing one weird issue while setting hl.maxAnalyzedChars to -1 to fetch highlight for some random records, for other records its working fine. Below is my solr query http://localhost:8080/solr/core0/select?q=(text:new year) AND (id:2343287)hl=onhl.fl=texthl.fragsize=500hl.maxAnalyzedChars=-1 If i remove hl.maxAnalyzedChars=-1 from above query , or set it to some positive value (higher than text field length) , then it return record with proper highlight. But my text field length is very long, and i want like to limit it, so I also need to set hl.maxAnalyzedChars to -1 . Please help me to solve this. -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-getting-highlight-with-hl-maxAnalyzedChars-1-tp4062269.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-getting-highlight-with-hl-maxAnalyzedChars-1-tp4062269p4063187.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr4.2 - Fuzzy Search Problems
I am using Solr4.2 , I have few queries on new fuzzy implementation in Solr4+ 1) I come to know that Solr4+ accepts maximum editing distance to 2 (2 insertion, deletion, replacements). Is there any way , i can configure this maximum editing distance limit ?? 2) although I set editing distance to 1 in my query (e.g. worde~1), solr returns me results having 2 editing distance (like WORDOES, WORHEE, WORKEE, .. ect. ) 3) Last and major issue, I had very few data at startup in my solr core (say around 1K - 2K ), at that time, when i was searching with worde~1 , it was returning many records (around 450). Then I ingested few more records in my solr core (say around 1K). It was ingested successfully , no errors or warning in Log. After that when I performed the same fuzzy search (worde~1) on previous records only, not in new ingested records , It did not return me previous results(around 450) as well, and return total 1 record only having highlight as WORD!N . It seems like , Issue is causing somewhere while ingesting last 1K records, but can not able to catch that issue. also solr do not provide any error or warning in log. Or I don't know the way of debugging this ingestion issue. Below is configuration for my text field type text_en_splitting. fieldType name=text_en_splitting class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 protected=protwords.txt types=wdfftypes.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_extra_query.txt enablePositionIncrements=false / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 protected=protwords.txt types=wdfftypes.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType Also I have one copy field on this text field , with field type text_general_preserved. Below is configuration for it. fieldType name=text_general_preserved class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_ns.txt enablePositionIncrements=false / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_extra_query.txt enablePositionIncrements=false / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Hope I explained all my question to be understandable, Please Help me on This. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr4-2-Fuzzy-Search-Problems-tp4063199.html Sent from the Solr - User mailing list archive at Nabble.com.
Issue with getting highlight with hl.maxAnalyzedChars = -1
I am facing one weird issue while setting hl.maxAnalyzedChars to -1 to fetch highlight for some random records, for other records its working fine. Below is my solr query http://localhost:8080/solr/core0/select?q=(text:new year) AND (id:2343287)hl=onhl.fl=texthl.fragsize=500hl.maxAnalyzedChars=-1 If i remove hl.maxAnalyzedChars=-1 from above query , or set it to some positive value (higher than text field length) , then it return record with proper highlight. But my text field length is very long, and i want like to limit it, so I also need to set hl.maxAnalyzedChars to -1 . Please help me to solve this. -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-getting-highlight-with-hl-maxAnalyzedChars-1-tp4062269.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with fuzzy search in Distributed Search
Please help me on this!! meghana wrote To ensure the all records exist in single node, i queried on specific duration, so , for shards core and simple core query, results should be similar. as you suggested, i analyzed the debugQuery for one specific search * text:worde~1 * , and I seen that the record which returns in shards core have highlights like * word * , * words * , * word!n * . but when I look in debugQuery it just processing for * word!n * , and was not processing other highlights (words, word), although it shows it in highlight for that record. and so, shards core do not return other records , having text as * word * or * words * , but not * word!n * in it. on the other case, the simple core processing all * word * , * words * , * word!n * , and return proper results. this seems very weird behavior, any suggestion ? Jack Krupansky-2 wrote A fuzzy query itself does not know about distributed search - Lucene simply scores the query results based on the local index. Then, Solr is merging the merging the query results from different nodes. Try the query locally for each node and set debugQuery=true and see how each document gets scored. I'm actually not sure what the specific problem (symptom) is that you are seeing. I mean, maybe there is only 1 result on that node - how do you know otherwise?? Or maybe one node has more exact matches. -- Jack Krupansky -Original Message- From: meghana Sent: Tuesday, April 30, 2013 7:51 AM To: solr-user@.apache Subject: Issue with fuzzy search in Distributed Search I have created 2 versions of Solr core in different servers. one is simple core having all records in one core. And other is shards core, distributed over 3 cores on server. Simple core : http://localhost:8080/sorl/core0/select?q=text:hoers~1 Distributed core : http://192.168.1.91:8080/core0/select?shards=http://192.168.1.91:8080/core0,http://192.168.1.91:8080/core1,http://192.168.1.91:8080/core2q=text:hoers~1 data, schema and other configuration is similar in both the cores. but while doing fuzzy search like hoers~1 one core returns many records(about 450), while other core return only 1 record. While this issue does not seem related to Distributed Search, as Although i do not use distributed search, then also it do not return more rows. as http://192.168.1.91:8080/core0/select?q=text:hoers~1 below is schema definition for my field. fieldType name=text_en_splitting class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 protected=protwords.txt types=wdfftypes.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_extra_query.txt enablePositionIncrements=false / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 protected=protwords.txt types=wdfftypes.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType Not sure, what is wrong with this. Can anybody help me on this?? -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-fuzzy-search-in-Distributed-Search-tp4060022.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Results-differ-in-2-solr-cores-same-configuration-for-fuzzy-search-tp4060022p4061545.html Sent from the Solr - User
How to get Term Vector Information on Distributed Search
Hi, I am using distributed query to fetch records. Distributed Search Document on wiki says , Distributed Search support distributed query. but I m getting error while querying. Not sure if I am doing anything wrong. below is my Query to fetch Term Vector with Distributed Search. http://localhost:8080/solr/core1/tvrh?q=id:3426545tv.all=truef.text.tv.tf_idf=falsef.text.tv.df=falsetv.fl=textshards=localhost:8080/solr/core1,localhost:8080/solr/core2,localhost:8080/solr/core3shards.qt=selectdebugQuery=on Below is error coming... java.lang.NullPointerException at org.apache.solr.handler.component.TermVectorComponent.finishStage(TermVectorComponent.java:437) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:317) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:275) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161) at org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:155) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:877) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:671) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930) at java.lang.Thread.run(Unknown Source) Please help me on this. Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-Term-Vector-Information-on-Distributed-Search-tp4061313.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with fuzzy search in Distributed Search
To ensure the all records exist in single node, i queried on specific duration, so , for shards core and simple core query, results should be similar. as you suggested, i analyzed the debugQuery for one specific search *text:worde~1*, and I seen that the record which returns in shards core have highlights like *word*, *words*, *word!n*. but when I look in debugQuery it just processing for *word!n*, and was not processing other highlights (words, word), although it shows it in highlight for that record. and so, shards core do not return other records , having text as *word* or *words* , but not *word!n* in it. on the other case, the simple core processing all *word*, *words*, *word!n*, and return proper results. this seems very weird behavior, any suggestion ? Jack Krupansky-2 wrote A fuzzy query itself does not know about distributed search - Lucene simply scores the query results based on the local index. Then, Solr is merging the merging the query results from different nodes. Try the query locally for each node and set debugQuery=true and see how each document gets scored. I'm actually not sure what the specific problem (symptom) is that you are seeing. I mean, maybe there is only 1 result on that node - how do you know otherwise?? Or maybe one node has more exact matches. -- Jack Krupansky -Original Message- From: meghana Sent: Tuesday, April 30, 2013 7:51 AM To: solr-user@.apache Subject: Issue with fuzzy search in Distributed Search I have created 2 versions of Solr core in different servers. one is simple core having all records in one core. And other is shards core, distributed over 3 cores on server. Simple core : http://localhost:8080/sorl/core0/select?q=text:hoers~1 Distributed core : http://192.168.1.91:8080/core0/select?shards=http://192.168.1.91:8080/core0,http://192.168.1.91:8080/core1,http://192.168.1.91:8080/core2q=text:hoers~1 data, schema and other configuration is similar in both the cores. but while doing fuzzy search like hoers~1 one core returns many records(about 450), while other core return only 1 record. While this issue does not seem related to Distributed Search, as Although i do not use distributed search, then also it do not return more rows. as http://192.168.1.91:8080/core0/select?q=text:hoers~1 below is schema definition for my field. fieldType name=text_en_splitting class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 protected=protwords.txt types=wdfftypes.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_extra_query.txt enablePositionIncrements=false / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 protected=protwords.txt types=wdfftypes.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType Not sure, what is wrong with this. Can anybody help me on this?? -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-fuzzy-search-in-Distributed-Search-tp4060022.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Results-differ-in-2-solr-cores-same-configuration-for-fuzzy-search-tp4060022p4060201.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - WordDelimiterFactory with Custom Tokenizer to split only on Boundires
Thanks Jack Krupansky, Its very helpful :) Jack Krupansky-2 wrote The WDF types will treat a character the same regardless of where it appears. For something conditional, like dot between letters vs. dot lot preceded and followed by a letter, you either have to have a custom tokenizer or a character filter. Interesting that although the standard tokenizer messes up embedded hyphens, it does handle the embedded dot vs. trailing dot case as you wish (but messes up U.S.A. by stripping the trailing dot) - but that doesn't help your case. A character filter like the following might help your case: fieldType name=text_ws_dot class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.PatternReplaceCharFilterFactory pattern=([\w\d])[\._amp;]+($|[^\w\d]) replacement=$1 $2 / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(^|[^\w\d])[\._amp;]+($|[^\w\d]) replacement=$1 $2 / charFilter class=solr.PatternReplaceCharFilterFactory pattern=(^|[^\w\d])[\._amp;]+([\w\d]) replacement=$1 $2 / tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType I'm not a regular expression expert, so I'm not sure whether/how those patterns could be combined. Also, that doesn't allow the case of a single ., , or _ as a word - but you didn't specify how that case should be handled. -- Jack Krupansky -Original Message- From: meghana Sent: Wednesday, April 24, 2013 6:49 AM To: solr-user@.apache Subject: Solr - WordDelimiterFactory with Custom Tokenizer to split only on Boundires I have configured WordDelimiterFilterFactory for custom tokenizers for '' and '-' , and for few tokenizer (like . _ :) we need to split on boundries only. e.g. test.com (should tokenized to test.com) newyear. (should tokenized to newyear) new_car (should tokenized to new_car) .. .. Below is defination for text field fieldType name=text_general_preserved class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.WordDelimiterFilterFactory splitOnCaseChange =0 splitOnNumerics =0 stemEnglishPossessive =0 generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 preserveOriginal=0 protected=protwords_general.txt types=wdfftypes_general.txt / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.WordDelimiterFilterFactory splitOnCaseChange =0 splitOnNumerics =0 stemEnglishPossessive =0 generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 preserveOriginal=0 protected=protwords_general.txt types=wdfftypes_general.txt / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType below is wdfftypes_general.txt content = ALPHA - = ALPHA _ = SUBWORD_DELIM : = SUBWORD_DELIM . = SUBWORD_DELIM types can be used in worddelimiter are LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM . there's no description available for use of each type. as per name, i thought type SUBWORD_DELIM may fulfill my need, but it doesn't seem to work. Can anybody suggest me how can i set configuration for worddelimiter factory to fulfill my requirement. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-WordDelimiterFactory-with-Custom-Tokenizer-to-split-only-on-Boundires-tp4058557.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-WordDelimiterFactory-with-Custom-Tokenizer-to-split-only-on-Boundires-tp4058557p4060011.html Sent from the Solr - User mailing list archive at Nabble.com.
Issue with fuzzy search in Distributed Search
I have created 2 versions of Solr core in different servers. one is simple core having all records in one core. And other is shards core, distributed over 3 cores on server. Simple core : http://localhost:8080/sorl/core0/select?q=text:hoers~1 Distributed core : http://192.168.1.91:8080/core0/select?shards=http://192.168.1.91:8080/core0,http://192.168.1.91:8080/core1,http://192.168.1.91:8080/core2q=text:hoers~1 data, schema and other configuration is similar in both the cores. but while doing fuzzy search like hoers~1 one core returns many records(about 450), while other core return only 1 record. While this issue does not seem related to Distributed Search, as Although i do not use distributed search, then also it do not return more rows. as http://192.168.1.91:8080/core0/select?q=text:hoers~1 below is schema definition for my field. fieldType name=text_en_splitting class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 protected=protwords.txt types=wdfftypes.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_extra_query.txt enablePositionIncrements=false / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 protected=protwords.txt types=wdfftypes.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType Not sure, what is wrong with this. Can anybody help me on this?? -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-fuzzy-search-in-Distributed-Search-tp4060022.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr - WordDelimiterFactory with Custom Tokenizer to split only on Boundires
I have configured WordDelimiterFilterFactory for custom tokenizers for '' and '-' , and for few tokenizer (like . _ :) we need to split on boundries only. e.g. test.com (should tokenized to test.com) newyear. (should tokenized to newyear) new_car (should tokenized to new_car) .. .. Below is defination for text field fieldType name=text_general_preserved class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.WordDelimiterFilterFactory splitOnCaseChange =0 splitOnNumerics =0 stemEnglishPossessive =0 generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 preserveOriginal=0 protected=protwords_general.txt types=wdfftypes_general.txt / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.WordDelimiterFilterFactory splitOnCaseChange =0 splitOnNumerics =0 stemEnglishPossessive =0 generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 preserveOriginal=0 protected=protwords_general.txt types=wdfftypes_general.txt / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType below is wdfftypes_general.txt content = ALPHA - = ALPHA _ = SUBWORD_DELIM : = SUBWORD_DELIM . = SUBWORD_DELIM types can be used in worddelimiter are LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM . there's no description available for use of each type. as per name, i thought type SUBWORD_DELIM may fulfill my need, but it doesn't seem to work. Can anybody suggest me how can i set configuration for worddelimiter factory to fulfill my requirement. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-WordDelimiterFactory-with-Custom-Tokenizer-to-split-only-on-Boundires-tp4058557.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: fuzzy search issue with PatternTokenizer Factory
Fuzzy Search is looking independent of all the analyzer, but it seems that its not independent of tokenizer. As If i just change my tokenizer to *Solr.StandardTokenizerFactory* , Fuzzy search started working fine, If it is independent of Tokenizer then this should not occur. And I also , I had analyzed my terms in Admin UI Analysis page, and the term coming perfectly fine as expected, only this is only issue which I am facing. but i cant analyze the fuzzy term in Admin UI Analysis page. so not able to catch the issue. Jack Krupansky-2 wrote Once again, fuzzy search is completely independent of your analyzer or pattern tokenizer. Please use the Solr Admin UI Analysis page to debug whether the terms are what you expect. And realize that fuzzy search has a maximum editing distance of 2 and that includes case changes. -- Jack Krupansky -Original Message- From: meghana Sent: Monday, April 22, 2013 3:25 AM To: solr-user@.apache Subject: Re: fuzzy search issue with PatternTokenizer Factory Jack, the regex will split tokens by anything expect alphabets , numbers, '' , '-' and ns: (where n is number from 0 to , e.g 4323s: ) Lets say for example my text is like below. *this is nice* day sun 53s: is risen. * Then pattern tokenizer should create tokens as *this is nice day sun is risen* pattern seem to working fine with different text, also for fuzzy search *worde~1*, I have checked the results returns for patterntokenizer factory, having punctuation marks like '*WORDS,*' , *WORDED* , etc... One more weird thing is, all the results are in uppercase letters, no results with lowercase results come. although it does not return all results of uppercase letters. but not sure after changing to this fuzzy search not working properly. Jack Krupansky-2 wrote Give us some examples of tokens that you are expecting that pattern to tokenize. And express the pattern in simple English as well. Some some actual input data. I suspect that Solr is working fine - but you may not have precisely specified your pattern. But we don't know what your pattern is supposed to recognize. Maybe some of your previous hits had punctuation adjacent to to the terms that your pattern doesn't recognize. And use the Solr Admin UI Analysis page to see how your sample input data is analyzed. w One other thing... without a group, the pattern specifies what delimiter sequence will split the rest of the input into tokens. I suspect you didn't mean this. -- Jack Krupansky -Original Message- From: meghana Sent: Friday, April 19, 2013 9:01 AM To: solr-user@.apache Subject: fuzzy search issue with PatternTokenizer Factory I m using Solr4.2 , I have changed my text field definition, to use the Solr.PatternTokenizerFactory instead of Solr.StandardTokenizerFactory , and changed my schema defination as below fieldType name=text_token class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.PatternTokenizerFactory pattern=[^a-zA-Z0-9amp;\-']|\d{0,4}s: / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.PatternTokenizerFactory pattern=[^a-zA-Z0-9amp;\-']|\d{0,4}s: / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_extra_query.txt enablePositionIncrements=false / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType after doing so, fuzzy search do not seems to working properly as it was working before. I m searching with search term : worde~1 on search , before it was returning , around 300 records , but now its returning only 5 records. not sure what can be issue. Can anybody help me to make it work!! -- View this message in context: http://lucene.472066.n3.nabble.com/fuzzy-search-issue-with-PatternTokenizer-Factory-tp4057275.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/fuzzy-search-issue-with-PatternTokenizer-Factory-tp4057275p4057831.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/fuzzy-search-issue-with-PatternTokenizer-Factory-tp4057275p4058267.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: fuzzy search issue with PatternTokenizer Factory
Jack, the regex will split tokens by anything expect alphabets , numbers, '' , '-' and ns: (where n is number from 0 to , e.g 4323s: ) Lets say for example my text is like below. *this is nice* day sun 53s: is risen. * Then pattern tokenizer should create tokens as *this is nice day sun is risen* pattern seem to working fine with different text, also for fuzzy search *worde~1*, I have checked the results returns for patterntokenizer factory, having punctuation marks like '*WORDS,*' , *WORDED* , etc... One more weird thing is, all the results are in uppercase letters, no results with lowercase results come. although it does not return all results of uppercase letters. but not sure after changing to this fuzzy search not working properly. Jack Krupansky-2 wrote Give us some examples of tokens that you are expecting that pattern to tokenize. And express the pattern in simple English as well. Some some actual input data. I suspect that Solr is working fine - but you may not have precisely specified your pattern. But we don't know what your pattern is supposed to recognize. Maybe some of your previous hits had punctuation adjacent to to the terms that your pattern doesn't recognize. And use the Solr Admin UI Analysis page to see how your sample input data is analyzed. w One other thing... without a group, the pattern specifies what delimiter sequence will split the rest of the input into tokens. I suspect you didn't mean this. -- Jack Krupansky -Original Message- From: meghana Sent: Friday, April 19, 2013 9:01 AM To: solr-user@.apache Subject: fuzzy search issue with PatternTokenizer Factory I m using Solr4.2 , I have changed my text field definition, to use the Solr.PatternTokenizerFactory instead of Solr.StandardTokenizerFactory , and changed my schema defination as below fieldType name=text_token class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.PatternTokenizerFactory pattern=[^a-zA-Z0-9amp;\-']|\d{0,4}s: / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.PatternTokenizerFactory pattern=[^a-zA-Z0-9amp;\-']|\d{0,4}s: / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_extra_query.txt enablePositionIncrements=false / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType after doing so, fuzzy search do not seems to working properly as it was working before. I m searching with search term : worde~1 on search , before it was returning , around 300 records , but now its returning only 5 records. not sure what can be issue. Can anybody help me to make it work!! -- View this message in context: http://lucene.472066.n3.nabble.com/fuzzy-search-issue-with-PatternTokenizer-Factory-tp4057275.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/fuzzy-search-issue-with-PatternTokenizer-Factory-tp4057275p4057831.html Sent from the Solr - User mailing list archive at Nabble.com.
fuzzy search issue with PatternTokenizer Factory
I m using Solr4.2 , I have changed my text field definition, to use the Solr.PatternTokenizerFactory instead of Solr.StandardTokenizerFactory , and changed my schema defination as below fieldType name=text_token class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.PatternTokenizerFactory pattern=[^a-zA-Z0-9amp;\-']|\d{0,4}s: / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=false / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.PatternTokenizerFactory pattern=[^a-zA-Z0-9amp;\-']|\d{0,4}s: / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_extra_query.txt enablePositionIncrements=false / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType after doing so, fuzzy search do not seems to working properly as it was working before. I m searching with search term : worde~1 on search , before it was returning , around 300 records , but now its returning only 5 records. not sure what can be issue. Can anybody help me to make it work!! -- View this message in context: http://lucene.472066.n3.nabble.com/fuzzy-search-issue-with-PatternTokenizer-Factory-tp4057275.html Sent from the Solr - User mailing list archive at Nabble.com.
Pattern Tokenizer Factory not working with negation regular expression
Hi, I need my tokenizer factory , to split on everything expect numbers , letters , '' , ':' and single quote character. I use 'PatternTokenizerFactory' as below, tokenizer class=solr.PatternTokenizerFactory pattern=[^a-zA-Z0-9amp;-:] / but, its spiting tokens by space only . not sure what I am doing wrong in this. can anybody help me on this?? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Pattern-Tokenizer-Factory-not-working-with-negation-regular-expression-tp4056653.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Pattern Tokenizer Factory not working with negation regular expression
Jack Krupansky-2 wrote Hyphen indicates as character range (as in a-z), so if you want to include a hyphen as a character, escape it with a single backslash. -- Jack Krupansky -Original Message- From: meghana Sent: Wednesday, April 17, 2013 7:58 AM To: solr-user@.apache Subject: Pattern Tokenizer Factory not working with negation regular expression Hi, I need my tokenizer factory , to split on everything expect numbers , letters , '' , ':' and single quote character. I use 'PatternTokenizerFactory' as below, tokenizer class=solr.PatternTokenizerFactory pattern=[^a-zA-Z0-9amp;-:] / but, its spiting tokens by space only . not sure what I am doing wrong in this. can anybody help me on this?? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Pattern-Tokenizer-Factory-not-working-with-negation-regular-expression-tp4056653.html Sent from the Solr - User mailing list archive at Nabble.com. Thanks Jack Krupansky, it was so stupid mistake. you saved me!! :) -- View this message in context: http://lucene.472066.n3.nabble.com/Pattern-Tokenizer-Factory-not-working-with-negation-regular-expression-tp4056653p4056659.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr : Search with special character
Hi Jack , yes, its working by using white space tokenizer. but i can not use that tokenizer, but solr has good option for using pattern tokenzier. so i'll try it out. hope that work. Thanks Jack Krupansky-2 wrote Switch the field types from the standard tokenizer to the white space tokenizer and don't use the word delimiter filter. Or, you can sometimes add custom character mapping tables to some filters and indicate that your desired special characters should be mapped to type ALPHA. -- Jack Krupansky -Original Message- From: meghana Sent: Wednesday, April 10, 2013 6:25 AM To: solr-user@.apache Subject: Solr : Search with special character We need to make Solr Search like Success Failure Working 50% but Solr query parser eliminates all special characters from search. my search query is as mentioned below http://localhost:8080/solr/core/select?q=%22Success%20%26%20Failure%22hl=onhl.snippets=99debugQuery=on below is debugQuery for it. lst name=debug str name=rawquerystring Success Failure /str str name=querystring Success Failure /str str name=parsedquery PhraseQuery(text:success failure) /str str name=parsedquery_toString text:success failure /str lst name=explain/ str name=QParser LuceneQParser /str lst name=timing /lst /lst We want to make, solr should search with success failure , and should not eliminate special character. anybody have any idea, how to do this?? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Search-with-special-character-tp4054994.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Search-with-special-character-with-phrase-search-tp4054994p4055237.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr : Stopwords at query time
In solr , I have text as like below format. 1s: This is very nice day. 4s: Christmas is about to come 7s: and christmas preparation is just on 12s: this is awesome!! I want that words like '1s:' , '4s:' , anything like 'ns:' should not be indexed and searchable, to do so I have added stop words filter in my text field definition. below is the my field type defination --- fieldType name=text_en_splitting class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType and stopwords.txt field contains words --- 1s: 2s: ... ... ... 1s: -- when i search for with q=109s: , it returns 0 results, but if i search for 109s , then it should also return 0 results. but surprisingly solr not doing so!! , and returning results having 190s: in text. I understand that , if words 109s: is not indexed, thus 190s also not indexed. and as word 190s is not there in index, it should not return results for that. But solr is not looking to behave so, can anybody explain me of this behavior. and if any changes i should do to fulfill my requirement Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Stopwords-at-query-time-tp4055249.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr : Search with special character
We need to make Solr Search like Success Failure Working 50% but Solr query parser eliminates all special characters from search. my search query is as mentioned below http://localhost:8080/solr/core/select?q=%22Success%20%26%20Failure%22hl=onhl.snippets=99debugQuery=on below is debugQuery for it. lst name=debug str name=rawquerystringSuccess Failure/str str name=querystringSuccess Failure/str str name=parsedqueryPhraseQuery(text:success failure)/str str name=parsedquery_toStringtext:success failure/str lst name=explain/ str name=QParserLuceneQParser/str lst name=timing/lst /lst We want to make, solr should search with success failure , and should not eliminate special character. anybody have any idea, how to do this?? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Search-with-special-character-tp4054994.html Sent from the Solr - User mailing list archive at Nabble.com.
highlight on same field with different fragsize
We use Solr4.2 in our application. We need to return highlight on same field 2 times, with different fragsize. Solr allows to highlight on different fields with different fragsize as mentioned below , but do not work with same fields http://localhost:8080/solr/select?q=my searchhl=onhl.fl=content1,content2f.content1.hl.fragsize=400f.content2.hl.fragsize=100 Solr4.0 also apply alias for field, but it do not seems to work for highlighting. Can anybody suggest me on this, how to achieve it? -- View this message in context: http://lucene.472066.n3.nabble.com/highlight-on-same-field-with-different-fragsize-tp4052863.html Sent from the Solr - User mailing list archive at Nabble.com.
Customize Solr Fragmeant
I want to use Regexp Fragmenter in my solr highlighting feature to customize my fragment. As per requirement , we need to return 25 words before and after highlighting term. To do so , i have made below regular expression ((?:\w+\W*){25})\b(span class)\b((?:\W*\w+){25}) This regular expression working fine with the simple string. (tested) , but while using it with solr, it does not seems to work properly. Few highlights coming fine, but for few highlights , the highlight term comes in very start of highlighting fragment. I am not sure, that regexp fragmenter can do, what i need. And Is there any other ways to fulfill this requirement. Can anybody suggest me on this?? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Customize-Solr-Fragmeant-tp4051372.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Make Exact Search on Field with Fuzzy Query
Hi Erickson, Thanks for your valuable reply. Actually we had tried with just storing one field and highlighting on that field all the time , whether we search on it or not. It sometimes occurs issue , like if i search with the term : 'hospitality' . and I use field for highlighting , which having stemming applied. it returns me highlights with 'hospital' , 'hospitality'. whether it should return highlighting only on 'hospitality' as I am doing exact term search, can you suggest anything on this?? If we can eliminate this issue while highlighting on original field (having applied stemming on it). The other solutions are sounds really good, but as you said they are hard to implement and we at this point , wanted to implement inbuilt solutions if possible. Please suggest if we can eliminate above explained issue on highlighting. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Make-Exact-Search-on-Field-with-Fuzzy-Query-tp4012888p4013067.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr - Make Exact Search on Field with Fuzzy Query
0 down vote favorite We are using solr 3.6. We have field named Description. We want searching feature with stemming and also without stemming (exact word/phrase search), with highlighting in both . For that , we had made lot of research and come to conclusion, to use the copy field with data type which doesn't have stemming factory. it is working fine at now. (main field has stemming and copy field has not.) The data for that field is very large and we are having millions of documents; and as we want, both searching and highlighting on them; we need to keep this copy field stored and indexed both. which will increase index size a lot. we need to eliminate this duplication if possible any how. From the recent research, we read that combining fuzzy search with dismax will fulfill our requirement. (we have tried a bit but not getting success.) Please let me know , if this is possible, or any other solutions to make this happen. Thanks in Advance -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Make-Exact-Search-on-Field-with-Fuzzy-Query-tp4012888.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr - Add Single node from XPathEntityProcessor in multiple fields
Hi, I have one date field in my data source as of format below 'Apr 10 2012 10:30AM' Now i wanted to add this field 2 times in my solr index, one with exact value as of above and one with 'tdate' format of solr schema. I used XPathEntityProcessor for my dataimport xml , and i add this date field in my dataimport handler as mentioned below entity name=NewsXML onError=continue rootEntity=true transformer=DateFormatTransformer processor=XPathEntityProcessor forEach=/content/article/ url=${FilePath.FileLocation} dataSource=FD field column=article_time_DT xpath=/content/article/article_time dateTimeFormat=MMM dd hh:mmaaa / field column=article_time xpath=/content/article/article_time / /entity but by doing this way , only last mentioned filed (in this case 'article_time') get inserted in solr index , but no data inserted for article_time_DT field. Can any body please suggest me , how can i achieve my requirement. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Add-Single-node-from-XPathEntityProcessor-in-multiple-fields-tp4007456.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Add Single node from XPathEntityProcessor in multiple fields
Alexandre Rafalovitch wrote copyField? Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, Sep 13, 2012 at 9:40 AM, meghana lt;meghana.ravani@gt; wrote: Hi, I have one date field in my data source as of format below 'Apr 10 2012 10:30AM' Now i wanted to add this field 2 times in my solr index, one with exact value as of above and one with 'tdate' format of solr schema. I used XPathEntityProcessor for my dataimport xml , and i add this date field in my dataimport handler as mentioned below entity name=NewsXML onError=continue rootEntity=true transformer=DateFormatTransformer processor=XPathEntityProcessor forEach=/content/article/ url=${FilePath.FileLocation} dataSource=FD field column=article_time_DT xpath=/content/article/article_time dateTimeFormat=MMM dd hh:mmaaa / field column=article_time xpath=/content/article/article_time / /entity but by doing this way , only last mentioned filed (in this case 'article_time') get inserted in solr index , but no data inserted for article_time_DT field. Can any body please suggest me , how can i achieve my requirement. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Add-Single-node-from-XPathEntityProcessor-in-multiple-fields-tp4007456.html Sent from the Solr - User mailing list archive at Nabble.com. Alexandre Rafalovitch wrote copyField? Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, Sep 13, 2012 at 9:40 AM, meghana lt;meghana.ravani@gt; wrote: Hi, I have one date field in my data source as of format below 'Apr 10 2012 10:30AM' Now i wanted to add this field 2 times in my solr index, one with exact value as of above and one with 'tdate' format of solr schema. I used XPathEntityProcessor for my dataimport xml , and i add this date field in my dataimport handler as mentioned below entity name=NewsXML onError=continue rootEntity=true transformer=DateFormatTransformer processor=XPathEntityProcessor forEach=/content/article/ url=${FilePath.FileLocation} dataSource=FD field column=article_time_DT xpath=/content/article/article_time dateTimeFormat=MMM dd hh:mmaaa / field column=article_time xpath=/content/article/article_time / /entity but by doing this way , only last mentioned filed (in this case 'article_time') get inserted in solr index , but no data inserted for article_time_DT field. Can any body please suggest me , how can i achieve my requirement. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Add-Single-node-from-XPathEntityProcessor-in-multiple-fields-tp4007456.html Sent from the Solr - User mailing list archive at Nabble.com. I have tried that... i inserted article_time_DT form dataimport handler and made copy field for article_time, but the issue is the value i wanted in these 2 fields are in different format. Is there any facility in copy field to have data in different format?? like i mentoned in my question? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Add-Single-node-from-XPathEntityProcessor-in-multiple-fields-tp4007456p4007465.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr - case-insensitive search do not work
I want to apply case-insensitive search for field *myfield* in solr. I googled a bit for that , and i found that , i need to apply *LowerCaseFilterFactory *to Field Type and field should be of solr.TextFeild. I applied that in my *schema.xml* and re-index the data, then also my search seems to be case-sensitive. Below is search that i perform. * http://localhost:8080/solr/select?q=myfield:cloud universityhl=onhl.snippets=99hl.fl=myfield* Below is definition for field type fieldType name=text_en_splitting class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType and below is my field definition field name=myfield type=text_en_splitting indexed=true stored=true / Not sure , what is wrong with this. Please help me to resolve this. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-case-insensitive-search-do-not-work-tp4002605.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - case-insensitive search do not work
@Ravish Bhagdev , Yes I am adding double quotes around my search , as shown in my post. Like, myfield:cloud university -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-case-insensitive-search-do-not-work-tp4002605p4002610.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - case-insensitive search do not work
Hi Ravish , the defination for text_en_splitting in solr default schema and of mine are same.. still its not working... any idea? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-case-insensitive-search-do-not-work-tp4002605p4002645.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr - customize Fragment using hl.fragmenter and hl.regex.pattern
0 down vote favorite I want solr highlight in specific format. Below is string format for which i need to provide highlighting feature --- 130s: LISTEN! LISTEN! 138s: [THUMP] 143s: WHAT IS THAT? 144s: HEAR THAT? 152s: EVERYBODY, SHH. SHH. 156s: STAY UP THERE. 163s: [BOAT CREAKING] 165s: WHAT IS THAT? 167s: [SCREAMING] 191s: COME ON! 192s: OH, GOD! 193s: AAH! 249s: OK. WE'VE HAD SOME PROBLEMS 253s: AT THE FACILITY. 253s: WHAT WE'RE ATTEMPTING TO ACHIEVE 256s: HERE HAS NEVER BEEN DONE. 256s: WE'RE THIS CLOSE 259s: TO THE REACTIVATION 259s: OF A HUMAN BRAIN CELL. 260s: DOCTOR, THE 200 MILLION 264s: I'VE SUNK INTO THIS COMPANY 264s: IS DUE IN GREAT PART 266s: TO YOUR RESEARCH. --- after user search I want to provide user fragment in below format *Previous Line of Highlight + Line containing Highlight + Next Line of Highlight* For. E.g. user searched for term hear , then one typical highlight fragment should be like below *str143s: WHAT IS THAT? 144s: emHEAR/em THAT? 152s: EVERYBODY, SHH. SHH./str* above is my ultimate plan , but right now I am trying to get fragment as, which start with ns: where n is numner between 0 to i use hl.regex.slop = 0.6 and my hl.fragsize=120 and below is regex for that. *\b(?=\s*\d{1,4}s:){50,200} * using above regular expression my fragment always do not start with ns: Please suggest me on this , how can i achieve ultimate plan Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-customize-Fragment-using-hl-fragmenter-and-hl-regex-pattern-tp3997693.html Sent from the Solr - User mailing list archive at Nabble.com.
how solr will apply regex fragmenter
I was looking on Regex fragment for customizing my highlight fragment, I was wondering how Regex fragment works within solr and googled for it , But didn't found any results. Can anybody tell me how regex fragmenter works with in solr. And when regex fragmenter apply regex on fragments , do i first get fragment using default solr operation and then apply regex on it. Or it directly apply regex on Search term and then return fragment.. -- View this message in context: http://lucene.472066.n3.nabble.com/how-solr-will-apply-regex-fragmenter-tp3997749.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr - hl.fragsize Issue
i am using solr 3.5 , and in search query i set hl.fragsize = 100 , but my fragment does not contain exact 100 chars , average fragment size is 120 . Can anybody have idea about this issue?? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-hl-fragsize-Issue-tp3997457.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - hl.fragsize Issue
Hi @iorixxx , I use DefaultSolrHighlighter , and yes fragment size also includes em tags but if we remove em from fragment , then also the average size of fragment is 110 instead of 100. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-hl-fragsize-Issue-tp3997457p3997656.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Highlighting issue with PlainTextEntityProcessor.
Hi Erik.. Thanks for your reply. And yes data was on index. but i found the problem , the problem was not of PlainTextEntityProcessor. highlighting was returning in multivalued field and in non-multivalued field there was less highlight. so i thought problem may be in PlainTextEntityProcessor. But the actual problem was my Search field is very big... i increased hl.MaxAnalyzedChar length... and it get working. -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-issue-with-PlainTextEntityProcessor-tp3650004p3653708.html Sent from the Solr - User mailing list archive at Nabble.com.
Highlighting issue with PlainTextEntityProcessor.
Hi all, My Solr Configuration had one multi-valued field which is imported using XPathEntityProcess and TemplateTransformer . Then we had to convert it to non-multivlaued field, We did that using PlainTextEntityProcessor and Script Transformer. Search on my old configuration was working fine , it returns me documents with proper highlighting. After changing multi-valued to non-multivalued field, on search i get less highlighting for few documents than older one. e.g i search with search Term 'cat' , in old configuration i get one result as below arr name=XA strIs that a emcat?/em strIs that a real emcat/em?/str strOh, my God, it's a German emcat/em?!/str strI won't rub this emcat/em/str strDrop the emcat/em or I rip/str strWithout emcats/em?/str strForget the emcat/em, rommy./str strYou steal emcats/em?/str /arr in new configuration i get it as below arr name=XA strIs that a emcat?/em strIs that a real emcat/em?/str strOh, my God, it's a German emcat/em?!/str strI won't rub this emcat/em/str strDrop the emcat/em or I rip/str /arr aslo for few documents , i get record but with 0 highlighting. Do anybody have idea on this?? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-issue-with-PlainTextEntityProcessor-tp3650004p3650004.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: prevent PlainTextEntityProcessor to encode text
Aslo i found that XPathEntityProcessor doesn't encode text. but if i try to import data using XpathEnityProcessor it does not import data for my Mfld field (non-multivalued). below what i have tried. entity name=x onError=continue processor=XPathEntityProcessor forEach=/xa url=${SRC.FileName} dataSource=FS field column=Mfld xpath=/xa / /entity String will be like below --- xaxb impt=12this is solr!!/xbxb impt=18Welcome/xbxb impt=35Hello test/xb/xa it works well , if i provide path xa/xb , but don't work for /xa Can anybody have any idea? -- View this message in context: http://lucene.472066.n3.nabble.com/prevent-PlainTextEntityProcessor-to-encode-text-tp3634744p3637402.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: XPathEntityProcessor append text in foreach
I have String like below format --- xaxb impt=12this is solr!!/xbxb impt=18Welcome/xbxb impt=35Hello test/xb/xa I want it to convert it in below format using XPathEntityProcessor --- 12: this is solr!! 18: Welcome 35: Hello test I had used PlainTextEnityProcessor with ScriptTransformer to make string like above... but it encode some text , So i want it to make happen using XPathEntityProcessor. Can anybody have any idea , how to do that? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/XPathEntityProcessor-append-text-in-foreach-tp3635022p3637412.html Sent from the Solr - User mailing list archive at Nabble.com.
prevent PlainTextEntityProcessor to encode text
Hi all, I am importing one field into solr using PlainTextEntityProcessor from a text file , which have text in XML format. after importing it some of the text get encoded (e.g. it convert quotation mark() to quot;) Can i prevent it to not encode XML and keep it as it is. (like do not convert quotation mark to quot;) Plz help me Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/prevent-PlainTextEntityProcessor-to-encode-text-tp3634744p3634744.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Search issue while making multivalued field to signle valued.
i have one text field. previously it was multivalued field and imported using xpathentityprocessor and it was working fine; now i change it to single value field and use plaintextentiyprocessor. when i do make search on it; currently (single value - plaintext entity processor) its giving less document than previously (multivalued - xpathentity processor) what should be the cause of this? and how/what changes needed so that it works as previously? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Search-issue-while-making-multivalued-field-to-signle-valued-tp3634764p3634764.html Sent from the Solr - User mailing list archive at Nabble.com.
XPathEntityProcessor append text in foreach
Hi all, i have one non-multivalued field , which i want to import from a file using XPathEntityProcessor. entity name=x onError=continue processor=XPathEntityProcessor transformer=TemplateTransformer forEach=/tt/body/div/p url=${SRC.FileName} dataSource=FS field column=Mfld xpath=/xa/xb/ /entity it works for fine for multi-valued field but , for signle value field it just assign one value of that. can i append text of '/xa/xb' in 'Mfld'? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/XPathEntityProcessor-append-text-in-foreach-tp3635022p3635022.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: disable stemming on query parser.
Hi, Can i do like.. stemmed match should score a lower then non-stemmed (exact word) match ? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3631826.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Mutivalue field search on different elements
Hi Kogi , Thanks for reply. I tried by adding BoundaryScanner in my solrconfig.xml and set hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and termOffsets=on. in my query. then also i didn't get any effect on my highlighting. my solr config setting is as below boundaryScanner name=default default=true class=solr.highlight.SimpleBoundaryScanner lst name=defaults str name=hl.bs.maxScan10/str str name=hl.bs.charss:/str /lst /boundaryScanner do i missing anything , or doing anything wrong?? i like to make a note that i am using solr version 1.4 Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3615937.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: hl.boundaryScanner and hl.bs.chars
Hi Kogi , Thanks for reply. I tried by adding BoundaryScanner in my solrconfig.xml and set hl.useFastVectorHighlighter=true, termVectors=on, termPositions=on and termOffsets=on. in my query. then also i didn't get any effect on my highlighting. my solr config setting is as below boundaryScanner name=default default=true class=solr.highlight.SimpleBoundaryScanner lst name=defaults str name=hl.bs.maxScan10/str str name=hl.bs.charss:/str /lst /boundaryScanner do i missing anything , or doing anything wrong?? i like to make a note that i am using solr version 1.4 Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/hl-boundaryScanner-and-hl-bs-chars-tp3615838p3615940.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: hl.boundaryScanner and hl.bs.chars
Thans iorixxx and Koji for your reply , so can i fulfill my needed requirement by using hl.regex.pattern and making hl.fragmenter=regex ?? i was watching on these fields on wiki. i am thinking to use it to make my highlighted text show in my desire format. my string is like below 1s: This is very nice day. 3s: Christmas is about to come 4s: and christmas preparation is just on now if i search with chirstmas , i want my fragment in below format str 3s: emChristmas/em is about to come /str str 4s: and emchristmas/em preparation is just on /str can i fulfill this using hl.regex.pattern ? or by any other way?? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/hl-boundaryScanner-and-hl-bs-chars-tp3615838p3616218.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Mutivalue field search on different elements
Hi iorixxx, I have changed my multiValued field to single value filed.. and now my field appears as below - 1s: This is very nice day. 3s: Christmas is about come and christmas 4s:preparation is just on - but by doing this, i don't get my search : christmas preparation to be matched on my search query , although i had set my positionIncrementGap to 0. any ideas why it is not matching ?? Please help me. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3614313.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Mutivalue field search on different elements
Hi iorixxx, Sorry for confusion in my question... yes , 1s, 3s, 4s are part of my field value.. i have my data in this format. and the field is non-multivalued field (single valued). so as PositionIncrementGap is only work for multivalued field , in my search i always have to apply slop in my search. Thanks for reply. Meghana. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3614365.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Mutivalue field search on different elements
i can't delete 1s ,2s ...etc from my field value , i have to keep text in this format... so i'll apply slop in my search to do my needed search done. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3615816.html Sent from the Solr - User mailing list archive at Nabble.com.
hl.boundaryScanner and hl.bs.chars
Hi all , i seen hl.boundaryScanner and hl.bs.chars parameters in solr highlighting feature. but i didn't get its meaning exactly , what its use and how can i use it in my search? My need is something like ,i want to set my fragment to start and end from special character / string that i can specify and can set fragment length dynamic (if possible) . Can i do this by any way?? Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/hl-boundaryScanner-and-hl-bs-chars-tp3615838p3615838.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: PlainTextEntityProcessor and RegexTransformer in DataImport Handler
Thanks Matthew , Its really helped a lot. i am about to done with this. -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3612674.html Sent from the Solr - User mailing list archive at Nabble.com.
PlainTexttransformer and RegexTransformer in DataImport Handler
Hi all, I need to import data from my text file (which have HTML text). and need to apply some formatting on it. i want all text with in p tag , and i want it to be preceded by one element of p tag in my output, like below. Original Text -- divp myvar=12 myvar1=xyzHello World!!/pp myvar=14 myvar1=abcWelcome to Solr./pp myvar=15 myvar1=defEnjoy/p/div Needed Text After Formattting -- 12 : Hello World!! 14 : Welcome to Solr. 15 : Enjoy I have applied combination of PlainTextTransformer , RegexTransformer and TemplateTransformer for that as below. but i am receiving ConfigurationError when i set that. entity name=xx onError=continue processor=PlainTextEntityProcessor,TemplateTransformer,RegexTransformer url=${URL.MyTxtFile} dataSource=MDataSource field column=plainText name=FullText / field column=quot;FullTextquot; template=quot;${xx.FullText}quot; regex='lt;p (?:\s+[^]+)? myvar=([^]*) (?:\s+[^]+)?([^]*)/p' replaceWith=$2 : $4/ /entity I like to add here that i am able do this using TempleteTransformer and multivalued field. but i need above format in signle valued field, for which i am failed to do it. Can any body help me, how can i get my desired result? or what i am doing wrong on above transformer? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTexttransformer-and-RegexTransformer-in-DataImport-Handler-tp3608415p3608415.html Sent from the Solr - User mailing list archive at Nabble.com.
PlainTextEntityProcessor and RegexTransformer in DataImport Handler
Hi all, I need to import data from my text file (which have HTML text). and need to apply some formatting on it. i want all text with in p tag , and i want it to be preceded by one element of p tag in my output, like below. Original Text -- divp myvar=12 myvar1=xyzHello World!!/pp myvar=14 myvar1=abcWelcome to Solr./pp myvar=15 myvar1=defEnjoy/p/div Needed Text After Formattting -- 12 : Hello World!! 14 : Welcome to Solr. 15 : Enjoy I have applied combination of PlainTextEntityProcessor with RegexTransformer and TemplateTransformer for that as below. but i am receiving ConfigurationError when i set that. entity name=xx onError=continue processor=PlainTextEntityProcessor transformer=TemplateTransformer,RegexTransformer url=${URL.MyTxtFile} dataSource=MDataSource field column=plainText name=FullText / field column=quot;FullTextquot; template=quot;${xx.FullText}quot; regex='lt;p (?:\s+[^]+)? myvar=([^]*) (?:\s+[^]+)?([^]*)/p' replaceWith=$2 : $4/ /entity I like to add here that i am able do this using TemplateTransformer and multivalued field by setting foreach on entity, but i need above format in single valued field, for which i am failed to do it. Can any body help me, how can i get my desired result? or what i am doing wrong on above transformer? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3608449.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: PlainTextEntityProcessor and RegexTransformer in DataImport Handler
Hi.. Plz anybody have any idea? how can i achieve this? also if it is possible to convert multivalued field to non-multicalued field then it would aslo work for me. I have custom mustivalued field ArrText, which have value as shown below arr name=ArrText str12 : Hello World!!/str str14 : Welcome to Solr./str str15 : Enjoy/str /arr if we can convert this as my desired result then it would be great. Thanks in Adcance. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3608726.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Mutivalue field search on different elements
Thanks Erik . i seen that , how it work with slop after making few operations :) . so i am happy with this now. but still i have one issue , when i do search i also need to show highlighting on that field, setting positionIncrementGap to 0, and then when i make phrase search . it does not return me highlighting on that words of phrase. can i handle this by doing some configuration changes? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3606597.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: disable stemming on query parser.
Hi, So we need to find that solr patch , which do add special character on word , which i index. i like to add here, that the my copy field is multivalued field, with many sentences. so do it add special character at each word of that? and for compression, Erik yes, i am storing my copy field (stored=true). because we also want highlighting on that field when searched, so we have to store that. then also it does not reduce my index file size (*.fdt). instead of that , it increasing the index file size. do you think there's any case , which increase file size on compression? i really need to do this. Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3604186.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: disable stemming on query parser.
Hi, So we need to find that solr patch , which do add special character on word , which i index. i like to add here, that the my copy field is multivalued field, with many sentences. so do it add special character at each word of that? and for compression, Erik yes, i am storing my copy field (stored=true). because we also want highlighting on that field when searched, so we have to store that. then also it does not reduce my index file size (*.fdt). instead of that , it increasing the index file size. do you think there's any case , which increase file size on compression? i really need to do this. Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3604188.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr - Mutivalue field search on different elements
Hi all, i need to make a different search on multivalued field. for e.g. i have data as below arr name=xx strMichel/str strJackson/str stris/str strgood/str strsinger and dancer/str /arr if i search using Michel Jackson , then i want above displayed record should come in result (search word in in consecutive element). do anybody have any idea? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3604213.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr - Mutivalue field search on different elements
Hi Tanguy, Thanks for your reply.. this is really useful. but i have one questions on that. my multivalued field is not just simple text. it has values like below str1s:[This is very nice day.]/str str3s:[Christmas is about come and christmas]/str str4s:[preparation is just on ]/str now i if i search with christmas preparation , then this should match. if i set positionIncrementGap to 0 then do it will match? Or how value of positionIncrementGap behave on my search? Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3605938.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: disable stemming on query parser.
Hi Dmitry , If we add some unseen character sequence to array , doesn't it remove my stemming at all time? how we can manage stemmed and unstemmed words in the same field? i am a bit confused on this. also i tried with making compression on a field, which i use for copy field, what i read about compression on field , it should make your index size lower. and it lowers performance a bit while querying , but when i tried it on my local solr configuration (which have about 5000 records , and copy field size is more than 5000 char or may be much more). it behave totally opposite of it. It increased my index file size and also performance does not decrease. have any idea why it is behaved like this. like to make a note that this i tried with my local configuration of solr. in live solr , we have more than 10 lakh records , and copy field size is very big( about 5000 or much more char) Thanks in advance, Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3603675.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: disable stemming on query parser.
Thanks Dmitry for your reply... i tried this out... it is working fine, but still we are in search of any less costly solution , if there's any. bcoz if i use copy field, it almost doubles my index file size. I don't know if it can be applicable or not , but i am thinking of to make a query parser in which stemming is disabled. please let me know if have any idea on it or any other solution which can be applied. Please please let me know . Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3597597.html Sent from the Solr - User mailing list archive at Nabble.com.
disable stemming on query parser.
Hi All, I am using Stemming in my solr , but i don't want to apply stemming always for each search request. i am thinking of to disable stemming on one specific query parser , can i do this? Any help much appreciated. Thanks in Advance -- View this message in context: http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3591420.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: make fuzzy search for phrase
any solutions?? i am just get stuck in this. :( -- View this message in context: http://lucene.472066.n3.nabble.com/make-fuzzy-search-for-phrase-tp3542079p3551203.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: make fuzzy search for phrase
I installed ComplexPhraseQueryParser as suggested by you from https://issues.apache.org/jira/browse/SOLR-1604 by adding latest version of it , i am getting error HTTP Status 500 - luceneMatchVersion java.lang.NoSuchFieldError: luceneMatchVersion at org.apache.solr.search.SolrComplexPhraseQueryParser while adding older version of it.. i didn't getting error while querying , but when starting solr i get below error... 18:27:38,117 ERROR [ProfileDeployAction] Failed to add deployment: ComplexPhrase-1.0.jar: org.jboss.deployers.spi.DeploymentException: Failed to mount archive: /opt/jboss-6.0.0.Final/server/default/deploy/ComplexPhrase-1.0.jar *** DEPLOYMENTS MISSING DEPLOYERS: Name vfs:///opt/jboss-6.0.0.Final/server/default/deploy/ComplexPhrase-1.0.jar -- any idea why such error is coming... Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/make-fuzzy-search-for-phrase-tp3542079p3548238.html Sent from the Solr - User mailing list archive at Nabble.com.
Error running ComplexPhraseQueryParser
hi all, I want to use wild query and fuzzy search together in my query. i installed ComplexPhraseQueryParser for that, but when running url with defType=complexphrase , i ger below error. HTTP Status 500 - luceneMatchVersion java.lang.NoSuchFieldError: luceneMatchVersion at org.apache.solr.search.SolrComplexPhraseQueryParser.init(SolrComplexPhraseQueryParser.java:66) at org.apache.solr.search.SolrComplexPhraseQueryParser.init(SolrComplexPhraseQueryParser.java:62) at org.apache.solr.search.ComplexPhraseQParser.parse(ComplexPhraseQParserPlugin.java:95) at org.apache.solr.search.QParser.getQuery(QParser.java:131) at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:274) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:242) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:275) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:181) at org.jboss.modcluster.catalina.CatalinaContext$RequestListenerValve.event(CatalinaContext.java:285) at org.jboss.modcluster.catalina.CatalinaContext$RequestListenerValve.invoke(CatalinaContext.java:261) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:88) at org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.invoke(SecurityContextEstablishmentValve.java:100) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:158) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:567) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.jboss.web.tomcat.service.request.ActiveRequestResponseCacheValve.invoke(ActiveRequestResponseCacheValve.java:53) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:362) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:877) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:654) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:951) at java.lang.Thread.run(Thread.java:636) anybody have any idea , how to solve this?? Thanks in Advance, Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Error-running-ComplexPhraseQueryParser-tp3545155p3545155.html Sent from the Solr - User mailing list archive at Nabble.com.
fuzzy search with prefix
Hi All, I making fuzzy search in my solr application like below, q:squre~ 0.6 i want that some prefix length should not go for match in fuzzy query , say for in this ex. i want that my fuzzy query should not go to match for squ , and rest of term go for fuzzy search. i am doing it by applying wild query with fuzzy query as below q:squre~ 0.6 AND squ* i want to know that , is any better way of doing this? , as per i read around for it, i read that we can set prefix length in our fuzzy query for no. of char. we dont want to match in our fuzzy query. but i didn't found anything how can i set it my solr fuzzy query. Thanks in Advance. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/fuzzy-search-with-prefix-tp3542064p3542064.html Sent from the Solr - User mailing list archive at Nabble.com.
make fuzzy search for phrase
Hi All, I am doing fuzzy search in my solr , its working good for signle term , but when searching for phrases i get either bulk of data or very less data. is there any good way for getting satisfactory amount of data with nice accuracy. 1) q:kenny zemanski : 9 recors 2) keny~0.7 zemansi~0.7 AND ken* : 22948 records. i want to get amount of data that is good in accuracy and some what near to my actual results. by applying more accuracy than if 0.7 , i am getting very less data and none match with my desired result. anybody have any idea? any help much appreciated. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/make-fuzzy-search-for-phrase-tp3542079p3542079.html Sent from the Solr - User mailing list archive at Nabble.com.
Fuzzy search with slop
Hi, Can i apply fuzzy query and slop together... like q=hello world~0.5~3 I am getting error when applying like this. i want to make both fuzzy search and slop work. How can i do this, can anybody help me? Thanks in Advance. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Fuzzy-search-with-slop-tp3542280p3542280.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: make fuzzy search for phrase
This seems to be solution of my problem.. i definitely try this. Thanks for your reply. Meghana. -- View this message in context: http://lucene.472066.n3.nabble.com/make-fuzzy-search-for-phrase-tp3542079p3544239.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: need a way so that solr return result for misspelled terms
okey, i am not very much aware of it , can i use lucene query parser with solr and make this fuzzy search possible? Erik Hatcher-4 wrote Sure... if you're using the lucene query parser and put a ~ after every term in the query :) But that would mean that either the users or your application do this. Erik On Nov 23, 2011, at 09:03 , meghana wrote: Hi Erik, Thanks for your reply. i come to know that Lucene provides the fuzzy search by applying tilde(~) symbol at the end of search with like delll~0.8 can we apply such fuzzy logic in solr in any way? Thanks Meghana Erik Hatcher-4 wrote Meghana - There's currently no facility in Solr to return results for suggestions automatically. You'll have to code this into your client to make another request to Solr for the suggestions returned from the first request. Erik On Nov 23, 2011, at 07:58 , meghana wrote: Hi, I have configured spellchecker component in my solr. it works with custom request handler (however its not working with standard request handler , but this is not concern at now) . but its returning suggestions for the matching spells, instead of it we want that we can directly get result for relative spells of misspelled search term. Can we do this. Any help much appreciated. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3530584.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3530769.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3533046.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: complex phrase plugin install
is this for wildcard search and search for misspell words. i need the same to do in my application. -- View this message in context: http://lucene.472066.n3.nabble.com/complex-phrase-plugin-install-tp3533123p3533182.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: need a way so that solr return result for misspelled terms
Hi Erik , I am sorry , i did not get you exactly. do you tries to say that tilde (~) works for single term only. Say for ex. i have sentence like i like solr speed for searching. and i try to search with slor~ , then it will not work bcoz it inside of phrases ? or i misunderstood you. plz clarify. Erik Hatcher-4 wrote The default query parser in Solr is the lucene one. q=term~ But there is nothing that automatically makes terms fuzzy with the ~ at the end. (and fuzzy queries only work on individual terms, not terms inside of phrases). Erik On Nov 24, 2011, at 03:08 , meghana wrote: okey, i am not very much aware of it , can i use lucene query parser with solr and make this fuzzy search possible? Erik Hatcher-4 wrote Sure... if you're using the lucene query parser and put a ~ after every term in the query :) But that would mean that either the users or your application do this. Erik On Nov 23, 2011, at 09:03 , meghana wrote: Hi Erik, Thanks for your reply. i come to know that Lucene provides the fuzzy search by applying tilde(~) symbol at the end of search with like delll~0.8 can we apply such fuzzy logic in solr in any way? Thanks Meghana Erik Hatcher-4 wrote Meghana - There's currently no facility in Solr to return results for suggestions automatically. You'll have to code this into your client to make another request to Solr for the suggestions returned from the first request. Erik On Nov 23, 2011, at 07:58 , meghana wrote: Hi, I have configured spellchecker component in my solr. it works with custom request handler (however its not working with standard request handler , but this is not concern at now) . but its returning suggestions for the matching spells, instead of it we want that we can directly get result for relative spells of misspelled search term. Can we do this. Any help much appreciated. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3530584.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3530769.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3533046.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3533198.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Search for misspelled search term
I have configured specllchecker component in my solrconfig below is the configuration requestHandler name=/spellcheck class=solr.SearchHandler lazy=true lst name=defaults str name=spellcheck.onlyMorePopularfalse/str str name=spellcheck.extendedResultsfalse/str str name=spellcheck.count1/str /lst arr name=last-components strspellcheck/str /arr /requestHandler using above configuration it works with below url http://192.168.1.59:8080/solr/core0/spellcheck?q=sc:directryspellcheck=truespellcheck.build=true But when i set the same config in my standard request handler then i dont work, below is config setting for that requestHandler name=standard class=solr.SearchHandler default=true lst name=defaults str name=echoParamsexplicit/str str name=spellcheck.onlyMorePopularfalse/str str name=spellcheck.extendedResultsfalse/str str name=spellcheck.count1/str /lst arr name=last-components strspellcheck/str /arr /requestHandler then its not working with below url http://192.168.1.59:8080/solr/core0/select?q=sc:directryspellcheck=truespellcheck.build=true. anybody have any idea? neuron005 wrote Do you mean stemming? For misspelled words you will have to edit your dictionary (stopwords.txt) i think where you can set solution for misspelled words! Hope So :) -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Search-for-misspelled-search-term-tp3529961p3530526.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to make effective search with fq and q params
Hi Erik , Actually right now we can say that almost is done in filtering and passing q as *:* , but we need to find out a better way if there is any. So according to pravesh , i m thinking of to pass user entered text in query and date and other fields in filter query? or as per you q=*:* is fast? I have below fields to search Search Term : User Entered Text Field (passing it in q) Title : User Entered Text Field (passing it in fq) Desc : User Entered Text Field (passing it in fq) Appearing : User Entered Text Field (passing it in fq) Date Range : (passing it in fq) Time Zone : (EST , CST ,MST , PST) (passing it in fq) Category : (multiple choice) (passing it in fq) Market : (multiple choice) (passing it in fq) Affiliate Network : (multiple choice) (passing it in fq) I really appreciate your view. Meghana Jeff Schmidt wrote Hi Erik: When using [e]dismax, does configuring q.alt=*:* and not specifying q affect the performance/caching in any way? As a side note, a while back I configured q.alt=*:*, and the application (via SolrJ) still set q=*:* if no user input was provided (faceting). With both of them set that way, I got zero results. (Solr 3.4.0) Interesting. Thanks, Jeff On Nov 22, 2011, at 7:06 AM, Erik Hatcher wrote: If all you're doing is filtering (browsing by facets perhaps), it's perfectly fine to have q=*:*. MatchAllDocsQuery is fast (and would be cached anyway), so use *:* as appropriate without worries. Erik On Nov 22, 2011, at 07:18 , pravesh wrote: Usually, Use the 'q' parameter to search for the free text values entered by the users (where you might want to parse the query and/or apply boosting/phrase-sloppy, minimum match,tie etc ) Use the 'fq' to limit the searches to certain criterias like location, date-ranges etc. Also, avoid using the q=*:* as it implicitly translates to matchalldocsquery Regds Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-make-effective-search-with-fq-and-q-params-tp3527217p3527535.html Sent from the Solr - User mailing list archive at Nabble.com. -- Jeff Schmidt 535 Consulting jas@ http://www.535consulting.com (650) 423-1068 -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-make-effective-search-with-fq-and-q-params-tp3527217p3529876.html Sent from the Solr - User mailing list archive at Nabble.com.
need a way so that solr return result for misspelled terms
Hi, I have configured spellchecker component in my solr. it works with custom request handler (however its not working with standard request handler , but this is not concern at now) . but its returning suggestions for the matching spells, instead of it we want that we can directly get result for relative spells of misspelled search term. Can we do this. Any help much appreciated. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3530584.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: need a way so that solr return result for misspelled terms
Hi Erik, Thanks for your reply. i come to know that Lucene provides the fuzzy search by applying tilde(~) symbol at the end of search with like delll~0.8 can we apply such fuzzy logic in solr in any way? Thanks Meghana Erik Hatcher-4 wrote Meghana - There's currently no facility in Solr to return results for suggestions automatically. You'll have to code this into your client to make another request to Solr for the suggestions returned from the first request. Erik On Nov 23, 2011, at 07:58 , meghana wrote: Hi, I have configured spellchecker component in my solr. it works with custom request handler (however its not working with standard request handler , but this is not concern at now) . but its returning suggestions for the matching spells, instead of it we want that we can directly get result for relative spells of misspelled search term. Can we do this. Any help much appreciated. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3530584.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3530769.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: need a way so that solr return result for misspelled terms
We are using solr query parser... just need some schema and / or solrconfig configuration to do the misspell search and find results. -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3532979.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: need a way so that solr return result for misspelled terms
this seems to be good. if it is possible , i want to make it from solr features / configuration changes. but can go for it, if it is not possible or not much compatible. Thanks. iorixxx wrote I have configured spellchecker component in my solr. it works with custom request handler (however its not working with standard request handler , but this is not concern at now) . but its returning suggestions for the matching spells, instead of it we want that we can directly get result for relative spells of misspelled search term. You might be interested in this : http://sematext.com/products/dym-researcher/index.html -- View this message in context: http://lucene.472066.n3.nabble.com/need-a-way-so-that-solr-return-result-for-misspelled-terms-tp3530584p3532983.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to make effective search with fq and q params
Thanks Pravesh for your reply.. I definitely try this.. i hope it will improve solr response time. pravesh wrote Usually, Use the 'q' parameter to search for the free text values entered by the users (where you might want to parse the query and/or apply boosting/phrase-sloppy, minimum match,tie etc ) Use the 'fq' to limit the searches to certain criterias like location, date-ranges etc. Also, avoid using the q=*:* as it implicitly translates to matchalldocsquery Regds Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-make-effective-search-with-fq-and-q-params-tp3527217p3527654.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Search for misspelled search term
Hi all, I need to find a way by which solr check and return for results for misspelled search term. Do anybody have any idea? Thank You!! Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Search-for-misspelled-search-term-tp3529961p3529961.html Sent from the Solr - User mailing list archive at Nabble.com.