Highlighting search result without using solrnet code with SOLR 4.1
Hi, I want to highlight the search results without using Highlighligting Parameters provided by Solrnet. following is my configuration for highlighting parameters. Here is my Schema.xml field name=guid type=text_en indexed=true stored=true/ field name=title type=text_en indexed=true stored=true field name=link type=text_en indexed=true stored=true/ field name=fulltext type=text_en indexed=true stored=true/ field name=scope type=text_en indexed=true stored=true/ Following is configuration for solrconfig.xml str name=hlon/str str name=hl.flfulltext title/str str name=hl.encoderhtml/str str name =hl.fragListBuildersimple/str str name=hl.simple.prelt;emgt;/str str name=hl.simple.postlt;/emgt;/strs str name=f.title.hl.fragsize0/str str name=f.title.hl.alternateFieldtitle/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldtitle/str str name=f.content.hl.snippets20/str str name=f.content.hl.fragsize2000/str str name=f.content.hl.alternateFieldfulltext/str str name=f.content.hl.maxAlternateFieldLength2000/str str name=hl.fragmenterregex/str When I tried to search result for fulltext:What rules Apply, it is giving me following response for highlighting which is correct. lst name=highlighting lst name=E836D2CC-76EF-4EC2-AD00-00015074537E arr name=fulltext str 3538. emWhat/em emrules/em emapply/em to correction of errors in nonqualified deferred compensation plans/str /arr /lst lst name=63DA3DDB-AAF1-435B-8AA4-00BB60F596A2 arr name=fulltext str 3723. What is a Section 1042 election? emWhat/em emrules/em emapply/em to qualified sales to an ESOP/str /arr /lst /lst I want to highlighted this results in the application. I am using c# language and does not want to do with solrnet DLL. is it possible to show highlighting without code ? Please do needful. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-search-result-without-using-solrnet-code-with-SOLR-4-1-tp414.html Sent from the Solr - User mailing list archive at Nabble.com.
Fetch Unique Values
Hi, I wish to execute a Solr query and fetch first 10,000 unique records based on a field. How do I achieve this in Solr? I am using 4.2.0 version. Thanks in advance! -- View this message in context: http://lucene.472066.n3.nabble.com/Fetch-Unique-Values-tp4095371.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search with punctuations
Hi Erick, I modified the SOLR schema file for the field as follows and re-indexed the schema, fieldType name=CustomStr class=solr.TextField positionIncrementGap=100 sortMissingLast=true analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 / filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory / /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 / filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory / /analyzer /fieldType My previous scenario seems to be working fine i.e., when I search for INTL, I get both the records containing string like INTL and INT'L. But, I am not able to perform a STARTS WITH search i.e., my schema field has values like INTERNATIONAL XYZ LOCAL and PLAY OF INTERNATIONAL XYZ, when I perform a STARTS WITH search for the keyword INTERNATIONAL it is returning both the values but, ideally it should return only INTERNATIONAL XYZ LOCAL. To perform the STARTS WITH search I append the keyword with * i.e., the keyword in my case becomes INTERNATIONAL*. It seems that the STARTS WITH search has started behaving like CONTAINS search. Please suggest me how should I achieve this scenario of performing the STARTS WITH search on the same field type. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Search-with-punctuations-tp4077510p4078591.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search with punctuations
Hi Erick, Thanks for your reply! I have tried both of the suggestions that you have mentioned i.e., 1. Using WhitespaceTokensizerFactory 2. Using WordDelimiterFilterFactory with catenateWords=1 But, I still face the same issue. Should the tokenizers/ factories used must be the same for both query and index analyzers? As per my scenario, when I search for INTL, I want SOLR to return both the records containing string like INTL and INT'L. Please do suggest me other alternatives to achieve this. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Search-with-punctuations-tp4077510p4077973.html Sent from the Solr - User mailing list archive at Nabble.com.
Search with punctuations
Hi, Scenario: User who perform search forget to put punctuation mark (apostrophe) for ex, when user wants to search for a value like INT'L, they just key in INTL (with no punctuation). In this scenario, I wish to return both values with INTL and INT'L that currently are indexed on SOLR instance. Currently, if I search for INTL it wont return the row having value INT'L. Schema Configuration entry for the field type: fieldType name=customStr class=solr.TextField positionIncrementGap=100 sortMissingLast=true analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory / filter class=solr.PatternReplaceFilterFactory pattern=\s*[,.]\s* replacement= replace=all / filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all / filter class=solr.PatternReplaceFilterFactory pattern=[';] replacement= replace=all / /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=\s*[,.]\s* replacement= replace=all / filter class=solr.PatternReplaceFilterFactory pattern=\s+ replacement= replace=all / filter class=solr.PatternReplaceFilterFactory pattern=[';] replacement= replace=all/ /analyzer /fieldType Please suggest as to what mechanism should I use to fetch both the values like INTL and INT'L, when the search is performed for INTL. Also, does the reg-ex look correct for the analyzers? What all different filters/ tokenizer can be used to overcome this issue. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Search-with-punctuations-tp4077510.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Searching for terms having embedded white spaces like word1 word2
Thank you so very much Jack for your prompt reply. Your solution worked for us. I have another issue in querying fields having values of the sort stringThis is good/stringstringThis is also good/stringstringThis is excellent/string. I want to perform StartsWith as well as 'Contains searches on this field. The field definition is as follow, fieldType name=cust_str class=solr.TextField positionIncrementGap=100 sortMissingLast=true analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory / /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Please suggest how to perform the above mentioned search. -- View this message in context: http://lucene.472066.n3.nabble.com/Searching-for-terms-having-embedded-white-spaces-like-word1-word2-tp4064170p4064355.html Sent from the Solr - User mailing list archive at Nabble.com.
Searching for terms having embedded white spaces like word1 word2
Hi Guys, I have a field defined with the following custom data type, fieldType name=cust_str class=solr.TextField positionIncrementGap=100 sortMissingLast=true analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory / /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType This field has values like SAN MIGUEL,SAN JUAN,SAN DIEGO etc. I wish to perform a Starts With and Contains search on these values and I perform the query in SOLR as follows, -Starts With: field:SAN M* -Contains: field:*SAN M* But, the SOLR is not returning correct results because of the white space. What modifications do I need to make in order to make the sreahces work for the values with embedded white spaces? -- View this message in context: http://lucene.472066.n3.nabble.com/Searching-for-terms-having-embedded-white-spaces-like-word1-word2-tp4064170.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: SOLR - Unable to execute query error - DIH
Thanks James. We have tried the following options *(individually)* including the one you suggested, 1.selectMethod=cursor 2. batchSize=-1 3.responseBuffering=adaptive But the indexing process doesn't seem to be improving at all. When we try to index set of 500 rows it works well gets completed in 18 min. For 1000K rows it took 22 hours (long) for indexing. But, when we try to index the complete set of 750K rows it doesn't show any progress and keeps on executing. Currently both the SQL server as well as the SOLR machine is running on 4 GB RAM. With this configuration does the above scenario stands justified? If we think of upgrading the RAM, which machine should that be, the SOLR machine or the SQL Server machine? Are there any other efficient methods to import/ index data from SQL Server to SOLR? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Unable-to-execute-query-error-DIH-tp4051028p4051981.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR - Unable to execute query error - DIH
Hello All, I am trying to index data from SQL Server view to the SOLR using the DIH with full-import command. The view has 750K rows and 427 columns. During the first execution i indexed only the first 50 rows of the view, the data got indexed in 10 min. But, when i executed the same scenario to index the complete set of 750K rows, the execution continued for 2 days and roll-backed, giving me the following error: Unable to execute the query: select * from. Following is my DIH configuration file, dataConfig dataSource type=JdbcDataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver url=jdbc:sqlserver://server1\sql2012;databaseName=DBName user=x password=x / document name=Search batchsize=1 entity name=Search query=select top 500 * from view field column=ID name=Id / As suggested in some of the posts, i did try with batchsize=-1, but dint work out. Please suggest is this the correct approach or any parameter needs to be modified for tuning. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Unable-to-execute-query-error-DIH-tp4051028.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR - Unable to execute query error - DIH
In context of the above scenario, when i try to index set of 500 rows, it fetches and indexes around 400 odd rows and then it shows no progress and keeps on executing. What can be the possible cause of this issue? If possible, please do share if you guys have gone through such scenario with the respective details. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Unable-to-execute-query-error-DIH-tp4051028p4051034.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR - Documents with large number of fields ~ 450
Hello All, Scenario: My data model consist of approx. 450 fields with different types of data. We want to include each field for indexing as a result it will create a single SOLR document with *450 fields*. The total of number of records in the data set is *755K*. We will be using the features like faceting and sorting on approx. 50 fields. We are planning to use SOLR 4.1. Following is the hardware configuration of the web server that we plan to install SOLR on:- CPU: 2 x Dual Core (4 cores) | RAM: 12GB | Storage: 212 GB Questions : 1)What's the best approach when dealing with documents with large number of fields. What's the drawback of having a single document with a very large number of fields. Does SOLR support documents with large number of fields as in my case? 2)Will there be any performance issue if i define all of the 450 fields for indexing? Also if faceting is done on 50 fields with document having large number of fields and huge number of records? 3)The name of the fields in the data set are quiet lengthy around 60 characters. Will it be a problem defining fields with such a huge name in the schema file? Is there any best practice to be followed related to naming convention? Will big field names create problem during querying? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-Documents-with-large-number-of-fields-450-tp4049633.html Sent from the Solr - User mailing list archive at Nabble.com.