RE: Unable to perform search query after changing uniqueKey
Gently walking into rough waters here, but if you use any API with GET, you're sending a URI which must be properly encoded. This has nothing to do with with the programming language that generates key and store pairs on the browser or the one(s) used on the server. Lots and lots of good folks have tripped over this one.http://www.w3schools.com/tags/ref_urlencode.asp Play hard, but play safe! Date: Wed, 1 Apr 2015 13:58:55 +0800 Subject: Re: Unable to perform search query after changing uniqueKey From: edwinye...@gmail.com To: solr-user@lucene.apache.org Thanks Erick. Yes, it is able to work correct if I do not use spaces for the field names, especially for the uniqueKey. Regards, Edwin On 31 March 2015 at 13:58, Erick Erickson erickerick...@gmail.com wrote: I would never put spaces in my field names! Frankly I have no clue what Solr does with that, but it can't be good. Solr explicitly supports Java naming conventions, camel case, underscores and numbers. Special symbols are frowned upon, I never use anything but upper case, lower case and underscores. Actually, I don't use upper case either but that's a personal preference. Other things might work, but only by chance. Best, Erick On Mon, Mar 30, 2015 at 8:59 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Latest information that I've found for this is that the error only occurs for shard2. If I do a search for just shard1, those records that are assigned to shard1 will be able to be displayed. Only when I search for shard2 will the NullPointerException error occurs. Previously I was doing a search for both shards. Is there any settings that I required to do for shard2 in order to solve this issue? Currently I have not made any changes to the shards since I created it using http://localhost:8983/solr/admin/collections?action=CREATEname=nps1numShards=2collection.configName=collection1 Regards, Edwin On 31 March 2015 at 09:42, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I've changed the uniqueKey from id to Item No. uniqueKeyItem No/uniqueKey Below are my definitions for both the id and Item No. field name=id type=string indexed=true stored=true required=false multiValued=false / field name=Item No type=text_general indexed=true stored=true/ Regards, Edwin On 30 March 2015 at 23:05, Erick Erickson erickerick...@gmail.com wrote: Well, let's see the definition of your ID field, 'cause I'm puzzled. It's definitely A Bad Thing to have it be any kind of tokenized field though, but that's a shot in the dark. Best, Erick On Mon, Mar 30, 2015 at 2:17 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Mostafa, Yes, I've defined all the fields in schema.xml. It is able to work on the version without SolrCloud, but it is not working for the one with SolrCloud. Both of them are using the same schema.xml. Regards, Edwin On 30 March 2015 at 14:34, Mostafa Gomaa mostafa.goma...@gmail.com wrote: Hi Zheng, It's possible that there's a problem with your schema.xml. Are all fields defined and have appropriate options enabled? Regards, Mostafa. On Mon, Mar 30, 2015 at 7:49 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I've tried that, and removed the data directory from both the shards. But the same problem still occurs, so we probably can rule out the memory issue. Regards, Edwin On 30 March 2015 at 12:39, Erick Erickson erickerick...@gmail.com wrote: I meant shut down Solr and physically remove the entire data directory. Not saying this is the cure, but it can't hurt to rule out the index having memory... Best, Erick On Sun, Mar 29, 2015 at 6:35 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I used the following query to delete all the index. http://localhost:8983/solr/update?stream.body= deletequery*:*/query/delete http://localhost:8983/solr/update?stream.body=commit/ Or is it better to physically delete the entire data directory? Regards, Edwin On 28 March 2015 at 02:27, Erick Erickson erickerick...@gmail.com wrote: You say you re-indexed, did you _completely_ remove the data directory first, i.e. the parent of the index and, maybe, tlog directories? I've occasionally seen remnants of old definitions pollute the new one, and since the uniqueKey key is so fundamental I can see it being a problem. Best, Erick On Fri, Mar 27, 2015 at 1:42 AM, Andrea Gazzarini
RE: Spark-Solr in python
There is a package of python with solr-cloud https://pypi.python.org/pypi/solrcloudpy but I don't know if there is possibility to connect it to spark -Original Message- From: Timothy Potter [mailto:thelabd...@gmail.com] Sent: Tuesday, March 31, 2015 23:15 To: solr-user@lucene.apache.org Subject: Re: Spark-Solr in python You'll need a python lib that uses a python ZooKeeper client to be SolrCloud-aware so that you can do RDD like things, such as reading from all shards in a collection in parallel. I'm not aware of any Solr py libs that are cloud-aware yet, but it would be a good contribution to upgrade https://github.com/toastdriven/pysolr to be SolrCloud-aware On Mon, Mar 30, 2015 at 11:31 PM, Chaushu, Shani shani.chau...@intel.com wrote: Hi, I saw there is a tool for reading solr into Spark RDD in JAVA I want to do something like this in python, is there any package in python for reading solr into spark RDD? Thanks , Shani - Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. - Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Solr Cloud Security not working for internal authentication
I am trying to use Solr Security on Solr 5.0 Cloud. Following process I have used :- 1. Modifying web.xml :- security-constraintweb-resource-collection web-resource-nameAdminAllowedQueries/web-resource-name url-pattern/admin/*/url-pattern /web-resource-collection auth-constraint role-nameadmin/role-name /auth-constraint /security-constraint login-config auth-methodBASIC/auth-method realm-nameSolr Realm/realm-name/login-config security-role descriptionAdmin/description role-nameadmin/role-name /security-role 1. Changes in jetty.xml :- Call name=addBean Arg New class=org.eclipse.jetty.security.HashLoginService Set name=nameSolr Realm/Set Set name=configSystemProperty name=jetty.home default=.//etc/realm.properties/Set Set name=refreshInterval0/Set /New /Arg /Call 2. Creating realm.properties:- solradmin: solradmin,admin 3. Set SOLR OPTS in solr.in.sh:- SOLR_OPTS=$SOLR_OPTS -DinternalAuthCredentialsBasicAuthUsername=solradmin SOLR_OPTS=$SOLR_OPTS -DinternalAuthCredentialsBasicAuthPassword=solradmin I am getting Unauthorized error while creating collection using following command:- curl -i -X GET \ -H Authorization:Basic c29scmFkbWluOnNvbHJhZG1pbg== \ 'http://localhost:8080/solr/admin/collections?action=CREATEname=testcollection.configName=testconfnumShards=1' Kindly help or suggest the best to get this done. Thanx in advance. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com ✆ +91-9811774497
Re: Unable to perform search query after changing uniqueKey
Thanks Erick. Yes, it is able to work correct if I do not use spaces for the field names, especially for the uniqueKey. Regards, Edwin On 31 March 2015 at 13:58, Erick Erickson erickerick...@gmail.com wrote: I would never put spaces in my field names! Frankly I have no clue what Solr does with that, but it can't be good. Solr explicitly supports Java naming conventions, camel case, underscores and numbers. Special symbols are frowned upon, I never use anything but upper case, lower case and underscores. Actually, I don't use upper case either but that's a personal preference. Other things might work, but only by chance. Best, Erick On Mon, Mar 30, 2015 at 8:59 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Latest information that I've found for this is that the error only occurs for shard2. If I do a search for just shard1, those records that are assigned to shard1 will be able to be displayed. Only when I search for shard2 will the NullPointerException error occurs. Previously I was doing a search for both shards. Is there any settings that I required to do for shard2 in order to solve this issue? Currently I have not made any changes to the shards since I created it using http://localhost:8983/solr/admin/collections?action=CREATEname=nps1numShards=2collection.configName=collection1 Regards, Edwin On 31 March 2015 at 09:42, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I've changed the uniqueKey from id to Item No. uniqueKeyItem No/uniqueKey Below are my definitions for both the id and Item No. field name=id type=string indexed=true stored=true required=false multiValued=false / field name=Item No type=text_general indexed=true stored=true/ Regards, Edwin On 30 March 2015 at 23:05, Erick Erickson erickerick...@gmail.com wrote: Well, let's see the definition of your ID field, 'cause I'm puzzled. It's definitely A Bad Thing to have it be any kind of tokenized field though, but that's a shot in the dark. Best, Erick On Mon, Mar 30, 2015 at 2:17 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Mostafa, Yes, I've defined all the fields in schema.xml. It is able to work on the version without SolrCloud, but it is not working for the one with SolrCloud. Both of them are using the same schema.xml. Regards, Edwin On 30 March 2015 at 14:34, Mostafa Gomaa mostafa.goma...@gmail.com wrote: Hi Zheng, It's possible that there's a problem with your schema.xml. Are all fields defined and have appropriate options enabled? Regards, Mostafa. On Mon, Mar 30, 2015 at 7:49 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I've tried that, and removed the data directory from both the shards. But the same problem still occurs, so we probably can rule out the memory issue. Regards, Edwin On 30 March 2015 at 12:39, Erick Erickson erickerick...@gmail.com wrote: I meant shut down Solr and physically remove the entire data directory. Not saying this is the cure, but it can't hurt to rule out the index having memory... Best, Erick On Sun, Mar 29, 2015 at 6:35 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I used the following query to delete all the index. http://localhost:8983/solr/update?stream.body= deletequery*:*/query/delete http://localhost:8983/solr/update?stream.body=commit/ Or is it better to physically delete the entire data directory? Regards, Edwin On 28 March 2015 at 02:27, Erick Erickson erickerick...@gmail.com wrote: You say you re-indexed, did you _completely_ remove the data directory first, i.e. the parent of the index and, maybe, tlog directories? I've occasionally seen remnants of old definitions pollute the new one, and since the uniqueKey key is so fundamental I can see it being a problem. Best, Erick On Fri, Mar 27, 2015 at 1:42 AM, Andrea Gazzarini a.gazzar...@gmail.com wrote: Hi Edwin, please provide some other detail about your context, (e.g. complete stacktrace, query you're issuing) Best, Andrea On 03/27/2015 09:38 AM, Zheng Lin Edwin Yeo wrote: Hi everyone, I've changed my uniqueKey to another name, instead of using id, on the schema.xml. However, after I have done the indexing (the indexing is successful), I'm not able to perform a search query on it. I gives the error java.lang.NullPointerException. Is there other place which I need to configure, besides changing the uniqueKey field
Re: Collapse and Expand behaviour on result with 1 document.
Hi Joel Correct me if my understanding is wrong. Using supplier id as the field to collapse on. - If thecollapse group heads inthe main result set has only 1document in each group, the expanded section will be empty since there are no documents to expandfor each collapse group. - To render the page, I need to iterate the main result set. For each document I have to check if there is an expanded group with the same supplier id. - The facets counts is based on the number of collapse groupsin the main result set (result maxScore=6.470696 name=response numFound=27 start=0) -Derek On 3/31/2015 7:43 PM, Joel Bernstein wrote: The way that collapse/expand is designed to be used is as follows: The main result set will contain the collapsed group heads. The expanded section will contain the expanded groups for the page of results. To render the page you iterate the main result set. For each document check to see if there is an expanded group. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Mar 31, 2015 at 7:37 AM, Joel Bernstein joels...@gmail.com wrote: You should be able to use collapse/expand with one result. Does the document in the main result set have group members that aren't being expanded? Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Mar 31, 2015 at 2:00 AM, Derek Poh d...@globalsources.com wrote: If I want to group the results (by a certain field) even if there is only 1 document, I should use the group parameter instead? The requirement is to group the result of product documents by their supplier id. group=truegroup.field=P_SupplierIdgroup.limit=5 Is it true that the performance of collapse is better than group parameter on large data set, say 10-20 million documents? -Derek On 3/31/2015 10:03 AM, Joel Bernstein wrote: The expanded section will only include groups that have expanded documents. So, if the document that in the main result set has no documents to expand, then this is working as expected. Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Mar 30, 2015 at 8:43 PM, Derek Poh d...@globalsources.com wrote: Hi I have a query which return 1 document. When I add the collapse and expand parameters to it, expand=trueexpand.rows=5fq={!collapse%20field=P_SupplierId}, the expanded section is empty (lst name=expanded/). Is this the behaviour of collapse and expand parameters on result which contain only 1 document? -Derek
solr 4.10.3 and index.xxxxxxxxxxx directory
Hi, Is it normal with Solr 4.10.3 that the data directory of replicas still contains directories like index.3636365667474747 index.999080980976 and files index.properties replica.properties If yes, why and in which circumstances ? Regards Dominique
Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
entity name=test1 processor=LineEntityProcessor dataSource=fds url=test.csv rootEntity=true transformer=RegexTransformer,TemplateTransformer field column=rawLine regex=^(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*)$ groupNames=test,, ,,,is_frequency_cap_enabled,,,daily_spend_limit,,, / field column=table_name name=table_name template=test1 / /entity -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Collapse and Expand behaviour on result with 1 document.
Exactly correct. Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Apr 1, 2015 at 5:44 AM, Derek Poh d...@globalsources.com wrote: Hi Joel Correct me if my understanding is wrong. Using supplier id as the field to collapse on. - If thecollapse group heads inthe main result set has only 1document in each group, the expanded section will be empty since there are no documents to expandfor each collapse group. - To render the page, I need to iterate the main result set. For each document I have to check if there is an expanded group with the same supplier id. - The facets counts is based on the number of collapse groupsin the main result set (result maxScore=6.470696 name=response numFound=27 start=0) -Derek On 3/31/2015 7:43 PM, Joel Bernstein wrote: The way that collapse/expand is designed to be used is as follows: The main result set will contain the collapsed group heads. The expanded section will contain the expanded groups for the page of results. To render the page you iterate the main result set. For each document check to see if there is an expanded group. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Mar 31, 2015 at 7:37 AM, Joel Bernstein joels...@gmail.com wrote: You should be able to use collapse/expand with one result. Does the document in the main result set have group members that aren't being expanded? Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Mar 31, 2015 at 2:00 AM, Derek Poh d...@globalsources.com wrote: If I want to group the results (by a certain field) even if there is only 1 document, I should use the group parameter instead? The requirement is to group the result of product documents by their supplier id. group=truegroup.field=P_SupplierIdgroup.limit=5 Is it true that the performance of collapse is better than group parameter on large data set, say 10-20 million documents? -Derek On 3/31/2015 10:03 AM, Joel Bernstein wrote: The expanded section will only include groups that have expanded documents. So, if the document that in the main result set has no documents to expand, then this is working as expected. Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Mar 30, 2015 at 8:43 PM, Derek Poh d...@globalsources.com wrote: Hi I have a query which return 1 document. When I add the collapse and expand parameters to it, expand=trueexpand.rows=5fq={!collapse%20field=P_SupplierId}, the expanded section is empty (lst name=expanded/). Is this the behaviour of collapse and expand parameters on result which contain only 1 document? -Derek
Customzing Solr Dedupe
I'm facing a challenges using de-dupliation of Solr documents. De-duplicate is done using TextProfileSignature with following parameters: str name=fieldsfield1, field2, field3/str str name=quantRate0.5/str str name=minTokenLen3/str Here Field3 is normal text with few lines of data. Field1 and Field2 can contain upto 5 or 6 words of data. I want to de-duplicate when data in field1 and field2 are exactly the same and 90% of the lines in field3 is matched to that in another document. Is there anyway to achieve this? -- View this message in context: http://lucene.472066.n3.nabble.com/Customzing-Solr-Dedupe-tp4196879.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 3.6, Highlight and multi words?
Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29version=2.2start=0rows=10indent=onhl=truehl.fl=tien,abenfl=pnf.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: lst name=DE202010012045U1 arr name=aben str(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made fromlt;emgt;plasticlt;/emgt; material/str str, particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made fromlt;emgt;plasticlt;/emgt;/str /arr /lst lst name=JP2014091382A arr name=aben str betweenlt;emgt;plasticlt;/emgt; tapes 3 and 3 having two heat fusion layers, and the twolt;emgt;plasticlt;/emgt; tapes 3 and 3 are stuck/str /arr /lst lst name=DE10201740A1 arr name=aben str elements. A connecting element is formed as a hinge, a flexible foil or a flexiblelt;emgt;plasticlt;/emgt; part. #CMT#USE/str /arr /lst lst name=US2008276751A1 arr name=aben strA bicycle handlebar grip includes an inner fiber layer and an outerlt;emgt;plasticlt;/emgt; layer. Thus, the fiber/str str handlebar grip, while thelt;emgt;plasticlt;/emgt; layer is soft and has an adjustable thickness to provide a comfortable/str str sensation to a user. In addition, thelt;emgt;plasticlt;/emgt; layer includes a holding portion coated on the outer surface/str str layer to enhance the combination strength between the fiber layer and thelt;emgt;plasticlt;/emgt; layer and to enhance/str /arr /lst --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
shard splitting (solr 4.4.0)
Hello Solr Community, Greetings ! This is my first post to this group. I am very new to solr, so please do not mind if some of my questions below sound dumb :) Let me explain my present setup: Solr version : Solr_4.4.0 Zookeeper version: zookeeper-3.4.5 - Present Setup Unix_box_1 One Solr instance (Collection 1 : contains around 24 million indexed documents) running on port 8983 Target setup Now as the number of users are going to increase and also we are looking for high availability, I am thinking of setting up solr cloud with the following setup: Unix box 1 zookeeper 1(master) Solr instance 1(Shard 1 - leader node) Unix_box_2 zookeeper 2 Solr instance 2 (Shard 2) Unix_box_3 zookeeper 3 Solr instance 3 (Replica for Shard 1) Unix_box_4 Solr instance 4 (Replica for Shard 2) Now following are my queries: 1) Is it possible for me to split the present solr running on one node with 24 million docs under Collection1 into 2 shards as shown above ? 2) If yes how can I achieve this, and approximately how long does it take ? 3) For my application to fetch the result from solr, I need to give one solr url meaning http://Unix_box_1:8983/solr . In this case if I have some docs on shard2 (which is on Unix_box_2) and some on shard1 (Unix_box_1), will my search result in the application fetch docs from both the shards and combine the result ? = Thank you for your patience and time. Regards, Ashwin
Re: Customzing Solr Dedupe
Solr dedupe is based on the concept of a signature - some fields and rules that reduce a document into a discrete signature, and then checking if that signature exists as a document key that can be looked up quickly in the index. That's the conceptual basis. It is not based on any kind of field by field comparison to all existing documents. -- Jack Krupansky On Wed, Apr 1, 2015 at 6:35 AM, thakkar.aayush thakkar.aay...@gmail.com wrote: I'm facing a challenges using de-dupliation of Solr documents. De-duplicate is done using TextProfileSignature with following parameters: str name=fieldsfield1, field2, field3/str str name=quantRate0.5/str str name=minTokenLen3/str Here Field3 is normal text with few lines of data. Field1 and Field2 can contain upto 5 or 6 words of data. I want to de-duplicate when data in field1 and field2 are exactly the same and 90% of the lines in field3 is matched to that in another document. Is there anyway to achieve this? -- View this message in context: http://lucene.472066.n3.nabble.com/Customzing-Solr-Dedupe-tp4196879.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
Solr actually has CSV update handler. You could send file to that directly. Have you tried that? Regards, Alex On 1 Apr 2015 11:56 pm, avinash09 avinash.i...@gmail.com wrote: entity name=test1 processor=LineEntityProcessor dataSource=fds url=test.csv rootEntity=true transformer=RegexTransformer,TemplateTransformer field column=rawLine regex=^(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*)$ groupNames=test,, ,,,is_frequency_cap_enabled,,,daily_spend_limit,,, / field column=table_name name=table_name template=test1 / /entity -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904.html Sent from the Solr - User mailing list archive at Nabble.com.
Suspicious message with attachment
The following message addressed to you was quarantined because it likely contains a virus: Subject: Error while reading index From: Moshe Recanati mos...@kmslh.com However, if you know the sender and are expecting an attachment, please reply to this message, and we will forward the quarantined message to you.
RE: Error while reading index
Hi, I uploaded the log to drive. https://drive.google.com/file/d/0B0GR0M-lL5QHX1B2a2NZZXh3a1E/view?usp=sharing Regards, Moshe Recanati SVP Engineering Office + 972-73-2617564 Mobile + 972-52-6194481 Skype: recanati [KMS2]http://finance.yahoo.com/news/kms-lighthouse-named-gartner-cool-121000184.html More at: www.kmslh.comhttp://www.kmslh.com/ | LinkedInhttp://www.linkedin.com/company/kms-lighthouse | FBhttps://www.facebook.com/pages/KMS-lighthouse/123774257810917 From: Moshe Recanati [mailto:mos...@kmslh.com] Sent: Wednesday, April 01, 2015 5:22 PM To: solr-user@lucene.apache.org Subject: Error while reading index Hi, We're running on production environment with Solr 4.7.1 master and slave with replication every 1 minute. During regular activity and index delta build we got the following error: ERROR - 2015-03-30 04:06:12.318; java.lang.RuntimeException: [was class java.net.SocketException] Connection reset at com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18) at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731) After additional 2 minutes we got the following error: ERROR - 2015-03-30 04:07:39.875; Unable to get file names for indexCommit generation: 638 java.io.FileNotFoundException: _tu.fdt at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261) at org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:178) And since than Solr wasn't recover until we did full rebuild of all documents. Detailed log attached. Let me know if you familiar with such issue. And what can create such issue that prevent from recovery and requires rebuild index. This is major issue for us. Thank you in advance, Regards, Moshe Recanati SVP Engineering Office + 972-73-2617564 Mobile + 972-52-6194481 Skype: recanati [KMS2]http://finance.yahoo.com/news/kms-lighthouse-named-gartner-cool-121000184.html More at: www.kmslh.comhttp://www.kmslh.com/ | LinkedInhttp://www.linkedin.com/company/kms-lighthouse | FBhttps://www.facebook.com/pages/KMS-lighthouse/123774257810917
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
no could you please share an example -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196928.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr 3.6, Highlight and multi words?
Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word and from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29version=2.2start=0row s=10indent=onhl=truehl.fl=tien,abenfl=pnf.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: lst name=DE202010012045U1 arr name=aben str(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made fromlt;emgt;plasticlt;/emgt; material/str str, particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made fromlt;emgt;plasticlt;/emgt;/str /arr /lst lst name=JP2014091382A arr name=aben str betweenlt;emgt;plasticlt;/emgt; tapes 3 and 3 having two heat fusion layers, and the twolt;emgt;plasticlt;/emgt; tapes 3 and 3 are stuck/str /arr /lst lst name=DE10201740A1 arr name=aben str elements. A connecting element is formed as a hinge, a flexible foil or a flexiblelt;emgt;plasticlt;/emgt; part. #CMT#USE/str /arr /lst lst name=US2008276751A1 arr name=aben strA bicycle handlebar grip includes an inner fiber layer and an outerlt;emgt;plasticlt;/emgt; layer. Thus, the fiber/str str handlebar grip, while thelt;emgt;plasticlt;/emgt; layer is soft and has an adjustable thickness to provide a comfortable/str str sensation to a user. In addition, thelt;emgt;plasticlt;/emgt; layer includes a holding portion coated on the outer surface/str str layer to enhance the combination strength between the fiber layer and thelt;emgt;plasticlt;/emgt; layer and to enhance/str /arr /lst * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF *
Re: shard splitting (solr 4.4.0)
Ashwin: First, if at all possible I would simply set up my new SolrCloud structure (2 shards, a leader and follower each) and re-index the entire corpus. 24M docs isn't really very many, and you'll have to have this capability sometime since somone, somewhere will want to change the schema in ways that require it. But to answer your questions: 1: Certainly. There's the SPLITSHARD command, see: https://cwiki.apache.org/confluence/display/solr/Collections+API. That said, Solr 4.4 used a relatively early version of SPLITSHARD and there have been many improvements so make sure and back up first. 2: Not quite sure how long it takes, but I wouldn't expect it to take hours. A lot depends on what the docs are like. 3: Yes, sending a query (or update for that matter) to any node in the cluster will do the right thing. In a production environment, and assuming you're not using SolrJ, I'd put a load balancer in front of the cluster for queries. If you _are_ querying through SolrJ from the application, you only need to use the CloudSolrServer class as it includes a software load balancer by default. Otherwise, if you hard-code a single machine that machine becomes a single point of failure. Best, Erick On Wed, Apr 1, 2015 at 4:55 AM, Ashwin Kumar ashwins...@outlook.de wrote: Hello Solr Community, Greetings ! This is my first post to this group. I am very new to solr, so please do not mind if some of my questions below sound dumb :) Let me explain my present setup: Solr version : Solr_4.4.0 Zookeeper version: zookeeper-3.4.5 - Present Setup Unix_box_1 One Solr instance (Collection 1 : contains around 24 million indexed documents) running on port 8983 Target setup Now as the number of users are going to increase and also we are looking for high availability, I am thinking of setting up solr cloud with the following setup: Unix box 1 zookeeper 1(master) Solr instance 1(Shard 1 - leader node) Unix_box_2 zookeeper 2 Solr instance 2 (Shard 2) Unix_box_3 zookeeper 3 Solr instance 3 (Replica for Shard 1) Unix_box_4 Solr instance 4 (Replica for Shard 2) Now following are my queries: 1) Is it possible for me to split the present solr running on one node with 24 million docs under Collection1 into 2 shards as shown above ? 2) If yes how can I achieve this, and approximately how long does it take ? 3) For my application to fetch the result from solr, I need to give one solr url meaning http://Unix_box_1:8983/solr . In this case if I have some docs on shard2 (which is on Unix_box_2) and some on shard1 (Unix_box_1), will my search result in the application fetch docs from both the shards and combine the result ? = Thank you for your patience and time. Regards, Ashwin
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
sir , a silly question m confuse here what is difference between data import handler and update csv -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196940.html Sent from the Solr - User mailing list archive at Nabble.com.
Information regarding This conf directory is not valid SolrException.
Hi, I'm working on upgrading a project from solr-4.10.3 to solr-5.0.0. As part of our JUnit tests we have a few tests for deleting/creating collections. Each test createdelete a collection with a different name, but they all share the same config in ZK. When running these tests in Eclipse everything works fine, but when running the same tests through Maven we get the following error so I suspect this is a timing related issue : INFO org.apache.solr.rest.ManagedResourceStorage – Setting up ZooKeeper-based storage for the RestManager with znodeBase: /configs/SIMPLE_CONFIG INFO org.apache.solr.rest.ManagedResourceStorage – Configured ZooKeeperStorageIO with znodeBase: /configs/SIMPLE_CONFIG INFO org.apache.solr.rest.RestManager – Initializing RestManager with initArgs: {} INFO org.apache.solr.rest.ManagedResourceStorage – Reading _rest_managed.json using ZooKeeperStorageIO:path=/configs/SIMPLE_CONFIG INFO org.apache.solr.rest.ManagedResourceStorage – No data found for znode /configs/SIMPLE_CONFIG/_rest_managed.json INFO org.apache.solr.rest.ManagedResourceStorage – Loaded null at path _rest_managed.json using ZooKeeperStorageIO:path=/configs/SIMPLE_CONFIG INFO org.apache.solr.rest.RestManager – Initializing 0 registered ManagedResources INFO org.apache.solr.handler.ReplicationHandler – Commits will be reserved for 1 INFO org.apache.solr.core.SolrCore – [mycollection1] Registered new searcher Searcher@3208a6c4[mycollection1] main{ExitableDirectoryReader(UninvertingDirectoryReader())} ERROR org.apache.solr.core.CoreContainer – Error creating core [mycollection1]: This conf directory is not valid org.apache.solr.common.SolrException: This conf directory is not valid at org.apache.solr.cloud.ZkController.registerConfListenerForCore(ZkController.java:2229) at org.apache.solr.core.SolrCore.registerConfListener(SolrCore.java:2633) at org.apache.solr.core.SolrCore.init(SolrCore.java:936) at org.apache.solr.core.SolrCore.init(SolrCore.java:662) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:513) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:488) at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:573) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:197) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:186) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:736) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:261) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:204) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at
Re: Unable to perform search query after changing uniqueKey
Steve: Totally agree. Even if you _do_ correctly escape the URL though, there's no guarantee that Solr will do the right thing with field names with spaces. Plus endless chances for you to get it wrong when constructing the URL Best, Erick On Wed, Apr 1, 2015 at 1:01 AM, steve sc_shep...@hotmail.com wrote: Gently walking into rough waters here, but if you use any API with GET, you're sending a URI which must be properly encoded. This has nothing to do with with the programming language that generates key and store pairs on the browser or the one(s) used on the server. Lots and lots of good folks have tripped over this one.http://www.w3schools.com/tags/ref_urlencode.asp Play hard, but play safe! Date: Wed, 1 Apr 2015 13:58:55 +0800 Subject: Re: Unable to perform search query after changing uniqueKey From: edwinye...@gmail.com To: solr-user@lucene.apache.org Thanks Erick. Yes, it is able to work correct if I do not use spaces for the field names, especially for the uniqueKey. Regards, Edwin On 31 March 2015 at 13:58, Erick Erickson erickerick...@gmail.com wrote: I would never put spaces in my field names! Frankly I have no clue what Solr does with that, but it can't be good. Solr explicitly supports Java naming conventions, camel case, underscores and numbers. Special symbols are frowned upon, I never use anything but upper case, lower case and underscores. Actually, I don't use upper case either but that's a personal preference. Other things might work, but only by chance. Best, Erick On Mon, Mar 30, 2015 at 8:59 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Latest information that I've found for this is that the error only occurs for shard2. If I do a search for just shard1, those records that are assigned to shard1 will be able to be displayed. Only when I search for shard2 will the NullPointerException error occurs. Previously I was doing a search for both shards. Is there any settings that I required to do for shard2 in order to solve this issue? Currently I have not made any changes to the shards since I created it using http://localhost:8983/solr/admin/collections?action=CREATEname=nps1numShards=2collection.configName=collection1 Regards, Edwin On 31 March 2015 at 09:42, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I've changed the uniqueKey from id to Item No. uniqueKeyItem No/uniqueKey Below are my definitions for both the id and Item No. field name=id type=string indexed=true stored=true required=false multiValued=false / field name=Item No type=text_general indexed=true stored=true/ Regards, Edwin On 30 March 2015 at 23:05, Erick Erickson erickerick...@gmail.com wrote: Well, let's see the definition of your ID field, 'cause I'm puzzled. It's definitely A Bad Thing to have it be any kind of tokenized field though, but that's a shot in the dark. Best, Erick On Mon, Mar 30, 2015 at 2:17 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Mostafa, Yes, I've defined all the fields in schema.xml. It is able to work on the version without SolrCloud, but it is not working for the one with SolrCloud. Both of them are using the same schema.xml. Regards, Edwin On 30 March 2015 at 14:34, Mostafa Gomaa mostafa.goma...@gmail.com wrote: Hi Zheng, It's possible that there's a problem with your schema.xml. Are all fields defined and have appropriate options enabled? Regards, Mostafa. On Mon, Mar 30, 2015 at 7:49 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I've tried that, and removed the data directory from both the shards. But the same problem still occurs, so we probably can rule out the memory issue. Regards, Edwin On 30 March 2015 at 12:39, Erick Erickson erickerick...@gmail.com wrote: I meant shut down Solr and physically remove the entire data directory. Not saying this is the cure, but it can't hurt to rule out the index having memory... Best, Erick On Sun, Mar 29, 2015 at 6:35 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I used the following query to delete all the index. http://localhost:8983/solr/update?stream.body= deletequery*:*/query/delete http://localhost:8983/solr/update?stream.body=commit/ Or is it better to physically delete the entire data directory? Regards, Edwin On 28 March 2015 at 02:27, Erick Erickson erickerick...@gmail.com wrote: You say you re-indexed, did you _completely_ remove the data directory first, i.e. the parent of
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
Data Import Handler is a process in Solr that reaches out, grabs something external and indexes it. Something external can be a database, files on the server etc. Along the way, you can do many transformations of the data. The point is that the source can be anything. The update handler is an end-point in Solr that expects certain specific formats and puts them in the index. For instance, if you index XML, it _must_ be in a very specific form to throw at the update handler, something like add doc field... field... /doc doc field... field... /doc /add The csv update handler is just an update handler that expects CSV files. The headers are usually the field names although you can map them from the column header in your csv file to your Solr schema. In importing csv files should be very fast. I suspect your regex is costly. As Alexandre says, though, it would be a good idea to go through the CSV import tutorial. The Solr reference guide has the details: https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-CSVFormattedIndexUpdates Best, Erick On Wed, Apr 1, 2015 at 8:04 AM, avinash09 avinash.i...@gmail.com wrote: sir , a silly question m confuse here what is difference between data import handler and update csv -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196940.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
Well, I believe the tutorial has an example. Always a good thing - going through the tutorial. And the reference guide has the details: https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-CSVFormattedIndexUpdates . Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 2 April 2015 at 01:37, avinash09 avinash.i...@gmail.com wrote: no could you please share an example -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196928.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr 4.10.3 and index.xxxxxxxxxxx directory
On 4/1/2015 6:35 AM, Dominique Bejean wrote: Is it normal with Solr 4.10.3 that the data directory of replicas still contains directories like index.3636365667474747 index.999080980976 and files index.properties replica.properties If yes, why and in which circumstances ? The index. directories are created during master/slave index replication. If you're running SolrCloud, then replication is only used for index recovery. Index recovery is only required in situations where the replicas are so far behind that the transaction log cannot be used to synchronize them, and sometimes happens when a Solr node is restarted. If SolrCloud index recovery is actually required when you are NOT restarting Solr instances, your index might be having problems. Regardless of whether you're running SolrCloud or not, normally when one of those directories with a numeric suffix is created, it will be changed to index with no suffix after the replication is complete, but if Solr is unable to change the directories for some reason, it will simply keep and use the new directory with the suffix. Do you see any ERROR or WARN entries in your solr logfile that would indicate why Solr cannot change the directory name? Are you on Windows? Problems like this are more common on Windows, because Windows prevents a lot of file operations when files/directories are open. The long-term existence of directories with this naming convention indicates that *something* went wrong, but you would need to consult your logs to find out what happened. There have been several bugs over Solr's history that cause this problem. Thanks, Shawn
How to recover a Shard
Hello, I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in a Recovery Failed state per the Solr Admin Cloud page. The logs contains the following type of entries for the two Solr nodes involved, including statements that it will retry. Is there a way to recover from this state? Maybe bring down one replica, and then somehow declare that the remaining replica is to be the leader? Understand this would not be ideal as the new leader may be missing documents that were sent its way to be indexed while it was down, but would be better than having to rebuild the whole cloud. Any tips or suggestions would be appreciated. Thanks, Matt Solr node .65 Error while trying to recover. core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: kla_collection slice: shard6 at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568) at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235) Solr node .64 Error while trying to recover. core=kla_collection_shard6_replica2:org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: kla_collection slice: shard6 at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568) at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
RE: How to recover a Shard
Maybe I have been working too many long hours as I missed the obvious solution of bringing down/up one of the Solr nodes backing one of the replicas, and then the same for the second node. This did the trick. Since I brought this topic up, I will narrow the question a bit: Would there be a way to recover without restarting the Solr node? Basically to delete one replica and then somehow declare the other replica the leader and break it out of its recovery process? Thanks, Matt From: Matt Kuiper Sent: Wednesday, April 01, 2015 8:43 PM To: solr-user@lucene.apache.org Subject: How to recover a Shard Hello, I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in a Recovery Failed state per the Solr Admin Cloud page. The logs contains the following type of entries for the two Solr nodes involved, including statements that it will retry. Is there a way to recover from this state? Maybe bring down one replica, and then somehow declare that the remaining replica is to be the leader? Understand this would not be ideal as the new leader may be missing documents that were sent its way to be indexed while it was down, but would be better than having to rebuild the whole cloud. Any tips or suggestions would be appreciated. Thanks, Matt Solr node .65 Error while trying to recover. core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: kla_collection slice: shard6 at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568) at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235) Solr node .64 Error while trying to recover. core=kla_collection_shard6_replica2:org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: kla_collection slice: shard6 at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568) at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
Re: Solr went on recovery multiple time.
I would give it 32GB of RAM. And try to use SSD. On Tue, Mar 31, 2015 at 12:50 AM, sthita sthit...@gmail.com wrote: Hi Bill, My index size is around 48GB and contains around 8 million documents. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-went-on-recovery-multiple-time-tp4196249p4196504.html Sent from the Solr - User mailing list archive at Nabble.com. -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: SolrCloud 5.0 cluster RAM requirements
On 4/1/2015 3:22 PM, Ryan Steele wrote: Does a SolrCloud 5.0 cluster need enough RAM across the cluster to load all the collections into RAM at all times? Need is too strong a word. If you want the best possible performance, then you would have enough RAM across the cluster to cache the entire index. That's not required for a *functional* system, ignoring performance. For an index on that scale, caching the entire index is usually an unrealistically expensive goal. Are you the person who mentioned a terabyte scale SolrCloud index on the #solr IRC channel that's hosted on Amazon? Here's a general wiki page on performance problems with Solr that has a large amount of focus on RAM: http://wiki.apache.org/solr/SolrPerformanceProblems The unfortunate fact about this is that the only way you'll figure out what you actually need is to prototype, and prototyping on the scale of your index is difficult and expensive. https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ Thanks, Shawn
Re: Unable to perform search query after changing uniqueKey
Hi Steve, Thanks for the link and the information. Regards, Edwin On 1 April 2015 at 23:17, Erick Erickson erickerick...@gmail.com wrote: Steve: Totally agree. Even if you _do_ correctly escape the URL though, there's no guarantee that Solr will do the right thing with field names with spaces. Plus endless chances for you to get it wrong when constructing the URL Best, Erick On Wed, Apr 1, 2015 at 1:01 AM, steve sc_shep...@hotmail.com wrote: Gently walking into rough waters here, but if you use any API with GET, you're sending a URI which must be properly encoded. This has nothing to do with with the programming language that generates key and store pairs on the browser or the one(s) used on the server. Lots and lots of good folks have tripped over this one.http://www.w3schools.com/tags/ref_urlencode.asp Play hard, but play safe! Date: Wed, 1 Apr 2015 13:58:55 +0800 Subject: Re: Unable to perform search query after changing uniqueKey From: edwinye...@gmail.com To: solr-user@lucene.apache.org Thanks Erick. Yes, it is able to work correct if I do not use spaces for the field names, especially for the uniqueKey. Regards, Edwin On 31 March 2015 at 13:58, Erick Erickson erickerick...@gmail.com wrote: I would never put spaces in my field names! Frankly I have no clue what Solr does with that, but it can't be good. Solr explicitly supports Java naming conventions, camel case, underscores and numbers. Special symbols are frowned upon, I never use anything but upper case, lower case and underscores. Actually, I don't use upper case either but that's a personal preference. Other things might work, but only by chance. Best, Erick On Mon, Mar 30, 2015 at 8:59 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Latest information that I've found for this is that the error only occurs for shard2. If I do a search for just shard1, those records that are assigned to shard1 will be able to be displayed. Only when I search for shard2 will the NullPointerException error occurs. Previously I was doing a search for both shards. Is there any settings that I required to do for shard2 in order to solve this issue? Currently I have not made any changes to the shards since I created it using http://localhost:8983/solr/admin/collections?action=CREATEname=nps1numShards=2collection.configName=collection1 Regards, Edwin On 31 March 2015 at 09:42, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I've changed the uniqueKey from id to Item No. uniqueKeyItem No/uniqueKey Below are my definitions for both the id and Item No. field name=id type=string indexed=true stored=true required=false multiValued=false / field name=Item No type=text_general indexed=true stored=true/ Regards, Edwin On 30 March 2015 at 23:05, Erick Erickson erickerick...@gmail.com wrote: Well, let's see the definition of your ID field, 'cause I'm puzzled. It's definitely A Bad Thing to have it be any kind of tokenized field though, but that's a shot in the dark. Best, Erick On Mon, Mar 30, 2015 at 2:17 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Mostafa, Yes, I've defined all the fields in schema.xml. It is able to work on the version without SolrCloud, but it is not working for the one with SolrCloud. Both of them are using the same schema.xml. Regards, Edwin On 30 March 2015 at 14:34, Mostafa Gomaa mostafa.goma...@gmail.com wrote: Hi Zheng, It's possible that there's a problem with your schema.xml. Are all fields defined and have appropriate options enabled? Regards, Mostafa. On Mon, Mar 30, 2015 at 7:49 AM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I've tried that, and removed the data directory from both the shards. But the same problem still occurs, so we probably can rule out the memory issue. Regards, Edwin On 30 March 2015 at 12:39, Erick Erickson erickerick...@gmail.com wrote: I meant shut down Solr and physically remove the entire data directory. Not saying this is the cure, but it can't hurt to rule out the index having memory... Best, Erick On Sun, Mar 29, 2015 at 6:35 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi Erick, I used the following query to delete all the index. http://localhost:8983/solr/update?stream.body= deletequery*:*/query/delete http://localhost:8983/solr/update?stream.body=commit/ Or is it better
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
thanks Erick and Alexandre Rafalovitch R one more doubt how to pass ctrl A(^A) seprator while csv upload -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196998.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 3.6, Highlight and multi words?
Dear Charles, Thanks for your answer, please find below my answers. ok it works if I use aben as field in my query as you say in Answer 1. it doesn't work if I use ab may be because ab field is a copyField for abfr, aben, abit, abpt Concerning the 2., yes you have right it's not and but AND I have this result: lst name=DE102009043935B3 arr name=tien strlt;emgt;Bicyclelt;/emgt; frame comprises holder, particularly for water bottle, where holder is connected/str /arr arr name=aben str#CMT# #/CMT# Thelt;emgt;bicyclelt;/emgt; frame (7) comprises a holder (1), particularly for a water bottle/str str. The holder is connected with thelt;emgt;bicyclelt;/emgt; frame by a screw (5), where a mounting element has a compensation/str str section which is made of an elastic material, particularly alt;emgt;plasticlt;/emgt; material. The compensation section/str /arr /lst So my last question is why I haven't em/em instead having colored ? How can I tell to solr to use the colored ? Thanks a lot, Bruno Le 01/04/2015 17:15, Reitzel, Charles a écrit : Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word and from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29version=2.2start=0row s=10indent=onhl=truehl.fl=tien,abenfl=pnf.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: lst name=DE202010012045U1 arr name=aben str(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made fromlt;emgt;plasticlt;/emgt; material/str str, particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made fromlt;emgt;plasticlt;/emgt;/str /arr /lst lst name=JP2014091382A arr name=aben str betweenlt;emgt;plasticlt;/emgt; tapes 3 and 3 having two heat fusion layers, and the twolt;emgt;plasticlt;/emgt; tapes 3 and 3 are stuck/str /arr /lst lst name=DE10201740A1 arr name=aben str elements. A connecting element is formed as a hinge, a flexible foil or a flexiblelt;emgt;plasticlt;/emgt; part. #CMT#USE/str /arr /lst lst name=US2008276751A1 arr name=aben strA bicycle handlebar grip includes an inner fiber layer and an outerlt;emgt;plasticlt;/emgt; layer. Thus, the fiber/str str handlebar grip, while thelt;emgt;plasticlt;/emgt; layer is soft and has an adjustable thickness to provide a comfortable/str str sensation to a user. In addition, thelt;emgt;plasticlt;/emgt; layer includes a holding portion coated on the outer surface/str str layer to enhance the combination strength between the fiber layer and thelt;emgt;plasticlt;/emgt; layer and to enhance/str /arr /lst * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF *
Re: Customzing Solr Dedupe
But you can potentially still use Solr dedupe if you do the upfront work (in RDMS or NoSQL pre-index processing) to assign some sort of Group ID. See OCLC's FRBR Work-Set Algorithm, http://www.oclc.org/content/dam/research/activities/frbralgorithm/2009-08.pdf?urlm=161376 , for some details on one such algorithm. If the job is too big for RDBMS, and/or you don't want to use/have a suitable NoSQL, you can have two Solr indexes (collection/core/whatever) - one for classification with only id, field1, field2, field3, and another for production query. Then, you put stuff into the classification index, use queries and your own algorithm to do classification, assigning a groupId, and then put the document with groupId assigned into the production database. A key question is whether you want to preserve the groupId. In some cases, you do, and in some cases, it is just an internal signature. In both cases, a non-deterministic up-front algorithm can work, but if the groupId needs to be preserved, you need to work harder to make sure it all hangs together. Hope this helps, -Dan On Wed, Apr 1, 2015 at 7:05 AM, Jack Krupansky jack.krupan...@gmail.com wrote: Solr dedupe is based on the concept of a signature - some fields and rules that reduce a document into a discrete signature, and then checking if that signature exists as a document key that can be looked up quickly in the index. That's the conceptual basis. It is not based on any kind of field by field comparison to all existing documents. -- Jack Krupansky On Wed, Apr 1, 2015 at 6:35 AM, thakkar.aayush thakkar.aay...@gmail.com wrote: I'm facing a challenges using de-dupliation of Solr documents. De-duplicate is done using TextProfileSignature with following parameters: str name=fieldsfield1, field2, field3/str str name=quantRate0.5/str str name=minTokenLen3/str Here Field3 is normal text with few lines of data. Field1 and Field2 can contain upto 5 or 6 words of data. I want to de-duplicate when data in field1 and field2 are exactly the same and 90% of the lines in field3 is matched to that in another document. Is there anyway to achieve this? -- View this message in context: http://lucene.472066.n3.nabble.com/Customzing-Solr-Dedupe-tp4196879.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr 4.10.3 and index.xxxxxxxxxxx directory
Hi Shawn, Thank you for your response. This is a Solrcloud installation on Centos. There are 5 servers with 128 Gb ram each. The collection contains 650 millions of small documents. There are 3 shards with replicationfactor = 2 (so 9 cores). The JVM Xmx parameter was set to 96 Gb. We changed it yesterday to 32 Gb in order to be under the CompressedOops limit and free the direct memory for MMapDirectory. I will have access to both full solr and tomcat logs tomorrow. What I know, is that there are some zookeeper time out in solr logs. And the replications occur on some nodes after some commits (after DIH import) and when nodes restart. So, I will have more precise log messages tomorrow. Thank you for your response. Dominique 2015-04-01 18:29 GMT+02:00 Shawn Heisey apa...@elyograg.org: On 4/1/2015 6:35 AM, Dominique Bejean wrote: Is it normal with Solr 4.10.3 that the data directory of replicas still contains directories like index.3636365667474747 index.999080980976 and files index.properties replica.properties If yes, why and in which circumstances ? The index. directories are created during master/slave index replication. If you're running SolrCloud, then replication is only used for index recovery. Index recovery is only required in situations where the replicas are so far behind that the transaction log cannot be used to synchronize them, and sometimes happens when a Solr node is restarted. If SolrCloud index recovery is actually required when you are NOT restarting Solr instances, your index might be having problems. Regardless of whether you're running SolrCloud or not, normally when one of those directories with a numeric suffix is created, it will be changed to index with no suffix after the replication is complete, but if Solr is unable to change the directories for some reason, it will simply keep and use the new directory with the suffix. Do you see any ERROR or WARN entries in your solr logfile that would indicate why Solr cannot change the directory name? Are you on Windows? Problems like this are more common on Windows, because Windows prevents a lot of file operations when files/directories are open. The long-term existence of directories with this naming convention indicates that *something* went wrong, but you would need to consult your logs to find out what happened. There have been several bugs over Solr's history that cause this problem. Thanks, Shawn
Re: solr 4.10.3 and index.xxxxxxxxxxx directory
I _really_ suspect that with the huge JVM heaps you had, you were hitting long GC pauses that exceeded the Zookeeper timeout, causing ZK to believe the node had gone away thus throwing it into recovery mode. You can enable GC logging to see whether you see such long pauses, but with 96G it's almost certain that you did. Reducing the JVM allocation should help, but if you continue to see nodes go into recovery for no apparent reason enabling GC logging is a good idea so you have a record.. See Getting a view into garbage collection here: https://lucidworks.com/blog/garbage-collection-bootcamp-1-0/ Best Erick On Wed, Apr 1, 2015 at 10:35 AM, Dominique Bejean dominique.bej...@eolya.fr wrote: Hi Shawn, Thank you for your response. This is a Solrcloud installation on Centos. There are 5 servers with 128 Gb ram each. The collection contains 650 millions of small documents. There are 3 shards with replicationfactor = 2 (so 9 cores). The JVM Xmx parameter was set to 96 Gb. We changed it yesterday to 32 Gb in order to be under the CompressedOops limit and free the direct memory for MMapDirectory. I will have access to both full solr and tomcat logs tomorrow. What I know, is that there are some zookeeper time out in solr logs. And the replications occur on some nodes after some commits (after DIH import) and when nodes restart. So, I will have more precise log messages tomorrow. Thank you for your response. Dominique 2015-04-01 18:29 GMT+02:00 Shawn Heisey apa...@elyograg.org: On 4/1/2015 6:35 AM, Dominique Bejean wrote: Is it normal with Solr 4.10.3 that the data directory of replicas still contains directories like index.3636365667474747 index.999080980976 and files index.properties replica.properties If yes, why and in which circumstances ? The index. directories are created during master/slave index replication. If you're running SolrCloud, then replication is only used for index recovery. Index recovery is only required in situations where the replicas are so far behind that the transaction log cannot be used to synchronize them, and sometimes happens when a Solr node is restarted. If SolrCloud index recovery is actually required when you are NOT restarting Solr instances, your index might be having problems. Regardless of whether you're running SolrCloud or not, normally when one of those directories with a numeric suffix is created, it will be changed to index with no suffix after the replication is complete, but if Solr is unable to change the directories for some reason, it will simply keep and use the new directory with the suffix. Do you see any ERROR or WARN entries in your solr logfile that would indicate why Solr cannot change the directory name? Are you on Windows? Problems like this are more common on Windows, because Windows prevents a lot of file operations when files/directories are open. The long-term existence of directories with this naming convention indicates that *something* went wrong, but you would need to consult your logs to find out what happened. There have been several bugs over Solr's history that cause this problem. Thanks, Shawn
RE: Solr 3.6, Highlight and multi words?
If you want to query on the field ab, you'll probably need to add it the qf parameter. To control the highlighting markup, with the standard highlighter, use hl.simple.pre and hl.simple.post. https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 2:24 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Dear Charles, Thanks for your answer, please find below my answers. ok it works if I use aben as field in my query as you say in Answer 1. it doesn't work if I use ab may be because ab field is a copyField for abfr, aben, abit, abpt Concerning the 2., yes you have right it's not and but AND I have this result: lst name=DE102009043935B3 arr name=tien strlt;emgt;Bicyclelt;/emgt; frame comprises holder, particularly for water bottle, where holder is connected/str /arr arr name=aben str#CMT# #/CMT# Thelt;emgt;bicyclelt;/emgt; frame (7) comprises a holder (1), particularly for a water bottle/str str. The holder is connected with thelt;emgt;bicyclelt;/emgt; frame by a screw (5), where a mounting element has a compensation/str str section which is made of an elastic material, particularly alt;emgt;plasticlt;/emgt; material. The compensation section/str /arr /lst So my last question is why I haven't em/em instead having colored ? How can I tell to solr to use the colored ? Thanks a lot, Bruno Le 01/04/2015 17:15, Reitzel, Charles a écrit : Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word and from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29version=2.2start=0ro w s=10indent=onhl=truehl.fl=tien,abenfl=pnf.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: lst name=DE202010012045U1 arr name=aben str(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made fromlt;emgt;plasticlt;/emgt; material/str str, particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made fromlt;emgt;plasticlt;/emgt;/str /arr /lst lst name=JP2014091382A arr name=aben str betweenlt;emgt;plasticlt;/emgt; tapes 3 and 3 having two heat fusion layers, and the twolt;emgt;plasticlt;/emgt; tapes 3 and 3 are stuck/str /arr /lst lst name=DE10201740A1 arr name=aben str elements. A connecting element is formed as a hinge, a flexible foil or a flexiblelt;emgt;plasticlt;/emgt; part. #CMT#USE/str /arr /lst lst name=US2008276751A1 arr name=aben strA bicycle handlebar grip includes an inner fiber layer and an outerlt;emgt;plasticlt;/emgt; layer. Thus, the fiber/str str handlebar grip, while thelt;emgt;plasticlt;/emgt; layer is soft and has an adjustable thickness to provide a comfortable/str str sensation to a user. In addition, thelt;emgt;plasticlt;/emgt; layer includes a holding portion coated on the outer surface/str str layer to enhance the combination strength between the fiber layer and thelt;emgt;plasticlt;/emgt; layer and to enhance/str /arr /lst ** *** This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA-CREF ** *** * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it.
Re: Solr 3.6, Highlight and multi words?
ok for qf (i can't test now) but concerning hl.simple.pre hl.simple.post I can define only one color no ? in the sample solrconfig.xml there are several color, !-- multi-colored tag FragmentsBuilder -- fragmentsBuilder name=colored class=solr.highlight.ScoreOrderFragmentsBuilder lst name=defaults str name=hl.tag.pre![CDATA[ b style=background:yellow,b style=background:lawgreen, b style=background:aquamarine,b style=background:magenta, b style=background:palegreen,b style=background:coral, b style=background:wheat,b style=background:khaki, b style=background:lime,b style=background:deepskyblue]]/str str name=hl.tag.post![CDATA[/b]]/str /lst /fragmentsBuilder How can I tell to solr to use these color instead of hl.simple.pre/post ? Le 01/04/2015 20:58, Reitzel, Charles a écrit : If you want to query on the field ab, you'll probably need to add it the qf parameter. To control the highlighting markup, with the standard highlighter, use hl.simple.pre and hl.simple.post. https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 2:24 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Dear Charles, Thanks for your answer, please find below my answers. ok it works if I use aben as field in my query as you say in Answer 1. it doesn't work if I use ab may be because ab field is a copyField for abfr, aben, abit, abpt Concerning the 2., yes you have right it's not and but AND I have this result: lst name=DE102009043935B3 arr name=tien strlt;emgt;Bicyclelt;/emgt; frame comprises holder, particularly for water bottle, where holder is connected/str /arr arr name=aben str#CMT# #/CMT# Thelt;emgt;bicyclelt;/emgt; frame (7) comprises a holder (1), particularly for a water bottle/str str. The holder is connected with thelt;emgt;bicyclelt;/emgt; frame by a screw (5), where a mounting element has a compensation/str str section which is made of an elastic material, particularly alt;emgt;plasticlt;/emgt; material. The compensation section/str /arr /lst So my last question is why I haven't em/em instead having colored ? How can I tell to solr to use the colored ? Thanks a lot, Bruno Le 01/04/2015 17:15, Reitzel, Charles a écrit : Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word and from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29version=2.2start=0ro w s=10indent=onhl=truehl.fl=tien,abenfl=pnf.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: lst name=DE202010012045U1 arr name=aben str(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made fromlt;emgt;plasticlt;/emgt; material/str str, particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made fromlt;emgt;plasticlt;/emgt;/str /arr /lst lst name=JP2014091382A arr name=aben str betweenlt;emgt;plasticlt;/emgt; tapes 3 and 3 having two heat fusion layers, and the twolt;emgt;plasticlt;/emgt; tapes 3 and 3 are stuck/str /arr /lst lst name=DE10201740A1 arr name=aben str elements. A connecting element is formed as a hinge, a flexible foil or a flexiblelt;emgt;plasticlt;/emgt; part. #CMT#USE/str /arr /lst lst name=US2008276751A1 arr name=aben strA bicycle handlebar grip includes an inner fiber layer and an outerlt;emgt;plasticlt;/emgt; layer. Thus, the fiber/str str handlebar grip, while thelt;emgt;plasticlt;/emgt; layer is soft and has an adjustable thickness to
Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr
That's an interesting question. The reference shows you how to set a separator, but ^A is a special case. You may need to pass it in as a URL escape character or similar. But I would first get a sample working with more conventional separator and then worry about ^A. Just so you are not confusing several problems. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 2 April 2015 at 05:05, avinash09 avinash.i...@gmail.com wrote: thanks Erick and Alexandre Rafalovitch R one more doubt how to pass ctrl A(^A) seprator while csv upload -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196998.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 3.6, Highlight and multi words?
of course no prb charles, you already help me ! Le 01/04/2015 21:54, Reitzel, Charles a écrit : Sorry, I've never tried highlighting in multiple colors... -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 3:43 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? ok for qf (i can't test now) but concerning hl.simple.pre hl.simple.post I can define only one color no ? in the sample solrconfig.xml there are several color, !-- multi-colored tag FragmentsBuilder -- fragmentsBuilder name=colored class=solr.highlight.ScoreOrderFragmentsBuilder lst name=defaults str name=hl.tag.pre![CDATA[ b style=background:yellow,b style=background:lawgreen, b style=background:aquamarine,b style=background:magenta, b style=background:palegreen,b style=background:coral, b style=background:wheat,b style=background:khaki, b style=background:lime,b style=background:deepskyblue]]/str str name=hl.tag.post![CDATA[/b]]/str /lst /fragmentsBuilder How can I tell to solr to use these color instead of hl.simple.pre/post ? Le 01/04/2015 20:58, Reitzel, Charles a écrit : If you want to query on the field ab, you'll probably need to add it the qf parameter. To control the highlighting markup, with the standard highlighter, use hl.simple.pre and hl.simple.post. https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 2:24 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Dear Charles, Thanks for your answer, please find below my answers. ok it works if I use aben as field in my query as you say in Answer 1. it doesn't work if I use ab may be because ab field is a copyField for abfr, aben, abit, abpt Concerning the 2., yes you have right it's not and but AND I have this result: lst name=DE102009043935B3 arr name=tien strlt;emgt;Bicyclelt;/emgt; frame comprises holder, particularly for water bottle, where holder is connected/str /arr arr name=aben str#CMT# #/CMT# Thelt;emgt;bicyclelt;/emgt; frame (7) comprises a holder (1), particularly for a water bottle/str str. The holder is connected with thelt;emgt;bicyclelt;/emgt; frame by a screw (5), where a mounting element has a compensation/str str section which is made of an elastic material, particularly alt;emgt;plasticlt;/emgt; material. The compensation section/str /arr /lst So my last question is why I haven't em/em instead having colored ? How can I tell to solr to use the colored ? Thanks a lot, Bruno Le 01/04/2015 17:15, Reitzel, Charles a écrit : Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word and from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29version=2.2start=0r o w s=10indent=onhl=truehl.fl=tien,abenfl=pnf.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: lst name=DE202010012045U1 arr name=aben str(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made fromlt;emgt;plasticlt;/emgt; material/str str, particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made fromlt;emgt;plasticlt;/emgt;/str /arr /lst lst name=JP2014091382A arr name=aben str betweenlt;emgt;plasticlt;/emgt; tapes 3 and 3 having two heat fusion layers, and the twolt;emgt;plasticlt;/emgt; tapes 3 and 3 are stuck/str /arr /lst lst name=DE10201740A1 arr name=aben str elements. A connecting element is
SolrCloud 5.0 cluster RAM requirements
Does a SolrCloud 5.0 cluster need enough RAM across the cluster to load all the collections into RAM at all times? I'm building a SolrCloud cluster that may have approximately 1 TB of data spread across the collections. Thanks, Ryan --- This email has been scanned for email related threats and delivered safely by Mimecast. For more information please visit http://www.mimecast.com ---
RE: Solr 3.6, Highlight and multi words?
Sorry, I've never tried highlighting in multiple colors... -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 3:43 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? ok for qf (i can't test now) but concerning hl.simple.pre hl.simple.post I can define only one color no ? in the sample solrconfig.xml there are several color, !-- multi-colored tag FragmentsBuilder -- fragmentsBuilder name=colored class=solr.highlight.ScoreOrderFragmentsBuilder lst name=defaults str name=hl.tag.pre![CDATA[ b style=background:yellow,b style=background:lawgreen, b style=background:aquamarine,b style=background:magenta, b style=background:palegreen,b style=background:coral, b style=background:wheat,b style=background:khaki, b style=background:lime,b style=background:deepskyblue]]/str str name=hl.tag.post![CDATA[/b]]/str /lst /fragmentsBuilder How can I tell to solr to use these color instead of hl.simple.pre/post ? Le 01/04/2015 20:58, Reitzel, Charles a écrit : If you want to query on the field ab, you'll probably need to add it the qf parameter. To control the highlighting markup, with the standard highlighter, use hl.simple.pre and hl.simple.post. https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 2:24 PM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Dear Charles, Thanks for your answer, please find below my answers. ok it works if I use aben as field in my query as you say in Answer 1. it doesn't work if I use ab may be because ab field is a copyField for abfr, aben, abit, abpt Concerning the 2., yes you have right it's not and but AND I have this result: lst name=DE102009043935B3 arr name=tien strlt;emgt;Bicyclelt;/emgt; frame comprises holder, particularly for water bottle, where holder is connected/str /arr arr name=aben str#CMT# #/CMT# Thelt;emgt;bicyclelt;/emgt; frame (7) comprises a holder (1), particularly for a water bottle/str str. The holder is connected with thelt;emgt;bicyclelt;/emgt; frame by a screw (5), where a mounting element has a compensation/str str section which is made of an elastic material, particularly alt;emgt;plasticlt;/emgt; material. The compensation section/str /arr /lst So my last question is why I haven't em/em instead having colored ? How can I tell to solr to use the colored ? Thanks a lot, Bruno Le 01/04/2015 17:15, Reitzel, Charles a écrit : Haven't used Solr 3.x in a long time. But with 4.10.x, I haven't had any trouble with multiple terms. I'd look at a few things. 1. Do you have a typo in your query? Shouldn't it be q=aben:(plastic and bicycle)? ^^ 2. Try removing the word and from the query. There may be some interaction with a stop word filter. If you want a phrase query, wrap it in quotes. 3. Also, be sure that the query and indexing analyzers for the aben field are compatible with each other. -Original Message- From: Bruno Mannina [mailto:bmann...@free.fr] Sent: Wednesday, April 01, 2015 7:05 AM To: solr-user@lucene.apache.org Subject: Re: Solr 3.6, Highlight and multi words? Sorry to disturb you with the renew but nobody use or have problem with multi-terms and highlight ? regards, Le 29/03/2015 21:15, Bruno Mannina a écrit : Dear Solr User, I try to work with highlight, it works well but only if I have only one keyword in my query?! If my request is plastic AND bicycle then only plastic is highlight. my request is: ./select/?q=ab%3A%28plastic+and+bicycle%29version=2.2start=0r o w s=10indent=onhl=truehl.fl=tien,abenfl=pnf.aben.hl.snippets=5 Could you help me please to understand ? I read doc, google, without success... so I post here... my result is: lst name=DE202010012045U1 arr name=aben str(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal body (10) made fromlt;emgt;plasticlt;/emgt; material/str str, particularly for touring bike. #CMT#ADVANTAGE : #/CMT# The bicycle pedal has a pedal body made fromlt;emgt;plasticlt;/emgt;/str /arr /lst lst name=JP2014091382A arr name=aben str betweenlt;emgt;plasticlt;/emgt; tapes 3 and 3 having two heat fusion layers, and the twolt;emgt;plasticlt;/emgt; tapes 3 and 3 are stuck/str /arr /lst lst name=DE10201740A1 arr name=aben str elements. A connecting element is formed as a hinge, a flexible foil or a