Re: Solr irregularly having QTime 50000ms, stracing solr cures the problem
Hi IJ, yes indeed, there are multiple node. But I have a 50 seconds delay, not 5 seconds. Anyway I will keep this in mind and will experiment with the hosts file if it starts to get annoying again. Cheers, Harald. On 16.07.2014 19:44, IJ wrote: I know u mentioned you have a single machine at play - but do you have multiple nodes on the machine that talk to one another ?? Does your problem recur when the load on the system is low ? Also faced a similar problem wherein the 5 second delay (described in detail on my other post) kept happening after a 1.5 minute inactivity interval. This was explained off as Solr keeping alive the http connection for inter-node communication for around 1.5 minutes before disconnecting - and if a new request happens post 1.5 minutes then, a new connection is created - which probably suffers a latency due to a DNS Name Lookup delay. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-irregularly-having-QTime-5ms-stracing-solr-cures-the-problem-tp4146047p4147512.html Sent from the Solr - User mailing list archive at Nabble.com. -- Harald Kirsch Raytion GmbH Kaiser-Friedrich-Ring 74 40547 Duesseldorf Fon +49 211 53883-216 Fax +49-211-550266-19 http://www.raytion.com
Re: Plugin init failure for custom analysis filter
Hi, I am not able to find anything in the log or rather not that specific. This error is being thrown when I add a string argument to my filter in schema. If I remove the same, I donot get any error. I tried changing the datatype but still same error. A little more detail regarding the filter arguments: fieldType name=textNumeric class=solr.CustomTextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.CustomFilterFactory pattern=(.*)/([0-9]+)/([0-9]+)/([0-9]+)/(.*)? / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType fieldType name=textCustom class=solr.CustomTextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.CustomFilterFactory pattern=(.*)/([0-9]+)/([0-9]+)/([0-9]+)/(.*)? ValueToStore=N / filter class=solr.LengthFilterFactory min=1 max=100 enablePositionIncrements=true / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType I get error here only for textCustom and not textNumeric while initializing the core. -- View this message in context: http://lucene.472066.n3.nabble.com/Plugin-init-failure-for-custom-analysis-filter-tp4147851p4148259.html Sent from the Solr - User mailing list archive at Nabble.com.
stats.facet with multi-valued field in Solr 4.9
Hi! I am storing aggregated article click statistics for a website in a Lucene database. Website articles (i.e., pages in this case) can have multiple associated financial instruments, which – for statistics reasons – I also copy to Lucene. So basically this data is stored (and regularly updated) by articleId and date as { articleId: 1234, date: 2014-07-21, clicks: 5, instrumentIds: [ 1, 2, 3, 4 ] } Now I need to generate statistics, like aggregated article click count by instrumentId: /solr/article_stats/select ?q=*:* stats=true stats.field=clicks stats.facet=instrumentIds rows=0 This way Solr returned a (large) list of instrumentIds in stats.stats_fields.clicks.facets.instrumentIds with the clicks per instrument, which was exactly what I want. After the upgrade to Solr 4.9 (from 3.6) this seems not to be possible anymore: Stats can only facet on single-valued fields, not: instrumentIds Is there a way to replicate the old behaviour? Thanks, Nico signature.asc Description: Message signed with OpenPGP using GPGMail
Re: stats.facet with multi-valued field in Solr 4.9
On Mon, Jul 21, 2014 at 7:09 AM, Nico Kaiser n...@kaiser.me wrote: After the upgrade to Solr 4.9 (from 3.6) this seems not to be possible anymore: Stats can only facet on single-valued fields, not: instrumentIds https://issues.apache.org/jira/browse/SOLR-3642 It looks like perhaps it never did work correctly. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: stats.facet with multi-valued field in Solr 4.9
Yonik, thanks for your reply! I also found https://issues.apache.org/jira/browse/SOLR-1782 which also sees to deal with this, but I did not find out wether there is a workaround. For our use case the previous behaviour was ok and seemed (!) to be consistent. However I understand that this feature had to be disabled if it was broken. Do you have an idea how to achieve the behaviour I mentioned before? Nico On 21 Jul 2014, at 13:26, Yonik Seeley yo...@heliosearch.com wrote: On Mon, Jul 21, 2014 at 7:09 AM, Nico Kaiser n...@kaiser.me wrote: After the upgrade to Solr 4.9 (from 3.6) this seems not to be possible anymore: Stats can only facet on single-valued fields, not: instrumentIds https://issues.apache.org/jira/browse/SOLR-3642 It looks like perhaps it never did work correctly. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data signature.asc Description: Message signed with OpenPGP using GPGMail
Re: stats.facet with multi-valued field in Solr 4.9
On Mon, Jul 21, 2014 at 7:32 AM, Nico Kaiser n...@kaiser.me wrote: Yonik, thanks for your reply! I also found https://issues.apache.org/jira/browse/SOLR-1782 which also sees to deal with this, but I did not find out wether there is a workaround. For our use case the previous behaviour was ok and seemed (!) to be consistent. However I understand that this feature had to be disabled if it was broken. Do you have an idea how to achieve the behaviour I mentioned before? I don't think there's anything currently committed/released. There has been work on an Analytics component that could do it. This hasn't been committed to Solr yet, but has been committed in Heliosearch. Also, Heliosearch has facet functions: http://heliosearch.org/solr-facet-functions/ -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
AUTO: Nicholas M. Wertzberger is out of the office (returning 07/23/2014)
I am out of the office until 07/23/2014. I'm out of town for the next few days. I am reachable by Blackberry, if needed. Please contact Jason Brown for anything JAS Team related. Note: This is an automated response to your message Re: questions on Solr WordBreakSolrSpellChecker and WordDelimiterFilterFactory sent on 7/17/2014 7:42:42 AM. This is the only notification you will receive while this person is away. ** This email and any attachments may contain information that is confidential and/or privileged for the sole use of the intended recipient. Any use, review, disclosure, copying, distribution or reliance by others, and any forwarding of this email or its contents, without the express permission of the sender is strictly prohibited by law. If you are not the intended recipient, please contact the sender immediately, delete the e-mail and destroy all copies. **
faceting within facets
Hi Is it possible to create a facet within another facet in a single query, currently I'm having to filter the query with facet.query=type:foo and running the query multiple times to return the number and type of object created on a given date. Is it even possible to return this in a single query? Cheers, David
Re: faceting within facets
On Mon, Jul 21, 2014 at 8:08 AM, David Flower dflo...@amplience.com wrote: Is it possible to create a facet within another facet in a single query For simple field facets, there's pivot faceting. For more complex nested facets, there are sub-facets in heliosearch (a solr fork): http://heliosearch.org/solr-subfacets/ -Yonik
Solr Cassandra MySQL Best Practice Indexing
So my full text data lies on Cassandra along with an ID. Now I have a lot of structured data linked to the ID which lies on an RDBMS (read MySQL). I need this structured data as it would help me with my faceting and other needs. What is the best practice in going about indexing in this scenario. My thoughts (maybe weird): 1. Read the data from Cassandra, for each ID read, read the corresponding row from MySQL for that ID, form an XML on the fly (for each ID) and send it to Solr for Indexing without storing anything. 2. I do not have much idea on Solandra. However even if I use it I will have to go to MySQL for fetching the structured data. 3. Duplicate the data and either get all of Cassandra to MySQL or vice versa but then data duplication would happen. I will think about incremental indexing for the new records later. Bit confused. Any help would be appreciated.
Re: Solr Cassandra MySQL Best Practice Indexing
Solandra is not a supported product. DataStax Enterprise (DSE) supersedes it. With DSE, just load your data into a Solr-enabled Cassandra data center and it will be indexed automatically in the embedded Solr within DSE, as per a Solr schema that you provide. Then use any of the nodes in that Solr-enabled Cassandra data center just the same as with normal Solr. -- Jack Krupansky -Original Message- From: Yavar Husain Sent: Monday, July 21, 2014 8:37 AM To: solr-user@lucene.apache.org Subject: Solr Cassandra MySQL Best Practice Indexing So my full text data lies on Cassandra along with an ID. Now I have a lot of structured data linked to the ID which lies on an RDBMS (read MySQL). I need this structured data as it would help me with my faceting and other needs. What is the best practice in going about indexing in this scenario. My thoughts (maybe weird): 1. Read the data from Cassandra, for each ID read, read the corresponding row from MySQL for that ID, form an XML on the fly (for each ID) and send it to Solr for Indexing without storing anything. 2. I do not have much idea on Solandra. However even if I use it I will have to go to MySQL for fetching the structured data. 3. Duplicate the data and either get all of Cassandra to MySQL or vice versa but then data duplication would happen. I will think about incremental indexing for the new records later. Bit confused. Any help would be appreciated.
RE: SolrCloud performance issues regarding hardware configuration
search engn dev [sachinyadav0...@gmail.com] wrote: Yes, You are right my facet queries are for text analytic purpose. Does this mean that facet calls are rare (at most one at a time)? Users will send boolean and spatial queries. current performance for spatial queries is 100qps with 150 concurrent users and avg response time is 500ms. What is the limiting factor here? CPU or I/O? If it is the latter, then adding more memory to the existing setup seems like the cheapest and easiest choice. - Toke Eskildsen
Query about Solr
Hi, How can i stop content of file from being getting indexed?? Will removing content field from schema.xml do that job? Thanks, Ameya
Edit Example Post.jar to read ALL file types
I am working with Solr 4.8.1 to set up an enterprise search system. The file system I am working with has numerous files with unique extension types (ex .20039 .20040 .20041 etc.) I am using the post.jar file included in the binary download (src: SimplePostTool.java http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/util/SimplePostTool.java )to post these files to the solr server and would like to edit this jar file to recognize /any/ file extension it comes across. Is there a way to do this with the SimplePostTool.java source? I am right now working to better understand the Filetype and DEFAULT_FILE_TYPE variables as well as the mimeMap. It is these that currently allow me to manually add file extensions. I would however, like the tool to be able to read in files no matter what they extension was and default their mime type to text/plain. -- View this message in context: http://lucene.472066.n3.nabble.com/Edit-Example-Post-jar-to-read-ALL-file-types-tp4148312.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query about Solr
Nothing gets indexed automatically. So you must be doing something (e.g. Nutch). Tell us what that something is first so we know your baseline setup. Regards, Alex On 21/07/2014 9:43 pm, Ameya Aware ameya.aw...@gmail.com wrote: Hi, How can i stop content of file from being getting indexed?? Will removing content field from schema.xml do that job? Thanks, Ameya
Re: Query about Solr
Hi, The data coming into Solr is different metadata such as author, created time, last modified time etc along with content of the file. So indexing content is giving me different errors, so i just simply want to skip indexing content part. Thanks, Ameya On Mon, Jul 21, 2014 at 11:07 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Nothing gets indexed automatically. So you must be doing something (e.g. Nutch). Tell us what that something is first so we know your baseline setup. Regards, Alex On 21/07/2014 9:43 pm, Ameya Aware ameya.aw...@gmail.com wrote: Hi, How can i stop content of file from being getting indexed?? Will removing content field from schema.xml do that job? Thanks, Ameya
Re: Query about Solr
Set the field type for such a field to ignored. Or set it to string and then you can still examine or query the data even if it is not properly formatted. -- Jack Krupansky -Original Message- From: Ameya Aware Sent: Monday, July 21, 2014 11:12 AM To: solr-user@lucene.apache.org Subject: Re: Query about Solr Hi, The data coming into Solr is different metadata such as author, created time, last modified time etc along with content of the file. So indexing content is giving me different errors, so i just simply want to skip indexing content part. Thanks, Ameya On Mon, Jul 21, 2014 at 11:07 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Nothing gets indexed automatically. So you must be doing something (e.g. Nutch). Tell us what that something is first so we know your baseline setup. Regards, Alex On 21/07/2014 9:43 pm, Ameya Aware ameya.aw...@gmail.com wrote: Hi, How can i stop content of file from being getting indexed?? Will removing content field from schema.xml do that job? Thanks, Ameya
Solr schema.xml query analyser
0 down vote favorite I am a complete beginner to Solr and need some help. My task is to provide a match when the search term contains the indexed field. For example: If query= foo bar and textExactMatch= foo, I should not get a MATCH If query= foo bar and textExactMatch= foo bar, I should get a MATCH If query= foo bar and textExactMatch= xyz foo bar/foo bar xyz, I should get a MATCH I am indexing my field as follows: fieldType name=textExactMatch class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ /analyzer So I'm indexing the text for the field as it is without breaking it further down. Could someone help me out with how should I tokenize and filter the field during query time. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-schema-xml-query-analyser-tp4148317.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr schema.xml query analyser
If you don't specify a query analyzer, Solr will use the index analyzer at query time. But... at query time there is something called a query parser which typically breaks the query into separate terms, delimited by white space, and then calls the analyzer for each term, separately. You can put the entire query in quotes or escape the space with a backslash. Of, just use the edismax query parser with the pf or pf2 parameters and then Solr will boost exact phrase matches even if not quoted or escaped. -- Jack Krupansky -Original Message- From: prashantc88 Sent: Monday, July 21, 2014 11:29 AM To: solr-user@lucene.apache.org Subject: Solr schema.xml query analyser 0 down vote favorite I am a complete beginner to Solr and need some help. My task is to provide a match when the search term contains the indexed field. For example: If query= foo bar and textExactMatch= foo, I should not get a MATCH If query= foo bar and textExactMatch= foo bar, I should get a MATCH If query= foo bar and textExactMatch= xyz foo bar/foo bar xyz, I should get a MATCH I am indexing my field as follows: fieldType name=textExactMatch class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ /analyzer So I'm indexing the text for the field as it is without breaking it further down. Could someone help me out with how should I tokenize and filter the field during query time. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-schema-xml-query-analyser-tp4148317.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr schema.xml query analyser
Thanks Jack for the reply. I did not mention the query time analyzer in my post because I wasn't sure what should be put there. With regards to your reply, If I put the query term in quotes, would I get a match for the following: Indexed field value: foo bar Query term: foo bar xyz/xyz foo bar I believe it should not as it will be looking for the exact term present in both the places. However I want it to behave in the following way: If query= foo bar and textExactMatch= foo, I SHOULD NOT get a MATCH If query= foo bar and textExactMatch= foo bar, I SHOULD get a MATCH If query= foo bar and textExactMatch= xyz foo bar/foo bar xyz, I SHOULD get a MATCH Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-schema-xml-query-analyser-tp4148317p4148327.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr schema.xml query analyser
Based on your stated requirements, there is no obvious need to use the keyword tokenizer. So fix that and then quoted phrases or escaped spaces should work. -- Jack Krupansky -Original Message- From: prashantc88 Sent: Monday, July 21, 2014 11:51 AM To: solr-user@lucene.apache.org Subject: Re: Solr schema.xml query analyser Thanks Jack for the reply. I did not mention the query time analyzer in my post because I wasn't sure what should be put there. With regards to your reply, If I put the query term in quotes, would I get a match for the following: Indexed field value: foo bar Query term: foo bar xyz/xyz foo bar I believe it should not as it will be looking for the exact term present in both the places. However I want it to behave in the following way: If query= foo bar and textExactMatch= foo, I SHOULD NOT get a MATCH If query= foo bar and textExactMatch= foo bar, I SHOULD get a MATCH If query= foo bar and textExactMatch= xyz foo bar/foo bar xyz, I SHOULD get a MATCH Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-schema-xml-query-analyser-tp4148317p4148327.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr schema.xml query analyser
My apologies Jack. But there was a mistake in my question. I actually switched query and textExactMatch in my question. I would be really helpful if you could have a look at the scenario once again: My task is to provide a match when the search term contains the indexed field. For example: If textExactMatch= foo bar and query= foo, I should not get a MATCH If textExactMatch= foo bar and query= foo bar, I should get a MATCH If textExactMatch= foo bar and query= xyz foo bar/foo bar xyz, I should get a MATCH I am indexing my field as follows: fieldType name=textExactMatch class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType So I'm indexing the text for the field as it is without breaking it further down. How should I tokenize and filter the field during query time? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-schema-xml-query-analyser-tp4148317p4148352.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: text search problem
Thanks for the reply Erick, I will try as you suggested. There I have another question related to this lines. When I have - in my description , name then the search results are different. For e.g. ABC-123 , it look sofr ABC or 123, I want to treat this search as exact match, i.e if my document has ABC-123 then I should get the results. When I check with hl-on, it has emABCem and get the results. How can I avoid this situation. Thanks Ravi -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Saturday, July 19, 2014 4:40 PM To: solr-user@lucene.apache.org Subject: Re: text search problem Try adding debug=all to the query and see what the parsed form of the query is, likely you're 1 using phrase queries, so broadway hotel requires both words in the 1 text or 2 if you're not using phrases, you're searching for the AND of the two terms. But debug=all will show you. Plus, take a look at the admin/analysis page, your tokenization may not be what you expect. Best, Erick On Fri, Jul 18, 2014 at 2:00 PM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) external.ravi.tamin...@us.bosch.com wrote: Hi, Below is the text_general field type when I search Text:Boradway it is not returning all the records, it returning only few records. But when I search for Text:*Broadway*, it is getting more records. When I get into multiple words ln search like Broadway Hotel, it may not get Broadway , HotelBroadway Hotel. DO you have any thought how to handle these type of keyword search. Text:Broadway,Vehicle Detailing,Water Systems,Vehicle Detailing,Car Wash Water Recovery My Field type look like this. fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.KStemFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 splitOnCaseChange=0 splitOnNumerics=0 stemEnglishPossessive=0 catenateWords=1 catenateNumbers=1 catenateAll=1 preserveOriginal=0/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.KStemFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 splitOnCaseChange=0 splitOnNumerics=0 stemEnglishPossessive=0 catenateWords=1 catenateNumbers=1 catenateAll=1 preserveOriginal=0/ /analyzer /fieldType Do you have any thought the behavior or how to get this? Thanks Ravi
Re: Solr schema.xml query analyser
That sounds more like a reverse query - trying to match documents against the query rather than matching the query against the documents. Solr doesn't have that feature currently. Although I'm not absolutely sure what your textExactMatch is. I'm guessing that it is a document field in your index. -- Jack Krupansky -Original Message- From: newBie88 Sent: Monday, July 21, 2014 1:13 PM To: solr-user@lucene.apache.org Subject: Re: Solr schema.xml query analyser My apologies Jack. But there was a mistake in my question. I actually switched query and textExactMatch in my question. I would be really helpful if you could have a look at the scenario once again: My task is to provide a match when the search term contains the indexed field. For example: If textExactMatch= foo bar and query= foo, I should not get a MATCH If textExactMatch= foo bar and query= foo bar, I should get a MATCH If textExactMatch= foo bar and query= xyz foo bar/foo bar xyz, I should get a MATCH I am indexing my field as follows: fieldType name=textExactMatch class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType So I'm indexing the text for the field as it is without breaking it further down. How should I tokenize and filter the field during query time? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-schema-xml-query-analyser-tp4148317p4148352.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Multiterm analysis in complexphrase query
That would be really useful. Can you upload the jar and its requirements? It also makes it pluggable with diff versions of solr. On Jul 1, 2014 9:01 PM, Allison, Timothy B. talli...@mitre.org wrote: If there's enough interest, I might get back into the code and throw a standalone src (and jar) of the SpanQueryParser and the Solr wrapper onto github. That would make it more widely available until there's a chance to integrate it into Lucene/Solr. If you'd be interested in this, let me know (and/or vote on the issue pages on Jira). Best, Tim -Original Message- From: Michael Ryan [mailto:mr...@moreover.com] Sent: Tuesday, July 01, 2014 9:24 AM To: solr-user@lucene.apache.org Subject: RE: Multiterm analysis in complexphrase query Thanks. This looks interesting... -Michael -Original Message- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Monday, June 30, 2014 8:15 AM To: solr-user@lucene.apache.org Subject: RE: Multiterm analysis in complexphrase query Ahmet, please correct me if I'm wrong, but the ComplexPhraseQueryParser does not perform analysis (as you, Michael, point out). The SpanQueryParser in LUCENE-5205 does perform analysis and might meet your needs. Work on it has gone on pause, though, so you'll have to build from the patch or the LUCENE-5205 branch. Let me know if you have any questions. LUCENE-5470 and LUCENE-5504 would move multiterm analysis farther down and make it available to all parsers that use QueryParserBase, including the ComplexPhraseQueryParser. Best, Tim -Original Message- From: Michael Ryan [mailto:mr...@moreover.com] Sent: Sunday, June 29, 2014 11:09 AM To: solr-user@lucene.apache.org Subject: Multiterm analysis in complexphrase query I've been using a modified version of the complex phrase query parser patch from https://issues.apache.org/jira/browse/SOLR-1604 in Solr 3.6, and I'm currently upgrading to 4.9, which has this built-in. I'm having trouble with using accents in wildcard queries, support for which was added in https://issues.apache.org/jira/browse/SOLR-2438. In 3.6, I was using a modified version of SolrQueryParser, which simply used ComplexPhraseQueryParser in place of QueryParser. In the version of ComplexPhraseQParserPlugin in 4.9, it just directly uses ComplexPhraseQueryParser, and doesn't go through SolrQueryParser at all. SolrQueryParserBase.analyzeIfMultitermTermText() is where the multiterm analysis magic happens. So, my problem is that ComplexPhraseQParserPlugin/ComplexPhraseQueryParser doesn't use SolrQueryParserBase, which breaks doing fun things like this: {!complexPhrase}barac* óba*a And expecting it to match Barack Obama. Anyone run into this before, or have a way to get this working? -Michael
How do I disable distributed search feature when I have only one shard
Hi there, We have a solr cloud set up with only one shard. There is one leader and 15 followers. So the data is replicated on 15 nodes. When we run a solr query, only one node should handle the request and we do not need any distributed search feature as all the nodes are exact copies of each other. Under certain load scenarios, we are seeing SOLRJ api is adding isShard=truedistrib=falseshard.url=A,B,C etc. to all the queries. Is the solr query waiting for responses from A, B and C before returning back to the client. If that is true, it is unnecessary and causing problems for us under heavy load. The thing is, somehow, these parameters are automagically added during query time. How do we disable this. The solrj query that we build programatically does not add these three parameters. Is there some configuration we can turn on, to tell solrj not to add these parameters to the solr request. Thanks, Pramod -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-I-disable-distributed-search-feature-when-I-have-only-one-shard-tp4148449.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCloud replica dies under high throughput
Hi, I'm doing some benchmarking with Solr Cloud 4.9.0. I am trying to work out exactly how much throughput my cluster can handle. Consistently in my test I see a replica go into recovering state forever caused by what looks like a timeout during replication. I can understand the timeout and failure (I am hitting it fairly hard) but what seems odd to me is that when I stop the heavy load it still does not recover the next time it tries, it seems broken forever until I manually go in, clear the index and let it do a full resync. Is this normal? Am I misunderstanding something? My cluster has 4 nodes (2 shards, 2 replicas) (AWS m3.2xlarge). I am indexing with ~800 concurrent connections and a 10 sec soft commit. I consistently get this problem with a throughput of around 1.5 million documents per hour. Thanks all, Darren Stack Traces Messages: [qtp779330563-627] ERROR org.apache.solr.servlet.SolrDispatchFilter â null:org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226) at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:422) at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Error while trying to recover. core=assets_shard2_replica1:java.util.concurrent.ExecutionException: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://xxx.xxx.15.171:8080/solr at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:615) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:371) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235) Caused by: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://xxx.xxx.15.171:8080/solr at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566) at org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:245) at org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:241) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.net.SocketException: Socket closed at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123) at
Re: SolrCloud replica dies under high throughput
Looks like you probably have to raise the http client connection pool limits to handle that kind of load currently. They are specified as top level config in solr.xml: maxUpdateConnections maxUpdateConnectionsPerHost -- Mark Miller about.me/markrmiller On July 21, 2014 at 7:14:59 PM, Darren Lee (d...@amplience.com) wrote: Hi, I'm doing some benchmarking with Solr Cloud 4.9.0. I am trying to work out exactly how much throughput my cluster can handle. Consistently in my test I see a replica go into recovering state forever caused by what looks like a timeout during replication. I can understand the timeout and failure (I am hitting it fairly hard) but what seems odd to me is that when I stop the heavy load it still does not recover the next time it tries, it seems broken forever until I manually go in, clear the index and let it do a full resync. Is this normal? Am I misunderstanding something? My cluster has 4 nodes (2 shards, 2 replicas) (AWS m3.2xlarge). I am indexing with ~800 concurrent connections and a 10 sec soft commit. I consistently get this problem with a throughput of around 1.5 million documents per hour. Thanks all, Darren Stack Traces Messages: [qtp779330563-627] ERROR org.apache.solr.servlet.SolrDispatchFilter â null:org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226) at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:422) at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Error while trying to recover. core=assets_shard2_replica1:java.util.concurrent.ExecutionException: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://xxx.xxx.15.171:8080/solr at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:615) at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:371) at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235) Caused by: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://xxx.xxx.15.171:8080/solr at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:566) at org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:245) at org.apache.solr.client.solrj.impl.HttpSolrServer$1.call(HttpSolrServer.java:241) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.net.SocketException: Socket closed at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251) at
SolrCloud extended warmup support
I’d like to ensure an extended warmup is done on each SolrCloud node prior to that node serving traffic. I can do certain things prior to starting Solr, such as pump the index dir through /dev/null to pre-warm the filesystem cache, and post-start I can use the ping handler with a health check file to prevent the node from entering the clients load balancer until I’m ready. What I seem to be missing is control over when a node starts participating in queries sent to the other nodes. I can, of course, add solrconfig.xml firstSearcher queries, which I assume (and fervently hope!) happens before a node registers itself in ZK clusterstate.json as ready for work, but that doesn’t scale so well if I want that initial warmup to run thousands of queries, or run them with some paralleism. I’m storing solrconfig.xml in ZK, so I’m sensitive to the size. Any ideas, or corrections to my assumptions? Thanks.
Re: SolrCloud extended warmup support
On 7/21/2014 5:37 PM, Jeff Wartes wrote: I’d like to ensure an extended warmup is done on each SolrCloud node prior to that node serving traffic. I can do certain things prior to starting Solr, such as pump the index dir through /dev/null to pre-warm the filesystem cache, and post-start I can use the ping handler with a health check file to prevent the node from entering the clients load balancer until I’m ready. What I seem to be missing is control over when a node starts participating in queries sent to the other nodes. I can, of course, add solrconfig.xml firstSearcher queries, which I assume (and fervently hope!) happens before a node registers itself in ZK clusterstate.json as ready for work, but that doesn’t scale so well if I want that initial warmup to run thousands of queries, or run them with some paralleism. I’m storing solrconfig.xml in ZK, so I’m sensitive to the size. Any ideas, or corrections to my assumptions? I think that firstSearcher/newSearcher (and making sure useColdSearcher is set to false) is going to be the only way you can do this in a way that's compatible with SolrCloud. If you were doing manual distributed search without SolrCloud, you'd have more options available. If useColdSearcher is set to false, that should keep *everything* from using the searcher until the warmup has finished. I cannot be certain that this is the case, but I have some reasonable confidence that this is how it works. If you find that it doesn't behave this way, I'd call it a bug. Thanks, Shawn
Re: SolrCloud extended warmup support
On 7/21/14, 4:50 PM, Shawn Heisey s...@elyograg.org wrote: On 7/21/2014 5:37 PM, Jeff Wartes wrote: I¹d like to ensure an extended warmup is done on each SolrCloud node prior to that node serving traffic. I can do certain things prior to starting Solr, such as pump the index dir through /dev/null to pre-warm the filesystem cache, and post-start I can use the ping handler with a health check file to prevent the node from entering the clients load balancer until I¹m ready. What I seem to be missing is control over when a node starts participating in queries sent to the other nodes. I can, of course, add solrconfig.xml firstSearcher queries, which I assume (and fervently hope!) happens before a node registers itself in ZK clusterstate.json as ready for work, but that doesn¹t scale so well if I want that initial warmup to run thousands of queries, or run them with some paralleism. I¹m storing solrconfig.xml in ZK, so I¹m sensitive to the size. Any ideas, or corrections to my assumptions? I think that firstSearcher/newSearcher (and making sure useColdSearcher is set to false) is going to be the only way you can do this in a way that's compatible with SolrCloud. If you were doing manual distributed search without SolrCloud, you'd have more options available. If useColdSearcher is set to false, that should keep *everything* from using the searcher until the warmup has finished. I cannot be certain that this is the case, but I have some reasonable confidence that this is how it works. If you find that it doesn't behave this way, I'd call it a bug. Thanks, Shawn Thanks for the quick reply. Since distributed search latency is the max of the shard sub-requests, I¹m trying my best to minimize any spikes in cluster latency due to node restarts. I double-checked useColdSearcher was false, but the doc says this means requests ³block until the first searcher is done warming², which translates pretty clearly to ³latency spike². The more I think about it, the more worried I am that a node might indeed register itself in live_nodes and get distributed requests before it¹s got a searcher to work with. *Especially* if I have lots of serial firstSearcher queries. I¹ll look through the code myself tomorrow, but if anyone can help confirm/deny the order of operations here, I¹d appreciate it.
Re: Edit Example Post.jar to read ALL file types
So how do you expect these to be indexed? I mean what happens if you run across a Word document? How about an mp3? Just blasting all files up seems chancy. And doesn't just 'java -jar post.jar * ' do what you ask? This seems like an XY problem, _why_ do you want to do this? Because unless the files being sent to Solr are properly formatted, they won't be ingested. There's some special logic that handles XML file and expects the very precise Solr format Solr would have no idea what to do with the extensions in your example. Perhaps a better approach would be to control the indexing from a SolrJ client. Here's a blog if you want to follow that approach. Best, Erick On Mon, Jul 21, 2014 at 7:51 AM, jrusnak jrus...@live.unc.edu wrote: I am working with Solr 4.8.1 to set up an enterprise search system. The file system I am working with has numerous files with unique extension types (ex .20039 .20040 .20041 etc.) I am using the post.jar file included in the binary download (src: SimplePostTool.java http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/util/SimplePostTool.java )to post these files to the solr server and would like to edit this jar file to recognize /any/ file extension it comes across. Is there a way to do this with the SimplePostTool.java source? I am right now working to better understand the Filetype and DEFAULT_FILE_TYPE variables as well as the mimeMap. It is these that currently allow me to manually add file extensions. I would however, like the tool to be able to read in files no matter what they extension was and default their mime type to text/plain. -- View this message in context: http://lucene.472066.n3.nabble.com/Edit-Example-Post-jar-to-read-ALL-file-types-tp4148312.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: text search problem
Try escaping the hyphen as \-. Or enclosing it all in quotes. But you _really_ have to spend some time with the debug option an admin/analysis page or you will find endless surprises. Best, Erick On Mon, Jul 21, 2014 at 11:12 AM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) external.ravi.tamin...@us.bosch.com wrote: Thanks for the reply Erick, I will try as you suggested. There I have another question related to this lines. When I have - in my description , name then the search results are different. For e.g. ABC-123 , it look sofr ABC or 123, I want to treat this search as exact match, i.e if my document has ABC-123 then I should get the results. When I check with hl-on, it has emABCem and get the results. How can I avoid this situation. Thanks Ravi -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Saturday, July 19, 2014 4:40 PM To: solr-user@lucene.apache.org Subject: Re: text search problem Try adding debug=all to the query and see what the parsed form of the query is, likely you're 1 using phrase queries, so broadway hotel requires both words in the 1 text or 2 if you're not using phrases, you're searching for the AND of the two terms. But debug=all will show you. Plus, take a look at the admin/analysis page, your tokenization may not be what you expect. Best, Erick On Fri, Jul 18, 2014 at 2:00 PM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) external.ravi.tamin...@us.bosch.com wrote: Hi, Below is the text_general field type when I search Text:Boradway it is not returning all the records, it returning only few records. But when I search for Text:*Broadway*, it is getting more records. When I get into multiple words ln search like Broadway Hotel, it may not get Broadway , HotelBroadway Hotel. DO you have any thought how to handle these type of keyword search. Text:Broadway,Vehicle Detailing,Water Systems,Vehicle Detailing,Car Wash Water Recovery My Field type look like this. fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.KStemFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 splitOnCaseChange=0 splitOnNumerics=0 stemEnglishPossessive=0 catenateWords=1 catenateNumbers=1 catenateAll=1 preserveOriginal=0/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.KStemFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 splitOnCaseChange=0 splitOnNumerics=0 stemEnglishPossessive=0 catenateWords=1 catenateNumbers=1 catenateAll=1 preserveOriginal=0/ /analyzer /fieldType Do you have any thought the behavior or how to get this? Thanks Ravi
Re: How do I disable distributed search feature when I have only one shard
Are you using CloudSolrServer in your SolrJ program? No matter what, the distrib=false should be keeping the query from going to more than one shard So I'd check the logs and see if the suspect query appears in more than one node. FWIW, Erick On Mon, Jul 21, 2014 at 4:13 PM, pramodEbay prmaha...@ebay.com wrote: Hi there, We have a solr cloud set up with only one shard. There is one leader and 15 followers. So the data is replicated on 15 nodes. When we run a solr query, only one node should handle the request and we do not need any distributed search feature as all the nodes are exact copies of each other. Under certain load scenarios, we are seeing SOLRJ api is adding isShard=truedistrib=falseshard.url=A,B,C etc. to all the queries. Is the solr query waiting for responses from A, B and C before returning back to the client. If that is true, it is unnecessary and causing problems for us under heavy load. The thing is, somehow, these parameters are automagically added during query time. How do we disable this. The solrj query that we build programatically does not add these three parameters. Is there some configuration we can turn on, to tell solrj not to add these parameters to the solr request. Thanks, Pramod -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-I-disable-distributed-search-feature-when-I-have-only-one-shard-tp4148449.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud extended warmup support
I've never seen it necessary to run thousands of queries to warm Solr. Usually less than a dozen will work fine. My challenge would be for you to measure performance differences on queries after running, say, 12 well-chosen queries as opposed to hundreds/thousands. I bet that if 1 you search across all the relevant fields, you'll fill up the low-level caches for those fields. 2 you facet on all the fields you intend to facet on. 3 you sort on all the fields you intend to sort on. 4 you specify some filter queries. This is fuzzy since really depends on you being able to predict what those will be for firstSearcher. Things like in the last day/week/month can be pre-configured, but others you won't get. BTW, here's a blog about why in the last day fq clauses can be tricky. http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/ that you'll pretty much nail warmup and be fine. Note that you can do all the faceting on a single query. Specifying the primary, secondary etc. sorts will fill those caches. Best, Erick On Mon, Jul 21, 2014 at 5:07 PM, Jeff Wartes jwar...@whitepages.com wrote: On 7/21/14, 4:50 PM, Shawn Heisey s...@elyograg.org wrote: On 7/21/2014 5:37 PM, Jeff Wartes wrote: I¹d like to ensure an extended warmup is done on each SolrCloud node prior to that node serving traffic. I can do certain things prior to starting Solr, such as pump the index dir through /dev/null to pre-warm the filesystem cache, and post-start I can use the ping handler with a health check file to prevent the node from entering the clients load balancer until I¹m ready. What I seem to be missing is control over when a node starts participating in queries sent to the other nodes. I can, of course, add solrconfig.xml firstSearcher queries, which I assume (and fervently hope!) happens before a node registers itself in ZK clusterstate.json as ready for work, but that doesn¹t scale so well if I want that initial warmup to run thousands of queries, or run them with some paralleism. I¹m storing solrconfig.xml in ZK, so I¹m sensitive to the size. Any ideas, or corrections to my assumptions? I think that firstSearcher/newSearcher (and making sure useColdSearcher is set to false) is going to be the only way you can do this in a way that's compatible with SolrCloud. If you were doing manual distributed search without SolrCloud, you'd have more options available. If useColdSearcher is set to false, that should keep *everything* from using the searcher until the warmup has finished. I cannot be certain that this is the case, but I have some reasonable confidence that this is how it works. If you find that it doesn't behave this way, I'd call it a bug. Thanks, Shawn Thanks for the quick reply. Since distributed search latency is the max of the shard sub-requests, I¹m trying my best to minimize any spikes in cluster latency due to node restarts. I double-checked useColdSearcher was false, but the doc says this means requests ³block until the first searcher is done warming², which translates pretty clearly to ³latency spike². The more I think about it, the more worried I am that a node might indeed register itself in live_nodes and get distributed requests before it¹s got a searcher to work with. *Especially* if I have lots of serial firstSearcher queries. I¹ll look through the code myself tomorrow, but if anyone can help confirm/deny the order of operations here, I¹d appreciate it.
DocValues without re-index?
Is it possible to use DocValues on an existing index without first re-indexing? -Michael