Re: Json Faceting Performance Issues on solr v8.7.0

2021-02-05 Thread Michael Gibney
Ah! that's significant. The latency is likely due to building the OrdinalMap (which maps segment ords to global ords) ... "dvhash" (assuming the relevant fields are not multivalued) will very likely work; "dvhash" doesn't map to global ords, so doesn't need to build the OrdinalMap (which gets built

Re: Json Faceting Performance Issues on solr v8.7.0

2021-02-05 Thread mmb1234
> Does this happen on a warm searcher (are subsequent requests with no intervening updates _ever_ fast?)? Subsequent response times very fast if searcher remains open. As a control test, I faceted on the same field that I used in the q param. 1. Start solr 2. Execute q=resultId:x&rows=0 =>

Re: Json Faceting Performance Issues on solr v8.7.0

2021-02-05 Thread Michael Gibney
Apologies, I missed deducing from the request url that you're already talking strictly about single-shard requests (so everything I was suggesting about shards.preference etc. is not applicable). "dvhash" is still worth a try though, esp. with `numFound` being 943 (out of 185 million!). Does this h

Re: Json Faceting Performance Issues on solr v8.7.0

2021-02-05 Thread mmb1234
Ok. I'll try that. Meanwhile query on resultId is subsecond response. But the immediate next query for faceting takes 40+secs. The core has 185million docs and 63GB index size. curl 'http://localhost:8983/solr/TestCollection_shard1_replica_t3/query?q=resultId:x&rows=0' { "responseHea

Re: Json Faceting Performance Issues on solr v8.7.0

2021-02-05 Thread Michael Gibney
`resultId` sounds like it might be a relatively high-cardinality field (lots of unique values)? What's your number of shards, and replicas per shard? SOLR-15008 (note: not a bug) describes a situation that may be fundamentally similar to yours (though to be sure it's impossible to say for sure with

Json Faceting Performance Issues on solr v8.7.0

2021-02-05 Thread mmb1234
Hello, I am seeing very slow response from json faceting against a single core (though core is shard leader in a collection). Fields processId and resultId are non-multivalued, indexed and docvalues string (not text). Soft Commit = 5sec (opensearcher=true) and Hard Commit = 10sec because new do

Re: Performance issues with CursorMark

2020-10-26 Thread Erick Erickson
pta >> Sent: Monday 26th October 2020 17:00 >> To: solr-user@lucene.apache.org >> Subject: Re: Performance issues with CursorMark >> >> Hey Markus, >> >> What are you sorting on? Do you have docValues enabled on the sort field ? >> >> On Mon, Oct

RE: Performance issues with CursorMark

2020-10-26 Thread Markus Jelsma
> Sent: Monday 26th October 2020 17:00 > To: solr-user@lucene.apache.org > Subject: Re: Performance issues with CursorMark > > Hey Markus, > > What are you sorting on? Do you have docValues enabled on the sort field ? > > On Mon, Oct 26, 2020 at 5:36 AM Markus Jelsma >

Re: Performance issues with CursorMark

2020-10-26 Thread Anshum Gupta
Hey Markus, What are you sorting on? Do you have docValues enabled on the sort field ? On Mon, Oct 26, 2020 at 5:36 AM Markus Jelsma wrote: > Hello, > > We have been using a simple Python tool for a long time that eases > movement of data between Solr collections, it uses CursorMark to fetch >

Performance issues with CursorMark

2020-10-26 Thread Markus Jelsma
Hello, We have been using a simple Python tool for a long time that eases movement of data between Solr collections, it uses CursorMark to fetch small or large pieces of data. Recently it stopped working when moving data from a production collection to my local machine for testing, the Solr nod

Solr LTR Performance Issues

2020-09-21 Thread krishan goyal
I was observing a high degradation in performance when adding more features to my solr LTR model even if the model complexity (no of trees, depth of tree) remains same. I am using the MultipleAdditiveTreesModel model Moreover, if model complexity increases keeping no of features constant, performa

Severe performance issues of Solr 6.6.0 with debug logging

2020-01-30 Thread Davis
I have recently observer severe performance issues of 1 collection, 2 shard, 4 server SolrCloud (Solr 6.6.0 running on Windows, using AdoptOpenJDK 1.8 JRE, NSSM was used to run Solr as Windows service). During recovery of a replica the network utilization of the server hosting the replica (that is

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-28 Thread Toke Eskildsen
Wittenberg, Lucas wrote: > As suggested I switched to using DocValues and SortedDocValues. > Now QTime is down to an average of 1100, which is much, much better > but still far from the 30 I had with SOLR 4. > I suppose it is due to the block-oriented compression you mentioned. I apologize for be

RE: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-28 Thread Wittenberg, Lucas
+ and performance issues with DelegatingCollector and PostFilter Thanks for the suggestion. But the "customid" field is already set as docValues="true" actually. Well, I guess so as it is a type="string" which by default has docValues="true". -Message

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Erick Erickson
> I don't know the precedence rules for stored vs. dovValues in Solr DocValues are used if (and only if) all the fields being returned have docValues=“true” _and_ are single-valued, or if you’ve explicitly set useDocValuesAsStored. single-valued docValues are they only situation where the respon

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Toke Eskildsen
On Tue, 2019-08-27 at 09:05 +, Wittenberg, Lucas wrote: > But the "customid" field is already set as docValues="true" actually. > Well, I guess so as it is a type="string" which by default has > docValues="true". > > required="true" multiValued="false" /> > docValues="true" /> Yeah, it's a

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Toke Eskildsen
On Mon, 2019-08-26 at 16:01 +, Wittenberg, Lucas wrote: > @Override > public void collect(int docNumber) throws IOException { > if (null != this.reader && > isValid(this.reader.document(docNumber).get("customid"))) > { > super.collec

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Erick Erickson
multiValued="false" /> > docValues="true" /> > > > -Message d'origine- > De : Wittenberg, Lucas > Envoyé : lundi 26 août 2019 18:01 > À : solr-user@lucene.apache.org > Objet : SOLR 7+ / Lucene 7+ and performance issues with DelegatingC

RE: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-27 Thread Wittenberg, Lucas
voyé : lundi 26 août 2019 18:01 À : solr-user@lucene.apache.org Objet : SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter Hello all, Here is the situation I am facing. I am migrating from SOLR 4 to SOLR 7. SOLR 4 is running on Tomcat 8, SOLR 7 runs with built in Jetty 9. The la

Re: SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-26 Thread Erick Erickson
Is “customid” a docValues=true field? I suspect not, in which case I think this is the problem (but do be warned, I don’t spend much time in Lucene code). this.reader.document(docNumber).get("customid”) document(docNumber) goes out to do a disk read I think. If it were docValues=true, it could b

SOLR 7+ / Lucene 7+ and performance issues with DelegatingCollector and PostFilter

2019-08-26 Thread Wittenberg, Lucas
Hello all, Here is the situation I am facing. I am migrating from SOLR 4 to SOLR 7. SOLR 4 is running on Tomcat 8, SOLR 7 runs with built in Jetty 9. The largest core contains about 1,800,000 documents (about 3 GB). The migration went through smoothly. But something's bothering me. I have a Pos

Re: Solr LTR model Performance Issues

2019-04-18 Thread Kamal Kishore Aggarwal
Hi, I made change in the model by making the LTRScoringModel as immutable and cache hashCode calculation. The response time improved a lot after the change. http://lucene.472066.n3.nabble.com/jira-Updated-SOLR-12688-LTR-Multiple-performance-fixes-pure-DocValues-support-for-FieldValueFeature-td440

Re: Solr LTR model Performance Issues

2019-04-05 Thread Jörn Franke
It is a little bit difficult to say, because it could be also the business logic in the query execution. What is your performance baseline, ie if you just execute one query for each of the models? How fast should it be? Do you have really 10 or more concurrent users, or users that fire up querie

Re: Solr LTR model Performance Issues

2019-04-05 Thread Kamal Kishore Aggarwal
Hi, Any update on this? Is this model running in multi threaded mode or is there is any scope to do this. Please let me know. Regards Kamal On Sat, Mar 23, 2019 at 10:35 AM Kamal Kishore Aggarwal < kkroyal@gmail.com> wrote: > HI Jörn Franke, > > Thanks for the quick reply. > > I have perfor

Re: Solr LTR model Performance Issues

2019-03-22 Thread Kamal Kishore Aggarwal
HI Jörn Franke, Thanks for the quick reply. I have performed the jmeter load testing on one of the server for Linear vs Multipleadditive tree model. We are using lucidworks fusion. There is some business logic in the query pipeline followed by main solr ltr query. This is the total time taken by

Re: Solr LTR model Performance Issues

2019-03-22 Thread Jörn Franke
Can you share the time needed of the two models? How many documents? What is your loading pipeline? Have you observed cpu/memory? > Am 22.03.2019 um 12:01 schrieb Kamal Kishore Aggarwal : > > Hi, > > I am trying to use LTR with solr 6.6.2.There are different types of model > like Linear Model,

Solr LTR model Performance Issues

2019-03-22 Thread Kamal Kishore Aggarwal
Hi, I am trying to use LTR with solr 6.6.2.There are different types of model like Linear Model, Multiple Additive Trees Model and Neural Network Model. I have tried using Linear & Multiadditive model and compared the performance of results. There is a major difference in response time between th

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-28 Thread RAUNAK AGRAWAL
Thank you Joel. Looking forward to the latest version of solr. Thanks On Fri, Sep 28, 2018 at 12:22 PM Joel Bernstein wrote: > The facet expression is currently not as expressive as the JSON facet API. > So for very demanding use cases you can create more highly tuned JSON facet > API call. > >

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-28 Thread Joel Bernstein
The facet expression is currently not as expressive as the JSON facet API. So for very demanding use cases you can create more highly tuned JSON facet API call. The good news is we are working this. And also working on other expressions that can be wrapped around the facet expression to implement

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-28 Thread RAUNAK AGRAWAL
Thanks a lot Toki. I will get back to you soon regarding patch update after having discussion with the team. Thanks & Regards On Fri, Sep 28, 2018 at 11:30 AM Toke Eskildsen wrote: > RAUNAK AGRAWAL wrote: > > > curl http://localhost:8983/solr/collection_name/stream -d > > 'expr=facet(collecti

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-28 Thread Toke Eskildsen
RAUNAK AGRAWAL wrote: > curl http://localhost:8983/solr/collection_name/stream -d > 'expr=facet(collection_name,q="id:953",bucketSorts="week > desc",buckets="week",bucketSizeLimit=200,sum(sales), > sum(amount),sum(days))' Stats on numeric fields then. > Also in my collection, I have almost 10

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-28 Thread RAUNAK AGRAWAL
Thanks a lot Erick for the documentation. I will go through it and get back to you in case of any queries. Regards, Raunak On Fri, Sep 28, 2018 at 11:09 AM Erick Erickson wrote: > It Depends (tm). The behavior changed with Solr 7.5. Here are all the > gory details: > > > https://lucidworks.com/

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-28 Thread Erick Erickson
It Depends (tm). The behavior changed with Solr 7.5. Here are all the gory details: https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ and for 7.5+ https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/ Best, Erick On Fri, Sep 28, 2018 at 10:

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-28 Thread RAUNAK AGRAWAL
Hey Guys, This is the sample query I am making: curl http://localhost:8983/solr/collection_name/stream -d 'expr=facet(collection_name,q="id:953",bucketSorts="week desc",buckets="week",bucketSizeLimit=200,sum(sales),sum(amount),sum(days))' Also in my collection, I have almost 10 Billion documen

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-28 Thread Toke Eskildsen
On Thu, 2018-09-27 at 15:52 -0700, RAUNAK AGRAWAL wrote: > But for last few days, we are observing now that streaming facet > response is slower that json facets. Also we have increased the > number of documents in collection (30%). Export performance goes down when segment size goes way up, so I

Re: Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-27 Thread Joel Bernstein
Please post the Streaming Expression that you are using. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Sep 27, 2018 at 6:52 PM RAUNAK AGRAWAL wrote: > Hi Guys, > > Just to give you context, we were using JSON Facets for doing analytical > queries in solr but they were slower. Hence we

Solr Streaming Queries Performance Issues [v7.2.1]

2018-09-27 Thread RAUNAK AGRAWAL
Hi Guys, Just to give you context, we were using JSON Facets for doing analytical queries in solr but they were slower. Hence we migrated our application to use solr streaming facet queries. But for last few days, we are observing now that streaming facet response is slower that json facets. Also

Re: SolrCloud Large Cluster Performance Issues

2018-06-25 Thread Shawn Heisey
On 6/24/2018 7:38 PM, 苗海泉 wrote: Hello, everyone, we encountered two solr problems and hoped to get help. Our data volume is very large, 24.5TB a day, and the number of records is 110 billion. We originally used 49 solr nodes. Because of insufficient storage, we expanded to 100. For a solr cluste

Re: SolrCloud Large Cluster Performance Issues

2018-06-25 Thread Emir Arnautović
Hi, With such a big cluster a lot of things can go wrong and it is hard to give any answer without looking into it more and understanding your model. I assume that you are monitoring your system (both Solr/ZK and components that index/query) so it should be the first thing to look at and see if

SolrCloud Large Cluster Performance Issues

2018-06-24 Thread 苗海泉
Hello, everyone, we encountered two solr problems and hoped to get help. Our data volume is very large, 24.5TB a day, and the number of records is 110 billion. We originally used 49 solr nodes. Because of insufficient storage, we expanded to 100. For a solr cluster composed of multiple machines, we

Re: Re:LTR performance issues

2018-05-09 Thread ilayaraja
Thanks, Deigo. I shall followup from the jira.. - --Ilay -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re:LTR performance issues

2018-05-08 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)
[1] https://issues.apache.org/jira/browse/SOLR-11831?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=16316605#comment-16316605 From: solr-user@lucene.apache.org At: 05/08/18 07:07:01To: solr-user@lucene.apache.org Subject: LTR performance issues LTR with grouping

LTR performance issues

2018-05-07 Thread ilayaraja
LTR with grouping results in very high latency (3x) even while re-ranking 24 top groups. How is re-ranking implemented in Solr? Is it expected that it would result in 3x more query time. Need clarifications on: 1. How many top groups are actually re-ranked, is it exactly what we pass in reRankDoc

LTR performance issues

2018-05-07 Thread ilayaraja
LTR with grouping results in very high latency (3x) even while re-ranking 24 top groups. How is re-ranking implemented in Solr? Is it expected that it would result in 3x more query time. Need clarifications on: 1. How many top groups are actually re-ranked, is it exactly what we pass in reRankDoc

Re: CDCR performance issues

2018-03-23 Thread Tom Peters
etty sure CDCR uses HTTP/HTTPS rather than just TCP, so also >> check whether some proxy/load balancer between data centers is causing it >> to be a single connection per operation. That will *kill* performance. >> Some proxies default to HTTP/1.0 (open, send request, server send >> respo

Re: CDCR performance issues

2018-03-23 Thread Susheel Kumar
AM, Davis, Daniel (NIH/NLM) [C] < > > > > daniel.da...@nih.gov> wrote: > > > > >>> > > > > >>> These are general guidelines, I've done loads of networking, but > > may > > > > be less familiar with SolrCloud and CDC

Re: CDCR performance issues

2018-03-23 Thread Amrit Sarkar
centers using ping or > TCP > > > ping. Throughput tests may be high, but if Solr has to wait for a > > > response to a request before sending the next action, then just like > any > > > network protocol that does that, it will get slow. > > > >>&g

Re: CDCR performance issues

2018-03-23 Thread Susheel Kumar
st like any > > network protocol that does that, it will get slow. > > >>> > > >>> I'm pretty sure CDCR uses HTTP/HTTPS rather than just TCP, so also > > check whether some proxy/load balancer between data centers is causing it > > to be a single

Re: CDCR performance issues

2018-03-23 Thread Amrit Sarkar
hat will *kill* performance. > Some proxies default to HTTP/1.0 (open, send request, server send > response, close), and that will hurt. > >>> > >>> Why you should listen to me even without SolrCloud knowledge - > checkout paper "Latency performance of SOAP I

Re: CDCR performance issues

2018-03-12 Thread Tom Peters
roxies default to HTTP/1.0 (open, send request, server send response, >>> close), and that will hurt. >>> >>> Why you should listen to me even without SolrCloud knowledge - checkout >>> paper "Latency performance of SOAP Implementations". Same distr

Re: CDCR performance issues

2018-03-12 Thread Tom Peters
aper "Latency performance of SOAP Implementations". Same distribution of >> skills - I knew TCP well, but Apache Axis 1.1 not so well. I still >> improved response time of Apache Axis 1.1 by 250ms per call with 1-line of >> code. >> >> -Or

Re: CDCR performance issues

2018-03-09 Thread Erick Erickson
ten to me even without SolrCloud knowledge - checkout >>> paper "Latency performance of SOAP Implementations". Same distribution of >>> skills - I knew TCP well, but Apache Axis 1.1 not so well. I still >>> improved response time of Apache Axis 1.1 by 250ms

Re: CDCR performance issues

2018-03-09 Thread john spooner
s [mailto:tpet...@synacor.com] Sent: Wednesday, March 7, 2018 6:19 PM To: solr-user@lucene.apache.org Subject: CDCR performance issues I'm having issues with the target collection staying up-to-date with indexing from the source collection using CDCR. This is what I'm getting back in

Re: CDCR performance issues

2018-03-09 Thread Tom Peters
- > From: Tom Peters [mailto:tpet...@synacor.com] > Sent: Wednesday, March 7, 2018 6:19 PM > To: solr-user@lucene.apache.org > Subject: CDCR performance issues > > I'm having issues with the target collection staying up-to-date with indexing > from the source collection

RE: CDCR performance issues

2018-03-09 Thread Davis, Daniel (NIH/NLM) [C]
ednesday, March 7, 2018 6:19 PM To: solr-user@lucene.apache.org Subject: CDCR performance issues I'm having issues with the target collection staying up-to-date with indexing from the source collection using CDCR. This is what I'm getting back in terms of OPS: curl -s 'solr2-

Re: CDCR performance issues

2018-03-08 Thread Tom Peters
So I'm continuing to look into this and not making much headway, but I have additional questions now as well. I restarted the nodes in the source data center to see if it would have any impact. It appeared to initiate another bootstrap with the target. The lag and queueSize were brought back do

CDCR performance issues

2018-03-07 Thread Tom Peters
I'm having issues with the target collection staying up-to-date with indexing from the source collection using CDCR. This is what I'm getting back in terms of OPS: curl -s 'solr2-a:8080/solr/mycollection/cdcr?action=OPS' | jq . { "responseHeader": { "status": 0, "Q

Performance issues with 'unique' function in json facets over a high cardinality field

2017-12-12 Thread alexpusch
Hi, I have a surprising performance issue with the 'unique' function in a json facet My setup holds large amount of docs (~1B), despite this large number I only facet on a small result set of a query, only a few docs. The query itself returns as fast as expected, but when I try to do a unique cou

RE: LTR feature extraction performance issues

2017-10-31 Thread Brian Yee
/ LONDON) [mailto:cpoersc...@bloomberg.net] Sent: Tuesday, October 31, 2017 8:48 AM To: solr-user@lucene.apache.org Subject: RE: LTR feature extraction performance issues Hi Brian, I just tried to explore the scenario you describe with the techproducts example and am able to see what you see: # step 1

RE: LTR feature extraction performance issues

2017-10-31 Thread Christine Poerschke (BLOOMBERG/ LONDON)
-user@lucene.apache.org Subject: RE: LTR feature extraction performance issues Hi Alessandro, Unfortunately some of my most important features are query dependent. I think I found an issue though. I don't think my features are being inserted into the cache. Notice "cumulative_inserts:0&q

RE: LTR feature extraction performance issues

2017-10-30 Thread Brian Yee
ature extraction performance issues Hi Alessandro, Unfortunately some of my most important features are query dependent. I think I found an issue though. I don't think my features are being inserted into the cache. Notice "cumulative_inserts:0". There are a lot of lookups, but si

RE: LTR feature extraction performance issues

2017-10-24 Thread Brian Yee
ubject: Re: LTR feature extraction performance issues It strictly depends on the kind of features you are using. At the moment there is just one cache for all the features. This means that even if you have 1 query dependent feature and 100 document dependent feature, a different value for the quer

Re: LTR feature extraction performance issues

2017-10-23 Thread alessandro.benedetti
It strictly depends on the kind of features you are using. At the moment there is just one cache for all the features. This means that even if you have 1 query dependent feature and 100 document dependent feature, a different value for the query dependent one will invalidate the cache entry for the

RE: LTR feature cache performance issues

2017-10-23 Thread Brian Yee
Has anyone had experience tuning feature caches? Do any of the values below look unreasonable? --Brian -Original Message- From: Brian Yee [mailto:b...@wayfair.com] Sent: Friday, October 20, 2017 1:41 PM To: solr-user@lucene.apache.org Subject: LTR feature extraction performance

LTR feature extraction performance issues

2017-10-20 Thread Brian Yee
I enabled LTR feature extraction and response times spiked. I suppose that was to be expected, but are there any tips regarding performance? I have the feature values cache set up as described in the docs: Do I simply have to wait for the cache to fill up and hope that response times go down?

Re: performance issues with geofilt

2015-02-25 Thread david.w.smi...@gmail.com
Okay. Just to re-emphasize something I said but which may not have been clear, it isn’t an either-or for filter & sort. Filter with the spatial field type that makes sense for filtering, sort (or boost) with the spatial field type that makes sense for sorting. RPT sucks for distance sorting, Lat

AW: performance issues with geofilt

2015-02-25 Thread dirk.thalheim
Hello David, thanks for your answer. In the meantime I found the memory hint too in http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4#Sorting_and_RelevancySo Maybe we switch to LatLonType for this kind of searches. But the RPT is also needed as we want to support search by arbitrary pol

Re: Solrcloud performance issues

2015-02-24 Thread longsan
why you use 15 replicas? more replicas more slower. -- View this message in context: http://lucene.472066.n3.nabble.com/Solrcloud-performance-issues-tp4186035p4188738.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: performance issues with geofilt

2015-02-24 Thread david.w.smi...@gmail.com
Hi Dirk, The RPT field type can be used for distance sorting/boosting but it’s a memory pig when used as-such so don’t do it unless you have to. You only have to if you have a multi-valued point field. If you have single-valued, use LatLonType specifically for distance sorting. Your sample quer

performance issues with geofilt

2015-02-24 Thread dirk.thalheim
Hello, we are using solr 4.10.1. There are two cores for different use cases with around 20 million documents (location descriptions) per core. Each document has a geometry field which stores a point and a bbox field which stores a bounding box. Both fields are defined with: I'm currently try

Re: Solrcloud performance issues

2015-02-12 Thread Otis Gospodnetic
Hi, Did you say you have 150 servers in this cluster? And 10 shards for just 90M docs? If so, that 150 hosts sounds like too much for all other numbers I see here. I'd love to see some metrics here. e.g. what happens with disk IO around those commits? How about GC time/size info? Are JVM mem

Re: Solrcloud performance issues

2015-02-12 Thread Timothy Potter
Hi Vijay, We're working on SOLR-6816 ... would love for you to be a test site for any improvements we make ;-) Curious if you've experimented with changing the mergeFactor to a higher value, such as 25 and what happens if you set soft-auto-commits to something lower like 15 seconds? Also, make s

Solrcloud performance issues

2015-02-12 Thread Vijay Sekhri
Hi Erick, We have following configuration of our solr cloud 1. 10 Shards 2. 15 replicas per shard 3. 9 GB of index size per shard 4. a total of around 90 mil documents 5. 2 collection viz search1 serving live traffic and search 2 for indexing. We swap collection when indexing fin

RE: Solr performance issues

2014-12-29 Thread Toke Eskildsen
Mahmoud Almokadem [prog.mahm...@gmail.com] wrote: > I've the same index with a bit different schema and 200M documents, > installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size > of index is about 1.5TB, have many updates every 5 minutes, complex queries > and faceting with resp

Re: Solr performance issues

2014-12-29 Thread Shawn Heisey
On 12/29/2014 12:07 PM, Mahmoud Almokadem wrote: > What do you mean with "important parts of index"? and how to calculate their > size? I have no formal education in what's important when it comes to doing a query, but I can make some educated guesses. Starting with this as a reference: http://

Re: Solr performance issues

2014-12-29 Thread Mahmoud Almokadem
Thanks Shawn. What do you mean with "important parts of index"? and how to calculate their size? Thanks, Mahmoud Sent from my iPhone > On Dec 29, 2014, at 8:19 PM, Shawn Heisey wrote: > >> On 12/29/2014 2:36 AM, Mahmoud Almokadem wrote: >> I've the same index with a bit different schema and

Re: Solr performance issues

2014-12-29 Thread Shawn Heisey
On 12/29/2014 2:36 AM, Mahmoud Almokadem wrote: > I've the same index with a bit different schema and 200M documents, > installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size > of index is about 1.5TB, have many updates every 5 minutes, complex queries > and faceting with respon

Re: Solr performance issues

2014-12-29 Thread Mahmoud Almokadem
Thanks all. I've the same index with a bit different schema and 200M documents, installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size of index is about 1.5TB, have many updates every 5 minutes, complex queries and faceting with response time of 100ms that is acceptable for us.

RE: Solr performance issues

2014-12-28 Thread Toke Eskildsen
Mahmoud Almokadem [prog.mahm...@gmail.com] wrote: > We've installed a cluster of one collection of 350M documents on 3 > r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is > about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS > General purpose (1x1TB + 1x500G

Re: Solr performance issues

2014-12-28 Thread Shawn Heisey
On 12/26/2014 7:17 AM, Mahmoud Almokadem wrote: > We've installed a cluster of one collection of 350M documents on 3 > r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is > about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS > General purpose (1x1TB + 1x500GB)

Re: Solr performance issues

2014-12-26 Thread Otis Gospodnetic
Likely lots of disk + network IO, yes. Put SPM for Solr on your nodes to double check. Otis > On Dec 26, 2014, at 09:17, Mahmoud Almokadem wrote: > > Dears, > > We've installed a cluster of one collection of 350M documents on 3 > r3.2xlarge (60GB RAM) Amazon servers. The size of index on eac

Solr performance issues

2014-12-26 Thread Mahmoud Almokadem
Dears, We've installed a cluster of one collection of 350M documents on 3 r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS General purpose (1x1TB + 1x500GB) on each instance. Then we create logical volume

RE: SolrCloud performance issues regarding hardware configuration

2014-07-21 Thread Toke Eskildsen
search engn dev [sachinyadav0...@gmail.com] wrote: > Yes, You are right my facet queries are for text analytic purpose. Does this mean that facet calls are rare (at most one at a time)? > Users will send boolean and spatial queries. current performance for spatial > queries is 100qps with 150 con

Re: SolrCloud performance issues regarding hardware configuration

2014-07-20 Thread Himanshu Mehrotra
; > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-performance-issues-regarding-hardware-configuration-tp4147843p4148222.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Himanshu Mehrotra Download Our App[imag

Re: SolrCloud performance issues regarding hardware configuration

2014-07-20 Thread search engn dev
l number shards -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-performance-issues-regarding-hardware-configuration-tp4147843p4148222.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Performance issues with facets and filter query exclusions

2014-07-18 Thread Hayden Muhl
That query is representative of some of the queries in my test, but I didn't notice any correlation between using the match all docs query and poor query performance. Here's another example of a query that took longer than expected. qt=en&q=dress green leather&fq=userId:(383)&fq={!tag=productR

RE: SolrCloud performance issues regarding hardware configuration

2014-07-18 Thread Toke Eskildsen
search engn dev [sachinyadav0...@gmail.com] wrote: > out of 700 million documents 95-97% values are unique approx. That's quite a lot. If you are not already using DocValues for that, you should do so. So, each shard handles ~175M documents. Even with DocValues, there is an overhead of just hav

Re: Performance issues with facets and filter query exclusions

2014-07-18 Thread Yonik Seeley
On Fri, Jul 18, 2014 at 2:10 PM, Hayden Muhl wrote: > I was doing some performance testing on facet queries and I noticed > something odd. Most queries tended to be under 500 ms, but every so often > the query time jumped to something like 5000 ms. > > q=*:*&fq={!tag=productBrandId}productBrandId:

Performance issues with facets and filter query exclusions

2014-07-18 Thread Hayden Muhl
I was doing some performance testing on facet queries and I noticed something odd. Most queries tended to be under 500 ms, but every so often the query time jumped to something like 5000 ms. q=*:*&fq={!tag=productBrandId}productBrandId:(156 1227)&facet.field={!ex=productBrandId}productBrandId&face

Re: SolrCloud performance issues regarding hardware configuration

2014-07-18 Thread Erick Erickson
gt; Above query throws OOM exception as soon as fire it to solr. > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-performance-issues-regarding-hardware-configuration-tp4147843p4147871.html > Sent from the Solr - User mailing list archive at Nabble.com. >

RE: SolrCloud performance issues regarding hardware configuration

2014-07-18 Thread search engn dev
: http://lucene.472066.n3.nabble.com/SolrCloud-performance-issues-regarding-hardware-configuration-tp4147843p4147871.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: SolrCloud performance issues regarding hardware configuration

2014-07-18 Thread Toke Eskildsen
From: search engn dev [sachinyadav0...@gmail.com]: > 1 collection : 4 shards : each shard has one master and one replica > total documents : 700 million Are you using DocValues for your facet fields? What is the approximate number of unique values in your facets and what is their type (string, nu

SolrCloud performance issues regarding hardware configuration

2014-07-17 Thread search engn dev
with 32 gb ram each? -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-performance-issues-regarding-hardware-configuration-tp4147843.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Investigating performance issues in solr cloud

2014-04-08 Thread Shawn Heisey
On 4/8/2014 6:48 PM, Utkarsh Sengar wrote: > 1. I am using Oracle JVM > user@host:~$ java -version > java version "1.6.0_45" > Java(TM) SE Runtime Environment (build 1.6.0_45-b06) > Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode) That version should be very good, until you need to

Re: Investigating performance issues in solr cloud

2014-04-08 Thread Utkarsh Sengar
1. I am using Oracle JVM user@host:~$ java -version java version "1.6.0_45" Java(TM) SE Runtime Environment (build 1.6.0_45-b06) Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode) 2. I will try out jHiccup and your GC settings. 3. Yes, I am running ZK instances in an ensemble. I didn

Re: Investigating performance issues in solr cloud

2014-04-08 Thread Shawn Heisey
On 4/8/2014 6:00 PM, Utkarsh Sengar wrote: > Lots of questions indeed :) > > 1. Total virtual machines: 3 > 2. Replication factor: 0 (don't have any replicas yet) > 3. Each machine has 1 shard which has 20GB of data. So data for a > collection is spread across 3 machines totalling to 60GB > 4. Sta

Re: Investigating performance issues in solr cloud

2014-04-08 Thread Utkarsh Sengar
Lots of questions indeed :) 1. Total virtual machines: 3 2. Replication factor: 0 (don't have any replicas yet) 3. Each machine has 1 shard which has 20GB of data. So data for a collection is spread across 3 machines totalling to 60GB 4. Start solr: java -Xmx1m -javaagent:newrelic/newre

Re: Investigating performance issues in solr cloud

2014-04-08 Thread Shawn Heisey
On 4/8/2014 5:30 PM, Utkarsh Sengar wrote: > I see sudden drop in throughput once every 3-4 days. The "downtime" is for > about 2-6minutes and things stabilize after that. > > But I am not sure what is causing it the problem. > > I have 3 shards with 20GB of data on each shard. > Solr dashboard:

Investigating performance issues in solr cloud

2014-04-08 Thread Utkarsh Sengar
I see sudden drop in throughput once every 3-4 days. The "downtime" is for about 2-6minutes and things stabilize after that. But I am not sure what is causing it the problem. I have 3 shards with 20GB of data on each shard. Solr dashboard: http://i.imgur.com/6RWT2Dj.png Newrelic graphs when durin

Re: Solr 1.4 - Performance Issues

2013-11-05 Thread Erick Erickson
1.4 is ancient, but you know that already :) Anyway, what are your autocommit settings? That vintage of Solr blocks indexing when committing which may include rewriting the entire index. So part of your regular slowdown is likely segment merging happening with the commit. The 14 hour cycle is

  1   2   3   >