Re: degrades qtime in a 20million doc collection

2016-01-15 Thread Anria B.
Thanks Toke for this. It gave us a ton to think about, and it really helps supporting the notion of several smaller indexes over one very large one, where we can rather distribute a few JVM processes with less size each, than have one massive one that is according to this, less efficient. Toke

Re: degrades qtime in a 20million doc collection

2016-01-15 Thread Anria B.
hi Yonik We definitely didn't overlook that q=* being a wildcard scan, we just had so many systemic problems to focus on I neglected to thank Shawn for that particular piece of useful information. I must admit, I seriously never knew this. Ever since q=* was allowed I was so happy that it never

Re: degrades qtime in a 20million doc collection

2016-01-14 Thread Anria B.
hi all, We did try the q=queryA AND queryB, vs q=queryA=queryB. For all tests, we commented out caching, and reload core between queries to be ultra sure that we are getting good comps on time. we have so many unique Fq and such frequent commits that caches are always invalidated, so our

Re: Can we create multiple cluster in single Zookeeper instance

2016-01-14 Thread Anria B.
hi Mugeesh It's best to use Zookeeper as it was intended. Install, or run 3 of them independent of any Solr, then point Solr to the zookeeper cluster. You can have 1, but then, if anything happens to that 1 single node of Zookeeper, all of your Solr will be dead, until you can properly revive

Re: degrades qtime in a 20million doc collection

2016-01-14 Thread Anria B.
Here are some Actual examples, if it helps wt=json=*:*=on=SolrDocumentType:"invalidValue"=timestamp=0=0=timing { "responseHeader": { "status": 0, "QTime": 590, "params": { "q": "*:*", "debug": "timing", "indent": "on",

Re: degrades qtime in a 20million doc collection

2016-01-14 Thread Anria B.
hi Shawn Thanks for your comprehensive answers. I really appreciate it. Just for clarity, the numbers I posted here were from tests that we isolated only one single fq and a q. These do have good times, even though its almost 600ms. Once we are in application mode, and other fq's and facets

Re: degrades qtime in a 20million doc collection

2016-01-14 Thread Anria B.
Here is a stacktrace of when we put a in the autowarming, or in the "newSearcher" to warm up the collection after a commit. 2016-01-12 19:00:13,216 [http-nio-19082-exec-25 vaultThreadId:http-STAGE-30518-14 vaultSessionId:1E53A095AD22704 vaultNodeId:nodeId:node-2 vaultInstanceId:2228

Re: degrades qtime in a 20million doc collection

2016-01-13 Thread Anria B.
hi Shawn Thanks for the quick answer. As for the q=*, we also saw similar results in our testing when doing things like q=somefield:qval =otherfield:fqval Which makes a pure Lucene query. I simplified things somewhat since our results were always that as numFound got large, the query time

degrades qtime in a 20million doc collection

2016-01-13 Thread Anria B.
hi all, I have a Really fun question to ask. I'm sitting here looking at what is by far the beefiest box I've ever seen in my life. 256GB of RAM, extreme TerraBytes of disc space, the works. Linux server properly partitioned Yet, what we are seeing goes against all intuition I've built up