After restarting Solr and doing a couple of queries to warm the caches, are
queries already slow/failing, or does it take some time and a number of
queries before failures start occurring?
One possibility is that you just need a lot more memory for caches for this
amount of data. So, maybe the failures are caused by heavy garbage
collections. So, after restarting Solr, check how much Java heap is
available, then do some warming queries, then check the Java heap available
again.
Add the debugQuery=true parameter to your queries and look at the timings to
see what phases of query processing are taking the most time. Also check
whether the reported QTime seems to match actual wall clock time; sometimes
formatting of the results and network transfer time can dwarf actual query
time.
How many fields are you returning on a typical query?
-- Jack Krupansky
-----Original Message-----
From: Suryansh Purwar
Sent: Monday, July 22, 2013 11:06 PM
To: solr-user@lucene.apache.org ; j...@basetechnology.com
Subject: how number of indexed fields effect performance
It was running fine initially when we just had around 100 fields
indexed. In this case as well it runs fine but after sometime broken pipe
exception starts coming which results in shard getting down.
Regards,
Suryansh
On Tuesday, July 23, 2013, Jack Krupansky wrote:
Was all of this running fine previously and only started running slow
recently, or is this your first measurement?
Are very simple queries (single keyword, no filters or facets or sorting
or anything else, and returning only a few fields) working reasonably
well?
-- Jack Krupansky
-----Original Message----- From: Suryansh Purwar
Sent: Monday, July 22, 2013 4:07 PM
To: solr-user@lucene.apache.org
Subject: how number of indexed fields effect performance
Hi,
We have a two shard solrcloud cluster with each shard allocated 3 separate
machines. We do complex queries involving a number of filter queries
coupled with group queries and faceting. All of our machines are 64 bit
with 32 gb ram. Our index size is around 10gb with around 8,00,000
documents. We have around 1000 indexed fields per document. 6gb of memeory
is allocated to tomcat under which solr is running on each of the six
machines. We have a zookeeper ensemble consisting of 3 zookeeper instances
running on 3 of the six machines with 4gb memory allocated to each of the
zookeeper instance. First solr start taking too much time with "Broken
pipe
exception because of timeout from client side" coming again and again,
then
after sometime a whole shard goes down with one machine at at time
followed
by other machines. Is having 1000 fields indexed with each document
resulting in this problem? If it is so, what would be the ideal number of
indexed fields in such environment.
Regards,
Suryansh