Re: Solr on HDFS: increase in query time with increase in data

Piyush Kunal Fri, 16 Dec 2016 00:45:07 -0800

I think 70GB is too huge for a shard.
How much memory does the system is having?
Incase solr does not have sufficient memory to load the indexes, it will
use only the amount of memory defined in your Solr Caches.


Although you are on HFDS, solr performances will be really bad if it has do
disk IO at the query time.

The best option for you is to shard it into atleast 8-10 nodes and create
appropriate replicas according to your read traffic.

Regards,
Piyush

On Fri, Dec 16, 2016 at 12:15 PM, Reth RM <reth.ik...@gmail.com> wrote:

> I think the shard index size is huge and should be split.
>
> On Wed, Dec 14, 2016 at 10:58 AM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
>
> > Hi everyone,
> >
> > I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have
> > the following config.
> > maxShardsperNode: 1
> > replicationFactor: 1
> >
> > I have been ingesting data into Solr for the last 3 months. With increase
> > in data, I am observing increase in the query time. Currently the size of
> > my indices is 70 GB per shard (i.e. per node).
> >
> > I am using cursor approach (/export handler) using SolrJ client to get
> back
> > results from Solr. All the fields I am querying on and all the fields
> that
> > I get back from Solr are indexed and have docValues enabled as well. What
> > could be the reason behind increase in query time?
> >
> > Has this got something to do with the OS disk cache that is used for
> > loading the Solr indices? When a query is fired, will Solr wait for all
> > (70GB) of disk cache being available so that it can load the index file?
> >
> > Thnaks!
> >
>

Re: Solr on HDFS: increase in query time with increase in data

Reply via email to