Hi Jonathan,
Merry Christmas.
Thanks for the suggestion. To manage IOPS can we do something on
rate-limiting behalf?

Regards,
Abhishek


On Thu, Dec 17, 2020 at 5:07 AM Jonathan Tan <jty....@gmail.com> wrote:

> Hi Abhishek,
>
> We're running Solr Cloud 8.6 on GKE.
> 3 node cluster, running 4 cpus (configured) and 8gb of min & max JVM
> configured, all with anti-affinity so they never exist on the same node.
> It's got 2 collections of ~13documents each, 6 shards, 3 replicas each,
> disk usage on each node is ~54gb (we've got all the shards replicated to
> all nodes)
>
> We're also using a 200gb zonal SSD, which *has* been necessary just so that
> we've got the right IOPS & bandwidth. (That's approximately 6000 IOPS for
> read & write each, and 96MB/s for read & write each)
>
> Various lessons learnt...
> You definitely don't want them ever on the same kubernetes node. From a
> resilience perspective, yes, but also when one SOLR node gets busy, they
> tend to all get busy, so now you'll have resource contention. Recovery can
> also get very busy and resource intensive, and again, sitting on the same
> node is problematic. We also saw the need to move to SSDs because of how
> IOPS bound we were.
>
> Did I mention use SSDs? ;)
>
> Good luck!
>
> On Mon, Dec 14, 2020 at 5:34 PM Abhishek Mishra <solrmis...@gmail.com>
> wrote:
>
> > Hi Houston,
> > Sorry for the late reply. Each shard has a 9GB size around.
> > Yeah, we are providing enough resources to pods. We are currently
> > using c5.4xlarge.
> > XMS and XMX is 16GB. The machine is having 32 GB and 16 core.
> > No, I haven't run it outside Kubernetes. But I do have colleagues who did
> > the same on 7.2 and didn't face any issue regarding it.
> > Storage volume is gp2 50GB.
> > It's not the search query where we are facing inconsistencies or
> timeouts.
> > Seems some internal admin APIs sometimes have issues. So while adding new
> > replica in clusters sometimes result in inconsistencies. Like recovery
> > takes some time more than one hour.
> >
> > Regards,
> > Abhishek
> >
> > On Thu, Dec 10, 2020 at 10:23 AM Houston Putman <houstonput...@gmail.com
> >
> > wrote:
> >
> > > Hello Abhishek,
> > >
> > > It's really hard to provide any advice without knowing any information
> > > about your setup/usage.
> > >
> > > Are you giving your Solr pods enough resources on EKS?
> > > Have you run Solr in the same configuration outside of kubernetes in
> the
> > > past without timeouts?
> > > What type of storage volumes are you using to store your data?
> > > Are you using headless services to connect your Solr Nodes, or
> ingresses?
> > >
> > > If this is the first time that you are using this data + Solr
> > > configuration, maybe it's just that your data within Solr isn't
> optimized
> > > for the type of queries that you are doing.
> > > If you have run it successfully in the past outside of Kubernetes,
> then I
> > > would look at the resources that you are giving your pods and the
> storage
> > > volumes that you are using.
> > > If you are using Ingresses, that might be causing slow connections
> > between
> > > nodes, or between your client and Solr.
> > >
> > > - Houston
> > >
> > > On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra <solrmis...@gmail.com>
> > > wrote:
> > >
> > > > Hello guys,
> > > > We are kind of facing some of the issues(Like timeout etc.) which are
> > > very
> > > > inconsistent. By any chance can it be related to EKS? We are using
> solr
> > > 7.7
> > > > and zookeeper 3.4.13. Should we move to ECS?
> > > >
> > > > Regards,
> > > > Abhishek
> > > >
> > >
> >
>

Reply via email to