Hi Jonathan, Merry Christmas. Thanks for the suggestion. To manage IOPS can we do something on rate-limiting behalf?
Regards, Abhishek On Thu, Dec 17, 2020 at 5:07 AM Jonathan Tan <jty....@gmail.com> wrote: > Hi Abhishek, > > We're running Solr Cloud 8.6 on GKE. > 3 node cluster, running 4 cpus (configured) and 8gb of min & max JVM > configured, all with anti-affinity so they never exist on the same node. > It's got 2 collections of ~13documents each, 6 shards, 3 replicas each, > disk usage on each node is ~54gb (we've got all the shards replicated to > all nodes) > > We're also using a 200gb zonal SSD, which *has* been necessary just so that > we've got the right IOPS & bandwidth. (That's approximately 6000 IOPS for > read & write each, and 96MB/s for read & write each) > > Various lessons learnt... > You definitely don't want them ever on the same kubernetes node. From a > resilience perspective, yes, but also when one SOLR node gets busy, they > tend to all get busy, so now you'll have resource contention. Recovery can > also get very busy and resource intensive, and again, sitting on the same > node is problematic. We also saw the need to move to SSDs because of how > IOPS bound we were. > > Did I mention use SSDs? ;) > > Good luck! > > On Mon, Dec 14, 2020 at 5:34 PM Abhishek Mishra <solrmis...@gmail.com> > wrote: > > > Hi Houston, > > Sorry for the late reply. Each shard has a 9GB size around. > > Yeah, we are providing enough resources to pods. We are currently > > using c5.4xlarge. > > XMS and XMX is 16GB. The machine is having 32 GB and 16 core. > > No, I haven't run it outside Kubernetes. But I do have colleagues who did > > the same on 7.2 and didn't face any issue regarding it. > > Storage volume is gp2 50GB. > > It's not the search query where we are facing inconsistencies or > timeouts. > > Seems some internal admin APIs sometimes have issues. So while adding new > > replica in clusters sometimes result in inconsistencies. Like recovery > > takes some time more than one hour. > > > > Regards, > > Abhishek > > > > On Thu, Dec 10, 2020 at 10:23 AM Houston Putman <houstonput...@gmail.com > > > > wrote: > > > > > Hello Abhishek, > > > > > > It's really hard to provide any advice without knowing any information > > > about your setup/usage. > > > > > > Are you giving your Solr pods enough resources on EKS? > > > Have you run Solr in the same configuration outside of kubernetes in > the > > > past without timeouts? > > > What type of storage volumes are you using to store your data? > > > Are you using headless services to connect your Solr Nodes, or > ingresses? > > > > > > If this is the first time that you are using this data + Solr > > > configuration, maybe it's just that your data within Solr isn't > optimized > > > for the type of queries that you are doing. > > > If you have run it successfully in the past outside of Kubernetes, > then I > > > would look at the resources that you are giving your pods and the > storage > > > volumes that you are using. > > > If you are using Ingresses, that might be causing slow connections > > between > > > nodes, or between your client and Solr. > > > > > > - Houston > > > > > > On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra <solrmis...@gmail.com> > > > wrote: > > > > > > > Hello guys, > > > > We are kind of facing some of the issues(Like timeout etc.) which are > > > very > > > > inconsistent. By any chance can it be related to EKS? We are using > solr > > > 7.7 > > > > and zookeeper 3.4.13. Should we move to ECS? > > > > > > > > Regards, > > > > Abhishek > > > > > > > > > >