Hi, Do we need to raise a feature request for this if currently it does not exist?
Regards, Aishwarya On Sat, Jul 15, 2023 at 4:59 PM Aishwarya Soni <[email protected]> wrote: > We use Apache Storm and Solrcloud along with a few clients that store data > in the zookeeper. Due to one open bug in solrcloud > https://issues.apache.org/jira/browse/SOLR-16415, we see async_ids not > being deleted automatically. This is increasing the size of overseer znode > in zookeeper. Also, storm stores topologies jars (25+ topologies) inside > zookeeper which are also heavy in size. These two are the top 2 heavy > clients (other than a few other springboot microservices) that increase > zookeeper znode size. > > I want to see how we can get the active size of a specific znode so > that we can monitor it and also set the jute.maxbuffer value accordingly. > > I know zookeeper does not behave well with huge data being stored > inside it, but ignoring that fact, how can we get the znode size info? > > Regards, > Aishwarya Soni > > On Sat, Jul 15, 2023 at 2:24 AM Steph van Schalkwyk < > [email protected]> wrote: > >> Take a look in the code repo. should be a simple pull. >> S >> >> On Fri, Jul 14, 2023 at 3:23 PM Ruel, Ryan <[email protected]> >> wrote: >> >> > We have an application where the size of individual ZNodes is small (a >> few >> > KB typically), however our data is distributed in the tree such that we >> can >> > have many sub nodes (10s of thousands, in some cases). >> > >> > When running the ZK CLI tool to view our data, I was surprised to see >> that >> > we started to get IOExceptions for exceeding the 1MB jute.maxbuffer. >> > >> > We've gotten around this by increasing the max buffer size to 10MB, but >> it >> > wasn't clear to me whether the ZNode allowed data size is impacted by >> the >> > number of sub nodes, or if this buffer size is just reused in various >> > places in the client code. >> > >> > ZK seems to operate just fine with these large numbers of sub nodes, >> it's >> > just the client tool that was complaining when trying to list sub nodes. >> > >> > /Ryan >> > >> > On 7/14/23, 3:01 PM, "Steph van Schalkwyk" <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > >> > To your last point - ZK was designed to distribute small packets, hence >> the >> > 1M buffer. >> > I've had a client who had a Solr connector that kept on creating new >> fields >> > from different sources, and the Solr schema quickly grew to 4M. That's >> > about the biggest I've seen ZK operate reliably. >> > >> > >> > On Fri, Jul 14, 2023 at 1:09 PM Aishwarya Soni < >> [email protected] >> > <mailto:[email protected]>> >> > wrote: >> > >> > >> > > Hi, >> > > >> > > I want to find what is the current size/memory of a znode, i.e. how >> much >> > > its utilizing including all its child znodes. I know >> > > *zk_approximate_data_size* is the approximate memory consumption for >> ALL >> > > znodes stored in the ZooKeeper ensemble. But I need to find the active >> > size >> > > of a specific znode out of multiple znodes. >> > > >> > > How can we get it? >> > > >> > > Also, what is the safe max value we can assign to jute.maxbuffer? I am >> > > seeing packet length of 1 GB coming from a couple of clients and it is >> > > getting errored out with IOException due to jute.maxbuffer set to the >> > > default value of 1MB. >> > > >> > > Regards, >> > > Aishwarya >> > > >> > >> > >> > >> > >> >
