Hi,

Do we need to raise a feature request for this if currently it does not
exist?

Regards,
Aishwarya

On Sat, Jul 15, 2023 at 4:59 PM Aishwarya Soni <[email protected]>
wrote:

> We use Apache Storm and Solrcloud along with a few clients that store data
> in the zookeeper. Due to one open bug in solrcloud
> https://issues.apache.org/jira/browse/SOLR-16415, we see async_ids not
> being deleted automatically. This is increasing the size of overseer znode
> in zookeeper. Also, storm stores topologies jars (25+ topologies) inside
> zookeeper which are also heavy in size. These two are the top 2 heavy
> clients (other than a few other springboot microservices) that increase
> zookeeper znode size.
>
> I want to see how we can get the active size of a specific znode so
> that we can monitor it and also set the jute.maxbuffer value accordingly.
>
> I know zookeeper does not behave well with huge data being stored
> inside it, but ignoring that fact, how can we get the znode size info?
>
> Regards,
> Aishwarya Soni
>
> On Sat, Jul 15, 2023 at 2:24 AM Steph van Schalkwyk <
> [email protected]> wrote:
>
>> Take a look in the code repo. should be a simple pull.
>> S
>>
>> On Fri, Jul 14, 2023 at 3:23 PM Ruel, Ryan <[email protected]>
>> wrote:
>>
>> > We have an application where the size of individual ZNodes is small (a
>> few
>> > KB typically), however our data is distributed in the tree such that we
>> can
>> > have many sub nodes (10s of thousands, in some cases).
>> >
>> > When running the ZK CLI tool to view our data, I was surprised to see
>> that
>> > we started to get IOExceptions for exceeding the 1MB jute.maxbuffer.
>> >
>> > We've gotten around this by increasing the max buffer size to 10MB, but
>> it
>> > wasn't clear to me whether the ZNode allowed data size is impacted by
>> the
>> > number of sub nodes, or if this buffer size is just reused in various
>> > places in the client code.
>> >
>> > ZK seems to operate just fine with these large numbers of sub nodes,
>> it's
>> > just the client tool that was complaining when trying to list sub nodes.
>> >
>> > /Ryan
>> >
>> > On 7/14/23, 3:01 PM, "Steph van Schalkwyk" <[email protected]
>> > <mailto:[email protected]>> wrote:
>> >
>> >
>> > To your last point - ZK was designed to distribute small packets, hence
>> the
>> > 1M buffer.
>> > I've had a client who had a Solr connector that kept on creating new
>> fields
>> > from different sources, and the Solr schema quickly grew to 4M. That's
>> > about the biggest I've seen ZK operate reliably.
>> >
>> >
>> > On Fri, Jul 14, 2023 at 1:09 PM Aishwarya Soni <
>> [email protected]
>> > <mailto:[email protected]>>
>> > wrote:
>> >
>> >
>> > > Hi,
>> > >
>> > > I want to find what is the current size/memory of a znode, i.e. how
>> much
>> > > its utilizing including all its child znodes. I know
>> > > *zk_approximate_data_size* is the approximate memory consumption for
>> ALL
>> > > znodes stored in the ZooKeeper ensemble. But I need to find the active
>> > size
>> > > of a specific znode out of multiple znodes.
>> > >
>> > > How can we get it?
>> > >
>> > > Also, what is the safe max value we can assign to jute.maxbuffer? I am
>> > > seeing packet length of 1 GB coming from a couple of clients and it is
>> > > getting errored out with IOException due to jute.maxbuffer set to the
>> > > default value of 1MB.
>> > >
>> > > Regards,
>> > > Aishwarya
>> > >
>> >
>> >
>> >
>> >
>>
>

Reply via email to