Hello,

I would like to follow-up and call for inputs for this. Damien, as the
author of the PR, do you have any inputs/thoughts?

Please let me know if anything I can help with moving this forward.

Cheers,

Li

On Wed, Feb 8, 2023 at 1:08 PM Li Wang <li4w...@gmail.com> wrote:

> Thanks for the inputs, Enrico.
>
> On Wed, Feb 8, 2023 at 12:26 AM Enrico Olivelli <eolive...@gmail.com>
> wrote:
>
>> Li,
>>
>> Il giorno mer 8 feb 2023 alle ore 03:49 Li Wang <li4w...@gmail.com> ha
>> scritto:
>> >
>> > Hello,
>> >
>> >
>> > We had a production outage due to the issue reported in
>> > https://issues.apache.org/jira/browse/ZOOKEEPER-4306 and some other
>> users
>> > also ran into the same issue. I wonder if we can use this thread to
>> discuss
>> > and come to a consensus on how to fix it. :-)
>> >
>> >
>> >
>> > Thanks Damien Diederen
>> > <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=ztzg> for
>> the
>> > contribution and patch. Limiting the number of ephemeral nodes that can
>> be
>> > created in a session looks like a simple and reasonable solution to me.
>> > Having a way to enforce it will protect the system from potential OOM
>> > issues.
>>
>> How does the client recover from having created too many ephemeral nodes ?
>> This seems not trivial to do. Let me share some ideas:
>>
>
> A new KeeperException/error code
> (i.e.TooManyEphemeralsException/TOOMANYEPHEMERALS) is introduced in the
> patch. Do you mean how
> the old clients handle the new error code?
>
>>
>> Solution one: fail the creation of the node
>> If we fail the creation of the node then the application will probably
>> enter a loop and continue to create it.
>> There is no way to say that some znode is "more important" than other
>> znodes, so the application will keep failing in the creation
>> of random znodes.
>>
>
> How about having a property to control whether throws
> TooManyEphemeralsException in this case? Admin can enable the property
> after all client applications upgrade to the new version and handle the new
> error code.
>
>
>> Solution two: force expires the session (and reset ephemeral nodes)
>> In this case some applications would probably recover in a better way
>> (ZK client applications are supposed to deal with session expiration
>> somehow).
>> and some applications will auto-restart (because session expired is a
>> symptom of network partition and suicide is the best thing to do)
>> In any case the application will try to create the znodes, work for
>> some time, and then die again (or recreate the session)
>>
>
> Great idea! Forcing session expiration seems promising, as it addresses
> both following.
>
> 1.  Protecting the server from txn size getting overflowed
> 2.  No need to worry about backward compatibility issue, as we use an
> existing error code and client application are supposed to handle session
> expiration error
>
>
>> I agree that a short term solution is a server side protection, but it
>> is better to think to a better plan.
>
>
> Totally agree. We need to think through and have a plan on how the client
> apps handle the changes.
> The Solution two seems better, as it is less intrusive and doesn't require
> any client side change. WDYT?
>
> Anyone else have any inputs?
>
>>
>> >
>> >
>> > I've also looked into the possibility of splitting CloseSessionTxn into
>> > smaller ones. Unfortunately, it didn't work, as currently in Zookeeper,
>> one
>> > request can only have one txn. Even though we can split the paths to be
>> > deleted into multiple batches and define sub-txn for each batch, we
>> have to
>> > wrap all sub-txn(s) into a single wrapper txn and associate it to the
>> > request. At the end, when loading zk database, we still have to
>> deserialize
>> > the large wrapper txn, which can fail the length check (jute.maxBuffer +
>> > zookeeper.jute.maxbuffer.extrasize).
>>
>> Unfortunately there are few users that say that zookeeper doesn't
>> scale and probably here we are hitting one of such cases,
>> and most of these cases are due to the write protocol (JUTE), that
>> puts unneeded constraints on Zookeeper
>>
>
> Yes, in this case, we hit the constraint that JUTE doesn't serialize the
> individual sub-txns separately.
>
> Best,
>
> Li
>
>> Enrico
>>
>> >
>> >
>> > Changing ZK to allow multiple txns for a single request looks quite
>> > involved and it may have other implications.
>> >
>> >
>> > I wonder if anyone has any input or any better ideas?
>> >
>> >
>> >
>> > Thanks,
>> >
>> >
>> > Li
>>
>

Reply via email to