Also worth consideration for the Netty fixes backport bundle: ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak
Our experience with Netty has been positive so far though I don't think we have enough info to declare it is stable. It makes a lot of sense to me for 3.5 to use Netty 4, the biggest question is the timing. On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote: > I got another one: > > ZOOKEEPER-3163 Use session map to improve the performance when closing > session in Netty > > > So, that doesn't seem too many. We can talk about backporting them, but > I don't think they're super critical for the first stable release. > > Regarding 3.4, I need to validate the embedded scenario, but at least > the Java 11 build is green. :) > > > Andor > > > > On 1/4/19 18:08, Enrico Olivelli wrote: > > Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar > > <nkal...@cloudera.com.invalid> ha scritto: > >> +1 for Netty 4 in 3.5 > >> > >> Pretty much all the pros and cons has been said before me. > >> I would only add that this is not a new functionality that we wan't to > >> backport. It's a criticall(ish?) bugfix, which requires quite a bit of > >> change unfortunately. > >> > >> Regards, > >> Norbert > >> > >> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> wrote: > >> > >>> What are those patches exactly? > >>> > >>> Comparing the ported version of 3.5 with master I’ve only found 2 > patches > >>> which are missing: > >>> > >>> ZOOKEEPER-3146 Limit the maximum client connections per IP in > >>> NettyServerCnxnFactory > >>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep > >>> the same behavior and make the code easier to maintain > >>> > >>> None of them are critical I would say. > >>> Is there anything else I’m missing? > > I have compared the histories and I think Andor you are right. > > > > master: > > https://github.com/apache/zookeeper/commits/master > > > > branch-3.5: > > https://github.com/apache/zookeeper/commits/branch-3.5 > > > > Sorry I was thinking to the amount of patches related to TLS stuff , > > but they are not related to Netty 4. > > > > What about branch 3.4 ? > > We don't have reconfig but the case "Start embedded ZK, stop it and > > try to restart on the same port" should apply as well. > > > > Enrico > > > >>> Andor > >>> > >>> > >>> > >>> > >>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eolive...@gmail.com> > wrote: > >>>> > >>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar > >>>> <an...@apache.org <mailto:an...@apache.org>> ha scritto: > >>>>> Hi team / Enrico, > >>>>> > >>>>> I’d like to get feedback from the community on the following patch > >>> (moving the discussion from GitHub to here): > >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 < > >>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204> > >>>>> https://github.com/apache/zookeeper/pull/753 < > >>> https://github.com/apache/zookeeper/pull/753> > >>>>> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it > >>> doesn’t properly close the underlying socket (probably not closing the > >>> registered NIO selectors) and reconfig tests are unable to re-bind the > >>> ports. This problem is similar that we already fixed in NIO with the > >>> following patch: > >>> > https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 > >>> < > >>> > https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 > >>>>> The problem doesn’t show up on trunk which has been recently upgraded > >>> to Netty 4. > >>>>> Repro: > >>>>> - Start embedded ZK, stop it and try to restart on the same port, or > >>>>> - Start normal ensemble and reconfig to use different (client) port. > >>> Then reconfig back to the original port which should fail. (that’s the > >>> scenario which is covered in ReconfigTest) > >>>>> I created the above patch (#753) to backport Netty 4 upgrade to 3.5 > and > >>> it fixes the problem with Java 11 (it doesn’t cause regression in the > >>> pre-commit build either), but Enrico is having concerns about making > such > >>> big change before the release. > >>>>> I tend to agree, but let’s see what are the options. > >>>>> > >>>>> Thoughts: > >>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is > >>> critical. > >>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3, > >>> what can we do? > >>>>> a) We cannot workaround from ZooKeeper itself and have to > submit > >>> a pull request for Netty. I think it’s quite unlikely that they will > accept > >>> the change given it’s not a security bug, but even if they did, only > the > >>> upgraded version of Netty 3 would work properly with ZooKeeper. Err. > >>>>> b) We can workaround it from ZooKeeper: that could be option > #1, > >>> but I have a strong feeling about it’s not going to be the case. > >>>>> - Shall we upgrade to Netty 4? - this is option #2 > >>>>> > >>>>> Please share your thoughts, maybe you know about an option #3. > >>>> Thank you Andor > >>>> > >>>> I have thought more about this problem, and I have checked that Netty > >>>> 3 is really dead/unmantained (last release in 2016). > >>>> If I understand correctly there is no easy workaround (nothing without > >>>> hacking Netty 3 internals) > >>>> > >>>> As soon as we will declare 3.5.5 "stable" the world will hopefully > >>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support). > >>>> The network stack is very important so it is better to have Netty 4 as > >>>> foundation, I am thinking about security issues, we won't make an > >>>> "hotfix" release with the switch to Netty 4 because there is a bad bug > >>>> in Netty 3. > >>>> So better to switch now. > >>>> > >>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes > >>>> around Netty on master branch, we must be sure that what we are > >>>> delivering in 3.5.5 is stable. > >>>> > >>>> We will also have to state clearly in the "release notes" that Netty > >>>> version is changed, as this may have a non trivial impact to memory > >>>> usage (i.e. Netty 4 uses more Direct memory by default) > >>>> > >>>> So to recap my final opinion: +1 to switch to Netty 4 if we take care > >>>> of port all of the fixes around Netty 4 from master branch and we > >>>> state the switch clearly in the release notes > >>>> > >>>> Enrico > >>>> > >>>>> Regards, > >>>>> Andor > >>> >