Sure, good point. Let's put it on the list. Andor
On Tue, Dec 18, 2018 at 12:17 AM Patrick Hunt <ph...@apache.org> wrote: > Are folks OK to wait on that OWASP issue I documented over the weekend? > afaict we are not affected but it would be good to get another pair of eyes > on it. > > Patrick > > On Mon, Dec 17, 2018 at 2:55 PM Andor Molnár <an...@apache.org> wrote: > > > Hi team, > > > > > > I'm proudly announce that thanks to the joint effort from the community, > > the 3.5 blockers list has become empty: > > > > "project = ZooKeeper AND resolution = Unresolved AND fixVersion = 3.5.5 > > AND priority in (blocker, critical) ORDER BY priority DESC, key ASC" > > > > > > Well... almost. All the blocker issues have gone, but we still have the > > Maven migration to complete before the stable release. If you have some > > free cycles, please join us testing the Maven build on this PR: > > > > https://github.com/apache/zookeeper/pull/708 > > > > I hope we can merge it pretty soon. > > > > > > In terms of the builds, the weather at 3.5 branch is quite sunny > nowadays: > > > > https://builds.apache.org/view/S-Z/view/ZooKeeper/ > > > > The Java 11 build is still having some difficulties, which hopefully I > > can address before the holidays: > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-3204 > > > > > > If you happen to know about something which is important from 3.5's > > perspective and missing from the above, please don't hesitate to share. > > > > > > Happy ZooKeeping! > > > > Andor > > > > > > > > On 11/2/18 21:12, Fangmin Lv wrote: > > > Andor, > > > > > > Here is the PR to port ZK-3104 from master to 3.4: > > > https://github.com/apache/zookeeper/pull/685. > > > > > > Fangmin > > > > > > On Fri, Nov 2, 2018 at 11:46 AM Fangmin Lv <lvfang...@gmail.com> > wrote: > > > > > >> Hi Andor, > > >> > > >> Is anyone working on ZK-2778? I can pick it up if there is no one > > working > > >> on it yet. > > >> > > >> I'll open a 3.5 PR for ZK-3104 today. > > >> > > >> Fangmin > > >> > > >> On Fri, Oct 26, 2018 at 3:33 AM Andor Molnar <an...@apache.org> > wrote: > > >> > > >>> Hi folks, > > >>> > > >>> You’ve probably realised lots of update emails coming from Jira. > Please > > >>> be aware that we’ve updated a bunch of open blocker/critical 3.5 > > tickets to > > >>> reflect to what we discussed in this email. > > >>> > > >>> If you open up the following jira filter: > > >>> > > >>> project = ZooKeeper and resolution = Unresolved and fixVersion = > 3.5.5 > > >>> AND priority in (blocker, critical) ORDER BY priority DESC, key ASC > > >>> > > >>> You’ll see the most up-to-date list of tickets which need to be > > addressed > > >>> before the stable 3.5 release. > > >>> > > >>> Thank you for your efforts to get this done. > > >>> > > >>> Fangmin, ZK-3104 is waiting for backport, but ticket has already been > > >>> resolved. Have you created a separate ticket for the backport or > shall > > I > > >>> just reopen it with the right fix versions? > > >>> > > >>> Thanks, > > >>> Andor > > >>> > > >>> > > >>> > > >>>> On 2018. Oct 8., at 12:34, Andor Molnar <an...@apache.org> wrote: > > >>>> > > >>>> Hi, > > >>>> > > >>>> Let me summarize and give a quick update on the outstanding issues > for > > >>> 3.5 GA: > > >>>> - ZOOKEEPER-1818 (Fix don't care for trunk) > > >>>> - ZOOKEEPER-2778 (Potential server deadlock between follower sync > with > > >>> leader and follower receiving external connection requests.) > > >>>> - ZOOKEEPER-3021 Migrate project structure to Maven (ongoing) > > >>>> - ZOOKEEPER-925 Docs generation to Maven > > >>>> - ZOOKEEPER-3104 (waiting for backport) > > >>>> - ZOOKEEPER-3125 (waiting for backport PR #647) > > >>>> > > >>>> The 2 Maven related tickets are no-brainers as well as the > backports. > > >>> ZK-2778 has been picked up by Maoling (thanks!) as far as I can see, > > >>> ZK-1818 is the only one waiting for a volunteer. > > >>>> Please correct me if I’ve missed something. > > >>>> > > >>>> Regards, > > >>>> Andor > > >>>> > > >>>> > > >>>> > > >>>> > > >>>>> On 2018. Sep 28., at 18:32, Tamas Penzes > <tam...@cloudera.com.INVALID > > > > > >>> wrote: > > >>>>> Hi All, > > >>>>> > > >>>>> I would add ZOOKEEPER-3021 > > >>>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-3021> Migrate > > project > > >>>>> structure to Maven build as a blocker too. Since the migration has > > >>> started > > >>>>> it would be good to finish before releasing ZK 3.5.x GA. > > >>>>> > > >>>>> ZOOKEEPER-925 <https://issues.apache.org/jira/browse/ZOOKEEPER-925 > > > > >>> replace > > >>>>> our forrest site and documentation generation might also be a good > > >>> idea, > > >>>>> since then we could deliver the new MarkDown based documentation. > > >>>>> > > >>>>> Regards, Tamaas > > >>>>> > > >>>>> On Fri, Sep 14, 2018 at 10:09 AM Fangmin Lv <lvfang...@gmail.com> > > >>> wrote: > > >>>>>> Oh, sorry for the confusion, I should provide more context. > > >>>>>> > > >>>>>> Leader will use on disk txn sync with followers to if the peer > zxid > > >>> is not > > >>>>>> in it's in memory commit logs, the code is here: Leader on disk > txn > > >>> sync > > >>>>>> < > > >>>>>> > > >>> > > > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java#L774 > > >>>>>>> . > > >>>>>> There is bug that potentially there will be gap in the txn files, > > like > > >>>>>> after snap sync, etc, so it's possible the peer will miss txns due > > to > > >>> this. > > >>>>>> The option to disable it is snapshotSizeFactor > > >>>>>> < > > >>>>>> > > >>> > > > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZKDatabase.java#L81 > > >>>>>>> , > > >>>>>> set it to -1 will disable this feature. On 3.5, it's better to > have > > a > > >>> PR to > > >>>>>> set this to -1 by default. It might have more SNAP sync, but from > > our > > >>> prod > > >>>>>> it doesn't seem to be a big problem to me. > > >>>>>> > > >>>>>> I can send out the diff to disable it by default on 3.5 if you > guys > > >>> think > > >>>>>> this is the right way to do. > > >>>>>> > > >>>>>> Thanks, > > >>>>>> Fangmin > > >>>>>> > > >>>>>> On Thu, Sep 13, 2018 at 1:58 AM Andor Molnar <an...@apache.org> > > >>> wrote: > > >>>>>>> What’s needed to turn it off? > > >>>>>>> Do we need a PR or it’s just a config option? > > >>>>>>> Shall we implement a feature switch for that and turn it off by > > >>> default? > > >>>>>>> Sorry I don’t have too much insight on disk txn sync. > > >>>>>>> > > >>>>>>> Andor > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>>> On 2018. Sep 13., at 9:16, Fangmin Lv <lvfang...@gmail.com> > > wrote: > > >>>>>>>> > > >>>>>>>> And to be clear, ZOOKEEPER-2418 is actually just one case of > > >>>>>>> inconsistency > > >>>>>>>> which could caused by on disk txn sync, as I mentioned in a > newer > > >>> JIRA > > >>>>>>>> ZOOKEEPER-2846 < > > >>> https://issues.apache.org/jira/browse/ZOOKEEPER-2846>, > > >>>>>>> the > > >>>>>>>> snap sync or txn sync could also leave txns gap in the txn file, > > >>> which > > >>>>>>> is a > > >>>>>>>> more common case could trigger this issue. > > >>>>>>>> > > >>>>>>>> I would suggest to turn off the on disk txn sync by default for > > now > > >>> to > > >>>>>>>> avoid this issue, after we finished ZOOKEEPER-3114, we can use > > that > > >>> to > > >>>>>>>> validate the on disk txns during syncing. > > >>>>>>>> > > >>>>>>>> Thanks, > > >>>>>>>> Fangmin > > >>>>>>>> > > >>>>>>>> On Wed, Sep 12, 2018 at 9:55 AM Fangmin Lv <lvfang...@gmail.com > > > > >>>>>> wrote: > > >>>>>>>>> Andor, > > >>>>>>>>> > > >>>>>>>>> ZOOKEEPER-3114 is about adding real time digest checking to > help > > >>>>>>> detecting > > >>>>>>>>> inconsistency, it's a new feature with amounts of code change. > > I'll > > >>>>>>> start > > >>>>>>>>> upstream it part by part, but I don't expect it's being merged > in > > >>> the > > >>>>>>> next > > >>>>>>>>> few weeks. So yes, it's a nice to have, but definitely not a > > block > > >>> for > > >>>>>>> 3.5. > > >>>>>>>>> Thanks, > > >>>>>>>>> Fangmin > > >>>>>>>>> > > >>>>>>>>> On Wed, Sep 12, 2018 at 2:55 AM Andor Molnar <an...@apache.org > > > > >>>>>> wrote: > > >>>>>>>>>> Fangmin, > > >>>>>>>>>> > > >>>>>>>>>> Sorry, I just noticed that you want to include the consistency > > >>> fixes > > >>>>>> in > > >>>>>>>>>> the stable version which is fine. Let’s finish the backports > and > > >>>>>> we’ll > > >>>>>>> be > > >>>>>>>>>> done with them. > > >>>>>>>>>> > > >>>>>>>>>> ZOOKEEPER-3114 is essentially a new feature, I wouldn’t block > > 3.5 > > >>>>>> with > > >>>>>>>>>> that. What do you think? > > >>>>>>>>>> > > >>>>>>>>>> Andor > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>>> On 2018. Sep 12., at 11:52, Andor Molnar <an...@apache.org> > > >>> wrote: > > >>>>>>>>>>> Cool, thanks for the clarification. > > >>>>>>>>>>> > > >>>>>>>>>>> The updated list is as follows: > > >>>>>>>>>>> > > >>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast > protocol) > > >>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk) > > >>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between follower > > sync > > >>>>>> with > > >>>>>>>>>> leader and follower receiving external connection requests.) > > >>>>>>>>>>> The following are not critical and no blockers for the stable > > >>>>>> release: > > >>>>>>>>>>> Waiting for to be ported to 3.5: > > >>>>>>>>>>> - ZOOKEEPER-3104 > > >>>>>>>>>>> - ZOOKEEPER-3125 > > >>>>>>>>>>> - ZOOKEEPER-3127 > > >>>>>>>>>>> > > >>>>>>>>>>> New feature: > > >>>>>>>>>>> - ZOOKEEPER-3114 (fixes ZOOKEEPER-2184 too) > > >>>>>>>>>>> > > >>>>>>>>>>> Regards, > > >>>>>>>>>>> Andor > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>>> On 2018. Sep 12., at 0:42, Fangmin Lv <lvfang...@gmail.com> > > >>> wrote: > > >>>>>>>>>>>> Hi Andor, > > >>>>>>>>>>>> > > >>>>>>>>>>>> That's the on disk txn feature, which was disabled > internally > > >>> after > > >>>>>>> we > > >>>>>>>>>>>> found the potentially inconsistent issue. The only solution > we > > >>> have > > >>>>>>>>>> for now > > >>>>>>>>>>>> is waiting for the new digest checking feature I mentioned > in > > >>>>>>>>>>>> ZOOKEEPER-3114. > > >>>>>>>>>>>> > > >>>>>>>>>>>> I think there are some other critical consistent issues we > > just > > >>>>>> fixed > > >>>>>>>>>> on > > >>>>>>>>>>>> master recently: ZOOKEEPER-3104, ZOOKEEPER-3125, > > >>> ZOOKEEPER-3127, I > > >>>>>>>>>> think we > > >>>>>>>>>>>> should include that in the official 3.5 release as well. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Thanks, > > >>>>>>>>>>>> Fangmin > > >>>>>>>>>>>> > > >>>>>>>>>>>> On Tue, Sep 11, 2018 at 11:58 AM Andor Molnár < > > an...@apache.org > > >>>>>>>>>> wrote: > > >>>>>>>>>>>>> Hi Jeelani, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Thanks for letting me know. I'm happy to remove it from the > > >>> list > > >>>>>> to > > >>>>>>>>>> get > > >>>>>>>>>>>>> closer to a stable release. :) > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> What's the feature which can be disabled to avoid data > > >>>>>>> inconsistency? > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Andor > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On 09/10/2018 11:33 PM, Mohamed Jeelani wrote: > > >>>>>>>>>>>>>> Thanks Andor for compiling this. Should we be ignoring > > >>>>>>>>>> ZOOKEEPER-2418 as > > >>>>>>>>>>>>> well? This exists in 3.4 as well and the feature can be > > >>> disabled. > > >>>>>> We > > >>>>>>>>>> are > > >>>>>>>>>>>>> working on a longer term fix for it in 3.6. > > >>>>>>>>>>>>>> Regards, > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Jeelani > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> On 9/10/18, 5:19 AM, "Andor Molnar" > > >>> <an...@cloudera.com.INVALID > > >>>>>>>>>> wrote: > > >>>>>>>>>>>>>> Fine. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> I'm happy to ignore 1549, 2846 and 2930. Still we have the > > >>> list > > >>>>>>> of: > > >>>>>>>>>>>>>> - ZOOKEEPER-236 (SSL/TLS support for Atomic Broadcast > > >>> protocol) > > >>>>>>>>>>>>>> - ZOOKEEPER-1818 (Fix don't care for trunk) > > >>>>>>>>>>>>>> - ZOOKEEPER-2418 (txnlog diff sync can skip sending some > > >>>>>>>>>>>>> transactions to > > >>>>>>>>>>>>>> followers) > > >>>>>>>>>>>>>> - ZOOKEEPER-2778 (Potential server deadlock between > follower > > >>>>>> sync > > >>>>>>>>>>>>> with > > >>>>>>>>>>>>>> leader and follower receiving external connection > requests.) > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> SSL (ZK-236) is a feature which essential for the 3.5 > > release, > > >>>>>>>>>> hence > > >>>>>>>>>>>>> I > > >>>>>>>>>>>>>> wouldn't leave it out or postpone it for the next stable > > >>>>>> release. > > >>>>>>>>>> PR > > >>>>>>>>>>>>> has > > >>>>>>>>>>>>>> been out for a long time, get on reviewing please. > > >>>>>>>>>>>>>> The rest are also long outstanding issues which have been > > >>> found > > >>>>>> in > > >>>>>>>>>>>>> the 3.5 > > >>>>>>>>>>>>>> branch. > > >>>>>>>>>>>>>> ZK-1818 is something which was found in 3.4 and fixed in > > 3.4, > > >>>>>> but > > >>>>>>>>>>>>> never has > > >>>>>>>>>>>>>> been fixed in 3.5. Quite a serious issue if still present. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> I think we should at least run some manual testing and see > > if > > >>> we > > >>>>>>>>>>>>> could > > >>>>>>>>>>>>>> repro any of these issues before going ahead with a stable > > >>>>>>> release. > > >>>>>>>>>>>>>> Regards, > > >>>>>>>>>>>>>> Andor > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> On Fri, Sep 7, 2018 at 3:24 AM, Michael Han < > > h...@apache.org> > > >>>>>>>>>> wrote: > > >>>>>>>>>>>>>>> I haven't went through the entire list, but looks like > lots > > >>> of > > >>>>>> the > > >>>>>>>>>>>>> JIRA > > >>>>>>>>>>>>>>> issues listed in this thread, such as ZOOKEEPER-1549, > 2846, > > >>> also > > >>>>>>>>>>>>> affects > > >>>>>>>>>>>>>>> 3.4 releases. Should we scope these issues out? > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> I think historically the single outstanding blocking > issue > > >>> for a > > >>>>>>>>>>>>> stable 3.5 > > >>>>>>>>>>>>>>> release is the reconfig feature and security concerns > > around > > >>> it > > >>>>>>>>>>>>> (somehow > > >>>>>>>>>>>>>>> addressed in ZOOKEEPER-2014), and the alpha and beta > > releases > > >>>>>> were > > >>>>>>>>>>>>> created > > >>>>>>>>>>>>>>> to stabilize that feature. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> > > >>> > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__zookeeper-2Duser.578899.n2.nabble.com_Zookeeper-2Dwith-2D&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=_tGtL3nMWtuPrXKXDx27AIWOzyyT7W-CjIVLDFZwT0E&e= > > >>>>>>>>>>>>>>> SSL-release-date-tt7581744.html > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> So it looks like we are in good shape to release. > Something > > >>>>>> might > > >>>>>>>>>>>>> worth > > >>>>>>>>>>>>>>> doing to claim the quality of 3.5 is on par with 3.4 > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> * Run Jepsen on 3.5 - 3.4 passed the test for the record > > >>>>>>>>>>>>>>> > > >>> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__aphyr.com_posts_291-2Djepsen-2Dzookeeper&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Vl4oKanLQehvaulUvoKg8A&m=wqlhnot9c-pQLdkGkccSGNpELUNUnB-wy_h0iA3PRqI&s=VjORkX5s7hrJyl8mW9Q4cfeSWF4qfTdyRjcuAiBt0y4&e= > > >>>>>>>>>>>>>>> * Fix all flaky tests on 3.5 - 3.4 has little or no flaky > > >>> tests > > >>>>>> at > > >>>>>>>>>>>>> all. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> On Tue, Sep 4, 2018 at 1:48 AM, Andor Molnar > > >>>>>>>>>>>>> <an...@cloudera.com.invalid> > > >>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Thanks Maoling! That would be huge help, I appreciate > it. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Andor > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>> > > >>> > > >