Thanks Owen!!. On Mon, Aug 19, 2019 at 3:44 PM Owen Nichols <onich...@pivotal.io> wrote:
> There appears to be consensus that this is a critical fix. > > The following commit has been brought into release/1.10.0 < > https://github.com/apache/geode/tree/release/1.10.0> as the critical fix > for GEODE-7079 <https://issues.apache.org/jira/browse/GEODE-7079>: > > git cherry-pick -x 6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b < > https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b > > > > GEODE-7079 <https://issues.apache.org/jira/browse/GEODE-7079> has been > changed to 'resolved in' 1.10.0. > > -Owen > > > On Aug 19, 2019, at 2:39 AM, Juan José Ramos <jra...@pivotal.io> wrote: > > > > Hello team, > > > > Just following up on my proposal to include GEODE-7079 in release > 1.10.0, I > > think there's enough consensus at this point (+1: 7, +0: 1), can we > cherry > > pick commit *6f4bbbd [1]* into the release branch?. > > Best regards. > > > > > > [1]: > > > https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b > > > > On Thu, Aug 15, 2019 at 9:17 PM Udo Kohlmeyer <u...@apache.com> wrote: > > > >> I'm changing my vote to +1 on this issue. > >> > >> The ONLY reason I'm changing my vote is to add to the cleanliness of the > >> code of the release. I do 100% disagree with the continual scope creep > >> that we have been incurring on this release branch. > >> > >> --Udo > >> > >> On 8/15/19 12:34 PM, Dan Smith wrote: > >>> +1 to merging Juan's fix for GEODE-7079. I've seen systems taken down > by > >>> rapidly filling up the logs in the past, this does seem to be a > critical > >>> fix from the perspective of system stability. > >>> > >>> Also, this is the change, which doesn't seem particularly risky to me. > >>> > >>> - ConflationKey key = new > >>> ConflationKey(gsEvent.getRegion().getFullPath(), > >>> + ConflationKey key = new > ConflationKey(gsEvent.getRegionPath(), > >>> > >>> -Dan > >>> > >>> On Thu, Aug 15, 2019 at 12:23 PM Udo Kohlmeyer <u...@apache.com> wrote: > >>> > >>>> Whilst I agree with "*finish* when we believe the quality of the > release > >>>> branch is sufficient", I disagree that we should have cut a branch and > >>>> continue to patch that branch with non-critical fixes. i.e this issue > >>>> has been around for a while and has no averse side effects. Issues > like > >>>> GEODE-7081, which is new due to a new commit, AND it has critical > >>>> stability implications on the server, that I can agree we should > include > >>>> in a potential release branch. > >>>> > >>>> Otherwise we can ALWAYS argue that said release branch is not of > >>>> "sufficient" quality, especially if there are numerous existing JIRA's > >>>> pertaining to bugs already in the system. > >>>> > >>>> To quote Juan's original email: > >>>> > >>>> /"Note: *no events are lost (even without the fix)* but, if the region > >>>> takes// > >>>> //a while to recover, the logs for the member can grow pretty quickly > >>>> due to// > >>>> //the continuously thrown *NPEs.*"/ > >>>> > >>>> In addition to this, if there is a commit in a cut release branch, > which > >>>> is requiring us to continuously patching the release branch, in order > to > >>>> stabilize that feature/fix, maybe we should consider reverting that > fix > >>>> and release it at a later stage, when it is believed that this fix is > >>>> more stable and have better, more comprehensive test coverage. > >>>> > >>>> So far, GEODE-7081, does not have me convinced that it is critical. OR > >>>> maybe it is the latter of my options, where it is a stabilization > commit > >>>> to a new feature, which begs the question, should we have accepted the > >>>> original feature commit if there are all manner of side effects which > we > >>>> are only discovering. > >>>> > >>>> --Udo > >>>> > >>>> On 8/15/19 11:08 AM, Anthony Baker wrote: > >>>>> While we can’t fix *all known bugs*, I think where we do have a fix > for > >>>> an important issue we should think hard about the cost of not > including > >>>> that in a release. > >>>>> IMO, the fixed time approach to releases means that we *start* the > >>>> release effort (including stabilization and bug fixing if needed) on a > >>>> known date and we *finish* when new believe the quality of the release > >>>> branch is sufficient. Given the number of important fixes being > >> requested, > >>>> I’m not sure we are there yet. > >>>>> I think the release branch concept has merit because it allows us to > >>>> isolate ongoing work from the changes needed for a release. > >>>>> +1 for including GEODE-7079. > >>>>> > >>>>> Anthony > >>>>> > >>>>> > >>>>>> On Aug 15, 2019, at 10:51 AM, Udo Kohlmeyer <ukohlme...@gmail.com> > >>>> wrote: > >>>>>> Seems everyone is in favor or including a /*non-critical*/ fix to an > >>>> already cut branch of the a potential release... > >>>>>> Am I missing something? > >>>>>> > >>>>>> Why cut a release at all... just have a perpetual cycle of fixes > added > >>>> to develop and users can chose what nightly snapshot build they would > >> want > >>>> to use.. > >>>>>> I'm voting -1 on a non-critical issue, which is existing and worst > >>>> effect is to fill logs will NPE logs... (yes, not something we want). > >>>>>> I believed that we (as a Geode community) agreed that once a release > >>>> has been cut, only critical issue fixes will be included. If we > continue > >>>> just continually adding to the ALREADY CUT 1.10 release, where do we > >> stop > >>>> and when do we release... > >>>>>> --Udo > >>>>>> > >>>>>> On 8/15/19 10:19 AM, Nabarun Nag wrote: > >>>>>>> +1 > >>>>>>> > >>>>>>> On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann < > >>>> amurm...@apache.org> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> +1 > >>>>>>>> > >>>>>>>> Agreed to fixing this. It's impossible for a user to discover they > >>>> hit an > >>>>>>>> edge case that we fail to support till they are in prod and > restart. > >>>>>>>> > >>>>>>>> On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos < > jra...@pivotal.io > >>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hello Udo, > >>>>>>>>> > >>>>>>>>> Even if it is an existing issue I'd still consider it critical > for > >>>> those > >>>>>>>>> cases on which there are unprocessed events on the persistent > queue > >>>>>>>> after a > >>>>>>>>> restart and the region takes long to recover... you can actually > >> see > >>>>>>>>> millions of *NPEs* flooding the member's logs. > >>>>>>>>> My two cents anyway, it's up to the community to make the final > >>>> decision. > >>>>>>>>> Cheers. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Thu, Aug 15, 2019 at 5:58 PM Udo Kohlmeyer <u...@apache.com> > >>>> wrote: > >>>>>>>>>> Juan, > >>>>>>>>>> > >>>>>>>>>> From your explanation, it seems this issue is existing and not > >>>>>>>>>> critical. Could we possibly hold this for 1.11? > >>>>>>>>>> > >>>>>>>>>> --Udo > >>>>>>>>>> > >>>>>>>>>> On 8/15/19 5:29 AM, Ju@N wrote: > >>>>>>>>>>> Hello team, > >>>>>>>>>>> > >>>>>>>>>>> I'd like to propose including the *fix [1]* for *GEODE-7079 > [2]* > >> in > >>>>>>>>>> release > >>>>>>>>>>> 1.10.0. > >>>>>>>>>>> Long story short: a *NullPointerException* can be continuously > >>>> thrown > >>>>>>>>>>> and flood the member's logs if a serial event processor (either > >>>>>>>>>>> *async-event-queue* or *gateway-sender*) starts processing > events > >>>>>>>> from > >>>>>>>>> a > >>>>>>>>>>> recovered persistent queue before the actual region to which it > >> was > >>>>>>>>>>> attached is fully operational. > >>>>>>>>>>> Note: *no events are lost (even without the fix)* but, if the > >>>> region > >>>>>>>>>> takes > >>>>>>>>>>> a while to recover, the logs for the member can grow pretty > >>>> quickly > >>>>>>>>> due > >>>>>>>>>> to > >>>>>>>>>>> the continuously thrown *NPEs.* > >>>>>>>>>>> Best regards. > >>>>>>>>>>> > >>>>>>>>>>> [1]: > >>>>>>>>>>> > >>>> > >> > https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b > >>>>>>>>>>> [2]: https://issues.apache.org/jira/browse/GEODE-7079 > >>>>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Juan José Ramos Cassella > >>>>>>>>> Senior Software Engineer > >>>>>>>>> Email: jra...@pivotal.io > >>>>>>>>> > >> > > > > > > -- > > Juan José Ramos Cassella > > Senior Software Engineer > > Email: jra...@pivotal.io > > -- Juan José Ramos Cassella Senior Software Engineer Email: jra...@pivotal.io