While we can’t fix *all known bugs*, I think where we do have a fix for an important issue we should think hard about the cost of not including that in a release.
IMO, the fixed time approach to releases means that we *start* the release effort (including stabilization and bug fixing if needed) on a known date and we *finish* when new believe the quality of the release branch is sufficient. Given the number of important fixes being requested, I’m not sure we are there yet. I think the release branch concept has merit because it allows us to isolate ongoing work from the changes needed for a release. +1 for including GEODE-7079. Anthony > On Aug 15, 2019, at 10:51 AM, Udo Kohlmeyer <ukohlme...@gmail.com> wrote: > > Seems everyone is in favor or including a /*non-critical*/ fix to an already > cut branch of the a potential release... > > Am I missing something? > > Why cut a release at all... just have a perpetual cycle of fixes added to > develop and users can chose what nightly snapshot build they would want to > use.. > > I'm voting -1 on a non-critical issue, which is existing and worst effect is > to fill logs will NPE logs... (yes, not something we want). > > I believed that we (as a Geode community) agreed that once a release has been > cut, only critical issue fixes will be included. If we continue just > continually adding to the ALREADY CUT 1.10 release, where do we stop and when > do we release... > > --Udo > > On 8/15/19 10:19 AM, Nabarun Nag wrote: >> +1 >> >> On Thu, Aug 15, 2019 at 10:15 AM Alexander Murmann <amurm...@apache.org> >> wrote: >> >>> +1 >>> >>> Agreed to fixing this. It's impossible for a user to discover they hit an >>> edge case that we fail to support till they are in prod and restart. >>> >>> On Thu, Aug 15, 2019 at 10:09 AM Juan José Ramos <jra...@pivotal.io> >>> wrote: >>> >>>> Hello Udo, >>>> >>>> Even if it is an existing issue I'd still consider it critical for those >>>> cases on which there are unprocessed events on the persistent queue >>> after a >>>> restart and the region takes long to recover... you can actually see >>>> millions of *NPEs* flooding the member's logs. >>>> My two cents anyway, it's up to the community to make the final decision. >>>> Cheers. >>>> >>>> >>>> On Thu, Aug 15, 2019 at 5:58 PM Udo Kohlmeyer <u...@apache.com> wrote: >>>> >>>>> Juan, >>>>> >>>>> From your explanation, it seems this issue is existing and not >>>>> critical. Could we possibly hold this for 1.11? >>>>> >>>>> --Udo >>>>> >>>>> On 8/15/19 5:29 AM, Ju@N wrote: >>>>>> Hello team, >>>>>> >>>>>> I'd like to propose including the *fix [1]* for *GEODE-7079 [2]* in >>>>> release >>>>>> 1.10.0. >>>>>> Long story short: a *NullPointerException* can be continuously thrown >>>>>> and flood the member's logs if a serial event processor (either >>>>>> *async-event-queue* or *gateway-sender*) starts processing events >>> from >>>> a >>>>>> recovered persistent queue before the actual region to which it was >>>>>> attached is fully operational. >>>>>> Note: *no events are lost (even without the fix)* but, if the region >>>>> takes >>>>>> a while to recover, the logs for the member can grow pretty quickly >>>> due >>>>> to >>>>>> the continuously thrown *NPEs.* >>>>>> Best regards. >>>>>> >>>>>> [1]: >>>>>> >>> https://github.com/apache/geode/commit/6f4bbbd96bcecdb82cf7753ce1dae9fa6baebf9b >>>>>> [2]: https://issues.apache.org/jira/browse/GEODE-7079 >>>>>> >>>> >>>> -- >>>> Juan José Ramos Cassella >>>> Senior Software Engineer >>>> Email: jra...@pivotal.io >>>>