+1 to revert in 1.7 and leaving the fix on develop. On Wed, Sep 5, 2018 at 9:47 AM Jacob Barrett <jbarr...@pivotal.io> wrote:
> I’m not ok with reverting in develop. Revert in 1.7 and modify in develop. > We shouldn’t go backwards in develop. The current fix is better than the > bug it fixes. > > > On Sep 5, 2018, at 9:40 AM, Nabarun Nag <n...@apache.org> wrote: > > > > If everyone is okay with it, I will revert that change in develop and > then > > cherry pick it to release/1.7.0 branch. > > Please do comment. > > > > Regards > > Nabarun Nag > > > > > >> On Wed, Sep 5, 2018 at 9:30 AM Dan Smith <dsm...@pivotal.io> wrote: > >> > >> +1 to yank it and rework the fix. > >> > >> Gester's change helps, but it just means that you will sometimes > randomly > >> have a 2 minute delay starting up a gateway receiver. I don't think > that is > >> a great user experience either. > >> > >> -Dan > >> > >> On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt < > bschucha...@pivotal.io> > >> wrote: > >> > >>> Let's yank it > >>> > >>> > >>> > >>>> On 9/4/18 5:04 PM, Sean Goller wrote: > >>>> > >>>> If it's to get the release out, I'm fine with reverting. I don't like > >> it, > >>>> but I'm not willing to die on that hill. :) > >>>> > >>>> -S. > >>>> > >>>> On Tue, Sep 4, 2018 at 4:38 PM Dan Smith <dsm...@pivotal.io> wrote: > >>>> > >>>> Spitting this into a separate thread. > >>>>> > >>>>> I see the issue. The two minute timeout is the constructor for > >>>>> AcceptorImpl, where it retries to bind for 2 minutes. > >>>>> > >>>>> That behavior makes sense for CacheServer.start. > >>>>> > >>>>> But it doesn't make sense for the new logic in > GatewayReceiver.start() > >>>>> from > >>>>> GEODE-5591. That code is trying to use CacheServer.start to scan for > an > >>>>> available port, trying each port in a range. That free port finding > >> logic > >>>>> really doesn't want to have two minutes of retries for each port. It > >>>>> seems > >>>>> like we need to rework the fix for GEODE-5591. > >>>>> > >>>>> Does it make sense to hold up the release to rework this fix, or > should > >>>>> we > >>>>> just revert it? Have we switched concourse over to using alpine > linux, > >>>>> which I think was the original motivation for this fix? > >>>>> > >>>>> -Dan > >>>>> > >>>>> On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith <dsm...@pivotal.io> wrote: > >>>>> > >>>>> Why is it waiting at all in this case? Where is this 2 minute timeout > >>>>>> coming from? > >>>>>> > >>>>>> -Dan > >>>>>> > >>>>>> On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda < > >>>>>> > >>>>> sai.boorlaga...@gmail.com > >>>>> > >>>>>> wrote: > >>>>>>> So the issue is that it takes longer to start than previous > releases? > >>>>>>> Also, is this wait time only when using Gfsh to create > >>>>>>> gateway-receiver? > >>>>>>> > >>>>>>> On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag <n...@apache.org> > wrote: > >>>>>>> > >>>>>>> Currently we have a minor issue in the release branch as pointed > out > >>>>>>>> > >>>>>>> by > >>>>> > >>>>>> Barry O. > >>>>>>>> We will wait till a resolution is figured out for this issue. > >>>>>>>> > >>>>>>>> Steps: > >>>>>>>> 1. create locator > >>>>>>>> 2. start server --name=server1 --server-port=40404 > >>>>>>>> 3. start server --name=server2 --server-port=40405 > >>>>>>>> 4. create gateway-receiver --member=server1 > >>>>>>>> 5. create gateway-receiver --member=server2 `This gets stuck for 2 > >>>>>>>> > >>>>>>> minutes` > >>>>>>> > >>>>>>>> Is the 2 minute wait time acceptable? Should we document it? When > we > >>>>>>>> > >>>>>>> revert > >>>>>>> > >>>>>>>> GEODE-5591, this issue does not happen. > >>>>>>>> > >>>>>>>> Regards > >>>>>>>> Nabarun Nag > >>>>>>>> > >>>>>>>> > >>> > >> >