Hi, Aravind. Can you help me understand why this might be a useful feature for Geode? I see that your needs require it, but why would users in general want to allow longer timeouts for some members? This is a significant change with backward-compatibility implications, so would be good for the community to understand the potential benefit.
Thanks! Brian On Mon, Aug 28, 2017 at 12:20 AM, Aravind Musigumpula < aravind.musigump...@amdocs.com> wrote: > Hi Team, > > We have a requirement to configure different member timeout for different > members as we need some members to survive in the view for longer time than > the other the members before being kicked out of the view in case they > aren't responding. > > > 1. Now with the current monitoring system it is not possible to > determine when the member will be kicked out of the view if we configure > different member-timeout's for some required members. > > 2. Because if a member is not responding to any heartbeat requests, > the member who is monitoring the non-responding member will initiate check > member request. > > 3. In this check member request monitoring member pings the > non-responding member and waits for member-timeout of monitoring member for > a response. > > 4. If still there is no response, it will initiate a final suspect > request to coordinator where the coordinator does the final check waiting > for coordinators member-timeout. > > 5. If coordinator did not get any response, it will remove the > non-responding member from the view and publishes it. > > 6. So, Here the time period for removing a member depends on its > monitoring member's and coordinator's timeout. But the monitoring member > depends on the view but it may change from time to time. > > So, now when a monitoring-member doing the check on a member, if we wait > for the non-responding member's timeout instead of the monitoring > member-timeout, then the time when the non-responding member will be > removed from the view depends on its own member-timeout and the > coordinators member-timeout. > Hence we can configure different member-timeout for the required members. > > I created a pull request based on the above scenario: > https://github.com/apache/geode/pull/717 > > Is the above approach correct? Do we have any concerns around this area? > Please give your insights on this issue. > > Thanks, > Aravind Musigumpula > > This message and the information contained herein is proprietary and > confidential and subject to the Amdocs policy statement, > > you may review at https://www.amdocs.com/about/email-disclaimer < > https://www.amdocs.com/about/email-disclaimer> >