Hi Mark Thanks for your reply. I'll try to clear a few things up.
First, LUS-1 and LUS-2 are both tagged with the same public group, no other attributes, and are reached via multicast by all clients. What we observe in our logs is that a particular client continued to try and access a service which (as far as we can tell) it had located via LUS-1, and calls to that service from that client failed, for the entire time LUS-1 was unreachable. We also see DiscoveryEvent notifications arriving at that client referring to LUS-1 during that 45 minute period (we don't have all the details of the events in the logs, unfortunately), and that LUS-1 was unreachable from that client during that time. What confuses us is that other clients complained (so to speak) at most once when LUS-1 was unavailable, then continued to operate, we assume, using LUS-2. So we were wondering if there was something we didn't understand about how an SDM and its LookupCache react to an LUS no longer being reachable. What we expect is that the SDM will, perhaps on a lease expiration, remove all entries in the cache related to that registrar, however, this didn't appear to happen, and we haven't found any documentation that indicates what it does or should do. > Are you suggesting that > some clients didn't find a particular service in the LookupCache that was > registered with LUS-2 while LUS-1 was not reachable. We have (at least) one client which appeared to continue to try and work with LUS-1 for over 45 minutes after it was last available, and it appeared to also attempt to retrieve a service proxy, continuously, from LUS-1 during that period, again unsuccessfully. > > In case the SDM in your client was able to see LUS-2 it shouldn't have any > problem seeing your service even in case LUS-1 became unreachable, assuming > no other problems than LUS-1 not being reachable occurred. This is what appeared to occur on other Jini clients in the network, and what we want. > is used for finding your lookup service. In that case are you sure that > LUS-2 was found by the SDM of your client? A good way to find out is to > configure logging for the logger documented in > http://java.sun.com/products/jini/2.1/doc/api/net/jini/discovery/LookupDiscovery.html, > set the level to FINEST. I think we have a discovery listener and logger of our own configured, but am not sure if we by default log all events. > The spec of ServiceDiscoveryListener > (http://java.sun.com/products/jini/2.1/doc/api/net/jini/lookup/ServiceDiscoveryListener.html) > talks a lot about these cases. This spec seems related to service events, not registrar events. > I've used multiple lookup services for redundancy problems and failure of > one shouldn't result in a registered service becoming 'invisible' if the > others were still reachable. This is what we expect and generally, I think it's worked for us as well. The particular firewall/iptables mess was a new situation we hadn't faced in this server configuration before. Thanks! Patrick
