Re: Problem with SDM/LookupCache when LUS unavailable

Patrick Wright Mon, 17 Aug 2009 13:55:25 -0700

Hi Mark

Thanks for your reply. I'll try to clear a few things up.


First, LUS-1 and LUS-2 are both tagged with the same public group, no
other attributes, and are reached via multicast by all clients.

What we observe in our logs is that a particular client continued to
try and access a service which (as far as we can tell) it had located
via LUS-1, and calls to that service from that client failed, for the
entire time LUS-1 was unreachable. We also see DiscoveryEvent
notifications arriving at that client referring to LUS-1 during that
45 minute period (we don't have all the details of the events in the
logs, unfortunately), and that LUS-1 was unreachable from that client
during that time.

What confuses us is that other clients complained (so to speak) at
most once when LUS-1 was unavailable, then continued to operate, we
assume, using LUS-2. So we were wondering if there was something we
didn't understand about how an SDM and its LookupCache react to an LUS
no longer being reachable. What we expect is that the SDM will,
perhaps on a lease expiration, remove all entries in the cache related
to that registrar, however, this didn't appear to happen, and we
haven't found any documentation that indicates what it does or should
do.


> Are you suggesting that
> some clients didn't find a particular service in the LookupCache that was
> registered with LUS-2 while LUS-1 was not reachable.

We have (at least) one client which appeared to continue to try and
work with LUS-1 for over 45 minutes after it was last available, and
it appeared to also attempt to retrieve a service proxy, continuously,
from LUS-1 during that period, again unsuccessfully.

>
> In case the SDM in your client was able to see LUS-2 it shouldn't have any
> problem seeing your service even in case LUS-1 became unreachable, assuming
> no other problems than LUS-1 not being reachable occurred.

This is what appeared to occur on other Jini clients in the network,
and what we want.


> is used for finding your lookup service. In that case are you sure that
> LUS-2 was found by the SDM of your client? A good way to find out is to
> configure logging for the logger documented in
> http://java.sun.com/products/jini/2.1/doc/api/net/jini/discovery/LookupDiscovery.html,
> set the level to FINEST.

I think we have a discovery listener and logger of our own configured,
but am not sure if we by default log all events.


> The spec of ServiceDiscoveryListener
> (http://java.sun.com/products/jini/2.1/doc/api/net/jini/lookup/ServiceDiscoveryListener.html)
> talks a lot about these cases.

This spec seems related to service events, not registrar events.


> I've used multiple lookup services for redundancy problems and failure of
> one shouldn't result in a registered service becoming 'invisible' if the
> others were still reachable.

This is what we expect and generally, I think it's worked for us as
well. The particular firewall/iptables mess was a new situation we
hadn't faced in this server configuration before.


Thanks!
Patrick

Re: Problem with SDM/LookupCache when LUS unavailable

Reply via email to