Hi Mark > From the above it is not completely clear whether LUS-1 and LUS-2 were > running on the same server, bound to the same IP number or that only LUS-1 > was affected by the firewall update and that LUS-2 is on a different server > or bound to another IP number and not affacted by the firewall update.
They were running on two different servers. > > Also what went exactly wrong with the firewall update, was nothing reachable > or was it that just certain services were blocked. >From my understanding of the situation, an admin loaded changes to iptables config and the box where LUS-1 was running was thereafter completely unreachable until a hard reboot. > What was exactly misconfigured with the firewall update. What if the event > registration fails because certain ports being blocked, while multicast and > unicast discovery is allowed through the firewall. I don't know the details, but know enough to say that the box was unreachable over the network until it was rebooted with the prior iptables config. > > At INFO level for net.jini.lookup.ServiceDiscoveryManager a failure of lease > creation or renewal should be visible in the logs. OK, I will make sure we have this enabled in the future. > > In case the SDM (by means of an implementation of DiscoveryManagement) > encounters a definite failure of a lookup service it will discard that > lookup service, but that lookup service will be eligible for (re)discovery, > meaning that when the SDM receives another multicast message that indicates > the lookup service is available on the network it will try to register with > that lookup service. That will fail in your case and it will be discarded. OK, thanks for the clarification. I just found the section of http://java.sun.com/products/jini/2.1/doc/specs/html/servicediscutil-spec.html (under "The DiscoveryManagement Interface") which describes this. However, it's not clear to me how a lookup helper class (our clients are configured to use LookupDiscoveryManager) "determine" if a lookup service is no longer available. In the Discovery Utilities Spec (http://java.sun.com/products/jini/2.1/doc/specs/html/discoveryutil-spec.html), I find: "Currently, there exist utilities such as the LookupDiscovery and LookupDiscoveryManager helper utilities that will, on behalf of a discovering entity, automatically discard a lookup service upon determining that the lookup service has become unreachable or uninteresting. Although most entities will typically employ such a utility to help with both its discovery as well as its discard duties, it is important to note that if the entity itself determines that the lookup service is unavailable, it is the responsibility of the entity to invoke the discard method. This scenario usually happens when the entity attempts to interact with a lookup service, but encounters an exceptional condition (for example, a communication failure). When the entity actively discards a lookup service, the discarded lookup service becomes eligible to be re-discovered. Allowing unreachable lookup services to remain in the managed set can result in repeated and unnecessary attempts to interact with lookup services with which the entity can no longer communicate. Thus, the mechanism provided by this method is intended to provide a way to remove such "stale" lookup service references from the managed set." However, I don't find any more detail on the topic. Thus it is unclear if we need to call discard(registrar) when we believe the registrar is no longer available. At least in this one case, it appears that the registrar may not have been discarded. Thanks a lot for helping out with this, Mark. I'm going to rework the logging and then see if I can reproduce this, or at least have better logging enabled if it reappears. May be some confusion on our end. Regards Patrick
