The java.net.SocketPermission class uses forward and reverse DNS lookups
to ensure that we're allowed to talk to particular remote machines.
These lookups are used to canonicalize a remote host's name to ensure
that variations in that name don't lead to false negatives.

However, many people have found
(http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882) that if
there are configuration errors in a DNS system, the reverse DNS failures
cause very significant latency (e.g. I've seen 10-12 seconds). This
latency has widely varying affects on a djinn. In many cases, it just
causes LookupCache slowdowns which can be mitigated by delayed
deserialization techniques discussed previously on the dev@ mailing
list. But in some cases, I've seen it cause Reggie to hang up for a
while (I still don't understand where in Reggie the problem occurs,
maybe EventListeners?)

Obviously, the real solution is to properly configure DNS. But I would
like to know how other people have addressed this issue in their
deployments.

 * Do you ensure the RMI codebase URLs all use canonical hostnames, or
IP addresses?
 * Do you ensure that the TcpServerEndpoint has a consistent (perhaps
hard-coded) name?
 * Do you have monitoring or logging code to proactively detect DNS
configuration errors?
 * Do you fiddle the Java security property
"networkaddress.cache.negative.ttl"?
 * Do you use host files?
 * Do you use a non-Sun JVM?
 * Do you use wildcards or IP addresses in your security policy file?
 * Do you completely disable the socket check in your security policy
file? (yikes!)
 * Have you simply never seen this problem?  (lucky you!)

Thanks,
Chris

Reply via email to