In addition CodeSource.implies() also causes DNS checks, I'm not 100% sure about the jvm code, but Harmony code uses SocketPermission.implies() to check if one CodeSource implies another, I believe the jvm policy implementation also utilises it, because harmony's implementation is built from Sun's java spec.

So in the existing policy implementations, when parsing the policy files, additional start up delays may be caused by the CodeSource.implies() method making network DNS calls.

In my ConcurrentPolicyFile implementation (to replace the standard java PolicyFile implementation), I've created a URIGrant, I've taken code from Harmony to implement implies(ProtectionDomain pd), that performs wildcard matching compliant with CodeSource.implies, the only difference being, that no attempt to resolve URI's is made.

Typically most policy files specify file based URL's for CodeSource, however in a network application where many CodeSources may be network URL's, DNS lookup causes added delays.

I've also created a CodeSourceGrant which uses CodeSource.implies() for backward compatibility with existing java policy files, however I'm sure that most will simply want to revise their policy files.

The standard interface PermissionGrant, is implemented by the following inheritance hierarchy of immutable classes:

                                 PrincipalGrant
                 ______________|_______________________________
| | ProtectionDomainGrant CertificateGrant | ________________ |________________ ClassLoaderGrant | | URIGrant CodeSourceGrant


Only PrincipalGrant is publicly visible, a builder returns the correct implementation.

ProtectionDomainGrant and ClassLoaderGrant are dynamically granted, by the completely new DynamicPolicyProvider (which has long since passed all tests).

CertificateGrant, URIGrant and CodeSourceGrant are used by the File based policy's and RemotePolicy, which is intended to be a service that nodes in a djinn can use to allow an administrator to update the policy (eg to include new certificates or principals), with all the protection of subject authentication and secure connections. RemotePolicy is idempotent, the policy is updated in one operation, so the current policy state is always known to the administrator (who is a client).

Since a File based policy is mostly read and only written when refreshed, PermissionGrant's are held in a volatile array reference, copied (only the reference) by any code that reads the array. The array reference is updated when the policy is updated, the array is never mutated after publishing.

A ConcurrentMap<ProtectionDomain, PermissionCollection> (with weak keys) acts as a cache, I've got ConcurrentPermissions, an implementation that replaces the hetergenous java.security.Permissions class, this also resolves any unresolved permissions.

However I'm starting to wonder if it's wiser to throw away the cache altogether and simply build java.security.Permissions on demand, then throw Permissions away immediately after use for collection in the young generation heap (it's likely to fit in level 2 cache and never even be copied to Ram). This would eliminate contention between existing PermissionCollection's that block, like SocketPermissionCollection.

So if you have for instance 100 different AccessControlContext's being checked by different threads, that all contain the same ProtectionDomain's for a SocketPermission, then all will be executed in parallel. Currently due to blocking, each SocketPermission that performs a DNS check must either resolve or timeout, before it's SocketPermissionCollection can release it's synchronization lock (and there may be multiple SocketPermission's in a SocketPermissionCollection), before another thread can check it's context and so on, which explains everything coming to a standstill.

If all permission checks execute in parallel independently, without blocking, then the timeout won't be magnified.

I am considering going one step further and replacing SocketPermission and SocketPermissionCollection, and implementing DNS checks in the SocketPermissionCollection rather than SocketPermission. By doing this a matching record will be found in most cases without requiring DNS reverse lookup. If I keep this as an internal policy implementation detail, then if Oracle fixes SocketPermission, we can return to using the standard java implementation, in fact I could make it a configuration property.

It's an unfortunate fact that not all permission checks are performed in the policy, replacing SocketPermission also requires the cooperation of the SecurityManager. To make matters worse, static ProtectionDomains created prior to my policy implementation being constructed will never consult my policy implementation as such they will still contain SocketPermission. So the SecurityManager would need to check each ProtectionDomain for both implementations, so reimplementing SocketPermission doesn't eliminate its use entirely.

It's worth noting that SocketPermission is implemented rather poorly and the same functionality can be provided with far fewer DNS lookups being performed, since the majority are performed completely unnecessarily. Perhaps it's worth me donating some time to OpenJDK to fix it, I'd have to check with Apache legal first I suppose.

The problems with DNS lookup also affects CodeSource and URL equals and hashcode methods, so these classes shouldn't be used in collections.

Cheers,

Peter.

Christopher Dolan wrote:
To simulate the problem, go to InetAddress.getHostFromNameService() in your IDE, set a 
breakpoint on the "nameService.getHostByAddr" line with a condition of 
something like this:

     new java.util.concurrent.CountDownLatch(1).await(15, 
java.util.concurrent.TimeUnit.SECONDS)

then launch your River application from within the IDE. This will cause all 
reverse DNS lookups to stall for 15 seconds before succeeding. This will affect 
Reggie the worst because it has to verify so many hostnames. In a large group 
(a few thousand services) this will drive Reggie's thread count skyward, 
perhaps triggering OutOfMemory errors if it's in a 32-bit JVM.

This problem happens in the real world in facilities that allow client connections to the 
production LAN, but do not allow the production LAN to resolve hosts in the client LAN. 
This may occur due to separate IT teams or strict security rules or simple configuration 
errors. Because most client-server systems, like web servers, do not require the server 
to contact the client this problem does not become immediately visible to IT. Instead, 
the question is inevitably "Why is Jini/River so sensitive to reverse DNS? All of my 
other services work fine."

Chris

-----Original Message-----
From: Tom Hobbs [mailto:tvho...@googlemail.com] Sent: Monday, December 12, 2011 1:43 PM
To: dev@river.apache.org
Subject: Re: RE: Implications for Security Checks - SocketPermission, URL and 
DNS lookups

My biggest concern with such fundamental changes is controlling the impact
it will have.  I'm a pretty good example of this, I haven't experienced the
troubles these changes are intended to overcome.  I also don't havent made
any attempt to dive into these areas of the code, for any reason.

Is it possible to put together a test case which exposes these problems and
also proves the solution?

Obviously, a test case involving misconfigured networks is daft, in that
instance a handy "if your network misconfigured" diagnostic tool or
documentation would be a good idea.

Please don't interpret this concern as a criticism of your work, Peter.
Far from it.  It's just a comment born out of not really having any contact
with the area your working in!


Grammar and spelling have been sacrificed on the altar of messaging via
mobile device.

On 12 Dec 2011 18:01, "Christopher Dolan" <christopher.do...@avid.com>
wrote:

Specifically for SocketPermission, I experienced severe timeout problems
with reverse DNS misconfigurations. For some LAN-based deployments, I
relaxed this criterion via 'new SocketPermission("*",
"accept,listen,connect,resolve")'. This was difficult to apply to a general
Sun/Oracle JVM, however, because the default security policy *prepends* a
("localhost:1024-","listen") permission that triggers the reverse DNS
lookup. To avoid this inconvenient setting, I install a new
java.security.Policy subclass that delegates to the default Policy except
when the incoming permission is a SocketPermission. That way I don't need
to modify the policy file in the JVM. The Policy.implies() override method
is trivial because it just needs to do " if (permission instanceof
SocketPermission) { ... }". The PermissionCollection methods were trickier
to override (skip over any SocketPermission elements in the default
Policy's PermissionCollection), but still only about 50 LOC.

Chris

-----Original Message-----
From: Peter Firmstone [mailto:j...@zeus.net.au]
Sent: Friday, December 09, 2011 9:28 PM
To: dev@river.apache.org
Subject: Implications for Security Checks - SocketPermission, URL and DNS
lookups

DNS lookups and reverse lookups caused by URL and SocketPermission,
equals, hashCode and implies methods create some serious performance
problems for distributed programs.

The concurrent policy implementation I've been working on reduces lock
contention between threads performing security checks.

When the SecurityManager is used to check a guard, it calls the
AccessController, which retrieves the AccessControlContext from the call
stack, this contains all the ProtectionDomain's on the call stack (I
won't go into privileged calls here), if a ProtectionDomain is dynamic
it will consult the Policy, prior to checking the static permissions it
contains.

The problem with the old policy implementation is lock contention caused
by multiple threads all using multiple ProtectionDomains, when the time
taken to perform a check is considerable, especially where identical
security checks might be performed by multiple threads executing the
same code.

Although concurrent policy reduces contention between ProtectionDomain's
calls to Policy.implies, there remain some fundamental problems with the
implementations of SocketPermission and URL, that cause unnecessary DNS
lookups during equals(), hashCode() and implies() methods.

The following bugs concern SocketPermission (please read before
continuing) :

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6592285
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4975882 - contains a
lot of valuable comments.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4671007 - fixed,
perhaps incorrectly.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6501746

Anyway to cut a long story short, DNS lookups and DNS reverse lookups
are performed for the equals and hashCode implementations in
SocketPermission and URL, with disastrous performance implications for
policy implementations using collections and caching security permission
check results.

For example, once a SocketPermission guard has been checked for a
specific AccessContolContext the result is cached by my SecurityManager,
avoiding repeat security checks, however if that cache contains
SocketPermission, DNS lookups will be required, the cache will perform
slower than some other directly performed security checks!  The cache is
intended to return quickly to avoid reconsulting every ProtectionDomain
on the stack.

To make matters worse, when checking a SocketPermission guard, the DNS
may be consulted for every non wild card SocketPermission contained
within a SocketPermissionCollection, up until it is implied.  DNS checks
are being made unnecessarily, since the wild card that matches may not
require a DNS lookup at all, but because the non matching
SocketPermission's are being checked first, the DNS lookups and reverse
lookups are still performed.  This could be fixed completely, by moving
the responsibility of DNS lookups from SocketPermission to
SocketPermissionCollection.

The identity of two SocketPermission's are equal if they resolve to the
same IP address, but their hashCode's are different! See bug 6592623.

The identity of a SocketPermission with an IP address and a DNS name,
resolving to identical IP address should not (in my opinion) be equal,
but is!  One SocketPermission should only imply the other while DNS
resolves to the same IP address, otherwise the equality of the two
SocketPermission's will change if the IP address is assigned to a
different domain!  Object equality / identity shouldn't depend on the
result of a possibly unreliable network source.

SocketPermission and SocketPermissionCollection are broken, the only
solution I can think of is to re-implement these classes (from Harmony)
in the policy and SecurityManager, substituting the existing jvm
classes.  This would not be visible to client developers.

SocketPermission's may also exist in a ProtectionDomain's static
Permissions, these would have to be converted by the policy when merging
the permissions from the ProtectionDomain with those from the policy.
Since ProtectionDomain, attempts to check it's own internal permissions,
after the policy permission check fails, DNS checks are currently
performed by duplicate SocketPermission's residing in the
ProectionDomain, this will no longer occur, since the permission being
checked will be converted to say for argument sake
org.apache.river.security.SocketPermission.  However because some
ProtectionDomains are static, they never consult the policy, so the
Permission's contained in each ProtectionDomain will require conversion
also, to do so will require extending and implementing a
ProtectionDomain that encapsulates existing ProtectionDomain's in the
AccessControlContext, by utilising a DomainCombiner.

For CodeSource grant's, the policy file based grant's are defined by
URL's, however URL's identity depend upon DNS record results, similar to
SocketPermission equals and hashCode implementations which we have no
control over.

I'm thinking about implementing URI based grant's instead, to avoid DNS
lookups, then allowing a policy compatibility mode to be enabled (with
logging) for falling back to CodeSource grant's when a URL cannot be
converted to a URI, this is a much simpler fix than the SocketPermission
problem.

For Dynamic Policy Grants, because ProtectionDomain doesn't override
equals (that's a good thing), the contained CodeSource must also be
checked, again potentially slowing down permission checks with DNS
lookups, simply because CodeSource uses URL's.  Changing the Dynamic
Grant's to use URI based comparison would be relatively simple, since
the URI is obtained dynamically when the dynamic grant is created.

URI based grant's don't use DNS resolution and would have a narrower
scope of implied CodeSources, an IP based grant won't imply a DNS domain
URL based CodeSource and vice versa.  Rather than rely on DNS
resolution, grant's could be made specifically for IPv4, IPv6 and DNS
names in policy files.  URL.toURI() can be utilised to check if URI
grant's imply a CodeSource without resorting to DNS.

Any thoughts, comments or ideas?

N.B. It's sad that security is implemented the way it is, it would be
far better if it was Executor based, since every protection domain could
be checked in parallel, rather than in sequence.

Regards,

Peter.





Reply via email to