Here is a good presentation on search security from the Infonortics Search Conference that was held a few weeks ago.

http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf

The approach you are using is called early-binding. As Jay mentioned, one of the downsides is updating the documents each time you have an ACL change. You could use the late-binding approach that checks each result after the query but before you display to the user. I don't recommend this approach because it will strain your security infrastructure because you will need to check if the user can access each result.

Good luck.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 1:21 PM, Jay Hill wrote:

The only downside would be that you would have to update a document anytime a user was granted or denied access. You would have to query before the update to get the current values for grantedUID and deniedUID, remove/add values, and update the index. If you don't have a lot of changes in the system that wouldn't be a big deal, but if a lot of changes are happening
throughout the day you might have to queue requests and batch them.

-Jay

On Tue, May 12, 2009 at 1:05 PM, Matt Weber <m...@mattweber.org> wrote:

I also work with the FAST Enterprise Search engine and this is exactly how their Security Access Module works. They actually use a modified base-32 encoded value for indexing, but that is because they don't have the luxury
of untokenized/un-processed String fields like Solr.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com





On May 12, 2009, at 12:26 PM, Terence Gannon wrote:

Paul -- thanks for the reply, I appreciate it. That's a very practical
approach, and is worth taking a closer look at. Actually, taking your
idea
one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3)
deniedUid (uid of users specifically denied access to the document).
These
fields, coupled with some business rules around how they were populated
should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but
that's
something that should be done anyway. You sure wouldn't want end users preparing their own XML and throwing it at Solr -- it would be pretty easy to figure out how to get around the access/denied fields and get at stuff
the owner didn't intend.

This approach mimics to some degree what is being done in the operating system, but it's still elegant and provides the level of control required. Anybody else have any thoughts in this regard? Has anybody implemented anything similar, and if so, how did it work? Thanks, and best regards...

Terence




Reply via email to