Markus,

> I am working on a Document Management System where every 
> document has an Access Control List attached to it. Obviously 
> a search result should only consist of documents that may be 
> viewed by the currently logged in user.
> 
> I can think of three strategies to accomplish this goal:
> 
> 1) using Filter and FilteredQuery
> 2) filtering the search result
> 3) somehow storing the ACL elements as Lucene fields
> 
> But each approach has serious drawbacks.
> 
> The first one degrades rapidly as the number of documents increases.
> Think of determining the viewability of 10,000 documents 
> where you need several SQL queries per document.
> 
> The second approach also degrades badly when a user has 
> access to a very small subset of all documents. There could 
> be thousands of false hits before the first viewable document 
> is reached.
> 
> The third approach looks most promising to me but would 
> require to update Lucene documents whenever an ACL changes. 
> Unfortunately it is not possible to update Lucene documents 
> without losing fields that are indexed but not stored, right?
> 
> So my question is: is there another approach or a "standard solution"
> I did not think of? Or how did others solve this problem?

We took a combination of the first and the second approach in our applications. 
We filter by content area that the user is allowed to view  and then filter the 
search results that are retrieved. It's actually very fast for us because we 
don't have to load the document to check the permissions - just query an API 
which caches all the permissions. SQL is only required for loading the 
documents that are visible for any given result page (assumming that the 
document isn't already loaded into cache).

The third approach was deemed unusable for the exact reason you outlined.


Regards,

Bruce Ritchie

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to