Thanks for the replies,

We'll try the filters then, possibly with cache if required for performance.

@Karsten: We did think about simplifying permissions to just top-level folders, which is probably suitable for 80% of our clients. If the filter is too slow we may have to. In that case it gets a lot simpler: we can add an extra field for what we call "zone" and use just a term query, no need for a prefix or wildcard anymor, and thus no more max clause count errors.

Kind regards,
Nico Krijnen

On 5 aug 2008, at 15:31, Erick Erickson wrote:

This situation is pretty much the kind of thing PrefixFilters
were written for, so I'd certainly try those first, with or
without caching. I was surprised at how fast filters
get constructed, so I'd just try it and take a few measurements.

Best
Erick


On 5 aug 2008, at 11:11, Karsten F. wrote:


Hi Nico Krijnen,

I think it is ok, to store a filter for each user-session im memory.
And I think that a cached filter is the correct approach for permissions.
(extra memory usage = one bit for each user and each document)

Hopefully someone with more experience will also answer your question.

But I want to ask the obvious question:

Is your permission-policy really on each file, or only on the top-most
folders?
Can't you store only the relevant path in an extra lucene field and set the
maximum of query-terms to e.g. 2048 ?

Best regards
 Karsten

On Tue, Aug 5, 2008 at 3:40 AM, Nico Krijnen <[EMAIL PROTECTED]> wrote:

Hello,

Need some help with prefix filtering...
We ran into the max clause count problem with our usage of the wildcard
query. Essentially what we are trying to do is:

One of the fields in our index contains a 'path' representing a file system
location. For example:

/folder A/subfolder/document 1.pdf
/folder B/image 1.jpg
/folder B/image 2.jpg
/folder B2/image 3.jpg
/folder C/image 4.jpg

We have a security layer in our application that filters results based on the users permissions. These permissions (VIEW, EDIT, ...) can be set on 'folder paths'. To filter the results we build a bool query with a wildcard (or prefix) query for each folder for which the user has VIEW permissions,
for example:

/folder A/subfolder/*
/folder B/*
/folder B2/*

This does exactly what we want to, but because a wildcard query is
rewritten to term queries it fails when there are more then 1024 documents below a folder (max clause count of rewritten bool query). After all, each document has a different (untokenized) term value for the 'path' field.

After searching the web we found some alternative methods, for example by using a PrefixFilter wrapped in a CachingWrapperFilter instead of a query. Before we start implementing I'd like to check if anyone here may have some more experience with queries like this or may have a better suggestion on
how to approach this?

Kind regards,
Nico Krijnen



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to