Re: LCF security with Solr

Erik Hatcher Tue, 06 Apr 2010 07:11:37 -0700

Karl -

I appreciate you starting this thread on this important topic. Tokick start some discussions, some thoughts are inline below...


On Apr 6, 2010, at 9:24 AM, Karl Wright wrote:

As many may be aware, the LCF model relies on "access tokens" (e.g.active directory SIDs). There are "allow" tokens, and "deny"tokens. They are currently dropped on the floor when Solr isinvolved, but they can readily (and most naturally) be handled toSolr as metadata when a document is ingested.

These tokens are arbitrary strings, right? In other words, thestrings from one data source isn't going to be in the same format asfrom another data source, as I understand it.

Can you provide some examples of the grant and deny strings one mayget from a few different data sources?

Read more about the LCF security model here:

http://cwiki.apache.org/confluence/display/CONNECTORS/Lucene+Connectors+Framework+concepts

My proposal is therefore to do the following:
(1) Choose specific metadata names that LCF will use for "allow"tokens and "deny" tokens;(2) Write a Solr request handler, which would peel out the specialheaders that LCF's mod_authz_annotate module puts into the request,and put those into a Solr request object;

Rather than a request handler, which would be too constraining on theSolr configuration of various request handlers, this is probably bestas a servlet filter that fronts Solr's dispatch filter and simply addsparameters to the request passed on to Solr.

mod_authz_annotate - I need to understand this, but it will be arequired front to Solr to take advantage of the grant/deny strings?Is this where the user credentials get processed?

Allowing the search component to pick up the parameters and add thefiltering...

(3) Write a Solr search component, which pulls out the access tokensfrom the Solr request object, and effectively wraps all incomingqueries with the appropriate clauses that limit the results returnedaccording to the appropriate "allow" and "deny" metadata matches.

(a) Is this the right approach (bearing in mind that the LCFsecurity model is pretty deeply ingrained in LCF at this time, andis thus not subject to significant changes);

Seems like a good approach with a servlet filter and searchcomponent. Although I'm unclear how this will work with more than onedata source indexed with different grant/deny formats.

(b) Where should all of this live? Should it be a component ofSolr, or a component of LCF?

Good questions! I don't have any strong opinion on this just yet.Always a toss-up when it comes to placing code that straddles twoprojects. But I think I lean towards having this in the new lucene/solr trunk as a module. While I'm pretty Solr-centric these days, Ican imagine that LCF can have an output connector to write to Lucene'sAPI directly and some may find it handy to have some common filteringcode shared between Lucene and Solr.

(c) The access tokens used by LCF are arbitrary strings, which areusually alphanumeric, but do contain certain punctuation. Would thiscause a problem?

Punctuation won't cause a problem, but jiving a search request from auser into the various grant/deny is what I'm not quite understandingjust yet. Would there be issues with multiple data sources integratedinto one Solr index?


        Erik

Re: LCF security with Solr

Reply via email to