On Wed, Nov 19, 2014 at 5:45 AM, Yonik Seeley <yo...@heliosearch.com> wrote: > On Tue, Nov 18, 2014 at 3:47 PM, Philip Durbin > <philip_dur...@harvard.edu> wrote: >> Solr JOINs are a way to enforce simple document security, as explained >> by Yonik Seeley at >> http://lucene.472066.n3.nabble.com/document-level-security-filter-solution-for-Solr-tp4126992p4126994.html >> >> I'm trying to tweak this pattern so that I don't have to keep the >> security information in each of my primary Solr documents. >> >> I just posted the gist at >> https://gist.github.com/pdurbin/4d27fea7b431ef3bf4f9 as an example of >> my working Solr JOIN based on data in `before.json` . Permissions per >> user are embedded in the primary documents like this: >> >> { >> "id": "dataset_3", >> "perms_ss": [ >> "alice", >> "bob" >> ] >> }, >> { >> "id": "dataset_4", >> "perms_ss": [ >> "alice", >> "bob", >> "public" >> ] >> }, >> >> User document have been created to do the JOIN on: >> >> { >> "id": "alice", >> "groups_s": "alice" >> }, >> >> The JOIN looks like this: >> >> {!join+from=groups_s+to=perms_ss}id:public+OR+{!join+from=groups_s+to=perms_ss}id:alice > > It would probably be faster written as a single join: > fq={!join+from=groups_s+to=perms_ss}id:(public alice)
Hmm, I can't get the single JOIN to work on the "before" example (perms embedded in each primary doc) in the gist I posted so I guess I'll live with the slower version with "OR". > Or, if you're using Heliosearch you could cache the filters separately > for better hit rates on commonly used perms via the "filter" keyword: > fq=filter({!join+from=groups_s+to=perms_ss}id:public) OR > filter({!join+from=groups_s+to=perms_ss}id:alice) Getting back to my original question about keeping permission information out of my primary documents, I noticed that http://heliosearch.org describes the Pseudo-Join feature as "selects a set of documents based on their relationship to a **second** set of documents" (emphasis mine) so I assume I can't take the perms out of my primary Solr documents and put them in a **third** set of "permission assignments" documents with definition points and role assignees: https://gist.github.com/pdurbin/4d27fea7b431ef3bf4f9#file-after-json . That is, the three sets of documents would be: 1. primary (datasets, with no permission info) 2. users 3. permission assignments So, I guess I'll continue to embed permissions into the primary documents, since it's working. :) Thanks, Yonik. I appreciate you taking a look at this. Phil -- Philip Durbin Software Developer for http://dataverse.org http://www.iq.harvard.edu/people/philip-durbin