Hi,

I've come across an interesting problem with regards distributed searching,
and thought I'd share it here and see if anyone else has come across it
and/or comment on the proposed solution:

*Requirement:*
A requirement of my particular Solr environment is that queries are subject
to http authentication (I currently use Jetty basic realm auth, but any http
auth is affected).
i.e. If you don't have a username/password, you can't look at anything.
For most use cases, I'm guessing that queries aren't generally subject to
authentication, hence this post...

*Problem:*
Querying a single server is easy, because my client app creates/manages its
own HttpClient object.
When it comes to querying across shards, the default SearchHandler uses a
'plain-vanilla' http client for its CommonsHttpSolrServer instance that
makes the request to each shard (in HttpCommComponent.submit()).
There is no provision to pass it any credentials.

Perhaps document-level security might be a better way to handle access
control for searching in general, but that's a different can of worms... :-)

*Proposed Solution:*
A proposed solution for overall Solr access for searching across
http-authenticated shards is this:

1. Define parameter(s) syntax for shard credentials.

2. Modify (or subclass) SearchHandler, in particular the
HttpCommComponent.submit() method, to optionally look for shard-specific
credentials in its ModifiableSolrParams params.
If it finds credentials, it creates/reuses an HttpClient object with these
and passes this to the SolrServer instance for the search request.
Because the credentials parameter would be totally optional, it should be
fine to patch SearchHandler 'in-line' without subclassing, so that
patches/updates will work without having to modify solrconfig.xml.
(feel free to disagree with me on this!)

3. This also requires a modification to SearchHandler.handleRequestBody() to
extract the credentials parameter(s) and pass these on to the submit()
request (similar to what it does now for SHARDS_QT).

4. Clients would populate their sharded query request with the defined
parameter(s) for each shard (I'm using SolrJ so there's app logic to do
this, but should be ok for other client types).

I admit I'm not an expert on SearchHandler inner workings, so if there are
other code paths that would be affected by this, or any other potential
issues, any advice/insight is greatly appreciated!
If anyone thinks this is a barmy idea, or has come up with a better
solution, please say!

Many thanks,
Peter

Reply via email to