Ryan Ernst created SOLR-4643:
--------------------------------
Summary: Refactor shard handler (and factory) to make pieces more
pluggable
Key: SOLR-4643
URL: https://issues.apache.org/jira/browse/SOLR-4643
Project: Solr
Issue Type: Improvement
Reporter: Ryan Ernst
Over the past few weeks I've been trying to write my own shard handler/factory,
and it is a bit of a pain. The pieces that I don't want to reimplement are
tied very closely with those that I do.
I believe the current design is as follows:
ShardHandlerFactory - created once, shared across cores (except in some legacy
case where it is per core?). This contains the "heavyweight" stuff like
threadpool for parallelizing requests and httpclient. It also is what keeps a
solrj loadbalancer object.
ShardHandler - created per request, it has the logic for determining if a
request is distributed, and sending the requests in parallel (using an executor
from the parent factory object). It also has the knowledge of how to send
requests and parse the response embedded within the parallelization piece
(through solrj code).
I've attempted to address some of the ease of plug-ability:
https://issues.apache.org/jira/browse/SOLR-4544
This was an attempt to get to reuse the code for parallelizing the requests,
but still plug in code for making the requests. It sort of works, but was just
a stop gap measure. You still cannot format the request or parse the response
without reimplementing ShardHandler.
https://issues.apache.org/jira/browse/SOLR-4613
Here I was trying to only require creating a shard handler when the request is
distributed, instead of every request just to find out if it is distributed.
At this point I thought I would create a jira to write down a proposal for how
to do this refactoring, instead of continuing with piecemeal/out of context
jiras.
I view this shard handler business as needing the following:
1. Something to parallelize the requests. Most people should never have to
replace this (if anyone?). It contains the thread pool and execution service
and is global (like the shard handler factory now).
2. Something that knows how to talk to the shards. This includes formatting
the request and parsing the response. This could probably be per core or even
per request handler?
3. Something to do load balancing. This could probably be in 2, although I
could see it being separate for easier plugging of LB without having to handle
request/response format or vice versa. It would contain the http client for
talking to hosts, and so probably still be global.
I would love to get consensus on the design of this before going off and doing
it, and suggestions for how to break this into smaller pieces.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]