Along similar lines I have been thinking of how to splice a Lucene indexing app I wrote into SOLR. It occurred to me that it would almost be simpler to use the plugin-friendly QueryRequest mechanism rather than the UpdateRequest mechanism; coupled with what you wrote below, Hoss, it makes me think that a little refactoring of request handling might go a long way:
SolrRequestHandler now defines public void handleRequest(SolrQueryRequest req, SolrQueryResponse rsp) Interface SolrQueryRequest and abstract implementation SolrQueryRequestBase are mainly involved with parsing request parameters; the only method signatures which are query-specific are getSearcher() and the @deprecated getQueryString() and getQueryType(). SolrQueryResponse is mainly concerned with building a generic response message including execution time, though it also supports a default set of returned field names. So SolrRequestHandler.handleRequest could be changed to public void handleRequest(SolrRequest req, SolrResponse rsp) with SolrRequest and SolrResponse interfaces having the generic functionality described above. Then SolrQueryRequest and SolrQueryResponse could be crafted as sub-interfaces and/or abstract implementations segregating the few Query-specific functionality. One would also create SolrUpdateRequest and SolrUpdateResponse interfaces and/or base implementations much the same way. Then in SolrCore, the RequestHandler registry and execute() method would without modification handle both Query and Update requests; the code in SolrCore.update and SolrCore.readDoc should be moved into an implementation of SolrRequestHandler, e.g. DefaultUpdateRequestHandler, which would be registered under the request name "update" and could then be subclassed by users. It could then use SolrResponse to formulate the response, and would get the request timing information put in by SolrCore.execute() for free, as well as the pluggable response format mechanism. Note the UpdateRequestHandler which formulates update requests would be separate from the UpdateHandler, which controls the update logic (index acrobatics). Finally, the SolrUpdateServlet could be cast as a trivial subclass of SolrServlet; perhaps all it needs to do is to set the default value for the request type to "update" rather than "standard", for reverse compatibility, and perhaps to let an a parameter other than 'qt' be used to specify the request type for updates. I am pretty sure something along these lines would accomplish all the benefits you suggest below and more, with a minimal amount of coding and fairly good reverse-compatibility. It of course still leaves the hard work of writing the actual update handler plugins. But it's a lot simpler to subclass an UpdateRequestHandler than SolrCore! What do you folks think? - J.J. PS: If I weren't up to my ears in other deadline-driven deliverables, I'd just jump in and try it. At 4:21 PM -0800 1/7/07, Chris Hostetter wrote: >It seems like [Handling disparate data sources in Solr] could be addressed by >modifing the SolrUpdateServlet to to support to low level query params similar >to the way the SolrServlet looks at "qt" and "wt". The first Param would be >used to pick an UpdateSource plugin that would have an API like... > public interface UpdateSource { > SolrUpdateRequest makeRequest(HttpServletRequest req); > } > >with the SolrUpdateRequest interface looking something like... > public interface SolrUpdateRequest { > SolrParams getParams(); > Iterable<java.io.Reader> getRawUpdates(); > } > >different out of the box versions of UpdateSource would support building >SolrUpdateRequest objects from HttpServletRequests using... > 1) URL query args and the raw POST body > 2) query args from multipart form input and Readers from file uploads > 3) query args and local filenames specificed in query args > 4) query args and remote URLs specified in query args > >The SolrUpdateServlet would then use SolrUpdateRequest.getParams() to >lookup it's second core param for picking an UpdateParser plugin, which >would be responsible for parsing all of those Readers in sequence, >converting them to UpdateCommands, and calling the appropriate methods on >the UpdateHandler. > >Out of the box versions of UpdateParser could do the XML parsing currently >done, or JSON parsing, or CSV parsing. Custom plugins written by users >could do more exotic schema specific parsing: ie, reading raw PDFs and >extracting specific field values. > > >what do you guys think? > > >-Hoss