Hello devs, I'd like to have an API design discussion here on the dev list concerning adding asynchronous request semantics to SolrClient. I think the SolrClient APIs require great peer-review consideration and so I'm having this discussion here instead of a JIRA issue.
Background: Sometimes, a SolrJ user (including Solr itself) may wish to issue a request but not block/wait for the response because it can do other useful work before eventually needing the results. For example, maybe the user has multiple requests to send to Solr that can be done in parallel and _then_ process the results. This is in fact what Solr needs internally for during distributed-search (see HttpShardHandler) and during distributed-indexing (see ConcurrentUpdateHttp2SolrClient). Why?: Greater efficiency. This can be accomplished easily in a generic fashion with threads (e.g. CompletableFuture.supplyAsync(Supplier,Executor)) with use of the SolrClient concurrently. However, this can result in lots of threads and context switching. I really like Cao Dat's explanation in the description here: https://issues.apache.org/jira/browse/SOLR-14354 Thanks to HTTP/2, there is a new approach in which the underlying HTTP client can handle the asynchronous nature more efficiently with minimal threads. Complete plumbing of this implies that SolrClient (and/or its derivatives) need a similar async mechanism. Disclaimer: I haven't measured any of this but am piggybacking off of Cao's's insights on SOLR-14354 Where?: An async API _could_ go only on some SolrClient classes that natively support it, avoiding changing SolrClient itself. Maybe this is what should occur first before "graduating" the method to SolrClient where we devise a default approach, although I would prefer just putting it on SolrClient. The default approach might be configured to throw UnsupportedOperationException, or perhaps might simply use an Executor to get it done in an obvious way (assuming we can get ahold of an Executor somewhere?). If you're writing a delegating SolrClient (which I've done in the past), you might want to take-care to delegate this method. Another aspect of "Where" is whether SolrRequest should additionally have an API alternative to "process" which is currently synchronous and calls out to SolrClient.request(this). What?: What should this API look like? This is my primary interest right now and the meat of the discussion. SolrClient's primary signature that actually does the work (synchronously) is this: public abstract NamedList<Object> request(@SuppressWarnings({"rawtypes"})final SolrRequest request, String collection) throws SolrServerException, IOException; I don't like the "collection" there as it's redundant with either the configured SolrClient default or a setting in SolrRequest, but I digress from the important matter at hand. In SOLR-14354, recently committed by Cao Dat destined for Solr 8.7, there is a new (undocumented) method on Http2SolrClient (*not* SolrClient): public Cancellable asyncRequest(@SuppressWarnings({"rawtypes"}) SolrRequest solrRequest, String collection, AsyncListener<NamedList<Object>> asyncListener) { So firstly, it's only on Http2SolrClient, which means if you're using CloudHttp2SolrClient (which does not subclass it, counterintuitively to users) then too bad -- SOLR-14675 filed by my colleague. Secondly, I really dislike this API signature (sorry Cao). It introduces two custom and obscure Solr interfaces, Cancellable and AsyncListener. I claim that we can and should use one abstraction provided by the JDK -- CompletableFuture<NamedList<Object>>. The user can cancel() it if they like, they can use it like a Future which is maybe what they prefer, or they can attach a listener / handler via whenComplete(lambda) that even has access to an exception if there was one. Based on some toying around with CompletableFuture lately, I suspect what would be most straight-forward is to have an asyncRequest method that *returns* a CompletableFuture, and which merely takes the SolrRequest parameter and nothing else. Alternatively the client could supply a CompletableFuture parameter that Solr will call complete() on etc. but that seems a bit less natural to the notion of a method that returns it's results, albeit with a wrapper. Proposal: public CompletableFuture<NamedList<Object>> requestAsync(SolrRequest<?> request); BTW I'm trying to avoid implementation details here. My objective is to devise a user-friendly API, and my hope is that the eventual implementation is reasonable. I hope I haven't bored you all to tears and that you find this public discussion useful. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley
