Hi!

Today, I've seen a weird issue in production workloads when the gzip 
compression was enabled. After some minutes, the client app ran out of 
connections and stopped responding.

The cluster setup is pretty simple:
Solr version: 7.7.2
Solr cloud enabled
Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas. 1 
HTTP LB using Round Robin over all nodes
All cluster nodes have gzip enabled for all paths, all HTTP verbs and all MIME 
types.
Solr client: HttpSolrClient targeting the HTTP LB

Problem description: when the Solr node that receives the request has to 
forward the request to a Solr Node that actually can perform the query, the 
response headers are added incorrectly to the client response, causing the 
SolrJ client to fail and to never release the connection back to the pool.

To simplify the case, let's try to start from the following repro scenario:

  *   Start one node with cloud mode and port 8983
  *   Create one single collection (1 shard, 1 replica)
  *   Start another node with port 8984 and the previusly started zk (-z 
localhost:9983)
  *   Start a java application and query the cluster using the node on port 
8984 (the one that doesn't host the collection)

So, the steps occur like:

  *   The application queries node:8984 with compression enabled 
("Accept-Encoding: gzip") and wt=javabin
  *   Node:8984 can't perform the query and creates a http request behind the 
scenes to node:8983
  *   Node:8983 returns a gzipped response with "Content-Encoding: gzip" and 
"Content-Type: application/octet-stream"
  *   Node:8984 adds the "Content-Encoding: gzip" header as character stream to 
the response (it should be forwarded as "Content-Encoding" header, not 
character encoding)
  *   HttpSolrClient receives a "Content-Type: 
application/octet-stream;charset=gzip", causing an exception.
  *   HttpSolrClient tries to quietly close the connection, but since the 
stream is broken, the Utils.consumeFully fails to actually consume the entity 
(it throws another exception in GzipDecompressingEntity#getContent() with "not 
in GZIP format")

The exception thrown by HttpSolrClient is:
java.nio.charset.UnsupportedCharsetException: gzip
               at java.nio.charset.Charset.forName(Charset.java:531)
               at 
org.apache.http.entity.ContentType.create(ContentType.java:271)
               at 
org.apache.http.entity.ContentType.create(ContentType.java:261)
               at org.apache.http.entity.ContentType.parse(ContentType.java:319)
               at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:591)
               at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
               at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
               at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
               at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1015)
               at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1031)
               at 
org.apache.solr.client.solrj.SolrClient$$FastClassBySpringCGLIB$$7fcf36a0.invoke(<generated>)
               at 
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)

Here I can see three different problems:

  *   HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to 
set the Content-Encoding header. This is obviously a typo.
  *   HttpSolrClient, specially the HttpClientUtil should be modified to 
prevent that if the Content-Encoding header lies about the actual content, 
there should be an exception, but shouldn't leak the connection forever.
  *   HttpSolrClient should allow clients to customize HttpClient's 
connectionRequestTimeout, preventing the application to respond to any other 
incoming request because all requests that used could be forever blocked 
waiting for a free connection that will never be free.

I think the two points are to bugs and the third one is a feature improvement. 
Unless I missed something, I'll file the two bugs and provide a patch for them. 
The same goes for the the feature improvement.



En el caso de haber recibido este mensaje por error, le rogamos que nos lo 
comunique por esta misma v?a, proceda a su eliminaci?n y se abstenga de 
utilizarlo en modo alguno.
If you receive this message by error, please notify the sender by return e-mail 
and delete it. Its use is forbidden.


Reply via email to