[ 
https://issues.apache.org/jira/browse/SOLR-17394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871990#comment-17871990
 ] 

ASF subversion and git services commented on SOLR-17394:
--------------------------------------------------------

Commit 734f53304786872e483189e566eef03b5687733c in solr's branch 
refs/heads/branch_9_7 from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=734f5330478 ]

SOLR-17394: Check response status in IndexFetcher (#2621)

Prior to this commit, IndexFetcher would mis-read 404 responses from
other nodes.  It eventually catches this mistake and retries, but not
before it wastefully allocates a massive (> 1GB) byte array.

These allocations were relatively infrequent, as they require out of
date leader information to occur.  But they are big enough to impact
performance in production, and were observed to cause OOM's when running
tests.

This commit avoids this by checking the HTTP status code of all
IndexFetcher requests/responses, even those that utilize
InputStreamResponseParser.

> IndexFetcher should inspect HTTP status codes on its requests
> -------------------------------------------------------------
>
>                 Key: SOLR-17394
>                 URL: https://issues.apache.org/jira/browse/SOLR-17394
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: replication (java)
>    Affects Versions: main (10.0), 9.6.1
>            Reporter: Jason Gerlowski
>            Assignee: Jason Gerlowski
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Typically, SolrJ will look at the HTTP status code of a response and it will 
> throw exceptions as appropriate (see 
> [here|https://github.com/apache/solr/blob/988e9e3c2666784638475f4cd4827bc2fe77c707/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClientBase.java#L187-L217]).
>   But it skips this logic if users have elected to parse the response 
> themselves by use of an "InputStreamResponseParser".
> Solr's IndexFetcher uses this "InputStreamResponseParser" so that it can 
> access the binary index data in the HTTP response.  But it doesn't check the 
> status code of responses as it should.
> IndexFetcher will typically notice that the response is unexpected and can 
> retry and ultimately succeed, but this happens relatively late in the 
> process.  And that delay can be very expensive in many cases.  For instance:
> When IndexFetcher gets a "filecontent" response, it expects the first few 
> bytes to indicate the size of the binary response.  So it [reads these bytes 
> and instantiates a byte-array of the indicated 
> size|https://github.com/apache/solr/blob/988e9e3c2666784638475f4cd4827bc2fe77c707/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L1836C1-L1847].
>   But if IndexFetcher happens to be reading a 404 response, the first few 
> bytes of the response will be the '*<*', '*h*', '*e*', and '*a*' characters 
> from the "*<head*>" tag that Solr uses to begin all its HTML errors.  This 
> leads to IndexFetcher allocating a massive > 1GB byte-array!  This can cause 
> GC churn in production and (for me at least) was causing test runs to 
> frequently OOM on certain machines.
> We should have IndexFetcher (and other places that use 
> InputStreamResponseParser) check the response status code as soon as its 
> available and handle errors accordingly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to