[ 
https://issues.apache.org/jira/browse/SOLR-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032723#comment-17032723
 ] 

Kevin Risden commented on SOLR-14249:
-------------------------------------

So I haven't personally looked at Krb5HttpClientBuilder recently, other than 
completely unrelated SOLR-13726. Part of the reason that a lot of clients 
buffer is due to how Kerberos SPNEGO authentication works.

There are 2 parts typically
* a request without authentication where the server returns a 401 with a 
negotiate response
* a request with authentication in response to the negotiate which the server 
can verify

If you don't put any optimizations in place every request becomes two. A lot of 
times a cookie is used here to limit the amount of HTTP requests.

The reason the 401 and second request is an issue - is if the request is a non 
repeatable one - like a POST body. 

So a lot of times the super simple workaround is to buffer the request - do the 
401 check dance and then proceed. This is a way to make a non repeatable 
request semi repeatable.

This buffering has issues though as you found where the buffer should be 
limited in size which then limits the usefulness of this technique.

There are a few alternatives to buffering:
* Authenticate upfront with say an OPTIONS request - which will get the cookie. 
the next request say a POST won't have any issue and won't do the 401 dance
* Use "Expect: 100-continue" header which asks the server if it can handle the 
request without the body and if it can then send the body. This actually holds 
the data from being sent in the first place if possible.
** Curl automatically activates "Expect: 100-continue" under a few conditions- 
https://gms.tf/when-curl-sends-100-continue.html
** Apache HttpClient does NOT do any special handling of "Expect: 100-continue"
** not sure if Jetty HttpClient does anything with "Expect: 100-continue"

So long story short - yes buffering is a problem.

> Krb5HttpClientBuilder should not buffer requests 
> -------------------------------------------------
>
>                 Key: SOLR-14249
>                 URL: https://issues.apache.org/jira/browse/SOLR-14249
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Authentication, SolrJ
>    Affects Versions: 7.4, master (9.0), 8.4.1
>            Reporter: Jason Gerlowski
>            Priority: Major
>         Attachments: SOLR-14249-reproduction.patch
>
>
> When SolrJ clients enable Kerberos authentication, a request interceptor is 
> set up which wraps the actual HttpEntity in a BufferedHttpEntity.  This 
> BufferedHttpEntity, well, buffers the request body in a {{byte[]}} so it can 
> be repeated if needed.  This works fine for small requests, but when requests 
> get large storing the entire request in memory causes contention or 
> OutOfMemoryErrors.
> The easiest way for this to manifest is to use ConcurrentUpdateSolrClient, 
> which opens a connection to Solr and streams documents out in an ever 
> increasing request entity until the doc queue held by the client is emptied.
> I ran into this while troubleshooting a DIH run that would reproducibly load 
> a few hundred thousand documents before progress stalled out.  Solr never 
> crashed and the DIH thread was still alive, but the 
> ConcurrentUpdateSolrClient used by DIH had its "Runner" thread disappear 
> around the time of the stall and an OOM like the one below could be seen in 
> solr-8983-console.log:
> {code}
> WARNING: Uncaught exception in thread: 
> Thread[concurrentUpdateScheduler-28-thread-1,5,TGRP-TestKerberosClientBuffering]
> java.lang.OutOfMemoryError: Java heap space
>   at __randomizedtesting.SeedInfo.seed([371A00FBA76D31DF]:0)
>   at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
>   at 
> java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)
>   at 
> java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
>   at 
> java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)
>   at 
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:213)
>   at 
> org.apache.solr.common.util.FastOutputStream.write(FastOutputStream.java:94)
>   at 
> org.apache.solr.common.util.ByteUtils.writeUTF16toUTF8(ByteUtils.java:145)
>   at org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:848)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:932)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:328)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:616)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:355)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeMapEntry(JavaBinCodec.java:764)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:383)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:705)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:367)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:223)
>   at 
> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:330)
>   at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
>   at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:155)
>   at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.marshal(JavaBinUpdateRequestCodec.java:91)
>   at 
> org.apache.solr.client.solrj.impl.BinaryRequestWriter.write(BinaryRequestWriter.java:83)
>   at 
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner$1.writeTo(ConcurrentUpdateSolrClient.java:264)
>   at org.apache.http.entity.EntityTemplate.writeTo(EntityTemplate.java:73)
>   at 
> org.apache.http.entity.BufferedHttpEntity.<init>(BufferedHttpEntity.java:62)
>   at 
> org.apache.solr.client.solrj.impl.Krb5HttpClientBuilder.lambda$new$3(Krb5HttpClientBuilder.java:155)
>   at 
> org.apache.solr.client.solrj.impl.Krb5HttpClientBuilder$$Lambda$459/0x0000000800623840.process(Unknown
>  Source)
>   at 
> org.apache.solr.client.solrj.impl.HttpClientUtil$DynamicInterceptor$1.accept(HttpClientUtil.java:177)
> {code}
> We took heap dumps and were able to confirm that the entire 8gb heap was 
> taken up with a single massive CUSC request body that was being buffered!
> (As an aside, I had no idea that OutOfMemoryError's could happen without 
> killing the entire JVM.  But apparently they can.  CUSC.Runner propagates the 
> OOM as it should and the OOM kills the Runner thread.  Since that thread is 
> the gc-root for the massive BufferedHttpEntity though, a garbage collection 
> frees up most of the heap space and the JVM survives its memory trouble.  
> Solr's oom script never triggers.)
> I've attached a JUnit test which reproduces the OOM issue by using a "fake" 
> Kerberos config.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to