Erick, I would try to limit the number of threads to aleviate the problem, honestly, gong much above #CPU cores rarely makes things better. Just let excessive update commands wait in some queue. We impose hard limits intercepting update chain and configure container thread pool to have smaller number of long liven threads ( longer TTL times to reduce churn). Is there a simple way to do it on jetty?
On May 9, 2013, at 10:28 PM, Erick Erickson <[email protected]> wrote: > Adrien: > > Yeah, this is getting warmer. Some of the > CompressingStoredFieldsReader objects are 240M. The documents aren't > nearly that large I don't think (but I'll verify). > > But still, over 700 of these objects live at once? I _think_ I'm > seeing the number go up significantly when the number of indexing > threads increases, but that's a bit of indirect evidence. My other > question would be whether you'd expect the number of these objects to > go up as the number of segments goes up, i.e. I assume they're > per-segment.... > > So the pattern here is atomic updates on documents where some of the > fields get quite large. So the underlying reader has to decompress > these a lot. Do you have any suggestions how to mitigate this? Other > than "don't do that<G>".... > > Thanks, > Erick > > On Thu, May 9, 2013 at 6:45 AM, Adrien Grand <[email protected]> wrote: >> Hi Erick, >> >> The stored fields reader reuses the buffer used for decompression across >> calls in the same thread, so I'm thinking that this kind of behaviour could >> happen if some documents are very large. Is it the case? >> >> Adrien >> >> Le 8 mai 2013 17:22, "Erick Erickson" <[email protected]> a écrit : >>> >>> I'm seeing a case where (reported to me, not on my personal machine) >>> where Solr's heap is being exhausted apparently by compressed field >>> reader. Here's a sample line from a summary of a heap dump: >>> >>> CompressingStoredFieldReader 735 instances >>> >>> where the CompressingStoredFieldReader instances are taking up 7.5G of >>> heap space (out of 8 allocated). The GC cycles are a pattern I've seen >>> before >>> gc for a long time, recover a few meg >>> run for a _very_ short time >>> gc again, recover a few meg >>> >>> This app happens to be doing a lot of atomic updates, running with CMS >>> collector, java 1.7. There is very little querying going on, and it is >>> happening on several different machines. Solr 4.2. >>> >>> Now, there is a bit of custom update code involved, we're testing whether >>> 1> if not compressing the stored fields changes things >>> 2> if disabling the custom code changes things >>> 3> if Solr 4.3 changes things. >>> 4> whether memory usage grows over time linearly or spikes (pretty >>> sure the former). >>> >>> This is an index-heavy/query-light application, and we're hosing a >>> _bunch_ of documents at the indexing process. But still, assuming that >>> the memory reporting above is accurate, each >>> CompressingStoredFieldReader is apparently taking up 10M each and >>> there are over 700 of them. The documents are being batched (don't >>> quite know how many per batch, will find out). >>> >>> Mainly, I'm asking if this rings any bells. I don't see anything on a >>> quick look at the JIRAs. >>> >>> Thanks, >>> Erick >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
