Re: Solr Heap, MMaps and Garbage Collection
If you just want to see which classes are occupying the most memory in a live JVM,you can do:jmap -permstat I don't think you can dump the contents of PERM space.Hope this helps,TriOn Mar 03, 2014, at 11:41 AM, KNitin wrote:Is there a way to dump the contents of permgen and look at which classes are occupying the most memory in that? - Nitin On Mon, Mar 3, 2014 at 11:19 AM, KNitinwrote: Regarding PermGen: Yes we have a bunch of custom jars loaded in solrcloud(containing custom parsing, analyzers). But I haven't specifically enabledany string interning. Does solr intern all strings in a collection bydefault?I agree with doc and Filter Query Cache. Query Result cache hits arepractically 0 for the large collection since our queries are tail by natureThanksNitinOn Mon, Mar 3, 2014 at 5:01 AM, Michael Sokolov wrote:On 3/3/2014 1:54 AM, KNitin wrote:3. 2.8 Gb - Perm Gen (I am guessing this is because of interned strings)As others have pointed out, this is really unusual for Solr. We oftensee high permgen in our app servers due to dynamic class loading that theframework performs; maybe you are somehow loading lots of new Solr plugins,or otherwise creating lots of classes? Of course if you have a plugin orsomething that does a lot of string interning, that could also be anexplanation.-Mike
Re: Solr Heap, MMaps and Garbage Collection
Is there a way to dump the contents of permgen and look at which classes are occupying the most memory in that? - Nitin On Mon, Mar 3, 2014 at 11:19 AM, KNitin wrote: > Regarding PermGen: Yes we have a bunch of custom jars loaded in solrcloud > (containing custom parsing, analyzers). But I haven't specifically enabled > any string interning. Does solr intern all strings in a collection by > default? > > I agree with doc and Filter Query Cache. Query Result cache hits are > practically 0 for the large collection since our queries are tail by nature > > > Thanks > Nitin > > > On Mon, Mar 3, 2014 at 5:01 AM, Michael Sokolov < > msoko...@safaribooksonline.com> wrote: > >> On 3/3/2014 1:54 AM, KNitin wrote: >> >>> 3. 2.8 Gb - Perm Gen (I am guessing this is because of interned strings) >>> >> As others have pointed out, this is really unusual for Solr. We often >> see high permgen in our app servers due to dynamic class loading that the >> framework performs; maybe you are somehow loading lots of new Solr plugins, >> or otherwise creating lots of classes? Of course if you have a plugin or >> something that does a lot of string interning, that could also be an >> explanation. >> >> -Mike >> > >
Re: Solr Heap, MMaps and Garbage Collection
Regarding PermGen: Yes we have a bunch of custom jars loaded in solrcloud (containing custom parsing, analyzers). But I haven't specifically enabled any string interning. Does solr intern all strings in a collection by default? I agree with doc and Filter Query Cache. Query Result cache hits are practically 0 for the large collection since our queries are tail by nature Thanks Nitin On Mon, Mar 3, 2014 at 5:01 AM, Michael Sokolov < msoko...@safaribooksonline.com> wrote: > On 3/3/2014 1:54 AM, KNitin wrote: > >> 3. 2.8 Gb - Perm Gen (I am guessing this is because of interned strings) >> > As others have pointed out, this is really unusual for Solr. We often see > high permgen in our app servers due to dynamic class loading that the > framework performs; maybe you are somehow loading lots of new Solr plugins, > or otherwise creating lots of classes? Of course if you have a plugin or > something that does a lot of string interning, that could also be an > explanation. > > -Mike >
Re: Solr Heap, MMaps and Garbage Collection
On 3/3/2014 1:54 AM, KNitin wrote: 3. 2.8 Gb - Perm Gen (I am guessing this is because of interned strings) As others have pointed out, this is really unusual for Solr. We often see high permgen in our app servers due to dynamic class loading that the framework performs; maybe you are somehow loading lots of new Solr plugins, or otherwise creating lots of classes? Of course if you have a plugin or something that does a lot of string interning, that could also be an explanation. -Mike
Re: Solr Heap, MMaps and Garbage Collection
New gen should be big enough to handle all allocations that have a lifetime of a single request, considering that you'll have multiple concurrent requests. If new gen is routinely overflowed, you can put short-lived objects in the old gen. Yes, you need to go to CMS. I have usually seen the hit rates on query results and doc caches to be fairly similar, with doc cache somewhat higher. Cache hit rates depend on the number of queries between updates. If you update once per day and get a million queries or so, your hit rates can get pretty good. 70-80% seems typical for doc cache on an infrequently updated index. We stay around 75% on our busiest 4m doc index. The query result cache is the most important, because it saves the most work. Ours stays around 20%, but I should spend some time improving that. The perm gen size is very big. I think we run with 128 Meg. wunder On Mar 2, 2014, at 10:54 PM, KNitin wrote: > Thanks, Walter > > Hit rate on the document caches is close to 70-80% and the filter caches > are a 100% hit (since most of our queries filter on the same fields but > have a different q parameter). Query result cache is not of great > importance to me since the hit rate their is almost negligible. > > Does it mean i need to increase the size of my filter and document cache > for large indices? > > The split up of my 25Gb heap usage is split as follows > > 1. 19 GB - Old Gen (100% pool utilization) > 2. 3 Gb - New Gen (50% pool utilization) > 3. 2.8 Gb - Perm Gen (I am guessing this is because of interned strings) > 4. Survivor space is in the order of 300-400 MB and is almost always 100% > full.(Is this a major issue?) > > We are also currently using Parallel GC collector but planning to move to > CMS for lesser stop-the-world gc times. If i increase the filter cache and > document cache entry sizes, they would also go to the Old gen right? > > A very naive question: How does increasing young gen going to help if we > know that solr is already pushing major caches and other objects to old gen > because of their nature? My young gen pool utilization is still well under > 50% > > > Thanks > Nitin > > > On Sun, Mar 2, 2014 at 9:31 PM, Walter Underwood wrote: > >> An LRU cache will always fill up the old generation. Old objects are >> ejected, and those are usually in the old generation. >> >> Increasing the heap size will not eliminate this. It will make major, stop >> the world collections longer. >> >> Increase the new generation size until the rate of old gen increase slows >> down. Then choose a total heap size to control the frequency (and duration) >> of major collections. >> >> We run with the new generation at about 25% of the heap, so 8GB total and >> a 2GB newgen. >> >> A 512 entry cache is very small for query results or docs. We run with 10K >> or more entries for those. The filter cache size depends on your usage. We >> have only a handful of different filter queries, so a tiny cache is fine. >> >> What is your hit rate on the caches? >> >> wunder >> >> On Mar 2, 2014, at 7:42 PM, KNitin wrote: >> >>> Hi >>> >>> I have very large index for a few collections and when they are being >>> queried, i see the Old gen space close to 100% Usage all the time. The >>> system becomes extremely slow due to GC activity right after that and it >>> gets into this cycle very often >>> >>> I have given solr close to 30G of heap in a 65 GB ram machine and rest is >>> given to RAm. I have a lot of hits in filter,query result and document >>> caches and the size of all the caches is around 512 entries per >>> collection.Are all the caches used by solr on or off heap ? >>> >>> >>> Given this scenario where GC is the primary bottleneck what is a good >>> recommended memory settings for solr? Should i increase the heap memory >>> (that will only postpone the problem before the heap becomes full again >>> after a while) ? Will memory maps help at all in this scenario? >>> >>> >>> Kindly advise on the best practices >>> Thanks >>> Nitin >> >> >> -- Walter Underwood wun...@wunderwood.org
Re: Solr Heap, MMaps and Garbage Collection
Actually, I haven't ever seen a PermGen with 2.8 GB. So you must have a very special use case with SOLR. For my little index with 60 million docs and 170GB index size I gave PermGen 82 MB and it is only using 50.6 MB for a single VM. Permanent Generation (PermGen) is completely separate from the heap. Permanent Generation (non-heap): The pool containing all the reflective data of the virtual machine itself, such as class and method objects. With Java VMs that use class data sharing, this generation is divided into read-only and read-write areas. Regards Bernd Am 03.03.2014 07:54, schrieb KNitin: > Thanks, Walter > > Hit rate on the document caches is close to 70-80% and the filter caches > are a 100% hit (since most of our queries filter on the same fields but > have a different q parameter). Query result cache is not of great > importance to me since the hit rate their is almost negligible. > > Does it mean i need to increase the size of my filter and document cache > for large indices? > > The split up of my 25Gb heap usage is split as follows > > 1. 19 GB - Old Gen (100% pool utilization) > 2. 3 Gb - New Gen (50% pool utilization) > 3. 2.8 Gb - Perm Gen (I am guessing this is because of interned strings) > 4. Survivor space is in the order of 300-400 MB and is almost always 100% > full.(Is this a major issue?) > > We are also currently using Parallel GC collector but planning to move to > CMS for lesser stop-the-world gc times. If i increase the filter cache and > document cache entry sizes, they would also go to the Old gen right? > > A very naive question: How does increasing young gen going to help if we > know that solr is already pushing major caches and other objects to old gen > because of their nature? My young gen pool utilization is still well under > 50% > > > Thanks > Nitin > > > On Sun, Mar 2, 2014 at 9:31 PM, Walter Underwood wrote: > >> An LRU cache will always fill up the old generation. Old objects are >> ejected, and those are usually in the old generation. >> >> Increasing the heap size will not eliminate this. It will make major, stop >> the world collections longer. >> >> Increase the new generation size until the rate of old gen increase slows >> down. Then choose a total heap size to control the frequency (and duration) >> of major collections. >> >> We run with the new generation at about 25% of the heap, so 8GB total and >> a 2GB newgen. >> >> A 512 entry cache is very small for query results or docs. We run with 10K >> or more entries for those. The filter cache size depends on your usage. We >> have only a handful of different filter queries, so a tiny cache is fine. >> >> What is your hit rate on the caches? >> >> wunder >> >> On Mar 2, 2014, at 7:42 PM, KNitin wrote: >> >>> Hi >>> >>> I have very large index for a few collections and when they are being >>> queried, i see the Old gen space close to 100% Usage all the time. The >>> system becomes extremely slow due to GC activity right after that and it >>> gets into this cycle very often >>> >>> I have given solr close to 30G of heap in a 65 GB ram machine and rest is >>> given to RAm. I have a lot of hits in filter,query result and document >>> caches and the size of all the caches is around 512 entries per >>> collection.Are all the caches used by solr on or off heap ? >>> >>> >>> Given this scenario where GC is the primary bottleneck what is a good >>> recommended memory settings for solr? Should i increase the heap memory >>> (that will only postpone the problem before the heap becomes full again >>> after a while) ? Will memory maps help at all in this scenario? >>> >>> >>> Kindly advise on the best practices >>> Thanks >>> Nitin >> >> >> > -- * Bernd FehlingBielefeld University Library Dipl.-Inform. (FH)LibTec - Library Technology Universitätsstr. 25 and Knowledge Management 33615 Bielefeld Tel. +49 521 106-4060 bernd.fehling(at)uni-bielefeld.de BASE - Bielefeld Academic Search Engine - www.base-search.net *
Re: Solr Heap, MMaps and Garbage Collection
Thanks, Walter Hit rate on the document caches is close to 70-80% and the filter caches are a 100% hit (since most of our queries filter on the same fields but have a different q parameter). Query result cache is not of great importance to me since the hit rate their is almost negligible. Does it mean i need to increase the size of my filter and document cache for large indices? The split up of my 25Gb heap usage is split as follows 1. 19 GB - Old Gen (100% pool utilization) 2. 3 Gb - New Gen (50% pool utilization) 3. 2.8 Gb - Perm Gen (I am guessing this is because of interned strings) 4. Survivor space is in the order of 300-400 MB and is almost always 100% full.(Is this a major issue?) We are also currently using Parallel GC collector but planning to move to CMS for lesser stop-the-world gc times. If i increase the filter cache and document cache entry sizes, they would also go to the Old gen right? A very naive question: How does increasing young gen going to help if we know that solr is already pushing major caches and other objects to old gen because of their nature? My young gen pool utilization is still well under 50% Thanks Nitin On Sun, Mar 2, 2014 at 9:31 PM, Walter Underwood wrote: > An LRU cache will always fill up the old generation. Old objects are > ejected, and those are usually in the old generation. > > Increasing the heap size will not eliminate this. It will make major, stop > the world collections longer. > > Increase the new generation size until the rate of old gen increase slows > down. Then choose a total heap size to control the frequency (and duration) > of major collections. > > We run with the new generation at about 25% of the heap, so 8GB total and > a 2GB newgen. > > A 512 entry cache is very small for query results or docs. We run with 10K > or more entries for those. The filter cache size depends on your usage. We > have only a handful of different filter queries, so a tiny cache is fine. > > What is your hit rate on the caches? > > wunder > > On Mar 2, 2014, at 7:42 PM, KNitin wrote: > > > Hi > > > > I have very large index for a few collections and when they are being > > queried, i see the Old gen space close to 100% Usage all the time. The > > system becomes extremely slow due to GC activity right after that and it > > gets into this cycle very often > > > > I have given solr close to 30G of heap in a 65 GB ram machine and rest is > > given to RAm. I have a lot of hits in filter,query result and document > > caches and the size of all the caches is around 512 entries per > > collection.Are all the caches used by solr on or off heap ? > > > > > > Given this scenario where GC is the primary bottleneck what is a good > > recommended memory settings for solr? Should i increase the heap memory > > (that will only postpone the problem before the heap becomes full again > > after a while) ? Will memory maps help at all in this scenario? > > > > > > Kindly advise on the best practices > > Thanks > > Nitin > > >
Re: Solr Heap, MMaps and Garbage Collection
An LRU cache will always fill up the old generation. Old objects are ejected, and those are usually in the old generation. Increasing the heap size will not eliminate this. It will make major, stop the world collections longer. Increase the new generation size until the rate of old gen increase slows down. Then choose a total heap size to control the frequency (and duration) of major collections. We run with the new generation at about 25% of the heap, so 8GB total and a 2GB newgen. A 512 entry cache is very small for query results or docs. We run with 10K or more entries for those. The filter cache size depends on your usage. We have only a handful of different filter queries, so a tiny cache is fine. What is your hit rate on the caches? wunder On Mar 2, 2014, at 7:42 PM, KNitin wrote: > Hi > > I have very large index for a few collections and when they are being > queried, i see the Old gen space close to 100% Usage all the time. The > system becomes extremely slow due to GC activity right after that and it > gets into this cycle very often > > I have given solr close to 30G of heap in a 65 GB ram machine and rest is > given to RAm. I have a lot of hits in filter,query result and document > caches and the size of all the caches is around 512 entries per > collection.Are all the caches used by solr on or off heap ? > > > Given this scenario where GC is the primary bottleneck what is a good > recommended memory settings for solr? Should i increase the heap memory > (that will only postpone the problem before the heap becomes full again > after a while) ? Will memory maps help at all in this scenario? > > > Kindly advise on the best practices > Thanks > Nitin