Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Marc Des Garets
Big heap because very large number of requests with more than 60 indexes
and hundreds of million of documents (all indexes together). My problem
is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
or 2mn and 20Gb of the heap is used.

With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
(it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
every minute.

There must be something that has changed in 4.1 compared to 3.5 to cause
this behavior. It's the same requests, same schemas (excepted 4 fields
changed from sint to tint) and same config.

On 04/10/2013 07:38 PM, Shawn Heisey wrote:
 On 4/10/2013 9:48 AM, Marc Des Garets wrote:
 The JVM behavior is now radically different and doesn't seem to make
 sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

 The perm gen went from 410Mb to 600Mb.

 The eden space usage is a lot bigger and the survivor space usage is
 100% all the time.

 I don't really understand what is happening. GC behavior really doesn't
 seem right.

 My jvm settings:
 -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
 -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m
 As Otis has already asked, why do you have a 40GB heap?  The only way I 
 can imagine that you would actually NEED a heap that big is if your 
 index size is measured in hundreds of gigabytes.  If you really do need 
 a heap that big, you will probably need to go with a JVM like Zing.  I 
 don't know how much Zing costs, but they claim to be able to make any 
 heap size perform well under any load.  It is Linux-only.

 I was running into extreme problems with GC pauses with my own setup, 
 and that was only with an 8GB heap.  I was using the CMS collector and 
 NewRatio=1.  Switching to G1 didn't help at all - it might have even 
 made the problem worse.  I never did try the Zing JVM.

 After a lot of experimentation (which I will admit was not done very 
 methodically) I found JVM options that have reduced the GC pause problem 
 greatly.  Below is what I am using now on Solr 4.2.1 with a total 
 per-server index size of about 45GB.  This works properly on CentOS 6 
 with Oracle Java 7u17, UseLargePages may require special kernel tuning 
 on other operating systems:

 -Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 
 -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled 
 -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts

 These options could probably use further tuning, but I haven't had time 
 for the kind of testing that will be required.

 If you decide to pay someone to make the problem going away instead:

 http://www.azulsystems.com/products/zing/whatisit

 Thanks,
 Shawn





This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the addressee. 
Any views or opinions expressed within it are those of the author and do not 
necessarily represent those of 
192.com Ltd or any of its subsidiary companies. If you are not the intended 
recipient then you must 
not disclose, copy or take any action in reliance of this transmission. If you 
have received this 
transmission in error, please notify the sender as soon as possible. No 
employee or agent is authorised 
to conclude any binding agreement on behalf 192.com Ltd with another party by 
email without express written 
confirmation by an authorised employee of the company. http://www.192.com (Tel: 
08000 192 192). 
192.com Ltd is incorporated in England and Wales, company number 07180348, VAT 
No. GB 103226273.

Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Furkan KAMACI
Hi Marc;

Could I learn your index size and what is your performance measure as query
per second?

2013/4/11 Marc Des Garets marc.desgar...@192.com

 Big heap because very large number of requests with more than 60 indexes
 and hundreds of million of documents (all indexes together). My problem
 is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
 or 2mn and 20Gb of the heap is used.

 With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
 (it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
 every minute.

 There must be something that has changed in 4.1 compared to 3.5 to cause
 this behavior. It's the same requests, same schemas (excepted 4 fields
 changed from sint to tint) and same config.

 On 04/10/2013 07:38 PM, Shawn Heisey wrote:
  On 4/10/2013 9:48 AM, Marc Des Garets wrote:
  The JVM behavior is now radically different and doesn't seem to make
  sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.
 
  The perm gen went from 410Mb to 600Mb.
 
  The eden space usage is a lot bigger and the survivor space usage is
  100% all the time.
 
  I don't really understand what is happening. GC behavior really doesn't
  seem right.
 
  My jvm settings:
  -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
  -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m
  As Otis has already asked, why do you have a 40GB heap?  The only way I
  can imagine that you would actually NEED a heap that big is if your
  index size is measured in hundreds of gigabytes.  If you really do need
  a heap that big, you will probably need to go with a JVM like Zing.  I
  don't know how much Zing costs, but they claim to be able to make any
  heap size perform well under any load.  It is Linux-only.
 
  I was running into extreme problems with GC pauses with my own setup,
  and that was only with an 8GB heap.  I was using the CMS collector and
  NewRatio=1.  Switching to G1 didn't help at all - it might have even
  made the problem worse.  I never did try the Zing JVM.
 
  After a lot of experimentation (which I will admit was not done very
  methodically) I found JVM options that have reduced the GC pause problem
  greatly.  Below is what I am using now on Solr 4.2.1 with a total
  per-server index size of about 45GB.  This works properly on CentOS 6
  with Oracle Java 7u17, UseLargePages may require special kernel tuning
  on other operating systems:
 
  -Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
  -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled
  -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts
 
  These options could probably use further tuning, but I haven't had time
  for the kind of testing that will be required.
 
  If you decide to pay someone to make the problem going away instead:
 
  http://www.azulsystems.com/products/zing/whatisit
 
  Thanks,
  Shawn
 
 
 


 This transmission is strictly confidential, possibly legally privileged,
 and intended solely for the addressee.
 Any views or opinions expressed within it are those of the author and do
 not necessarily represent those of
 192.com Ltd or any of its subsidiary companies. If you are not the
 intended recipient then you must
 not disclose, copy or take any action in reliance of this transmission. If
 you have received this
 transmission in error, please notify the sender as soon as possible. No
 employee or agent is authorised
 to conclude any binding agreement on behalf 192.com Ltd with another
 party by email without express written
 confirmation by an authorised employee of the company. 
 http://www.192.com(Tel: 08000 192 192).
 192.com Ltd is incorporated in England and Wales, company number
 07180348, VAT No. GB 103226273.



Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Marc Des Garets
I have 45 solr 4.1 indexes. Sizes vary between 20Gb and 2.2Gb.

- 1 is 20Gb (80 million docs)
- 1 is 5.1Gb (24 million docs)
- 1 is 5.6Gb (26 million docs)
- 1 is 6.5Gb (28 million docs)
- 11 others are about 2.2Gb (6-7 million docs).
- 20 others are about 600Mb (2.5 million docs)

That reminds me of something. The 4.1 indexes are 2 times smaller than
the 3.5 indexes. For example the one which is 20Gb with solr 4.1 is 43Gb
with solr 3.5. Maybe there is something there?

There is roughly 200 queries per second.


On 04/11/2013 11:07 AM, Furkan KAMACI wrote:
 Hi Marc;

 Could I learn your index size and what is your performance measure as query
 per second?

 2013/4/11 Marc Des Garets marc.desgar...@192.com

 Big heap because very large number of requests with more than 60 indexes
 and hundreds of million of documents (all indexes together). My problem
 is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
 or 2mn and 20Gb of the heap is used.

 With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
 (it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
 every minute.

 There must be something that has changed in 4.1 compared to 3.5 to cause
 this behavior. It's the same requests, same schemas (excepted 4 fields
 changed from sint to tint) and same config.

 On 04/10/2013 07:38 PM, Shawn Heisey wrote:
 On 4/10/2013 9:48 AM, Marc Des Garets wrote:
 The JVM behavior is now radically different and doesn't seem to make
 sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

 The perm gen went from 410Mb to 600Mb.

 The eden space usage is a lot bigger and the survivor space usage is
 100% all the time.

 I don't really understand what is happening. GC behavior really doesn't
 seem right.

 My jvm settings:
 -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
 -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m
 As Otis has already asked, why do you have a 40GB heap?  The only way I
 can imagine that you would actually NEED a heap that big is if your
 index size is measured in hundreds of gigabytes.  If you really do need
 a heap that big, you will probably need to go with a JVM like Zing.  I
 don't know how much Zing costs, but they claim to be able to make any
 heap size perform well under any load.  It is Linux-only.

 I was running into extreme problems with GC pauses with my own setup,
 and that was only with an 8GB heap.  I was using the CMS collector and
 NewRatio=1.  Switching to G1 didn't help at all - it might have even
 made the problem worse.  I never did try the Zing JVM.

 After a lot of experimentation (which I will admit was not done very
 methodically) I found JVM options that have reduced the GC pause problem
 greatly.  Below is what I am using now on Solr 4.2.1 with a total
 per-server index size of about 45GB.  This works properly on CentOS 6
 with Oracle Java 7u17, UseLargePages may require special kernel tuning
 on other operating systems:

 -Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
 -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled
 -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts

 These options could probably use further tuning, but I haven't had time
 for the kind of testing that will be required.

 If you decide to pay someone to make the problem going away instead:

 http://www.azulsystems.com/products/zing/whatisit

 Thanks,
 Shawn




 This transmission is strictly confidential, possibly legally privileged,
 and intended solely for the addressee.
 Any views or opinions expressed within it are those of the author and do
 not necessarily represent those of
 192.com Ltd or any of its subsidiary companies. If you are not the
 intended recipient then you must
 not disclose, copy or take any action in reliance of this transmission. If
 you have received this
 transmission in error, please notify the sender as soon as possible. No
 employee or agent is authorised
 to conclude any binding agreement on behalf 192.com Ltd with another
 party by email without express written
 confirmation by an authorised employee of the company. 
 http://www.192.com(Tel: 08000 192 192).
 192.com Ltd is incorporated in England and Wales, company number
 07180348, VAT No. GB 103226273.



This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the addressee. 
Any views or opinions expressed within it are those of the author and do not 
necessarily represent those of 
192.com Ltd or any of its subsidiary companies. If you are not the intended 
recipient then you must 
not disclose, copy or take any action in reliance of this transmission. If you 
have received this 
transmission in error, please notify the sender as soon as possible. No 
employee or agent is authorised 
to conclude any binding agreement on behalf 192.com Ltd with another party by 
email without express written 
confirmation by an authorised employee of the 

Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Jack Krupansky
Same config? Do a compare with the new example config and see what settings 
are different/changed. There may have been some defaults that changed. Read 
the comments in the new config.


If you had just taken or merged the new config, then I would suggest making 
sure that the update log is not enabled (or make sure you do hard commits 
relatively frequently rather than only soft commits.)


-- Jack Krupansky

-Original Message- 
From: Marc Des Garets

Sent: Thursday, April 11, 2013 3:07 AM
To: solr-user@lucene.apache.org
Subject: Re: migration solr 3.5 to 4.1 - JVM GC problems

Big heap because very large number of requests with more than 60 indexes
and hundreds of million of documents (all indexes together). My problem
is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
or 2mn and 20Gb of the heap is used.

With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
(it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
every minute.

There must be something that has changed in 4.1 compared to 3.5 to cause
this behavior. It's the same requests, same schemas (excepted 4 fields
changed from sint to tint) and same config.

On 04/10/2013 07:38 PM, Shawn Heisey wrote:

On 4/10/2013 9:48 AM, Marc Des Garets wrote:

The JVM behavior is now radically different and doesn't seem to make
sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

The perm gen went from 410Mb to 600Mb.

The eden space usage is a lot bigger and the survivor space usage is
100% all the time.

I don't really understand what is happening. GC behavior really doesn't
seem right.

My jvm settings:
-d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
-XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m

As Otis has already asked, why do you have a 40GB heap?  The only way I
can imagine that you would actually NEED a heap that big is if your
index size is measured in hundreds of gigabytes.  If you really do need
a heap that big, you will probably need to go with a JVM like Zing.  I
don't know how much Zing costs, but they claim to be able to make any
heap size perform well under any load.  It is Linux-only.

I was running into extreme problems with GC pauses with my own setup,
and that was only with an 8GB heap.  I was using the CMS collector and
NewRatio=1.  Switching to G1 didn't help at all - it might have even
made the problem worse.  I never did try the Zing JVM.

After a lot of experimentation (which I will admit was not done very
methodically) I found JVM options that have reduced the GC pause problem
greatly.  Below is what I am using now on Solr 4.2.1 with a total
per-server index size of about 45GB.  This works properly on CentOS 6
with Oracle Java 7u17, UseLargePages may require special kernel tuning
on other operating systems:

-Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled
-XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts

These options could probably use further tuning, but I haven't had time
for the kind of testing that will be required.

If you decide to pay someone to make the problem going away instead:

http://www.azulsystems.com/products/zing/whatisit

Thanks,
Shawn






This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the addressee.
Any views or opinions expressed within it are those of the author and do not 
necessarily represent those of
192.com Ltd or any of its subsidiary companies. If you are not the intended 
recipient then you must
not disclose, copy or take any action in reliance of this transmission. If 
you have received this
transmission in error, please notify the sender as soon as possible. No 
employee or agent is authorised
to conclude any binding agreement on behalf 192.com Ltd with another party 
by email without express written
confirmation by an authorised employee of the company. http://www.192.com 
(Tel: 08000 192 192).
192.com Ltd is incorporated in England and Wales, company number 07180348, 
VAT No. GB 103226273. 



Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Marc Des Garets
Same config. I compared both, some defaults changed like ramBufferSize
which I've set like in 3.5 (same with other things).

It becomes even more strange to me. Now I have changed the jvm settings
to this:
-d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=6
-XX:SurvivorRatio=2 -XX:G1ReservePercent=10 -XX:MaxGCPauseMillis=100
-XX:InitiatingHeapOccupancyPercent=30 -XX:PermSize=728m -XX:MaxPermSize=728m

So the Eden space is just 6Gb, survivor space is still weird (80Mb) and
full 100% of time, and old gen is 34Gb.

I now get GCs of just 0.07 sec every 30sec/1mn. Very regular like this:
[GC pause (young) 16214M-10447M(40960M), 0.0738720 secs]

Just 30% of the total heap is used.

After while it's going to do:
[GC pause (young) (initial-mark) 11603M-11391M(40960M), 0.100 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0172380]
[GC concurrent-mark-start]
[GC concurrent-mark-end, 0.4824210 sec]
[GC remark, 0.0248680 secs]
[GC cleanup 11476M-11476M(40960M), 0.0116420 secs]

Which looks pretty good. If I am not mistaken, concurrent-mark isn't
stop the world. remark is stop the world but is just 0.02 sec and GC
cleanup is also stop the world but is just 0.01 sec.

By the look of it I could have a 20g heap rather than 40... Now I am
waiting to see what happens when it will clear the old gen but that will
take a while before it happens because it is growing slowly.

Still mysterious to me but it looks like it's going to all work out.

On 04/11/2013 03:06 PM, Jack Krupansky wrote:
 Same config? Do a compare with the new example config and see what settings 
 are different/changed. There may have been some defaults that changed. Read 
 the comments in the new config.

 If you had just taken or merged the new config, then I would suggest making 
 sure that the update log is not enabled (or make sure you do hard commits 
 relatively frequently rather than only soft commits.)

 -- Jack Krupansky

 -Original Message- 
 From: Marc Des Garets
 Sent: Thursday, April 11, 2013 3:07 AM
 To: solr-user@lucene.apache.org
 Subject: Re: migration solr 3.5 to 4.1 - JVM GC problems

 Big heap because very large number of requests with more than 60 indexes
 and hundreds of million of documents (all indexes together). My problem
 is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
 or 2mn and 20Gb of the heap is used.

 With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
 (it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
 every minute.

 There must be something that has changed in 4.1 compared to 3.5 to cause
 this behavior. It's the same requests, same schemas (excepted 4 fields
 changed from sint to tint) and same config.

 On 04/10/2013 07:38 PM, Shawn Heisey wrote:
 On 4/10/2013 9:48 AM, Marc Des Garets wrote:
 The JVM behavior is now radically different and doesn't seem to make
 sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

 The perm gen went from 410Mb to 600Mb.

 The eden space usage is a lot bigger and the survivor space usage is
 100% all the time.

 I don't really understand what is happening. GC behavior really doesn't
 seem right.

 My jvm settings:
 -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
 -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m
 As Otis has already asked, why do you have a 40GB heap?  The only way I
 can imagine that you would actually NEED a heap that big is if your
 index size is measured in hundreds of gigabytes.  If you really do need
 a heap that big, you will probably need to go with a JVM like Zing.  I
 don't know how much Zing costs, but they claim to be able to make any
 heap size perform well under any load.  It is Linux-only.

 I was running into extreme problems with GC pauses with my own setup,
 and that was only with an 8GB heap.  I was using the CMS collector and
 NewRatio=1.  Switching to G1 didn't help at all - it might have even
 made the problem worse.  I never did try the Zing JVM.

 After a lot of experimentation (which I will admit was not done very
 methodically) I found JVM options that have reduced the GC pause problem
 greatly.  Below is what I am using now on Solr 4.2.1 with a total
 per-server index size of about 45GB.  This works properly on CentOS 6
 with Oracle Java 7u17, UseLargePages may require special kernel tuning
 on other operating systems:

 -Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
 -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled
 -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts

 These options could probably use further tuning, but I haven't had time
 for the kind of testing that will be required.

 If you decide to pay someone to make the problem going away instead:

 http://www.azulsystems.com/products/zing/whatisit

 Thanks,
 Shawn




 This transmission is strictly confidential, possibly legally privileged, and 
 intended solely

Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Otis Gospodnetic
Marc,

Re smaller index sizes - it's the stored field compression that didn't
exist in 3.x.
See https://issues.apache.org/jira/browse/SOLR-4375

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Thu, Apr 11, 2013 at 10:53 AM, Marc Des Garets
marc.desgar...@192.com wrote:
 Same config. I compared both, some defaults changed like ramBufferSize
 which I've set like in 3.5 (same with other things).

 It becomes even more strange to me. Now I have changed the jvm settings
 to this:
 -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=6
 -XX:SurvivorRatio=2 -XX:G1ReservePercent=10 -XX:MaxGCPauseMillis=100
 -XX:InitiatingHeapOccupancyPercent=30 -XX:PermSize=728m -XX:MaxPermSize=728m

 So the Eden space is just 6Gb, survivor space is still weird (80Mb) and
 full 100% of time, and old gen is 34Gb.

 I now get GCs of just 0.07 sec every 30sec/1mn. Very regular like this:
 [GC pause (young) 16214M-10447M(40960M), 0.0738720 secs]

 Just 30% of the total heap is used.

 After while it's going to do:
 [GC pause (young) (initial-mark) 11603M-11391M(40960M), 0.100 secs]
 [GC concurrent-root-region-scan-start]
 [GC concurrent-root-region-scan-end, 0.0172380]
 [GC concurrent-mark-start]
 [GC concurrent-mark-end, 0.4824210 sec]
 [GC remark, 0.0248680 secs]
 [GC cleanup 11476M-11476M(40960M), 0.0116420 secs]

 Which looks pretty good. If I am not mistaken, concurrent-mark isn't
 stop the world. remark is stop the world but is just 0.02 sec and GC
 cleanup is also stop the world but is just 0.01 sec.

 By the look of it I could have a 20g heap rather than 40... Now I am
 waiting to see what happens when it will clear the old gen but that will
 take a while before it happens because it is growing slowly.

 Still mysterious to me but it looks like it's going to all work out.

 On 04/11/2013 03:06 PM, Jack Krupansky wrote:
 Same config? Do a compare with the new example config and see what settings
 are different/changed. There may have been some defaults that changed. Read
 the comments in the new config.

 If you had just taken or merged the new config, then I would suggest making
 sure that the update log is not enabled (or make sure you do hard commits
 relatively frequently rather than only soft commits.)

 -- Jack Krupansky

 -Original Message-
 From: Marc Des Garets
 Sent: Thursday, April 11, 2013 3:07 AM
 To: solr-user@lucene.apache.org
 Subject: Re: migration solr 3.5 to 4.1 - JVM GC problems

 Big heap because very large number of requests with more than 60 indexes
 and hundreds of million of documents (all indexes together). My problem
 is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
 or 2mn and 20Gb of the heap is used.

 With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
 (it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
 every minute.

 There must be something that has changed in 4.1 compared to 3.5 to cause
 this behavior. It's the same requests, same schemas (excepted 4 fields
 changed from sint to tint) and same config.

 On 04/10/2013 07:38 PM, Shawn Heisey wrote:
 On 4/10/2013 9:48 AM, Marc Des Garets wrote:
 The JVM behavior is now radically different and doesn't seem to make
 sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

 The perm gen went from 410Mb to 600Mb.

 The eden space usage is a lot bigger and the survivor space usage is
 100% all the time.

 I don't really understand what is happening. GC behavior really doesn't
 seem right.

 My jvm settings:
 -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
 -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m
 As Otis has already asked, why do you have a 40GB heap?  The only way I
 can imagine that you would actually NEED a heap that big is if your
 index size is measured in hundreds of gigabytes.  If you really do need
 a heap that big, you will probably need to go with a JVM like Zing.  I
 don't know how much Zing costs, but they claim to be able to make any
 heap size perform well under any load.  It is Linux-only.

 I was running into extreme problems with GC pauses with my own setup,
 and that was only with an 8GB heap.  I was using the CMS collector and
 NewRatio=1.  Switching to G1 didn't help at all - it might have even
 made the problem worse.  I never did try the Zing JVM.

 After a lot of experimentation (which I will admit was not done very
 methodically) I found JVM options that have reduced the GC pause problem
 greatly.  Below is what I am using now on Solr 4.2.1 with a total
 per-server index size of about 45GB.  This works properly on CentOS 6
 with Oracle Java 7u17, UseLargePages may require special kernel tuning
 on other operating systems:

 -Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
 -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled
 -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts

 These options could probably use further tuning, but I

migration solr 3.5 to 4.1 - JVM GC problems

2013-04-10 Thread Marc Des Garets
Hi,

I run multiple solr indexes in 1 single tomcat (1 webapp per index). All
the indexes are solr 3.5 and I have upgraded few of them to solr 4.1
(about half of them).

The JVM behavior is now radically different and doesn't seem to make
sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

The perm gen went from 410Mb to 600Mb.

The eden space usage is a lot bigger and the survivor space usage is
100% all the time.

I don't really understand what is happening. GC behavior really doesn't
seem right.

My jvm settings:
-d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
-XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m

I have tried NewRatio=1 and SurvivorRatio=3 hoping to get the Survivor
space to not be 100% full all the time without success.

Here is what jmap is giving me:
Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize  = 42949672960 (40960.0MB)
   NewSize  = 1363144 (1.254223632812MB)
   MaxNewSize   = 17592186044415 MB
   OldSize  = 5452592 (5.169482421875MB)
   NewRatio = 1
   SurvivorRatio= 3
   PermSize = 754974720 (720.0MB)
   MaxPermSize  = 763363328 (728.0MB)
   G1HeapRegionSize = 16777216 (16.0MB)

Heap Usage:
G1 Heap:
   regions  = 2560
   capacity = 42949672960 (40960.0MB)
   used = 23786449912 (22684.526359558105MB)
   free = 19163223048 (18275.473640441895MB)
   55.382144432514906% used
G1 Young Generation:
Eden Space:
   regions  = 674
   capacity = 20619198464 (19664.0MB)
   used = 11307843584 (10784.0MB)
   free = 9311354880 (8880.0MB)
   54.841334418226204% used
Survivor Space:
   regions  = 115
   capacity = 1929379840 (1840.0MB)
   used = 1929379840 (1840.0MB)
   free = 0 (0.0MB)
   100.0% used
G1 Old Generation:
   regions  = 732
   capacity = 20401094656 (19456.0MB)
   used = 10549226488 (10060.526359558105MB)
   free = 9851868168 (9395.473640441895MB)
   51.70911985792612% used
Perm Generation:
   capacity = 754974720 (720.0MB)
   used = 514956504 (491.10079193115234MB)
   free = 240018216 (228.89920806884766MB)
   68.20844332377116% used

The Survivor space even went up to 3.6Gb but was still 100% used.

I have disabled all caches.

Obviously I am getting very bad GC performance.

Any idea as to what could be wrong and why this could be happening?


Thanks,

Marc


This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the addressee. 
Any views or opinions expressed within it are those of the author and do not 
necessarily represent those of 
192.com Ltd or any of its subsidiary companies. If you are not the intended 
recipient then you must 
not disclose, copy or take any action in reliance of this transmission. If you 
have received this 
transmission in error, please notify the sender as soon as possible. No 
employee or agent is authorised 
to conclude any binding agreement on behalf 192.com Ltd with another party by 
email without express written 
confirmation by an authorised employee of the company. http://www.192.com (Tel: 
08000 192 192). 
192.com Ltd is incorporated in England and Wales, company number 07180348, VAT 
No. GB 103226273.

Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-10 Thread Otis Gospodnetic
Hi Marc,

Why such a big heap?  Do you really need it?  You disabled all caches,
so the JVM really shouldn't need much memory.  Have you tried with
-Xmx20g or even -Xmx8g?  Aha, survivor is getting to 100% so you kept
increasing -Xmx?

Have you tried just not using any of these:
-XX:+UseG1GC -XX:NewRatio=1 -XX:SurvivorRatio=3 -XX:PermSize=728m
-XX:MaxPermSize=728m ?

My hunch is that there is a leak somewhere, because without caches you
shouldn't eed 40GB heap.

Otis
--
SOLR Performance Monitoring - http://sematext.com/spm/index.html
Solr  ElasticSearch Support
http://sematext.com/





On Wed, Apr 10, 2013 at 11:48 AM, Marc Des Garets
marc.desgar...@192.com wrote:
 Hi,

 I run multiple solr indexes in 1 single tomcat (1 webapp per index). All
 the indexes are solr 3.5 and I have upgraded few of them to solr 4.1
 (about half of them).

 The JVM behavior is now radically different and doesn't seem to make
 sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

 The perm gen went from 410Mb to 600Mb.

 The eden space usage is a lot bigger and the survivor space usage is
 100% all the time.

 I don't really understand what is happening. GC behavior really doesn't
 seem right.

 My jvm settings:
 -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
 -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m

 I have tried NewRatio=1 and SurvivorRatio=3 hoping to get the Survivor
 space to not be 100% full all the time without success.

 Here is what jmap is giving me:
 Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize  = 42949672960 (40960.0MB)
NewSize  = 1363144 (1.254223632812MB)
MaxNewSize   = 17592186044415 MB
OldSize  = 5452592 (5.169482421875MB)
NewRatio = 1
SurvivorRatio= 3
PermSize = 754974720 (720.0MB)
MaxPermSize  = 763363328 (728.0MB)
G1HeapRegionSize = 16777216 (16.0MB)

 Heap Usage:
 G1 Heap:
regions  = 2560
capacity = 42949672960 (40960.0MB)
used = 23786449912 (22684.526359558105MB)
free = 19163223048 (18275.473640441895MB)
55.382144432514906% used
 G1 Young Generation:
 Eden Space:
regions  = 674
capacity = 20619198464 (19664.0MB)
used = 11307843584 (10784.0MB)
free = 9311354880 (8880.0MB)
54.841334418226204% used
 Survivor Space:
regions  = 115
capacity = 1929379840 (1840.0MB)
used = 1929379840 (1840.0MB)
free = 0 (0.0MB)
100.0% used
 G1 Old Generation:
regions  = 732
capacity = 20401094656 (19456.0MB)
used = 10549226488 (10060.526359558105MB)
free = 9851868168 (9395.473640441895MB)
51.70911985792612% used
 Perm Generation:
capacity = 754974720 (720.0MB)
used = 514956504 (491.10079193115234MB)
free = 240018216 (228.89920806884766MB)
68.20844332377116% used

 The Survivor space even went up to 3.6Gb but was still 100% used.

 I have disabled all caches.

 Obviously I am getting very bad GC performance.

 Any idea as to what could be wrong and why this could be happening?


 Thanks,

 Marc


 This transmission is strictly confidential, possibly legally privileged, and 
 intended solely for the addressee.
 Any views or opinions expressed within it are those of the author and do not 
 necessarily represent those of
 192.com Ltd or any of its subsidiary companies. If you are not the intended 
 recipient then you must
 not disclose, copy or take any action in reliance of this transmission. If 
 you have received this
 transmission in error, please notify the sender as soon as possible. No 
 employee or agent is authorised
 to conclude any binding agreement on behalf 192.com Ltd with another party by 
 email without express written
 confirmation by an authorised employee of the company. http://www.192.com 
 (Tel: 08000 192 192).
 192.com Ltd is incorporated in England and Wales, company number 07180348, 
 VAT No. GB 103226273.


Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-10 Thread Shawn Heisey

On 4/10/2013 9:48 AM, Marc Des Garets wrote:

The JVM behavior is now radically different and doesn't seem to make
sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

The perm gen went from 410Mb to 600Mb.

The eden space usage is a lot bigger and the survivor space usage is
100% all the time.

I don't really understand what is happening. GC behavior really doesn't
seem right.

My jvm settings:
-d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
-XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m


As Otis has already asked, why do you have a 40GB heap?  The only way I 
can imagine that you would actually NEED a heap that big is if your 
index size is measured in hundreds of gigabytes.  If you really do need 
a heap that big, you will probably need to go with a JVM like Zing.  I 
don't know how much Zing costs, but they claim to be able to make any 
heap size perform well under any load.  It is Linux-only.


I was running into extreme problems with GC pauses with my own setup, 
and that was only with an 8GB heap.  I was using the CMS collector and 
NewRatio=1.  Switching to G1 didn't help at all - it might have even 
made the problem worse.  I never did try the Zing JVM.


After a lot of experimentation (which I will admit was not done very 
methodically) I found JVM options that have reduced the GC pause problem 
greatly.  Below is what I am using now on Solr 4.2.1 with a total 
per-server index size of about 45GB.  This works properly on CentOS 6 
with Oracle Java 7u17, UseLargePages may require special kernel tuning 
on other operating systems:


-Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 
-XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled 
-XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts


These options could probably use further tuning, but I haven't had time 
for the kind of testing that will be required.


If you decide to pay someone to make the problem going away instead:

http://www.azulsystems.com/products/zing/whatisit

Thanks,
Shawn