solr 4.4 config trouble

2013-09-30 Thread Marc des Garets

Hi,

I'm running solr in tomcat. I am trying to upgrade to solr 4.4 but I 
can't get it to work. If someone can point me at what I'm doing wrong.


tomcat context:
crossContext="true">
value="/opt/solr4.4/solr_address" override="true" />




core.properties:
name=address
collection=address
coreNodeName=address
dataDir=/opt/indexes4.1/address


solr.xml:



${host:}
8080
solr_address
${zkClientTimeout:15000}
false



${socketTimeout:0}
${connTimeout:0}




In solrconfig.xml I have:
4.1

/opt/indexes4.1/address


And the log4j logs in catalina.out:
...
INFO: Deploying configuration descriptor solr_address.xml
0 [main] INFO org.apache.solr.servlet.SolrDispatchFilter – 
SolrDispatchFilter.init()
24 [main] INFO org.apache.solr.core.SolrResourceLoader – Using JNDI 
solr.home: /opt/solr4.4/solr_address
26 [main] INFO org.apache.solr.core.SolrResourceLoader – new 
SolrResourceLoader for directory: '/opt/solr4.4/solr_address/'
176 [main] INFO org.apache.solr.core.ConfigSolr – Loading container 
configuration from /opt/solr4.4/solr_address/solr.xml
272 [main] INFO org.apache.solr.core.SolrCoreDiscoverer – Looking for 
cores in /opt/solr4.4/solr_address
276 [main] INFO org.apache.solr.core.SolrCoreDiscoverer – Looking for 
cores in /opt/solr4.4/solr_address/conf
276 [main] INFO org.apache.solr.core.SolrCoreDiscoverer – Looking for 
cores in /opt/solr4.4/solr_address/conf/xslt
277 [main] INFO org.apache.solr.core.SolrCoreDiscoverer – Looking for 
cores in /opt/solr4.4/solr_address/conf/lang
278 [main] INFO org.apache.solr.core.SolrCoreDiscoverer – Looking for 
cores in /opt/solr4.4/solr_address/conf/velocity
283 [main] INFO org.apache.solr.core.CoreContainer – New CoreContainer 
991552899
284 [main] INFO org.apache.solr.core.CoreContainer – Loading cores into 
CoreContainer [instanceDir=/opt/solr4.4/solr_address/]
301 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
socketTimeout to: 0
301 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
urlScheme to: http://
301 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
connTimeout to: 0
302 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
maxConnectionsPerHost to: 20
302 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
corePoolSize to: 0
303 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
maximumPoolSize to: 2147483647
303 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
maxThreadIdleTime to: 5
303 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
sizeOfQueue to: -1
303 [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory – Setting 
fairnessPolicy to: false
320 [main] INFO org.apache.solr.client.solrj.impl.HttpClientUtil – 
Creating new http client, 
config:maxConnectionsPerHost=20&maxConnections=1&socketTimeout=0&connTimeout=0&retry=false
420 [main] INFO org.apache.solr.logging.LogWatcher – Registering Log 
Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)]
422 [main] INFO org.apache.solr.core.ZkContainer – Zookeeper 
client=192.168.10.206:2181
429 [main] INFO org.apache.solr.client.solrj.impl.HttpClientUtil – 
Creating new http client, 
config:maxConnections=500&maxConnectionsPerHost=16&socketTimeout=0&connTimeout=0
487 [main] INFO org.apache.solr.common.cloud.ConnectionManager – Waiting 
for client to connect to ZooKeeper
540 [main-EventThread] INFO 
org.apache.solr.common.cloud.ConnectionManager – Watcher 
org.apache.solr.common.cloud.ConnectionManager@7dc21ece 
name:ZooKeeperConnection Watcher:192.168.10.206:2181 got event 
WatchedEvent state:SyncConnected type:None path:null path:null type:None
541 [main] INFO org.apache.solr.common.cloud.ConnectionManager – Client 
is connected to ZooKeeper
562 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath: 
/overseer/queue
578 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath: 
/overseer/collection-queue-work
591 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath: 
/live_nodes
595 [main] INFO org.apache.solr.cloud.ZkController – Register node as 
live in ZooKeeper:/live_nodes/192.168.10.206:8080_solr_address
600 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath: 
/live_nodes/192.168.10.206:8080_solr_address
606 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath: 
/collections
613 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath: 
/overseer_elect/election
649 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath: 
/overseer_elect/leader
654 [main] INFO org.apache.solr.cloud.Overseer – Overseer 
(id=90474615489036288-192.168.10.206:8080_solr_address-n_00) 
starting
675 [main] INFO org.apache.solr.common.cloud.SolrZkClient – makePath: 
/overseer/queue-work
690 
[Overseer-90474615489036288-192.168.10.206:8080_solr_address-n_00] 
INFO org

SolrCloud - shard containing an invalid host:port

2013-09-03 Thread Marc des Garets

Hi,

I have setup SolrCloud with tomcat. I use solr 4.1.

I have zookeeper running on 192.168.1.10.
A tomcat running solr_myidx on 192.168.1.10 on port 8080.
A tomcat running solr_myidx on 192.168.1.11 on port 8080.

My solr.xml is like this:


  hostPort="8080" hostContext="solr_myidx" zkClientTimeout="2">


  


I have tomcat starting with: -Dbootstrap_conf=true 
-DzkHost=192.168.1.10:2181


Both tomcat startup all good but when I go to the Cloud tab in the solr 
admin, I see the following:


collection1 --> shard1 --> 192.168.1.10:8983/solr
  192.168.1.11:8080/solr_ugc
  192.168.1.10:8080/solr_ugc

I don't know what is 192.168.1.10:8983/solr doing there. Do you know how 
I can remove it?


It's causing the following error when I try to query the index:
SEVERE: Error while trying to recover. 
core=collection1:org.apache.solr.client.solrj.SolrServerException: 
Server refused connection at: http://192.168.10.206:8983/solr


Thanks,
Marc


Solr Admin: what is "Current" meaning?

2013-05-10 Thread Marc Des Garets
Hi,

In the solr admin web interface, when looking at the statistics of a
collection (this page: http://{ip}:8080/{index}/#/collection1), there is
"Current" under Optimized.

What does it mean?


Thanks.

This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the addressee. 
Any views or opinions expressed within it are those of the author and do not 
necessarily represent those of 
192.com Ltd or any of its subsidiary companies. If you are not the intended 
recipient then you must 
not disclose, copy or take any action in reliance of this transmission. If you 
have received this 
transmission in error, please notify the sender as soon as possible. No 
employee or agent is authorised 
to conclude any binding agreement on behalf 192.com Ltd with another party by 
email without express written 
confirmation by an authorised employee of the company. http://www.192.com (Tel: 
08000 192 192). 
192.com Ltd is incorporated in England and Wales, company number 07180348, VAT 
No. GB 103226273.

Re: SolrException parsing error

2013-04-16 Thread Marc Des Garets
Problem solved for me as well. The client is running in tomcat and the
connector had compression="true". I removed it and now it seems to work
fine.

On 04/16/2013 02:28 PM, Luis Lebolo wrote:
> Turns out I spoke too soon. I was *not* sending the query via POST.
> Changing the method to POST solved the issue for me (maybe I was hitting a
> GET limit somewhere?).
>
> -Luis
>
>
> On Tue, Apr 16, 2013 at 7:38 AM, Marc des Garets  wrote:
>
>> Did you find anything? I have the same problem but it's on update requests
>> only.
>>
>> The error comes from the solrj client indeed. It is solrj logging this
>> error. There is nothing in solr itself and it does the update correctly.
>> It's fairly small simple documents being updated.
>>
>>
>> On 04/15/2013 07:49 PM, Shawn Heisey wrote:
>>
>>> On 4/15/2013 9:47 AM, Luis Lebolo wrote:
>>>
>>>> Hi All,
>>>>
>>>> I'm using Solr 4.1 and am receiving an org.apache.solr.common.**
>>>> SolrException
>>>> "parsing error" with root cause java.io.EOFException (see below for stack
>>>> trace). The query I'm performing is long/complex and I wonder if its size
>>>> is causing the issue?
>>>>
>>>> I am querying via POST through SolrJ. The query (fq) itself is ~20,000
>>>> characters long in the form of:
>>>>
>>>> fq=(mutation_prot_mt_1_1:2374 + OR + mutation_prot_mt_2_1:2374 + OR +
>>>> mutation_prot_mt_3_1:2374 + ...) + OR + (mutation_prot_mt_1_2:2374 + OR +
>>>> mutation_prot_mt_2_2:2374 + OR + mutation_prot_mt_3_2:2374+...) + OR +
>>>> ...
>>>>
>>>> In short, I am querying for an ID throughout multiple dynamically created
>>>> fields (mutation_prot_mt_#_#).
>>>>
>>>> Any thoughts on how to further debug?
>>>>
>>>> Thanks in advance,
>>>> Luis
>>>>
>>>> --**
>>>>
>>>> SEVERE: Servlet.service() for servlet [X] in context with path [/x] threw
>>>> exception [Request processing failed; nested exception is
>>>> org.apache.solr.common.**SolrException: parsing error] with root cause
>>>> java.io.EOFException
>>>> at
>>>> org.apache.solr.common.util.**FastInputStream.readByte(**FastInputStream.java:193)
>>>>
>>>> at org.apache.solr.common.util.**JavaBinCodec.unmarshal(**
>>>> JavaBinCodec.java:107)
>>>>   at
>>>> org.apache.solr.client.solrj.**impl.BinaryResponseParser.**
>>>> processResponse(**BinaryResponseParser.java:41)
>>>> at
>>>> org.apache.solr.client.solrj.**impl.HttpSolrServer.request(**HttpSolrServer.java:387)
>>>>
>>>>   at
>>>> org.apache.solr.client.solrj.**impl.HttpSolrServer.request(**HttpSolrServer.java:181)
>>>>
>>>> at
>>>> org.apache.solr.client.solrj.**request.QueryRequest.process(**QueryRequest.java:90)
>>>>
>>>>   at org.apache.solr.client.solrj.**SolrServer.query(SolrServer.**
>>>> java:301)
>>>>
>>> I am guessing that this log is coming from your SolrJ client, but That is
>>> not completely clear, so is it SolrJ or Solr that is logging this error?
>>>  If it's SolrJ, do you see anything in the Solr log, and vice versa?
>>>
>>> This looks to me like a network problem, where something is dropping the
>>> connection before transfer is complete.  It could be an unusual server-side
>>> config, OS problems, timeout settings in the SolrJ code, NIC
>>> drivers/firmware, bad cables, bad network hardware, etc.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>


This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the addressee. 
Any views or opinions expressed within it are those of the author and do not 
necessarily represent those of 
192.com Ltd or any of its subsidiary companies. If you are not the intended 
recipient then you must 
not disclose, copy or take any action in reliance of this transmission. If you 
have received this 
transmission in error, please notify the sender as soon as possible. No 
employee or agent is authorised 
to conclude any binding agreement on behalf 192.com Ltd with another party by 
email without express written 
confirmation by an authorised employee of the company. http://www.192.com (Tel: 
08000 192 192). 
192.com Ltd is incorporated in England and Wales, company number 07180348, VAT 
No. GB 103226273.

Re: SolrException parsing error

2013-04-16 Thread Marc des Garets
Did you find anything? I have the same problem but it's on update 
requests only.


The error comes from the solrj client indeed. It is solrj logging this 
error. There is nothing in solr itself and it does the update correctly. 
It's fairly small simple documents being updated.


On 04/15/2013 07:49 PM, Shawn Heisey wrote:

On 4/15/2013 9:47 AM, Luis Lebolo wrote:

Hi All,

I'm using Solr 4.1 and am receiving an 
org.apache.solr.common.SolrException
"parsing error" with root cause java.io.EOFException (see below for 
stack
trace). The query I'm performing is long/complex and I wonder if its 
size

is causing the issue?

I am querying via POST through SolrJ. The query (fq) itself is ~20,000
characters long in the form of:

fq=(mutation_prot_mt_1_1:2374 + OR + mutation_prot_mt_2_1:2374 + OR +
mutation_prot_mt_3_1:2374 + ...) + OR + (mutation_prot_mt_1_2:2374 + 
OR +
mutation_prot_mt_2_2:2374 + OR + mutation_prot_mt_3_2:2374+...) + OR 
+ ...


In short, I am querying for an ID throughout multiple dynamically 
created

fields (mutation_prot_mt_#_#).

Any thoughts on how to further debug?

Thanks in advance,
Luis

--

SEVERE: Servlet.service() for servlet [X] in context with path [/x] 
threw

exception [Request processing failed; nested exception is
org.apache.solr.common.SolrException: parsing error] with root cause
java.io.EOFException
at
org.apache.solr.common.util.FastInputStream.readByte(FastInputStream.java:193) 

at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:107)

  at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41) 


at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:387) 


  at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181) 


at
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90) 


  at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)


I am guessing that this log is coming from your SolrJ client, but That 
is not completely clear, so is it SolrJ or Solr that is logging this 
error?  If it's SolrJ, do you see anything in the Solr log, and vice 
versa?


This looks to me like a network problem, where something is dropping 
the connection before transfer is complete.  It could be an unusual 
server-side config, OS problems, timeout settings in the SolrJ code, 
NIC drivers/firmware, bad cables, bad network hardware, etc.


Thanks,
Shawn





Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Marc Des Garets
Same config. I compared both, some defaults changed like ramBufferSize
which I've set like in 3.5 (same with other things).

It becomes even more strange to me. Now I have changed the jvm settings
to this:
-d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=6
-XX:SurvivorRatio=2 -XX:G1ReservePercent=10 -XX:MaxGCPauseMillis=100
-XX:InitiatingHeapOccupancyPercent=30 -XX:PermSize=728m -XX:MaxPermSize=728m

So the Eden space is just 6Gb, survivor space is still weird (80Mb) and
full 100% of time, and old gen is 34Gb.

I now get GCs of just 0.07 sec every 30sec/1mn. Very regular like this:
[GC pause (young) 16214M->10447M(40960M), 0.0738720 secs]

Just 30% of the total heap is used.

After while it's going to do:
[GC pause (young) (initial-mark) 11603M->11391M(40960M), 0.100 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0172380]
[GC concurrent-mark-start]
[GC concurrent-mark-end, 0.4824210 sec]
[GC remark, 0.0248680 secs]
[GC cleanup 11476M->11476M(40960M), 0.0116420 secs]

Which looks pretty good. If I am not mistaken, concurrent-mark isn't
stop the world. remark is stop the world but is just 0.02 sec and GC
cleanup is also stop the world but is just 0.01 sec.

By the look of it I could have a 20g heap rather than 40... Now I am
waiting to see what happens when it will clear the old gen but that will
take a while before it happens because it is growing slowly.

Still mysterious to me but it looks like it's going to all work out.

On 04/11/2013 03:06 PM, Jack Krupansky wrote:
> Same config? Do a compare with the new example config and see what settings 
> are different/changed. There may have been some defaults that changed. Read 
> the comments in the new config.
>
> If you had just taken or merged the new config, then I would suggest making 
> sure that the update log is not enabled (or make sure you do hard commits 
> relatively frequently rather than only soft commits.)
>
> -- Jack Krupansky
>
> -Original Message- 
> From: Marc Des Garets
> Sent: Thursday, April 11, 2013 3:07 AM
> To: solr-user@lucene.apache.org
> Subject: Re: migration solr 3.5 to 4.1 - JVM GC problems
>
> Big heap because very large number of requests with more than 60 indexes
> and hundreds of million of documents (all indexes together). My problem
> is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
> or 2mn and 20Gb of the heap is used.
>
> With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
> (it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
> every minute.
>
> There must be something that has changed in 4.1 compared to 3.5 to cause
> this behavior. It's the same requests, same schemas (excepted 4 fields
> changed from sint to tint) and same config.
>
> On 04/10/2013 07:38 PM, Shawn Heisey wrote:
>> On 4/10/2013 9:48 AM, Marc Des Garets wrote:
>>> The JVM behavior is now radically different and doesn't seem to make
>>> sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.
>>>
>>> The perm gen went from 410Mb to 600Mb.
>>>
>>> The eden space usage is a lot bigger and the survivor space usage is
>>> 100% all the time.
>>>
>>> I don't really understand what is happening. GC behavior really doesn't
>>> seem right.
>>>
>>> My jvm settings:
>>> -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
>>> -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m
>> As Otis has already asked, why do you have a 40GB heap?  The only way I
>> can imagine that you would actually NEED a heap that big is if your
>> index size is measured in hundreds of gigabytes.  If you really do need
>> a heap that big, you will probably need to go with a JVM like Zing.  I
>> don't know how much Zing costs, but they claim to be able to make any
>> heap size perform well under any load.  It is Linux-only.
>>
>> I was running into extreme problems with GC pauses with my own setup,
>> and that was only with an 8GB heap.  I was using the CMS collector and
>> NewRatio=1.  Switching to G1 didn't help at all - it might have even
>> made the problem worse.  I never did try the Zing JVM.
>>
>> After a lot of experimentation (which I will admit was not done very
>> methodically) I found JVM options that have reduced the GC pause problem
>> greatly.  Below is what I am using now on Solr 4.2.1 with a total
>> per-server index size of about 45GB.  This works properly on CentOS 6
>> with Oracle Java 7u17, UseLargePages may require special kernel tuning
>> on other operating systems:
>>
>> -Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancy

Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Marc Des Garets
I have 45 solr 4.1 indexes. Sizes vary between 20Gb and 2.2Gb.

- 1 is 20Gb (80 million docs)
- 1 is 5.1Gb (24 million docs)
- 1 is 5.6Gb (26 million docs)
- 1 is 6.5Gb (28 million docs)
- 11 others are about 2.2Gb (6-7 million docs).
- 20 others are about 600Mb (2.5 million docs)

That reminds me of something. The 4.1 indexes are 2 times smaller than
the 3.5 indexes. For example the one which is 20Gb with solr 4.1 is 43Gb
with solr 3.5. Maybe there is something there?

There is roughly 200 queries per second.


On 04/11/2013 11:07 AM, Furkan KAMACI wrote:
> Hi Marc;
>
> Could I learn your index size and what is your performance measure as query
> per second?
>
> 2013/4/11 Marc Des Garets 
>
>> Big heap because very large number of requests with more than 60 indexes
>> and hundreds of million of documents (all indexes together). My problem
>> is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
>> or 2mn and 20Gb of the heap is used.
>>
>> With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
>> (it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
>> every minute.
>>
>> There must be something that has changed in 4.1 compared to 3.5 to cause
>> this behavior. It's the same requests, same schemas (excepted 4 fields
>> changed from sint to tint) and same config.
>>
>> On 04/10/2013 07:38 PM, Shawn Heisey wrote:
>>> On 4/10/2013 9:48 AM, Marc Des Garets wrote:
>>>> The JVM behavior is now radically different and doesn't seem to make
>>>> sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.
>>>>
>>>> The perm gen went from 410Mb to 600Mb.
>>>>
>>>> The eden space usage is a lot bigger and the survivor space usage is
>>>> 100% all the time.
>>>>
>>>> I don't really understand what is happening. GC behavior really doesn't
>>>> seem right.
>>>>
>>>> My jvm settings:
>>>> -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
>>>> -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m
>>> As Otis has already asked, why do you have a 40GB heap?  The only way I
>>> can imagine that you would actually NEED a heap that big is if your
>>> index size is measured in hundreds of gigabytes.  If you really do need
>>> a heap that big, you will probably need to go with a JVM like Zing.  I
>>> don't know how much Zing costs, but they claim to be able to make any
>>> heap size perform well under any load.  It is Linux-only.
>>>
>>> I was running into extreme problems with GC pauses with my own setup,
>>> and that was only with an 8GB heap.  I was using the CMS collector and
>>> NewRatio=1.  Switching to G1 didn't help at all - it might have even
>>> made the problem worse.  I never did try the Zing JVM.
>>>
>>> After a lot of experimentation (which I will admit was not done very
>>> methodically) I found JVM options that have reduced the GC pause problem
>>> greatly.  Below is what I am using now on Solr 4.2.1 with a total
>>> per-server index size of about 45GB.  This works properly on CentOS 6
>>> with Oracle Java 7u17, UseLargePages may require special kernel tuning
>>> on other operating systems:
>>>
>>> -Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
>>> -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled
>>> -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts
>>>
>>> These options could probably use further tuning, but I haven't had time
>>> for the kind of testing that will be required.
>>>
>>> If you decide to pay someone to make the problem going away instead:
>>>
>>> http://www.azulsystems.com/products/zing/whatisit
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>>
>>
>> This transmission is strictly confidential, possibly legally privileged,
>> and intended solely for the addressee.
>> Any views or opinions expressed within it are those of the author and do
>> not necessarily represent those of
>> 192.com Ltd or any of its subsidiary companies. If you are not the
>> intended recipient then you must
>> not disclose, copy or take any action in reliance of this transmission. If
>> you have received this
>> transmission in error, please notify the sender as soon as possible. No
>> employee or agent is authorised
>> to conclude any binding agreement on behalf 192.com Ltd with another
>> party by ema

Re: migration solr 3.5 to 4.1 - JVM GC problems

2013-04-11 Thread Marc Des Garets
Big heap because very large number of requests with more than 60 indexes
and hundreds of million of documents (all indexes together). My problem
is with solr 4.1. All is perfect with 3.5. I have 0.05 sec GCs every 1
or 2mn and 20Gb of the heap is used.

With the 4.1 indexes it uses 30Gb-33Gb, the survivor space is all weird
(it changed the size capacity to 6Mb at some point) and I have 2 sec GCs
every minute.

There must be something that has changed in 4.1 compared to 3.5 to cause
this behavior. It's the same requests, same schemas (excepted 4 fields
changed from sint to tint) and same config.

On 04/10/2013 07:38 PM, Shawn Heisey wrote:
> On 4/10/2013 9:48 AM, Marc Des Garets wrote:
>> The JVM behavior is now radically different and doesn't seem to make
>> sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.
>>
>> The perm gen went from 410Mb to 600Mb.
>>
>> The eden space usage is a lot bigger and the survivor space usage is
>> 100% all the time.
>>
>> I don't really understand what is happening. GC behavior really doesn't
>> seem right.
>>
>> My jvm settings:
>> -d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
>> -XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m
> As Otis has already asked, why do you have a 40GB heap?  The only way I 
> can imagine that you would actually NEED a heap that big is if your 
> index size is measured in hundreds of gigabytes.  If you really do need 
> a heap that big, you will probably need to go with a JVM like Zing.  I 
> don't know how much Zing costs, but they claim to be able to make any 
> heap size perform well under any load.  It is Linux-only.
>
> I was running into extreme problems with GC pauses with my own setup, 
> and that was only with an 8GB heap.  I was using the CMS collector and 
> NewRatio=1.  Switching to G1 didn't help at all - it might have even 
> made the problem worse.  I never did try the Zing JVM.
>
> After a lot of experimentation (which I will admit was not done very 
> methodically) I found JVM options that have reduced the GC pause problem 
> greatly.  Below is what I am using now on Solr 4.2.1 with a total 
> per-server index size of about 45GB.  This works properly on CentOS 6 
> with Oracle Java 7u17, UseLargePages may require special kernel tuning 
> on other operating systems:
>
> -Xmx6144M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 
> -XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled 
> -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts
>
> These options could probably use further tuning, but I haven't had time 
> for the kind of testing that will be required.
>
> If you decide to pay someone to make the problem going away instead:
>
> http://www.azulsystems.com/products/zing/whatisit
>
> Thanks,
> Shawn
>
>
>


This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the addressee. 
Any views or opinions expressed within it are those of the author and do not 
necessarily represent those of 
192.com Ltd or any of its subsidiary companies. If you are not the intended 
recipient then you must 
not disclose, copy or take any action in reliance of this transmission. If you 
have received this 
transmission in error, please notify the sender as soon as possible. No 
employee or agent is authorised 
to conclude any binding agreement on behalf 192.com Ltd with another party by 
email without express written 
confirmation by an authorised employee of the company. http://www.192.com (Tel: 
08000 192 192). 
192.com Ltd is incorporated in England and Wales, company number 07180348, VAT 
No. GB 103226273.

migration solr 3.5 to 4.1 - JVM GC problems

2013-04-10 Thread Marc Des Garets
Hi,

I run multiple solr indexes in 1 single tomcat (1 webapp per index). All
the indexes are solr 3.5 and I have upgraded few of them to solr 4.1
(about half of them).

The JVM behavior is now radically different and doesn't seem to make
sense. I was using ConcMarkSweepGC. I am now trying the G1 collector.

The perm gen went from 410Mb to 600Mb.

The eden space usage is a lot bigger and the survivor space usage is
100% all the time.

I don't really understand what is happening. GC behavior really doesn't
seem right.

My jvm settings:
-d64 -server -Xms40g -Xmx40g -XX:+UseG1GC -XX:NewRatio=1
-XX:SurvivorRatio=3 -XX:PermSize=728m -XX:MaxPermSize=728m

I have tried NewRatio=1 and SurvivorRatio=3 hoping to get the Survivor
space to not be 100% full all the time without success.

Here is what jmap is giving me:
Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize  = 42949672960 (40960.0MB)
   NewSize  = 1363144 (1.254223632812MB)
   MaxNewSize   = 17592186044415 MB
   OldSize  = 5452592 (5.169482421875MB)
   NewRatio = 1
   SurvivorRatio= 3
   PermSize = 754974720 (720.0MB)
   MaxPermSize  = 763363328 (728.0MB)
   G1HeapRegionSize = 16777216 (16.0MB)

Heap Usage:
G1 Heap:
   regions  = 2560
   capacity = 42949672960 (40960.0MB)
   used = 23786449912 (22684.526359558105MB)
   free = 19163223048 (18275.473640441895MB)
   55.382144432514906% used
G1 Young Generation:
Eden Space:
   regions  = 674
   capacity = 20619198464 (19664.0MB)
   used = 11307843584 (10784.0MB)
   free = 9311354880 (8880.0MB)
   54.841334418226204% used
Survivor Space:
   regions  = 115
   capacity = 1929379840 (1840.0MB)
   used = 1929379840 (1840.0MB)
   free = 0 (0.0MB)
   100.0% used
G1 Old Generation:
   regions  = 732
   capacity = 20401094656 (19456.0MB)
   used = 10549226488 (10060.526359558105MB)
   free = 9851868168 (9395.473640441895MB)
   51.70911985792612% used
Perm Generation:
   capacity = 754974720 (720.0MB)
   used = 514956504 (491.10079193115234MB)
   free = 240018216 (228.89920806884766MB)
   68.20844332377116% used

The Survivor space even went up to 3.6Gb but was still 100% used.

I have disabled all caches.

Obviously I am getting very bad GC performance.

Any idea as to what could be wrong and why this could be happening?


Thanks,

Marc


This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the addressee. 
Any views or opinions expressed within it are those of the author and do not 
necessarily represent those of 
192.com Ltd or any of its subsidiary companies. If you are not the intended 
recipient then you must 
not disclose, copy or take any action in reliance of this transmission. If you 
have received this 
transmission in error, please notify the sender as soon as possible. No 
employee or agent is authorised 
to conclude any binding agreement on behalf 192.com Ltd with another party by 
email without express written 
confirmation by an authorised employee of the company. http://www.192.com (Tel: 
08000 192 192). 
192.com Ltd is incorporated in England and Wales, company number 07180348, VAT 
No. GB 103226273.

RE: question about StandardAnalyzer, differences between solr 1.4 and solr 3.3

2011-09-09 Thread Marc Des Garets
Ok thanks, I don't know why the behaviour is different from my 1.4 index then 
but hopefully it will be the same by doing what you tell me.

Thanks again,

Marc

-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu] 
Sent: 09 September 2011 14:40
To: solr-user@lucene.apache.org
Subject: RE: question about StandardAnalyzer, differences between solr 1.4 and 
solr 3.3

Hi Marc,

StandardAnalyzer includes StopFilter.  See the Javadocs for Lucene 3.3 here: 
<http://lucene.apache.org/java/3_3_0/api/all/org/apache/lucene/analysis/standard/StandardAnalyzer.html>

This is not new behavior - StandardAnalyzer in Lucene 2.9.1 (the version of 
Lucene bundled with Solr 1.4) also includes a StopFilter: 
<http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/analysis/standard/StandardAnalyzer.html>

If you don't want a StopFilter configured, you can specify the individual 
components directly, e.g. to get the equivalent of StandardAnalyzer, but 
without the StopFilter:


  



  


Steve

> -Original Message-
> From: Marc Des Garets [mailto:marc.desgar...@192.com]
> Sent: Friday, September 09, 2011 6:21 AM
> To: solr-user@lucene.apache.org
> Subject: question about StandardAnalyzer, differences between solr 1.4
> and solr 3.3
> 
> Hi,
> 
> I have a simple field defined like this:
> 
>class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
> 
> 
> Which I use here:
> required="false" />
> 
> In solr 1.4, I could do:
> ?q=(middlename:a*)
> 
> And I was getting all documents where middlename = A or where middlename
> starts by the letter A.
> 
> In solr 3.3, I get only results where middlename starts by the letter A
> but not where middlename is equal to A.
> 
> The thing is this happens only with the letter A, with other letters, it
> is fine, I get the ones starting by the letter and the ones equal to the
> letter. My guess is that it considers A as the English article but I do
> not specify any filter with stopwords so how come the behaviour with the
> letter A is different from the other letters? Is there a bug? How can I
> change my field to work with the letter A, the same way it does with
> other letters.
> 
> 
> Thanks,
> Marc
> --
> This transmission is strictly confidential, possibly legally privileged,
> and intended solely for the
> addressee.  Any views or opinions expressed within it are those of the
> author and do not necessarily
> represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's
> subsidiary companies.  If you
> are not the intended recipient then you must not disclose, copy or take
> any action in reliance of this
> transmission. If you have received this transmission in error, please
> notify the sender as soon as
> possible.  No employee or agent is authorised to conclude any binding
> agreement on behalf of
> i-CD Publishing (UK) Ltd with another party by email without express
> written confirmation by an
> authorised employee of the Company. http://www.192.com (Tel: 08000 192
> 192).  i-CD Publishing (UK) Ltd
> is incorporated in England and Wales, company number 3148549, VAT No. GB
> 673128728.
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.


question about StandardAnalyzer, differences between solr 1.4 and solr 3.3

2011-09-09 Thread Marc Des Garets
Hi,

I have a simple field defined like this:

  


Which I use here:
   

In solr 1.4, I could do:
?q=(middlename:a*)

And I was getting all documents where middlename = A or where middlename starts 
by the letter A.

In solr 3.3, I get only results where middlename starts by the letter A but not 
where middlename is equal to A.

The thing is this happens only with the letter A, with other letters, it is 
fine, I get the ones starting by the letter and the ones equal to the letter. 
My guess is that it considers A as the English article but I do not specify any 
filter with stopwords so how come the behaviour with the letter A is different 
from the other letters? Is there a bug? How can I change my field to work with 
the letter A, the same way it does with other letters.


Thanks,
Marc
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.


HTTP Status 500 - null java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:249)

2010-03-18 Thread Marc Des Garets
Hi,

 

I am doing a really simple query on my index (it's running in tomcat):

http://host:8080/solr_er_07_09/select/?q=hash_id:123456

 

I am getting the following exception:

HTTP Status 500 - null java.lang.IllegalArgumentException at
java.nio.Buffer.limit(Buffer.java:249) at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOF
SDirectory.java:123) at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.jav
a:157) at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.j
ava:38) at
org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80) at
org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:214) at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:948)
at
org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:50
6) at org.apache.lucene.index.IndexReader.document(IndexReader.java:947)
at
org.apache.solr.search.SolrIndexReader.document(SolrIndexReader.java:444
) at
org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:427)
at
org.apache.solr.util.SolrPluginUtils.optimizePreFetchDocs(SolrPluginUtil
s.java:267) at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.
java:269) at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(Search
Handler.java:195) at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
ase.java:131) at
org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja
va:338) at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:241) at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica
tionFilterChain.java:235) at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt
erChain.java:206) at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv
e.java:233) at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv
e.java:191) at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
:128) at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
:102) at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.
java:109) at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:2
93) at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:84
9) at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(
Http11Protocol.java:583) at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
at java.lang.Thread.run(Thread.java:595)

 

Has someone any idea why would this happen?

 

I built the index on a different machine than the one I am doing the
query on though the configuration is exactly the same. I can do the same
query using solrj (I have an app doing that) and it works fine.

 

 

Thanks.
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.

RE: question about mergeFactor

2010-03-08 Thread Marc Des Garets
Perfect. Thank you for your help.

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: 08 March 2010 12:57
To: solr-user@lucene.apache.org
Subject: Re: question about mergeFactor

On Mon, Mar 8, 2010 at 5:31 PM, Marc Des Garets
wrote:

>
> If I have a mergeFactor of 50 when I build the index and then I
optimize
> the index, I end up with 1 index file so I have a small number of
index
> files and having used mergeFactor of 50 won't slow searching? Or my
> supposition is wrong and the mergeFactor used when building the index
> has an impact on speed searching anyway?
>
>
If you optimize then mergeFactor does not matter and your searching
speed
will not be slowed down. On the other hand, the optimize may take the
bulk
of the indexing time, so you won't get any benefit from using a
mergeFactor
of 50.

-- 
Regards,
Shalin Shekhar Mangar.
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.

question about mergeFactor

2010-03-08 Thread Marc Des Garets
Hello,

 

On the solr wiki, here:
http://wiki.apache.org/solr/SolrPerformanceFactors

 

It is written:

mergeFactor Tradeoffs

 

High value merge factor (e.g., 25):

Pro: Generally improves indexing speed

Con: Less frequent merges, resulting in a collection with more index
files which may slow searching

Low value merge factor (e.g., 2):

 

Pro: Smaller number of index files, which speeds up searching.

Con: More segment merges slow down indexing.

 

If I have a mergeFactor of 50 when I build the index and then I optimize
the index, I end up with 1 index file so I have a small number of index
files and having used mergeFactor of 50 won't slow searching? Or my
supposition is wrong and the mergeFactor used when building the index
has an impact on speed searching anyway?

 

 

Thanks.
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.

RE: Problem comitting on 40GB index

2010-01-13 Thread Marc Des Garets
Just curious, have you checked if the hanging you are experiencing is not 
garbage collection related?

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 13 January 2010 13:33
To: solr-user@lucene.apache.org
Subject: Re: Problem comitting on 40GB index

That's my understanding.. But fortunately disk space is cheap 


On Wed, Jan 13, 2010 at 5:01 AM, Frederico Azeiteiro <
frederico.azeite...@cision.com> wrote:

> Sorry, my bad... I replied to a current mailing list message only changing
> the subject... Didn't know about this " Hijacking" problem. Will not happen
> again.
>
> Just for close this issue, if I understand correctly, for an index of 40G,
> I will need, for running an optimize:
> - 40G if all activity on index is stopped
> - 80G if index is being searched...)
> - 120G if index is being searched and if a commit is performed.
>
> Is this correct?
>
> Thanks.
> Frederico
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: terça-feira, 12 de Janeiro de 2010 19:18
> To: solr-user@lucene.apache.org
> Subject: Re: Problem comitting on 40GB index
>
> Huh?
>
> On Tue, Jan 12, 2010 at 2:00 PM, Chris Hostetter
> wrote:
>
> >
> > : Subject: Problem comitting on 40GB index
> > : In-Reply-To: <
> > 7a9c48b51001120345h5a57dbd4o8a8a39fc4a98a...@mail.gmail.com>
> >
> > http://people.apache.org/~hossman/#threadhijack
> > Thread Hijacking on Mailing Lists
> >
> > When starting a new discussion on a mailing list, please do not reply to
> > an existing message, instead start a fresh email.  Even if you change the
> > subject line of your email, other mail headers still track which thread
> > you replied to and your question is "hidden" in that thread and gets less
> > attention.   It makes following discussions in the mailing list archives
> > particularly difficult.
> > See Also:  http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking
> >
> >
> >
> > -Hoss
> >
> >
>
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.

RE: update solr index

2010-01-12 Thread Marc Des Garets
I have 2 ways to update the index, either I use solrj using
SolrEmbeddedServer or I do it with an http query. If I do it with an
http query I indeed don't stop tomcat but I have to do some operations
(mainly taking instance out of the cluster) and I can't automate this
process when I can automate update which is my goal that's why I'm
trying to do update without having problems with garbage collection.

I have disabled cache as I don't need it. Is it running a number of old
queries to re-generate the cache anyway or it is a different cache you
are talking about? But I believe it still has to register a new
searcher. I don't know what the impact of this is though.

I guess I will go for running more tomcat running less indexes with
lower JVM heap.

Thank you for the link and thanks for the reply.


Marc

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: 12 January 2010 07:49
To: solr-user@lucene.apache.org
Subject: Re: update solr index

On Mon, Jan 11, 2010 at 7:42 PM, Marc Des Garets
wrote:

>
> I am running solr in tomcat and I have about 35 indexes (between 2 and
> 80 millions documents each). Currently if I try to update few
documents
> from an index (let's say the one which contains 80 millions documents)
> while tomcat is running and therefore receiving requests, I am getting
> few very long garbage collection (about 60sec). I am running tomcat
with
> -Xms10g -Xmx10g -Xmn2g -XX:PermSize=256m -XX:MaxPermSize=256m. I'm
using
> ConcMarkSweepGC.
>
> I have 2 questions:
> 1. Is solr doing something specific while an index is being updated
like
> updating something in memory which would cause the garbage collection?
>

Solr's caches are thrown away and a fixed number of old queries are
re-executed to re-generated the cache on the new index (known as
auto-warming). This happens on a commit.


>
> 2. Any idea how I could solve this problem? Currently I stop tomcat,
> update index, start tomcat. I would like to be able to update my index
> while tomcat is running. I was thinking about running more tomcat
> instance with less memory for each and each running few of my indexes.
> Do you think it would be the best way to go?
>
>
If you stop tomcat, how do you update the index? Are you running a
multi-core setup? Perhaps it is better to split up the indexes among
multiple boxes. Also, you should probably lower the JVM heap so that the
full GC pause doesn't make your index unavailable for such a long time.

Also see
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles
/Scaling-Lucene-and-Solr

-- 
Regards,
Shalin Shekhar Mangar.
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.

update solr index

2010-01-11 Thread Marc Des Garets
Hi,

I am running solr in tomcat and I have about 35 indexes (between 2 and
80 millions documents each). Currently if I try to update few documents
from an index (let's say the one which contains 80 millions documents)
while tomcat is running and therefore receiving requests, I am getting
few very long garbage collection (about 60sec). I am running tomcat with
-Xms10g -Xmx10g -Xmn2g -XX:PermSize=256m -XX:MaxPermSize=256m. I'm using
ConcMarkSweepGC.

I have 2 questions:
1. Is solr doing something specific while an index is being updated like
updating something in memory which would cause the garbage collection?

2. Any idea how I could solve this problem? Currently I stop tomcat,
update index, start tomcat. I would like to be able to update my index
while tomcat is running. I was thinking about running more tomcat
instance with less memory for each and each running few of my indexes.
Do you think it would be the best way to go?


Thanks,
Marc
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.

RE: very slow add/commit time

2009-11-03 Thread Marc Des Garets
If you mean ramBufferSizeMB, I have it set on 512. The maxBufferedDocs
is commented. If you mean queryResultMaxDocsCached, it is set on 200 but
is it used when indexing?

-Original Message-
From: Bruno [mailto:brun...@gmail.com] 
Sent: 03 November 2009 14:27
To: solr-user@lucene.apache.org
Subject: Re: very slow add/commit time

How many MB have you set of cache on your solrconfig.xml?

On Tue, Nov 3, 2009 at 12:24 PM, Marc Des Garets
wrote:

> Hi,
>
>
>
> I am experiencing a problem with an index of about 80 millions
documents
> (41Gb). I am trying to update documents in this index using Solrj.
>
>
>
> When I do:
>
> solrServer.add(docs);  //docs is a List that
contains
> 1000 SolrInputDocument (takes 36sec)
>
> solrServer.commit(false,false); //either never ends with a OutOfMemory
> error or takes forever
>
>
>
> I have -Xms4g -Xmx4g
>
>
>
> Any idea what could be the problem?
>
>
>
> Thanks for your help.
>
>
> --
> This transmission is strictly confidential, possibly legally
privileged,
> and intended solely for the
> addressee.  Any views or opinions expressed within it are those of the
> author and do not necessarily
> represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's
> subsidiary companies.  If you
> are not the intended recipient then you must not disclose, copy or
take any
> action in reliance of this
> transmission. If you have received this transmission in error, please
> notify the sender as soon as
> possible.  No employee or agent is authorised to conclude any binding
> agreement on behalf of
> i-CD Publishing (UK) Ltd with another party by email without express
> written confirmation by an
> authorised employee of the Company. http://www.192.com (Tel: 08000 192
> 192).  i-CD Publishing (UK) Ltd
> is incorporated in England and Wales, company number 3148549, VAT No.
GB
> 673128728.




-- 
Bruno Morelli Vargas
Mail: brun...@gmail.com
Msn: brun...@hotmail.com
Icq: 165055101
Skype: morellibmv
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.

very slow add/commit time

2009-11-03 Thread Marc Des Garets
Hi,

 

I am experiencing a problem with an index of about 80 millions documents
(41Gb). I am trying to update documents in this index using Solrj.

 

When I do:

solrServer.add(docs);  //docs is a List that contains
1000 SolrInputDocument (takes 36sec)

solrServer.commit(false,false); //either never ends with a OutOfMemory
error or takes forever

 

I have -Xms4g -Xmx4g

 

Any idea what could be the problem?

 

Thanks for your help.

 
--
This transmission is strictly confidential, possibly legally privileged, and 
intended solely for the 
addressee.  Any views or opinions expressed within it are those of the author 
and do not necessarily 
represent those of 192.com, i-CD Publishing (UK) Ltd or any of it's subsidiary 
companies.  If you 
are not the intended recipient then you must not disclose, copy or take any 
action in reliance of this 
transmission. If you have received this transmission in error, please notify 
the sender as soon as 
possible.  No employee or agent is authorised to conclude any binding agreement 
on behalf of 
i-CD Publishing (UK) Ltd with another party by email without express written 
confirmation by an 
authorised employee of the Company. http://www.192.com (Tel: 08000 192 192).  
i-CD Publishing (UK) Ltd 
is incorporated in England and Wales, company number 3148549, VAT No. GB 
673128728.