Re: SolrCloud setup - any advice?

2013-09-19 Thread Shreejay Nair
Hi Neil,

Although you haven't mentioned it, just wanted to confirm - do you have
soft commits enabled?

Also what's the version of solr you are using for the solr cloud setup?
4.0.0 had lots of memory and zk related issues. What's the warmup time for
your caches? Have you tried disabling the caches?

Is this is static index or you documents are added continuously?

The answers to these questions might help us pin point the issue...

On Thursday, September 19, 2013, Neil Prosser wrote:

 Apologies for the giant email. Hopefully it makes sense.

 We've been trying out SolrCloud to solve some scalability issues with our
 current setup and have run into problems. I'd like to describe our current
 setup, our queries and the sort of load we see and am hoping someone might
 be able to spot the massive flaw in the way I've been trying to set things
 up.

 We currently run Solr 4.0.0 in the old style Master/Slave replication. We
 have five slaves, each running Centos with 96GB of RAM, 24 cores and with
 48GB assigned to the JVM heap. Disks aren't crazy fast (i.e. not SSDs) but
 aren't slow either. Our GC parameters aren't particularly exciting, just
 -XX:+UseConcMarkSweepGC. Java version is 1.7.0_11.

 Our index size ranges between 144GB and 200GB (when we optimise it back
 down, since we've had bad experiences with large cores). We've got just
 over 37M documents some are smallish but most range between 1000-6000
 bytes. We regularly update documents so large portions of the index will be
 touched leading to a maxDocs value of around 43M.

 Query load ranges between 400req/s to 800req/s across the five slaves
 throughout the day, increasing and decreasing gradually over a period of
 hours, rather than bursting.

 Most of our documents have upwards of twenty fields. We use different
 fields to store territory variant (we have around 30 territories) values
 and also boost based on the values in some of these fields (integer ones).

 So an average query can do a range filter by two of the territory variant
 fields, filter by a non-territory variant field. Facet by a field or two
 (may be territory variant). Bring back the values of 60 fields. Boost query
 on field values of a non-territory variant field. Boost by values of two
 territory-variant fields. Dismax query on up to 20 fields (with boosts) and
 phrase boost on those fields too. They're pretty big queries. We don't do
 any index-time boosting. We try to keep things dynamic so we can alter our
 boosts on-the-fly.

 Another common query is to list documents with a given set of IDs and
 select documents with a common reference and order them by one of their
 fields.

 Auto-commit every 30 minutes. Replication polls every 30 minutes.

 Document cache:
   * initialSize - 32768
   * size - 32768

 Filter cache:
   * autowarmCount - 128
   * initialSize - 8192
   * size - 8192

 Query result cache:
   * autowarmCount - 128
   * initialSize - 8192
   * size - 8192

 After a replicated core has finished downloading (probably while it's
 warming) we see requests which usually take around 100ms taking over 5s. GC
 logs show concurrent mode failure.

 I was wondering whether anyone can help with sizing the boxes required to
 split this index down into shards for use with SolrCloud and roughly how
 much memory we should be assigning to the JVM. Everything I've read
 suggests that running with a 48GB heap is way too high but every attempt
 I've made to reduce the cache sizes seems to wind up causing out-of-memory
 problems. Even dropping all cache sizes by 50% and reducing the heap by 50%
 caused problems.

 I've already tried using SolrCloud 10 shards (around 3.7M documents per
 shard, each with one replica) and kept the cache sizes low:

 Document cache:
   * initialSize - 1024
   * size - 1024

 Filter cache:
   * autowarmCount - 128
   * initialSize - 512
   * size - 512

 Query result cache:
   * autowarmCount - 32
   * initialSize - 128
   * size - 128

 Even when running on six machines in AWS with SSDs, 24GB heap (out of 60GB
 memory) and four shards on two boxes and three on the rest I still see
 concurrent mode failure. This looks like it's causing ZooKeeper to mark the
 node as down and things begin to struggle.

 Is concurrent mode failure just something that will inevitably happen or is
 it avoidable by dropping the CMSInitiatingOccupancyFraction?

 If anyone has anything that might shove me in the right direction I'd be
 very grateful. I'm wondering whether our set-up will just never work and
 maybe we're expecting too much.

 Many thanks,

 Neil



Re: What do you use for solr's logging analysis?

2013-08-11 Thread Shreejay Nair
There are a lot of tools out there with varying degrees of functionality (
and ease of setup) we also have multiple solr servers in production ( both
cloud and single nodes ) and we have decided to use
http://loggly. http://loggly.com/ We will probably be setting it up for
all our servers in the next few weeks. .

There are plenty of other such log analysis tools. It all depends on your
particular use case.

--Shreejay



On Sunday, August 11, 2013, adfel70 wrote:

 Hi
 I'm looking at a tool that could help me perform solr logging analysis.
 I use SolrCloud on multiple servers, so the tool should be able to collect
 logs from multiple servers.

 Any tool you use and can advice of?

 Thanks



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/What-do-you-use-for-solr-s-logging-analysis-tp4083809.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
-- 
Shreejay Nair
Sent from my mobile device. Please excuse brevity and typos.


Re: commit vs soft-commit

2013-08-11 Thread Shreejay Nair
Yes a new searcher is opened with every soft commit. It's still considered
faster because it does not write to the disk which is a slow IO operation
and might take a lot more time.

On Sunday, August 11, 2013, tamanjit.bin...@yahoo.co.in wrote:

 Hi,
 Some confusion in my head.
 http://
 http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
 http://
 http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22
 
 says that
 /A soft commit is much faster since it only makes index changes visible and
 does not fsync index files or write a new index descriptor./

 So this means that even with every softcommit a new searcher opens right?
 If
 it does, isn't it still very heavy?




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/commit-vs-soft-commit-tp4083817.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
-- 
Shreejay Nair
Sent from my mobile device. Please excuse brevity and typos.


Re: Need help on Solr

2013-06-20 Thread Shreejay
org.apache.solr.common.SolrException: [schema.xml] Duplicate field
definition for 'id'

You might have defined an id field in the schema file. The out of box schema 
file already contains an id field .  

-- 
Shreejay


On Thursday, June 20, 2013 at 9:16, Abhishek Bansal wrote:

 Hello,
 
 I am trying to index a pdf file on Solr. I am running icurrently Solr on
 Apache Tomcat 6.
 
 When I try to index it I get below error. Please help. I was not able to
 rectify this error with help of internet.
 
 
 
 
 ERROR - 2013-06-20 20:43:41.549; org.apache.solr.core.CoreContainer; Unable
 to create core: collection1
 org.apache.solr.common.SolrException: [schema.xml] Duplicate field
 definition for 'id'
 [[[id{type=string,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,sortMissingLast,required,
 required=true}]]] and
 [[[id{type=string,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,sortMissingLast,required,
 required=true}]]]
 at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:502)
 at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:176)
 at
 org.apache.solr.schema.ClassicIndexSchemaFactory.create(ClassicIndexSchemaFactory.java:62)
 at
 org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:36)
 at
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:946)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
 at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
 at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:662)
 ERROR - 2013-06-20 20:43:41.551; org.apache.solr.common.SolrException;
 null:org.apache.solr.common.SolrException: Unable to create core:
 collection1
 at
 org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993)
 at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
 at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: [schema.xml] Duplicate
 field definition for 'id'
 [[[id{type=string,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,sortMissingLast,required,
 required=true}]]] and
 [[[id{type=string,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,sortMissingLast,required,
 required=true}]]]
 at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:502)
 at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:176)
 at
 org.apache.solr.schema.ClassicIndexSchemaFactory.create(ClassicIndexSchemaFactory.java:62)
 at
 org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:36)
 at
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:946)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
 ... 10 more
 
 INFO - 2013-06-20 20:43:41.553;
 org.apache.solr.servlet.SolrDispatchFilter; user.dir=C:\Program
 Files\Apache Software Foundation\Tomcat 6.0
 INFO - 2013-06-20 20:43:41.553;
 org.apache.solr.servlet.SolrDispatchFilter; SolrDispatchFilter.init() done
 ERROR - 2013-06-20 20:43:41.820; org.apache.solr.common.SolrException;
 null:org.apache.solr.common.SolrException: SolrCore 'collection1' is not
 available due to init failure: [schema.xml] Duplicate field definition for
 'id'
 [[[id{type=string,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,sortMissingLast,required,
 required=true}]]] and
 [[[id{type=string,properties=indexed,stored,omitNorms,omitTermFreqAndPositions,sortMissingLast,required,
 required=true}]]]
 at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1212)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155

Re: ngroups does not show correct number of groups when used in SolrCloud

2013-06-14 Thread Shreejay
Hi Markus,  

For ngroups to work in a cloud environment you have to make sure that all docs 
belonging to a group reside on the same shard. Custom hashing has been 
introduced in the recent versions of solr cloud. You might want to look into 
that 
https://issues.apache.org/jira/browse/SOLR-2592

All queries on SolrCloud are run individually on each shard. And then the 
results are merged. When u run a group query SolrCloud runs the query on each 
shard and when the results are merged the ngroups from each shard are added up. 
This is why the ngroups is incorrect when using SolrCloud.  

-- 
Shreejay


On Friday, June 14, 2013 at 5:11, Markus.Mirsberger wrote:

 Hi,
 
 I just noticed (after long time testing and finally looking into the 
 docu :p) that the ngroups parameter does not show the correct number of 
 groups when used in anything else than a single shard environment (in my 
 case SolrCloud).
 
 Is there another way to get the amount of all groups without iterating 
 through alot of resultsets?
 I dont need the values of the grouping. I just need the complete number 
 of groups.
 
 Or can this be done with facets maybe?
 I dont need to to use grouping but as far as I know I cant get the 
 complete amount of facets without iterating through the resultsets.
 So this seemed to me the only way to achieve something equal to a 
 distinct count in sql.
 
 any ideas how this can be done with solr?
 
 
 Thanks,
 Markus
 
 




Re: How spell checker used if indexed document is containing misspelled words

2013-06-14 Thread Shreejay
Hi,  

Have you tried this? 
http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.onlyMorePopular

Of course this is assuming that your corpus has correct words occurring more 
frequently than incorrect ones!  

-- 
Shreejay


On Friday, June 14, 2013 at 2:49, venkatesham.gu...@igate.com wrote:

 My data is picked from social media sites and misspelled words are very
 frequent in social text because of the informal mode of
 communication.Spellchecker does not work here because misspelled words are
 present in the text corpus and not in the search query. Finding documents
 with all the different misspelled forms for a given word is not possible
 using spellchecker, how to go ahead with this.
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/How-spell-checker-used-if-indexed-document-is-containing-misspelled-words-tp4070463.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 




Re: out of memory during indexing do to large incoming queue

2013-06-02 Thread Shreejay
A couple of things: 

1) can you give some more details about your setup ? Like whether its cloud or 
single instance . How many nodes if its cloud.  The hardware - memory per 
machine , JVM options. Etc 

2) any specific reason for using 4.0 beta? The latest version is 4.3. I used 
4.0 for a few weeks and there were a lot if bugs related to memory and 
communication between nodes ( zookeeper) 
3) if you haven't seen it already , please go through this wiki page . It's an 
excellent starting point for troubleshooting memory n indexing issues. 
Specially section 3 to 7 
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations


-- 
Shreejay


On Sunday, June 2, 2013 at 7:16, Yoni Amir wrote:

 Hello,
 I am receiving OutOfMemoryError during indexing, and after investigating the 
 heap dump, I am still missing some information, and I thought this might be a 
 good place for help.
 
 I am using Solr 4.0 beta, and I have 5 threads that send update requests to 
 Solr. Each request is a bulk of 100 SolrInputDocuments (using solrj), and my 
 goal is to index around 2.5 million documents.
 Solr is configured to do a hard-commit every 10 seconds, so initially I 
 thought that it can only accumulate in memory 10 seconds worth of updates, 
 but that's not the case. I can see in a profiler how it accumulates memory 
 over time, even with 4 to 6 GB of memory. It is also configured to optimize 
 with mergeFactor=10.
 
 At first I thought that optimization is a blocking, synchronous operation. It 
 is, in the sense that the index can't be updated during optimization. 
 However, it is not synchronous, in the sense that the update request coming 
 from my code is not blocked - Solr just returns an OK response, even while 
 the index is optimizing.
 This indicates that Solr has an internal queue of inbound requests, and that 
 the OK response just means that it is in the queue. I get confirmation for 
 this from a friend who is a Solr expert (or so I hope).
 
 My main question is: how can I put a bound on this internal queue, and make 
 update requests synchronous in case the queue is full? Put it another way, I 
 need to know if Solr is really ready to receive more requests, so I don't 
 overload it and cause OOME.
 
 I performed several tests, with slow and fast disks, and on the really fasts 
 disk the problem didn't occur. However, I can't demand such fast disk from 
 all the clients, and also even with a fast disk the problem will occur 
 eventually when I try to index 10 million documents.
 I also tried to perform indexing with optimization disabled, but it didn't 
 help.
 
 Thanks,
 Yoni
 
 Confidentiality: This communication and any attachments are intended for the 
 above-named persons only and may be confidential and/or legally privileged. 
 Any opinions expressed in this communication are not necessarily those of 
 NICE Actimize. If this communication has come to you in error you must take 
 no action based on it, nor must you copy or show it to anyone; please 
 delete/destroy and inform the sender by e-mail immediately. 
 Monitoring: NICE Actimize may monitor incoming and outgoing e-mails.
 Viruses: Although we have taken steps toward ensuring that this e-mail and 
 attachments are free from any virus, we advise that in keeping with good 
 computing practice the recipient should ensure they are actually virus free.
 
 




Re: Highlighting fields

2013-05-31 Thread Shreejay

Are the fields you are trying to highlight stored? 

If yes then can you show the exact query you are using? Which version of solr? 
And which highlighter? ( you can paste the relevant highlight section from solr 
config file) 

-- 
Shreejay


On Thursday, May 30, 2013 at 22:56, Sagar Chaturvedi wrote:

 Sorry for wrong subject. Corrected it.
 
 -Original Message-
 From: Sagar Chaturvedi [mailto:sagar.chaturv...@nectechnologies.in] 
 Sent: Friday, May 31, 2013 11:25 AM
 To: solr-user@lucene.apache.org
 Subject: RE: Support for Mongolian language
 
 Hi,
 
 On solr admin UI, in a query I am trying to highlight some fields. I have set 
 hl = true, given name of comma separated fields in hl.fl but fields are not 
 getting highlighted. Any insights?
 
 Regards,
 Sagar
 
 
 
 
 
 
 
 DISCLAIMER:
 ---
 The contents of this e-mail and any attachment(s) are confidential and 
 intended for the named recipient(s) only. 
 It shall not attach any liability on the originator or NEC or its affiliates. 
 Any views or opinions presented in this email are solely those of the author 
 and may not necessarily reflect the opinions of NEC or its affiliates. 
 Any form of reproduction, dissemination, copying, disclosure, modification, 
 distribution and / or publication of this message without the prior written 
 consent of the author of this e-mail is strictly prohibited. If you have 
 received this email in error please delete it and notify the sender 
 immediately. .
 ---
 
 
 
 DISCLAIMER:
 ---
 The contents of this e-mail and any attachment(s) are confidential and
 intended
 for the named recipient(s) only. 
 It shall not attach any liability on the originator or NEC or its
 affiliates. Any views or opinions presented in 
 this email are solely those of the author and may not necessarily reflect the
 opinions of NEC or its affiliates. 
 Any form of reproduction, dissemination, copying, disclosure, modification,
 distribution and / or publication of 
 this message without the prior written consent of the author of this e-mail is
 strictly prohibited. If you have 
 received this email in error please delete it and notify the sender
 immediately. .
 ---
 
 




Re: Highlighting fields

2013-05-31 Thread Shreejay
Didn't notice the original message. Sorry about that.  

-- 
Shreejay


On Friday, May 31, 2013 at 5:55, Jack Krupansky wrote:

 Please do not respond to hijacked message threads, other than to encourage 
 the sender to start a new message thread.
 
 -- Jack Krupansky
 
 -Original Message- 
 From: Shreejay
 Sent: Friday, May 31, 2013 5:10 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Highlighting fields
 
 
 Are the fields you are trying to highlight stored?
 
 If yes then can you show the exact query you are using? Which version of 
 solr? And which highlighter? ( you can paste the relevant highlight section 
 from solr config file)
 
 -- 
 Shreejay
 
 
 On Thursday, May 30, 2013 at 22:56, Sagar Chaturvedi wrote:
 
  Sorry for wrong subject. Corrected it.
  
  -Original Message-
  From: Sagar Chaturvedi [mailto:sagar.chaturv...@nectechnologies.in]
  Sent: Friday, May 31, 2013 11:25 AM
  To: solr-user@lucene.apache.org
  Subject: RE: Support for Mongolian language
  
  Hi,
  
  On solr admin UI, in a query I am trying to highlight some fields. I have 
  set hl = true, given name of comma separated fields in hl.fl but fields 
  are not getting highlighted. Any insights?
  
  Regards,
  Sagar
  
  
  
  
  
  
  
  DISCLAIMER:
  ---
  The contents of this e-mail and any attachment(s) are confidential and 
  intended for the named recipient(s) only.
  It shall not attach any liability on the originator or NEC or its 
  affiliates. Any views or opinions presented in this email are solely those 
  of the author and may not necessarily reflect the opinions of NEC or its 
  affiliates.
  Any form of reproduction, dissemination, copying, disclosure, 
  modification, distribution and / or publication of this message without 
  the prior written consent of the author of this e-mail is strictly 
  prohibited. If you have received this email in error please delete it and 
  notify the sender immediately. .
  ---
  
  
  
  DISCLAIMER:
  ---
  The contents of this e-mail and any attachment(s) are confidential and
  intended
  for the named recipient(s) only.
  It shall not attach any liability on the originator or NEC or its
  affiliates. Any views or opinions presented in
  this email are solely those of the author and may not necessarily reflect 
  the
  opinions of NEC or its affiliates.
  Any form of reproduction, dissemination, copying, disclosure, 
  modification,
  distribution and / or publication of
  this message without the prior written consent of the author of this 
  e-mail is
  strictly prohibited. If you have
  received this email in error please delete it and notify the sender
  immediately. .
  ---
  
 
 
 




Re: Tool to read Solr4.2 index

2013-05-22 Thread Shreejay
This might help 
http://wiki.apache.org/solr/LukeRequestHandler

-- 
Shreejay Nair
Sent from my mobile device. Please excuse brevity and typos.


On Wednesday, May 22, 2013 at 13:47, gpssolr2020 wrote:

 Hi All,
 
 We can use lukeall4.0 for reading Solr3.x index . Like that do we have
 anything to read solr 4.x index. Please help.
 
 Thanks.
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Tool-to-read-Solr4-2-index-tp4065448.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 




Re: seeing lots of autowarming messages in log during DIH indexing

2013-05-20 Thread Shreejay
Every time a commit is done a new searcher is opened. In the solr config file 
caches are defined with a parameter called autowarm. Autowarm basically tries 
to copy the cache values from previous searcher into the current one. If you 
are doing a bulk update and do not care for searching till your indexing is 
over then you can specify openSearcher=false while doing a commit. That should 
speed up your indexing a lot.  

-- 
Shreejay Nair
Sent from my mobile device. Please excuse brevity and typos.


On Monday, May 20, 2013 at 7:16, geeky2 wrote:

 hello,
 
 we are tracking down some performance issues with our DIH process.
 
 not sure if this is related - but i am seeing tons of the messages below in
 the logs during re-indexing of the core.
 
 what do these messages mean?
 
 
 2013-05-18 19:37:30,623 INFO [org.apache.solr.update.UpdateHandler]
 (pool-11-thread-1) end_commit_flush
 2013-05-18 19:37:30,623 INFO [org.apache.solr.search.SolrIndexSearcher]
 (pool-10-thread-1) autowarming Searcher@5b8d745 main from Searcher@1fb355af
 main
 fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 2013-05-18 19:37:30,624 INFO [org.apache.solr.search.SolrIndexSearcher]
 (pool-10-thread-1) autowarming result for Searcher@5b8d745 main
 fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 2013-05-18 19:37:30,624 INFO [org.apache.solr.search.SolrIndexSearcher]
 (pool-10-thread-1) autowarming Searcher@5b8d745 main from Searcher@1fb355af
 main
 filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 2013-05-18 19:37:30,625 INFO [org.apache.solr.search.SolrIndexSearcher]
 (pool-10-thread-1) autowarming result for Searcher@5b8d745 main
 filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 2013-05-18 19:37:30,625 INFO [org.apache.solr.search.SolrIndexSearcher]
 (pool-10-thread-1) autowarming Searcher@5b8d745 main from Searcher@1fb355af
 main
 queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=1,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 2013-05-18 19:37:30,628 INFO [org.apache.solr.search.SolrIndexSearcher]
 (pool-10-thread-1) autowarming result for Searcher@5b8d745 main
 queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=3,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 2013-05-18 19:37:30,628 INFO [org.apache.solr.search.SolrIndexSearcher]
 (pool-10-thread-1) autowarming Searcher@5b8d745 main from Searcher@1fb355af
 main
 documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 2013-05-18 19:37:30,628 INFO [org.apache.solr.search.SolrIndexSearcher]
 (pool-10-thread-1) autowarming result for Searcher@5b8d745 main
 documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 
 thx
 mark
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/seeing-lots-of-autowarming-messages-in-log-during-DIH-indexing-tp4064649.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 




Re: seeing lots of autowarming messages in log during DIH indexing

2013-05-20 Thread shreejay
geeky2 wrote
 you mean i would add this switch to my script that kicks of the
 dataimport?
 
 exmaple:
 
 
 OUTPUT=$(curl -v
 http://${SERVER}.intra.searshc.com:${PORT}/solrpartscat/${CORE}/dataimport
 -F command=full-import -F clean=${CLEAN} -F commit=${COMMIT} -F
 optimize=${OPTIMIZE} -F openSearcher=false)

Yes. Thats correct



geeky2 wrote
 what needs to be done _AFTER_ the DIH finishes (if anything)?
 
 eg, does this need to be turned back on after the DIH has finished?

Yes. You need to open the searcher to be able to search. Just run another
commit with openSearcher = true , once your indexing process finishes.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/seeing-lots-of-autowarming-messages-in-log-during-DIH-indexing-tp4064649p4064768.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: replication without automated polling, just manual trigger?

2013-05-15 Thread Shreejay Nair
You can disable polling so that the slave never polls the Master(In Solr
4.3 you can disable it from the Admin interface). . And you can trigger a
replication using the HTTP API
http://wiki.apache.org/solr/SolrReplication#HTTP_API or again, use the
Admin interface to trigger a manual replication.



On Wed, May 15, 2013 at 12:47 PM, Jonathan Rochkind rochk...@jhu.eduwrote:

 I want to set up Solr replication between a master and slave, where no
 automatic polling every X minutes happens, instead the slave only
 replicates on command. [1]

 So the basic question is: What's the best way to do that? But I'll provide
 what I've been doing etc., for anyone interested.

 Until recently, my appliation was running on Solr 1.4.  I had a setup that
 was working to accomplish this in Solr 1.4, but as I work on moving it to
 Solr 4.3, it's unclear to me if it can/will work the same way.

 In Solr 1.4, on slave,  I supplied a masterUrl, but did NOT supply any
 pollInterval at all on slave.  I did NOT supply an enable
 false in slave, because I think that would have prevented even manual
 replication.

 This seemed to result in the slave never polling, although I'm not sure if
 that was just an accident of Solr implementation or not.  Can anyone say if
 the same thing would happen in Solr 4.3?  If I look at the admin screen for
 my slave set up this way in Solr 4.3, it does say polling enabled, but I
 realize that doesn't neccesarily mean any polling will take place, since
 I've set no pollInterval.

 In Solr 1.4 under this setup, I could go to the slave's admin/replication,
 and there was a replicate now button that I could use for manually
 triggered replication.  This button seems to no longer be there in 4.3
 replication admin screen, although I suppose I could still, somewhat less
 conveniently, issue a `replication?command=**fetchindex` to the slave, to
 manually trigger a replication?



 Thanks for any advice or ideas.



 [1]: Why, you ask?  The master is actually my 'indexing' server. Due to
 business needs, indexing only happens in bulk/mass indexing, and only
 happens periodically -- sometimes nightly, sometimes less. So I index on
 master, at a periodic schedule, and then when indexing is complete and
 verified, tell slave to replicate.  I don't want slave accidentally
 replicating in the middle of the bulk indexing process either, when the
 index might be in an unfinished state.



Request to be added to ContributorsGroup

2013-05-13 Thread Shreejay Nair
Hello Wiki Admins,

Request you to please add me to the ContributorsGroup.

I have been using Solr for a few years now and I would like to contribute
back by adding more information to the wiki Pages.

Wiki User Name : Shreejay

--Shreejay


Re: Frequent OOM - (Unknown source in logs).

2013-05-09 Thread shreejay
We ended up using a Solr 4.0 (now 4.2) without the cloud option. And it seems
to be holding good. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Frequent-OOM-Unknown-source-in-logs-tp4029361p4061945.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Frequent OOM - (Unknown source in logs).

2012-12-28 Thread shreejay
Hi Otis, 

Following is the setup:

6 Solr individual servers (VMs) running on Jetty. 
3 Shards. Each shard with a leader and replica. 
*Solr Version *: /Solr 4.0 (with a patch from Solr-2592)./
*OS*: /CentOS release 5.8 (Final)/
*Java*:
/java version 1.6.0_32
Java(TM) SE Runtime Environment (build 1.6.0_32-b05)
Java HotSpot(TM) 64-Bit Server VM (build 20.7-b02, mixed mode) /

*Memory*: /4 servers have 32 GB, 2 have 30 GB. /
*Disk space*: /500 GB on each server. /

*Queries*:
Usual select queries with upto 6 filters.
facets on around 8 fields. (returning only top 20) . 

*Java options while starting the server:*
/JAVA_OPTIONS=-Xms15360m -Xmx15360m -DSTOP.PORT=1234 -DSTOP.KEY=
-XX:NewRatio=1 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled -XX:+UseCompressedOops
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/ABC/LOGFOLDER
-XX:-TraceClassUnloading
-Dbootstrap_confdir=./solr/collection123/conf
-Dcollection.configName=123conf
-DzkHost=ZooKeeper001:,ZooKeeper002:,SGAZZooKeeper003:
-DnumShards=3 -jar start.jar
LOG_FILE=/ABC/LOGFOLDER/solrlogfile.log
/

I run a *commit* using a curl command every 30 mins using a cron job. 
/curl --silent
http://11.111.111.111:1234/solr/collection123/update/?commit=trueopenSearcher=false/

In my SolrConfig file I have these *Commit settings*:
/updateHandler class=solr.DirectUpdateHandler2

autoCommit
maxDocs0/maxDocs
maxTime0/maxTime
/autoCommit
autoSoftCommit
 maxTime0/maxTime
/autoSoftCommit

openSearcherfalse/openSearcher
waitSearcherfalse/waitSearcher

updateLog
  str name=dir${solr.data.dir:}/str
/updateLog

  /updateHandler

/


Please let me know if you would like more information. I am Not indexing any
documents right now and I again got a OOM around an hour back one one of the
nodes. Lets call it Node1. The node is in recovery right now. and keeps
erroring with this message:
/SEVERE: Error while trying to recover:org.apache.solr.common.SolrException:
Server at   
   
http://NODE2:8983/solr/collection1 returned non ok status:500,
message:Server Error
/
Although its still showing as recovering it is serving queries according
to the log file. 
The other instance in this shard became the leader and is up and running
properly (serving queries). 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Frequent-OOM-Unknown-source-in-logs-tp4029361p4029459.html
Sent from the Solr - User mailing list archive at Nabble.com.


Frequent OOM - (Unknown source in logs).

2012-12-27 Thread shreejay
Hello, 

I am seeing frequent OOMs for the past 2 days on a SolrCloud Cluster
(Solr4.0 with a patch from Solr-2592) setup (3 shards, each shard with 2
instances. Each instance is running CentOS with 30GB memory, 500GB disk
space), with a separate Zoo Keeper ensemble of 3. 

Here is the stacktrace: http://pastebin.com/cV5DxD4N

I also saw there is a Jira issue which looks similar, the difference being,
in the stacktrace I get, I can Not see which process is trying to do a
expandCapacity.
/java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)/

Where as the stacktrace mentioned in this issue
(https://issues.apache.org/jira/browse/SOLR-3881) is 
  /at
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)/

Has anyone seen this issue before? Any fixes for this? 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Frequent-OOM-Unknown-source-in-logs-tp4029361.html
Sent from the Solr - User mailing list archive at Nabble.com.


Commit and OpenSearcher not working as expected.

2012-12-16 Thread shreejay
Hello. 

I am running a commit on a solrCloud collection using a cron job. The
command is as follows:

aa.aa.aa.aa:8983/solr/ABCCollection/update?commit=trueopensearcher=false

But when i see the logs I see the the commit has been called with
openSearcher=true. 

The directupdatehandler2 in my solrconfig file looks like this:
/updateHandler class=solr.DirectUpdateHandler2

autoCommit
maxDocs0/maxDocs
maxTime0/maxTime
/autoCommit
autoSoftCommit 
 maxTime0/maxTime 
/autoSoftCommit
  
openSearcherfalse/openSearcher
waitSearcherfalse/waitSearcher  

updateLog
  str name=dir${solr.data.dir:}/str
/updateLog
   
  /updateHandler/



And these are the logs :
http://pastebin.com/bGh2GRvx


I am not sure why openSearcher is being called. I am indexing a ton of
documents right now, and am not using search at all. Also read in the Wiki,
that keeping openSearcher=false is recommended for solrcloud.
http://wiki.apache.org/solr/SolrConfigXml#Update_Handler_Section


Is there some place else where openSearcher has to be set while calling a
commit? 


--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-and-OpenSearcher-not-working-as-expected-tp4027419.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit and OpenSearcher not working as expected.

2012-12-16 Thread shreejay
Hi Mark, 

That was a typo in my post. I am using openSearcher only. But still see the
same log files. 

/update/?commit=trueopenSearcher=false




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-and-OpenSearcher-not-working-as-expected-tp4027419p4027451.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit and OpenSearcher not working as expected.

2012-12-16 Thread shreejay
Ok this is very surprising. 

I just ran the curl command 

curl --silent
http://xx.xx.xx.xx:8985/solr/collectionABC/update/?commit=trueopenSearcher=false

And on the solr log file I can see these messages:

/Dec 16, 2012 10:44:14 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start
commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
Dec 16, 2012 10:44:25 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
   
commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solrfolder/example/solr/collection1/data/index.2012121519458
lockFactor/

*After the searcher is opened I see these:
*

/Dec 16, 2012 10:44:28 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Dec 16, 2012 10:44:28 PM org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: [resumesnew] webapp=/solr path=/update
params={waitSearcher=truecommit=truecommit_end_point=trueexpungeDeletes=falsewt=javabinsoftCommit=falseversion=2}
{commit=} 0 14155/


I can see that openSearcher is still being called, because a new searcher is
created in the log files. I am not sure why it is being called? Does
solrCloud ignore openSearcher=false? 

--Shreejay
shreejay wrote
 Hi Mark, 
 
 That was a typo in my post. I am using openSearcher only. But still see
 the same log files. 
 
 /update/?commit=trueopenSearcher=false





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-and-OpenSearcher-not-working-as-expected-tp4027419p4027462.html
Sent from the Solr - User mailing list archive at Nabble.com.


A SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene41' does not exist.

2012-12-11 Thread shreejay
I am getting the following error :
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1326)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1438)
at org.apache.solr.core.SolrCore.init(SolrCore.java:700)
... 45 more
Caused by: java.lang.IllegalArgumentException: A SPI class of type
org.apache.lucene.codecs.Codec with name 'Lucene41' does not exist. You need
to add the corresponding JAR file supporting this SPI to your classpath.The
current classpath supports the following names: [Lucene40, Lucene3x]
at
org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:104)
at org.apache.lucene.codecs.Codec.forName(Codec.java:95)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:299)
at
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783)
at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
at
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:87)
at
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34)
at
org.apache.solr.search.SolrIndexSearcher.init(SolrIndexSearcher.java:119)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:130


I was using a patched version of branch_4x for indexing. But due to some
issues I am now reverting back to 4.0 (with one of the patches for Solr 2592
) . I am using the data directories from my previous instance. 

Would just adding the codes folder lucene41 under the folder
lucene\core\src\java\org\apache\lucene\codecs and compiling it suffice ?

--
Shreejay



--
View this message in context: 
http://lucene.472066.n3.nabble.com/A-SPI-class-of-type-org-apache-lucene-codecs-Codec-with-name-Lucene41-does-not-exist-tp4026118.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrCloud OOM heap space

2012-12-10 Thread shreejay
Hi All, 

I am getting constant OOM errors on a SolrCloud instance. (3 shards, 2 solr
instance in each shard, each server with 22gb Of Memory, Xmx = 12GB for java
) . 

Here is a error log:
http://pastie.org/private/dcga3kfatvvamslmtvrp0g


As of now  Iam not indexing any more documents. The total size of index on
each server is around 36-40 GB. 
-Xmx12288m -DSTOP.PORT=8079 -DSTOP.KEY=ABC
-XX:NewRatio=1 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log
-Djetty.port=8983 -DzkHost=ZooKeeperServer001:2181 -jar start.jar

If anyone has faced similar issues please let me know.

--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-heap-space-tp4025821.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SolrCloud OOM heap space

2012-12-10 Thread shreejay
Thanks Markus. Is this issue only on 4.x and 5.x branches? I am currently
running a v recent build of 4.x branch with an applied patch. 

I just want to make sure that this is not an issue with 4.0. In which case I
can think of applying my patch to 4.0 instead of 4x or 5x. 



--Shreejay



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-heap-space-tp4025821p4025839.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SolrCloud OOM heap space

2012-12-10 Thread shreejay
Thanks Marcus. I will apply the patch to the 4x branch I have, and report
back. 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-heap-space-tp4025821p4025858.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Nightly build server down ?

2012-12-05 Thread shreejay
Hi All, 

Is the server hosting nightly builds of Solr down?
https://builds.apache.org/job/Solr-Artifacts-4.x/lastSuccessfulBuild/artifact/solr/package/
 

If anyone knows an alternate link to download the nightly build please let
me know. 


--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Nightly-build-server-down-tp4024493.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr 4.0 ngroups issue workaround

2012-12-05 Thread shreejay
Hi All,

I have a Solrcloud instance with 6 million documents. We are using the
ngroups feature in a few places and I am aware that this is still a JIRA
issue with work in progress (and some patches). 

Apart from using the patch  here
https://issues.apache.org/jira/browse/SOLR-2592  , and re-indexing data so
that all documents with same group field are on the same server, has anyone
else tried or used any alternate methods? 

I wanted to see if there would be any other options before re-indexing. 

Thanks. 

--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-ngroups-issue-workaround-tp4024513.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr4.0 / SolrCloud queries

2012-11-19 Thread shreejay
Hi all , 

I have managed to successfully index around 6 million documents, but while
indexing (and even now after the indexing has stopped), I am running into a
bunch of errors. 

The most common error I see is 
/ null:org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: Server refused connection
at: http://ABC:8983/solr/xyzabc/

I have made sure that the servers are able to communicate with each other
using the same names. 

Another error I keep getting is that the leader stops recovering and goes
red / recovery failed.
/Error while trying to recover.
core=ABC123:org.apache.solr.common.SolrException: We are not the leader/


The servers intermittently go offline taking down one of the shards and in
turn stopping all search queries. 

The configuration I have 

Shard1:
Server1 -  Memory - 22GB , JVM - 8gb 
Server2 - Memory - 22GB , JVM - 10gb  (This one is on recovery failed
status, but still acting as a leader). 

Shard2:
Server1 -  Memory - 22GB , JVM - 8 GB (This one is on recovery failed
status, but still acting as a leader). 
Server2 - Memory -  22 GB, JVM - 8 GB

Shard3 
Server1 - Memory -  22 GB, JVM - 10 GB
Server2 - Memory -  22 GB, JVM - 8 GB

While typing his post I did a Reload from the Core Admin page, and both
servers (Shard1-Server2 and Shard2-Server1)came back up again. 

Has anyone else encountered these issues? Any steps to prevent these? 

Thanks. 


--Shreejay






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4021154.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr4.0 / SolrCloud queries

2012-11-13 Thread shreejay
Thanks Mark. I meant ConcurrentMergeScheduler and ramBufferSizeMB (not
maxBuffer). These are my settings for Merge. 

/
ramBufferSizeMB960/ramBufferSizeMB

mergeFactor40/mergeFactor
mergeScheduler
class=org.apache.lucene.index.ConcurrentMergeScheduler/
/



--Shreejay


Mark Miller-3 wrote
 On Nov 9, 2012, at 1:20 PM, shreejay lt;

 shreejayn@

 gt; wrote:
 
 Instead of doing an optimize, I have now changed the Merge settings by
 keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy. 
 
 Don't you mean ConcurrentMergeScheduler?
 
 Keep in mind that if you use the default TieredMergePolicy, mergeFactor
 will have no affect. You need to use  maxMergeAtOnce and segmentsPerTier
 as sub args to the merge policy config (see the commented out example in
 solrconfig.xml). 
 
 Also, it's probably best to avoid using maxBufferedDocs at all.
 
 - Mark


Mark Miller-3 wrote
 On Nov 9, 2012, at 1:20 PM, shreejay lt;

 shreejayn@

 gt; wrote:
 
 Instead of doing an optimize, I have now changed the Merge settings by
 keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy. 
 
 Don't you mean ConcurrentMergeScheduler?
 
 Keep in mind that if you use the default TieredMergePolicy, mergeFactor
 will have no affect. You need to use  maxMergeAtOnce and segmentsPerTier
 as sub args to the merge policy config (see the commented out example in
 solrconfig.xml). 
 
 Also, it's probably best to avoid using maxBufferedDocs at all.
 
 - Mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4020200.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr4.0 / SolrCloud queries

2012-11-09 Thread shreejay
Thanks Erick. I will try optimizing after indexing everything. I was doing it
after every batch since it was taking way too long to Optimize (which was
expected), but it was not finishing merging it into lesser number of
segments (1 segment). 

Instead of doing an optimize, I have now changed the Merge settings by
keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy. 

I am also going to check the infoStream option so I can see how the
indexing is going on. 


Thanks for your inputs. 

--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4019373.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr4.0 / SolrCloud queries

2012-11-04 Thread shreejay
Thanks Everyone. 

As Shawn mentioned, it was a memory issue. I reduced the amount allocated to
Java to 6 GB. And its been working pretty good. 

I am re-indexing one of the SolrCloud. I was having trouble with optimizing
the data when I indexed last time 

I am hoping optimizing will not be an issue this time due to the memory
changes. I will post more info once I am done. 

Thanks once again. 

--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825p4018176.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr4.0 / SolrCloud queries

2012-10-29 Thread shreejay
Hi All, 

I am trying to run two SolrCloud with 3 and 2 shards respectively (lets say
Cloud3shards and Clouds2Shards). All servers are identical with 18GB Ram
(16GB assigned for Java). 

I am facing a few issues on both clouds and would be grateful if any one
else has seen / solved these.

1) Every now and then, Solr would take off one of the servers (It either
shows as recovering (orange) or its taken offline completely). The Logging
tab on Admin page shows these errors for Cloud3shards

/Error while trying to
recover:org.apache.solr.client.solrj.SolrServerException: Timeout occured
while waiting response from server at: http://xxx:8983/solr/xxx  /

/Error while trying to recover.
core=xxx:org.apache.solr.common.SolrException: I was asked to wait on state
recovering for xxx:8983_solr but I still do not see the request state. I see
state: recovering live:false/

On the Cloud2shards also I see similar messages

I have noticed it does happen more while indexing documents, but I have also
seen this happening while only querying Solr. 

Both SolrClouds are managed by the same Zookeeper ensemble (set of 3 ZK
servers). 

2) I am able to Commit but Optimize never seems to work. Right now I have an
average of 30 segments on every Solr Server. Has any one else faced this
issue? I have tried Optimize from the admin page and as a Http post request.
Both of them fail. Its not because of the hard disk space since my index
size is less than 50Gb and I have 500GB space on each server. 

3) If I try to query solr with rows = 5000 or more (for Cloud2) . for cloud1
its around 20,000 documents. 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this request:[http://ABC1:8983/solr/aaa,
http://ABC2:8983/solr/aaa]. 

4) I have also noticed that ZK would switch leaders every now and then. I am
attributing it to point 1 above, where as soon as the leader is down,
another server takes its place. My concern is the frequency with which this
switch happens. I guess this is completely related to point 1 , and if that
is solved, I will not be having this issue either. 



--Shreejay





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-0-SolrCloud-queries-tp4016825.html
Sent from the Solr - User mailing list archive at Nabble.com.


NewSearcher old cache

2012-10-11 Thread shreejay
Hello Everyone, 

I was configuring a Solr installation and had a few queries about
NewSearcher. As I understand a NewSearcher event will be triggered if there
is an already existing registered searcher. 

Q1) 
As soon as a new searcher is opened, the caches begin populating from the
older caches. What happens if the NewSearcher event has queries defined in
them? does these queries ignore the old cache altogether and load only
results of the queries defined in the listener event? Or do these get added
after the new caches have been warmed by old caches? 

Q2) 
I am running edismax queries on the Solr Server. Can I specify these queries
in NewSearcher and FirstSearcher also? Or are the queries supposed to be
simple queries? 

Thanks. 

--Shreejay



--
View this message in context: 
http://lucene.472066.n3.nabble.com/NewSearcher-old-cache-tp4013225.html
Sent from the Solr - User mailing list archive at Nabble.com.