Re: newbie question on how to batch commit documents

2010-06-01 Thread findbestopensource
Add commit after the loop. I would advise to use commit in a separate
thread. I do keep separate timer thread, where every minute I will do
commit and at the end of every day I will optimize the index.

Regards
Aditya
www.findbestopensource.com


On Tue, Jun 1, 2010 at 2:57 AM, Steve Kuo kuosen...@gmail.com wrote:

 I have a newbie question on what is the best way to batch add/commit a
 large
 collection of document data via solrj.  My first attempt  was to write a
 multi-threaded application that did following.

 CollectionSolrInputDocument docs = new ArrayListSolrInputDocument();
 for (Widget w : widges) {
doc.addField(id, w.getId());
doc.addField(name, w.getName());
   doc.addField(price, w.getPrice());
doc.addField(category, w.getCat());
doc.addField(srcType, w.getSrcType());
docs.add(doc);

// commit docs to solr server
server.add(docs);
server.commit();
 }

 And I got this exception.

 rg.apache.solr.common.SolrException:

 Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later


 Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at
 org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

 The solrj wiki/documents seemed to indicate that because multiple threads
 were calling SolrServer.commit() which in term called
 CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
 thought was to change the configs for autowarming.  But after looking at
 the
 autowarm params, I am not sure what can be changed or perhaps a different
 approach is recommened.

filterCache
  class=solr.FastLRUCache
  size=512
  initialSize=512
  autowarmCount=0/

queryResultCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

documentCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

 Your help is much appreciated.



Re: newbie question on how to batch commit documents

2010-06-01 Thread olivier sallou
I would additionally suggest to use embeddedSolrServer for large uploads if
possible, performance are better.

2010/5/31 Steve Kuo kuosen...@gmail.com

 I have a newbie question on what is the best way to batch add/commit a
 large
 collection of document data via solrj.  My first attempt  was to write a
 multi-threaded application that did following.

 CollectionSolrInputDocument docs = new ArrayListSolrInputDocument();
 for (Widget w : widges) {
doc.addField(id, w.getId());
doc.addField(name, w.getName());
   doc.addField(price, w.getPrice());
doc.addField(category, w.getCat());
doc.addField(srcType, w.getSrcType());
docs.add(doc);

// commit docs to solr server
server.add(docs);
server.commit();
 }

 And I got this exception.

 rg.apache.solr.common.SolrException:

 Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later


 Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at
 org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

 The solrj wiki/documents seemed to indicate that because multiple threads
 were calling SolrServer.commit() which in term called
 CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
 thought was to change the configs for autowarming.  But after looking at
 the
 autowarm params, I am not sure what can be changed or perhaps a different
 approach is recommened.

filterCache
  class=solr.FastLRUCache
  size=512
  initialSize=512
  autowarmCount=0/

queryResultCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

documentCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

 Your help is much appreciated.



Re: newbie question on how to batch commit documents

2010-06-01 Thread Chris Hostetter

: CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
: thought was to change the configs for autowarming.  But after looking at the
: autowarm params, I am not sure what can be changed or perhaps a different
: approach is recommened.

even with 0 autowarming (which is what you have) it can still take time to 
close/open a searcher on every commit -- which is why a commit per doc is 
not usually a good idea (and is *definitely* not a good idea when doing 
batch indexing.

most people can get away with just doing one commit after all their docs 
have been added (ie: at the end of the batch) but if you've got a ot of 
distinct clients, doing a lot of parllel indexing and you don't want to 
coordinate who is responsible for sending the commit, you can configure 
autocommit to happen on the solr server...

http://wiki.apache.org/solr/SolrConfigXml#Update_Handler_Section

...but in general you should make sure that your clients sending docs can 
deal with the occasional long delays (or possibly even needing to retry) 
when an occasional commit might block add/delete operations because of an 
expensive segment merge.

-Hoss



newbie question on how to batch commit documents

2010-05-31 Thread Steve Kuo
I have a newbie question on what is the best way to batch add/commit a large
collection of document data via solrj.  My first attempt  was to write a
multi-threaded application that did following.

CollectionSolrInputDocument docs = new ArrayListSolrInputDocument();
for (Widget w : widges) {
doc.addField(id, w.getId());
doc.addField(name, w.getName());
   doc.addField(price, w.getPrice());
doc.addField(category, w.getCat());
doc.addField(srcType, w.getSrcType());
docs.add(doc);

// commit docs to solr server
server.add(docs);
server.commit();
}

And I got this exception.

rg.apache.solr.common.SolrException:
Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

The solrj wiki/documents seemed to indicate that because multiple threads
were calling SolrServer.commit() which in term called
CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
thought was to change the configs for autowarming.  But after looking at the
autowarm params, I am not sure what can be changed or perhaps a different
approach is recommened.

filterCache
  class=solr.FastLRUCache
  size=512
  initialSize=512
  autowarmCount=0/

queryResultCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

documentCache
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/

Your help is much appreciated.


Re: newbie question on how to batch commit documents

2010-05-31 Thread Erik Hatcher
Move the commit outside your loop and you'll be in better shape.   
Better yet, enable autocommit in solrconfig.xml and don't commit from  
your multithreaded client, otherwise you still run the risk of too  
many commits happening concurrently.


Erik

On May 31, 2010, at 5:27 PM, Steve Kuo wrote:

I have a newbie question on what is the best way to batch add/commit  
a large
collection of document data via solrj.  My first attempt  was to  
write a

multi-threaded application that did following.

CollectionSolrInputDocument docs = new  
ArrayListSolrInputDocument();

for (Widget w : widges) {
   doc.addField(id, w.getId());
   doc.addField(name, w.getName());
  doc.addField(price, w.getPrice());
   doc.addField(category, w.getCat());
   doc.addField(srcType, w.getSrcType());
   docs.add(doc);

   // commit docs to solr server
   server.add(docs);
   server.commit();
}

And I got this exception.

rg.apache.solr.common.SolrException:
Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

	at  
org 
.apache 
.solr 
.client 
.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 
424)
	at  
org 
.apache 
.solr 
.client 
.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 
243)
	at  
org 
.apache 
.solr 
.client 
.solrj 
.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)

at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

The solrj wiki/documents seemed to indicate that because multiple  
threads

were calling SolrServer.commit() which in term called
CommonsHttpSolrServer.request() resulting in multiple searchers.  My  
first
thought was to change the configs for autowarming.  But after  
looking at the
autowarm params, I am not sure what can be changed or perhaps a  
different

approach is recommened.

   filterCache
 class=solr.FastLRUCache
 size=512
 initialSize=512
 autowarmCount=0/

   queryResultCache
 class=solr.LRUCache
 size=512
 initialSize=512
 autowarmCount=0/

   documentCache
 class=solr.LRUCache
 size=512
 initialSize=512
 autowarmCount=0/

Your help is much appreciated.