Re: newbie question on how to batch commit documents

2010-06-01 Thread Chris Hostetter

: CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
: thought was to change the configs for autowarming.  But after looking at the
: autowarm params, I am not sure what can be changed or perhaps a different
: approach is recommened.

even with 0 autowarming (which is what you have) it can still take time to 
close/open a searcher on every commit -- which is why a commit per doc is 
not usually a good idea (and is *definitely* not a good idea when doing 
batch indexing.

most people can get away with just doing one commit after all their docs 
have been added (ie: at the end of the batch) but if you've got a ot of 
distinct clients, doing a lot of parllel indexing and you don't want to 
coordinate who is responsible for sending the commit, you can configure 
"autocommit" to happen on the solr server...

http://wiki.apache.org/solr/SolrConfigXml#Update_Handler_Section

...but in general you should make sure that your clients sending docs can 
deal with the occasional long delays (or possibly even needing to retry) 
when an occasional commit might block add/delete operations because of an 
expensive segment merge.

-Hoss



Re: newbie question on how to batch commit documents

2010-06-01 Thread olivier sallou
I would additionally suggest to use embeddedSolrServer for large uploads if
possible, performance are better.

2010/5/31 Steve Kuo 

> I have a newbie question on what is the best way to batch add/commit a
> large
> collection of document data via solrj.  My first attempt  was to write a
> multi-threaded application that did following.
>
> Collection docs = new ArrayList();
> for (Widget w : widges) {
>doc.addField("id", w.getId());
>doc.addField("name", w.getName());
>   doc.addField("price", w.getPrice());
>doc.addField("category", w.getCat());
>doc.addField("srcType", w.getSrcType());
>docs.add(doc);
>
>// commit docs to solr server
>server.add(docs);
>server.commit();
> }
>
> And I got this exception.
>
> rg.apache.solr.common.SolrException:
>
> Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later
>
>
> Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later
>
>at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
>at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
>at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>at
> org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)
>
> The solrj wiki/documents seemed to indicate that because multiple threads
> were calling SolrServer.commit() which in term called
> CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
> thought was to change the configs for autowarming.  But after looking at
> the
> autowarm params, I am not sure what can be changed or perhaps a different
> approach is recommened.
>
>  class="solr.FastLRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>  class="solr.LRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>  class="solr.LRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
> Your help is much appreciated.
>


Re: newbie question on how to batch commit documents

2010-05-31 Thread findbestopensource
Add commit after the loop. I would advise to use commit in a separate
thread. I do keep separate timer thread, where every minute I will do
commit and at the end of every day I will optimize the index.

Regards
Aditya
www.findbestopensource.com


On Tue, Jun 1, 2010 at 2:57 AM, Steve Kuo  wrote:

> I have a newbie question on what is the best way to batch add/commit a
> large
> collection of document data via solrj.  My first attempt  was to write a
> multi-threaded application that did following.
>
> Collection docs = new ArrayList();
> for (Widget w : widges) {
>doc.addField("id", w.getId());
>doc.addField("name", w.getName());
>   doc.addField("price", w.getPrice());
>doc.addField("category", w.getCat());
>doc.addField("srcType", w.getSrcType());
>docs.add(doc);
>
>// commit docs to solr server
>server.add(docs);
>server.commit();
> }
>
> And I got this exception.
>
> rg.apache.solr.common.SolrException:
>
> Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later
>
>
> Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later
>
>at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
>at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
>at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
>at
> org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)
>
> The solrj wiki/documents seemed to indicate that because multiple threads
> were calling SolrServer.commit() which in term called
> CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
> thought was to change the configs for autowarming.  But after looking at
> the
> autowarm params, I am not sure what can be changed or perhaps a different
> approach is recommened.
>
>  class="solr.FastLRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>  class="solr.LRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>  class="solr.LRUCache"
>  size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
> Your help is much appreciated.
>


Re: newbie question on how to batch commit documents

2010-05-31 Thread Erik Hatcher
Move the commit outside your loop and you'll be in better shape.   
Better yet, enable autocommit in solrconfig.xml and don't commit from  
your multithreaded client, otherwise you still run the risk of too  
many commits happening concurrently.


Erik

On May 31, 2010, at 5:27 PM, Steve Kuo wrote:

I have a newbie question on what is the best way to batch add/commit  
a large
collection of document data via solrj.  My first attempt  was to  
write a

multi-threaded application that did following.

Collection docs = new  
ArrayList();

for (Widget w : widges) {
   doc.addField("id", w.getId());
   doc.addField("name", w.getName());
  doc.addField("price", w.getPrice());
   doc.addField("category", w.getCat());
   doc.addField("srcType", w.getSrcType());
   docs.add(doc);

   // commit docs to solr server
   server.add(docs);
   server.commit();
}

And I got this exception.

rg.apache.solr.common.SolrException:
Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

	at  
org 
.apache 
.solr 
.client 
.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 
424)
	at  
org 
.apache 
.solr 
.client 
.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java: 
243)
	at  
org 
.apache 
.solr 
.client 
.solrj 
.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)

at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

The solrj wiki/documents seemed to indicate that because multiple  
threads

were calling SolrServer.commit() which in term called
CommonsHttpSolrServer.request() resulting in multiple searchers.  My  
first
thought was to change the configs for autowarming.  But after  
looking at the
autowarm params, I am not sure what can be changed or perhaps a  
different

approach is recommened.

   

   

   

Your help is much appreciated.




newbie question on how to batch commit documents

2010-05-31 Thread Steve Kuo
I have a newbie question on what is the best way to batch add/commit a large
collection of document data via solrj.  My first attempt  was to write a
multi-threaded application that did following.

Collection docs = new ArrayList();
for (Widget w : widges) {
doc.addField("id", w.getId());
doc.addField("name", w.getName());
   doc.addField("price", w.getPrice());
doc.addField("category", w.getCat());
doc.addField("srcType", w.getSrcType());
docs.add(doc);

// commit docs to solr server
server.add(docs);
server.commit();
}

And I got this exception.

rg.apache.solr.common.SolrException:
Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later

at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:86)

The solrj wiki/documents seemed to indicate that because multiple threads
were calling SolrServer.commit() which in term called
CommonsHttpSolrServer.request() resulting in multiple searchers.  My first
thought was to change the configs for autowarming.  But after looking at the
autowarm params, I am not sure what can be changed or perhaps a different
approach is recommened.







Your help is much appreciated.