> 2. As far as I know the better SolrJ interface to index with SolrCloud is CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances of CloudSolrServer and you correctly balance them with a Round Robin or something similar you´ll get a better performance in SolrCloud scenarios. At least is what I´ve read in the documentation, and also I asked to Mark Miller some months ago when I started dealing with Solr 4.0-BETA.
I was told otherwise during Solr Boot Camp. Michael Della Bitta ------------------------------------------------ Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Wed, Mar 20, 2013 at 5:14 AM, Luis Cappa Banda <luisca...@gmail.com> wrote: > Thank you for answering. Some notes: > > 1. The Java engine I´ve developed that wrappers SolrJ 4.1 with some > business logic only executes search queries, not index/update operations, > so the problem is not related with concurrent updates, or something similar. > > 2. As far as I know the better SolrJ interface to index with SolrCloud is > CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances > of CloudSolrServer and you correctly balance them with a Round Robin or > something similar you´ll get a better performance in SolrCloud scenarios. > At least is what I´ve read in the documentation, and also I asked to Mark > Miller some months ago when I started dealing with Solr 4.0-BETA. > > 3. I´m almost convinced that the problem is related with: > > - Zookeeper ensemble configuration. > - Zookeeper version (3.4.5) is not compatible with Solr 4.1. expected one. > - SolrJ Zookeeper driver. > > In short, all my architecture works perfectly with search operations. Also > I´ve got another NRT Indexer module that deals with CloudSolrServer and > works perfectly. But after two, three days, something happens with > Zookeeper - CloudSolrServer connection, and tries to update cluster status > forever with no success. Only after Zookeeper + SolrCloud leader&replica > shards restart the problem is solved. > > > 2013/3/19 Michael Della Bitta <michael.della.bi...@appinions.com> > >> Don't use CloudSolrServer for writes. Instead, use >> ConcurrentUpdateSolrServer, something like: >> >> SolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, 100, 4); >> >> The 100 corresponds to how many docs to send in a batch. The higher >> this is, the better performance is (to a point, don't set that to 50k >> or anything). >> >> The 4 corresponds to the number of threads that will be sending batches. >> >> Note that this class doesn't report errors, so if you want to see >> exceptions when bad things happen, you'll have to override >> handleError(Throwable ex) method. >> >> Here's the javadoc for the class: >> >> http://lucene.apache.org/solr/4_2_0/solr-solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.html >> >> It'd be best if you can use a load balancer in front of your Solr >> Cloud and use that as the solrUrl parameter. >> >> ***Either way, though, Mark is right in that you need to diagnose why >> you're only able to do a few documents per second first.*** Adding >> more threads at this point is probably not going to help. >> >> Michael Della Bitta >> >> ------------------------------------------------ >> Appinions >> 18 East 41st Street, 2nd Floor >> New York, NY 10017-6271 >> >> www.appinions.com >> >> Where Influence Isn’t a Game >> >> >> On Tue, Mar 19, 2013 at 3:57 PM, Luis Cappa Banda <luisca...@gmail.com> >> wrote: >> > Anyone can help me? Each response may save a little kitten from a >> horrible >> > and dramatic death somewhere in the world :-P >> > El 15/03/2013 21:06, "Jack Park" <jackp...@topicquests.org> escribió: >> > >> >> Is there a document that tells how to create multiple threads? Search >> >> returns many hits which orbit this idea, but I haven't spotted one >> >> which tells how. >> >> >> >> Thanks >> >> Jack >> >> >> >> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller <markrmil...@gmail.com> >> >> wrote: >> >> > You def have to use multiple threads with it for it to be fast, but 3 >> or >> >> 4 docs a second still sounds absurdly slow. >> >> > >> >> > - Mark >> >> > >> >> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <luisca...@gmail.com> >> >> wrote: >> >> > >> >> >> And up! :-) >> >> >> >> >> >> I´ve been wondering if using CloudSolrServer has something to do >> here. >> >> Does >> >> >> it have a bad performance when a CloudSolrServer singletong receives >> >> >> multiple queries? Is it recommended to have a CloudSolrServer >> instances >> >> >> list and select one of them with a Round Robin criteria? >> >> >> >> >> >> >> >> >> >> >> >> 2013/3/14 Luis Cappa Banda <luisca...@gmail.com> >> >> >> >> >> >>> Hello! >> >> >>> >> >> >>> Thanks a lot, Erick! I've attached some stack traces during a normal >> >> >>> 'engine' running. >> >> >>> >> >> >>> Cheers, >> >> >>> >> >> >>> - Luis Cappa >> >> >>> >> >> >>> >> >> >>> 2013/3/13 Erick Erickson <erickerick...@gmail.com> >> >> >>> >> >> >>>> Stack traces.. >> >> >>>> >> >> >>>> First, >> >> >>>> jps -l >> >> >>>> >> >> >>>> that will give you a the process IDs of your running Java >> processes. >> >> Then: >> >> >>>> >> >> >>>> jstack <pid from above> >> >> >>>> >> >> >>>> Usually I pipe the output from jstack into a text file... >> >> >>>> >> >> >>>> Best >> >> >>>> Erick >> >> >>>> >> >> >>>> >> >> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda < >> >> luisca...@gmail.com >> >> >>>>> wrote: >> >> >>>> >> >> >>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole >> it´s >> >> >>>> posible >> >> >>>>> to output this traces, but with a .war application built on top of >> >> >>>> Spring I >> >> >>>>> don´t know how can I do that. In any case, here is my >> CloudSolrServer >> >> >>>>> wrapper that is used by other classes. There is no sync method or >> >> piece >> >> >>>> of >> >> >>>>> code: >> >> >>>>> >> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >> - - >> >> >>>> - - >> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >> - - >> >> >>>>> >> >> >>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {* >> >> >>>>> >> >> >>>>> private static final long serialVersionUID = 3905956120804659445L; >> >> >>>>> public BinaryLBHttpSolrServer(String[] endpoints) throws >> >> >>>>> MalformedURLException { >> >> >>>>> super(endpoints); >> >> >>>>> } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> protected HttpSolrServer makeServer(String server) throws >> >> >>>>> MalformedURLException { >> >> >>>>> HttpSolrServer solrServer = super.makeServer(server); >> >> >>>>> solrServer.setRequestWriter(new BinaryRequestWriter()); >> >> >>>>> return solrServer; >> >> >>>>> } >> >> >>>>> } >> >> >>>>> >> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >> - - >> >> >>>> - - >> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >> - - >> >> >>>>> >> >> >>>>> *public class CloudSolrHttpServerImpl implements >> CloudSolrHttpServer >> >> {* >> >> >>>>> private CloudSolrServer cloudSolrServer; >> >> >>>>> >> >> >>>>> private Logger log = >> Logger.getLogger(CloudSolrHttpServerImpl.class); >> >> >>>>> >> >> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[] >> >> >>>>> endpoints, int clientTimeout, >> >> >>>>> int connectTimeout, String cloudCollection) { >> >> >>>>> try { >> >> >>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer* >> >> >>>>> (endpoints); >> >> >>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints, >> >> >>>>> lbSolrServer); >> >> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout); >> >> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout); >> >> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection); >> >> >>>>> } catch (MalformedURLException e) { >> >> >>>>> log.error(e); >> >> >>>>> } >> >> >>>>> } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> public QueryResponse *search*(SolrQuery query) throws >> >> >>>> SolrServerException { >> >> >>>>> return cloudSolrServer.query(query, METHOD.POST); >> >> >>>>> } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> public boolean *index*(DocumentBean user) { >> >> >>>>> boolean indexed = false; >> >> >>>>> int retries = 0; >> >> >>>>> do { >> >> >>>>> indexed = addBean(user); >> >> >>>>> retries++; >> >> >>>>> } while(!indexed && retries<4); >> >> >>>>> return indexed; >> >> >>>>> } >> >> >>>>> @Override >> >> >>>>> public boolean *update*(SolrInputDocument updateDoc) { >> >> >>>>> boolean update = false; >> >> >>>>> int retries = 0; >> >> >>>>> >> >> >>>>> do { >> >> >>>>> update = addSolrInputDocument(updateDoc); >> >> >>>>> retries++; >> >> >>>>> } while(!update && retries<4); >> >> >>>>> return update; >> >> >>>>> } >> >> >>>>> @Override >> >> >>>>> public void commit() { >> >> >>>>> try { >> >> >>>>> cloudSolrServer.commit(); >> >> >>>>> } catch (SolrServerException e) { >> >> >>>>> log.error(e); >> >> >>>>> } catch (IOException e) { >> >> >>>>> log.error(e); >> >> >>>>> } >> >> >>>>> } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> public boolean *delete*(String ... ids) { >> >> >>>>> boolean deleted = false; >> >> >>>>> List<String> idList = Arrays.asList(ids); >> >> >>>>> try { >> >> >>>>> this.cloudSolrServer.deleteById(idList); >> >> >>>>> this.cloudSolrServer.commit(true, true); >> >> >>>>> deleted = true; >> >> >>>>> >> >> >>>>> } catch (SolrServerException e) { >> >> >>>>> log.error(e); >> >> >>>>> >> >> >>>>> } catch (IOException e) { >> >> >>>>> log.error(e); >> >> >>>>> } >> >> >>>>> return deleted; >> >> >>>>> } >> >> >>>>> >> >> >>>>> @Override >> >> >>>>> public void *optimize*() { >> >> >>>>> try { >> >> >>>>> this.cloudSolrServer.optimize(); >> >> >>>>> } catch (SolrServerException e) { >> >> >>>>> log.error(e); >> >> >>>>> } catch (IOException e) { >> >> >>>>> log.error(e); >> >> >>>>> } >> >> >>>>> } >> >> >>>>> /* >> >> >>>>> * ******************** >> >> >>>>> * Getters & setters * >> >> >>>>> * ******************** >> >> >>>>> * */ >> >> >>>>> public CloudSolrServer getSolrServer() { >> >> >>>>> return cloudSolrServer; >> >> >>>>> } >> >> >>>>> >> >> >>>>> public void setSolrServer(CloudSolrServer solrServer) { >> >> >>>>> this.cloudSolrServer = solrServer; >> >> >>>>> } >> >> >>>>> >> >> >>>>> private boolean addBean(DocumentBean user) { >> >> >>>>> boolean added = false; >> >> >>>>> try { >> >> >>>>> this.cloudSolrServer.addBean(user, 100); >> >> >>>>> this.commit(); >> >> >>>>> >> >> >>>>> } catch (IOException e) { >> >> >>>>> log.error(e); >> >> >>>>> >> >> >>>>> } catch (SolrServerException e) { >> >> >>>>> log.error(e); >> >> >>>>> }catch(SolrException e) { >> >> >>>>> log.error(e); >> >> >>>>> } >> >> >>>>> return added; >> >> >>>>> } >> >> >>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc) >> { >> >> >>>>> boolean added = false; >> >> >>>>> try { >> >> >>>>> this.cloudSolrServer.add(updateDoc, 100); >> >> >>>>> this.commit(); >> >> >>>>> added = true; >> >> >>>>> } catch (IOException e) { >> >> >>>>> log.error(e); >> >> >>>>> >> >> >>>>> } catch (SolrServerException e) { >> >> >>>>> log.error(e); >> >> >>>>> }catch(SolrException e) { >> >> >>>>> log.error(e); >> >> >>>>> } >> >> >>>>> return added; >> >> >>>>> } >> >> >>>>> } >> >> >>>>> >> >> >>>>> Thank you very much, Mark. >> >> >>>>> >> >> >>>>> >> >> >>>>> - Luis Cappa >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> And >> >> >>>>> 2013/3/13 Mark Miller <markrmil...@gmail.com> >> >> >>>>> >> >> >>>>>> >> >> >>>>>> Could you capture some thread stack traces in the 'engine' and >> see >> >> if >> >> >>>>>> there are any blocking methods? >> >> >>>>>> >> >> >>>>>> - Mark >> >> >>>>>> >> >> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda < >> luisca...@gmail.com> >> >> >>>>> wrote: >> >> >>>>>> >> >> >>>>>>> Just one correction: >> >> >>>>>>> >> >> >>>>>>> When I said: >> >> >>>>>>> >> >> >>>>>>> - I´ve checked SolrCloud via Solr Admin interface and it´s OK: >> >> >>>>>>> everything is green, and I cant execute queries directly into >> >> >>>> Solr. >> >> >>>>>>> >> >> >>>>>>> I mean: >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> - I´ve checked SolrCloud via Solr Admin interface and it´s OK: >> >> >>>>>>> everything is green, and *I can* execute queries directly into >> >> >>>> Solr. >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> Thanks! >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> - Luis Cappa >> >> >>>>>>> >> >> >>>>>>> >> >> >>>>>>> 2013/3/13 Luis Cappa Banda <luisca...@gmail.com> >> >> >>>>>>> >> >> >>>>>>>> Hello, guys! >> >> >>>>>>>> >> >> >>>>>>>> I´ve been experiencing some annoying behavior with my current >> >> >>>>> production >> >> >>>>>>>> scenario. Here is the snapshot: >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> - SolrCloud: 2 shards >> >> >>>>>>>> - Zookeeper ensemble: 3 nodes in *different machines *(most of >> >> >>>> the >> >> >>>>>>>> tutorials installs 3 Zookeeper nodes in the same machine). >> >> >>>>>>>> - This is the zoo.cfg from every >> >> >>>>>>>> >> >> >>>>>>>> tickTime=2000 // I´ve also tried with 60000 >> >> >>>>>>>> >> >> >>>>>>>> initLimit=10 >> >> >>>>>>>> >> >> >>>>>>>> syncLimit=5 >> >> >>>>>>>> >> >> >>>>>>>> dataDir=/var/lib/zookeeper >> >> >>>>>>>> >> >> >>>>>>>> clientPort=9000 >> >> >>>>>>>> >> >> >>>>>>>> server.1=zoohost1:2888:3888 >> >> >>>>>>>> >> >> >>>>>>>> server.2=zoohost1:2888:3888 >> >> >>>>>>>> >> >> >>>>>>>> server.3=zoohost1:2888:3888 >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> - I´ve developed a Java Application with a REST API (let´s >> call >> >> >>>> it * >> >> >>>>>>>> engine*) that dispatches queries into SolrCloud. It´s a >> wrapper >> >> >>>>> around >> >> >>>>>>>> CloudSolrServer, so it´s mandatory to specify some Zookeeper >> >> >>>>>> configuration >> >> >>>>>>>> params too. They are loaded dynamically when the application >> is >> >> >>>>>> deployed in >> >> >>>>>>>> a Tomcat server, but the current values that I´m using are as >> >> >>>>> follows: >> >> >>>>>>>> >> >> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)* >> >> >>>>>>>> >> >> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)* >> >> >>>>>>>> * >> >> >>>>>>>> * >> >> >>>>>>>> * >> >> >>>>>>>> * >> >> >>>>>>>> >> >> >>>>>>>> *THE PROBLEM* >> >> >>>>>>>> * >> >> >>>>>>>> * >> >> >>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve >> >> >>>> checked >> >> >>>>>>>> that this behavior occurrs periodically, more or less) the >> *engine >> >> >>>>>> blocks >> >> >>>>>>>> * and cannot dispatch any query to SolrCloud. >> >> >>>>>>>> >> >> >>>>>>>> - The *engine *log only outputs "updating Zookeeper..." one >> last >> >> >>>>> time, >> >> >>>>>>>> but never updates. >> >> >>>>>>>> - I´ve checked SolrCloud via Solr Admin interface and it´s OK: >> >> >>>>>>>> everything is green, and I cant execute queries directly into >> >> >>>> Solr. >> >> >>>>>>>> - So then Solr appears to be OK, so the next step is to >> restart >> >> >>>>>> *engine >> >> >>>>>>>> but *it again appears "updating Zookeeper...". Unfortunately >> >> >>>> switch >> >> >>>>>>>> off + switch on doesn´t work here, :-( >> >> >>>>>>>> - I´ve checked too Zookeeper logs and it appears some >> connection >> >> >>>> log >> >> >>>>>>>> outs, but the ensemble appears to be OK too. >> >> >>>>>>>> - *The end: *If I restart Zookeeper one by one, and I restart >> >> >>>>>>>> SolrCloud, plus I restart the engine, the problem is solved. >> I´m >> >> >>>>> using >> >> >>>>>>>> Amazon AWS as hostage, so I discard connection problems >> between >> >> >>>>>> instances. >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> Does anyone experienced something similar? Can anybody shed >> some >> >> >>>> light >> >> >>>>>> on >> >> >>>>>>>> this problem? >> >> >>>>>>>> >> >> >>>>>>>> Thank you very much. >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> Regards, >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> - Luis Cappa >> >> >>>>>>>> >> >> >>>>>> >> >> >>>>>> >> >> >>>>> >> >> >>>> >> >> >>> >> >> >>> >> >> > >> >> >> > > > > -- > Luis Cappa Banda > > *Phone*: (0034) 686 200 375 > *Skype*: luiscappabanda