> 2. As far as I know the better SolrJ interface to index with SolrCloud is
CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances
of CloudSolrServer and you correctly balance them with a Round Robin or
something similar you´ll get a better performance in SolrCloud scenarios.
At least is what I´ve read in the documentation, and also I asked to Mark
Miller some months ago when I started dealing with Solr 4.0-BETA.

I was told otherwise during Solr Boot Camp.

Michael Della Bitta

------------------------------------------------
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Wed, Mar 20, 2013 at 5:14 AM, Luis Cappa Banda <luisca...@gmail.com> wrote:
> Thank you for answering. Some notes:
>
> 1. The Java engine I´ve developed that wrappers SolrJ 4.1  with some
> business logic only executes search queries, not index/update operations,
> so the problem is not related with concurrent updates, or something similar.
>
> 2. As far as I know the better SolrJ interface to index with SolrCloud is
> CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances
> of CloudSolrServer and you correctly balance them with a Round Robin or
> something similar you´ll get a better performance in SolrCloud scenarios.
> At least is what I´ve read in the documentation, and also I asked to Mark
> Miller some months ago when I started dealing with Solr 4.0-BETA.
>
> 3. I´m almost convinced that the problem is related with:
>
> - Zookeeper ensemble configuration.
> - Zookeeper version (3.4.5) is not compatible with Solr 4.1. expected one.
> - SolrJ Zookeeper driver.
>
> In short, all my architecture works perfectly with search operations. Also
> I´ve got another NRT Indexer module that deals with CloudSolrServer and
> works perfectly. But after two, three days, something happens with
> Zookeeper - CloudSolrServer connection, and tries to update cluster status
> forever with no success. Only after Zookeeper + SolrCloud leader&replica
> shards restart the problem is solved.
>
>
> 2013/3/19 Michael Della Bitta <michael.della.bi...@appinions.com>
>
>> Don't use CloudSolrServer for writes. Instead, use
>> ConcurrentUpdateSolrServer, something like:
>>
>> SolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, 100, 4);
>>
>> The 100 corresponds to how many docs to send in a batch. The higher
>> this is, the better performance is (to a point, don't set that to 50k
>> or anything).
>>
>> The 4 corresponds to the number of threads that will be sending batches.
>>
>> Note that this class doesn't report errors, so if you want to see
>> exceptions when bad things happen, you'll have to override
>> handleError(Throwable ex) method.
>>
>> Here's the javadoc for the class:
>>
>> http://lucene.apache.org/solr/4_2_0/solr-solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.html
>>
>> It'd be best if you can use a load balancer in front of your Solr
>> Cloud and use that as the solrUrl parameter.
>>
>> ***Either way, though, Mark is right in that you need to diagnose why
>> you're only able to do a few documents per second first.*** Adding
>> more threads at this point is probably not going to help.
>>
>> Michael Della Bitta
>>
>> ------------------------------------------------
>> Appinions
>> 18 East 41st Street, 2nd Floor
>> New York, NY 10017-6271
>>
>> www.appinions.com
>>
>> Where Influence Isn’t a Game
>>
>>
>> On Tue, Mar 19, 2013 at 3:57 PM, Luis Cappa Banda <luisca...@gmail.com>
>> wrote:
>> > Anyone can help me? Each response may save a little kitten from a
>> horrible
>> > and dramatic  death somewhere in the world :-P
>> > El 15/03/2013 21:06, "Jack Park" <jackp...@topicquests.org> escribió:
>> >
>> >> Is there a document that tells how to create multiple threads? Search
>> >> returns many hits which orbit this idea, but I haven't spotted one
>> >> which tells how.
>> >>
>> >> Thanks
>> >> Jack
>> >>
>> >> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller <markrmil...@gmail.com>
>> >> wrote:
>> >> > You def have to use multiple threads with it for it to be fast, but 3
>> or
>> >> 4 docs a second still sounds absurdly slow.
>> >> >
>> >> > - Mark
>> >> >
>> >> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <luisca...@gmail.com>
>> >> wrote:
>> >> >
>> >> >> And up! :-)
>> >> >>
>> >> >> I´ve been wondering if using CloudSolrServer has something to do
>> here.
>> >> Does
>> >> >> it have a bad performance when a CloudSolrServer singletong receives
>> >> >> multiple queries? Is it recommended to have a CloudSolrServer
>> instances
>> >> >> list and select one of them with a Round Robin criteria?
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2013/3/14 Luis Cappa Banda <luisca...@gmail.com>
>> >> >>
>> >> >>> Hello!
>> >> >>>
>> >> >>> Thanks a lot, Erick! I've attached some stack traces during a normal
>> >> >>> 'engine' running.
>> >> >>>
>> >> >>> Cheers,
>> >> >>>
>> >> >>> - Luis Cappa
>> >> >>>
>> >> >>>
>> >> >>> 2013/3/13 Erick Erickson <erickerick...@gmail.com>
>> >> >>>
>> >> >>>> Stack traces..
>> >> >>>>
>> >> >>>> First,
>> >> >>>> jps -l
>> >> >>>>
>> >> >>>> that will give you a the process IDs of your running Java
>> processes.
>> >> Then:
>> >> >>>>
>> >> >>>> jstack <pid from above>
>> >> >>>>
>> >> >>>> Usually I pipe the output from jstack into a text file...
>> >> >>>>
>> >> >>>> Best
>> >> >>>> Erick
>> >> >>>>
>> >> >>>>
>> >> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <
>> >> luisca...@gmail.com
>> >> >>>>> wrote:
>> >> >>>>
>> >> >>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole
>> it´s
>> >> >>>> posible
>> >> >>>>> to output this traces, but with a .war application built on top of
>> >> >>>> Spring I
>> >> >>>>> don´t know how can I do that. In any case, here is my
>> CloudSolrServer
>> >> >>>>> wrapper that is used by other classes. There is no sync method or
>> >> piece
>> >> >>>> of
>> >> >>>>> code:
>> >> >>>>>
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> >> >>>> - -
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> >> >>>>>
>> >> >>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>> >> >>>>>
>> >> >>>>> private static final long serialVersionUID = 3905956120804659445L;
>> >> >>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws
>> >> >>>>> MalformedURLException {
>> >> >>>>>    super(endpoints);
>> >> >>>>>    }
>> >> >>>>>
>> >> >>>>>    @Override
>> >> >>>>>    protected HttpSolrServer makeServer(String server) throws
>> >> >>>>> MalformedURLException {
>> >> >>>>>        HttpSolrServer solrServer = super.makeServer(server);
>> >> >>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
>> >> >>>>>        return solrServer;
>> >> >>>>>    }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> >> >>>> - -
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> >> >>>>>
>> >> >>>>> *public class CloudSolrHttpServerImpl implements
>> CloudSolrHttpServer
>> >> {*
>> >> >>>>> private CloudSolrServer cloudSolrServer;
>> >> >>>>>
>> >> >>>>> private Logger log =
>> Logger.getLogger(CloudSolrHttpServerImpl.class);
>> >> >>>>>
>> >> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
>> >> >>>>> endpoints, int clientTimeout,
>> >> >>>>> int connectTimeout, String cloudCollection) {
>> >> >>>>> try {
>> >> >>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
>> >> >>>>> (endpoints);
>> >> >>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
>> >> >>>>> lbSolrServer);
>> >> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
>> >> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
>> >> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
>> >> >>>>> } catch (MalformedURLException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public QueryResponse *search*(SolrQuery query) throws
>> >> >>>> SolrServerException {
>> >> >>>>> return cloudSolrServer.query(query, METHOD.POST);
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public boolean *index*(DocumentBean user) {
>> >> >>>>> boolean indexed = false;
>> >> >>>>> int retries = 0;
>> >> >>>>> do {
>> >> >>>>> indexed = addBean(user);
>> >> >>>>> retries++;
>> >> >>>>> } while(!indexed && retries<4);
>> >> >>>>> return indexed;
>> >> >>>>> }
>> >> >>>>> @Override
>> >> >>>>> public boolean *update*(SolrInputDocument updateDoc) {
>> >> >>>>> boolean update = false;
>> >> >>>>> int retries = 0;
>> >> >>>>>
>> >> >>>>> do {
>> >> >>>>> update = addSolrInputDocument(updateDoc);
>> >> >>>>> retries++;
>> >> >>>>> } while(!update && retries<4);
>> >> >>>>> return update;
>> >> >>>>> }
>> >> >>>>> @Override
>> >> >>>>> public void commit() {
>> >> >>>>> try {
>> >> >>>>> cloudSolrServer.commit();
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>>     log.error(e);
>> >> >>>>> } catch (IOException e) {
>> >> >>>>>     log.error(e);
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public boolean *delete*(String ... ids) {
>> >> >>>>> boolean deleted = false;
>> >> >>>>> List<String> idList = Arrays.asList(ids);
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.deleteById(idList);
>> >> >>>>> this.cloudSolrServer.commit(true, true);
>> >> >>>>> deleted = true;
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>> log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> return deleted;
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public void *optimize*() {
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.optimize();
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>> /*
>> >> >>>>> * ********************
>> >> >>>>> *  Getters & setters *
>> >> >>>>> * ********************
>> >> >>>>> * */
>> >> >>>>> public CloudSolrServer getSolrServer() {
>> >> >>>>> return cloudSolrServer;
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> public void setSolrServer(CloudSolrServer solrServer) {
>> >> >>>>> this.cloudSolrServer = solrServer;
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> private boolean addBean(DocumentBean user) {
>> >> >>>>> boolean added = false;
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.addBean(user, 100);
>> >> >>>>> this.commit();
>> >> >>>>>
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }catch(SolrException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> return added;
>> >> >>>>> }
>> >> >>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc)
>> {
>> >> >>>>> boolean added = false;
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.add(updateDoc, 100);
>> >> >>>>> this.commit();
>> >> >>>>> added = true;
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }catch(SolrException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> return added;
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> Thank you very much, Mark.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> -  Luis Cappa
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> And
>> >> >>>>> 2013/3/13 Mark Miller <markrmil...@gmail.com>
>> >> >>>>>
>> >> >>>>>>
>> >> >>>>>> Could you capture some thread stack traces in the 'engine' and
>> see
>> >> if
>> >> >>>>>> there are any blocking methods?
>> >> >>>>>>
>> >> >>>>>> - Mark
>> >> >>>>>>
>> >> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <
>> luisca...@gmail.com>
>> >> >>>>> wrote:
>> >> >>>>>>
>> >> >>>>>>> Just one correction:
>> >> >>>>>>>
>> >> >>>>>>> When I said:
>> >> >>>>>>>
>> >> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>  everything is green, and I cant execute queries directly into
>> >> >>>> Solr.
>> >> >>>>>>>
>> >> >>>>>>> I mean:
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>  everything is green, and *I can* execute queries directly into
>> >> >>>> Solr.
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> Thanks!
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> - Luis Cappa
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> 2013/3/13 Luis Cappa Banda <luisca...@gmail.com>
>> >> >>>>>>>
>> >> >>>>>>>> Hello, guys!
>> >> >>>>>>>>
>> >> >>>>>>>> I´ve been experiencing some annoying behavior with my current
>> >> >>>>> production
>> >> >>>>>>>> scenario. Here is the snapshot:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>  - SolrCloud: 2 shards
>> >> >>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines *(most of
>> >> >>>> the
>> >> >>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
>> >> >>>>>>>>  - This is the zoo.cfg from every
>> >> >>>>>>>>
>> >> >>>>>>>> tickTime=2000  // I´ve also tried with 60000
>> >> >>>>>>>>
>> >> >>>>>>>> initLimit=10
>> >> >>>>>>>>
>> >> >>>>>>>> syncLimit=5
>> >> >>>>>>>>
>> >> >>>>>>>> dataDir=/var/lib/zookeeper
>> >> >>>>>>>>
>> >> >>>>>>>> clientPort=9000
>> >> >>>>>>>>
>> >> >>>>>>>> server.1=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>> server.2=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>> server.3=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>  - I´ve developed a Java Application with a REST API (let´s
>> call
>> >> >>>> it *
>> >> >>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a
>> wrapper
>> >> >>>>> around
>> >> >>>>>>>>  CloudSolrServer, so it´s mandatory to specify some Zookeeper
>> >> >>>>>> configuration
>> >> >>>>>>>>  params too. They are loaded dynamically when the application
>> is
>> >> >>>>>> deployed in
>> >> >>>>>>>>  a Tomcat server, but the current values that I´m using are as
>> >> >>>>> follows:
>> >> >>>>>>>>
>> >> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
>> >> >>>>>>>>
>> >> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>>
>> >> >>>>>>>> *THE PROBLEM*
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve
>> >> >>>> checked
>> >> >>>>>>>> that this behavior occurrs periodically, more or less) the
>> *engine
>> >> >>>>>> blocks
>> >> >>>>>>>> * and cannot dispatch any query to SolrCloud.
>> >> >>>>>>>>
>> >> >>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." one
>> last
>> >> >>>>> time,
>> >> >>>>>>>>  but never updates.
>> >> >>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>>  everything is green, and I cant execute queries directly into
>> >> >>>> Solr.
>> >> >>>>>>>>  - So then Solr appears to be OK, so the next step is to
>> restart
>> >> >>>>>> *engine
>> >> >>>>>>>>  but *it again appears "updating Zookeeper...". Unfortunately
>> >> >>>> switch
>> >> >>>>>>>>  off + switch on doesn´t work here, :-(
>> >> >>>>>>>>  - I´ve checked too Zookeeper logs and it appears some
>> connection
>> >> >>>> log
>> >> >>>>>>>>  outs, but the ensemble appears to be OK too.
>> >> >>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I restart
>> >> >>>>>>>>  SolrCloud, plus I restart the engine, the problem is solved.
>> I´m
>> >> >>>>> using
>> >> >>>>>>>>  Amazon AWS as hostage, so I discard connection problems
>> between
>> >> >>>>>> instances.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Does anyone experienced something similar? Can anybody shed
>> some
>> >> >>>> light
>> >> >>>>>> on
>> >> >>>>>>>> this problem?
>> >> >>>>>>>>
>> >> >>>>>>>> Thank you very much.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Regards,
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> - Luis Cappa
>> >> >>>>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >
>> >>
>>
>
>
>
> --
> Luis Cappa Banda
>
> *Phone*: (0034) 686 200 375
> *Skype*: luiscappabanda

Reply via email to