Sorry, yes, I had been using the BETA version. I have deleted all of that, replaced the jars with the released versions (reduced my core count), and now I have consistent results. I guess I missed that JIRA ticket, sorry for the false alarm. Dave
-----Original Message----- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, November 23, 2012 4:25 AM To: solr-user@lucene.apache.org Subject: Re: inconsistent number of results returned in solr cloud Dave: I should have asked this first. What version of Solr are you using? I Not sure whether it was fixed in BETA or not (certainly is in the 4.0 GA release). There was a problem with adding a doclist via solrj, here's one related JIRA, although it wasn't the main fix: https://issues.apache.org/jira/browse/SOLR-3001. I suspect that's the "known problem" Mark mentioned. Because what you're seeing _sure_ sounds similar.... Best Erick On Mon, Nov 19, 2012 at 12:49 PM, Buttler, David <buttl...@llnl.gov> wrote: > Answers inline below > > -----Original Message----- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Saturday, November 17, 2012 6:40 AM > To: solr-user@lucene.apache.org > Subject: Re: inconsistent number of results returned in solr cloud > > Hmmm, first an aside. If by "commit after every batch of documents " > you mean after every call to server.add(doclist), there's no real need > to do that unless you're striving for really low latency. the usual > recommendation is to use commitWithin when adding and commit only at > the very end of the run. This shouldn't actually be germane to your > issue, just an FYI. > > DB> Good point. The code for committing docs to solr is fairly old. > DB> I > will update it since I don't have a latency requirement. > > So you're saying that the inconsistency is permanent? By that I mean > it keeps coming back inconsistently for minutes/hours/days? > > DB> Yes, it is permanent. I have collections that have been up for > DB> weeks, > and are still returning inconsistent results, and I haven't been > adding any additional documents. > DB> Related to this, I seem to have a discrepancy between the number > DB> of > documents I think I am sending to solr, and the number of documents it > is reporting. I have tried reducing the number of shards for one of > my small collections, so I deleted all references to this collections, > and reloaded it. I think I have 260 documents submitted (counted from a > hadoop job). > Solr returns a count of ~430 (it varies), and the first returned > document is not consistent. > > I guess if I were trying to test this I'd need to know how you added > subsequent collections. In particular what you did re: zookeeper as > you added each collection. > > DB> These are my steps > DB> 1. Create the collection via the HTTP API: http:// > <host>:<port>/solr/admin/collections?action=CREATE&name=<collection>&n > umShards=6&%20collection.configName=<collection> > DB> 2. Relaunch one of my JVM processes, bootstrapping the collection: > DB> java -Xmx16g -Dcollection.configName=<collection> > DB> -Djetty.port=<port> > -DzkHost=<zkhost> -Dsolr.solr.home=<solr home> -DnumShards=6 > -Dbootstrap_confdir=conf -jar start.jar > DB> load data > > DB> Let me know if something is unclear. I can run through the > DB> process > again and document it more carefully. > DB> > DB> Thanks for looking at it, > DB> Dave > > Best > Erick > > > On Fri, Nov 16, 2012 at 2:55 PM, Buttler, David <buttl...@llnl.gov> wrote: > > > My typical way of adding documents is through SolrJ, where I commit > > after every batch of documents (where the batch size is > > configurable) > > > > I have now tried committing several times, from the command line > > (curl) with and without openSearcher=true. It does not affect anything. > > > > Dave > > > > -----Original Message----- > > From: Mark Miller [mailto:markrmil...@gmail.com] > > Sent: Friday, November 16, 2012 11:04 AM > > To: solr-user@lucene.apache.org > > Subject: Re: inconsistent number of results returned in solr cloud > > > > How did you do the final commit? Can you try a lone commit (with > > openSearcher=true) and see if that affects things? > > > > Trying to determine if this is a known issue or not. > > > > - Mark > > > > On Nov 16, 2012, at 1:34 PM, "Buttler, David" <buttl...@llnl.gov> wrote: > > > > > Hi all, > > > I buried an issue in my last post, so let me pop it up. > > > > > > I have a cluster with 10 collections on it. The first collection > > > I > > loaded works perfectly. But every subsequent collection returns an > > inconsistent number of results for each query. The queries can be > > simply *:*, or more complex facet queries. If I go to individual > > cores and > issue > > the query, with distrib=false, I get a consistent number of results. > > I > am > > wondering if there is some delay in returning results from my > > shards, and the queried node just times out and displays the number > > of results that > it > > has received so far. If there is such a timeout, it must be very > > small, > as > > my QTime is around 11 ms. > > > > > > Dave > > > > >