P.S. I you have too much trouble with session timeouts with ZooKeeper, you may need to raise the client timeout from 15 seconds to something higher.
- Mark On Tue, Dec 11, 2012 at 11:38 AM, Mark Miller <markrmil...@gmail.com> wrote: > Hey Roland! > > When you look at the Admin UI in the Cloud tab, do you see both > instances as active? > > Also, how are you querying the nodes? > > One tricky thing at the moment is that if you are not using a 'smart' > client like the solrj CloudSolrServer, and you directly query the node > that never recovered - I *think* it will happily respond to querys. > > We should probably look into that and file a JIRA issue if it's a > problem. Perhaps we need some defensive checks so that a node that did > not recover properly won't serve queries if that is not already > enforced. > > Part of the issue is that we continue serving queries even if we are > not connected to zookeeper. Perhaps we need to be looking at our last > publish state as the defensive check and only serve queries if that > was active. > > Now it's another issue if the node said it was active and it hadn't > fully recovered. That we would want to investigate. > > - Mark > > On Tue, Dec 11, 2012 at 6:23 AM, Roland Villemoes <r...@alpha-solutions.dk> > wrote: >> Hi There, >> >> >> >> We have a 2 instance/1 shards setup running Solr 4 >> (4.0.0.2012.10.06.03.04.33). >> >> Each instance running on each on server, running a separate ZooKeeper on one >> of these machines. >> >> >> >> We had a bad experience that originated from a network error: >> >> >> >> From the log: >> >> Caused by: java.io.IOException: Connection reset by peer >> >> at sun.nio.ch.FileDispatcher.read0(Native Method) >> >> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) >> >> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) >> >> at sun.nio.ch.IOUtil.read(IOUtil.java:218) >> >> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) >> >> at org.mortbay.io.nio.ChannelEndPoint.fill(ChannelEndPoint.java:131) >> >> >> >> >> >> Solr tries to do a commit, and then we see this in the log – as something is >> wrong and it tries to recover: >> >> >> >> Dec 8, 2012 2:36:33 AM org.apache.solr.common.cloud.ZkStateReader$2 process >> >> INFO: A cluster state change has occurred - updating... >> >> Dec 8, 2012 2:36:34 AM org.apache.solr.cloud.RecoveryStrategy run >> >> INFO: Starting recovery process. core=default1_English >> recoveringAfterStartup=true >> >> >> >> It seems like it have problems getting in contact with ZooKeeper due to the >> network problems. >> >> >> >> INFO: Unable to reconnect to ZooKeeper service, session 0x13b6666d0ed00d4 >> has expired, closing socket connection >> >> >> >> Problem is: The solr established itselves with around 30% of the documents >> that was in the other index. I would have liked it to withdraw from the >> cluster and leave all handling of queries to the other server. >> >> When network worked again the solr instances still stayed like this having >> the full index on one server and 30% on the other. This resulted in “funny” >> results from queries – sometimes corrects sometimes not. >> >> >> >> >> >> med venlig hilsen/best regards >> >> >> >> Roland Villemoes >> >> Tel: (+45) 22 69 59 62 >> >> E-Mail: mailto:r...@alpha-solutions.dk >> >> >> >> Alpha Solutions A/S >> >> Sølvgade 10, 1.sal, 1307 København K >> >> Tel: (+45) 70 20 65 38 >> >> Web: http://www.alpha-solutions.dk >> >> >> >> ** This message including any attachments may contain confidential and/or >> privileged information intended only for the person or entity to which it is >> addressed. If you are not the intended recipient you should delete this >> message. Any printing, copying, distribution or other use of this message is >> strictly prohibited. If you have received this message in error, please >> notify the sender immediately by telephone, or e-mail and delete all copies >> of this message and any attachments from your system. Thank you. >> >> >> >> >> >> > > > > -- > - Mark -- - Mark --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org