P.S.

I you have too much trouble with session timeouts with ZooKeeper, you
may need to raise the client timeout from 15 seconds to something
higher.

- Mark

On Tue, Dec 11, 2012 at 11:38 AM, Mark Miller <markrmil...@gmail.com> wrote:
> Hey Roland!
>
> When you look at the Admin UI in the Cloud tab, do you see both
> instances as active?
>
> Also, how are you querying the nodes?
>
> One tricky thing at the moment is that if you are not using a 'smart'
> client like the solrj CloudSolrServer, and you directly query the node
> that never recovered - I *think* it will happily respond to querys.
>
> We should probably look into that and file a JIRA issue if it's a
> problem. Perhaps we need some defensive checks so that a node that did
> not recover properly won't serve queries if that is not already
> enforced.
>
> Part of the issue is that we continue serving queries even if we are
> not connected to zookeeper. Perhaps we need to be looking at our last
> publish state as the defensive check and only serve queries if that
> was active.
>
> Now it's another issue if the node said it was active and it hadn't
> fully recovered. That we would want to investigate.
>
> - Mark
>
> On Tue, Dec 11, 2012 at 6:23 AM, Roland Villemoes <r...@alpha-solutions.dk> 
> wrote:
>> Hi There,
>>
>>
>>
>> We have a 2 instance/1 shards setup running Solr 4
>> (4.0.0.2012.10.06.03.04.33).
>>
>> Each instance running on each on server, running a separate ZooKeeper on one
>> of these machines.
>>
>>
>>
>> We had a bad experience that originated from a network error:
>>
>>
>>
>> From the log:
>>
>> Caused by: java.io.IOException: Connection reset by peer
>>
>> at sun.nio.ch.FileDispatcher.read0(Native Method)
>>
>> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>
>> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
>>
>> at sun.nio.ch.IOUtil.read(IOUtil.java:218)
>>
>> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
>>
>> at org.mortbay.io.nio.ChannelEndPoint.fill(ChannelEndPoint.java:131)
>>
>>
>>
>>
>>
>> Solr tries to do a commit, and then we see this in the log – as something is
>> wrong and it tries to recover:
>>
>>
>>
>> Dec 8, 2012 2:36:33 AM org.apache.solr.common.cloud.ZkStateReader$2 process
>>
>> INFO: A cluster state change has occurred - updating...
>>
>> Dec 8, 2012 2:36:34 AM org.apache.solr.cloud.RecoveryStrategy run
>>
>> INFO: Starting recovery process.  core=default1_English
>> recoveringAfterStartup=true
>>
>>
>>
>> It seems like it have problems getting in contact with ZooKeeper due to the
>> network problems.
>>
>>
>>
>> INFO: Unable to reconnect to ZooKeeper service, session 0x13b6666d0ed00d4
>> has expired, closing socket connection
>>
>>
>>
>> Problem is: The solr established itselves with around 30% of the documents
>> that was in the other index. I would have liked it to withdraw from the
>> cluster and leave all handling of queries to the other server.
>>
>> When network worked again the solr instances still stayed like this having
>> the full index on one server and 30% on the other. This resulted in “funny”
>> results from queries – sometimes corrects sometimes not.
>>
>>
>>
>>
>>
>> med venlig hilsen/best regards
>>
>>
>>
>> Roland Villemoes
>>
>> Tel: (+45) 22 69 59 62
>>
>> E-Mail: mailto:r...@alpha-solutions.dk
>>
>>
>>
>> Alpha Solutions A/S
>>
>> Sølvgade 10, 1.sal, 1307 København K
>>
>> Tel: (+45) 70 20 65 38
>>
>> Web: http://www.alpha-solutions.dk
>>
>>
>>
>> ** This message including any attachments may contain confidential and/or
>> privileged information intended only for the person or entity to which it is
>> addressed. If you are not the intended recipient you should delete this
>> message. Any printing, copying, distribution or other use of this message is
>> strictly prohibited. If you have received this message in error, please
>> notify the sender immediately by telephone, or e-mail and delete all copies
>> of this message and any attachments from your system. Thank you.
>>
>>
>>
>>
>>
>>
>
>
>
> --
> - Mark



-- 
- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to