RE: external zookeeper with SolrCloud

2013-08-16 Thread Joshi, Shital
Is there a way to find if We have a zookeeper quorum? We can ping individual 
zookeeper and see if it is running, but it would be nice to ping/query one URL 
and check if we have a quorum. 

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Friday, August 09, 2013 2:15 PM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

On 8/9/2013 11:15 AM, Joshi, Shital wrote:
 Same thing happen. It only works with N/2 + 1 zookeeper instances up.

Got it.

An update came in on the issue that I filed.  This behavior that you're 
seeing is currently by design.

Because this is expected behavior, I've changed the issue to improvement 
instead of a bug.  I don't know if it is something that will happen, but 
the request is in.

The workaround is fairly simple -- don't start or restart Solr nodes if 
you don't have zookeeper quorum.

Thank you for your diligent testing!

Shawn



Re: external zookeeper with SolrCloud

2013-08-16 Thread Shawn Heisey

On 8/16/2013 11:58 AM, Joshi, Shital wrote:

Is there a way to find if We have a zookeeper quorum? We can ping individual 
zookeeper and see if it is running, but it would be nice to ping/query one URL 
and check if we have a quorum.


This is a really good question, to which I do not have an answer.  If 
your client code is Java, you could probably get this information out of 
CloudSolrServer, with something like this:


server.getZkStateReader().getZkClient().getSolrZooKeeper().getState();

If the state is CONNECTED everything's probably fine.

If anyone who's dealt with Zookeeper happens to know whether this would 
work, I'd appreciate knowing.  For Solr, it is probably a good idea to 
expose something via an admin handler with the current zookeeper quorum 
state.


Thanks,
Shawn



Re: external zookeeper with SolrCloud

2013-08-16 Thread Walter Underwood
You might be able to get info from the Zookeeper four letter words.

http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_zkCommands

Here is a command to get the status for one of our Zookeeper hosts:

$ echo stat | nc zk-web02.test3.cloud.cheggnet.com 2181

wunder

On Aug 16, 2013, at 12:01 PM, Shawn Heisey wrote:

 On 8/16/2013 11:58 AM, Joshi, Shital wrote:
 Is there a way to find if We have a zookeeper quorum? We can ping individual 
 zookeeper and see if it is running, but it would be nice to ping/query one 
 URL and check if we have a quorum.
 
 This is a really good question, to which I do not have an answer.  If your 
 client code is Java, you could probably get this information out of 
 CloudSolrServer, with something like this:
 
 server.getZkStateReader().getZkClient().getSolrZooKeeper().getState();
 
 If the state is CONNECTED everything's probably fine.
 
 If anyone who's dealt with Zookeeper happens to know whether this would work, 
 I'd appreciate knowing.  For Solr, it is probably a good idea to expose 
 something via an admin handler with the current zookeeper quorum state.
 
 Thanks,
 Shawn
 



RE: external zookeeper with SolrCloud

2013-08-16 Thread Boogie Shafer
good stuff

here is a more recent version of the same resource as they have added a few new 
commands in the recent releases of zookeeper

http://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html#sc_zkCommands



From: Walter Underwood wun...@wunderwood.org
Sent: Friday, August 16, 2013 12:48
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

You might be able to get info from the Zookeeper four letter words.

http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_zkCommands

Here is a command to get the status for one of our Zookeeper hosts:

$ echo stat | nc zk-web02.test3.cloud.cheggnet.com 2181

wunder

On Aug 16, 2013, at 12:01 PM, Shawn Heisey wrote:

 On 8/16/2013 11:58 AM, Joshi, Shital wrote:
 Is there a way to find if We have a zookeeper quorum? We can ping individual 
 zookeeper and see if it is running, but it would be nice to ping/query one 
 URL and check if we have a quorum.

 This is a really good question, to which I do not have an answer.  If your 
 client code is Java, you could probably get this information out of 
 CloudSolrServer, with something like this:

 server.getZkStateReader().getZkClient().getSolrZooKeeper().getState();

 If the state is CONNECTED everything's probably fine.

 If anyone who's dealt with Zookeeper happens to know whether this would work, 
 I'd appreciate knowing.  For Solr, it is probably a good idea to expose 
 something via an admin handler with the current zookeeper quorum state.

 Thanks,
 Shawn




Re: external zookeeper with SolrCloud

2013-08-16 Thread Shawn Heisey

On 8/16/2013 11:58 AM, Joshi, Shital wrote:

Is there a way to find if We have a zookeeper quorum? We can ping individual 
zookeeper and see if it is running, but it would be nice to ping/query one URL 
and check if we have a quorum.


I filed an issue on this:

https://issues.apache.org/jira/browse/SOLR-5169

Thanks,
Shawn



RE: external zookeeper with SolrCloud

2013-08-16 Thread Boogie Shafer
the mntr command can give that info if you hit the leader of the zk quorum

e.g. in the example for that command on the link you can see that its a 5 
member zk ensemble (zk_followers 4) and that all followers are synced 
(zk_synced_followers 4)

you would obviously need to query for the zk leader before you could get that 
data. the srvr command can tell you the status of a given zk (leader or 
follower)


$ echo mntr | nc localhost 2185

zk_version  3.4.0
zk_avg_latency  0
zk_max_latency  0
zk_min_latency  0
zk_packets_received 70
zk_packets_sent 69
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count   4
zk_watch_count  0
zk_ephemerals_count 0
zk_approximate_data_size27
zk_followers4   - only exposed by the Leader
zk_synced_followers 4   - only exposed by the Leader
zk_pending_syncs0   - only exposed by the Leader
zk_open_file_descriptor_count 23- only available on Unix platforms
zk_max_file_descriptor_count 1024   - only available on Unix platforms


--- some examples of using srvr command

echo srvr | nc fookeeper_follower 2185
Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
Latency min/avg/max: 0/0/45
Received: 1132673
Sent: 1132724
Connections: 4
Outstanding: 0
Zxid: 0x600172e5a
Mode: follower
Node count: 218

echo srvr | nc fookeeper_leader 2181
Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
Latency min/avg/max: 0/0/880
Received: 21976696
Sent: 21988742
Connections: 17
Outstanding: 0
Zxid: 0x600172e66
Mode: leader
Node count: 218



From: Shawn Heisey s...@elyograg.org
Sent: Friday, August 16, 2013 14:13
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

On 8/16/2013 11:58 AM, Joshi, Shital wrote:
 Is there a way to find if We have a zookeeper quorum? We can ping individual 
 zookeeper and see if it is running, but it would be nice to ping/query one 
 URL and check if we have a quorum.

I filed an issue on this:

https://issues.apache.org/jira/browse/SOLR-5169

Thanks,
Shawn




RE: external zookeeper with SolrCloud

2013-08-16 Thread Boogie Shafer
sorry, it looks like you can get the follower/leader status for each node using 
just the mntrnot the zk_server_state values



echo mntr | nc fookeeper_follower 2181

zk_version  3.4.5-1392090, built on 09/30/2012 17:52 GMT
zk_avg_latency  0
zk_max_latency  45
zk_min_latency  0
zk_packets_received 1132824
zk_packets_sent 1132875
zk_num_alive_connections4
zk_outstanding_requests 0
zk_server_state follower
zk_znode_count  218
zk_watch_count  12
zk_ephemerals_count 85
zk_approximate_data_size546670
zk_open_file_descriptor_count   35
zk_max_file_descriptor_count4096



From: Boogie Shafer boo...@ebrary.com
Sent: Friday, August 16, 2013 14:26
To: solr-user@lucene.apache.org
Subject: RE: external zookeeper with SolrCloud

the mntr command can give that info if you hit the leader of the zk quorum

e.g. in the example for that command on the link you can see that its a 5 
member zk ensemble (zk_followers 4) and that all followers are synced 
(zk_synced_followers 4)

you would obviously need to query for the zk leader before you could get that 
data. the srvr command can tell you the status of a given zk (leader or 
follower)


$ echo mntr | nc localhost 2185

zk_version  3.4.0
zk_avg_latency  0
zk_max_latency  0
zk_min_latency  0
zk_packets_received 70
zk_packets_sent 69
zk_outstanding_requests 0
zk_server_state leader
zk_znode_count   4
zk_watch_count  0
zk_ephemerals_count 0
zk_approximate_data_size27
zk_followers4   - only exposed by the Leader
zk_synced_followers 4   - only exposed by the Leader
zk_pending_syncs0   - only exposed by the Leader
zk_open_file_descriptor_count 23- only available on Unix platforms
zk_max_file_descriptor_count 1024   - only available on Unix platforms


--- some examples of using srvr command

echo srvr | nc fookeeper_follower 2185
Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
Latency min/avg/max: 0/0/45
Received: 1132673
Sent: 1132724
Connections: 4
Outstanding: 0
Zxid: 0x600172e5a
Mode: follower
Node count: 218

echo srvr | nc fookeeper_leader 2181
Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
Latency min/avg/max: 0/0/880
Received: 21976696
Sent: 21988742
Connections: 17
Outstanding: 0
Zxid: 0x600172e66
Mode: leader
Node count: 218



From: Shawn Heisey s...@elyograg.org
Sent: Friday, August 16, 2013 14:13
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

On 8/16/2013 11:58 AM, Joshi, Shital wrote:
 Is there a way to find if We have a zookeeper quorum? We can ping individual 
 zookeeper and see if it is running, but it would be nice to ping/query one 
 URL and check if we have a quorum.

I filed an issue on this:

https://issues.apache.org/jira/browse/SOLR-5169

Thanks,
Shawn





RE: external zookeeper with SolrCloud

2013-08-09 Thread Joshi, Shital
Thanks so much for your reply. Appreciate your help with this. 

We have 10 Solr4 nodes (5 shards with replication factor 2) and three zookeeper 
instances. When we bring 10 Solr4 nodes (while all zookeeper instances are 
down), we see this exception in Solr4 logs. (which makes sense)

java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
862352 [main-SendThread(d136274-003.dc.gs.com:2181)] WARN  
org.apache.zookeeper.ClientCnxn  ? Session 0x0 for server null, unexpected 
error, closing socket connection and attempting reconnect

When we bring up all zookeeper instances, we stop getting above exception, see 
this message in log and log stops moving after that:

INFO  - 2013-08-09 15:48:41.447; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@203727c5 
name:ZooKeeperConnection 
Watcher:zk1.test.com:2181,zk2.test.com:2181,zk3.test.com:2181 got event 
WatchedEvent state:SyncConnected type:None path:null path:null type:None
998962 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  
? Watcher org.apache.solr.common.cloud.ConnectionManager@203727c5 
name:ZooKeeperConnection 
Watcher:zk1.test.com:2181,zk2.test.com:2181,qa-zk3.test.com:2181 got event 
WatchedEvent state:SyncConnected type:None path:null path:null type:None
INFO  - 2013-08-09 15:48:41.528; 
org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status change 
trigger but we are already closed
999043 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  
? Client-ZooKeeper status change trigger but we are already closed

At this point, we cannot see admin page or query of any solr nodes unless we 
restart entire cloud and after that everything is great. So we must put checks 
to make sure that N/2 + 1 zookeeper instances are up before we can bring up any 
solr nodes.  




-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Thursday, August 08, 2013 6:34 PM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

On 8/8/2013 3:03 PM, Joshi, Shital wrote:
 We did quite a bit of testing and we think bug 
 https://issues.apache.org/jira/browse/SOLR-4899 is not resolved in Solr 4.4

The commit for SOLR-4899 was made to branch_4x on June 10th. 
lucene_solr_4_4 code branch was created from branch_4x on July 8th.

The change is definitely present in 4.4.  It's an extremely simple 
one-line change - instead of waiting for DEFAULT_CLIENT_CONNECT_TIMEOUT, 
a zookeeper reconnect will wait for Long.MAX_VALUE milliseconds.

http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/solr/solrj/src/java/org/apache/solr/common/cloud/ConnectionManager.java?r1=1491451r2=1491450pathrev=1491451

Either you are having a problem that's unrelated to the change committed 
by SOLR-4899 or there's something strange going on.

Can you describe exactly what you are trying, what you are seeing, and 
what you expect to see?

Thanks,
Shawn



Re: external zookeeper with SolrCloud

2013-08-09 Thread Shawn Heisey
On 8/9/2013 9:02 AM, Joshi, Shital wrote:
 At this point, we cannot see admin page or query of any solr nodes unless we 
 restart entire cloud and after that everything is great. So we must put 
 checks to make sure that N/2 + 1 zookeeper instances are up before we can 
 bring up any solr nodes.  

I am not really surprised to learn that SolrCloud doesn't start
correctly if you don't have zookeeper running when starting Solr.

I think it's definitely a bug that Solr won't start working correctly
when you start zookeeper.  I have filed an issue:

https://issues.apache.org/jira/browse/SOLR-5129

If you repeat your test while you have one zookeeper node up (but not
N/2 + 1 for quorum), does the same thing happen, or will it work?

Thanks,
Shawn



RE: external zookeeper with SolrCloud

2013-08-09 Thread Joshi, Shital
Same thing happen. It only works with N/2 + 1 zookeeper instances up.  

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Friday, August 09, 2013 11:22 AM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

On 8/9/2013 9:02 AM, Joshi, Shital wrote:
 At this point, we cannot see admin page or query of any solr nodes unless we 
 restart entire cloud and after that everything is great. So we must put 
 checks to make sure that N/2 + 1 zookeeper instances are up before we can 
 bring up any solr nodes.  

I am not really surprised to learn that SolrCloud doesn't start
correctly if you don't have zookeeper running when starting Solr.

I think it's definitely a bug that Solr won't start working correctly
when you start zookeeper.  I have filed an issue:

https://issues.apache.org/jira/browse/SOLR-5129

If you repeat your test while you have one zookeeper node up (but not
N/2 + 1 for quorum), does the same thing happen, or will it work?

Thanks,
Shawn



Re: external zookeeper with SolrCloud

2013-08-09 Thread Shawn Heisey

On 8/9/2013 11:15 AM, Joshi, Shital wrote:

Same thing happen. It only works with N/2 + 1 zookeeper instances up.


Got it.

An update came in on the issue that I filed.  This behavior that you're 
seeing is currently by design.


Because this is expected behavior, I've changed the issue to improvement 
instead of a bug.  I don't know if it is something that will happen, but 
the request is in.


The workaround is fairly simple -- don't start or restart Solr nodes if 
you don't have zookeeper quorum.


Thank you for your diligent testing!

Shawn



RE: external zookeeper with SolrCloud

2013-08-08 Thread Joshi, Shital
We did quite a bit of testing and we think bug 
https://issues.apache.org/jira/browse/SOLR-4899 is not resolved in Solr 4.4 

-Original Message-
From: Joshi, Shital [Tech] 
Sent: Wednesday, August 07, 2013 2:48 PM
To: 'solr-user@lucene.apache.org'
Subject: RE: external zookeeper with SolrCloud

I started looking into what I might have missed while upgrading to Solr 4.4. 
and I noticed that solr.xml in Solr 4.4 has this:

solr

  solrcloud
str name=host${host:}/str
int name=hostPort${jetty.port:8983}/int
str name=hostContext${hostContext:solr}/str
int name=zkClientTimeout${zkClientTimeout:15000}/int
bool name=genericCoreNodeNames${genericCoreNodeNames:true}/bool
  /solrcloud

  shardHandlerFactory name=shardHandlerFactory
class=HttpShardHandlerFactory
int name=socketTimeout${socketTimeout:0}/int
int name=connTimeout${connTimeout:0}/int
  /shardHandlerFactory

/solr


While our solr.xml has this:
solr persistent=true

cores adminPath=/admin/cores defaultCoreName=collection1 host=${host:} 
hostPort=${jetty.port:8983} hostContext=${hostContext:solr} zkClientTim
eout=${zkClientTimeout:15000}
core name=collection1 instanceDir=collection1 shard=${shard:} 
dataDir=${solr.data.dir} /
  /cores

/solr

Do you think not having shardHandlerFactory is causing this bug to appear on 
our end? 



-Original Message-
From: Raymond Wiker [mailto:rwi...@gmail.com] 
Sent: Wednesday, August 07, 2013 8:29 AM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

You said earlier that you had 6 zookeeper instances, but the zkHost param
only shows 5 instances... is that correct?


On Tue, Aug 6, 2013 at 11:23 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Machines are definitely up. Solr4 node and zookeeper instance share the
 machine. We're using -DzkHost=zk1,zk2,zk3,zk4,zk5 to let solr nodes know
 about the zk instances.


 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Tuesday, August 06, 2013 5:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud

 First off, even 6 ZK instances are overkill, vast overkill. 3 should be
 more than enough.

 That aside, however, how are you letting your Solr nodes know about the zk
 machines?
 Is it possible you've pointed some of your Solr nodes at specific ZK
 machines
 that aren't up when you have this problem? I.e. -zkHost=zk1,zk2,zk3

 Best
 Erick


 On Tue, Aug 6, 2013 at 4:56 PM, Joshi, Shital shital.jo...@gs.com wrote:

  Hi,
 
  We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes.
  We have 6 zookeeper instances. We are planning to change to odd number of
  zookeeper instances.
 
  With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node
  never connects to zookeeper (can't see the admin page) until all
 zookeeper
  instances are up and we restart all solr nodes. It was suggested that it
  could be due this bug https://issues.apache.org/jira/browse/SOLR-4899and
  this bug is solved in Solr 4.4
 
  We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of
 6
  zookeeper instances and then brought up all ten Solr4 nodes. We kept
 seeing
  this exception in Solr logs:
 
  751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session
  0x0 for server null, unexpected error, closing socket connection and
  attempting reconnect java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at
  sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
  at
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
  at
  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
  And after a while saw this exception.
 
  INFO  - 2013-08-05 22:24:07.582;
  org.apache.solr.common.cloud.ConnectionManager; Watcher
  org.apache.solr.common.cloud.ConnectionManager@5140709name:ZooKeeperConnection
 Watcher:
  qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,
  qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com got
  event WatchedEvent state:SyncConnected type:None path:null path:null
  type:None
  INFO  - 2013-08-05 22:24:07.662;
  org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
  change trigger but we are already closed
  754311 [main-EventThread] INFO
   org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper
 status
  change trigger but we are already closed
 
  We brought up all zookeeper instances but the cloud never came up until
  all solr nodes were restarted. Do we need to change any settings? After
  weekend reboot, all zookeeper instances come up one by one. While
 zookeeper
  instances are coming up solr nodes are also getting started. With this
  issue, we have to put checks to make sure all zookeeper instances are up
  before we bring up any solr node.
 
  Thanks

Re: external zookeeper with SolrCloud

2013-08-08 Thread Shawn Heisey

On 8/8/2013 3:03 PM, Joshi, Shital wrote:

We did quite a bit of testing and we think bug 
https://issues.apache.org/jira/browse/SOLR-4899 is not resolved in Solr 4.4


The commit for SOLR-4899 was made to branch_4x on June 10th. 
lucene_solr_4_4 code branch was created from branch_4x on July 8th.


The change is definitely present in 4.4.  It's an extremely simple 
one-line change - instead of waiting for DEFAULT_CLIENT_CONNECT_TIMEOUT, 
a zookeeper reconnect will wait for Long.MAX_VALUE milliseconds.


http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/solr/solrj/src/java/org/apache/solr/common/cloud/ConnectionManager.java?r1=1491451r2=1491450pathrev=1491451

Either you are having a problem that's unrelated to the change committed 
by SOLR-4899 or there's something strange going on.


Can you describe exactly what you are trying, what you are seeing, and 
what you expect to see?


Thanks,
Shawn



Re: external zookeeper with SolrCloud

2013-08-07 Thread Erick Erickson
Hmmm, shouldn't be happening. How sure are you that the upgrade to 4.4
was carried out on all machines?

Erick


On Tue, Aug 6, 2013 at 5:23 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Machines are definitely up. Solr4 node and zookeeper instance share the
 machine. We're using -DzkHost=zk1,zk2,zk3,zk4,zk5 to let solr nodes know
 about the zk instances.


 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Tuesday, August 06, 2013 5:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud

 First off, even 6 ZK instances are overkill, vast overkill. 3 should be
 more than enough.

 That aside, however, how are you letting your Solr nodes know about the zk
 machines?
 Is it possible you've pointed some of your Solr nodes at specific ZK
 machines
 that aren't up when you have this problem? I.e. -zkHost=zk1,zk2,zk3

 Best
 Erick


 On Tue, Aug 6, 2013 at 4:56 PM, Joshi, Shital shital.jo...@gs.com wrote:

  Hi,
 
  We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes.
  We have 6 zookeeper instances. We are planning to change to odd number of
  zookeeper instances.
 
  With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node
  never connects to zookeeper (can't see the admin page) until all
 zookeeper
  instances are up and we restart all solr nodes. It was suggested that it
  could be due this bug https://issues.apache.org/jira/browse/SOLR-4899and
  this bug is solved in Solr 4.4
 
  We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of
 6
  zookeeper instances and then brought up all ten Solr4 nodes. We kept
 seeing
  this exception in Solr logs:
 
  751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session
  0x0 for server null, unexpected error, closing socket connection and
  attempting reconnect java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at
  sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
  at
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
  at
  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
  And after a while saw this exception.
 
  INFO  - 2013-08-05 22:24:07.582;
  org.apache.solr.common.cloud.ConnectionManager; Watcher
  org.apache.solr.common.cloud.ConnectionManager@5140709name:ZooKeeperConnection
 Watcher:
  qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,
  qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com got
  event WatchedEvent state:SyncConnected type:None path:null path:null
  type:None
  INFO  - 2013-08-05 22:24:07.662;
  org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
  change trigger but we are already closed
  754311 [main-EventThread] INFO
   org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper
 status
  change trigger but we are already closed
 
  We brought up all zookeeper instances but the cloud never came up until
  all solr nodes were restarted. Do we need to change any settings? After
  weekend reboot, all zookeeper instances come up one by one. While
 zookeeper
  instances are coming up solr nodes are also getting started. With this
  issue, we have to put checks to make sure all zookeeper instances are up
  before we bring up any solr node.
 
  Thanks!!
 
  -Original Message-
  From: Mark Miller [mailto:markrmil...@gmail.com]
  Sent: Tuesday, June 11, 2013 10:42 AM
  To: solr-user@lucene.apache.org
  Subject: Re: external zookeeper with SolrCloud
 
 
  On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com
 wrote:
 
   Thanks Mark.
  
   Looks like this bug is fixed in Solr 4.4. Do you have any date for
  official release of 4.4?
 
  Looks like it might come out in a couple of weeks.
 
   Is there any instruction available on how to build Solr 4.4 from SVN
  repository?
 
  It's java, so it's pretty easy - you might find some help here:
  http://wiki.apache.org/solr/HowToContribute
 
  - Mark
 
  
   -Original Message-
   From: Mark Miller [mailto:markrmil...@gmail.com]
   Sent: Monday, June 10, 2013 8:05 PM
   To: solr-user@lucene.apache.org
   Subject: Re: external zookeeper with SolrCloud
  
   This might be https://issues.apache.org/jira/browse/SOLR-4899
  
   - Mark
  
   On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com
  wrote:
  
   Hi,
  
  
  
   We're setting up 5 shard SolrCloud with external zoo keeper. When we
  bring up Solr nodes while the zookeeper instance is not up and running,
 we
  see this error in Solr logs.
  
  
  
   java.net.ConnectException: Connection refused
  
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  
 at
  sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
  
 at
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java

Re: external zookeeper with SolrCloud

2013-08-07 Thread Raymond Wiker
You said earlier that you had 6 zookeeper instances, but the zkHost param
only shows 5 instances... is that correct?


On Tue, Aug 6, 2013 at 11:23 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Machines are definitely up. Solr4 node and zookeeper instance share the
 machine. We're using -DzkHost=zk1,zk2,zk3,zk4,zk5 to let solr nodes know
 about the zk instances.


 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Tuesday, August 06, 2013 5:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud

 First off, even 6 ZK instances are overkill, vast overkill. 3 should be
 more than enough.

 That aside, however, how are you letting your Solr nodes know about the zk
 machines?
 Is it possible you've pointed some of your Solr nodes at specific ZK
 machines
 that aren't up when you have this problem? I.e. -zkHost=zk1,zk2,zk3

 Best
 Erick


 On Tue, Aug 6, 2013 at 4:56 PM, Joshi, Shital shital.jo...@gs.com wrote:

  Hi,
 
  We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes.
  We have 6 zookeeper instances. We are planning to change to odd number of
  zookeeper instances.
 
  With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node
  never connects to zookeeper (can't see the admin page) until all
 zookeeper
  instances are up and we restart all solr nodes. It was suggested that it
  could be due this bug https://issues.apache.org/jira/browse/SOLR-4899and
  this bug is solved in Solr 4.4
 
  We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of
 6
  zookeeper instances and then brought up all ten Solr4 nodes. We kept
 seeing
  this exception in Solr logs:
 
  751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session
  0x0 for server null, unexpected error, closing socket connection and
  attempting reconnect java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at
  sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
  at
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
  at
  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
  And after a while saw this exception.
 
  INFO  - 2013-08-05 22:24:07.582;
  org.apache.solr.common.cloud.ConnectionManager; Watcher
  org.apache.solr.common.cloud.ConnectionManager@5140709name:ZooKeeperConnection
 Watcher:
  qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,
  qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com got
  event WatchedEvent state:SyncConnected type:None path:null path:null
  type:None
  INFO  - 2013-08-05 22:24:07.662;
  org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
  change trigger but we are already closed
  754311 [main-EventThread] INFO
   org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper
 status
  change trigger but we are already closed
 
  We brought up all zookeeper instances but the cloud never came up until
  all solr nodes were restarted. Do we need to change any settings? After
  weekend reboot, all zookeeper instances come up one by one. While
 zookeeper
  instances are coming up solr nodes are also getting started. With this
  issue, we have to put checks to make sure all zookeeper instances are up
  before we bring up any solr node.
 
  Thanks!!
 
  -Original Message-
  From: Mark Miller [mailto:markrmil...@gmail.com]
  Sent: Tuesday, June 11, 2013 10:42 AM
  To: solr-user@lucene.apache.org
  Subject: Re: external zookeeper with SolrCloud
 
 
  On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com
 wrote:
 
   Thanks Mark.
  
   Looks like this bug is fixed in Solr 4.4. Do you have any date for
  official release of 4.4?
 
  Looks like it might come out in a couple of weeks.
 
   Is there any instruction available on how to build Solr 4.4 from SVN
  repository?
 
  It's java, so it's pretty easy - you might find some help here:
  http://wiki.apache.org/solr/HowToContribute
 
  - Mark
 
  
   -Original Message-
   From: Mark Miller [mailto:markrmil...@gmail.com]
   Sent: Monday, June 10, 2013 8:05 PM
   To: solr-user@lucene.apache.org
   Subject: Re: external zookeeper with SolrCloud
  
   This might be https://issues.apache.org/jira/browse/SOLR-4899
  
   - Mark
  
   On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com
  wrote:
  
   Hi,
  
  
  
   We're setting up 5 shard SolrCloud with external zoo keeper. When we
  bring up Solr nodes while the zookeeper instance is not up and running,
 we
  see this error in Solr logs.
  
  
  
   java.net.ConnectException: Connection refused
  
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  
 at
  sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
  
 at
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport

RE: external zookeeper with SolrCloud

2013-08-07 Thread Joshi, Shital
I went through Admin page - Dashboard of all 10 nodes and verified that each 
one is using solr-spec 4.4.0. 

solr-spec  4.4.0
solr-impl 4.4.0 1504776 - sarowe - 2013-07-19 02:58:35
lucene-spec 4.4.0
lucene-impl 4.4.0 1504776 - sarowe - 2013-07-19 02:53:42

Is there anything else I can check to verify that we upgraded to solr 4.4.0?


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, August 07, 2013 8:10 AM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

Hmmm, shouldn't be happening. How sure are you that the upgrade to 4.4
was carried out on all machines?

Erick


On Tue, Aug 6, 2013 at 5:23 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Machines are definitely up. Solr4 node and zookeeper instance share the
 machine. We're using -DzkHost=zk1,zk2,zk3,zk4,zk5 to let solr nodes know
 about the zk instances.


 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Tuesday, August 06, 2013 5:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud

 First off, even 6 ZK instances are overkill, vast overkill. 3 should be
 more than enough.

 That aside, however, how are you letting your Solr nodes know about the zk
 machines?
 Is it possible you've pointed some of your Solr nodes at specific ZK
 machines
 that aren't up when you have this problem? I.e. -zkHost=zk1,zk2,zk3

 Best
 Erick


 On Tue, Aug 6, 2013 at 4:56 PM, Joshi, Shital shital.jo...@gs.com wrote:

  Hi,
 
  We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes.
  We have 6 zookeeper instances. We are planning to change to odd number of
  zookeeper instances.
 
  With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node
  never connects to zookeeper (can't see the admin page) until all
 zookeeper
  instances are up and we restart all solr nodes. It was suggested that it
  could be due this bug https://issues.apache.org/jira/browse/SOLR-4899and
  this bug is solved in Solr 4.4
 
  We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of
 6
  zookeeper instances and then brought up all ten Solr4 nodes. We kept
 seeing
  this exception in Solr logs:
 
  751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session
  0x0 for server null, unexpected error, closing socket connection and
  attempting reconnect java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at
  sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
  at
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
  at
  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
  And after a while saw this exception.
 
  INFO  - 2013-08-05 22:24:07.582;
  org.apache.solr.common.cloud.ConnectionManager; Watcher
  org.apache.solr.common.cloud.ConnectionManager@5140709name:ZooKeeperConnection
 Watcher:
  qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,
  qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com got
  event WatchedEvent state:SyncConnected type:None path:null path:null
  type:None
  INFO  - 2013-08-05 22:24:07.662;
  org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
  change trigger but we are already closed
  754311 [main-EventThread] INFO
   org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper
 status
  change trigger but we are already closed
 
  We brought up all zookeeper instances but the cloud never came up until
  all solr nodes were restarted. Do we need to change any settings? After
  weekend reboot, all zookeeper instances come up one by one. While
 zookeeper
  instances are coming up solr nodes are also getting started. With this
  issue, we have to put checks to make sure all zookeeper instances are up
  before we bring up any solr node.
 
  Thanks!!
 
  -Original Message-
  From: Mark Miller [mailto:markrmil...@gmail.com]
  Sent: Tuesday, June 11, 2013 10:42 AM
  To: solr-user@lucene.apache.org
  Subject: Re: external zookeeper with SolrCloud
 
 
  On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com
 wrote:
 
   Thanks Mark.
  
   Looks like this bug is fixed in Solr 4.4. Do you have any date for
  official release of 4.4?
 
  Looks like it might come out in a couple of weeks.
 
   Is there any instruction available on how to build Solr 4.4 from SVN
  repository?
 
  It's java, so it's pretty easy - you might find some help here:
  http://wiki.apache.org/solr/HowToContribute
 
  - Mark
 
  
   -Original Message-
   From: Mark Miller [mailto:markrmil...@gmail.com]
   Sent: Monday, June 10, 2013 8:05 PM
   To: solr-user@lucene.apache.org
   Subject: Re: external zookeeper with SolrCloud
  
   This might be https://issues.apache.org/jira/browse/SOLR-4899
  
   - Mark
  
   On Jun 10, 2013, at 5:59 PM, Joshi, Shital

RE: external zookeeper with SolrCloud

2013-08-07 Thread Joshi, Shital
We have all 6 instances in zkhost parameter. 

-Original Message-
From: Raymond Wiker [mailto:rwi...@gmail.com] 
Sent: Wednesday, August 07, 2013 8:29 AM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

You said earlier that you had 6 zookeeper instances, but the zkHost param
only shows 5 instances... is that correct?


On Tue, Aug 6, 2013 at 11:23 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Machines are definitely up. Solr4 node and zookeeper instance share the
 machine. We're using -DzkHost=zk1,zk2,zk3,zk4,zk5 to let solr nodes know
 about the zk instances.


 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Tuesday, August 06, 2013 5:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud

 First off, even 6 ZK instances are overkill, vast overkill. 3 should be
 more than enough.

 That aside, however, how are you letting your Solr nodes know about the zk
 machines?
 Is it possible you've pointed some of your Solr nodes at specific ZK
 machines
 that aren't up when you have this problem? I.e. -zkHost=zk1,zk2,zk3

 Best
 Erick


 On Tue, Aug 6, 2013 at 4:56 PM, Joshi, Shital shital.jo...@gs.com wrote:

  Hi,
 
  We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes.
  We have 6 zookeeper instances. We are planning to change to odd number of
  zookeeper instances.
 
  With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node
  never connects to zookeeper (can't see the admin page) until all
 zookeeper
  instances are up and we restart all solr nodes. It was suggested that it
  could be due this bug https://issues.apache.org/jira/browse/SOLR-4899and
  this bug is solved in Solr 4.4
 
  We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of
 6
  zookeeper instances and then brought up all ten Solr4 nodes. We kept
 seeing
  this exception in Solr logs:
 
  751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session
  0x0 for server null, unexpected error, closing socket connection and
  attempting reconnect java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at
  sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
  at
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
  at
  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
  And after a while saw this exception.
 
  INFO  - 2013-08-05 22:24:07.582;
  org.apache.solr.common.cloud.ConnectionManager; Watcher
  org.apache.solr.common.cloud.ConnectionManager@5140709name:ZooKeeperConnection
 Watcher:
  qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,
  qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com got
  event WatchedEvent state:SyncConnected type:None path:null path:null
  type:None
  INFO  - 2013-08-05 22:24:07.662;
  org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
  change trigger but we are already closed
  754311 [main-EventThread] INFO
   org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper
 status
  change trigger but we are already closed
 
  We brought up all zookeeper instances but the cloud never came up until
  all solr nodes were restarted. Do we need to change any settings? After
  weekend reboot, all zookeeper instances come up one by one. While
 zookeeper
  instances are coming up solr nodes are also getting started. With this
  issue, we have to put checks to make sure all zookeeper instances are up
  before we bring up any solr node.
 
  Thanks!!
 
  -Original Message-
  From: Mark Miller [mailto:markrmil...@gmail.com]
  Sent: Tuesday, June 11, 2013 10:42 AM
  To: solr-user@lucene.apache.org
  Subject: Re: external zookeeper with SolrCloud
 
 
  On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com
 wrote:
 
   Thanks Mark.
  
   Looks like this bug is fixed in Solr 4.4. Do you have any date for
  official release of 4.4?
 
  Looks like it might come out in a couple of weeks.
 
   Is there any instruction available on how to build Solr 4.4 from SVN
  repository?
 
  It's java, so it's pretty easy - you might find some help here:
  http://wiki.apache.org/solr/HowToContribute
 
  - Mark
 
  
   -Original Message-
   From: Mark Miller [mailto:markrmil...@gmail.com]
   Sent: Monday, June 10, 2013 8:05 PM
   To: solr-user@lucene.apache.org
   Subject: Re: external zookeeper with SolrCloud
  
   This might be https://issues.apache.org/jira/browse/SOLR-4899
  
   - Mark
  
   On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com
  wrote:
  
   Hi,
  
  
  
   We're setting up 5 shard SolrCloud with external zoo keeper. When we
  bring up Solr nodes while the zookeeper instance is not up and running,
 we
  see this error in Solr logs.
  
  
  
   java.net.ConnectException: Connection refused

RE: external zookeeper with SolrCloud

2013-08-07 Thread Joshi, Shital
I started looking into what I might have missed while upgrading to Solr 4.4. 
and I noticed that solr.xml in Solr 4.4 has this:

solr

  solrcloud
str name=host${host:}/str
int name=hostPort${jetty.port:8983}/int
str name=hostContext${hostContext:solr}/str
int name=zkClientTimeout${zkClientTimeout:15000}/int
bool name=genericCoreNodeNames${genericCoreNodeNames:true}/bool
  /solrcloud

  shardHandlerFactory name=shardHandlerFactory
class=HttpShardHandlerFactory
int name=socketTimeout${socketTimeout:0}/int
int name=connTimeout${connTimeout:0}/int
  /shardHandlerFactory

/solr


While our solr.xml has this:
solr persistent=true

cores adminPath=/admin/cores defaultCoreName=collection1 host=${host:} 
hostPort=${jetty.port:8983} hostContext=${hostContext:solr} zkClientTim
eout=${zkClientTimeout:15000}
core name=collection1 instanceDir=collection1 shard=${shard:} 
dataDir=${solr.data.dir} /
  /cores

/solr

Do you think not having shardHandlerFactory is causing this bug to appear on 
our end? 



-Original Message-
From: Raymond Wiker [mailto:rwi...@gmail.com] 
Sent: Wednesday, August 07, 2013 8:29 AM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

You said earlier that you had 6 zookeeper instances, but the zkHost param
only shows 5 instances... is that correct?


On Tue, Aug 6, 2013 at 11:23 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Machines are definitely up. Solr4 node and zookeeper instance share the
 machine. We're using -DzkHost=zk1,zk2,zk3,zk4,zk5 to let solr nodes know
 about the zk instances.


 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Tuesday, August 06, 2013 5:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud

 First off, even 6 ZK instances are overkill, vast overkill. 3 should be
 more than enough.

 That aside, however, how are you letting your Solr nodes know about the zk
 machines?
 Is it possible you've pointed some of your Solr nodes at specific ZK
 machines
 that aren't up when you have this problem? I.e. -zkHost=zk1,zk2,zk3

 Best
 Erick


 On Tue, Aug 6, 2013 at 4:56 PM, Joshi, Shital shital.jo...@gs.com wrote:

  Hi,
 
  We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes.
  We have 6 zookeeper instances. We are planning to change to odd number of
  zookeeper instances.
 
  With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node
  never connects to zookeeper (can't see the admin page) until all
 zookeeper
  instances are up and we restart all solr nodes. It was suggested that it
  could be due this bug https://issues.apache.org/jira/browse/SOLR-4899and
  this bug is solved in Solr 4.4
 
  We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of
 6
  zookeeper instances and then brought up all ten Solr4 nodes. We kept
 seeing
  this exception in Solr logs:
 
  751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session
  0x0 for server null, unexpected error, closing socket connection and
  attempting reconnect java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at
  sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
  at
 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
  at
  org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
  And after a while saw this exception.
 
  INFO  - 2013-08-05 22:24:07.582;
  org.apache.solr.common.cloud.ConnectionManager; Watcher
  org.apache.solr.common.cloud.ConnectionManager@5140709name:ZooKeeperConnection
 Watcher:
  qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,
  qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com got
  event WatchedEvent state:SyncConnected type:None path:null path:null
  type:None
  INFO  - 2013-08-05 22:24:07.662;
  org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
  change trigger but we are already closed
  754311 [main-EventThread] INFO
   org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper
 status
  change trigger but we are already closed
 
  We brought up all zookeeper instances but the cloud never came up until
  all solr nodes were restarted. Do we need to change any settings? After
  weekend reboot, all zookeeper instances come up one by one. While
 zookeeper
  instances are coming up solr nodes are also getting started. With this
  issue, we have to put checks to make sure all zookeeper instances are up
  before we bring up any solr node.
 
  Thanks!!
 
  -Original Message-
  From: Mark Miller [mailto:markrmil...@gmail.com]
  Sent: Tuesday, June 11, 2013 10:42 AM
  To: solr-user@lucene.apache.org
  Subject: Re: external zookeeper with SolrCloud
 
 
  On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com
 wrote:
 
   Thanks Mark

RE: external zookeeper with SolrCloud

2013-08-06 Thread Joshi, Shital
Hi,

We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes. We 
have 6 zookeeper instances. We are planning to change to odd number of 
zookeeper instances. 

With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node never 
connects to zookeeper (can't see the admin page) until all zookeeper instances 
are up and we restart all solr nodes. It was suggested that it could be due 
this bug https://issues.apache.org/jira/browse/SOLR-4899 and this bug is solved 
in Solr 4.4

We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of 6 
zookeeper instances and then brought up all ten Solr4 nodes. We kept seeing 
this exception in Solr logs:

751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session 0x0 
for server null, unexpected error, closing socket connection and attempting 
reconnect java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

And after a while saw this exception. 

INFO  - 2013-08-05 22:24:07.582; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@5140709 name:ZooKeeperConnection 
Watcher:qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com
 got event WatchedEvent state:SyncConnected type:None path:null path:null 
type:None
INFO  - 2013-08-05 22:24:07.662; 
org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status change 
trigger but we are already closed
754311 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  
? Client-ZooKeeper status change trigger but we are already closed

We brought up all zookeeper instances but the cloud never came up until all 
solr nodes were restarted. Do we need to change any settings? After weekend 
reboot, all zookeeper instances come up one by one. While zookeeper instances 
are coming up solr nodes are also getting started. With this issue, we have to 
put checks to make sure all zookeeper instances are up before we bring up any 
solr node. 

Thanks!!

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Tuesday, June 11, 2013 10:42 AM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud


On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com wrote:

 Thanks Mark.
 
 Looks like this bug is fixed in Solr 4.4. Do you have any date for official 
 release of 4.4?

Looks like it might come out in a couple of weeks.

 Is there any instruction available on how to build Solr 4.4 from SVN 
 repository?

It's java, so it's pretty easy - you might find some help here: 
http://wiki.apache.org/solr/HowToContribute

- Mark

 
 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com] 
 Sent: Monday, June 10, 2013 8:05 PM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud
 
 This might be https://issues.apache.org/jira/browse/SOLR-4899
 
 - Mark
 
 On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com wrote:
 
 Hi,
 
 
 
 We're setting up 5 shard SolrCloud with external zoo keeper. When we bring 
 up Solr nodes while the zookeeper instance is not up and running, we see 
 this error in Solr logs.
 
 
 
 java.net.ConnectException: Connection refused
 
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 
   at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
 
 
 INFO  - 2013-06-10 15:03:35.422; 
 org.apache.solr.common.cloud.ConnectionManager; Watcher 592147 
 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  ? 
 Watcher org.apache.solr.common.cloud.ConnectionManager@530d0eae 
 name:ZooKeeperConnection Watcher: . got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 
 
 
 INFO  - 2013-06-10 15:03:35.423; 
 org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status 
 change trigger but we are already closed
 
 592148 [main-EventThread] INFO  
 org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper status 
 change trigger but we are already closed
 
 
 
 After we bring up zookeeper instance, the node never connects to zookeeper 
 and we can't see the solr admin page, until we restart the node.
 
 
 
 Does the zookeeper instance has to be up when we bring up Solr node? That's 
 not what the documentation say though.
 
 
 
 Thanks.
 



Re: external zookeeper with SolrCloud

2013-08-06 Thread Erick Erickson
First off, even 6 ZK instances are overkill, vast overkill. 3 should be
more than enough.

That aside, however, how are you letting your Solr nodes know about the zk
machines?
Is it possible you've pointed some of your Solr nodes at specific ZK
machines
that aren't up when you have this problem? I.e. -zkHost=zk1,zk2,zk3

Best
Erick


On Tue, Aug 6, 2013 at 4:56 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Hi,

 We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes.
 We have 6 zookeeper instances. We are planning to change to odd number of
 zookeeper instances.

 With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node
 never connects to zookeeper (can't see the admin page) until all zookeeper
 instances are up and we restart all solr nodes. It was suggested that it
 could be due this bug https://issues.apache.org/jira/browse/SOLR-4899 and
 this bug is solved in Solr 4.4

 We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of 6
 zookeeper instances and then brought up all ten Solr4 nodes. We kept seeing
 this exception in Solr logs:

 751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session
 0x0 for server null, unexpected error, closing socket connection and
 attempting reconnect java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 at
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 at
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

 And after a while saw this exception.

 INFO  - 2013-08-05 22:24:07.582;
 org.apache.solr.common.cloud.ConnectionManager; Watcher
 org.apache.solr.common.cloud.ConnectionManager@5140709name:ZooKeeperConnection
  Watcher:
 qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,
 qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com got
 event WatchedEvent state:SyncConnected type:None path:null path:null
 type:None
 INFO  - 2013-08-05 22:24:07.662;
 org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
 change trigger but we are already closed
 754311 [main-EventThread] INFO
  org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper status
 change trigger but we are already closed

 We brought up all zookeeper instances but the cloud never came up until
 all solr nodes were restarted. Do we need to change any settings? After
 weekend reboot, all zookeeper instances come up one by one. While zookeeper
 instances are coming up solr nodes are also getting started. With this
 issue, we have to put checks to make sure all zookeeper instances are up
 before we bring up any solr node.

 Thanks!!

 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com]
 Sent: Tuesday, June 11, 2013 10:42 AM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud


 On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com wrote:

  Thanks Mark.
 
  Looks like this bug is fixed in Solr 4.4. Do you have any date for
 official release of 4.4?

 Looks like it might come out in a couple of weeks.

  Is there any instruction available on how to build Solr 4.4 from SVN
 repository?

 It's java, so it's pretty easy - you might find some help here:
 http://wiki.apache.org/solr/HowToContribute

 - Mark

 
  -Original Message-
  From: Mark Miller [mailto:markrmil...@gmail.com]
  Sent: Monday, June 10, 2013 8:05 PM
  To: solr-user@lucene.apache.org
  Subject: Re: external zookeeper with SolrCloud
 
  This might be https://issues.apache.org/jira/browse/SOLR-4899
 
  - Mark
 
  On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com
 wrote:
 
  Hi,
 
 
 
  We're setting up 5 shard SolrCloud with external zoo keeper. When we
 bring up Solr nodes while the zookeeper instance is not up and running, we
 see this error in Solr logs.
 
 
 
  java.net.ConnectException: Connection refused
 
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 
at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 
at
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 
at
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
 
 
  INFO  - 2013-06-10 15:03:35.422;
 org.apache.solr.common.cloud.ConnectionManager; Watcher 592147
 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  ?
 Watcher 
 org.apache.solr.common.cloud.ConnectionManager@530d0eaename:ZooKeeperConnection
  Watcher: . got event WatchedEvent
 state:SyncConnected type:None path:null path:null type:None
 
 
 
  INFO  - 2013-06-10 15:03:35.423;
 org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
 change trigger but we are already closed
 
  592148 [main-EventThread] INFO

RE: external zookeeper with SolrCloud

2013-08-06 Thread Joshi, Shital
Machines are definitely up. Solr4 node and zookeeper instance share the 
machine. We're using -DzkHost=zk1,zk2,zk3,zk4,zk5 to let solr nodes know about 
the zk instances. 


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Tuesday, August 06, 2013 5:03 PM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

First off, even 6 ZK instances are overkill, vast overkill. 3 should be
more than enough.

That aside, however, how are you letting your Solr nodes know about the zk
machines?
Is it possible you've pointed some of your Solr nodes at specific ZK
machines
that aren't up when you have this problem? I.e. -zkHost=zk1,zk2,zk3

Best
Erick


On Tue, Aug 6, 2013 at 4:56 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Hi,

 We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes.
 We have 6 zookeeper instances. We are planning to change to odd number of
 zookeeper instances.

 With Solr 4.3.0, if all zookeeper instances are not up, the solr4 node
 never connects to zookeeper (can't see the admin page) until all zookeeper
 instances are up and we restart all solr nodes. It was suggested that it
 could be due this bug https://issues.apache.org/jira/browse/SOLR-4899 and
 this bug is solved in Solr 4.4

 We upgraded to Solr 4.4 but still see this issue. We brought up 4 out of 6
 zookeeper instances and then brought up all ten Solr4 nodes. We kept seeing
 this exception in Solr logs:

 751395 [main-SendThread] WARN  org.apache.zookeeper.ClientCnxn  ? Session
 0x0 for server null, unexpected error, closing socket connection and
 attempting reconnect java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 at
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 at
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

 And after a while saw this exception.

 INFO  - 2013-08-05 22:24:07.582;
 org.apache.solr.common.cloud.ConnectionManager; Watcher
 org.apache.solr.common.cloud.ConnectionManager@5140709name:ZooKeeperConnection
  Watcher:
 qa-zk1.services.gs.com,qa-zk2.services.gs.com,qa-zk3.services.gs.com,
 qa-zk4.services.gs.com,qa-zk5.services.gs.com,qa-zk6.services.gs.com got
 event WatchedEvent state:SyncConnected type:None path:null path:null
 type:None
 INFO  - 2013-08-05 22:24:07.662;
 org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status
 change trigger but we are already closed
 754311 [main-EventThread] INFO
  org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper status
 change trigger but we are already closed

 We brought up all zookeeper instances but the cloud never came up until
 all solr nodes were restarted. Do we need to change any settings? After
 weekend reboot, all zookeeper instances come up one by one. While zookeeper
 instances are coming up solr nodes are also getting started. With this
 issue, we have to put checks to make sure all zookeeper instances are up
 before we bring up any solr node.

 Thanks!!

 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com]
 Sent: Tuesday, June 11, 2013 10:42 AM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud


 On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com wrote:

  Thanks Mark.
 
  Looks like this bug is fixed in Solr 4.4. Do you have any date for
 official release of 4.4?

 Looks like it might come out in a couple of weeks.

  Is there any instruction available on how to build Solr 4.4 from SVN
 repository?

 It's java, so it's pretty easy - you might find some help here:
 http://wiki.apache.org/solr/HowToContribute

 - Mark

 
  -Original Message-
  From: Mark Miller [mailto:markrmil...@gmail.com]
  Sent: Monday, June 10, 2013 8:05 PM
  To: solr-user@lucene.apache.org
  Subject: Re: external zookeeper with SolrCloud
 
  This might be https://issues.apache.org/jira/browse/SOLR-4899
 
  - Mark
 
  On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com
 wrote:
 
  Hi,
 
 
 
  We're setting up 5 shard SolrCloud with external zoo keeper. When we
 bring up Solr nodes while the zookeeper instance is not up and running, we
 see this error in Solr logs.
 
 
 
  java.net.ConnectException: Connection refused
 
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 
at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 
at
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 
at
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
 
 
  INFO  - 2013-06-10 15:03:35.422;
 org.apache.solr.common.cloud.ConnectionManager; Watcher 592147
 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  ?
 Watcher

RE: external zookeeper with SolrCloud

2013-06-11 Thread Joshi, Shital
Thanks Mark.

Looks like this bug is fixed in Solr 4.4. Do you have any date for official 
release of 4.4? Is there any instruction available on how to build Solr 4.4 
from SVN repository?

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Monday, June 10, 2013 8:05 PM
To: solr-user@lucene.apache.org
Subject: Re: external zookeeper with SolrCloud

This might be https://issues.apache.org/jira/browse/SOLR-4899

- Mark

On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Hi,
 
 
 
 We're setting up 5 shard SolrCloud with external zoo keeper. When we bring up 
 Solr nodes while the zookeeper instance is not up and running, we see this 
 error in Solr logs.
 
 
 
 java.net.ConnectException: Connection refused
 
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 
at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 
at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
 
 
 INFO  - 2013-06-10 15:03:35.422; 
 org.apache.solr.common.cloud.ConnectionManager; Watcher 592147 
 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  ? 
 Watcher org.apache.solr.common.cloud.ConnectionManager@530d0eae 
 name:ZooKeeperConnection Watcher: . got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 
 
 
 INFO  - 2013-06-10 15:03:35.423; 
 org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status 
 change trigger but we are already closed
 
 592148 [main-EventThread] INFO  
 org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper status 
 change trigger but we are already closed
 
 
 
 After we bring up zookeeper instance, the node never connects to zookeeper 
 and we can't see the solr admin page, until we restart the node.
 
 
 
 Does the zookeeper instance has to be up when we bring up Solr node? That's 
 not what the documentation say though.
 
 
 
 Thanks.



Re: external zookeeper with SolrCloud

2013-06-11 Thread Mark Miller

On Jun 11, 2013, at 10:15 AM, Joshi, Shital shital.jo...@gs.com wrote:

 Thanks Mark.
 
 Looks like this bug is fixed in Solr 4.4. Do you have any date for official 
 release of 4.4?

Looks like it might come out in a couple of weeks.

 Is there any instruction available on how to build Solr 4.4 from SVN 
 repository?

It's java, so it's pretty easy - you might find some help here: 
http://wiki.apache.org/solr/HowToContribute

- Mark

 
 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com] 
 Sent: Monday, June 10, 2013 8:05 PM
 To: solr-user@lucene.apache.org
 Subject: Re: external zookeeper with SolrCloud
 
 This might be https://issues.apache.org/jira/browse/SOLR-4899
 
 - Mark
 
 On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com wrote:
 
 Hi,
 
 
 
 We're setting up 5 shard SolrCloud with external zoo keeper. When we bring 
 up Solr nodes while the zookeeper instance is not up and running, we see 
 this error in Solr logs.
 
 
 
 java.net.ConnectException: Connection refused
 
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 
   at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
 
 
 INFO  - 2013-06-10 15:03:35.422; 
 org.apache.solr.common.cloud.ConnectionManager; Watcher 592147 
 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  ? 
 Watcher org.apache.solr.common.cloud.ConnectionManager@530d0eae 
 name:ZooKeeperConnection Watcher: . got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 
 
 
 INFO  - 2013-06-10 15:03:35.423; 
 org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status 
 change trigger but we are already closed
 
 592148 [main-EventThread] INFO  
 org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper status 
 change trigger but we are already closed
 
 
 
 After we bring up zookeeper instance, the node never connects to zookeeper 
 and we can't see the solr admin page, until we restart the node.
 
 
 
 Does the zookeeper instance has to be up when we bring up Solr node? That's 
 not what the documentation say though.
 
 
 
 Thanks.
 



external zookeeper with SolrCloud

2013-06-10 Thread Joshi, Shital
Hi,



We're setting up 5 shard SolrCloud with external zoo keeper. When we bring up 
Solr nodes while the zookeeper instance is not up and running, we see this 
error in Solr logs.



java.net.ConnectException: Connection refused

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)

at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)

at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)



INFO  - 2013-06-10 15:03:35.422; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 592147 
[main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  ? 
Watcher org.apache.solr.common.cloud.ConnectionManager@530d0eae 
name:ZooKeeperConnection Watcher: . got event WatchedEvent 
state:SyncConnected type:None path:null path:null type:None



INFO  - 2013-06-10 15:03:35.423; 
org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status change 
trigger but we are already closed

592148 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  
? Client-ZooKeeper status change trigger but we are already closed



After we bring up zookeeper instance, the node never connects to zookeeper and 
we can't see the solr admin page, until we restart the node.



Does the zookeeper instance has to be up when we bring up Solr node? That's not 
what the documentation say though.



Thanks.


Re: external zookeeper with SolrCloud

2013-06-10 Thread Mark Miller
This might be https://issues.apache.org/jira/browse/SOLR-4899

- Mark

On Jun 10, 2013, at 5:59 PM, Joshi, Shital shital.jo...@gs.com wrote:

 Hi,
 
 
 
 We're setting up 5 shard SolrCloud with external zoo keeper. When we bring up 
 Solr nodes while the zookeeper instance is not up and running, we see this 
 error in Solr logs.
 
 
 
 java.net.ConnectException: Connection refused
 
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 
at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 
at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
 
 
 
 INFO  - 2013-06-10 15:03:35.422; 
 org.apache.solr.common.cloud.ConnectionManager; Watcher 592147 
 [main-EventThread] INFO  org.apache.solr.common.cloud.ConnectionManager  ? 
 Watcher org.apache.solr.common.cloud.ConnectionManager@530d0eae 
 name:ZooKeeperConnection Watcher: . got event WatchedEvent 
 state:SyncConnected type:None path:null path:null type:None
 
 
 
 INFO  - 2013-06-10 15:03:35.423; 
 org.apache.solr.common.cloud.ConnectionManager; Client-ZooKeeper status 
 change trigger but we are already closed
 
 592148 [main-EventThread] INFO  
 org.apache.solr.common.cloud.ConnectionManager  ? Client-ZooKeeper status 
 change trigger but we are already closed
 
 
 
 After we bring up zookeeper instance, the node never connects to zookeeper 
 and we can't see the solr admin page, until we restart the node.
 
 
 
 Does the zookeeper instance has to be up when we bring up Solr node? That's 
 not what the documentation say though.
 
 
 
 Thanks.