Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-14 Thread Hui
Sorry All.

I've verified that the slow problem is related to the 
reason failed to ping, tried [3] times, each with maximum [30s] timeout

Thanks.


On Friday, March 14, 2014 12:02:36 PM UTC+8, Hui wrote:
>
> Hi All,
>
> After testing in another cluster, I found that the cluster can be 
> connected but it was very slow.
>
> At this moment, every normal request(~50ms) becomes 41732ms to 85984ms 
> while the cluster is in Yellow health and there is no unassigned shard(s).
>
> It becomes 50ms again after the problem node re-joins.
>
> There is no exception log in the master node. 
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5f5a7739-c2b5-43cf-b040-346002d7281f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread Hui
Hi All,

After testing in another cluster, I found that the cluster can be connected 
but it was very slow.

At this moment, every normal request(~50ms) becomes 41732ms to 85984ms 
while the cluster is in Yellow health and there is no unassigned shard(s).

It becomes 50ms again after the problem node re-joins.

There is no exception log in the master node. 

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/840c93bb-f62a-4a65-9220-c725918436f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread Hui
Hi Echin,

Since the problem node ip is defined in the client es connection by JAVA 
API, I guess the client will still try to connect to this node. So, there 
are such warnings.

It should be fine for client to keep working with the cluster. However, in 
my case, the java client is not reachable and timeout(through HTTP 
protocol).

I will try to create a testing cluster with same settings to test does the 
client work fine in this condition.

Thanks.

On Thursday, March 13, 2014 11:54:53 PM UTC+8, echin1999 wrote:
>
> One more thing - I notice that functionally, the client is still able to 
> communicate to the remaining active node.  so I guess this warning is just 
> a "warning".  must be some background thread that periodically looks for 
> the missing node, while the main Client instance can still communicate to 
> the active node.would you be able to verify if its merely a warning for 
> you?  if so, i might just not worry about this for now.
>
> On Thursday, March 13, 2014 5:15:36 AM UTC-4, Hui wrote:
>>
>> Hi Dome, 
>>
>> Do you mean the service of 10.1.4.196 is not open? Yes, the service 
>> should be stopped when it was rebooted.
>>
>> But the master node 10.1.4.197 has removed the problem node 10.1.4.196 
>> when it cannot ping the machine 10.1.4.196.
>>
>> The cluster should be fine after this operation. Do I understand it 
>> wrongly?
>>
>> Thanks
>>
>> On Thursday, March 13, 2014 4:48:17 PM UTC+8, Dome.C.Wei wrote:
>>>
>>> That must be the service not open.
>>>
>>> 在 2014年3月13日星期四UTC+8下午2时10分22秒,Hui写道:

 Hi Mark,

 Thanks for replying.

 The master (10.1.4.197) and other nodes can be reached while the 
 problem node(10.1.4.196) is not reachable.
 So, we can see the cluster status at that moment

  "status" : "yellow",
   "timed_out" : false,
   "unassigned_shards" : 0,


 On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:
>
> It looks like a networking issue, at least based on "No route to host" 
> in the error.
> Can you ping the master when this is happening, what about doing a 
> telnet test?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 13 March 2014 16:54, Hui  wrote:
>
>> Hi All,
>>
>>
>> This is the log for the case.
>>
>>
>> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the 
>> client keeps trying to connect to the elasticsearch cluster but fails.
>>
>> Master Node : 
>> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
>> [10.1.4.197:9202] removed 
>> {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
>> reason: 
>> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>>
>>
>> Client : 
>> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
>> [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], 
>> closing connection
>> java.net.NoRouteToHostException: No route to host
>>
>>
>> (The cluster health at this moment is Yellow and there is no unassigned 
>> shard.)
>>
>>
>>
>>
>> The node is back at 14:25, the client can successfully connected to the 
>> cluster again.
>>
>> Client :
>>
>> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
>> [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], 
>> closing connection
>> java.net.NoRouteToHostException: No route to host
>>
>>
>> Master Node :
>>
>> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
>> [10.1.4.197:9202] added 
>> {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
>> reason: zen-disco-receive(join from 
>> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>>
>>
>> (The cluster health at this moment is Green.)
>>
>> In the above case, the client should be able to connect to the cluster 
>> even a node is removed from the cluster.
>>
>>
>> For the client, the connection is created as followings : 
>>
>>
>> Settings settings = ImmutableSettings.settingsBuilder()
>> .put("cluster.name", "clustername")
>>
>> .put("client.transport.sniff", true)
>>
>>
>> .build();
>> 
>>
>> TransportClient client = new TransportClient(settings);
>>
>> client.addTransportAddress(new InetSocketTransportAddress(
>> "10.1.4.195" /* hostname */, 9300 /* port */));
>>
>> client.addTransportAddress(new InetSocketTransportAddress(
>>
>

Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread echin1999
One more thing - I notice that functionally, the client is still able to 
communicate to the remaining active node.  so I guess this warning is just 
a "warning".  must be some background thread that periodically looks for 
the missing node, while the main Client instance can still communicate to 
the active node.would you be able to verify if its merely a warning for 
you?  if so, i might just not worry about this for now.

On Thursday, March 13, 2014 5:15:36 AM UTC-4, Hui wrote:
>
> Hi Dome, 
>
> Do you mean the service of 10.1.4.196 is not open? Yes, the service should 
> be stopped when it was rebooted.
>
> But the master node 10.1.4.197 has removed the problem node 10.1.4.196 
> when it cannot ping the machine 10.1.4.196.
>
> The cluster should be fine after this operation. Do I understand it 
> wrongly?
>
> Thanks
>
> On Thursday, March 13, 2014 4:48:17 PM UTC+8, Dome.C.Wei wrote:
>>
>> That must be the service not open.
>>
>> 在 2014年3月13日星期四UTC+8下午2时10分22秒,Hui写道:
>>>
>>> Hi Mark,
>>>
>>> Thanks for replying.
>>>
>>> The master (10.1.4.197) and other nodes can be reached while the problem 
>>> node(10.1.4.196) is not reachable.
>>> So, we can see the cluster status at that moment
>>>
>>>  "status" : "yellow",
>>>   "timed_out" : false,
>>>   "unassigned_shards" : 0,
>>>
>>>
>>> On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:

 It looks like a networking issue, at least based on "No route to host" 
 in the error.
 Can you ping the master when this is happening, what about doing a 
 telnet test?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 13 March 2014 16:54, Hui  wrote:

> Hi All,
>
>
> This is the log for the case.
>
>
> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client 
> keeps trying to connect to the elasticsearch cluster but fails.
>
> Master Node : 
> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
> [10.1.4.197:9202] removed 
> {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
> reason: 
> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>
>
> Client : 
> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
> [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], 
> closing connection
> java.net.NoRouteToHostException: No route to host
>
>
> (The cluster health at this moment is Yellow and there is no unassigned 
> shard.)
>
>
>
>
> The node is back at 14:25, the client can successfully connected to the 
> cluster again.
>
> Client :
>
> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
> [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], 
> closing connection
> java.net.NoRouteToHostException: No route to host
>
>
> Master Node :
>
> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
> [10.1.4.197:9202] added 
> {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
> reason: zen-disco-receive(join from 
> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>
>
> (The cluster health at this moment is Green.)
>
> In the above case, the client should be able to connect to the cluster 
> even a node is removed from the cluster.
>
>
> For the client, the connection is created as followings : 
>
>
> Settings settings = ImmutableSettings.settingsBuilder()
> .put("cluster.name", "clustername")
>
> .put("client.transport.sniff", true)
>
>
> .build();
> 
>
> TransportClient client = new TransportClient(settings);
>
> client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.195" /* hostname */, 9300 /* port */));
>
> client.addTransportAddress(new InetSocketTransportAddress(
>
> "10.1.4.196" /* hostname */, 9300 /* port */)); 
>  client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.197" /* hostname */, 9300 /* port */));
>
> The master node is 10.1.4.197 while the node being removed is 
> 10.1.4.196.
>
> For the cluster setting, all setting is using the default except the 
> the discovery.zen.minimum_master_nodes which is set to 3.
>
> Is there any problem for the above setting which cause this issue?
>
> Thanks.
>
>  -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this 

Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread echin1999
Hi.
that would be my assumption as well.  By the way, I am getting this same 
warning that you are getting.  Very similar scenario (2 nodes in a cluster 
- all works fine when everything is running. Warning appears on client if 
one of the nodes is taken down).
  
I am using v. 0.90 - not sure if that matters.


On Thursday, March 13, 2014 5:15:36 AM UTC-4, Hui wrote:
>
> Hi Dome, 
>
> Do you mean the service of 10.1.4.196 is not open? Yes, the service should 
> be stopped when it was rebooted.
>
> But the master node 10.1.4.197 has removed the problem node 10.1.4.196 
> when it cannot ping the machine 10.1.4.196.
>
> The cluster should be fine after this operation. Do I understand it 
> wrongly?
>
> Thanks
>
> On Thursday, March 13, 2014 4:48:17 PM UTC+8, Dome.C.Wei wrote:
>>
>> That must be the service not open.
>>
>> 在 2014年3月13日星期四UTC+8下午2时10分22秒,Hui写道:
>>>
>>> Hi Mark,
>>>
>>> Thanks for replying.
>>>
>>> The master (10.1.4.197) and other nodes can be reached while the problem 
>>> node(10.1.4.196) is not reachable.
>>> So, we can see the cluster status at that moment
>>>
>>>  "status" : "yellow",
>>>   "timed_out" : false,
>>>   "unassigned_shards" : 0,
>>>
>>>
>>> On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:

 It looks like a networking issue, at least based on "No route to host" 
 in the error.
 Can you ping the master when this is happening, what about doing a 
 telnet test?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 13 March 2014 16:54, Hui  wrote:

> Hi All,
>
>
> This is the log for the case.
>
>
> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client 
> keeps trying to connect to the elasticsearch cluster but fails.
>
> Master Node : 
> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
> [10.1.4.197:9202] removed 
> {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
> reason: 
> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>
>
> Client : 
> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
> [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], 
> closing connection
> java.net.NoRouteToHostException: No route to host
>
>
> (The cluster health at this moment is Yellow and there is no unassigned 
> shard.)
>
>
>
>
> The node is back at 14:25, the client can successfully connected to the 
> cluster again.
>
> Client :
>
> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
> [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], 
> closing connection
> java.net.NoRouteToHostException: No route to host
>
>
> Master Node :
>
> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
> [10.1.4.197:9202] added 
> {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
> reason: zen-disco-receive(join from 
> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>
>
> (The cluster health at this moment is Green.)
>
> In the above case, the client should be able to connect to the cluster 
> even a node is removed from the cluster.
>
>
> For the client, the connection is created as followings : 
>
>
> Settings settings = ImmutableSettings.settingsBuilder()
> .put("cluster.name", "clustername")
>
> .put("client.transport.sniff", true)
>
>
> .build();
> 
>
> TransportClient client = new TransportClient(settings);
>
> client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.195" /* hostname */, 9300 /* port */));
>
> client.addTransportAddress(new InetSocketTransportAddress(
>
> "10.1.4.196" /* hostname */, 9300 /* port */)); 
>  client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.197" /* hostname */, 9300 /* port */));
>
> The master node is 10.1.4.197 while the node being removed is 
> 10.1.4.196.
>
> For the cluster setting, all setting is using the default except the 
> the discovery.zen.minimum_master_nodes which is set to 3.
>
> Is there any problem for the above setting which cause this issue?
>
> Thanks.
>
>  -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com.
> To view this d

Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread Hui
Hi Dome, 

Do you mean the service of 10.1.4.196 is not open? Yes, the service should 
be stopped when it was rebooted.

But the master node 10.1.4.197 has removed the problem node 10.1.4.196 when 
it cannot ping the machine 10.1.4.196.

The cluster should be fine after this operation. Do I understand it wrongly?

Thanks

On Thursday, March 13, 2014 4:48:17 PM UTC+8, Dome.C.Wei wrote:
>
> That must be the service not open.
>
> 在 2014年3月13日星期四UTC+8下午2时10分22秒,Hui写道:
>>
>> Hi Mark,
>>
>> Thanks for replying.
>>
>> The master (10.1.4.197) and other nodes can be reached while the problem 
>> node(10.1.4.196) is not reachable.
>> So, we can see the cluster status at that moment
>>
>>  "status" : "yellow",
>>   "timed_out" : false,
>>   "unassigned_shards" : 0,
>>
>>
>> On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:
>>>
>>> It looks like a networking issue, at least based on "No route to host" 
>>> in the error.
>>> Can you ping the master when this is happening, what about doing a 
>>> telnet test?
>>>
>>> Regards,
>>> Mark Walkom
>>>
>>> Infrastructure Engineer
>>> Campaign Monitor
>>> email: ma...@campaignmonitor.com
>>> web: www.campaignmonitor.com
>>>
>>>
>>> On 13 March 2014 16:54, Hui  wrote:
>>>
 Hi All,


 This is the log for the case.


 The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client 
 keeps trying to connect to the elasticsearch cluster but fails.

 Master Node : 
 [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
 [10.1.4.197:9202] removed 
 {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
 reason: 
 zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
  reason failed to ping, tried [3] times, each with maximum [30s] timeout


 Client : 
 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
 [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], 
 closing connection
 java.net.NoRouteToHostException: No route to host


 (The cluster health at this moment is Yellow and there is no unassigned 
 shard.)




 The node is back at 14:25, the client can successfully connected to the 
 cluster again.

 Client :

 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
 [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], 
 closing connection
 java.net.NoRouteToHostException: No route to host


 Master Node :

 [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
 [10.1.4.197:9202] added 
 {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
 reason: zen-disco-receive(join from 
 node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])


 (The cluster health at this moment is Green.)

 In the above case, the client should be able to connect to the cluster 
 even a node is removed from the cluster.


 For the client, the connection is created as followings : 


 Settings settings = ImmutableSettings.settingsBuilder()
 .put("cluster.name", "clustername")

 .put("client.transport.sniff", true)


 .build();
 

 TransportClient client = new TransportClient(settings);

 client.addTransportAddress(new InetSocketTransportAddress(
 "10.1.4.195" /* hostname */, 9300 /* port */));

 client.addTransportAddress(new InetSocketTransportAddress(

 "10.1.4.196" /* hostname */, 9300 /* port */)); 
  client.addTransportAddress(new InetSocketTransportAddress(
 "10.1.4.197" /* hostname */, 9300 /* port */));

 The master node is 10.1.4.197 while the node being removed is 
 10.1.4.196.

 For the cluster setting, all setting is using the default except the 
 the discovery.zen.minimum_master_nodes which is set to 3.

 Is there any problem for the above setting which cause this issue?

 Thanks.

  -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/b1f3adf5-723b-49aa-bffe-674c5ce930e5%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it

Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-13 Thread Dome.C.Wei
That must be the service not open.

在 2014年3月13日星期四UTC+8下午2时10分22秒,Hui写道:
>
> Hi Mark,
>
> Thanks for replying.
>
> The master (10.1.4.197) and other nodes can be reached while the problem 
> node(10.1.4.196) is not reachable.
> So, we can see the cluster status at that moment
>
>  "status" : "yellow",
>   "timed_out" : false,
>   "unassigned_shards" : 0,
>
>
> On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:
>>
>> It looks like a networking issue, at least based on "No route to host" in 
>> the error.
>> Can you ping the master when this is happening, what about doing a telnet 
>> test?
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 13 March 2014 16:54, Hui  wrote:
>>
>>> Hi All,
>>>
>>>
>>> This is the log for the case.
>>>
>>>
>>> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client 
>>> keeps trying to connect to the elasticsearch cluster but fails.
>>>
>>> Master Node : 
>>> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
>>> [10.1.4.197:9202] removed 
>>> {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
>>> reason: 
>>> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>>>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>>>
>>>
>>> Client : 
>>> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
>>> [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], closing 
>>> connection
>>> java.net.NoRouteToHostException: No route to host
>>>
>>>
>>> (The cluster health at this moment is Yellow and there is no unassigned 
>>> shard.)
>>>
>>>
>>>
>>>
>>> The node is back at 14:25, the client can successfully connected to the 
>>> cluster again.
>>>
>>> Client :
>>>
>>> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
>>> [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], closing 
>>> connection
>>> java.net.NoRouteToHostException: No route to host
>>>
>>>
>>> Master Node :
>>>
>>> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
>>> [10.1.4.197:9202] added 
>>> {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
>>> reason: zen-disco-receive(join from 
>>> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>>>
>>>
>>> (The cluster health at this moment is Green.)
>>>
>>> In the above case, the client should be able to connect to the cluster even 
>>> a node is removed from the cluster.
>>>
>>>
>>> For the client, the connection is created as followings : 
>>>
>>>
>>> Settings settings = ImmutableSettings.settingsBuilder()
>>> .put("cluster.name", "clustername")
>>>
>>> .put("client.transport.sniff", true)
>>>
>>>
>>> .build();
>>> 
>>>
>>> TransportClient client = new TransportClient(settings);
>>>
>>> client.addTransportAddress(new InetSocketTransportAddress(
>>> "10.1.4.195" /* hostname */, 9300 /* port */));
>>>
>>> client.addTransportAddress(new InetSocketTransportAddress(
>>>
>>> "10.1.4.196" /* hostname */, 9300 /* port */)); 
>>>  client.addTransportAddress(new InetSocketTransportAddress(
>>> "10.1.4.197" /* hostname */, 9300 /* port */));
>>>
>>> The master node is 10.1.4.197 while the node being removed is 10.1.4.196.
>>>
>>> For the cluster setting, all setting is using the default except the the 
>>> discovery.zen.minimum_master_nodes 
>>> which is set to 3.
>>>
>>> Is there any problem for the above setting which cause this issue?
>>>
>>> Thanks.
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/b1f3adf5-723b-49aa-bffe-674c5ce930e5%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a24efac4-f61d-4aa9-913c-bf11eba2735f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-12 Thread Hui
Hi Mark,

Thanks for replying.

The master (10.1.4.197) and other nodes can be reached while the problem 
node(10.1.4.196) is not reachable.
So, we can see the cluster status at that moment

 "status" : "yellow",
  "timed_out" : false,
  "unassigned_shards" : 0,


On Thursday, March 13, 2014 2:03:44 PM UTC+8, Mark Walkom wrote:
>
> It looks like a networking issue, at least based on "No route to host" in 
> the error.
> Can you ping the master when this is happening, what about doing a telnet 
> test?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 13 March 2014 16:54, Hui > wrote:
>
>> Hi All,
>>
>>
>> This is the log for the case.
>>
>>
>> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client 
>> keeps trying to connect to the elasticsearch cluster but fails.
>>
>> Master Node : 
>> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] 
>> [10.1.4.197:9202] removed 
>> {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
>> reason: 
>> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>>
>>
>> Client : 
>> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - 
>> [Bulldozer] exception caught on transport layer [[id: 0x50dc218f]], closing 
>> connection
>> java.net.NoRouteToHostException: No route to host
>>
>>
>> (The cluster health at this moment is Yellow and there is no unassigned 
>> shard.)
>>
>>
>>
>>
>> The node is back at 14:25, the client can successfully connected to the 
>> cluster again.
>>
>> Client :
>>
>> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - 
>> [Bulldozer] exception caught on transport layer [[id: 0xf24d85d7]], closing 
>> connection
>> java.net.NoRouteToHostException: No route to host
>>
>>
>> Master Node :
>>
>> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] 
>> [10.1.4.197:9202] added 
>> {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
>> reason: zen-disco-receive(join from 
>> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>>
>>
>> (The cluster health at this moment is Green.)
>>
>> In the above case, the client should be able to connect to the cluster even 
>> a node is removed from the cluster.
>>
>>
>> For the client, the connection is created as followings : 
>>
>>
>> Settings settings = ImmutableSettings.settingsBuilder()
>> .put("cluster.name", "clustername")
>>
>> .put("client.transport.sniff", true)
>>
>>
>> .build();
>> 
>>
>> TransportClient client = new TransportClient(settings);
>>
>> client.addTransportAddress(new InetSocketTransportAddress(
>> "10.1.4.195" /* hostname */, 9300 /* port */));
>>
>> client.addTransportAddress(new InetSocketTransportAddress(
>>
>> "10.1.4.196" /* hostname */, 9300 /* port */)); 
>>  client.addTransportAddress(new InetSocketTransportAddress(
>> "10.1.4.197" /* hostname */, 9300 /* port */));
>>
>> The master node is 10.1.4.197 while the node being removed is 10.1.4.196.
>>
>> For the cluster setting, all setting is using the default except the the 
>> discovery.zen.minimum_master_nodes 
>> which is set to 3.
>>
>> Is there any problem for the above setting which cause this issue?
>>
>> Thanks.
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b1f3adf5-723b-49aa-bffe-674c5ce930e5%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fe322bdb-2726-4979-80d1-bb2f7372f28f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: (ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-12 Thread Mark Walkom
It looks like a networking issue, at least based on "No route to host" in
the error.
Can you ping the master when this is happening, what about doing a telnet
test?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 13 March 2014 16:54, Hui  wrote:

> Hi All,
>
> This is the log for the case.
>
> The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client 
> keeps trying to connect to the elasticsearch cluster but fails.
> Master Node :
> [2014-03-08 14:08:26,531][INFO ][cluster.service  ] [10.1.4.197:9202] 
> removed {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],},
> reason: 
> zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
>  reason failed to ping, tried [3] times, each with maximum [30s] timeout
>
> Client :
> 2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - [Bulldozer] 
> exception caught on transport layer [[id: 0x50dc218f]], closing connection
> java.net.NoRouteToHostException: No route to host
>
> (The cluster health at this moment is Yellow and there is no unassigned 
> shard.)
>
>
> The node is back at 14:25, the client can successfully connected to the 
> cluster again.
> Client :
> 2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - [Bulldozer] 
> exception caught on transport layer [[id: 0xf24d85d7]], closing connection
> java.net.NoRouteToHostException: No route to host
>
> Master Node :
> [2014-03-08 14:25:57,984][INFO ][cluster.service  ] [10.1.4.197:9202] 
> added {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
> reason: zen-disco-receive(join from 
> node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])
>
> (The cluster health at this moment is Green.)
>
> In the above case, the client should be able to connect to the cluster even a 
> node is removed from the cluster.
>
> For the client, the connection is created as followings :
>
> Settings settings = ImmutableSettings.settingsBuilder()
> .put("cluster.name", "clustername")
> .put("client.transport.sniff", true)
>
> .build();
>
> TransportClient client = new TransportClient(settings);
>
> client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.195" /* hostname */, 9300 /* port */));
>
> client.addTransportAddress(new InetSocketTransportAddress(
>
> "10.1.4.196" /* hostname */, 9300 /* port */));
>  client.addTransportAddress(new InetSocketTransportAddress(
> "10.1.4.197" /* hostname */, 9300 /* port */));
>
> The master node is 10.1.4.197 while the node being removed is 10.1.4.196.
>
> For the cluster setting, all setting is using the default except the the 
> discovery.zen.minimum_master_nodes
> which is set to 3.
>
> Is there any problem for the above setting which cause this issue?
>
> Thanks.
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b1f3adf5-723b-49aa-bffe-674c5ce930e5%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624ahrjC1y7iDzQpb4UvrOnDbMRYnW%3D-u78TTBGLSwBc3Ow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(ES 0.90.1) Cannot connect to elasticsearch cluster after a node is removed

2014-03-12 Thread Hui


Hi All,

This is the log for the case.

The node 10.1.4.196 is removed at 14:08 due to machine reboot, the client keeps 
trying to connect to the elasticsearch cluster but fails.
Master Node : 
[2014-03-08 14:08:26,531][INFO ][cluster.service  ] [10.1.4.197:9202] 
removed {[10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]],}, 
reason: 
zen-disco-node_failed([10.1.4.196:9202][_sJrum34QWGqEkv8CvAtow][inet[/10.1.4.196:9302]]),
 reason failed to ping, tried [3] times, each with maximum [30s] timeout

Client : 
2014-03-08 14:15:36,184 WARN  org.elasticsearch.transport.netty - [Bulldozer] 
exception caught on transport layer [[id: 0x50dc218f]], closing connection
java.net.NoRouteToHostException: No route to host

(The cluster health at this moment is Yellow and there is no unassigned shard.)


The node is back at 14:25, the client can successfully connected to the cluster 
again.
Client :
2014-03-08 14:25:20,597 WARN  org.elasticsearch.transport.netty - [Bulldozer] 
exception caught on transport layer [[id: 0xf24d85d7]], closing connection
java.net.NoRouteToHostException: No route to host

Master Node :
[2014-03-08 14:25:57,984][INFO ][cluster.service  ] [10.1.4.197:9202] 
added {[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]],}, 
reason: zen-disco-receive(join from 
node[[10.1.4.196:9202][rFZ7k7XSSY231EgPoDfmFw][inet[/10.1.4.196:9302]]])

(The cluster health at this moment is Green.)

In the above case, the client should be able to connect to the cluster even a 
node is removed from the cluster.

For the client, the connection is created as followings : 
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", "clustername")
.put("client.transport.sniff", true)
.build();
TransportClient client = new TransportClient(settings);

client.addTransportAddress(new InetSocketTransportAddress(
"10.1.4.195" /* hostname */, 9300 /* port */));

client.addTransportAddress(new InetSocketTransportAddress(

"10.1.4.196" /* hostname */, 9300 /* port */)); 
client.addTransportAddress(new InetSocketTransportAddress(
"10.1.4.197" /* hostname */, 9300 /* port */));

The master node is 10.1.4.197 while the node being removed is 10.1.4.196.

For the cluster setting, all setting is using the default except the the 
discovery.zen.minimum_master_nodes 
which is set to 3.

Is there any problem for the above setting which cause this issue?

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b1f3adf5-723b-49aa-bffe-674c5ce930e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.