Hi Cliff,great, it helps, thank you!

So it's still strange for me why - as I mentioned "I suspected connectivity 
problem, but tcpdump shows constant traffic on port 7001 between nodes.", and 
even in unresponsive state there was packet exchange. Also I don't see in 
Cassandra code enabling SO_KEEPALIVE on storage port, only on CQL 
port.Nevertheless it works now, thanks again!


Here is link to MSDN about this timeout - 
https://blogs.msdn.microsoft.com/cie/2014/02/13/windows-azure-load-balancer-timeout-for-cloud-service-roles-paas-webworker/
Regards, Vlad
 
   

 On Thursday, October 27, 2016 8:50 PM, Cliff Gilmore <cgilm...@datastax.com> 
wrote:
 

 Azure has aggressively low keepalive settings for it's networks. Ignore the 
Mongo parts of this link and have a look at the OS settings they change.
https://docs.mongodb.com/ecosystem/platforms/windows-azure/

---------------------------------------------------
Cliff Gilmore
Vanguard Solutions ArchitectM: 314-825-4413
DataStax, Inc. | www.DataStax.com


On Thu, Oct 27, 2016 at 5:48 AM, Vlad <qa23d-...@yahoo.com> wrote:

Hello,
I put two nodes cluster on Azure. Each node in its own DC (ping about 10 ms.), 
inter-node connection (SSL port 7001) is going throw external IPs, i.e.

 listen_interface: eth0broadcast_address: 1.1.1.1
Cluster is starting, cqlsh can connect, stress-tool survives night of writes 
with replication factor two, all seems to be fine. But when cluster is leaved 
without load it becomes nonfunctional after several minutes of idle. Attempt to 
connect fails with error
Connection error: ('Unable to connect to any servers', {'1.1.1.1': 
OperationTimedOut('errors= Timed out creating connection (10 seconds), 
last_host=None',)})

There is messageWARN  10:06:32 RequestExecutionException READ_TIMEOUT: 
Operation timed out - received only 1 responses.

on one node six minutes after start (no load or connect in this time).

nodetool status shows both nodes as UN (Up and Normal, I guess) 

I suspected connectivity problem, but tcpdump shows constant traffic on port 
7001 between nodes. Restarting OTHER node than I'm connection to solves the 
problem for another several minutes. I increased  TCP idle time in Azure IP 
address setting to 30 minutes, but it had no effect.

Thanks, Vlad






   

Reply via email to