Hi Cliff,great, it helps, thank you! So it's still strange for me why - as I mentioned "I suspected connectivity problem, but tcpdump shows constant traffic on port 7001 between nodes.", and even in unresponsive state there was packet exchange. Also I don't see in Cassandra code enabling SO_KEEPALIVE on storage port, only on CQL port.Nevertheless it works now, thanks again!
Here is link to MSDN about this timeout - https://blogs.msdn.microsoft.com/cie/2014/02/13/windows-azure-load-balancer-timeout-for-cloud-service-roles-paas-webworker/ Regards, Vlad On Thursday, October 27, 2016 8:50 PM, Cliff Gilmore <cgilm...@datastax.com> wrote: Azure has aggressively low keepalive settings for it's networks. Ignore the Mongo parts of this link and have a look at the OS settings they change. https://docs.mongodb.com/ecosystem/platforms/windows-azure/ --------------------------------------------------- Cliff Gilmore Vanguard Solutions ArchitectM: 314-825-4413 DataStax, Inc. | www.DataStax.com On Thu, Oct 27, 2016 at 5:48 AM, Vlad <qa23d-...@yahoo.com> wrote: Hello, I put two nodes cluster on Azure. Each node in its own DC (ping about 10 ms.), inter-node connection (SSL port 7001) is going throw external IPs, i.e. listen_interface: eth0broadcast_address: 1.1.1.1 Cluster is starting, cqlsh can connect, stress-tool survives night of writes with replication factor two, all seems to be fine. But when cluster is leaved without load it becomes nonfunctional after several minutes of idle. Attempt to connect fails with error Connection error: ('Unable to connect to any servers', {'1.1.1.1': OperationTimedOut('errors= Timed out creating connection (10 seconds), last_host=None',)}) There is messageWARN 10:06:32 RequestExecutionException READ_TIMEOUT: Operation timed out - received only 1 responses. on one node six minutes after start (no load or connect in this time). nodetool status shows both nodes as UN (Up and Normal, I guess) I suspected connectivity problem, but tcpdump shows constant traffic on port 7001 between nodes. Restarting OTHER node than I'm connection to solves the problem for another several minutes. I increased TCP idle time in Azure IP address setting to 30 minutes, but it had no effect. Thanks, Vlad