Re: How does the "batch" commit log sync works

2016-10-30 Thread Hiroyuki Yamada
Hello Benedict and Edward,

Thank you very much for the comments.
I think the batch parameter is useful when doing some transactional
processing on C* where we need atomicity and higher durability.

Anyways, I think it is not working as expected at least in the latest
versions in 2.1 and 2.2.
So, I created a ticket in JIRA.
https://issues.apache.org/jira/browse/CASSANDRA-12864

I hope it will be fixed soon.

Thanks,
Hiro

On Fri, Oct 28, 2016 at 6:00 PM, Benedict Elliott Smith
 wrote:
> That is the maximum length of time that queries may be batched together for,
> not the minimum. If there is a break in the flow of queries for the commit
> log, it will commit those outstanding immediately.  It will anyway commit in
> clusters of commit log file size (default 32Mb).
>
> I know the documentation used to disagree with itself in a few places, and
> with actual behaviour, but I thought that had been fixed.  I suggest you
> file a ticket if you find a mention that does not match this description.
>
> Really the batch period is a near useless parameter.  If it were to be
> honoured as a minimum, performance would decline due to the threading model
> in Cassandra (and it will be years before this and memory management improve
> enough to support that behaviour).
>
> Conversely honouring it as a maximum is only possible for very small values,
> just by nature of queueing theory.
>
> I believe I proposed removing the parameter entirely some time ago, though
> it is lost in the mists of time.
>
> Anyway, many people do indeed use this commitlog mode successfully, although
> it is by far less common than periodic mode.  This behaviour does not mean
> your data is in anyway unsafe.
>
>
> On Friday, 28 October 2016, Edward Capriolo  wrote:
>>
>> I mentioned during my Cassandra.yaml presentation at the summit that I
>> never saw anyone use these settings. Things off by default are typically not
>> highly not covered well by tests. It sounds like it is not working. Quick
>> suggestion: go back in time maybe to a version like 1.2.X or 0.7 and see if
>> it behaves like the yaml suggests it should.
>>
>> On Thu, Oct 27, 2016 at 11:48 PM, Hiroyuki Yamada 
>> wrote:
>>>
>>> Hello Satoshi and the community,
>>>
>>> I am also using commitlog_sync for durability, but I have never
>>> modified commitlog_sync_batch_window_in_ms parameter yet,
>>> so I wondered if it is working or not.
>>>
>>> As Satoshi said, I also changed commitlog_sync_batch_window_in_ms (to
>>> 1) and restarted C* and
>>> issued some INSERT command.
>>> But, it actually returned immediately right after issuing.
>>>
>>> So, it seems like the parameter is not working correctly.
>>> Are we missing something ?
>>>
>>> Thanks,
>>> Hiro
>>>
>>> On Thu, Oct 27, 2016 at 5:58 PM, Satoshi Hikida 
>>> wrote:
>>> > Hi, all.
>>> >
>>> > I have a question about "batch" commit log sync behavior with C*
>>> > version
>>> > 2.2.8.
>>> >
>>> > Here's what I have done:
>>> >
>>> > * set commitlog_sync to the "batch" mode as follows:
>>> >
>>> >> commitlog_sync: batch
>>> >> commitlog_sync_batch_window_in_ms: 1
>>> >
>>> > * ran a script which inserts the data to a table
>>> > * prepared a disk dedicated to store the commit logs
>>> >
>>> > According to the DataStax document, I expected that fsync is done once
>>> > in a
>>> > batch window (one fsync per 10sec in this case) and writes issued
>>> > within
>>> > this batch window are blocked until fsync is completed.
>>> >
>>> > In my experiment, however, it seems that the write requests returned
>>> > almost
>>> > immediately (within 300~400 ms).
>>> >
>>> > Am I misunderstanding something? If so, can someone give me any advices
>>> > as
>>> > to the reason why C* behaves like this?
>>> >
>>> >
>>> > I referred to this document:
>>> >
>>> > https://docs.datastax.com/en/cassandra/2.2/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__PerformanceTuningProps
>>> >
>>> > Regards,
>>> > Satoshi
>>> >
>>
>>
>


Re: Securing a Cassandra 2.2.6 Cluster

2016-10-30 Thread Jonathan Haddad
I'm not sure why you aren't able to connect with cqlsh, there may be
something in your log files to help figure that out.

As for your config, even if you do figure out why you can't connect, you're
still going to have to change your settings since you won't be able to
connect to your cluster from a machine outside localhost.

Comment out listen_address.  Using that is, imo, generally wrong, since it
means you need to have different configs on each machine.  I'm not wild
about the "do the right thing" part in there - instead set your
listen_interface to en0 (or whatever network device you're using) and
Cassandra will attach to the external IP and stuff *shoiuld just work*.  If
it doesn't, check your logs for errors.

Jon

On Sun, Oct 30, 2016 at 12:38 PM Raimund Klein  wrote:

> Hi guys,
>
> Thank you for your responses. Let me try to address them:
>
>
>- I just tried cqlsh directly with the IP, no change in behaviour. (I
>previously tried the hostnames, didn't work either.)
>- As for the "empty" ..._address: I meant that I leave these blank.
>Please let me quote from the default cassandra.yaml:
># Leaving it blank leaves it up to InetAddress.getLocalHost(). This
># (hostname, name resolution, etc), and the Right Thing is to use the
># address associated with the hostname (it might not be).
># will always do the Right Thing _if_ the node is properly configured
>So what should I put instead?
>- Requested outputs:
>
>nodetool status
>Datacenter: datacenter1
>===
>Status=Up/Down
>|/ State=Normal/Leaving/Joining/Moving
>--  Address   Load   Tokens   Owns (effective)  Host ID
>Rack
>UN 344.56 KB  256  100.0%
>6271c749-e41d-443c-89e4-46c0fbac49af  rack1
>UN266.91 KB  256  100.0%
>e50a1076-7149-45f3-9001-26bb479f2a50  rack1
>
># netstat -lptn | grep java
>tcp0  0 :70000.0.0.0:*   LISTEN
>17040/*java*
>tcp0  0 127.0.0.1:36415 0.0.0.0:*
>LISTEN  17040/*java*
>tcp0  0 127.0.0.1:7199  0.0.0.0:*
>LISTEN  17040/*java*
>tcp6   0  0 :9042:::*LISTEN
>17040/
>
> *java *
># netstat -lptn | grep java
>tcp0  0 127.0.0.1:43569 0.0.0.0:*
>LISTEN  49349/*java*
>tcp0  0 :7000   0.0.0.0:*   LISTEN
>49349/*java*
>tcp0  0 127.0.0.1:7199  0.0.0.0:*
>LISTEN  49349/*java*
>tcp6   0  0 :::8009 :::*
>LISTEN  42088/*java*
>tcp6   0  0 :::8080 :::*
>LISTEN  42088/*java*
>tcp6   0  0 :9042   :::*LISTEN
>49349/*java*
>tcp6   0  0 127.0.0.1:8005  :::*
>LISTEN  42088/*java*
>
> Jonathan, thank you for reassuring me that I didn't misunderstand seeds
> completely. ;-)
>
> Any ideas?
>
> Regards
> Raimund
>
> 2016-10-30 18:48 GMT+00:00 Jonathan Haddad :
>
> I always prefer to set the listen interface instead of listen adress
>
> Both nodes can be seeds. In fact, there should be more than one seed.
> Having your first 2 nodes as seeds is usual the correct thing to do.
> On Sun, Oct 30, 2016 at 8:28 AM Vladimir Yudovin 
> wrote:
>
> >Empty listen_address and rpc_address.
> What do you mean by "Empty"? You should set either ***_address or
> ***_interface. Otherwise
> Cassandra will not listen on port 9042.
>
> >Open ports 9042, 7000 and 7001 for external communication.
> Only port 9042 should be open to the world, Port 7000 for internode
> communication, and 7001 for internode SSL communication (only one of them
> is used).
>
> >What is the best order of steps
> Order doesn't really matter.
>
> >Define both machines as seeds.
> It's wrong. Only one (started first) should be seed.
>
>
> >nodetool sees both of them
> cqlsh refuses to connect
> Can you please give output of
> *nodetool status*
> and
> *netstat -lptn | grep java*
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Sun, 30 Oct 2016 14:11:55 -0400*Raimund Klein
> >* wrote 
>
> Hi everyone,
>
> We've managed to set up a Cassandra 2.2.6 cluster of two physical nodes
> (nodetool sees both of them, so I'm quite certain the cluster is indeed
> active). My steps to create the cluster were (this applies to both
> machines):
>
>  - Empty listen_address and rpc_address.
>  - Define a cluster_name.
>  - Define both machines as seeds.
>  - Open ports 9042, 7000 and 7001 for external communication.
>
>
>
> Now I want to secure access to the cluster in all forms:
>
>  - define a different database user with a new password
>  - encrypt communication bet ween clients and the cluster including client
> verification
>  - encrypt

Re: Securing a Cassandra 2.2.6 Cluster

2016-10-30 Thread Raimund Klein
Hi guys,

Thank you for your responses. Let me try to address them:


   - I just tried cqlsh directly with the IP, no change in behaviour. (I
   previously tried the hostnames, didn't work either.)
   - As for the "empty" ..._address: I meant that I leave these blank.
   Please let me quote from the default cassandra.yaml:
   # Leaving it blank leaves it up to InetAddress.getLocalHost(). This
   # (hostname, name resolution, etc), and the Right Thing is to use the
   # address associated with the hostname (it might not be).
   # will always do the Right Thing _if_ the node is properly configured
   So what should I put instead?
   - Requested outputs:

   nodetool status
   Datacenter: datacenter1
   ===
   Status=Up/Down
   |/ State=Normal/Leaving/Joining/Moving
   --  Address   Load   Tokens   Owns (effective)  Host ID
 Rack
   UN 344.56 KB  256  100.0%
   6271c749-e41d-443c-89e4-46c0fbac49af  rack1
   UN266.91 KB  256  100.0%
   e50a1076-7149-45f3-9001-26bb479f2a50  rack1

   # netstat -lptn | grep java
   tcp0  0 :70000.0.0.0:*   LISTEN
 17040/*java*
   tcp0  0 127.0.0.1:36415 0.0.0.0:*
   LISTEN  17040/*java*
   tcp0  0 127.0.0.1:7199  0.0.0.0:*
   LISTEN  17040/*java*
   tcp6   0  0 :9042:::*LISTEN
 17040/

*java *
   # netstat -lptn | grep java
   tcp0  0 127.0.0.1:43569 0.0.0.0:*
   LISTEN  49349/*java*
   tcp0  0 :7000   0.0.0.0:*   LISTEN
 49349/*java*
   tcp0  0 127.0.0.1:7199  0.0.0.0:*
   LISTEN  49349/*java*
   tcp6   0  0 :::8009 :::*
   LISTEN  42088/*java*
   tcp6   0  0 :::8080 :::*
   LISTEN  42088/*java*
   tcp6   0  0 :9042   :::*LISTEN
 49349/*java*
   tcp6   0  0 127.0.0.1:8005  :::*
   LISTEN  42088/*java*

Jonathan, thank you for reassuring me that I didn't misunderstand seeds
completely. ;-)

Any ideas?

Regards
Raimund

2016-10-30 18:48 GMT+00:00 Jonathan Haddad :

> I always prefer to set the listen interface instead of listen adress
>
> Both nodes can be seeds. In fact, there should be more than one seed.
> Having your first 2 nodes as seeds is usual the correct thing to do.
> On Sun, Oct 30, 2016 at 8:28 AM Vladimir Yudovin 
> wrote:
>
>> >Empty listen_address and rpc_address.
>> What do you mean by "Empty"? You should set either ***_address or
>> ***_interface. Otherwise
>> Cassandra will not listen on port 9042.
>>
>> >Open ports 9042, 7000 and 7001 for external communication.
>> Only port 9042 should be open to the world, Port 7000 for internode
>> communication, and 7001 for internode SSL communication (only one of them
>> is used).
>>
>> >What is the best order of steps
>> Order doesn't really matter.
>>
>> >Define both machines as seeds.
>> It's wrong. Only one (started first) should be seed.
>>
>>
>> >nodetool sees both of them
>> cqlsh refuses to connect
>> Can you please give output of
>> *nodetool status*
>> and
>> *netstat -lptn | grep java*
>>
>> Best regards, Vladimir Yudovin,
>>
>> *Winguzone  - Hosted Cloud
>> CassandraLaunch your cluster in minutes.*
>>
>>
>>  On Sun, 30 Oct 2016 14:11:55 -0400*Raimund Klein
>> >* wrote 
>>
>> Hi everyone,
>>
>> We've managed to set up a Cassandra 2.2.6 cluster of two physical nodes
>> (nodetool sees both of them, so I'm quite certain the cluster is indeed
>> active). My steps to create the cluster were (this applies to both
>> machines):
>>
>>  - Empty listen_address and rpc_address.
>>  - Define a cluster_name.
>>  - Define both machines as seeds.
>>  - Open ports 9042, 7000 and 7001 for external communication.
>>
>>
>>
>> Now I want to secure access to the cluster in all forms:
>>
>>  - define a different database user with a new password
>>  - encrypt communication bet ween clients and the cluster including
>> client verification
>>  - encrypt communication between the nodes including verification
>>
>> What is the best order of steps and correct way to achieve this? I wanted
>> to start with defining a different user, but cqlsh refuses to connect after
>> enforcing user/password authentication:
>>
>> cqlsh -u cassandra -p cassandra
>> Connection error: ('Unable to connect to any servers', {'127.0.0.1':
>> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
>> Connection refused")})
>>
>>
>>
>> This happens when I run the command on either of the two machines. Any
>> help would be greatly appreciated.
>>
>>


Re: Securing a Cassandra 2.2.6 Cluster

2016-10-30 Thread Jonathan Haddad
I always prefer to set the listen interface instead of listen adress

Both nodes can be seeds. In fact, there should be more than one seed.
Having your first 2 nodes as seeds is usual the correct thing to do.
On Sun, Oct 30, 2016 at 8:28 AM Vladimir Yudovin 
wrote:

> >Empty listen_address and rpc_address.
> What do you mean by "Empty"? You should set either ***_address or
> ***_interface. Otherwise
> Cassandra will not listen on port 9042.
>
> >Open ports 9042, 7000 and 7001 for external communication.
> Only port 9042 should be open to the world, Port 7000 for internode
> communication, and 7001 for internode SSL communication (only one of them
> is used).
>
> >What is the best order of steps
> Order doesn't really matter.
>
> >Define both machines as seeds.
> It's wrong. Only one (started first) should be seed.
>
>
> >nodetool sees both of them
> cqlsh refuses to connect
> Can you please give output of
> *nodetool status*
> and
> *netstat -lptn | grep java*
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Sun, 30 Oct 2016 14:11:55 -0400*Raimund Klein
> >* wrote 
>
> Hi everyone,
>
> We've managed to set up a Cassandra 2.2.6 cluster of two physical nodes
> (nodetool sees both of them, so I'm quite certain the cluster is indeed
> active). My steps to create the cluster were (this applies to both
> machines):
>
>  - Empty listen_address and rpc_address.
>  - Define a cluster_name.
>  - Define both machines as seeds.
>  - Open ports 9042, 7000 and 7001 for external communication.
>
>
>
> Now I want to secure access to the cluster in all forms:
>
>  - define a different database user with a new password
>  - encrypt communication bet ween clients and the cluster including client
> verification
>  - encrypt communication between the nodes including verification
>
> What is the best order of steps and correct way to achieve this? I wanted
> to start with defining a different user, but cqlsh refuses to connect after
> enforcing user/password authentication:
>
> cqlsh -u cassandra -p cassandra
> Connection error: ('Unable to connect to any servers', {'127.0.0.1':
> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
> Connection refused")})
>
>
>
> This happens when I run the command on either of the two machines. Any
> help would be greatly appreciated.
>
>


Re: Securing a Cassandra 2.2.6 Cluster

2016-10-30 Thread Vladimir Yudovin
>Empty listen_address and rpc_address.

What do you mean by "Empty"? You should set either ***_address or 
***_interface. Otherwise 

Cassandra will not listen on port 9042.



>Open ports 9042, 7000 and 7001 for external communication.

Only port 9042 should be open to the world, Port 7000 for internode 
communication, and 7001 for internode SSL communication (only one of them is 
used).



>What is the best order of steps

Order doesn't really matter.



>Define both machines as seeds.

It's wrong. Only one (started first) should be seed.





>nodetool sees both of them

cqlsh refuses to connect

Can you please give output of

nodetool status

and

netstat -lptn | grep java



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra
Launch your cluster in minutes.





 On Sun, 30 Oct 2016 14:11:55 -0400Raimund Klein 
 wrote 




Hi everyone,

 

We've managed to set up a Cassandra 2.2.6 cluster of two physical nodes 
(nodetool sees both of them, so I'm quite certain the cluster is indeed 
active). My steps to create the cluster were (this applies to both machines):



 - Empty listen_address and rpc_address.

 - Define a cluster_name.

 - Define both machines as seeds.

 - Open ports 9042, 7000 and 7001 for external communication.



 



Now I want to secure access to the cluster in all forms:



 - define a different database user with a new password

 - encrypt communication bet ween clients and the cluster including client 
verification

 - encrypt communication between the nodes including verification



What is the best order of steps and correct way to achieve this? I wanted to 
start with defining a different user, but cqlsh refuses to connect after 
enforcing user/password authentication:



cqlsh -u cassandra -p cassandra

Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, 
"Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})



 



This happens when I run the command on either of the two machines. Any help 
would be greatly appreciated.











Re: Securing a Cassandra 2.2.6 Cluster

2016-10-30 Thread Jonathan Haddad
Dis you try the external IP with cqlsh?
On Sun, Oct 30, 2016 at 8:12 AM Raimund Klein  wrote:

> Hi everyone,
>
> We've managed to set up a Cassandra 2.2.6 cluster of two physical nodes
> (nodetool sees both of them, so I'm quite certain the cluster is indeed
> active). My steps to create the cluster were (this applies to both
> machines):
>
>  - Empty listen_address and rpc_address.
>  - Define a cluster_name.
>  - Define both machines as seeds.
>  - Open ports 9042, 7000 and 7001 for external communication.
>
>
>
> Now I want to secure access to the cluster in all forms:
>
>  - define a different database user with a new password
>  - encrypt communication bet ween clients and the cluster including client
> verification
>  - encrypt communication between the nodes including verification
>
> What is the best order of steps and correct way to achieve this? I wanted
> to start with defining a different user, but cqlsh refuses to connect after
> enforcing user/password authentication:
>
> cqlsh -u cassandra -p cassandra
> Connection error: ('Unable to connect to any servers', {'127.0.0.1':
> error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
> Connection refused")})
>
>
>
> This happens when I run the command on either of the two machines. Any
> help would be greatly appreciated.
>
>


Securing a Cassandra 2.2.6 Cluster

2016-10-30 Thread Raimund Klein
Hi everyone,

We've managed to set up a Cassandra 2.2.6 cluster of two physical nodes
(nodetool sees both of them, so I'm quite certain the cluster is indeed
active). My steps to create the cluster were (this applies to both
machines):

 - Empty listen_address and rpc_address.
 - Define a cluster_name.
 - Define both machines as seeds.
 - Open ports 9042, 7000 and 7001 for external communication.



Now I want to secure access to the cluster in all forms:

 - define a different database user with a new password
 - encrypt communication bet ween clients and the cluster including client
verification
 - encrypt communication between the nodes including verification

What is the best order of steps and correct way to achieve this? I wanted
to start with defining a different user, but cqlsh refuses to connect after
enforcing user/password authentication:

cqlsh -u cassandra -p cassandra
Connection error: ('Unable to connect to any servers', {'127.0.0.1':
error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
Connection refused")})



This happens when I run the command on either of the two machines. Any help
would be greatly appreciated.


Re: Cassandra failure during read query at consistency QUORUM (2 responses were required but only 0 replica responded, 2 failed)

2016-10-30 Thread Denis Mikhaylov
Why does it prevent quorum? How to fix it?

> On 28 Oct 2016, at 16:02, Edward Capriolo  wrote:
> 
> This looks like another case of an assert bubbling through try catch that 
> don't catch assert
> 
> On Fri, Oct 28, 2016 at 6:30 AM, Denis Mikhaylov  wrote:
> Hi!
> 
> We’re running Cassandra 3.9
> 
> On the application side I see failed reads with this exception 
> com.datastax.driver.core.exceptions.ReadFailureException: Cassandra failure 
> during read query at consistency QUORUM (2 responses were required but only 0 
> replica responded, 2 failed)
> 
> On the server side we see:
> 
> WARN  [SharedPool-Worker-3] 2016-10-28 13:28:22,965 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-3,5,
> main]: {}
> java.lang.AssertionError: null
> at org.apache.cassandra.db.rows.BTreeRow.getCell(BTreeRow.java:212) 
> ~[apache-cassandra-3.7.jar:3.7]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.canRemoveRow(SinglePartitionReadCommand.java:899)
>  ~[apache-cassandra-3.7.jar:3.7]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.reduceFilter(SinglePartitionReadCommand.java:863)
>  ~[apache-cassandra-3.7.jar:3.7]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndSSTablesInTimestampOrder(SinglePartitionReadCommand.java:748)
>  ~[apache-cassan
> dra-3.7.jar:3.7]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:519)
>  ~[apache-cassandra-3.7.jar:
> 3.7]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:496)
>  ~[apache-cassandra-3.7.jar:3.7]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:358)
>  ~[apache-cassandra-3.7.jar:3.7]
> at 
> org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:366) 
> ~[apache-cassandra-3.7.jar:3.7]
> at 
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:48)
>  ~[apache-cassandra-3.7.jar:3.7]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
> ~[apache-cassandra-3.7.jar:3.7]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_102]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-
> 3.7.jar:3.7]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache
> -cassandra-3.7.jar:3.7]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.7.jar:3.7]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
> 
> It’s only affect single table. Sadly both on test (3.9) and production (3.7) 
> deployments of cassandra.
> 
> What could be the problem? Please help.
> 



Re: Two DCs assandra on Azure connection problem.

2016-10-30 Thread Vlad
Hi Cliff,great, it helps, thank you!

So it's still strange for me why - as I mentioned "I suspected connectivity 
problem, but tcpdump shows constant traffic on port 7001 between nodes.", and 
even in unresponsive state there was packet exchange. Also I don't see in 
Cassandra code enabling SO_KEEPALIVE on storage port, only on CQL 
port.Nevertheless it works now, thanks again!


Here is link to MSDN about this timeout - 
https://blogs.msdn.microsoft.com/cie/2014/02/13/windows-azure-load-balancer-timeout-for-cloud-service-roles-paas-webworker/
Regards, Vlad
 
   

 On Thursday, October 27, 2016 8:50 PM, Cliff Gilmore  
wrote:
 

 Azure has aggressively low keepalive settings for it's networks. Ignore the 
Mongo parts of this link and have a look at the OS settings they change.
https://docs.mongodb.com/ecosystem/platforms/windows-azure/

---
Cliff Gilmore
Vanguard Solutions ArchitectM: 314-825-4413
DataStax, Inc. | www.DataStax.com


On Thu, Oct 27, 2016 at 5:48 AM, Vlad  wrote:

Hello,
I put two nodes cluster on Azure. Each node in its own DC (ping about 10 ms.), 
inter-node connection (SSL port 7001) is going throw external IPs, i.e.

 listen_interface: eth0broadcast_address: 1.1.1.1
Cluster is starting, cqlsh can connect, stress-tool survives night of writes 
with replication factor two, all seems to be fine. But when cluster is leaved 
without load it becomes nonfunctional after several minutes of idle. Attempt to 
connect fails with error
Connection error: ('Unable to connect to any servers', {'1.1.1.1': 
OperationTimedOut('errors= Timed out creating connection (10 seconds), 
last_host=None',)})

There is messageWARN  10:06:32 RequestExecutionException READ_TIMEOUT: 
Operation timed out - received only 1 responses.

on one node six minutes after start (no load or connect in this time).

nodetool status shows both nodes as UN (Up and Normal, I guess) 

I suspected connectivity problem, but tcpdump shows constant traffic on port 
7001 between nodes. Restarting OTHER node than I'm connection to solves the 
problem for another several minutes. I increased  TCP idle time in Azure IP 
address setting to 30 minutes, but it had no effect.

Thanks, Vlad