Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
hi @Erick,

Actually this timestamp *1614575293790 *is equivalent to
*GMT: Monday, 1 March 2021 05:08:13.790*
that stands for
*GMT+1: Monday, 1 March 2021 06:08:13.790* (my local
timezone).
This is consistent with the other logs time in the cluster.

Thank you for pointing me in some direction, I'll try to investigate this,
surely.



Il giorno mar 2 mar 2021 alle ore 00:56 Erick Ramirez <
erick.rami...@datastax.com> ha scritto:

> The timestamp (1614575293790) in the snapshot directory name is equivalent
> to 1 March 16:08 GMT:
>
> actually I found a lot of .db files in the following directory:
>>
>> /var/lib/cassandra/data/mykespace/mytable-2795c0204a2d11e9aba361828766468f/snapshots/dropped-1614575293790-
>> mytable
>>
>
> which lines up nicely with this log entry:
>
>
>>  2021-03-01 06:08:08,864 INFO  [Native-Transport-Requests-1]
>> MigrationManager.java:542 announceKeyspaceDrop Drop Keyspace 'mykeyspace'
>>
>
> In any case, those 2 pieces of information are evidence that the keyspace
> didn't get randomly dropped -- some operator/developer/daemon/orchestration
> tool/whatever initiated it either intentionally or by accident.
>
> I've seen this happen a number of times where a developer thought they
> were connecting to dev/staging/test environment and issued a DROP or
> TRUNCATE not realising they were connected to production. Not saying this
> is what happened in your case but I'm just giving you ideas for your
> investigation. Cheers!
>


Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
 I haven't made any schema modifications for a year or more.
This problem came up during a "normal day of work" for Cassandra.


Il giorno lun 1 mar 2021 alle ore 16:25 Bowen Song 
ha scritto:

> Your missing keyspace problem has nothing to do with that bug.
>
> In that case, the same table was created twice in a very short period of
> time, and I suspect that was done concurrently on two different nodes. The
> evidence lies in the two CF IDs - bd7200a0156711e88974855d74ee356f and
> bd750de0156711e8bdc54f7bcdcb851f, which are created at
> 2018-02-19T11:26:33.898 and 2018-02-19T11:26:33.918 respectively, with a
> merely 20 milliseconds gap between them.
>
> TBH, It doesn't sound like a bug to me. Cassandra is eventually consistent
> by design, and two conflicting schema changes on two different nodes at
> nearly the same time will likely result in schema disagreement and
> Cassandra will eventually reach agreement again, and possibly discarding
> one of the conflicting schema change, together with all data written to the
> discarded table/columns. To make sure this doesn't happen to your data, you
> should avoid doing multiple schema changes to the same keyspace (for
> create/alter/... keyspace) or same table (for create/alter/... table) on
> two or more Cassandra coordinator nodes in a very short period of time.
> Instead, send all your schema change queries to the same coordinator node,
> or if that's not possible, wait for at least 30 seconds between two schema
> changes and make sure you aren't restarting any node at the same time.
>
> On 01/03/2021 14:04, Marco Gasparini wrote:
>
> actually I found a lot of .db files in the following directory:
>
> /var/lib/cassandra/data/mykespace/mytable-2795c0204a2d11e9aba361828766468f/snapshots/dropped-1614575293790-mytable
>
> I also found this:
>  2021-03-01 06:08:08,864 INFO  [Native-Transport-Requests-1]
> MigrationManager.java:542 announceKeyspaceDrop Drop Keyspace 'mykeyspace'
>
> so I think that you, @erick and @bowen, are right. Something dropped the
> keyspace.
>
> I will try to follow your procedure @bowen, thank you very much!
>
> Do you know what could cause this issue?
> It seems like a big issue. I found this bug
> https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel,
> maybe they are correlated...
>
> Thank you @Bowen and @Erick
>
>
>
>
>
> Il giorno lun 1 mar 2021 alle ore 13:39 Bowen Song 
>  ha scritto:
>
>> The warning message indicates the node y.y.y.y went down (or is
>> unreachable via network) before 2021-02-28 05:17:33. Is there any chance
>> you can find the log file on that node at around or before that time? It
>> may show why did that node go down. The reason of that might be irrelevant
>> to the missing keyspace, but still worth to have a look in order to prevent
>> the same thing from happening again.
>>
>> As Erick said, the table's CF ID isn't new, so it's unlikely to be a
>> schema synchronization issue. Therefore I also suspect the keyspace was
>> accidentally dropped. Cassandra only logs "Drop Keyspace 'keyspace_name'"
>> on the node that received the "DROP KEYSPACE ..." query, so you may have to
>> search this in log files from all nodes to find it.
>>
>> Assuming the keyspace was dropped but you still have the SSTable files,
>> you can recover the data by re-creating the keyspace and tables with
>> identical replication strategy and schema, then copy the SSTable files to
>> the corresponding new table directories (with different CF ID suffixes) on
>> the same node, and finally run "nodetool refresh ..." or restart the node.
>> Since you don't yet have a full backup, I strongly recommend you to make a
>> backup, and ideally test restoring it to a different cluster, before
>> attempting to do this.
>>
>>
>> On 01/03/2021 11:48, Marco Gasparini wrote:
>>
>> here the previous error:
>>
>> 2021-02-28 05:17:33,262 WARN NodeConnectionsService.java:165
>> validateAndConnectIfNeeded failed to connect to node
>> {y.y.y.y}{9ba2d3ee-bc82-4e76-ae24-9e20eb334c24}{9ba2d3ee-bc82-4e76-ae24-9e20eb334c24}{y.y.y.y
>> }{ y.y.y.y :9300}{ALIVE}{rack=r1, dc=DC1} (tried [1] times)
>> org.elasticsearch.transport.ConnectTransportException: [ y.y.y.y ][
>> y.y.y.y :9300] connect_timeout[30s]
>> at
>> org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:163)
>> at
>> org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:616)
>> at
>> org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java

Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
actually I found a lot of .db files in the following directory:

/var/lib/cassandra/data/mykespace/mytable-2795c0204a2d11e9aba361828766468f/snapshots/dropped-1614575293790-
mytable

I also found this:
 2021-03-01 06:08:08,864 INFO  [Native-Transport-Requests-1]
MigrationManager.java:542 announceKeyspaceDrop Drop Keyspace 'mykeyspace'

so I think that you, @erick and @bowen, are right. Something dropped the
keyspace.

I will try to follow your procedure @bowen, thank you very much!

Do you know what could cause this issue?
It seems like a big issue. I found this bug
https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel,
maybe they are correlated...

Thank you @Bowen and @Erick





Il giorno lun 1 mar 2021 alle ore 13:39 Bowen Song 
ha scritto:

> The warning message indicates the node y.y.y.y went down (or is
> unreachable via network) before 2021-02-28 05:17:33. Is there any chance
> you can find the log file on that node at around or before that time? It
> may show why did that node go down. The reason of that might be irrelevant
> to the missing keyspace, but still worth to have a look in order to prevent
> the same thing from happening again.
>
> As Erick said, the table's CF ID isn't new, so it's unlikely to be a
> schema synchronization issue. Therefore I also suspect the keyspace was
> accidentally dropped. Cassandra only logs "Drop Keyspace 'keyspace_name'"
> on the node that received the "DROP KEYSPACE ..." query, so you may have to
> search this in log files from all nodes to find it.
>
> Assuming the keyspace was dropped but you still have the SSTable files,
> you can recover the data by re-creating the keyspace and tables with
> identical replication strategy and schema, then copy the SSTable files to
> the corresponding new table directories (with different CF ID suffixes) on
> the same node, and finally run "nodetool refresh ..." or restart the node.
> Since you don't yet have a full backup, I strongly recommend you to make a
> backup, and ideally test restoring it to a different cluster, before
> attempting to do this.
>
>
> On 01/03/2021 11:48, Marco Gasparini wrote:
>
> here the previous error:
>
> 2021-02-28 05:17:33,262 WARN NodeConnectionsService.java:165
> validateAndConnectIfNeeded failed to connect to node
> {y.y.y.y}{9ba2d3ee-bc82-4e76-ae24-9e20eb334c24}{9ba2d3ee-bc82-4e76-ae24-9e20eb334c24}{y.y.y.y
> }{ y.y.y.y :9300}{ALIVE}{rack=r1, dc=DC1} (tried [1] times)
> org.elasticsearch.transport.ConnectTransportException: [ y.y.y.y ][
> y.y.y.y :9300] connect_timeout[30s]
> at
> org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:163)
> at
> org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:616)
> at
> org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:513)
> at
> org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:336)
> at
> org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:323)
> at
> org.elasticsearch.cluster.NodeConnectionsService.validateAndConnectIfNeeded(NodeConnectionsService.java:156)
> at
> org.elasticsearch.cluster.NodeConnectionsService$ConnectionChecker.doRun(NodeConnectionsService.java:185)
> at
> org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672)
> at
> org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
> Yes this node (y.y.y.y) stopped because it went out of disk space.
>
>
> I said "deleted" because I'm not a native english speaker :)
> I usually "remove" snapshots via 'nodetool clearsnapshot' or
> cassandra-reaper user interface.
>
>
>
>
> Il giorno lun 1 mar 2021 alle ore 12:39 Bowen Song 
>  ha scritto:
>
>> What was the warning? Is it related to the disk failure policy? Could you
>> please share the relevant log? You can edit it and redact the sensitive
>> information before sharing it.
>>
>> Also, I can't help to notice that you used the word "delete" (instead of
>> "clear") to describe the process of removing snapshots. May I ask how did
>> you delete the snapshots? Was it "nodetool clearsnapshot ...", "rm -rf ..."
>> or something else?
>>
>>
>> On 01/03/2021 11:27, Marco Gasparini wrote:
>>
>> thanks Bowen for answering
>>
>> Actually, I checked the server lo

Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
here the previous error:

2021-02-28 05:17:33,262 WARN NodeConnectionsService.java:165
validateAndConnectIfNeeded failed to connect to node
{y.y.y.y}{9ba2d3ee-bc82-4e76-ae24-9e20eb334c24}{9ba2d3ee-bc82-4e76-ae24-9e20eb334c24}{
y.y.y.y }{ y.y.y.y :9300}{ALIVE}{rack=r1, dc=DC1} (tried [1] times)
org.elasticsearch.transport.ConnectTransportException: [ y.y.y.y ][ y.y.y.y
:9300] connect_timeout[30s]
at
org.elasticsearch.transport.TcpChannel.awaitConnected(TcpChannel.java:163)
at
org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:616)
at
org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:513)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:336)
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:323)
at
org.elasticsearch.cluster.NodeConnectionsService.validateAndConnectIfNeeded(NodeConnectionsService.java:156)
at
org.elasticsearch.cluster.NodeConnectionsService$ConnectionChecker.doRun(NodeConnectionsService.java:185)
at
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672)
at
org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Yes this node (y.y.y.y) stopped because it went out of disk space.


I said "deleted" because I'm not a native english speaker :)
I usually "remove" snapshots via 'nodetool clearsnapshot' or
cassandra-reaper user interface.




Il giorno lun 1 mar 2021 alle ore 12:39 Bowen Song 
ha scritto:

> What was the warning? Is it related to the disk failure policy? Could you
> please share the relevant log? You can edit it and redact the sensitive
> information before sharing it.
>
> Also, I can't help to notice that you used the word "delete" (instead of
> "clear") to describe the process of removing snapshots. May I ask how did
> you delete the snapshots? Was it "nodetool clearsnapshot ...", "rm -rf ..."
> or something else?
>
>
> On 01/03/2021 11:27, Marco Gasparini wrote:
>
> thanks Bowen for answering
>
> Actually, I checked the server log and the only warning was that a node
> went offline.
> No, I have no backups or snapshots.
>
> In the meantime I found that probably Cassandra moved all files from a
> directory to the snapshot directory. I am pretty sure of that because I
> have recently deleted all the snapshots I made because it was going out of
> disk space and I found this very directory full of files where the
> modification timestamp was the same as the first error I got in the log.
>
>
>
> Il giorno lun 1 mar 2021 alle ore 12:13 Bowen Song 
>  ha scritto:
>
>> The first thing I'd check is the server log. The log may contain vital
>> information about the cause of it, and that there may be different ways to
>> recover from it depending on the cause.
>>
>> Also, please allow me to ask a seemingly obvious question, do you have a
>> backup?
>>
>>
>> On 01/03/2021 09:34, Marco Gasparini wrote:
>>
>> hello everybody,
>>
>> This morning, Monday!!!, I was checking on Cassandra cluster and I
>> noticed that all data was missing. I noticed the following error on each
>> node (9 nodes in the cluster):
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *2021-03-01 09:05:52,984 WARN  [MessagingService-Incoming-/x.x.x.x]
>> IncomingTcpConnection.java:103 run UnknownColumnFamilyException reading
>> from socket; closing org.apache.cassandra.db.UnknownColumnFamilyException:
>> Couldn't find table for cfId cba90a70-5c46-11e9-9e36-f54fe3235e69. If a
>> table was just created, this is likely due to the schema not being fully
>> propagated.  Please wait for schema agreement on table creation. at
>> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1533)
>> at
>> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:758)
>> at
>> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:697)
>> at
>> org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50)
>> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:183)
>> at

Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
thanks Bowen for answering

Actually, I checked the server log and the only warning was that a node
went offline.
No, I have no backups or snapshots.

In the meantime I found that probably Cassandra moved all files from a
directory to the snapshot directory. I am pretty sure of that because I
have recently deleted all the snapshots I made because it was going out of
disk space and I found this very directory full of files where the
modification timestamp was the same as the first error I got in the log.



Il giorno lun 1 mar 2021 alle ore 12:13 Bowen Song 
ha scritto:

> The first thing I'd check is the server log. The log may contain vital
> information about the cause of it, and that there may be different ways to
> recover from it depending on the cause.
>
> Also, please allow me to ask a seemingly obvious question, do you have a
> backup?
>
>
> On 01/03/2021 09:34, Marco Gasparini wrote:
>
> hello everybody,
>
> This morning, Monday!!!, I was checking on Cassandra cluster and I noticed
> that all data was missing. I noticed the following error on each node (9
> nodes in the cluster):
>
>
>
>
>
>
>
>
>
>
> *2021-03-01 09:05:52,984 WARN  [MessagingService-Incoming-/x.x.x.x]
> IncomingTcpConnection.java:103 run UnknownColumnFamilyException reading
> from socket; closing org.apache.cassandra.db.UnknownColumnFamilyException:
> Couldn't find table for cfId cba90a70-5c46-11e9-9e36-f54fe3235e69. If a
> table was just created, this is likely due to the schema not being fully
> propagated.  Please wait for schema agreement on table creation. at
> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1533)
> at
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:758)
> at
> org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:697)
> at
> org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50)
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:183)
> at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)*
>
> I tried to query the keyspace and got this:
>
> node1# cqlsh
> Connected to Cassandra Cluster at x.x.x.x:9042.
> [cqlsh 5.0.1 | Cassandra 3.11.5.1 | CQL spec 3.4.4 | Native protocol v4]
> Use HELP for help.
> cqlsh> select * from mykeyspace.mytable  where id = 123935;
> *InvalidRequest: Error from server: code=2200 [Invalid query]
> message="Keyspace * *mykeyspace  does not exist"*
>
> Investigating on each node I found that all the *SStables exist*, so I
> think data is still there but the keyspace vanished, "magically".
>
> Other facts I can tell you are:
>
>- I have been getting Anticompaction errors from 2 nodes due to the
>fact the disk was almost full.
>- the cluster was online friday
>- this morning, Monday, the whole cluster was offline and I noticed
>the problem of "missing keyspace"
>- During the weekend the cluster has been subject to inserts and
>deletes
>- I have a 9 node (HDD) Cassandra 3.11 cluster.
>
> I really need help on this, how can I restore the cluster?
>
> Thank you very much
> Marco
>
>
>
>
>
>
>
>
>


MISSING keyspace

2021-03-01 Thread Marco Gasparini
hello everybody,

This morning, Monday!!!, I was checking on Cassandra cluster and I noticed
that all data was missing. I noticed the following error on each node (9
nodes in the cluster):










*2021-03-01 09:05:52,984 WARN  [MessagingService-Incoming-/x.x.x.x]
IncomingTcpConnection.java:103 run UnknownColumnFamilyException reading
from socket; closingorg.apache.cassandra.db.UnknownColumnFamilyException:
Couldn't find table for cfId cba90a70-5c46-11e9-9e36-f54fe3235e69. If a
table was just created, this is likely due to the schema not being fully
propagated.  Please wait for schema agreement on table creation.at
org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1533)
  at
org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:758)
  at
org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:697)
  at
org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50)
  at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
  at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:183)
  at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)*

I tried to query the keyspace and got this:

node1# cqlsh
Connected to Cassandra Cluster at x.x.x.x:9042.
[cqlsh 5.0.1 | Cassandra 3.11.5.1 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> select * from mykeyspace.mytable  where id = 123935;
*InvalidRequest: Error from server: code=2200 [Invalid query]
message="Keyspace * *mykeyspace  does not exist"*

Investigating on each node I found that all the *SStables exist*, so I
think data is still there but the keyspace vanished, "magically".

Other facts I can tell you are:

   - I have been getting Anticompaction errors from 2 nodes due to the fact
   the disk was almost full.
   - the cluster was online friday
   - this morning, Monday, the whole cluster was offline and I noticed the
   problem of "missing keyspace"
   - During the weekend the cluster has been subject to inserts and deletes
   - I have a 9 node (HDD) Cassandra 3.11 cluster.

I really need help on this, how can I restore the cluster?

Thank you very much
Marco


Re: cluster rolling restart

2019-10-16 Thread Marco Gasparini
Great!
Thank you very much Alain!

Il giorno mer 16 ott 2019 alle ore 10:56 Alain RODRIGUEZ 
ha scritto:

> Hello Marco,
>
> No this should not be a 'normal' / 'routine' thing in a Cassandra cluster.
> I can imagine it being helpful in some cases or versions of Cassandra if
> they are memory issues/leaks or something like that, going wrong, but
> 'normally', you should not have to do that. Even more, when doing so,
> you'll have to 'warm up' C* again to get to full speed, such as loading the
> page/key caches again and some other stuff like starting over compactions
> that were interrupted might even slow down the cluster somewhat.
>
> Let's just say it should not harm (much), it might even be useful in some
> corner cases, but restarting the cluster regularly, for no reasons, is
> definitely not part of 'best practices' around Cassandra imho.
>
> C*heers,
> ---
> Alain Rodriguez - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> Le mer. 16 oct. 2019 à 09:56, Marco Gasparini <
> marco.gaspar...@competitoor.com> a écrit :
>
>> hi all,
>>
>> I was wondering if it is recommended to perform a rolling restart of the
>> cluster once in a while.
>> Is it a good practice or necessary? how often?
>>
>> Thanks
>> Marco
>>
>


cluster rolling restart

2019-10-16 Thread Marco Gasparini
hi all,

I was wondering if it is recommended to perform a rolling restart of the
cluster once in a while.
Is it a good practice or necessary? how often?

Thanks
Marco


Re: different query result after a rerun of the same query

2019-04-30 Thread Marco Gasparini
> My guess is the initial query was causing a read repair so, on subsequent
queries, there were replicas of the data on every node and it still got
returned at consistency one
got it

>There are a number of ways the data could have become inconsistent in the
first place - eg  badly overloaded or down nodes, changes in topology
without following proper procedure, etc
I actually perform repair every day (because I have  lot of deletes).
The topology has not been changed since months.
I usually don't have down nodes but I have a high workload every night that
last for about 2/3 hours. I'm monitoring Cassandra performances via
prometheus+grafana and I noticed that reads are too slow, about 10/15
seconds latency, writes are faster than reads, about 600/700 us. I'm using
non-SSD drives on nodes.









Il giorno lun 29 apr 2019 alle ore 22:36 Ben Slater <
ben.sla...@instaclustr.com> ha scritto:

> My guess is the initial query was causing a read repair so, on subsequent
> queries, there were replicas of the data on every node and it still got
> returned at consistency one.
>
> There are a number of ways the data could have become inconsistent in the
> first place - eg  badly overloaded or down nodes, changes in topology
> without following proper procedure, etc.
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> <https://www.instaclustr.com/platform/>
>
> <https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
><https://www.linkedin.com/company/instaclustr>
>
> Read our latest technical blog posts here
> <https://www.instaclustr.com/blog/>.
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Mon, 29 Apr 2019 at 19:50, Marco Gasparini <
> marco.gaspar...@competitoor.com> wrote:
>
>> thank you Ben for the reply.
>>
>> > You haven’t said what consistency level you are using. CQLSH by default
>> uses consistency level one which may be part of the issue - try using a
>> higher level (eg CONSISTENCY QUOROM)
>> yes, actually I used CQLSH so the consistency level was set to ONE. After
>> I changed it I get the right results.
>>
>> >After results are returned correctly are they then returned correctly
>> for all future runs?
>> yes it seems that after they returned I can get access to them at each
>> run of the same query on each node i run it.
>>
>> > When was the data inserted (relative to your attempt to query it)?
>> about a day before the query
>>
>>
>> Thanks
>>
>>
>> Il giorno lun 29 apr 2019 alle ore 10:29 Ben Slater <
>> ben.sla...@instaclustr.com> ha scritto:
>>
>>> You haven’t said what consistency level you are using. CQLSH by default
>>> uses consistency level one which may be part of the issue - try using a
>>> higher level (eg CONSISTENCY QUOROM).
>>>
>>> After results are returned correctly are they then returned correctly
>>> for all future runs? When was the data inserted (relative to your attempt
>>> to query it)?
>>>
>>> Cheers
>>> Ben
>>>
>>> ---
>>>
>>>
>>> *Ben Slater**Chief Product Officer*
>>>
>>> <https://www.instaclustr.com/platform/>
>>>
>>> <https://www.facebook.com/instaclustr>
>>> <https://twitter.com/instaclustr>
>>> <https://www.linkedin.com/company/instaclustr>
>>>
>>> Read our latest technical blog posts here
>>> <https://www.instaclustr.com/blog/>.
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>>
>>> On Mon, 29 Apr 2019 at 17:57, Marco Gasparini <
>>> marco.gaspar...@competitoor.com> wrote:
>>>
>>>

Re: different query result after a rerun of the same query

2019-04-29 Thread Marco Gasparini
thank you Ben for the reply.

> You haven’t said what consistency level you are using. CQLSH by default
uses consistency level one which may be part of the issue - try using a
higher level (eg CONSISTENCY QUOROM)
yes, actually I used CQLSH so the consistency level was set to ONE. After I
changed it I get the right results.

>After results are returned correctly are they then returned correctly for
all future runs?
yes it seems that after they returned I can get access to them at each run
of the same query on each node i run it.

> When was the data inserted (relative to your attempt to query it)?
about a day before the query


Thanks


Il giorno lun 29 apr 2019 alle ore 10:29 Ben Slater <
ben.sla...@instaclustr.com> ha scritto:

> You haven’t said what consistency level you are using. CQLSH by default
> uses consistency level one which may be part of the issue - try using a
> higher level (eg CONSISTENCY QUOROM).
>
> After results are returned correctly are they then returned correctly for
> all future runs? When was the data inserted (relative to your attempt to
> query it)?
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> <https://www.instaclustr.com/platform/>
>
> <https://www.facebook.com/instaclustr>   <https://twitter.com/instaclustr>
><https://www.linkedin.com/company/instaclustr>
>
> Read our latest technical blog posts here
> <https://www.instaclustr.com/blog/>.
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Mon, 29 Apr 2019 at 17:57, Marco Gasparini <
> marco.gaspar...@competitoor.com> wrote:
>
>> Hi all,
>>
>> I'm using Cassandra 3.11.3.5.
>>
>> I have just noticed that when I perform a query I get 0 result but if I
>> launch that same query after few seconds I get the right result.
>>
>> I have traced the query:
>>
>> cqlsh> select event_datetime, id_url, uuid, num_pages from
>> mkp_history.mkp_lookup where id_url= 1455425 and url_type='mytype' ;
>>
>>  event_datetime | id_url | uuid | num_pages
>> ++--+---
>>
>> (0 rows)
>>
>> Tracing session: dda9d1a0-6a51-11e9-9e36-f54fe3235e69
>>
>>  activity
>>
>>  | timestamp  | source| source_elapsed
>> | client
>>
>> --++---++---
>>
>>
>>  Execute CQL3 query | 2019-04-29 09:39:05.53 | 10.8.0.10 |
>> 0 | 10.8.0.10
>>  Parsing select event_datetime, id_url, uuid, num_pages from
>> mkp_history.mkp_lookup where id_url= 1455425 and url_type=' mytype'\n;
>> [Native-Transport-Requests-2] | 2019-04-29 09:39:05.53 | 10.8.0.10 |
>> 238 | 10.8.0.10
>>
>>   Preparing statement
>> [Native-Transport-Requests-2] | 2019-04-29 09:39:05.53 | 10.8.0.10 |
>> 361 | 10.8.0.10
>>
>>  reading data from /10.8.0.38
>> [Native-Transport-Requests-2] | 2019-04-29 09:39:05.531000 | 10.8.0.10 |
>> 527 | 10.8.0.10
>>
>> Sending READ message to /10.8.0.38
>> [MessagingService-Outgoing-/10.8.0.38-Small] | 2019-04-29 09:39:05.531000 |
>> 10.8.0.10 |620 | 10.8.0.10
>>
>>READ message received from /10.8.0.10
>> [MessagingService-Incoming-/10.8.0.10] | 2019-04-29 09:39:05.535000 |
>> 10.8.0.8 | 44 | 10.8.0.10
>>
>>   speculating read retry on /10.8.0.8
>> [Native-Transport-Requests-2] | 2019-04-29 09:39:05.535000 | 10.8.0.10 |
>>4913 | 10.8.0.10
>>
>>Executing single-partition query on
>> mkp_lookup [ReadStage-2] | 2019-04-29 09:39:05.535000 |  10.8.0.8 |
>> 304 | 10.8.0.10
>>
>>   Sending READ message to /10.8.0.8
>> [MessagingService-Outgoing-/10.8.0.8-Small] | 2019-04-29 09:39:05.535000 |
>> 10.8.0.10 |   4970 | 10.8.0.10
>>
>>  Acquiring sstable
>> references [ReadStage-2] | 

different query result after a rerun of the same query

2019-04-29 Thread Marco Gasparini
Hi all,

I'm using Cassandra 3.11.3.5.

I have just noticed that when I perform a query I get 0 result but if I
launch that same query after few seconds I get the right result.

I have traced the query:

cqlsh> select event_datetime, id_url, uuid, num_pages from
mkp_history.mkp_lookup where id_url= 1455425 and url_type='mytype' ;

 event_datetime | id_url | uuid | num_pages
++--+---

(0 rows)

Tracing session: dda9d1a0-6a51-11e9-9e36-f54fe3235e69

 activity

   | timestamp  | source| source_elapsed |
client
--++---++---


 Execute CQL3 query | 2019-04-29 09:39:05.53 | 10.8.0.10 |
0 | 10.8.0.10
 Parsing select event_datetime, id_url, uuid, num_pages from
mkp_history.mkp_lookup where id_url= 1455425 and url_type=' mytype'\n;
[Native-Transport-Requests-2] | 2019-04-29 09:39:05.53 | 10.8.0.10 |
238 | 10.8.0.10

Preparing statement
[Native-Transport-Requests-2] | 2019-04-29 09:39:05.53 | 10.8.0.10 |
361 | 10.8.0.10

   reading data from /10.8.0.38
[Native-Transport-Requests-2] | 2019-04-29 09:39:05.531000 | 10.8.0.10 |
527 | 10.8.0.10

  Sending READ message to /10.8.0.38
[MessagingService-Outgoing-/10.8.0.38-Small] | 2019-04-29 09:39:05.531000 |
10.8.0.10 |620 | 10.8.0.10

 READ message received from /10.8.0.10
[MessagingService-Incoming-/10.8.0.10] | 2019-04-29 09:39:05.535000 |
10.8.0.8 | 44 | 10.8.0.10

speculating read retry on /10.8.0.8
[Native-Transport-Requests-2] | 2019-04-29 09:39:05.535000 | 10.8.0.10 |
   4913 | 10.8.0.10

 Executing single-partition query on mkp_lookup
[ReadStage-2] | 2019-04-29 09:39:05.535000 |  10.8.0.8 |304 |
10.8.0.10

Sending READ message to /10.8.0.8
[MessagingService-Outgoing-/10.8.0.8-Small] | 2019-04-29 09:39:05.535000 |
10.8.0.10 |   4970 | 10.8.0.10

   Acquiring sstable references
[ReadStage-2] | 2019-04-29 09:39:05.536000 |  10.8.0.8 |391 |
10.8.0.10

 Bloom filter allows skipping sstable 1
[ReadStage-2] | 2019-04-29 09:39:05.536000 |  10.8.0.8 |490 |
10.8.0.10

  Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones
[ReadStage-2] | 2019-04-29 09:39:05.536000 |  10.8.0.8 |549 |
10.8.0.10

  Merged data from memtables and 0 sstables
[ReadStage-2] | 2019-04-29 09:39:05.536000 |  10.8.0.8 |697 |
10.8.0.10

 Read 0 live rows and 0 tombstone cells
[ReadStage-2] | 2019-04-29 09:39:05.536000 |  10.8.0.8 |808 |
10.8.0.10

   Enqueuing response to /10.8.0.10
[ReadStage-2] | 2019-04-29 09:39:05.536000 |  10.8.0.8 |896 |
10.8.0.10

Sending REQUEST_RESPONSE message to /10.8.0.10
[MessagingService-Outgoing-/10.8.0.10-Small] | 2019-04-29 09:39:05.536000
|  10.8.0.8 |   1141 | 10.8.0.10

   REQUEST_RESPONSE message received from /10.8.0.8
[MessagingService-Incoming-/10.8.0.8] | 2019-04-29 09:39:05.539000 |
10.8.0.10 |   8627 | 10.8.0.10

  Processing response from /10.8.0.8
[RequestResponseStage-3] | 2019-04-29 09:39:05.539000 | 10.8.0.10 |
   8739 | 10.8.0.10


 Request complete | 2019-04-29 09:39:05.538823 | 10.8.0.10 |   8823
| 10.8.0.10



And here I rerun the query just after few seconds:


cqlsh> select event_datetime, id_url, uuid, num_pages from
mkp_history.mkp_lookup where id_url= 1455425 and url_type='mytype';

 event_datetime  | id_url  | uuid
   | num_pages
-+-+--+---
 2019-04-15 21:32:27.031000+ | 1455425 |
91114c7d-3dd3-4913-ac9c-0dfa12b4198b | 1
 2019-04-14 21:34:23.63+ | 1455425 |
e97b160d-3901-4550-9ce6-36893a6dcd90 | 1
 2019-04-11 21:57:23.025000+ | 1455425 |
1566cc7c-7893-43f0-bffe-caab47dec851 | 1

(3 rows)

Tracing session: f4b7eb20-6a51-11e9-9e36-f54fe3235e69

 activity

 | timestamp  | source| source_elapsed |
client
++---++---


 Execute CQL3 query | 2019-04-29 09:39:44.21 | 10.8.0.10 |
0 | 10.8.0.10
 Parsing select event_datetime, id_url, uuid, num_pages from
mkp_history.mkp_lookup where id_url= 1455425 and 

Re: too many logDroppedMessages and StatusLogger

2019-03-12 Thread Marco Gasparini
thanks for the answer Nate,

my queries are more like the following:
select f1,f2,f3, bigtxt from  mytable  where f1= ? and f2= ? limit 10;
insert into mytable (f1,f2,f3,bigtxt) values (?,?,?,?)

actually I have a text field (bigtxt) that could be > 1MB.

Marco

Il giorno lun 11 mar 2019 alle ore 22:21 Nate McCall 
ha scritto:

> Are you using queries with a large number of arguments to an IN clause
> on a partition key? If so, the coordinator has to:
> - hold open the client request
> - unwind the IN clause into individual statements
> - scatter/gathering those statements around the cluster (each at the
> requested consistency level!)
> - pull it all back together and send it out
>
> In extreme cases, this can flood internode messaging and make things
> look slow even when the system is near idle.
>
> On Fri, Mar 8, 2019 at 9:27 PM Marco Gasparini
>  wrote:
> >
> > Hi all,
> >
> > I cannot understand why I get the following logs, they appear every day
> at not fixed period of time. I saw them every 2 minutes or every 10
> seconds, I cannot find any pattern.
> > I took this very example here during an heavy workload of writes and
> reads but I get them also during a very little workload and without any
> active compaction/repair/streaming process and no high cpu/memory/iowait
> usage.
> >
> >> 2019-03-08 01:49:47,868 INFO  [ScheduledTasks:1]
> MessagingService.java:1246 logDroppedMessages READ messages were dropped in
> last 5000 ms: 0 internal and 1 cross node. Mean internal dropped latency:
> 6357 ms and Mean cross-node dropped latency: 6556 ms
> >> 2019-03-08 01:49:47,868 INFO  [ScheduledTasks:1] StatusLogger.java:47
> log Pool NameActive   Pending  Completed   Blocked
> All Time Blocked
> >> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log MutationStage 0 0   17641121 0
>0
> >> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log ViewMutationStage 0 0  0 0
>0
> >> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log ReadStage 0 06851090 0
>0
> >> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log RequestResponseStage  0 0   13646587 0
>0
> >> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log ReadRepairStage   0 0 352884 0
>0
> >> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log CounterMutationStage  0 0  0 0
>0
> >> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log MiscStage 0 0  0 0
>0
> >> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log CompactionExecutor0 0 882478 0
>0
> >> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log MemtableReclaimMemory 0 0   4101 0
>0
> >> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log PendingRangeCalculator0 0  7 0
>0
> >> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log GossipStage   0 04399705 0
>0
> >> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log SecondaryIndexManagement  0 0  0 0
>0
> >> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log HintsDispatcher   0 0   2165 0
>0
> >> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log MigrationStage0 0 50 0
>0
> >> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log MemtablePostFlush 0 0   4393 0
>0
> >> 2019-03-08 01:49:47,872 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log PerDiskMemtableFlushWriter_0 0 0   4097
>  0 0
> >> 2019-03-08 01:49:47,872 INFO  [ScheduledTasks:1] StatusLogger.java:51
> log ValidationExecutor0 0

too many logDroppedMessages and StatusLogger

2019-03-08 Thread Marco Gasparini
Hi all,

I cannot understand why I get the following logs, they appear every day at
not fixed period of time. I saw them every 2 minutes or every 10 seconds, I
cannot find any pattern.
I took this very example here during an heavy workload of writes and reads
but I get them also during a very little workload and without any active
compaction/repair/streaming process and no high cpu/memory/iowait usage.

2019-03-08 01:49:47,868 INFO  [ScheduledTasks:1] MessagingService.java:1246
> logDroppedMessages READ messages were dropped in last 5000 ms: 0 internal
> and 1 cross node. Mean internal dropped latency: 6357 ms and Mean
> cross-node dropped latency: 6556 ms
> 2019-03-08 01:49:47,868 INFO  [ScheduledTasks:1] StatusLogger.java:47 log
> Pool NameActive   Pending  Completed   Blocked  All
> Time Blocked
> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> MutationStage 0 0   17641121 0
>0
> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> ViewMutationStage 0 0  0 0
>0
> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> ReadStage 0 06851090 0
>0
> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> RequestResponseStage  0 0   13646587 0
>0
> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> ReadRepairStage   0 0 352884 0
>0
> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> CounterMutationStage  0 0  0 0
>0
> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> MiscStage 0 0  0 0
>0
> 2019-03-08 01:49:47,870 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> CompactionExecutor0 0 882478 0
>0
> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> MemtableReclaimMemory 0 0   4101 0
>0
> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> PendingRangeCalculator0 0  7 0
>0
> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> GossipStage   0 04399705 0
>0
> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> SecondaryIndexManagement  0 0  0 0
>0
> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> HintsDispatcher   0 0   2165 0
>0
> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> MigrationStage0 0 50 0
>0
> 2019-03-08 01:49:47,871 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> MemtablePostFlush 0 0   4393 0
>0
> 2019-03-08 01:49:47,872 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> PerDiskMemtableFlushWriter_0 0 0   4097 0
>0
> 2019-03-08 01:49:47,872 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> ValidationExecutor0 0   1565 0
>0
> 2019-03-08 01:49:47,872 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> Sampler   0 0  0 0
>0
> 2019-03-08 01:49:47,872 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> MemtableFlushWriter   0 0   4101 0
>0
> 2019-03-08 01:49:47,872 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> InternalResponseStage 0 0 121813 0
>0
> 2019-03-08 01:49:47,872 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> AntiEntropyStage  0 0   6997 0
>0
> 2019-03-08 01:49:47,873 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> CacheCleanupExecutor  0 0 33 0
>0
> 2019-03-08 01:49:47,873 INFO  [ScheduledTasks:1] StatusLogger.java:51 log
> Native-Transport-Requests 4 0  364526951 0
>0
> 2019-03-08 01:49:47,873 INFO  [ScheduledTasks:1] StatusLogger.java:61 log
> CompactionManager 0 0
> 2019-03-08 01:49:47,873 INFO  [ScheduledTasks:1] StatusLogger.java:73 log
> MessagingServicen/a   2/0
> 2019-03-08 01:49:47,873 INFO  [ScheduledTasks:1] StatusLogger.java:83 log
> Cache Type Size 

MonitoringTask logSlowOperations show wront query message

2019-01-14 Thread Marco Gasparini
hi everyone,

I have activated DEBUG mode via nodetool setlogginglevel, now system.log
shows me slow queries (slower than 500ms) but the log is showing me the
wrong query. My query is composed like this:

SELECT pkey, f1, f2, f3 FROM mykeyspace.mytable WHERE pkey='xxx' LIMIT 3000;

but MonitoringTask is showing me that I'm selecting all (*) table's fields,
like this:

, time 775
msec - slow timeout 500 msec/cross-node

I'm querying Cassandra from my pc using NodeJS cassandra-driver and I have
checked the outbound query via Wireshark and it is correct.

is this a real issue or just a matter of wrong text in the log?

Thanks
Marco


Re: [EXTERNAL] fine tuning for wide rows and mixed worload system

2019-01-11 Thread Marco Gasparini
Hi Sean,

> I will start – knowing that others will have additional help/questions
I hope that, I really need help with this :)

> What heap size are you using? Sounds like you are using the CMS garbage
collector.

Yes, I'm using CMS garbage Collector. I have not used G1 because I read it
isn't recommended but if you are saying that is going to help me with my
use case I have no objection in using it. I will try.
I have 3 nodes: node1 has 32GB and node2 and node3 16 GB. I'm currently
using 50% RAM for each node.


> Spinning disks are a problem, too. Can you tell if the IO is getting
overwhelmed? SSDs are much preferred.

I'm not sure about it, 'dstat' and 'iostat' tell me that rMB/s is
constantly above 100MB/s and %util is closed to 100% and in these
conditions the node is frozen.
HDD specifics says that maximum transfer rate is 175MB/s for node1 and
155MB/s for node2 and node3.
Unfortunately switching to spinning disk to SSD is not an option.



> Read before write is usually an anti-pattern for Cassandra. From your
queries, it seems you have a partition key and clustering key.
Can you give us the table schema? I’m also concerned about the IF EXISTS in
your delete.
I think that invokes a light weight transaction – costly for performance.
Is it really required for your use case?

I don't need the 'IF EXISTS' parameter. Actually is pretty much a refuse
from an old query and I can try to remove this.

Here the schema:

CREATE KEYSPACE my_keyspace WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3'}  AND durable_writes = false;
CREATE TABLE my_keyspace.my_table (
pkey text,
event_datetime timestamp,
f1 text,
f2 text,
f3 text,
f4 text,
f5 int,
f6 bigint,
f7 bigint,
f8 text,
f9 text,
PRIMARY KEY (pkey, event_datetime)
) WITH CLUSTERING ORDER BY (event_datetime DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 9
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';


Thank you very much
Marco

Il giorno ven 11 gen 2019 alle ore 16:14 Durity, Sean R <
sean_r_dur...@homedepot.com> ha scritto:

> I will start – knowing that others will have additional help/questions.
>
>
>
> What heap size are you using? Sounds like you are using the CMS garbage
> collector. That takes some arcane knowledge and lots of testing to tune. I
> would start with G1 and using ½ the available RAM as the heap size. I would
> want 32 GB RAM as a minimum on the hosts.
>
>
>
> Spinning disks are a problem, too. Can you tell if the IO is getting
> overwhelmed? SSDs are much preferred.
>
>
>
> Read before write is usually an anti-pattern for Cassandra. From your
> queries, it seems you have a partition key and clustering key. Can you give
> us the table schema? I’m also concerned about the IF EXISTS in your delete.
> I think that invokes a light weight transaction – costly for performance.
> Is it really required for your use case?
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Marco Gasparini 
> *Sent:* Friday, January 11, 2019 8:20 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] fine tuning for wide rows and mixed worload system
>
>
>
> Hello everyone,
>
>
>
> I need some advise in order to solve my use case problem. I have already
> tried some solutions but it didn't work out.
>
> Can you help me with the following configuration please? any help is very
> appreciate
>
>
>
> I'm using:
>
> - Cassandra 3.11.3
>
> - java version "1.8.0_191"
>
>
>
> My use case is composed by the following constraints:
>
> - about 1M reads per day (it is going to rise up)
>
> - about 2M writes per day (it is going to rise up)
>
> - there is a high peek of requests in less than 2 hours in which the
> system receives half of all day traffic (500K reads, 1M writes)
>
> - each request is composed by 1 read and 2 writes (1 delete + 1 write)
>
>
>
> * the read query selects max 3 records based on the primary
> key (select * from my_keyspace.my_table where pkey = ? limit 3)
>
> * then is performed a deletion of one record (delete from
> my_keyspace.my_table where pkey = ? and event_datetime = ? IF EXISTS)
>
> * finally t

fine tuning for wide rows and mixed worload system

2019-01-11 Thread Marco Gasparini
Hello everyone,

I need some advise in order to solve my use case problem. I have already
tried some solutions but it didn't work out.
Can you help me with the following configuration please? any help is very
appreciate

I'm using:
- Cassandra 3.11.3
- java version "1.8.0_191"

My use case is composed by the following constraints:
- about 1M reads per day (it is going to rise up)
- about 2M writes per day (it is going to rise up)
- there is a high peek of requests in less than 2 hours in which the system
receives half of all day traffic (500K reads, 1M writes)
- each request is composed by 1 read and 2 writes (1 delete + 1 write)
* the read query selects max 3 records based on the primary key (select *
from my_keyspace.my_table where pkey = ? limit 3)
* then is performed a deletion of one record (delete from
my_keyspace.my_table where pkey = ? and event_datetime = ? IF EXISTS)
* finally the new data is stored (insert into my_keyspace.my_table
(event_datetime, pkey, agent, some_id, ft, ftt..) values (?,?,?,?,?,?...))

- each row is pretty wide. I don't really know the exact size because there
are 2 dynamic text columns that stores data between 1MB to 50MB length
each.
  So, reads are going to be huge because I read 3 records of that dimension
every time. Writes are complex as well because each row is that wide.

Currently, I own 3 nodes with the following properties:
- node1:
* Intel Core i7-3770
* 2x HDD SATA 3,0 TB
* 4x RAM 8192 MB DDR3
* nominative bit rate 175MB/s
# blockdev --report /dev/sd[ab]
RORA   SSZ   BSZ   StartSecSize   Device
rw   256   512  4096  0   3000592982016   /dev/sda
rw   256   512  4096  0   3000592982016   /dev/sdb
- node2,3:
* Intel Core i7-2600
* 2x HDD SATA 3,0 TB
* 4x RAM 4096 MB DDR3
* nominative bit rate 155MB/s
# blockdev --report /dev/sd[ab]
RORA   SSZ   BSZ   StartSecSize   Device
rw   256   512  4096  0   3000592982016   /dev/sda
rw   256   512  4096  0   3000592982016   /dev/sdb
Each node has 2 disks but I have disabled RAID option and I have created a
virtual single disk in order to get much free space.
Can this configuration create issues?

I have already tried some configurations in order to make it work, like:
1) straigthforward attempt
- default Cassandra configuration (cassandra.yaml)
- RF=1
- SizeTieredCompactionStrategy  (write strategy)
- no row cache (because of wide rows dimension is better to have no row
cache)
- gc_grace_seconds = 1 day (unfortunately, I did no repair schedule at all)
results:
too many timeouts, losing data

2)
- added repair schedules
- RF=3 (in order increase reads speed)
results:
- too many timeouts, losing data
- high I/O consumption on each nodes (iostat shows 100% in %util on each
nodes, dstat shows hundred of M read for each iteration)
- node2 frozen until I stopped data writes.
- node3 almost frozen
- many panding MutationStage events in TPSTATS in node2
- many full GC
- many HintsDispatchExecutor events in system.log
actual)
- added repair schedules
- RF=3
- set durable_writes = false in order to speed up writes
- increased young heap
- decreased SurviviorRatio in order to get much young size available
because of wide rows data
- increased from 1 to 3 MaxTenuringThreshold in order to decrease reads
latency
- increased Cassandra's memtable onheap and offheap dimensions beacause of
wide rows data
- changed memtable_allocation_type to offheap_objects bacause of wide rows
data
results:
- better GC performance on nodes1 and node3
- still high I/O consumption on each nodes (iostat shows 100% in %util on
each nodes, dstat shows hundred of M read for each iteration)
- still node2 completely frozen
- many panding MutationStage events in TPSTATS in node2
- many HintsDispatchExecutor events in system.log in each nodes

I cannot go to AWS but I can only get dedicated server.
Do you have any suggestions to fine tune the system on this use case?

Thank you
Marco


Re: [EXTERNAL] Writes and Reads with high latency

2018-12-28 Thread Marco Gasparini
 to all your questions
Thank you very much!

Regards
Marco


Il giorno gio 27 dic 2018 alle ore 21:09 Durity, Sean R <
sean_r_dur...@homedepot.com> ha scritto:

> Your RF is only 1, so the data only exists on one node. This is not
> typically how Cassandra is used. If you need the high availability and low
> latency, you typically set RF to 3 per DC.
>
>
>
> How many event_datetime records can you have per pkey? How many pkeys
> (roughly) do you have? In general, you only want to have at most 100 MB of
> data per partition (pkey). If it is larger than that, I would expect some
> timeouts. And because only one node has the data, a single timeout means
> you won’t get any data. Server timeouts default to just 10 seconds. The
> secret to Cassandra is to always select your data by at least the primary
> key (which you are doing). So, I suspect you either have very wide rows or
> lots of tombstones.
>
>
>
> Since you mention lots of deletes, I am thinking it could be tombstones.
> Are you getting any tombstone warnings or errors in your system.log? When
> you delete, are you deleting a full partition? If you are deleting just
> part of a partition over and over, I think you will be creating too many
> tombstones. I try to design my data partitions so that deletes are for a
> full partition. Then I won’t be reading through 1000s (or more) tombstones
> trying to find the live data.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Marco Gasparini 
> *Sent:* Thursday, December 27, 2018 3:01 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: [EXTERNAL] Writes and Reads with high latency
>
>
>
> Hello Sean,
>
>
>
> here my schema and RF:
>
>
>
> -
>
> CREATE KEYSPACE my_keyspace WITH replication = {'class':
> 'NetworkTopologyStrategy', 'DC1': '1'}  AND durable_writes = true;
>
>
>
> CREATE TABLE my_keyspace.my_table (
>
> pkey text,
>
> event_datetime timestamp,
>
> agent text,
>
> ft text,
>
> ftt text,
>
> some_id bigint,
>
> PRIMARY KEY (pkey, event_datetime)
>
> ) WITH CLUSTERING ORDER BY (event_datetime DESC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = ''
>
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 9
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
>
>
>
> -
>
>
>
> Queries I make are very simple:
>
>
>
> select pkey, event_datetime, ft, some_id, ftt from my_keyspace.my_table
> where pkey = ? limit ?;
>
> and
>
> insert into my_keyspace.my_table (event_datetime, pkey, agent, some_id,
> ft, ftt) values (?,?,?,?,?,?);
>
>
>
> About Retry policy, the answer is yes, actually when a write fails I store
> it somewhere else and, after a period, a try to write it to Cassandra
> again. This way I can store almost all my data, but when the problem is the
> read I don't apply any Retry policy (but this is my problem)
>
>
>
>
>
> Thanks
>
> Marco
>
>
>
>
>
> Il giorno ven 21 dic 2018 alle ore 17:18 Durity, Sean R <
> sean_r_dur...@homedepot.com> ha scritto:
>
> Can you provide the schema and the queries? What is the RF of the keyspace
> for the data? Are you using any Retry policy on your Cluster object?
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Marco Gasparini 
> *Sent:* Friday, December 21, 2018 10:45 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Writes and Reads with high latency
>
>
>
> hello all,
>
>
>
> I have 1 DC of 3 nodes in which is running Cassandra 3.11.3 with
> consistency level ONE and Java 1.8.0_191.
>
>
>
> Every day, there are many nodejs programs that send data to the
> cassandra's cluster via NodeJs cassandra-driver.
>
> Every day I got like 600k requests. Each request makes the server to:
>
> 1_ READ some data in Cassandra (by an id, usually I get 3 records),
>
> 2_ DELETE one of those 

Re: [EXTERNAL] Writes and Reads with high latency

2018-12-27 Thread Marco Gasparini
Hello Sean,

here my schema and RF:

-
CREATE KEYSPACE my_keyspace WITH replication = {'class':
'NetworkTopologyStrategy', 'DC1': '1'}  AND durable_writes = true;

CREATE TABLE my_keyspace.my_table (
pkey text,
event_datetime timestamp,
agent text,
ft text,
ftt text,
some_id bigint,
PRIMARY KEY (pkey, event_datetime)
) WITH CLUSTERING ORDER BY (event_datetime DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 9
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

-

Queries I make are very simple:

select pkey, event_datetime, ft, some_id, ftt from my_keyspace.my_table
where pkey = ? limit ?;
and
insert into my_keyspace.my_table (event_datetime, pkey, agent, some_id, ft,
ftt) values (?,?,?,?,?,?);

About Retry policy, the answer is yes, actually when a write fails I store
it somewhere else and, after a period, a try to write it to Cassandra
again. This way I can store almost all my data, but when the problem is the
read I don't apply any Retry policy (but this is my problem)


Thanks
Marco


Il giorno ven 21 dic 2018 alle ore 17:18 Durity, Sean R <
sean_r_dur...@homedepot.com> ha scritto:

> Can you provide the schema and the queries? What is the RF of the keyspace
> for the data? Are you using any Retry policy on your Cluster object?
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Marco Gasparini 
> *Sent:* Friday, December 21, 2018 10:45 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Writes and Reads with high latency
>
>
>
> hello all,
>
>
>
> I have 1 DC of 3 nodes in which is running Cassandra 3.11.3 with
> consistency level ONE and Java 1.8.0_191.
>
>
>
> Every day, there are many nodejs programs that send data to the
> cassandra's cluster via NodeJs cassandra-driver.
>
> Every day I got like 600k requests. Each request makes the server to:
>
> 1_ READ some data in Cassandra (by an id, usually I get 3 records),
>
> 2_ DELETE one of those records
>
> 3_ WRITE the data into Cassandra.
>
>
>
> So every day I make many deletes.
>
>
>
> Every day I find errors like:
>
> "All host(s) tried for query failed. First host tried, 10.8.0.10:9042
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__10.8.0.10-3A9042_=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=Y2zNzOyvqOiHqZ5yvB1rO_X6C-HivNjXYN0bLLL-yZQ=2v42cyvuxcXJ0oMfUrRcY-kRno1SkM4CTEMi4n1k0Wo=>:
> Host considered as DOWN. See innerErrors"
>
> "Server timeout during write query at consistency LOCAL_ONE (0 peer(s)
> acknowledged the write over 1 required)"
>
> "Server timeout during write query at consistency SERIAL (0 peer(s)
> acknowledged the write over 1 required)"
>
> "Server timeout during read query at consistency LOCAL_ONE (0 peer(s)
> acknowledged the read over 1 required)"
>
>
>
> nodetool tablehistograms tells me this:
>
>
>
> Percentile  SSTables Write Latency  Read LatencyPartition
> SizeCell Count
>
>   (micros)  (micros)   (bytes)
>
> 50% 8.00379.02   1955.67
> 379022 8
>
> 75%10.00785.94 155469.30
> 65494917
>
> 95%12.00  17436.92 268650.95
>  162972235
>
> 98%12.00  25109.16 322381.14
>  234679942
>
> 99%12.00  30130.99 386857.37
>  337939150
>
> Min 0.00  6.87 88.15
>  104 0
>
> Max12.00  43388.63 386857.37
> 20924300   179
>
>
>
> in the 99% I noted that write and read latency is pretty high, but I don't
> know how to improve that.
>
> I can provide more statistics if needed.
>
>
>
> Is there any improvement I can make to the Cassandra's configuration in
> order to not to lose any data?
>
>
>

Writes and Reads with high latency

2018-12-21 Thread Marco Gasparini
hello all,

I have 1 DC of 3 nodes in which is running Cassandra 3.11.3 with
consistency level ONE and Java 1.8.0_191.

Every day, there are many nodejs programs that send data to the cassandra's
cluster via NodeJs cassandra-driver.
Every day I got like 600k requests. Each request makes the server to:
1_ READ some data in Cassandra (by an id, usually I get 3 records),
2_ DELETE one of those records
3_ WRITE the data into Cassandra.

So every day I make many deletes.

Every day I find errors like:
"All host(s) tried for query failed. First host tried, 10.8.0.10:9042: Host
considered as DOWN. See innerErrors"
"Server timeout during write query at consistency LOCAL_ONE (0 peer(s)
acknowledged the write over 1 required)"
"Server timeout during write query at consistency SERIAL (0 peer(s)
acknowledged the write over 1 required)"
"Server timeout during read query at consistency LOCAL_ONE (0 peer(s)
acknowledged the read over 1 required)"

nodetool tablehistograms tells me this:

Percentile  SSTables Write Latency  Read LatencyPartition Size
  Cell Count
  (micros)  (micros)   (bytes)
50% 8.00379.02   1955.67379022
   8
75%10.00785.94 155469.30654949
  17
95%12.00  17436.92 268650.95   1629722
  35
98%12.00  25109.16 322381.14   2346799
  42
99%12.00  30130.99 386857.37   3379391
  50
Min 0.00  6.87 88.15   104
   0
Max12.00  43388.63 386857.37  20924300
 179

in the 99% I noted that write and read latency is pretty high, but I don't
know how to improve that.
I can provide more statistics if needed.

Is there any improvement I can make to the Cassandra's configuration in
order to not to lose any data?

Thanks

Regards
Marco