[ 
https://issues.apache.org/jira/browse/CASSANDRA-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Constance Eustace updated CASSANDRA-9886:
-----------------------------------------
    Description: 
I was doing performance testing to get off of using batches for our persistence 
engine, and instead use "async spray" with timestamps. 

First of all, it seems fairly insane that the USING TIMESTAMP clause is in a 
different location for INSERT (before WHERE) and the UPDATE (before SET)  and 
the DELETE (before WHERE) statements... thus is in the middle of the statement 
for no real apparently good reason, although maybe there is some PostGresql 
compatibility. 

This means that if some code produces a large list of statements without the 
USING TIMESTAMP already in it, because the actual method of execution of a list 
of statements, which may use batches (if we were grouping by partition key) or 
not (single statement) may be determined later...

Then for single statement updates, the statement needs  to properly place the 
USING TIMESTAMP clause. It would be MUCH EASIER to add a simple append of 
"USING TIMESTAMP xxx" at the end of the CQL statement.

BATCH is easier, you just wrap the statements. Pretty basic.

I have done performance testing with single-statement BATCH USING TIMESTAMP and 
their performance is awful, worse that "NEVER EVER DO THIS" sync batches with 
cross-partition updates.

Can we either allow a USING TIMESTAMP to be at the end of all the mutation 
statements in the same place, or have a check in the BATCH statement processing 
to check if its a single statement and reduce it to non-batch execution?

  was:
I was doing performance testing to get off of using batches for our persistence 
engine, and instead use "async spray" with timestamps. 

First of all, it seems fairly insane that the USING TIMESTAMP clause is in a 
different location for INSERT (before WHERE) and the UPDATE (before SET)  and 
the DELETE (before WHERE) statements... thus is in the middle of the statement 
for no real apparently good reason, although maybe there is some PostGresql 
compatibility. 

This means that if some code produces a large list of statements without the 
USING TIMESTAMP already in it, because the actual method of execution of a list 
of statements, which may use batches (if we were grouping by partition key) or 
not (single statement) may be determined later...

Then for single statement updates, the statement needs  to properly place the 
USING TIMESTAMP clause. It would be MUCH EASIER to all a simple append of 
"USING TIMESTAMP xxx" at the end of the CQL statement.

BATCH is easier, you just wrap the statements. Pretty basic.

I have done performance testing with single-statement BATCH USING TIMESTAMP and 
their performance is awful, worse that "NEVER EVER DO THIS" sync batches with 
cross-partition updates.

Can we either allow a USING TIMESTAMP to be at the end of all the mutation 
statements in the same place, or have a check in the BATCH statement processing 
to check if its a single statement and reduce it to non-batch execution?


> TIMESTAMP - allow USING TIMESTAMP at end of mutation CQL 
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9886
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9886
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Constance Eustace
>             Fix For: 2.1.x
>
>
> I was doing performance testing to get off of using batches for our 
> persistence engine, and instead use "async spray" with timestamps. 
> First of all, it seems fairly insane that the USING TIMESTAMP clause is in a 
> different location for INSERT (before WHERE) and the UPDATE (before SET)  and 
> the DELETE (before WHERE) statements... thus is in the middle of the 
> statement for no real apparently good reason, although maybe there is some 
> PostGresql compatibility. 
> This means that if some code produces a large list of statements without the 
> USING TIMESTAMP already in it, because the actual method of execution of a 
> list of statements, which may use batches (if we were grouping by partition 
> key) or not (single statement) may be determined later...
> Then for single statement updates, the statement needs  to properly place the 
> USING TIMESTAMP clause. It would be MUCH EASIER to add a simple append of 
> "USING TIMESTAMP xxx" at the end of the CQL statement.
> BATCH is easier, you just wrap the statements. Pretty basic.
> I have done performance testing with single-statement BATCH USING TIMESTAMP 
> and their performance is awful, worse that "NEVER EVER DO THIS" sync batches 
> with cross-partition updates.
> Can we either allow a USING TIMESTAMP to be at the end of all the mutation 
> statements in the same place, or have a check in the BATCH statement 
> processing to check if its a single statement and reduce it to non-batch 
> execution?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to