[ 
https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643663#comment-15643663
 ] 

Benjamin Lerer commented on CASSANDRA-12649:
--------------------------------------------

[~appodictic] If you have some questions/concerns on some review feedbacks, you 
should not hesitate to write them down on the ticket.

I realize now that my feedback was may be not really helpfull. Sorry for that.

My main concern is the fact that the measurement of the mutation will be done 
on every mutation and that it involves going through the mutation, to sum up 
the size of the mutation. On nodes with heavy writes this might have a non 
neglectable CPU cost. It is true that we compute the data size when we 
serialize the data but, there, we do not have another choice.

Setting up a benchmark for that take some time and unfortunatly I do not have 
the time to do it myself.
If I have to do it, I will try to setup a JMH benchmark to check the throughput 
of the {{dataSize}} method of random data. It is too difficult to assess that 
type of stuff on a running Cassandra due to the JIT and the overall fluctuation 
in response time of the all database (the impact of the change might end up 
being lost in the noise).

Regarding,
{quote}
I am also not fully convinced by the usefullness of Logged / Unlogged 
Partitions per batch distribution. Could you explain in more details how it 
will be usefull for you?
{quote}
I just wanted to understand why such mettric will be usefull.

What I did not see was the end of [~KurtG] comment:
bq. We have seen a lot of users mistakenly batch against multiple partitions.

Then my question is: Is there not a better way of doing it? Should we not have 
a setting for rejecting batches against multi partition or a warning?

[~appodictic] My only concern is the quality of what goes into Cassandra. I am 
not trying to prevent anybody from contributing. If you do not understand my 
comments or do not agree with them will free to write it down on the ticket. My 
patches do not receive a better treatment (have a look at CASSANDRA-10707, 
which took me 8 months), they rarely go in on the first attempt.



 


 



> Add BATCH metrics
> -----------------
>
>                 Key: CASSANDRA-12649
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
>             Project: Cassandra
>          Issue Type: Wish
>            Reporter: Alwyn Davis
>            Priority: Minor
>             Fix For: 3.x
>
>         Attachments: 12649-3.x.patch, stress-batch-metrics.tar.gz, 
> stress-trunk.tar.gz, trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some 
> additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when 
> tracking the performance of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give 
> a count of batch types processed. Multiple distinct tables in batch would 
> just be considered as separate partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to