[ https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643663#comment-15643663 ]
Benjamin Lerer commented on CASSANDRA-12649: -------------------------------------------- [~appodictic] If you have some questions/concerns on some review feedbacks, you should not hesitate to write them down on the ticket. I realize now that my feedback was may be not really helpfull. Sorry for that. My main concern is the fact that the measurement of the mutation will be done on every mutation and that it involves going through the mutation, to sum up the size of the mutation. On nodes with heavy writes this might have a non neglectable CPU cost. It is true that we compute the data size when we serialize the data but, there, we do not have another choice. Setting up a benchmark for that take some time and unfortunatly I do not have the time to do it myself. If I have to do it, I will try to setup a JMH benchmark to check the throughput of the {{dataSize}} method of random data. It is too difficult to assess that type of stuff on a running Cassandra due to the JIT and the overall fluctuation in response time of the all database (the impact of the change might end up being lost in the noise). Regarding, {quote} I am also not fully convinced by the usefullness of Logged / Unlogged Partitions per batch distribution. Could you explain in more details how it will be usefull for you? {quote} I just wanted to understand why such mettric will be usefull. What I did not see was the end of [~KurtG] comment: bq. We have seen a lot of users mistakenly batch against multiple partitions. Then my question is: Is there not a better way of doing it? Should we not have a setting for rejecting batches against multi partition or a warning? [~appodictic] My only concern is the quality of what goes into Cassandra. I am not trying to prevent anybody from contributing. If you do not understand my comments or do not agree with them will free to write it down on the ticket. My patches do not receive a better treatment (have a look at CASSANDRA-10707, which took me 8 months), they rarely go in on the first attempt. > Add BATCH metrics > ----------------- > > Key: CASSANDRA-12649 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12649 > Project: Cassandra > Issue Type: Wish > Reporter: Alwyn Davis > Priority: Minor > Fix For: 3.x > > Attachments: 12649-3.x.patch, stress-batch-metrics.tar.gz, > stress-trunk.tar.gz, trunk-12649.txt > > > To identify causes of load on a cluster, it would be useful to have some > additional metrics: > * *Mutation size distribution:* I believe this would be relevant when > tracking the performance of unlogged batches. > * *Logged / Unlogged Partitions per batch distribution:* This would also give > a count of batch types processed. Multiple distinct tables in batch would > just be considered as separate partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)