[ 
https://issues.apache.org/jira/browse/PHOENIX-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16019998#comment-16019998
 ] 

James Taylor commented on PHOENIX-3788:
---------------------------------------

We track metrics on the client to track for a given query, how big is the batch 
size (among many other metrics). These metrics drive the Splunk dashboards used 
to monitor Phoenix. I think this corner case is important to fix because 
otherwise alarms may go off when a user runs a DELETE with auto commit on. The 
client would report a huge batch size in this case when in actuality it's 
broken up on the server side into batches. Or it might be the case that we're 
not even reporting these metrics which is a hole as we'd be severely 
undercounting the bytes/rows committed.

Here's what I think the simplest fix would be:
- In both DeleteCompiler and UpsertCompiler in the runOnServer branches (in the 
MutationPlan.execute method), pass through the client-side batch size on a new 
scan attribute.
- In UngroupedAggregateRegionObserver.doPostScannerOpen() method, look first 
for the new scan attribute for the batch size and only use the server-side 
batch size config if not found.
- Create a new MutationState constructor that passes through the batch size and 
internally calls GLOBAL_MUTATION_BYTES.update() in a loop (basically simulating 
the batches that the server would have done).
- Use this new constructor in DeleteCompile and UpsertCompiler for the 
runOnServer branches.

FYI, [~samarthjain]. Thoughts? Is the batch size only tracked as a global 
metric?

> GLOBAL_MUTATION_BATCH_SIZE should reflect size of chunked batches
> -----------------------------------------------------------------
>
>                 Key: PHOENIX-3788
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3788
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.10.0
>            Reporter: Geoffrey Jacoby
>            Assignee: Geoffrey Jacoby
>             Fix For: 4.11.0
>
>         Attachments: PHOENIX-3788.patch
>
>
> As part of PHOENIX-541, we started chunking large MutationStates into 
> multiple smaller batches transparently. However, the relevant metric, 
> GLOBAL_MUTATION_BATCH_SIZE, still is updated with the total batch size, not 
> the size of each chunk. This means you can't see the actual batch sizes which 
> are being submitted to HBase. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to