[ https://issues.apache.org/jira/browse/PHOENIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752811#comment-15752811 ]
Geoffrey Jacoby commented on PHOENIX-541: ----------------------------------------- Attached second version of patch for comment. A couple of points: 1. I wasn't able to increase the DEFAULT_MUTATE_BATCH_SIZE to Integer.MAX_VALUE because quite a few places in Phoenix use that value to initialize arraylist capacities, so increasing the value led to tons of OOM exceptions. (These cases will need to be changed when DEFAULT_MUTATE_BATCH_SIZE is removed in a future JIRA.) 2. So far I haven't created MAX_MUTATION_SIZE_BYTES_ATTRIB, because I'm not sure it's necessary. Right now the only place I can see that's using the row-based equivalent is in MutationState.throwIfTooBig(), and I'm not sure if that needs to continue to exist, since it's only called when a MutationState is joined to another one, and we now handle the "overly-large MutationState" case by partitioning our batched Mutations to HBase. For example, PhoenixIndexImportDirectMapper already Math.min()'s it with the max batch row size, and DeleteCompiler only grabs the MAX_MUTATION_SIZE_BYTES_ATTRIB to pass it into MutationState's constructor so it can be used in throwIfTooBig(). [~jamestaylor] [~samarthjain] > Make mutable batch size bytes-based instead of row-based > -------------------------------------------------------- > > Key: PHOENIX-541 > URL: https://issues.apache.org/jira/browse/PHOENIX-541 > Project: Phoenix > Issue Type: Improvement > Affects Versions: 3.0-Release > Reporter: mujtaba > Assignee: Geoffrey Jacoby > Labels: newbie > Fix For: 4.10.0 > > Attachments: PHOENIX-541-v2.patch, PHOENIX-541.patch > > > With current configuration of row-count based mutable batch size, ideal value > for batch size is around 800 rather then current 15k when creating indexes > based on memory consumption, CPU and GC (data size: key: ~60 bytes, 14 > integer column in separate CFs) -- This message was sent by Atlassian JIRA (v6.3.4#6332)