[ https://issues.apache.org/jira/browse/PHOENIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716203#comment-15716203 ]
James Taylor commented on PHOENIX-541: -------------------------------------- This sounds like a good approach, [~gjacoby]. Here's some feedback: - Search for all occurrences of QueryServices.MUTATE_BATCH_SIZE_ATTRIB and make sure we're using the new MUTATE_BATCH_SIZE_BYTES_ATTRIB to track when to write/commit. There are times when we don't go through MutationState (in particular in UngroupedAggregateRegionObserver which is the code path when auto commit is on). - Deprecate JDBCUtil.getMutateBatchSize(), QueryServices.MUTATE_BATCH_SIZE_ATTRIB, and any related methods. - Change the default we have for QueryServicesOptions.DEFAULT_MUTATE_BATCH_SIZE to Integer.MAX_VALUE and for b/w compat, track both bytes and row count and send/write the mutations if either of them is met. - Create a good default value for MUTATE_BATCH_SIZE_BYTES_ATTRIB instead of using Long.MAX_VALUE. - Make similar changes QueryServices.MAX_MUTATION_SIZE_ATTRIB - making it byte-based instead of row-count-based. Usage of this config parameter would be isolated to MutationState, I believe. We should be able to come up with an accurate size based on the underlying Mutation and/or Delete info we store in PRowImpl. - Have a reasonable (smaller) default for the new QueryServices.MAX_MUTATION_SIZE_BYTES_ATTRIB > Make mutable batch size bytes-based instead of row-based > -------------------------------------------------------- > > Key: PHOENIX-541 > URL: https://issues.apache.org/jira/browse/PHOENIX-541 > Project: Phoenix > Issue Type: Improvement > Affects Versions: 3.0-Release > Reporter: mujtaba > Assignee: Geoffrey Jacoby > Labels: newbie > Fix For: 4.10.0 > > Attachments: PHOENIX-541.patch > > > With current configuration of row-count based mutable batch size, ideal value > for batch size is around 800 rather then current 15k when creating indexes > based on memory consumption, CPU and GC (data size: key: ~60 bytes, 14 > integer column in separate CFs) -- This message was sent by Atlassian JIRA (v6.3.4#6332)