[
https://issues.apache.org/jira/browse/PHOENIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752811#comment-15752811
]
Geoffrey Jacoby commented on PHOENIX-541:
-----------------------------------------
Attached second version of patch for comment. A couple of points:
1. I wasn't able to increase the DEFAULT_MUTATE_BATCH_SIZE to Integer.MAX_VALUE
because quite a few places in Phoenix use that value to initialize arraylist
capacities, so increasing the value led to tons of OOM exceptions. (These cases
will need to be changed when DEFAULT_MUTATE_BATCH_SIZE is removed in a future
JIRA.)
2. So far I haven't created MAX_MUTATION_SIZE_BYTES_ATTRIB, because I'm not
sure it's necessary. Right now the only place I can see that's using the
row-based equivalent is in MutationState.throwIfTooBig(), and I'm not sure if
that needs to continue to exist, since it's only called when a MutationState is
joined to another one, and we now handle the "overly-large MutationState" case
by partitioning our batched Mutations to HBase.
For example, PhoenixIndexImportDirectMapper already Math.min()'s it with the
max batch row size, and DeleteCompiler only grabs the
MAX_MUTATION_SIZE_BYTES_ATTRIB to pass it into MutationState's constructor so
it can be used in throwIfTooBig().
[~jamestaylor] [~samarthjain]
> Make mutable batch size bytes-based instead of row-based
> --------------------------------------------------------
>
> Key: PHOENIX-541
> URL: https://issues.apache.org/jira/browse/PHOENIX-541
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 3.0-Release
> Reporter: mujtaba
> Assignee: Geoffrey Jacoby
> Labels: newbie
> Fix For: 4.10.0
>
> Attachments: PHOENIX-541-v2.patch, PHOENIX-541.patch
>
>
> With current configuration of row-count based mutable batch size, ideal value
> for batch size is around 800 rather then current 15k when creating indexes
> based on memory consumption, CPU and GC (data size: key: ~60 bytes, 14
> integer column in separate CFs)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)