[
https://issues.apache.org/jira/browse/PHOENIX-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551793#comment-17551793
]
Kadir Ozdemir commented on PHOENIX-6677:
----------------------------------------
[~larsh] Your suggestion works and covers my use cases too. We can actually
change the batch size using connection properties for example
{code:java}
Properties props = new Properties();
props.setProperty(UPSERT_BATCH_SIZE_BYTES_ATTRIB, "2000000");
props.setProperty(UPSERT_BATCH_SIZE_ATTRIB, "2000");
try (Connection conn = DriverManager.getConnection(getUrl(), props)) {
{code}
There is no need to do more here.
> Parallelism within a batch of mutations
> ----------------------------------------
>
> Key: PHOENIX-6677
> URL: https://issues.apache.org/jira/browse/PHOENIX-6677
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Kadir OZDEMIR
> Priority: Major
> Fix For: 4.17.0, 5.2.0
>
>
> Currently, Phoenix client simply passes the batches of row mutations from the
> application to HBase client without any parallelism or intelligent grouping
> (except grouping mutations for the same row).
> Assume that the application creates batches 10000 row mutations for a given
> table. Phoenix client divides these rows based on their arrival order into
> HBase batches of n (e.g., 100) rows based on the configured batch size, i.e.,
> the number of rows and bytes. Then, Phoenix calls HBase batch API, one batch
> at a time (i.e., serially). HBase client further divides a given batch of
> rows into smaller batches based on their regions. This means that a large
> batch created by the application is divided into many tiny batches and
> executed mostly serially. For slated tables, this will result in even smaller
> batches.
> We can improve the current implementation greatly if we group the rows of the
> batch prepared by the application into sub batches based on table region
> boundaries and then execute these batches in parallel.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)