[ https://issues.apache.org/jira/browse/PHOENIX-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kadir Ozdemir resolved PHOENIX-6677. ------------------------------------ Resolution: Not A Problem > Parallelism within a batch of mutations > ---------------------------------------- > > Key: PHOENIX-6677 > URL: https://issues.apache.org/jira/browse/PHOENIX-6677 > Project: Phoenix > Issue Type: Improvement > Reporter: Kadir OZDEMIR > Priority: Major > Fix For: 4.17.0, 5.2.0 > > > Currently, Phoenix client simply passes the batches of row mutations from the > application to HBase client without any parallelism or intelligent grouping > (except grouping mutations for the same row). > Assume that the application creates batches 10000 row mutations for a given > table. Phoenix client divides these rows based on their arrival order into > HBase batches of n (e.g., 100) rows based on the configured batch size, i.e., > the number of rows and bytes. Then, Phoenix calls HBase batch API, one batch > at a time (i.e., serially). HBase client further divides a given batch of > rows into smaller batches based on their regions. This means that a large > batch created by the application is divided into many tiny batches and > executed mostly serially. For slated tables, this will result in even smaller > batches. > We can improve the current implementation greatly if we group the rows of the > batch prepared by the application into sub batches based on table region > boundaries and then execute these batches in parallel. -- This message was sent by Atlassian Jira (v8.20.7#820007)