[jira] [Commented] (PHOENIX-541) Make mutable batch size bytes-based instead of row-based

Geoffrey Jacoby (JIRA) Fri, 02 Dec 2016 10:34:21 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715893#comment-15715893
 ]


Geoffrey Jacoby commented on PHOENIX-541:
-----------------------------------------

I spoke offline with [~samarthjain] and he suggested a better approach, where 
rather than throw an exception if a MutationState has too many bytes (as it 
does now with too many rows), to just transparently partition the list of 
mutations to be batched to HBase into sub-lists that are all smaller than the 
byte size boundary. This is the approach I adopted; patch is attached. 

 By default the max byte size is Long.MaxValue for backwards compatibility. 

> Make mutable batch size bytes-based instead of row-based
> --------------------------------------------------------
>
>                 Key: PHOENIX-541
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-541
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 3.0-Release
>            Reporter: mujtaba
>            Assignee: Geoffrey Jacoby
>              Labels: newbie
>             Fix For: 4.10.0
>
>         Attachments: PHOENIX-541.patch
>
>
> With current configuration of row-count based mutable batch size, ideal value 
> for batch size is around 800 rather then current 15k when creating indexes 
> based on memory consumption, CPU and GC (data size: key: ~60 bytes, 14 
> integer column in separate CFs)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-541) Make mutable batch size bytes-based instead of row-based

Reply via email to