[ 
https://issues.apache.org/jira/browse/HBASE-18573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138622#comment-16138622
 ] 

Xiang Li edited comment on HBASE-18573 at 8/23/17 4:45 PM:
-----------------------------------------------------------

Hi [~Jan Hentschel], thanks for the reply! Got your idea, but I might have a 
different opinion.
You mentioned 
bq. As you see, there's only a single call to add(), which makes it safe to 
assume that the size of the list will be 1
It might be not like that. add() could be called several times to an Append 
object, to add several cells. similarly, addColumn() could also be called 
several times to an Append object to add several families, qualifiers and 
values.
{code}
Append a1 = new Append(row1);
a1.add(c1);
a1.add(c2);
a1.add(c3);
...
table1.append(a1);
{code}

So setting initial capacity to 1 is good when only adding one cell or one 
family/qualifier/value to the Append object and make HTable process it, but 
when adding multiple cells, initial capacity = 1 will inflate the backing array 
more frequently than setting the initial capacity to a larger number. 
It varies in different use scenarios. Does it make sense to you?


was (Author: water):
Hi [~Jan Hentschel], thanks for the reply! Got your idea, but I might have a 
different opinion.
You mentioned 
bq. As you see, there's only a single call to add(), which makes it safe to 
assume that the size of the list will be 1
It might be not like that. add() could be called several times to an Append 
object, to add several cells. similarly, addColumn() could also be called 
several times to an Append object to add several families, qualifiers and 
values.
{code}
Append a1 = new Append(row1);
a1.add(c1);
a1.add(c2);
a1.add(c3);
...
table1.append(a1);

{code}

So setting initial capacity to 1 is good when only adding one cell or one 
family/qualifier/value to the Append object and make HTable process it, but 
when adding multiple cells, initial capacity = 1 will inflate the backing array 
more frequently than setting the initial capacity to a larger number. 
It varies in different use scenarios. Does it make sense to you?

> Update Append and Delete to use Mutation#getCellList(family)
> ------------------------------------------------------------
>
>                 Key: HBASE-18573
>                 URL: https://issues.apache.org/jira/browse/HBASE-18573
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Xiang Li
>            Assignee: Xiang Li
>            Priority: Minor
>             Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-3
>
>         Attachments: HBASE-18573.master.000.patch
>
>
> In addxxx() of Put and Increment, Mutation#getCellList(family) is called to 
> get cell list from familyMap. But in the other 2 sub-class of Mutation: 
> Append and Delete, the logic like Mutation#getCellList(family) is used, like
> {code}
>     List<Cell> list = familyMap.get(family);
>     if(list == null) {
>       list = new ArrayList<>(1);
>     }
> {code}
> in
> {code}
> public Delete addColumn(byte [] family, byte [] qualifier, long timestamp)
> {code}
> of Delete
> We could make them to call Mutation#getCellList(family) to get better 
> encapsulation



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to