[ https://issues.apache.org/jira/browse/HBASE-18573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138622#comment-16138622 ]
Xiang Li edited comment on HBASE-18573 at 8/23/17 4:46 PM: ----------------------------------------------------------- Hi [~Jan Hentschel], thanks for the reply! Got your idea, but I might have a different opinion. You mentioned bq. As you see, there's only a single call to add(), which makes it safe to assume that the size of the list will be 1 It might be not like that. add() could be called multiple times to an single Append object, to add several cells. Similarly, addColumn() could also be called multiple times to an Append object to add several families, qualifiers and values. {code} Append a1 = new Append(row1); a1.add(c1); a1.add(c2); a1.add(c3); ... table1.append(a1); {code} So setting initial capacity to 1 is good when only adding one cell or one family/qualifier/value to the Append object and make HTable process it, but when adding multiple cells, initial capacity = 1 will inflate the backing array more frequently than setting the initial capacity to a larger number. It varies in different use scenarios. Does it make sense to you? was (Author: water): Hi [~Jan Hentschel], thanks for the reply! Got your idea, but I might have a different opinion. You mentioned bq. As you see, there's only a single call to add(), which makes it safe to assume that the size of the list will be 1 It might be not like that. add() could be called several times to an Append object, to add several cells. similarly, addColumn() could also be called several times to an Append object to add several families, qualifiers and values. {code} Append a1 = new Append(row1); a1.add(c1); a1.add(c2); a1.add(c3); ... table1.append(a1); {code} So setting initial capacity to 1 is good when only adding one cell or one family/qualifier/value to the Append object and make HTable process it, but when adding multiple cells, initial capacity = 1 will inflate the backing array more frequently than setting the initial capacity to a larger number. It varies in different use scenarios. Does it make sense to you? > Update Append and Delete to use Mutation#getCellList(family) > ------------------------------------------------------------ > > Key: HBASE-18573 > URL: https://issues.apache.org/jira/browse/HBASE-18573 > Project: HBase > Issue Type: Improvement > Reporter: Xiang Li > Assignee: Xiang Li > Priority: Minor > Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-3 > > Attachments: HBASE-18573.master.000.patch > > > In addxxx() of Put and Increment, Mutation#getCellList(family) is called to > get cell list from familyMap. But in the other 2 sub-class of Mutation: > Append and Delete, the logic like Mutation#getCellList(family) is used, like > {code} > List<Cell> list = familyMap.get(family); > if(list == null) { > list = new ArrayList<>(1); > } > {code} > in > {code} > public Delete addColumn(byte [] family, byte [] qualifier, long timestamp) > {code} > of Delete > We could make them to call Mutation#getCellList(family) to get better > encapsulation -- This message was sent by Atlassian JIRA (v6.4.14#64029)