[ 
https://issues.apache.org/jira/browse/HBASE-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249558#comment-15249558
 ] 

Ashish Singhi commented on HBASE-15669:
---------------------------------------

[~tedyu],
{quote} What if there is no size for this file ?
I see LOG.warn() below. Is that enough ?{quote}
That's not possible, even if get a exception we will set it 0 so that should be 
enough.

[~anoop.hbase],
bq. Do we need a check like hasStoreFileSize()? getStoreFileSize(): 0?
By default its 0.

bq. totalEdits? 
totalCells :)

{quote}
In loop condition part u can have i < cells.size()? Other places also similar 
way. Will it add more burden on other normal edits size calc? Like we have 
qualifier check on each and every cell.
There can be one WALEdit with a mix of bulk load cells + normal cells? I dont 
think so. So we can early out when 1st cell in WALEdit is not a bulk load cell? 
May be this optimization can come in some other places also?
{quote}
The same thing came up when we were working on the main jira (HBASE-13153), but 
we are not sure if in future an edit can contain a mix of mutation and bulk 
load marker cells. If that happens then it will break the replication. So to 
avoid that we are handling it in that way.

Thanks for the reviews.

> HFile size is not considered correctly in a replication request
> ---------------------------------------------------------------
>
>                 Key: HBASE-15669
>                 URL: https://issues.apache.org/jira/browse/HBASE-15669
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.3.0
>            Reporter: Ashish Singhi
>            Assignee: Ashish Singhi
>             Fix For: 2.0.0, 1.3.0, 1.4.0
>
>         Attachments: HBASE-15669.patch
>
>
> In a single replication request from source cluster a RS can send either at 
> most {{replication.source.size.capacity}} size of data or 
> {{replication.source.nb.capacity}} entries. 
> The size is calculated by considering the cells size in each entry which will 
> get calculated wrongly in case of bulk loaded data replication, in this case 
> we need to consider the size of hfiles not cell.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to