[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

Swapna (JIRA) Thu, 31 May 2018 19:54:35 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497492#comment-16497492
 ]


Swapna commented on HBASE-20618:
--------------------------------

Thanks [~eclark]

Looked into that option. But we have a server side filter with hasFilterRow set 
to true. We drop results based on some cells missing for a row. And this is 
incompatible with partial results as row boundaries are not known.

> Skip large rows instead of throwing an exception to client
> ----------------------------------------------------------
>
>                 Key: HBASE-20618
>                 URL: https://issues.apache.org/jira/browse/HBASE-20618
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Swapna
>            Priority: Minor
>             Fix For: 3.0.0, 2.0.1, 1.4.5
>
>         Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

Reply via email to