[ 
https://issues.apache.org/jira/browse/HBASE-16425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423675#comment-15423675
 ] 

Jean-Marc Spaggiari commented on HBASE-16425:
---------------------------------------------

I like this thread!

Another thing related to the bulk load. If someone bulkloads a cell wich is WAY 
too big, the region server might not be able to load it. Like, a 2GB cell. And 
will fail. Might be nice to detect that and alert the user/log the issue/skip 
the cell...

> [Operability] Autohandling 'bad data'
> -------------------------------------
>
>                 Key: HBASE-16425
>                 URL: https://issues.apache.org/jira/browse/HBASE-16425
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Operability
>            Reporter: stack
>
> This is a brainstorming issue. It came up chatting w/ a couple of operators 
> talking about 'bad data'; i.e. no matter how you control your clients, 
> someone by mistake or under a misconception will load an out-of-spec Cell or 
> Row. In this particular case, two types of 'bad data' were talked about:
> (on) The Big Cell: An upload of a 'big cell' came in via bulkload but it so 
> happened that their frontend all arrived at the malignant Cell at the same 
> time so hundreds of threads requesting the big cell. The RS OOME'd. Then when 
> the region opened on the new RS, it OOME'd, etc. Could we switch to chunking 
> when a Server sees that it has a large Cell on its hands? I suppose bulk load 
> could defeat any Put chunking we had in place but would be good to have this 
> too. Chatting w/ Matteo, we probably want to just move to the streaming 
> Interface that we've talked of in the past at various times; the Get would 
> chunk out the big Cell for assembly on the Client, or just give back the Cell 
> in pieces -- an OutputStream for the Application to suck on. New API and/or 
> old API could use it when Cells are big.
> (on) The user had a row with 29M Columns in it because the default entity had 
> id=-1.... In this case chunking the Scan (v1.1+) helps but the operator was 
> having trouble finding the problem row. How could we surface anomalies like 
> this for operators? On flush, add even more meta data to the HFile (Yahoo! 
> Data Sketches as [~jleach] has been suggesting) and then an offline tool to 
> read metadata and run it through a few simple rules. Data Sketches are 
> mergeable so could build up a region-view or store-view....
> This is sketchy and I'm pretty sure repeats stuff in old issues but parking 
> this note here while the encounter still fresh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to