[ https://issues.apache.org/jira/browse/HBASE-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182988#comment-15182988 ]
Ted Yu commented on HBASE-15398: -------------------------------- {code} 174 if(!filter.isFamilyEssential(family)){ 175 return true; {code} Is the above condition inverted ? > Cells loss or disorder when using family essential filter and partial > scanning protocol > --------------------------------------------------------------------------------------- > > Key: HBASE-15398 > URL: https://issues.apache.org/jira/browse/HBASE-15398 > Project: HBase > Issue Type: Bug > Components: dataloss, Scanners > Affects Versions: 1.2.0, 1.1.3 > Reporter: Phil Yang > Assignee: Phil Yang > Priority: Critical > Attachments: 15398-test.txt, HBASE-15398.v1.txt > > > In RegionScannerImpl, we have two heaps, storeHeap and joinedHeap. If we have > a filter and it doesn't apply to all cf, the stores whose families needn't be > filtered will be in joinedHeap. We scan storeHeap first, then joinedHeap, > and merge the results and sort and return to client. We need sort because the > order of Cell is rowkey/cf/cq/ts and a smaller cf may be in the joinedHeap. > However, after HBASE-11544 we may transfer partial results when we get > SIZE_LIMIT_REACHED_MID_ROW or other similar states. We may return a larger cf > first because it is in storeHeap and then a smaller cf because it is in > joinedHeap. Server won't hold all cells in a row and client doesn't have a > sorting logic. The order of cf in Result for user is wrong. > And a more critical bug is, if we get a LIMIT_REACHED_MID_ROW on the last > cell of a row in storeHeap, we will break scanning in RegionScannerImpl and > in populateResult we will change the state to SIZE_LIMIT_REACHED because next > peeked cell is next row. But this is only the last cell of one and we have > two... And SIZE_LIMIT_REACHED means this Result is not partial (by > ScannerContext.partialResultFormed), client will see it and merge them and > return to user with losing data of joinedHeap. On next scan we will read next > row of storeHeap and joinedHeap is forgotten and never be read... -- This message was sent by Atlassian JIRA (v6.3.4#6332)