[ https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104311#comment-14104311 ]
Jeffrey Zhong commented on HBASE-11591: --------------------------------------- [~jerryhe] {quote}If it is bulkloaded file, could we just set the cells regardless of its old seqId in the cell?{quote} Yes, we could. The condition is to prevent a Cell from keeping reset [~ram_krish] {quote}This is a fake key that we are creating right?{quote} Yes, you're right that we don't have to use setCurrentCell in these two cases. The patch is to use a consistent way to set instance variable cur so that it's easy to maintain&reasoning in the future or we do more in the setCurrentCell call. I guess there is no much difference either way. > Scanner fails to retrieve KV from bulk loaded file with highest sequence id > than the cell's mvcc in a non-bulk loaded file > --------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-11591 > URL: https://issues.apache.org/jira/browse/HBASE-11591 > Project: HBase > Issue Type: Bug > Affects Versions: 0.99.0 > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Priority: Critical > Fix For: 0.99.0 > > Attachments: HBASE-11591.patch, HBASE-11591_1.patch, > HBASE-11591_2.patch, HBASE-11591_3.patch, TestBulkload.java, > hbase-11591-03-jeff.patch > > > See discussion in HBASE-11339. > When we have a case where there are same KVs in two files one produced by > flush/compaction and the other thro the bulk load. > Both the files have some same kvs which matches even in timestamp. > Steps: > Add some rows with a specific timestamp and flush the same. > Bulk load a file with the same data.. Enusre that "assign seqnum" property is > set. > The bulk load should use HFileOutputFormat2 (or ensure that we write the > bulk_time_output key). > This would ensure that the bulk loaded file has the highest seq num. > Assume the cell in the flushed/compacted store file is > row1,cf,cq,ts1, value1 and the cell in the bulk loaded file is > row1,cf,cq,ts1,value2 > (There are no parallel scans). > Issue a scan on the table in 0.96. The retrieved value is > row1,cf1,cq,ts1,value2 > But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. > This is a behaviour change. This is because of this code > {code} > public int compare(KeyValueScanner left, KeyValueScanner right) { > int comparison = compare(left.peek(), right.peek()); > if (comparison != 0) { > return comparison; > } else { > // Since both the keys are exactly the same, we break the tie in favor > // of the key which came latest. > long leftSequenceID = left.getSequenceID(); > long rightSequenceID = right.getSequenceID(); > if (leftSequenceID > rightSequenceID) { > return -1; > } else if (leftSequenceID < rightSequenceID) { > return 1; > } else { > return 0; > } > } > } > {code} > Here in 0.96 case the mvcc of the cell in both the files will have 0 and so > the comparison will happen from the else condition . Where the seq id of the > bulk loaded file is greater and would sort out first ensuring that the scan > happens from that bulk loaded file. > In case of 0.98+ as we are retaining the mvcc+seqid we are not making the > mvcc as 0 (remains a non zero positive value). Hence the compare() sorts out > the cell in the flushed/compacted file. Which means though we know the > lateset file is the bulk loaded file we don't scan the data. > Seems to be a behaviour change. Will check on other corner cases also but we > are trying to know the behaviour of bulk load because we are evaluating if it > can be used for MOB design. -- This message was sent by Atlassian JIRA (v6.2#6252)