[
https://issues.apache.org/jira/browse/HADOOP-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559728#action_12559728
]
Hadoop QA commented on HADOOP-2513:
-----------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12373187/2512-v2.patch
against trunk revision r612561.
@author +1. The patch does not contain any @author tags.
javadoc +1. The javadoc tool did not generate any warning messages.
javac +1. The applied patch does not generate any new compiler warnings.
findbugs +1. The patch does not introduce any new Findbugs warnings.
core tests +1. The patch passed core unit tests.
contrib tests +1. The patch passed contrib unit tests.
Test results:
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1615/testReport/
Findbugs warnings:
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1615/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1615/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1615/console
This message is automatically generated.
> [hbase] HStore#get and HStore#getFull may not return expected values by
> timestamp when there is more than one MapFile
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-2513
> URL: https://issues.apache.org/jira/browse/HADOOP-2513
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Reporter: Bryan Duxbury
> Assignee: stack
> Fix For: 0.16.0
>
> Attachments: 2512-v2.patch, 2513.patch
>
>
> Ok, this one is a little tricky. Let's say that you write a row with some
> value without a timestamp, thus meaning right now. Then, the memcache gets
> flushed out to a MapFile. Then, you write another value to the same row, this
> time with a timestamp that is in the past, ie, before the "now" timestamp of
> the first put.
> Some time later, but before there is a compaction, if you do a get for this
> row, and only ask for a single version, you will logically be expecting the
> latest version of the cell, which you would assume would be the one written
> at "now" time. Instead, you will get the value written into the "past" cell,
> because even though it is tagged as having happened in the past, it actually
> *was written* after the "now" cell, and thus when #get searches for
> satisfying values, it runs into the one most recently written first.
> The result of this problem is inconsistent data results. Note that this
> problem only ever exists when there's an uncompacted HStore, because during
> compaction, these cells will all get sorted into the correct order by
> timestamp and such. In a way, this actually makes the problem worse, because
> then you could easily get inconsistent results from HBase about the same
> (unchanged) row depending on whether there's been a flush/compaction.
> The only solution I can think of for this problem at the moment is to scan
> all the MapFiles and Memcache for possible results, sort them, and then
> select the desired number of versions off of the top. This is unfortunate
> because it means you never get the snazzy shortcircuit logic except within a
> single mapfile or memcache.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.