I think I have isolated this problem to distributed splitting, setting hbase.master.distributed.log.splitting to false I can't make the test fail. Still digging more.
J-D On Mon, Nov 21, 2011 at 11:23 AM, Jean-Daniel Cryans <[email protected]> wrote: > There's something *very* odd with that test. I've been running it a > few times and the 2 failures I saw seemed to be of different nature > but I need more debug to understand them (working on that). > > J-D > > On Sun, Nov 20, 2011 at 8:52 PM, Ted Yu <[email protected]> wrote: >> TestReplication.queueFailover<https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/151/testReport/junit/org.apache.hadoop.hbase.replication/TestReplication/queueFailover/>was >> failing as recently as build 151. >> >> We might have had some luck for the more recent builds. >> >> w.r.t. HBASE-2856, if we have it in 0.92, I think the difference between >> 0.92 and 0.94 would be blurry. >> >> On Sun, Nov 20, 2011 at 8:41 PM, stack (Commented) (JIRA) >> <[email protected]>wrote: >> >>> >>> [ >>> https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153983#comment-13153983] >>> >>> stack commented on HBASE-2856: >>> ------------------------------ >>> >>> You fellas want this in 0.92? I want to cut a 0.92 RC. I have 0.92 >>> tests passing up on jenkins a few times in a row now and all criticals and >>> blockers are in. Should we wait? Or should we cut the RC and get this >>> into the second RC (I"m sure there'll be one). >>> >>> > TestAcidGuarantee broken on trunk >>> > ---------------------------------- >>> > >>> > Key: HBASE-2856 >>> > URL: https://issues.apache.org/jira/browse/HBASE-2856 >>> > Project: HBase >>> > Issue Type: Bug >>> > Affects Versions: 0.89.20100621 >>> > Reporter: ryan rawson >>> > Assignee: Amitanand Aiyer >>> > Priority: Blocker >>> > Fix For: 0.94.0 >>> > >>> > Attachments: 2856-0.92.txt, 2856-v2.txt, 2856-v3.txt, >>> 2856-v4.txt, 2856-v5.txt, 2856-v6.txt, 2856-v7.txt, 2856-v8.txt, >>> 2856-v9-all-inclusive.txt, acid.txt >>> > >>> > >>> > TestAcidGuarantee has a test whereby it attempts to read a number of >>> columns from a row, and every so often the first column of N is different, >>> when it should be the same. This is a bug deep inside the scanner whereby >>> the first peek() of a row is done at time T then the rest of the read is >>> done at T+1 after a flush, thus the memstoreTS data is lost, and previously >>> 'uncommitted' data becomes committed and flushed to disk. >>> > One possible solution is to introduce the memstoreTS (or similarly >>> equivalent value) to the HFile thus allowing us to preserve read >>> consistency past flushes. Another solution involves fixing the scanners so >>> that peek() is not destructive (and thus might return different things at >>> different times alas). >>> >>> -- >>> This message is automatically generated by JIRA. >>> If you think it was sent incorrectly, please contact your JIRA >>> administrators: >>> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa >>> For more information on JIRA, see: http://www.atlassian.com/software/jira >>> >>> >>> >> >
