Hi, I¹m seeing behavior on 0.20.2 and 0.20.3 that doesn¹t seem quite right and would like to know if this is by design, a bug, or something I¹m doing wrong.
Background: When I do a put that includes a timestamp like this (conceptually I know this is not the actual API), it works just fine. put ³table², ³family², ³column², ³bbb², 12345 Then, if I do another put in the same client code using the same timestamp like this... put ³table², ³family², ³column², ³aaa², 12345 ...and I create a scanner, grab a Result, and iterate over all values using list(), I get this... ³table², ³family², ³column², ³aaa², 12345 So far, so good. Now, if I truncate the table from the shell and run a new program that does a flush() on the table between the two put¹s, but does it in the same client program back-to-back, I also get the same results from list(). ----- Problem: Here¹s where the trouble starts. I truncate the table and run a new program that puts ³bbb², flushes the table, and quits. Here¹s what I get from list(): ³table², ³family², ³column², ³bbb², 12345 Then I run another program that puts ³aaa², flushes, and quits. Here¹s what I get from list(): ³table², ³family², ³column², ³aaa², 12345 ³table², ³family², ³column², ³bbb², 12345 And if I then run a third program that puts ³ccc², flushes, and quits, I get this from list(): ³table², ³family², ³column², ³ccc², 12345 ³table², ³family², ³column², ³bbb², 12345 ³table², ³family², ³column², ³aaa², 12345 I¹m getting three different values for identical table/family/qualifier/timestamp tuples. Does this seem right? There also doesn¹t seem to be a defined sort order, probably because the timestamps are identical. Also, if instead of using list(), I use getMap(), then I always only get a single result. The single result is always the last item in the lists above (i.e., ³bbb² then ³bbb² then ³aaa²). I get identical results from using getNoVersionMap(). I suspect that this same behavior could occur when HBase decides to flush on its own, but I could be wrong. As you can imagine, this can cause problems because clients can¹t know from the results of calling list() which value is ³right² or ³newest². They also can¹t rely on getMap() or getNoVersionMap() because the single result that gets returned is not necessarily ³right² or ³newest². I¹ve reproduced everything above in a stand-alone installation and also with a 7 regionserver cluster with the final 0.20.3. I started down this debugging path originally because I ran into this problem on the 7 regionserver cluster with one table of 100+ regions. I was flushing programmatically at the end of some large imports because I'm doing setWriteToWAL(false) for load performance. Am I doing something wrong? Did I miss an HBase assumption about flushing and/or identical timestamps? Any help would be much appreciated. Thanks, Rod -- Rod Cope CTO & Founder OpenLogic, Inc.
