[ 
https://issues.apache.org/jira/browse/HBASE-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228998#comment-13228998
 ] 

Lars Hofhansl edited comment on HBASE-5569 at 3/14/12 6:29 AM:
---------------------------------------------------------------

Well... The whole point of the new API was to have atomic operations.
The Put and the Delete are executed atomically together and visible at the same 
time.
Note that the code alternates putting row and deleting row2, and then putting 
row2 and deleting row. The scan than ensure that only exactly one column is 
visible.

In this case the scan *itself* is inconsistent. And worse, as Nicolas (N) found 
out is that even testRowMutationMultiThreads fails sometimes, and that is just 
a single row and should never happen.

So I am not entirely convinced the test is at fault.

For example the scenario described above:
if
{code}
Put p = new Put(row2, ts);
                p.add(fam1, qual1, value1);
                mrm.add(p);
                Delete d = new Delete(row);
                d.deleteColumns(fam1, qual1, ts);
                mrm.add(d);
{code}
happened between 
{code}
region.mutateRowsWithLocks(mrm, rowsToLock);
{code}

and
{code}

Scan s = new Scan(row);
RegionScanner rs = region.getScanner(s);
              List<KeyValue> r = new ArrayList<KeyValue>();
              while(rs.next(r));
{code}

Both the Put and the Delete would happen atomically with the same WALEdit and 
the same MVCC writepoint. So the scan will now see the other row (it sees 
either row or row, because row -RowA- sorts before row2 -RowB-)
This has nothing to do with race conditions between threads, but only occurs 
with flushes in the test. I'll remove the forced flushes and then run the test 
again.

                
      was (Author: lhofhansl):
    Well... The whole point of the new API was to have atomic operations.
The Put and the Delete are executed atomically together and visible at the same 
time.
Note that the code alternates putting row and deleting row2, and then putting 
row2 and deleting row. The scan than ensure that only exactly one column is 
visible.

In this case the scan *itself* is inconsistent. And worse, as Nicolas (N) found 
out is that even testRowMutationMultiThreads fails sometimes, and that is just 
a single row and should never happen.

So I am not entirely convinced the test is at fault.

For example the scenario described above if Between the time thread1 execute
if
{code}
Put p = new Put(row2, ts);
                p.add(fam1, qual1, value1);
                mrm.add(p);
                Delete d = new Delete(row);
                d.deleteColumns(fam1, qual1, ts);
                mrm.add(d);
{code}
happened between 
{code}
region.mutateRowsWithLocks(mrm, rowsToLock);
{code}

and
{code}

Scan s = new Scan(row);
RegionScanner rs = region.getScanner(s);
              List<KeyValue> r = new ArrayList<KeyValue>();
              while(rs.next(r));
{code}

Both the Put and the Delete would happen atomically with the same WALEdit and 
the same MVCC writepoint. So the scan will now see the other row.
This has nothing to do with race conditions between threads, but only occurs 
with flushes in the test. I'll remove the forced flushes and then run the test 
again.
                  
> TestAtomicOperation.testMultiRowMutationMultiThreads fails occasionally
> -----------------------------------------------------------------------
>
>                 Key: HBASE-5569
>                 URL: https://issues.apache.org/jira/browse/HBASE-5569
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Priority: Minor
>         Attachments: TestAtomicOperation-output.trunk_120313.rar
>
>
> What I pieced together so far is that it is the *scanning* side that has 
> problems sometimes.
> Every time I see a assertion failure in the log I see this before:
> {quote}
> 2012-03-12 21:48:49,523 DEBUG [Thread-211] regionserver.StoreScanner(499): 
> Storescanner.peek() is changed where before = 
> rowB/colfamily11:qual1/75366/Put/vlen=6,and after = 
> rowB/colfamily11:qual1/75203/DeleteColumn/vlen=0
> {quote}
> The order of if the Put and Delete is sometimes reversed.
> The test threads should always see exactly one KV, if the "before" was the 
> Put the thread see 0 KVs, if the "before" was the Delete the threads see 2 
> KVs.
> This debug message comes from StoreScanner to checkReseek. It seems we still 
> some consistency issue with scanning sometimes :(

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to