[ https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168785#comment-15168785 ]
Anoop Sam John commented on HBASE-15340: ---------------------------------------- Yep. This is a known issue then.. The solution of having a client aware readPnt will solve even that (?) That work has to consider comparability as well. old client -> new RS and reverse. > Partial row result of scan may return data violates the row-level transaction > ------------------------------------------------------------------------------ > > Key: HBASE-15340 > URL: https://issues.apache.org/jira/browse/HBASE-15340 > Project: HBase > Issue Type: Bug > Components: Scanners, Transactions/MVCC > Affects Versions: 2.0.0 > Reporter: Jianwei Cui > > There are cases the region sever will return partial row result, such as the > client set batch for scan or configured size limit reached. In these > situations, the client may return data that violates the row-level > transaction to the application. The following steps show the problem: > {code} > // assume there is a test table 'test_table' with one family 'F' and one > region 'region'. > // meanwhile there are two region servers 'rsA' and 'rsB'. > 1. Let 'region' firstly located in 'rsA' and put one row with two columns > 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1' > 2. Start a client to scan 'test_table', with scan.setBatch(1) and > scan.setCaching(1). The client will get one column as : {column='F:c1' and > value='value1'} in the first rpc call after scanner created, and the result > will be returned to application. > 3. Before the client issues the next request, the 'region' was moved to 'rsB' > and accepted another mutations for the two columns 'c1' and 'c2' as: > > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2' > 4. Then, the client will receive a RegionMovedException when issuing next > request and will retry to open scanner on 'rsB'. The newly opened scanner > will higher mvcc than old data so that could read out column as : { > column='F:c2' with value='value2'} and return the result to application. > Therefore, the application will get data as: > 'row' column='F:c1' value='value1' > 'row' column='F:c2', value='value2' > The returned data is combined from two different mutations and violates > the row-level transaction. > {code} > The reason is that the newly opened scanner after region moved will get a > different mvcc. I am not sure whether this result is by design for scan if > partial row result is allowed. However, such row result combined from > different transactions may make the application have unexpected state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)