[jira] [Updated] (HBASE-10679) Both clients get wrong scan results if the first scanner expires and the second scanner is created with the same scannerId on the same region

2014-03-12 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10679:
--

   Resolution: Fixed
Fix Version/s: 0.99.0
   0.98.1
   0.96.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.96-trunk (Thought you'd like this bug fix 
[~andrew.purt...@gmail.com])

> Both clients get wrong scan results if the first scanner expires and the 
> second scanner is created with the same scannerId on the same region
> -
>
> Key: HBASE-10679
> URL: https://issues.apache.org/jira/browse/HBASE-10679
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
>Priority: Critical
> Fix For: 0.96.2, 0.98.1, 0.99.0
>
> Attachments: HBASE-10679-trunk_v1.patch, HBASE-10679-trunk_v2.patch, 
> HBASE-10679-trunk_v2.patch, HBASE-10679-trunk_v2.patch
>
>
> The scenario is as below (both Client A and Client B scan against Region R)
> # A opens a scanner SA on R, the scannerId is N, it successfully get its 
> first row "a"
> # SA's lease expires and it's removed from scanners
> # B opens a scanner SB on R, the scannerId is N too. it successfully get its 
> first row "m"
> # A issues its second scan request with scannerId N, regionserver finds N is 
> valid scannerId and the region matches too. (since the region is always 
> online on this regionserver and both two scanners are against it), so it 
> executes scan request on SB, returns "n" to A -- wrong! (get data from other 
> scanner, A expects row something like "b" that follows "a")
> # B issues its second scan request with scannerId N, regionserver also thinks 
> it's valid, and executes scan on SB, return "o" to B -- wrong! (should return 
> "n" but "n" has been scanned out by A just now)
> The consequence is both clients get wrong scan results:
> # A gets data from scanner created by other client, its own scanner has 
> expired and removed
> # B misses data which should be gotten but has been wrongly scanned out by A
> The root cause is scannerId generated by regionserver can't be guaranteed 
> unique within regionserver's whole lifecycle, *there is only guarantee that 
> scannerIds of scanners that are currently still valid (not expired) are 
> unique*, so a same scannerId can present in scanners again after a former 
> scanner with this scannerId expires and has been removed from scanners. And 
> if the second scanner is against the same region, the bug arises.
> Theoretically, the possibility of above scenario should be very rare(two 
> consecutive scans on a same region from two different clients get a same 
> scannerId, and the first expires before the second is created), but it does 
> can happen, and once it happens, the consequence is severe(all clients 
> involved get wrong data), and should be extremely hard to diagnose/debug



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10679) Both clients get wrong scan results if the first scanner expires and the second scanner is created with the same scannerId on the same region

2014-03-09 Thread Feng Honghua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua updated HBASE-10679:
-

Attachment: HBASE-10679-trunk_v2.patch

> Both clients get wrong scan results if the first scanner expires and the 
> second scanner is created with the same scannerId on the same region
> -
>
> Key: HBASE-10679
> URL: https://issues.apache.org/jira/browse/HBASE-10679
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
>Priority: Critical
> Attachments: HBASE-10679-trunk_v1.patch, HBASE-10679-trunk_v2.patch, 
> HBASE-10679-trunk_v2.patch, HBASE-10679-trunk_v2.patch
>
>
> The scenario is as below (both Client A and Client B scan against Region R)
> # A opens a scanner SA on R, the scannerId is N, it successfully get its 
> first row "a"
> # SA's lease expires and it's removed from scanners
> # B opens a scanner SB on R, the scannerId is N too. it successfully get its 
> first row "m"
> # A issues its second scan request with scannerId N, regionserver finds N is 
> valid scannerId and the region matches too. (since the region is always 
> online on this regionserver and both two scanners are against it), so it 
> executes scan request on SB, returns "n" to A -- wrong! (get data from other 
> scanner, A expects row something like "b" that follows "a")
> # B issues its second scan request with scannerId N, regionserver also thinks 
> it's valid, and executes scan on SB, return "o" to B -- wrong! (should return 
> "n" but "n" has been scanned out by A just now)
> The consequence is both clients get wrong scan results:
> # A gets data from scanner created by other client, its own scanner has 
> expired and removed
> # B misses data which should be gotten but has been wrongly scanned out by A
> The root cause is scannerId generated by regionserver can't be guaranteed 
> unique within regionserver's whole lifecycle, *there is only guarantee that 
> scannerIds of scanners that are currently still valid (not expired) are 
> unique*, so a same scannerId can present in scanners again after a former 
> scanner with this scannerId expires and has been removed from scanners. And 
> if the second scanner is against the same region, the bug arises.
> Theoretically, the possibility of above scenario should be very rare(two 
> consecutive scans on a same region from two different clients get a same 
> scannerId, and the first expires before the second is created), but it does 
> can happen, and once it happens, the consequence is severe(all clients 
> involved get wrong data), and should be extremely hard to diagnose/debug



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10679) Both clients get wrong scan results if the first scanner expires and the second scanner is created with the same scannerId on the same region

2014-03-07 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10679:
--

Attachment: HBASE-10679-trunk_v2.patch

Retry

> Both clients get wrong scan results if the first scanner expires and the 
> second scanner is created with the same scannerId on the same region
> -
>
> Key: HBASE-10679
> URL: https://issues.apache.org/jira/browse/HBASE-10679
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
>Priority: Critical
> Attachments: HBASE-10679-trunk_v1.patch, HBASE-10679-trunk_v2.patch, 
> HBASE-10679-trunk_v2.patch
>
>
> The scenario is as below (both Client A and Client B scan against Region R)
> # A opens a scanner SA on R, the scannerId is N, it successfully get its 
> first row "a"
> # SA's lease expires and it's removed from scanners
> # B opens a scanner SB on R, the scannerId is N too. it successfully get its 
> first row "m"
> # A issues its second scan request with scannerId N, regionserver finds N is 
> valid scannerId and the region matches too. (since the region is always 
> online on this regionserver and both two scanners are against it), so it 
> executes scan request on SB, returns "n" to A -- wrong! (get data from other 
> scanner, A expects row something like "b" that follows "a")
> # B issues its second scan request with scannerId N, regionserver also thinks 
> it's valid, and executes scan on SB, return "o" to B -- wrong! (should return 
> "n" but "n" has been scanned out by A just now)
> The consequence is both clients get wrong scan results:
> # A gets data from scanner created by other client, its own scanner has 
> expired and removed
> # B misses data which should be gotten but has been wrongly scanned out by A
> The root cause is scannerId generated by regionserver can't be guaranteed 
> unique within regionserver's whole lifecycle, *there is only guarantee that 
> scannerIds of scanners that are currently still valid (not expired) are 
> unique*, so a same scannerId can present in scanners again after a former 
> scanner with this scannerId expires and has been removed from scanners. And 
> if the second scanner is against the same region, the bug arises.
> Theoretically, the possibility of above scenario should be very rare(two 
> consecutive scans on a same region from two different clients get a same 
> scannerId, and the first expires before the second is created), but it does 
> can happen, and once it happens, the consequence is severe(all clients 
> involved get wrong data), and should be extremely hard to diagnose/debug



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HBASE-10679) Both clients get wrong scan results if the first scanner expires and the second scanner is created with the same scannerId on the same region

2014-03-06 Thread Feng Honghua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua updated HBASE-10679:
-

Summary: Both clients get wrong scan results if the first scanner expires 
and the second scanner is created with the same scannerId on the same region  
(was: Both clients operating on a same region will get wrong scan results if 
the first scanner expires and the second scanner is created with the same 
scannerId)

> Both clients get wrong scan results if the first scanner expires and the 
> second scanner is created with the same scannerId on the same region
> -
>
> Key: HBASE-10679
> URL: https://issues.apache.org/jira/browse/HBASE-10679
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
>Priority: Critical
> Attachments: HBASE-10679-trunk_v1.patch, HBASE-10679-trunk_v2.patch
>
>
> The scenario is as below (both Client A and Client B scan against Region R)
> # A opens a scanner SA on R, the scannerId is N, it successfully get its 
> first row "a"
> # SA's lease expires and it's removed from scanners
> # B opens a scanner SB on R, the scannerId is N too. it successfully get its 
> first row "m"
> # A issues its second scan request with scannerId N, regionserver finds N is 
> valid scannerId and the region matches too. (since the region is always 
> online on this regionserver and both two scanners are against it), so it 
> executes scan request on SB, returns "n" to A -- wrong! (get data from other 
> scanner, A expects row something like "b" that follows "a")
> # B issues its second scan request with scannerId N, regionserver also thinks 
> it's valid, and executes scan on SB, return "o" to B -- wrong! (should return 
> "n" but "n" has been scanned out by A just now)
> The consequence is both clients get wrong scan results:
> # A gets data from scanner created by other client, its own scanner has 
> expired and removed
> # B misses data which should be gotten but has been wrongly scanned out by A
> The root cause is scannerId generated by regionserver can't be guaranteed 
> unique within regionserver's whole lifecycle, *there is only guarantee that 
> scannerIds of scanners that are currently still valid (not expired) are 
> unique*, so a same scannerId can present in scanners again after a former 
> scanner with this scannerId expires and has been removed from scanners. And 
> if the second scanner is against the same region, the bug arises.
> Theoretically, the possibility of above scenario should be very rare(two 
> consecutive scans on a same region from two different clients get a same 
> scannerId, and the first expires before the second is created), but it does 
> can happen, and once it happens, the consequence is severe(all clients 
> involved get wrong data), and should be extremely hard to diagnose/debug



--
This message was sent by Atlassian JIRA
(v6.2#6252)