[ 
https://issues.apache.org/jira/browse/HBASE-26465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-26465:
-----------------------------
    Description: 
HBASE-26144 moved {{MemStore.clearSnapshot}} out of write lock of 
{{HStore#lock}}, but because {{MemStore.clearSnapshot}}  closed 
{{DefaultMemStore#snapshot}} which may be used by 
{{DefaultMemStore#getScanners}}, when {{DefaultMemStore#getScanners}} and 
{{MemStore}} flushing execute conurrently, {{MemStoreLAB}} used by 
{{DefaultMemStore#snapshot}}  may be released early when its {{SegmentScanner}} 
is scanning.

Considering follow thread sequences: 
1.The flush thread starts {{DefaultMemStore}} flushing after some cells have be 
added to {{DefaultMemStore}}.

2.The flush thread stopping before {{DefaultMemStore#clearSnapshot}} in 
following line 1239 in
   {{HStore#updateStorefiles}} after completed flushing memStore to hfile.
{code:java}
1221    private boolean updateStorefiles(List<HStoreFile> sfs, long snapshotId) 
throws IOException {
1223    this.lock.writeLock().lock();
1224   try {
1225      this.storeEngine.getStoreFileManager().insertNewFiles(sfs);
1226    } finally {
1227      // We need the lock, as long as we are updating the storeFiles
1228      // or changing the memstore. Let us release it before calling
1229      // notifyChangeReadersObservers. See HBASE-4485 for a possible
1230      // deadlock scenario that could have happened if continue to hold
1231      // the lock.
1232     this.lock.writeLock().unlock();
1233   }
1234    // We do not need to call clearSnapshot method inside the write lock.
1235    // The clearSnapshot itself is thread safe, which can be called at the 
same time with other
1236    // memstore operations expect snapshot and clearSnapshot. And for these 
two methods, in HRegion
1237    // we can guarantee that there is only one onging flush, so they will 
be no race.
1238    if (snapshotId > 0) {
1239      this.memstore.clearSnapshot(snapshotId);
1240    }
{code}

3.The scan thread starts and stopping after 
{{DefaultMemStore.snapshot.getAllSegments}} in
   {{DefaultMemStore#getScanners}},here the scan thread gets the 
{{DefaultMemStore#snapshot}} which is created by the flush thread.

4.The flush thread continues {{DefaultMemStore#clearSnapshot}} and close 
{{DefaultMemStore#snapshot}},because the reference count of the corresponding  
{{MemStoreLABImpl}} is 0, the Chunks in corresponding {{MemStoreLABImpl}} are 
recycled.

5.The scan thread continues {{DefaultMemStore#getScanners}}, and create a 
{{SegmentScanner}} for this {{DefaultMemStore#snapshot}}, and increase the 
reference count of the corresponding {{MemStoreLABImpl}}, but {{Chunks}} in 
corresponding {{MemStoreLABImpl}} are recycled by step 4, and these {{Chunks}} 
may be overwritten by other write threads, which may cause serious problem.


I simulate above case in my PR, and my Fix in the PR is as following:
1.Moving  {{MemStore.clearSnapshot}} back under write lock of {{HStore#lock}}, 
because {{DefaultMemStore#getScanners}} is protected under the read lock of 
{{HStore#lock}}, so {{DefaultMemStore#getScanners}} and
   {{DefaultMemStore#clearSnapshot}} could not execute concurrently.
2. Introducing {{RefCnt}} into {{MemStoreLABImpl}}  to replace old reference 
count, so checking and increasing or decreasing is done in atomicity and if 
there is illegal state in  reference count, it would throw exception rather 
than corrupt data.

  was:
HBASE-26144 moved {{MemStore.clearSnapshot}} out of write lock of 
{{HStore#lock}}, but because {{MemStore.clearSnapshot}}  closed 
{{DefaultMemStore#snapshot}} which may be used by 
{{DefaultMemStore#getScanners}}, when {{DefaultMemStore#getScanners}} and 
{{MemStore}} flushing execute conurrently, {{MemStoreLAB}} used by 
{{DefaultMemStore#snapshot}}  may be released early when its {{SegmentScanner}} 
is scanning.

Considering follow thread sequences: 
1.The flush thread starts {{DefaultMemStore}} flushing after some cells have be 
added to {{DefaultMemStore}}.

2.The flush thread stopping before {{DefaultMemStore#clearSnapshot}} in
   {{HStore#updateStorefiles}} after completed flushing memStore to hfile.

3.The scan thread starts and stopping after 
{{DefaultMemStore.snapshot.getAllSegments}} in
   {{DefaultMemStore#getScanners}},here the scan thread gets the 
{{DefaultMemStore#snapshot}} which is created by the flush thread.

4.The flush thread continues {{DefaultMemStore#clearSnapshot}} and close 
{{DefaultMemStore#snapshot}},because the reference count of the corresponding  
{{MemStoreLABImpl}} is 0, the Chunks in corresponding {{MemStoreLABImpl}} are 
recycled.

5.The scan thread continues {{DefaultMemStore#getScanners}}, and create a 
{{SegmentScanner}} for this {{DefaultMemStore#snapshot}}, and increase the 
reference count of the corresponding {{MemStoreLABImpl}}, but {{Chunks}} in 
corresponding {{MemStoreLABImpl}} are recycled by step 4, and these {{Chunks}} 
may be overwritten by other write threads, which may cause serious problem.


I simulate above case in my PR, and my Fix in the PR is as following:
1.Moving  {{MemStore.clearSnapshot}} back under write lock of {{HStore#lock}}, 
because {{DefaultMemStore#getScanners}} is protected under the read lock of 
{{HStore#lock}}, so {{DefaultMemStore#getScanners}} and
   {{DefaultMemStore#clearSnapshot}} could not execute concurrently.
2. Introducing {{RefCnt}} into {{MemStoreLABImpl}}  to replace old reference 
count, so checking and increasing or decreasing is done in atomicity and if 
there is illegal state in  reference count, it would throw exception rather 
than corrupt data.


> MemStoreLAB may be released early when its SegmentScanner is scanning
> ---------------------------------------------------------------------
>
>                 Key: HBASE-26465
>                 URL: https://issues.apache.org/jira/browse/HBASE-26465
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0-alpha-1, 2.4.8
>            Reporter: chenglei
>            Assignee: chenglei
>            Priority: Critical
>
> HBASE-26144 moved {{MemStore.clearSnapshot}} out of write lock of 
> {{HStore#lock}}, but because {{MemStore.clearSnapshot}}  closed 
> {{DefaultMemStore#snapshot}} which may be used by 
> {{DefaultMemStore#getScanners}}, when {{DefaultMemStore#getScanners}} and 
> {{MemStore}} flushing execute conurrently, {{MemStoreLAB}} used by 
> {{DefaultMemStore#snapshot}}  may be released early when its 
> {{SegmentScanner}} is scanning.
> Considering follow thread sequences: 
> 1.The flush thread starts {{DefaultMemStore}} flushing after some cells have 
> be added to {{DefaultMemStore}}.
> 2.The flush thread stopping before {{DefaultMemStore#clearSnapshot}} in 
> following line 1239 in
>    {{HStore#updateStorefiles}} after completed flushing memStore to hfile.
> {code:java}
> 1221    private boolean updateStorefiles(List<HStoreFile> sfs, long 
> snapshotId) throws IOException {
> 1223    this.lock.writeLock().lock();
> 1224   try {
> 1225      this.storeEngine.getStoreFileManager().insertNewFiles(sfs);
> 1226    } finally {
> 1227      // We need the lock, as long as we are updating the storeFiles
> 1228      // or changing the memstore. Let us release it before calling
> 1229      // notifyChangeReadersObservers. See HBASE-4485 for a possible
> 1230      // deadlock scenario that could have happened if continue to hold
> 1231      // the lock.
> 1232     this.lock.writeLock().unlock();
> 1233   }
> 1234    // We do not need to call clearSnapshot method inside the write lock.
> 1235    // The clearSnapshot itself is thread safe, which can be called at 
> the same time with other
> 1236    // memstore operations expect snapshot and clearSnapshot. And for 
> these two methods, in HRegion
> 1237    // we can guarantee that there is only one onging flush, so they will 
> be no race.
> 1238    if (snapshotId > 0) {
> 1239      this.memstore.clearSnapshot(snapshotId);
> 1240    }
> {code}
> 3.The scan thread starts and stopping after 
> {{DefaultMemStore.snapshot.getAllSegments}} in
>    {{DefaultMemStore#getScanners}},here the scan thread gets the 
> {{DefaultMemStore#snapshot}} which is created by the flush thread.
> 4.The flush thread continues {{DefaultMemStore#clearSnapshot}} and close 
> {{DefaultMemStore#snapshot}},because the reference count of the corresponding 
>  {{MemStoreLABImpl}} is 0, the Chunks in corresponding {{MemStoreLABImpl}} 
> are recycled.
> 5.The scan thread continues {{DefaultMemStore#getScanners}}, and create a 
> {{SegmentScanner}} for this {{DefaultMemStore#snapshot}}, and increase the 
> reference count of the corresponding {{MemStoreLABImpl}}, but {{Chunks}} in 
> corresponding {{MemStoreLABImpl}} are recycled by step 4, and these 
> {{Chunks}} may be overwritten by other write threads, which may cause serious 
> problem.
> I simulate above case in my PR, and my Fix in the PR is as following:
> 1.Moving  {{MemStore.clearSnapshot}} back under write lock of 
> {{HStore#lock}}, because {{DefaultMemStore#getScanners}} is protected under 
> the read lock of {{HStore#lock}}, so {{DefaultMemStore#getScanners}} and
>    {{DefaultMemStore#clearSnapshot}} could not execute concurrently.
> 2. Introducing {{RefCnt}} into {{MemStoreLABImpl}}  to replace old reference 
> count, so checking and increasing or decreasing is done in atomicity and if 
> there is illegal state in  reference count, it would throw exception rather 
> than corrupt data.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to