[jira] [Commented] (HBASE-28522) UNASSIGN proc indefinitely stuck on dead rs

2024-07-01 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17861140#comment-17861140
 ] 

chenglei commented on HBASE-28522:
--

Can we acquire the table shared lock before checking DISABLED or DISABLING 
state of the table in {{ServerCrashProcedure.assignRegions}}  to avoid the race 
?

> UNASSIGN proc indefinitely stuck on dead rs
> ---
>
> Key: HBASE-28522
> URL: https://issues.apache.org/jira/browse/HBASE-28522
> Project: HBase
>  Issue Type: Improvement
>  Components: proc-v2, Region Assignment
>Reporter: Prathyusha
>Assignee: Prathyusha
>Priority: Critical
>  Labels: pull-request-available
> Attachments: timeline.jpg
>
>
> One scenario we noticed in production -
> we had DisableTableProc and SCP almost triggered at similar time
> 2024-03-16 17:59:23,014 INFO [PEWorker-11] procedure.DisableTableProcedure - 
> Set  to state=DISABLING
> 2024-03-16 17:59:15,243 INFO [PEWorker-26] procedure.ServerCrashProcedure - 
> Start pid=21592440, state=RUNNABLE:SERVER_CRASH_START, locked=true; 
> ServerCrashProcedure 
> , splitWal=true, meta=false
> DisabeTableProc creates unassign procs, and at this time ASSIGNs of SCP is 
> not completed
> {{2024-03-16 17:59:23,003 DEBUG [PEWorker-40] procedure2.ProcedureExecutor - 
> LOCK_EVENT_WAIT pid=21594220, ppid=21592440, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=, region=, ASSIGN}}
> UNASSIGN created by DisableTableProc is stuck on the dead regionserver and we 
> had to manually bypass unassign of DisableTableProc and then do ASSIGN.
> If we can break the loop for UNASSIGN procedure to not retry if there is scp 
> for that server, we do not need manual intervention?, at least the 
> DisableTableProc can go to a rollback state?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-20 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839202#comment-17839202
 ] 

chenglei commented on HBASE-28509:
--

Pushed to 2.6+, thanks [~zhangduo] for reviewing!

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.1
>
>
> When we invoke {{AsyncTableResultScanner.close}},  
> {{AsyncTableResultScanner.resultQueue}} is cleared and 
> {{AsyncTableResultScanner.closed}} is set to true, and we do not need any 
> more scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} 
> would be invoked to perform another unnecessary scan on {{RegionServer}} and 
> call {{AsyncTableResultScanner.onNext}} again when {{ScanResponse}} is 
> received.  {{AsyncTableResultScanner.onNext}}  would do nothing else but just 
> discard scan results because {{AsyncTableResultScanner.closed}} is true.  We 
> could save this unnecessary scan on {{RegionServer}} and close scanner 
> directly .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-20 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-28509:
-
Fix Version/s: 2.6.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.1
>
>
> When we invoke {{AsyncTableResultScanner.close}},  
> {{AsyncTableResultScanner.resultQueue}} is cleared and 
> {{AsyncTableResultScanner.closed}} is set to true, and we do not need any 
> more scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} 
> would be invoked to perform another unnecessary scan on {{RegionServer}} and 
> call {{AsyncTableResultScanner.onNext}} again when {{ScanResponse}} is 
> received.  {{AsyncTableResultScanner.onNext}}  would do nothing else but just 
> discard scan results because {{AsyncTableResultScanner.closed}} is true.  We 
> could save this unnecessary scan on {{RegionServer}} and close scanner 
> directly .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-28509:
-
Status: Patch Available  (was: Open)

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>  Labels: pull-request-available
>
> When we invoke {{AsyncTableResultScanner.close}},  
> {{AsyncTableResultScanner.resultQueue}} is cleared and 
> {{AsyncTableResultScanner.closed}} is set to true, and we do not need any 
> more scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} 
> would be invoked to perform another unnecessary scan on {{RegionServer}} and 
> call {{AsyncTableResultScanner.onNext}} again when {{ScanResponse}} is 
> received.  {{AsyncTableResultScanner.onNext}}  would do nothing else but just 
> discard scan results because {{AsyncTableResultScanner.closed}} is true.  We 
> could save this unnecessary scan on {{RegionServer}} and close scanner 
> directly .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei reassigned HBASE-28509:


Assignee: chenglei

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>  Labels: pull-request-available
>
> When we invoke {{AsyncTableResultScanner.close}},  
> {{AsyncTableResultScanner.resultQueue}} is cleared and 
> {{AsyncTableResultScanner.closed}} is set to true, and we do not need any 
> more scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} 
> would be invoked to perform another unnecessary scan on {{RegionServer}} and 
> call {{AsyncTableResultScanner.onNext}} again when {{ScanResponse}} is 
> received.  {{AsyncTableResultScanner.onNext}}  would do nothing else but just 
> discard scan results because {{AsyncTableResultScanner.closed}} is true.  We 
> could save this unnecessary scan on {{RegionServer}} and close scanner 
> directly .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-28509:
-
Description: When we invoke {{AsyncTableResultScanner.close}},  
{{AsyncTableResultScanner.resultQueue}} is cleared and 
{{AsyncTableResultScanner.closed}} is set to true, and we do not need any more 
scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} would 
be invoked to perform another unnecessary scan on {{RegionServer}} and call 
{{AsyncTableResultScanner.onNext}} again when {{ScanResponse}} is received.  
{{AsyncTableResultScanner.onNext}}  would do nothing else but just discard scan 
results because {{AsyncTableResultScanner.closed}} is true.  We could save this 
unnecessary scan on {{RegionServer}} and close scanner directly .  (was: When 
we invoke {{AsyncTableResultScanner.close}},  
{{AsyncTableResultScanner.resultQueue}} is cleared and 
{{AsyncTableResultScanner.closed}} is set to true, and we do not need any more 
scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} is 
invoked to perform another unnecessary scan on {{RegionServer}} and call 
{{AsyncTableResultScanner.onNext}} again when it receives {{ScanResponse}}.  
{{AsyncTableResultScanner.onNext}}  would do nothing else but just discard scan 
results because {{AsyncTableResultScanner.closed}} is true.  We could save this 
unnecessary scan on {{RegionServer}} and close scanner directly .)

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Priority: Major
>
> When we invoke {{AsyncTableResultScanner.close}},  
> {{AsyncTableResultScanner.resultQueue}} is cleared and 
> {{AsyncTableResultScanner.closed}} is set to true, and we do not need any 
> more scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} 
> would be invoked to perform another unnecessary scan on {{RegionServer}} and 
> call {{AsyncTableResultScanner.onNext}} again when {{ScanResponse}} is 
> received.  {{AsyncTableResultScanner.onNext}}  would do nothing else but just 
> discard scan results because {{AsyncTableResultScanner.closed}} is true.  We 
> could save this unnecessary scan on {{RegionServer}} and close scanner 
> directly .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-28509:
-
Description: When we invoke {{AsyncTableResultScanner.close}},  
{{AsyncTableResultScanner.resultQueue}} is cleared and 
{{AsyncTableResultScanner.closed}} is set to true, and we do not need any more 
scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} is 
invoked to perform another unnecessary scan on {{RegionServer}} and call 
{{AsyncTableResultScanner.onNext}} again when it receives {{ScanResponse}}.  
{{AsyncTableResultScanner.onNext}}  would do nothing else but just discard scan 
results because {{AsyncTableResultScanner.closed}} is true.  We could save this 
unnecessary scan on {{RegionServer}} and close scanner directly .  (was: When 
we invoke {{AsyncTableResultScanner.close}},  
{{AsyncTableResultScanner.resultQueue}} is cleared and 
{{AsyncTableResultScanner.closed}} is set to true, and we do not need any more 
scan results, but if there is a {{ScanResumser}}, {{ScanResumer.resume}} would 
perform another unnecessary scan on {{RegionServer}} and call 
{{AsyncTableResultScanner.onNext}} again when receives {{ScanResponse}}.  
{{AsyncTableResultScanner.onNext}}  would do nothing else but just discard scan 
results because )

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Priority: Major
>
> When we invoke {{AsyncTableResultScanner.close}},  
> {{AsyncTableResultScanner.resultQueue}} is cleared and 
> {{AsyncTableResultScanner.closed}} is set to true, and we do not need any 
> more scan results. But if there is a {{ScanResumser}}, {{ScanResumer.resume}} 
> is invoked to perform another unnecessary scan on {{RegionServer}} and call 
> {{AsyncTableResultScanner.onNext}} again when it receives {{ScanResponse}}.  
> {{AsyncTableResultScanner.onNext}}  would do nothing else but just discard 
> scan results because {{AsyncTableResultScanner.closed}} is true.  We could 
> save this unnecessary scan on {{RegionServer}} and close scanner directly .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-28509:
-
Description: When we invoke {{AsyncTableResultScanner.close}},  
{{AsyncTableResultScanner.resultQueue}} is cleared and 
{{AsyncTableResultScanner.closed}} is set to true, and we do not need any more 
scan results, but if there is a {{ScanResumser}}, {{ScanResumer.resume}} would 
perform another unnecessary scan on {{RegionServer}} and call 
{{AsyncTableResultScanner.onNext}} again when receives {{ScanResponse}}.  
{{AsyncTableResultScanner.onNext}}  would do nothing else but just discard scan 
results because   (was: For {{}})

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Priority: Major
>
> When we invoke {{AsyncTableResultScanner.close}},  
> {{AsyncTableResultScanner.resultQueue}} is cleared and 
> {{AsyncTableResultScanner.closed}} is set to true, and we do not need any 
> more scan results, but if there is a {{ScanResumser}}, {{ScanResumer.resume}} 
> would perform another unnecessary scan on {{RegionServer}} and call 
> {{AsyncTableResultScanner.onNext}} again when receives {{ScanResponse}}.  
> {{AsyncTableResultScanner.onNext}}  would do nothing else but just discard 
> scan results because 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-28509:
-
Description: For {{}}

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Priority: Major
>
> For {{}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner

2024-04-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-28509:
-
Summary: ScanResumer.resume would perform unnecessary scan when close 
AsyncTableResultScanner  (was: ScanResumer.resume would perform unnecessary 
scan when close AsyncTableResultScanner  and )

> ScanResumer.resume would perform unnecessary scan when close 
> AsyncTableResultScanner
> 
>
> Key: HBASE-28509
> URL: https://issues.apache.org/jira/browse/HBASE-28509
> Project: HBase
>  Issue Type: Improvement
>  Components: asyncclient
>Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
>Reporter: chenglei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28509) ScanResumer.resume would perform unnecessary scan when close AsyncTableResultScanner and

2024-04-09 Thread chenglei (Jira)
chenglei created HBASE-28509:


 Summary: ScanResumer.resume would perform unnecessary scan when 
close AsyncTableResultScanner  and 
 Key: HBASE-28509
 URL: https://issues.apache.org/jira/browse/HBASE-28509
 Project: HBase
  Issue Type: Improvement
  Components: asyncclient
Affects Versions: 2.6.0, 4.0.0-alpha-1, 3.0.0-beta-2
Reporter: chenglei






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-07-03 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei resolved HBASE-27954.
--
Resolution: Fixed

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> The code to get the {{midkey}} by {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code 
> is somewhat complicated, we could use the 
> {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-07-03 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27954:
-
Fix Version/s: 2.6.0
   3.0.0-beta-1

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> The code to get the {{midkey}} by {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code 
> is somewhat complicated, we could use the 
> {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-07-03 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739631#comment-17739631
 ] 

chenglei edited comment on HBASE-27954 at 7/3/23 2:02 PM:
--

Pushed to 2.6+, thanks [~zhangduo] for reviewing!


was (Author: comnetwork):
PUshed to 2.6+, thanks [~zhangduo] for reviewing!

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> The code to get the {{midkey}} by {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code 
> is somewhat complicated, we could use the 
> {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-07-03 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739631#comment-17739631
 ] 

chenglei commented on HBASE-27954:
--

PUshed to 2.6+, thanks [~zhangduo] for reviewing!

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> The code to get the {{midkey}} by {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code 
> is somewhat complicated, we could use the 
> {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-28 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27954:
-
Description: The code to get the {{midkey}} by {{midLeafBlock}} in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code is 
somewhat complicated, we could use the 
{{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.  (was: The code to get 
the {{midkey}} in {{midLeafBlock}} in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code is 
somewhat complicated, we could use the 
{{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.)

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> The code to get the {{midkey}} by {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code 
> is somewhat complicated, we could use the 
> {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-28 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei reassigned HBASE-27954:


Assignee: chenglei

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> The code to get the {{midkey}} in {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code 
> is somewhat complicated, we could use the 
> {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-27 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27954:
-
Issue Type: Improvement  (was: Bug)

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Improvement
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> The code to get the {{midkey}} in {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code 
> is somewhat complicated, we could use the 
> {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-27 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27954:
-
Description: The code to get the {{midkey}} in {{midLeafBlock}} in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code is 
somewhat complicated, we could use the 
{{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.  (was: The code to get 
the {{midkey}} in {{midLeafBlock}} in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, we could 
use the {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.)

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> The code to get the {{midkey}} in {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, the code 
> is somewhat complicated, we could use the 
> {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-27 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27954:
-
Description: The code to get the {{midkey}} in {{midLeafBlock}} in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, we could 
use the {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.  (was: In 
{{HFileBlockIndex}}, the code to get the {{midkey}} in {{midLeafBlock}} in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} is almost the same as  
{{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, we could use the 
{{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
{{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.)

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> The code to get the {{midkey}} in {{midLeafBlock}} in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} method is almost the 
> same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, we could 
> use the {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in 
>  {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-27 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27954:
-
Description: In {{HFileBlockIndex}}, the code to get the {{midkey}} in 
{{midLeafBlock}} in  {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} is 
almost the same as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, 
we could use the {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} 
directly in  {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.  (was: In 
{{HFileBlockIndex}}, the code for {{BlockIndexReader.BlockIndexReader}} and 
{{}})

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> In {{HFileBlockIndex}}, the code to get the {{midkey}} in {{midLeafBlock}} in 
>  {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}} is almost the same 
> as  {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}}, we could use 
> the {{HFileBlockIndex.BlockIndexReader.getNonRootIndexedKey}} directly in  
> {{HFileBlockIndex.CellBasedKeyBlockIndexReader.midkey}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-27 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27954:
-
Description: In HFileBlockIndex, BlockIndexReader.

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> In HFileBlockIndex, BlockIndexReader.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-27 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27954:
-
Description: In {{HFileBlockIndex}}, the code for 
{{BlockIndexReader.BlockIndexReader}} and {{}}  (was: In HFileBlockIndex, 
BlockIndexReader.)

> Eliminate duplicate code for  getNonRootIndexedKey in HFileBlockIndex
> -
>
> Key: HBASE-27954
> URL: https://issues.apache.org/jira/browse/HBASE-27954
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> In {{HFileBlockIndex}}, the code for {{BlockIndexReader.BlockIndexReader}} 
> and {{}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27954) Eliminate duplicate code for getNonRootIndexedKey in HFileBlockIndex

2023-06-27 Thread chenglei (Jira)
chenglei created HBASE-27954:


 Summary: Eliminate duplicate code for  getNonRootIndexedKey in 
HFileBlockIndex
 Key: HBASE-27954
 URL: https://issues.apache.org/jira/browse/HBASE-27954
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 3.0.0-alpha-4, 4.0.0-alpha-1
Reporter: chenglei






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-17 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27940:
-
Fix Version/s: 2.6.0
   2.4.18
   2.5.6
   3.0.0-beta-1

> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 2.5.6, 3.0.0-beta-1
>
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for 
> {{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
> it still subtracts the checksum to check if the midkey metadat exists,  the 
> midkey metadata would always be ignored: 
> {code:java}
> public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
> throws IOException {
>   DataInputStream in = readRootIndex(blk, numEntries);
>   // after reading the root index the checksum bytes have to
>   // be subtracted to know if the mid key exists.
>   int checkSumBytes = blk.totalChecksumBytes();
>   if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
> // No mid-key metadata available.
> return;
>   }
>   midLeafBlockOffset = in.readLong();
>   midLeafBlockOnDiskSize = in.readInt();
>   midKeyEntry = in.readInt();
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-17 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei resolved HBASE-27940.
--
Resolution: Fixed

> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for 
> {{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
> it still subtracts the checksum to check if the midkey metadat exists,  the 
> midkey metadata would always be ignored: 
> {code:java}
> public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
> throws IOException {
>   DataInputStream in = readRootIndex(blk, numEntries);
>   // after reading the root index the checksum bytes have to
>   // be subtracted to know if the mid key exists.
>   int checkSumBytes = blk.totalChecksumBytes();
>   if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
> // No mid-key metadata available.
> return;
>   }
>   midLeafBlockOffset = in.readLong();
>   midLeafBlockOnDiskSize = in.readInt();
>   midKeyEntry = in.readInt();
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-17 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733762#comment-17733762
 ] 

chenglei commented on HBASE-27940:
--

Pushed to 2.4+, thanks [~zhangduo] for reviewing !

> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for 
> {{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
> it still subtracts the checksum to check if the midkey metadat exists,  the 
> midkey metadata would always be ignored: 
> {code:java}
> public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
> throws IOException {
>   DataInputStream in = readRootIndex(blk, numEntries);
>   // after reading the root index the checksum bytes have to
>   // be subtracted to know if the mid key exists.
>   int checkSumBytes = blk.totalChecksumBytes();
>   if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
> // No mid-key metadata available.
> return;
>   }
>   midLeafBlockOffset = in.readLong();
>   midLeafBlockOnDiskSize = in.readInt();
>   midKeyEntry = in.readInt();
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-16 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei reassigned HBASE-27940:


Assignee: chenglei

> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for 
> {{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
> it still subtracts the checksum to check if the midkey metadat exists,  the 
> midkey metadata would always be ignored: 
> {code:java}
> public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
> throws IOException {
>   DataInputStream in = readRootIndex(blk, numEntries);
>   // after reading the root index the checksum bytes have to
>   // be subtracted to know if the mid key exists.
>   int checkSumBytes = blk.totalChecksumBytes();
>   if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
> // No mid-key metadata available.
> return;
>   }
>   midLeafBlockOffset = in.readLong();
>   midLeafBlockOnDiskSize = in.readInt();
>   midKeyEntry = in.readInt();
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-16 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27940:
-
Description: 
After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  in 
{{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
{{HFileBlock.buf}} does not include checksum, but for 
{{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
it still subtracts the checksum to check if the midkey metadat exists,  the 
midkey metadata would always be ignored: 

{code:java}
public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
throws IOException {
  DataInputStream in = readRootIndex(blk, numEntries);
  // after reading the root index the checksum bytes have to
  // be subtracted to know if the mid key exists.
  int checkSumBytes = blk.totalChecksumBytes();
  if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
// No mid-key metadata available.
return;
  }
  midLeafBlockOffset = in.readLong();
  midLeafBlockOnDiskSize = in.readInt();
  midKeyEntry = in.readInt();
 }
{code}


  was:
After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  in 
{{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
{{HFileBlock.buf}} does not include checksum, but for 
{{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
it still subtract the checksum to check if the midkey metadat exists,  the : 

{code:java}
public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
throws IOException {
  DataInputStream in = readRootIndex(blk, numEntries);
  // after reading the root index the checksum bytes have to
  // be subtracted to know if the mid key exists.
  int checkSumBytes = blk.totalChecksumBytes();
  if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
// No mid-key metadata available.
return;
  }
  midLeafBlockOffset = in.readLong();
  midLeafBlockOnDiskSize = in.readInt();
  midKeyEntry = in.readInt();
 }
{code}



> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for 
> {{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
> it still subtracts the checksum to check if the midkey metadat exists,  the 
> midkey metadata would always be ignored: 
> {code:java}
> public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
> throws IOException {
>   DataInputStream in = readRootIndex(blk, numEntries);
>   // after reading the root index the checksum bytes have to
>   // be subtracted to know if the mid key exists.
>   int checkSumBytes = blk.totalChecksumBytes();
>   if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
> // No mid-key metadata available.
> return;
>   }
>   midLeafBlockOffset = in.readLong();
>   midLeafBlockOnDiskSize = in.readInt();
>   midKeyEntry = in.readInt();
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-16 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27940:
-
Description: 
After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  in 
{{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
{{HFileBlock.buf}} does not include checksum, but for 
{{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
it still subtract the checksum to check if the midkey metadat exists,  the : 

{code:java}
public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
throws IOException {
  DataInputStream in = readRootIndex(blk, numEntries);
  // after reading the root index the checksum bytes have to
  // be subtracted to know if the mid key exists.
  int checkSumBytes = blk.totalChecksumBytes();
  if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
// No mid-key metadata available.
return;
  }
  midLeafBlockOffset = in.readLong();
  midLeafBlockOnDiskSize = in.readInt();
  midKeyEntry = in.readInt();
 }
{code}


  was:
After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  in 
{{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
{{HFileBlock.buf}} does not include checksum, but for 
{{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
it still subtract the checksum to check if the midkey exists: 

{code:java}
public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
throws IOException {
  DataInputStream in = readRootIndex(blk, numEntries);
  // after reading the root index the checksum bytes have to
  // be subtracted to know if the mid key exists.
  int checkSumBytes = blk.totalChecksumBytes();
  if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
// No mid-key metadata available.
return;
  }
  midLeafBlockOffset = in.readLong();
  midLeafBlockOnDiskSize = in.readInt();
  midKeyEntry = in.readInt();
 }
{code}



> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for 
> {{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
> it still subtract the checksum to check if the midkey metadat exists,  the : 
> {code:java}
> public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
> throws IOException {
>   DataInputStream in = readRootIndex(blk, numEntries);
>   // after reading the root index the checksum bytes have to
>   // be subtracted to know if the mid key exists.
>   int checkSumBytes = blk.totalChecksumBytes();
>   if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
> // No mid-key metadata available.
> return;
>   }
>   midLeafBlockOffset = in.readLong();
>   midLeafBlockOnDiskSize = in.readInt();
>   midKeyEntry = in.readInt();
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-16 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27940:
-
Description: 
After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  in 
{{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
{{HFileBlock.buf}} does not include checksum, but for 
{{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
it still subtract the checksum to check if the midkey exists: 

{code:java}
public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
throws IOException {
  DataInputStream in = readRootIndex(blk, numEntries);
  // after reading the root index the checksum bytes have to
  // be subtracted to know if the mid key exists.
  int checkSumBytes = blk.totalChecksumBytes();
  if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
// No mid-key metadata available.
return;
  }
  midLeafBlockOffset = in.readLong();
  midLeafBlockOnDiskSize = in.readInt();
  midKeyEntry = in.readInt();
 }
{code}


  was:After HBASE-27053, checksum is removed from the {{HFileBlock}} 
{{ByteBuff}}  in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is 
verified, so {{HFileBlock.buf}} does not include checksum, but for 
{{BlockIndexReader.readMultiLevelIndexRoot}}, it checks 


> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for 
> {{BlockIndexReader.readMultiLevelIndexRoot}}, after read root index entries , 
> it still subtract the checksum to check if the midkey exists: 
> {code:java}
> public void readMultiLevelIndexRoot(HFileBlock blk, final int numEntries) 
> throws IOException {
>   DataInputStream in = readRootIndex(blk, numEntries);
>   // after reading the root index the checksum bytes have to
>   // be subtracted to know if the mid key exists.
>   int checkSumBytes = blk.totalChecksumBytes();
>   if ((in.available() - checkSumBytes) < MID_KEY_METADATA_SIZE) {
> // No mid-key metadata available.
> return;
>   }
>   midLeafBlockOffset = in.readLong();
>   midLeafBlockOnDiskSize = in.readInt();
>   midKeyEntry = in.readInt();
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-16 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27940:
-
Description: After HBASE-27053, checksum is removed from the {{HFileBlock}} 
{{ByteBuff}}  in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is 
verified, so {{HFileBlock.buf}} does not include checksum, but for 
{{BlockIndexReader.readMultiLevelIndexRoot}}, it checks   (was: After 
HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  in 
{{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
{{HFileBlock.buf}} does not include checksum, but for {{}})

> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for 
> {{BlockIndexReader.readMultiLevelIndexRoot}}, it checks 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-16 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27940:
-
Description: After HBASE-27053, checksum is removed from the {{HFileBlock}} 
{{ByteBuff}}  in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is 
verified, so {{HFileBlock.buf}} does not include checksum, but for {{}}

> Midkey metadata in root index block would always be ignored by 
> BlockIndexReader.readMultiLevelIndexRoot
> ---
>
> Key: HBASE-27940
> URL: https://issues.apache.org/jira/browse/HBASE-27940
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> After HBASE-27053, checksum is removed from the {{HFileBlock}} {{ByteBuff}}  
> in {{FSReaderImpl.readBlockDataInternal}}  once the checksum is verified, so 
> {{HFileBlock.buf}} does not include checksum, but for {{}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27940) Midkey metadata in root index block would always be ignored by BlockIndexReader.readMultiLevelIndexRoot

2023-06-16 Thread chenglei (Jira)
chenglei created HBASE-27940:


 Summary: Midkey metadata in root index block would always be 
ignored by BlockIndexReader.readMultiLevelIndexRoot
 Key: HBASE-27940
 URL: https://issues.apache.org/jira/browse/HBASE-27940
 Project: HBase
  Issue Type: Bug
  Components: HFile
Affects Versions: 2.5.5, 2.4.17, 3.0.0-alpha-4, 4.0.0-alpha-1
Reporter: chenglei






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-16 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733687#comment-17733687
 ] 

chenglei commented on HBASE-27924:
--

Thanks [~zhangduo] for reviewing and merging !

> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> {{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
> {{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
> replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
> {{ServerRpcConnection.doRawSaslReply}}: 
> {code:java}
> private void doResponse(ChannelHandlerContext ctx, SaslStatus status, 
> Writable rv,
> String errorClass, String error) throws IOException {
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> ByteBuf resp = ctx.alloc().buffer(256);
> try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
> }
> NettyFutureUtils.safeWriteAndFlush(ctx, resp);
>   }
> {code}
> {code:java}
> protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
> errorClass,
> String error) throws IOException {
> BufferChain bc;
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> try (ByteBufferOutputStream saslResponse = new 
> ByteBufferOutputStream(256);
>   DataOutputStream out = new DataOutputStream(saslResponse)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
>   bc = new BufferChain(saslResponse.getByteBuffer());
> }
> doRespond(() -> bc);
>   }
> {code}
> At the same time, {{NettyHBaseSaslRpcServerHandler.doResponse}}  sends 
> ByteBuf directly , not the unified {{RpcResponse}} , so  it would not handled 
> by the logic in  {{NettyRpcServerResponseEncoder.write}}, which would update 
> the {{MetricsHBaseServer.sentBytes}}.  Using   
> {{ServerRpcConnection.doRawSaslReply}} uniformly would make the 
> {{MetricsHBaseServer.sentBytes}} more accurate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-16 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27924:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> {{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
> {{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
> replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
> {{ServerRpcConnection.doRawSaslReply}}: 
> {code:java}
> private void doResponse(ChannelHandlerContext ctx, SaslStatus status, 
> Writable rv,
> String errorClass, String error) throws IOException {
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> ByteBuf resp = ctx.alloc().buffer(256);
> try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
> }
> NettyFutureUtils.safeWriteAndFlush(ctx, resp);
>   }
> {code}
> {code:java}
> protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
> errorClass,
> String error) throws IOException {
> BufferChain bc;
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> try (ByteBufferOutputStream saslResponse = new 
> ByteBufferOutputStream(256);
>   DataOutputStream out = new DataOutputStream(saslResponse)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
>   bc = new BufferChain(saslResponse.getByteBuffer());
> }
> doRespond(() -> bc);
>   }
> {code}
> At the same time, {{NettyHBaseSaslRpcServerHandler.doResponse}}  sends 
> ByteBuf directly , not the unified {{RpcResponse}} , so  it would not handled 
> by the logic in  {{NettyRpcServerResponseEncoder.write}}, which would update 
> the {{MetricsHBaseServer.sentBytes}}.  Using   
> {{ServerRpcConnection.doRawSaslReply}} uniformly would make the 
> {{MetricsHBaseServer.sentBytes}} more accurate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-16 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27924:
-
Fix Version/s: 2.6.0
   3.0.0-beta-1

> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> {{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
> {{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
> replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
> {{ServerRpcConnection.doRawSaslReply}}: 
> {code:java}
> private void doResponse(ChannelHandlerContext ctx, SaslStatus status, 
> Writable rv,
> String errorClass, String error) throws IOException {
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> ByteBuf resp = ctx.alloc().buffer(256);
> try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
> }
> NettyFutureUtils.safeWriteAndFlush(ctx, resp);
>   }
> {code}
> {code:java}
> protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
> errorClass,
> String error) throws IOException {
> BufferChain bc;
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> try (ByteBufferOutputStream saslResponse = new 
> ByteBufferOutputStream(256);
>   DataOutputStream out = new DataOutputStream(saslResponse)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
>   bc = new BufferChain(saslResponse.getByteBuffer());
> }
> doRespond(() -> bc);
>   }
> {code}
> At the same time, {{NettyHBaseSaslRpcServerHandler.doResponse}}  sends 
> ByteBuf directly , not the unified {{RpcResponse}} , so  it would not handled 
> by the logic in  {{NettyRpcServerResponseEncoder.write}}, which would update 
> the {{MetricsHBaseServer.sentBytes}}.  Using   
> {{ServerRpcConnection.doRawSaslReply}} uniformly would make the 
> {{MetricsHBaseServer.sentBytes}} more accurate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27924:
-
Status: Patch Available  (was: Open)

> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 3.0.0-alpha-4, 2.6.0, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
> {{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
> replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
> {{ServerRpcConnection.doRawSaslReply}}: 
> {code:java}
> private void doResponse(ChannelHandlerContext ctx, SaslStatus status, 
> Writable rv,
> String errorClass, String error) throws IOException {
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> ByteBuf resp = ctx.alloc().buffer(256);
> try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
> }
> NettyFutureUtils.safeWriteAndFlush(ctx, resp);
>   }
> {code}
> {code:java}
> protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
> errorClass,
> String error) throws IOException {
> BufferChain bc;
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> try (ByteBufferOutputStream saslResponse = new 
> ByteBufferOutputStream(256);
>   DataOutputStream out = new DataOutputStream(saslResponse)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
>   bc = new BufferChain(saslResponse.getByteBuffer());
> }
> doRespond(() -> bc);
>   }
> {code}
> At the same time, {{NettyHBaseSaslRpcServerHandler.doResponse}}  sends 
> ByteBuf directly , not the unified {{RpcResponse}} , so  it would not handled 
> by the logic in  {{NettyRpcServerResponseEncoder.write}}, which would update 
> the {{MetricsHBaseServer.sentBytes}}.  Using   
> {{ServerRpcConnection.doRawSaslReply}} uniformly would make the 
> {{MetricsHBaseServer.sentBytes}} more accurate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27924:
-
Description: 
{{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
{{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
{{ServerRpcConnection.doRawSaslReply}}: 

{code:java}
private void doResponse(ChannelHandlerContext ctx, SaslStatus status, Writable 
rv,
String errorClass, String error) throws IOException {
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
ByteBuf resp = ctx.alloc().buffer(256);
try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
}
NettyFutureUtils.safeWriteAndFlush(ctx, resp);
  }
{code}


{code:java}
protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
errorClass,
String error) throws IOException {
BufferChain bc;
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
try (ByteBufferOutputStream saslResponse = new ByteBufferOutputStream(256);
  DataOutputStream out = new DataOutputStream(saslResponse)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
  bc = new BufferChain(saslResponse.getByteBuffer());
}
doRespond(() -> bc);
  }
{code}

At the same time, {{NettyHBaseSaslRpcServerHandler.doResponse}}  sends ByteBuf 
directly , not the unified {{RpcResponse}} , so  it would not handled by the 
logic in  {{NettyRpcServerResponseEncoder.write}}, which would update the 
{{MetricsHBaseServer.sentBytes}}.  Using   
{{ServerRpcConnection.doRawSaslReply}} uniformly would make the 
{{MetricsHBaseServer.sentBytes}} more accurate.

  was:
{{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
{{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
{{ServerRpcConnection.doRawSaslReply}}: 

{code:java}
private void doResponse(ChannelHandlerContext ctx, SaslStatus status, Writable 
rv,
String errorClass, String error) throws IOException {
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
ByteBuf resp = ctx.alloc().buffer(256);
try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
}
NettyFutureUtils.safeWriteAndFlush(ctx, resp);
  }
{code}


{code:java}
protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
errorClass,
String error) throws IOException {
BufferChain bc;
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
try (ByteBufferOutputStream saslResponse = new ByteBufferOutputStream(256);
  DataOutputStream out = new DataOutputStream(saslResponse)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
  bc = new BufferChain(saslResponse.getByteBuffer());
}
doRespond(() -> bc);
  }
{code}

At the same time, {{NettyHBaseSaslRpcServerHandler.doResponse}}  sends ByteBuf 
directly , so  it would not handled by the logic in  
{{NettyRpcServerResponseEncoder.write}}, which would update the 
{{MetricsHBaseServer.sentBytes}}.  Using   
{{ServerRpcConnection.doRawSaslReply}} uniformly would make the 
{{MetricsHBaseServer.sentBytes}} more accurate.


> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> 

[jira] [Assigned] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei reassigned HBASE-27924:


Assignee: chenglei

> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
> {{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
> replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
> {{ServerRpcConnection.doRawSaslReply}}: 
> {code:java}
> private void doResponse(ChannelHandlerContext ctx, SaslStatus status, 
> Writable rv,
> String errorClass, String error) throws IOException {
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> ByteBuf resp = ctx.alloc().buffer(256);
> try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
> }
> NettyFutureUtils.safeWriteAndFlush(ctx, resp);
>   }
> {code}
> {code:java}
> protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
> errorClass,
> String error) throws IOException {
> BufferChain bc;
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> try (ByteBufferOutputStream saslResponse = new 
> ByteBufferOutputStream(256);
>   DataOutputStream out = new DataOutputStream(saslResponse)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
>   bc = new BufferChain(saslResponse.getByteBuffer());
> }
> doRespond(() -> bc);
>   }
> {code}
> At the same time, {{NettyHBaseSaslRpcServerHandler.doResponse}}  sends 
> ByteBuf directly , so  it would not handled by the logic in  
> {{NettyRpcServerResponseEncoder.write}}, which would update the 
> {{MetricsHBaseServer.sentBytes}}.  Using   
> {{ServerRpcConnection.doRawSaslReply}} uniformly would make the 
> {{MetricsHBaseServer.sentBytes}} more accurate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27924:
-
Description: 
{{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
{{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
{{ServerRpcConnection.doRawSaslReply}}: 

{code:java}
private void doResponse(ChannelHandlerContext ctx, SaslStatus status, Writable 
rv,
String errorClass, String error) throws IOException {
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
ByteBuf resp = ctx.alloc().buffer(256);
try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
}
NettyFutureUtils.safeWriteAndFlush(ctx, resp);
  }
{code}


{code:java}
protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
errorClass,
String error) throws IOException {
BufferChain bc;
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
try (ByteBufferOutputStream saslResponse = new ByteBufferOutputStream(256);
  DataOutputStream out = new DataOutputStream(saslResponse)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
  bc = new BufferChain(saslResponse.getByteBuffer());
}
doRespond(() -> bc);
  }
{code}

At the same time, {{NettyHBaseSaslRpcServerHandler.doResponse}}  sends ByteBuf 
directly , so  it would not handled by the logic in  
{{NettyRpcServerResponseEncoder.write}}, which would update the 
{{MetricsHBaseServer.sentBytes}}.  Using   
{{ServerRpcConnection.doRawSaslReply}} uniformly would make the 
{{MetricsHBaseServer.sentBytes}} more accurate.

  was:
{{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
{{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
{{ServerRpcConnection.doRawSaslReply}}: 

{code:java}
private void doResponse(ChannelHandlerContext ctx, SaslStatus status, Writable 
rv,
String errorClass, String error) throws IOException {
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
ByteBuf resp = ctx.alloc().buffer(256);
try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
}
NettyFutureUtils.safeWriteAndFlush(ctx, resp);
  }
{code}


{code:java}
protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
errorClass,
String error) throws IOException {
BufferChain bc;
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
try (ByteBufferOutputStream saslResponse = new ByteBufferOutputStream(256);
  DataOutputStream out = new DataOutputStream(saslResponse)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
  bc = new BufferChain(saslResponse.getByteBuffer());
}
doRespond(() -> bc);
  }
{code}



> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> {{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
> {{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
> replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
> {{ServerRpcConnection.doRawSaslReply}}: 
> {code:java}
> private void doResponse(ChannelHandlerContext ctx, SaslStatus status, 
> Writable rv,
> String errorClass, String error) throws IOException {
> // In my testing, 

[jira] [Updated] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27924:
-
Description: 
{{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
{{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
{{ServerRpcConnection.doRawSaslReply}}: 

{code:java}
private void doResponse(ChannelHandlerContext ctx, SaslStatus status, Writable 
rv,
String errorClass, String error) throws IOException {
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
ByteBuf resp = ctx.alloc().buffer(256);
try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
}
NettyFutureUtils.safeWriteAndFlush(ctx, resp);
  }
{code}


{code:java}
protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
errorClass,
String error) throws IOException {
BufferChain bc;
// In my testing, have noticed that sasl messages are usually
// in the ballpark of 100-200. That's why the initial capacity is 256.
try (ByteBufferOutputStream saslResponse = new ByteBufferOutputStream(256);
  DataOutputStream out = new DataOutputStream(saslResponse)) {
  out.writeInt(status.state); // write status
  if (status == SaslStatus.SUCCESS) {
rv.write(out);
  } else {
WritableUtils.writeString(out, errorClass);
WritableUtils.writeString(out, error);
  }
  bc = new BufferChain(saslResponse.getByteBuffer());
}
doRespond(() -> bc);
  }
{code}


  was:There are two very similar code for 


> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> {{NettyHBaseSaslRpcServerHandler.doResponse}}  and  
> {{ServerRpcConnection.doRawSaslReply}} are very similar, I think we could 
> replace {{NettyHBaseSaslRpcServerHandler.doResponse}}  with 
> {{ServerRpcConnection.doRawSaslReply}}: 
> {code:java}
> private void doResponse(ChannelHandlerContext ctx, SaslStatus status, 
> Writable rv,
> String errorClass, String error) throws IOException {
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> ByteBuf resp = ctx.alloc().buffer(256);
> try (ByteBufOutputStream out = new ByteBufOutputStream(resp)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
> }
> NettyFutureUtils.safeWriteAndFlush(ctx, resp);
>   }
> {code}
> {code:java}
> protected final void doRawSaslReply(SaslStatus status, Writable rv, String 
> errorClass,
> String error) throws IOException {
> BufferChain bc;
> // In my testing, have noticed that sasl messages are usually
> // in the ballpark of 100-200. That's why the initial capacity is 256.
> try (ByteBufferOutputStream saslResponse = new 
> ByteBufferOutputStream(256);
>   DataOutputStream out = new DataOutputStream(saslResponse)) {
>   out.writeInt(status.state); // write status
>   if (status == SaslStatus.SUCCESS) {
> rv.write(out);
>   } else {
> WritableUtils.writeString(out, errorClass);
> WritableUtils.writeString(out, error);
>   }
>   bc = new BufferChain(saslResponse.getByteBuffer());
> }
> doRespond(() -> bc);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27924:
-
Description: There are two very similar code for 

> Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the 
> sentByte metrics more accurate
> 
>
> Key: HBASE-27924
> URL: https://issues.apache.org/jira/browse/HBASE-27924
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> There are two very similar code for 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27924) Remove duplicate code for NettyHBaseSaslRpcServerHandler and make the sentByte metrics more accurate

2023-06-12 Thread chenglei (Jira)
chenglei created HBASE-27924:


 Summary: Remove duplicate code for NettyHBaseSaslRpcServerHandler 
and make the sentByte metrics more accurate
 Key: HBASE-27924
 URL: https://issues.apache.org/jira/browse/HBASE-27924
 Project: HBase
  Issue Type: Bug
  Components: netty, rpc, security
Affects Versions: 3.0.0-alpha-4, 2.6.0, 4.0.0-alpha-1
Reporter: chenglei






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Fix Version/s: 3.0.0-beta-1

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 3.0.0-beta-1
>
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,  I think this 
> problem is caused by two reasons:
> * For Server:
>   The type of the response is  {{RpcResponse}}, but for 
> {{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
> ,{{NettyRpcServerResponseEncoder}} does not exist, so {{RpcResponse}} 
> messages cannot be sent.
> * For Client
>   When {{NettyHBaseSaslRpcClientHandler}} receives 
> {{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove 
> {{SaslChallengeDecoder}} and {{NettyHBaseSaslRpcClientHandler}}, so the 
> latter responses are considered to be incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Fix Version/s: 2.6.0

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,  I think this 
> problem is caused by two reasons:
> * For Server:
>   The type of the response is  {{RpcResponse}}, but for 
> {{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
> ,{{NettyRpcServerResponseEncoder}} does not exist, so {{RpcResponse}} 
> messages cannot be sent.
> * For Client
>   When {{NettyHBaseSaslRpcClientHandler}} receives 
> {{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove 
> {{SaslChallengeDecoder}} and {{NettyHBaseSaslRpcClientHandler}}, so the 
> latter responses are considered to be incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-12 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei resolved HBASE-27923.
--
Resolution: Fixed

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,  I think this 
> problem is caused by two reasons:
> * For Server:
>   The type of the response is  {{RpcResponse}}, but for 
> {{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
> ,{{NettyRpcServerResponseEncoder}} does not exist, so {{RpcResponse}} 
> messages cannot be sent.
> * For Client
>   When {{NettyHBaseSaslRpcClientHandler}} receives 
> {{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove 
> {{SaslChallengeDecoder}} and {{NettyHBaseSaslRpcClientHandler}}, so the 
> latter responses are considered to be incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-12 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17731468#comment-17731468
 ] 

chenglei commented on HBASE-27923:
--

Pushed to 2.6+, and backported the UT to 2.4 and 2.5, thanks [~zhangduo] and  
[~wchevreuil] for reviewing!

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,  I think this 
> problem is caused by two reasons:
> * For Server:
>   The type of the response is  {{RpcResponse}}, but for 
> {{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
> ,{{NettyRpcServerResponseEncoder}} does not exist, so {{RpcResponse}} 
> messages cannot be sent.
> * For Client
>   When {{NettyHBaseSaslRpcClientHandler}} receives 
> {{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove 
> {{SaslChallengeDecoder}} and {{NettyHBaseSaslRpcClientHandler}}, so the 
> latter responses are considered to be incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Description: 
{{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
server does not enable security  and client enables security,  I think this 
problem is caused by two reasons:
* For Server:
  The type of the response is  {{RpcResponse}}, but for 
{{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
,{{NettyRpcServerResponseEncoder}} does not exist, so {{RpcResponse}} messages 
cannot be sent.

* For Client
  When {{NettyHBaseSaslRpcClientHandler}} receives 
{{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove {{SaslChallengeDecoder}} 
and {{NettyHBaseSaslRpcClientHandler}}, so the latter responses are considered 
to be incorrect.


  was:
{{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
server does not enable security  and client enables security,  I think this 
problem is caused by two reasons:
* For Server:
  The type of the response is  {{RpcResponse}}, but for 
{{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
,{{NettyRpcServerResponseEncoder }} does not exist, so {{RpcResponse}} messages 
cannot be sent.

* For Client
  When {{NettyHBaseSaslRpcClientHandler}} receives 
{{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove {{SaslChallengeDecoder}} 
and {{NettyHBaseSaslRpcClientHandler}}, so the latter responses are considered 
to be incorrect.



> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,  I think this 
> problem is caused by two reasons:
> * For Server:
>   The type of the response is  {{RpcResponse}}, but for 
> {{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
> ,{{NettyRpcServerResponseEncoder}} does not exist, so {{RpcResponse}} 
> messages cannot be sent.
> * For Client
>   When {{NettyHBaseSaslRpcClientHandler}} receives 
> {{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove 
> {{SaslChallengeDecoder}} and {{NettyHBaseSaslRpcClientHandler}}, so the 
> latter responses are considered to be incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei reassigned HBASE-27923:


Assignee: chenglei

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,  I think this 
> problem is caused by two reasons:
> * For Server:
>   The type of the response is  {{RpcResponse}}, but for 
> {{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
> ,{{NettyRpcServerResponseEncoder }} does not exist, so {{RpcResponse}} 
> messages cannot be sent.
> * For Client
>   When {{NettyHBaseSaslRpcClientHandler}} receives 
> {{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove 
> {{SaslChallengeDecoder}} and {{NettyHBaseSaslRpcClientHandler}}, so the 
> latter responses are considered to be incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-09 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Description: 
{{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
server does not enable security  and client enables security,  I think this 
problem is caused by two reasons:
* For Server:
  The type of the response is  {{RpcResponse}}, but for 
{{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
,{{NettyRpcServerResponseEncoder }} does not exist, so {{RpcResponse}} messages 
cannot be sent.

* For Client
  When {{NettyHBaseSaslRpcClientHandler}} receives 
{{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove {{SaslChallengeDecoder}} 
and {{NettyHBaseSaslRpcClientHandler}}, so the latter responses are considered 
to be incorrect.


  was:
{{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
server does not enable security  and client enables security,I think this 
problem is caused by two reasons:
* For Server:
  The type of the response is  {{RpcResponse}}, but for 
{{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
,{{NettyRpcServerResponseEncoder }} does not exist, so {{RpcResponse}} messages 
cannot be sent.

* For Client
  When {{NettyHBaseSaslRpcClientHandler}} receives 
{{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove {{SaslChallengeDecoder}} 
and {{NettyHBaseSaslRpcClientHandler}}, so the latter responses are considered 
to be incorrect.



> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,  I think this 
> problem is caused by two reasons:
> * For Server:
>   The type of the response is  {{RpcResponse}}, but for 
> {{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
> ,{{NettyRpcServerResponseEncoder }} does not exist, so {{RpcResponse}} 
> messages cannot be sent.
> * For Client
>   When {{NettyHBaseSaslRpcClientHandler}} receives 
> {{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove 
> {{SaslChallengeDecoder}} and {{NettyHBaseSaslRpcClientHandler}}, so the 
> latter responses are considered to be incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-08 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Description: 
{{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
server does not enable security  and client enables security,I think this 
problem is caused by two reasons:
* For Server:
  The type of the response is  {{RpcResponse}}, but for 
{{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
,{{NettyRpcServerResponseEncoder }} does not exist, so {{RpcResponse}} messages 
cannot be sent.

* For Client
  When {{NettyHBaseSaslRpcClientHandler}} receives 
{{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove {{SaslChallengeDecoder}} 
and {{NettyHBaseSaslRpcClientHandler}}, so the latter responses are considered 
to be incorrect.


  was:
{{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
server does not enable security  and client enables security,I think this 
problem is caused by two reasons:
For Server:


> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,I think this 
> problem is caused by two reasons:
> * For Server:
>   The type of the response is  {{RpcResponse}}, but for 
> {{NettyRpcServerPreambleHandler}},when it  send {{RpcResponse}} 
> ,{{NettyRpcServerResponseEncoder }} does not exist, so {{RpcResponse}} 
> messages cannot be sent.
> * For Client
>   When {{NettyHBaseSaslRpcClientHandler}} receives 
> {{SaslUtil.SWITCH_TO_SIMPLE_AUTH}}, it does not remove 
> {{SaslChallengeDecoder}} and {{NettyHBaseSaslRpcClientHandler}}, so the 
> latter responses are considered to be incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-08 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Description: 
{{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
server does not enable security  and client enables security,I think this 
problem is caused by two reasons:
For Server:

  was:{{NettyRpcServer}} may hange if  it should skip initial sasl handshake 
when server does not enable security  and client enables security,I think this 
problem is caused by two reasons:


> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,I think this 
> problem is caused by two reasons:
> For Server:



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-08 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Description: {{NettyRpcServer}} may hange if  it should skip initial sasl 
handshake when server does not enable security  and client enables security,I 
think this problem is caused by two reasons:  (was: NettyRpcServer may hange if 
)

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> {{NettyRpcServer}} may hange if  it should skip initial sasl handshake when 
> server does not enable security  and client enables security,I think this 
> problem is caused by two reasons:



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-08 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Description: NettyRpcServer may hange if 

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>
> NettyRpcServer may hange if 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-08 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Component/s: rpc

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, rpc, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-08 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27923:
-
Affects Version/s: 2.6.0
   (was: 2.5.5)

> NettyRpcServer may hange if it should skip initial sasl handshake
> -
>
> Key: HBASE-27923
> URL: https://issues.apache.org/jira/browse/HBASE-27923
> Project: HBase
>  Issue Type: Bug
>  Components: netty, security
>Affects Versions: 2.6.0, 3.0.0-alpha-4, 4.0.0-alpha-1
>Reporter: chenglei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27923) NettyRpcServer may hange if it should skip initial sasl handshake

2023-06-08 Thread chenglei (Jira)
chenglei created HBASE-27923:


 Summary: NettyRpcServer may hange if it should skip initial sasl 
handshake
 Key: HBASE-27923
 URL: https://issues.apache.org/jira/browse/HBASE-27923
 Project: HBase
  Issue Type: Bug
  Components: netty, security
Affects Versions: 2.5.5, 3.0.0-alpha-4, 4.0.0-alpha-1
Reporter: chenglei






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-21 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4
>
>
>  {{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, which makes the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}. It causes duplicated code and would make 
> tracing the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should encapsulate and centralize it in 
> {{ReplicationSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-21 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Fix Version/s: 2.6.0
   3.0.0-alpha-4

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4
>
>
>  {{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, which makes the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}. It causes duplicated code and would make 
> tracing the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should encapsulate and centralize it in 
> {{ReplicationSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-21 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715018#comment-17715018
 ] 

chenglei commented on HBASE-27785:
--

Pushed to 2.6+, thanks [~zhangduo] for reviewing!

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
>  {{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, which makes the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}. It causes duplicated code and would make 
> tracing the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should encapsulate and centralize it in 
> {{ReplicationSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-26869) RSRpcServices.scan should deep clone cells when RpcCallContext is null

2023-04-13 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-26869:
-
Description: 
When inspect HBASE-26812, I find that if {{RpcCallContext}} is null,  
{{RSRpcServices.scan}}  does not set {{ServerCall.rpcCallback}} and directly 
closes {{RegionScannerImpl}},  but it does not deep clone the result cells , so 
these cells may be returned to the {{ByteBuffAllocator}} and may be overwritten 
before the caller reads them, similar as HBASE-26036, and at the same time, if 
{{RpcCallContext}} is null, when {{RSRpcServices.scan}} return partial results, 
it does not invoke {{RegionScannerImpl.shipped}} to release the used resources 
such as {{HFileScanner}}, pooled {{ByteBuffer}} etc.

No matter {{ShortCircuitingClusterConnection}} should be removed or not, I 
think  this {{RSRpcServices.scan}} problem should fix for future 
maintainability.

  was:
When inspect HBASAE-26812, I find that if {{RpcCallContext}} is null,  
{{RSRpcServices.scan}}  does not set {{ServerCall.rpcCallback}} and directly 
closes {{RegionScannerImpl}},  but it does not deep clone the result cells , so 
these cells may be returned to the {{ByteBuffAllocator}} and may be overwritten 
before the caller reads them, similar as HBASE-26036, and at the same time, if 
{{RpcCallContext}} is null, when {{RSRpcServices.scan}} return partial results, 
it does not invoke {{RegionScannerImpl.shipped}} to release the used resources 
such as {{HFileScanner}}, pooled {{ByteBuffer}} etc.

No matter {{ShortCircuitingClusterConnection}} should be removed or not, I 
think  this {{RSRpcServices.scan}} problem should fix for future 
maintainability.


> RSRpcServices.scan should deep clone cells when RpcCallContext is null
> --
>
> Key: HBASE-26869
> URL: https://issues.apache.org/jira/browse/HBASE-26869
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 3.0.0-alpha-2, 2.4.11
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-3, 2.4.12
>
>
> When inspect HBASE-26812, I find that if {{RpcCallContext}} is null,  
> {{RSRpcServices.scan}}  does not set {{ServerCall.rpcCallback}} and directly 
> closes {{RegionScannerImpl}},  but it does not deep clone the result cells , 
> so these cells may be returned to the {{ByteBuffAllocator}} and may be 
> overwritten before the caller reads them, similar as HBASE-26036, and at the 
> same time, if {{RpcCallContext}} is null, when {{RSRpcServices.scan}} return 
> partial results, it does not invoke {{RegionScannerImpl.shipped}} to release 
> the used resources such as {{HFileScanner}}, pooled {{ByteBuffer}} etc.
> No matter {{ShortCircuitingClusterConnection}} should be removed or not, I 
> think  this {{RSRpcServices.scan}} problem should fix for future 
> maintainability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Description:  {{ReplicationSourceManager.totalBufferUsed}} is a counter, 
and is scoped to {{ReplicationSourceManager}}, but it is copied to 
{{ReplicationSource}} and {{ReplicationSourceWALReader}}, which makes the logic 
about {{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}. It causes duplicated code and would make 
tracing the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should encapsulate and centralize it in 
{{ReplicationSourceManager}}.  (was:  
{{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
{{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
{{ReplicationSourceWALReader}}, which makes the logic about 
{{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
tracing the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should encapsulate and centralize it in 
{{ReplicationSourceManager}}.)

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
>  {{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, which makes the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}. It causes duplicated code and would make 
> tracing the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should encapsulate and centralize it in 
> {{ReplicationSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Assignee: chenglei
  Status: Patch Available  (was: Open)

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
>  {{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, which makes the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
> tracing the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should encapsulate and centralize it in 
> {{ReplicationSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Description:  {{ReplicationSourceManager.totalBufferUsed}} is a counter, 
and is scoped to {{ReplicationSourceManager}}, but it is copied to 
{{ReplicationSource}} and {{ReplicationSourceWALReader}}, which makes the logic 
about {{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
tracing the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should encapsulate and centralize it in 
{{ReplicationSourceManager}}.  (was:  
{{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
{{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
{{ReplicationSourceWALReader}}, and make the logic about 
{{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
trace the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should encapsulate and centralize it in 
{{ReplicationSourceManager}}.)

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
>  {{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, which makes the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
> tracing the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should encapsulate and centralize it in 
> {{ReplicationSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Description:  {{ReplicationSourceManager.totalBufferUsed}} is a counter, 
and is scoped to {{ReplicationSourceManager}}, but it is copied to 
{{ReplicationSource}} and {{ReplicationSourceWALReader}}, and make the logic 
about {{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
trace the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should encapsulate and centralize it in 
{{ReplicationSourceManager}}.  (was:  
{{ReplicationSourceManager.totalBufferUsed}} is scoped to 
{{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
{{ReplicationSourceWALReader}}, and make the logic about 
{{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
trace the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should encapsulate and centralize it in 
{{ReplicationSourceManager}}.)

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
>  {{ReplicationSourceManager.totalBufferUsed}} is a counter, and is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, and make the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
> trace the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should encapsulate and centralize it in 
> {{ReplicationSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Description:  {{ReplicationSourceManager.totalBufferUsed}} is scoped to 
{{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
{{ReplicationSourceWALReader}}, and make the logic about 
{{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
trace the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should encapsulate and centralize it in 
{{ReplicationSourceManager}}.  (was:  
{{ReplicationSourceManager.totalBufferUsed}} is scoped to 
{{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
{{ReplicationSourceWALReader}}, and make the logic about 
{{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
trace the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should )

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
>  {{ReplicationSourceManager.totalBufferUsed}} is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, and make the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
> trace the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should encapsulate and centralize it in 
> {{ReplicationSourceManager}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Description:  {{ReplicationSourceManager.totalBufferUsed}} is scoped to 
{{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
{{ReplicationSourceWALReader}}, and make the logic about 
{{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
trace the buffer usage somewhat difficult when there is problem about 
{{totalBufferUsed}}. I think we should   (was:  
{{ReplicationSourceManager.totalBufferUsed}} is scoped to 
{{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
{{ReplicationSourceWALReader}}, and make the logic about 
{{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
trace the buffer usage somewhat difficult when there is problem about )

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
>  {{ReplicationSourceManager.totalBufferUsed}} is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, and make the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
> trace the buffer usage somewhat difficult when there is problem about 
> {{totalBufferUsed}}. I think we should 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Description:  {{ReplicationSourceManager.totalBufferUsed}} is scoped to 
{{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
{{ReplicationSourceWALReader}}, and make the logic about 
{{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
{{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
 and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
trace the buffer usage somewhat difficult when there is problem about   (was:  
ReplicationSourceManager)

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
>  {{ReplicationSourceManager.totalBufferUsed}} is scoped to 
> {{ReplicationSourceManager}}, but it is copied to {{ReplicationSource}} and 
> {{ReplicationSourceWALReader}}, and make the logic about 
> {{ReplicationSourceManager.totalBufferUsed}} is  scattered throughout 
> {{ReplicationSourceManager}},{{ReplicationSource}},{{ReplicationSourceWALReader}}
>  and {{ReplicationSourceShipper}}, which cause duplicated code and would make 
> trace the buffer usage somewhat difficult when there is problem about 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27785:
-
Description:  ReplicationSourceManager

> Encapsulate and centralize totalBufferUsed in ReplicationSourceManager
> --
>
> Key: HBASE-27785
> URL: https://issues.apache.org/jira/browse/HBASE-27785
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
>  ReplicationSourceManager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27785) Encapsulate and centralize totalBufferUsed in ReplicationSourceManager

2023-04-10 Thread chenglei (Jira)
chenglei created HBASE-27785:


 Summary: Encapsulate and centralize totalBufferUsed in 
ReplicationSourceManager
 Key: HBASE-27785
 URL: https://issues.apache.org/jira/browse/HBASE-27785
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Affects Versions: 3.0.0-alpha-3
Reporter: chenglei






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-07 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709387#comment-17709387
 ] 

chenglei edited comment on HBASE-27778 at 4/7/23 8:44 AM:
--

Pushed to 2.4+, thanks [~zhangduo] and [~Xiaolin Ha] for reviewing.


was (Author: comnetwork):
Pushed to 2.6+, thanks [~zhangduo] and [~Xiaolin Ha] for reviewing.

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.4.17, 2.5.4
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.5, 2.4.18
>
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-07 Thread chenglei (Jira)


[ https://issues.apache.org/jira/browse/HBASE-27778 ]


chenglei deleted comment on HBASE-27778:
--

was (Author: comnetwork):
The problem also exists in 2.5 and 2.4, I would open new PRs for 2.5 and 2.4.

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.4.17, 2.5.4
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.5, 2.4.18
>
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-07 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Affects Version/s: 2.4.17
   (was: 2.4.16)

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.4.17, 2.5.4
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.5, 2.4.18
>
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-07 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Fix Version/s: 2.6.0
   3.0.0-alpha-4
   2.5.5
   2.4.18

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.4.16, 2.5.4
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.5, 2.4.18
>
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-06 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Affects Version/s: 2.5.4
   2.4.16

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.4.16, 2.5.4
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-06 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709394#comment-17709394
 ] 

chenglei commented on HBASE-27778:
--

The problem also exists in 2.5 and 2.4, I would open new PRs for 2.5 and 2.4.

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-06 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709387#comment-17709387
 ] 

chenglei edited comment on HBASE-27778 at 4/6/23 1:53 PM:
--

Pushed to 2.6+, thanks [~zhangduo] and [~Xiaolin Ha] for reviewing.


was (Author: comnetwork):
Pushed to 2.6+, thanks [~zhangduo] and [~Xiaolin Ha] for review.

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-06 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709387#comment-17709387
 ] 

chenglei edited comment on HBASE-27778 at 4/6/23 1:52 PM:
--

Pushed to 2.6+, thanks [~zhangduo] and [~Xiaolin Ha] for review.


was (Author: comnetwork):
Pushed to 2.6+, thanks [~zhangduo] and [~Xiaolin Ha]

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-06 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709387#comment-17709387
 ] 

chenglei commented on HBASE-27778:
--

Pushed to 2.6+, thanks [~zhangduo] and [~Xiaolin Ha]

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-06 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Status: Patch Available  (was: Open)

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0-alpha-3, 2.6.0
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei reassigned HBASE-27778:


Assignee: chenglei

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Description: When we read a new WAL Entry in 
{{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), and the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased 
in this case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
actually scoped to {{ReplicationSourceManager}}, after a long run, replication 
to all peers may hang up.  (was: When we read a new WAL Entry in 
{{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), and the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased 
in this case. Because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is 
actually scoped to {{ReplicationSourceManager}}, after a long run, replication 
to all peers may hang up.)

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader.totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Description: When we read a new WAL Entry in 
{{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), and the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased 
in this case. Because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is 
actually scoped to {{ReplicationSourceManager}}, after a long run, replication 
to all peers may hang up.  (was: When we read a new WAL Entry in 
{{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), and the {{ReplicationSourceWALReader. totalBufferUsed}} is not 
decreased in this case. Because the  {{ReplicationSourceWALReader. 
totalBufferUsed}}  is actually scoped to {{ReplicationSourceManager}}, after a 
long run, replication to all peers may hang up.)

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader.totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Description: When we read a new WAL Entry in 
{{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), and the {{ReplicationSourceWALReader. totalBufferUsed}} is not 
decreased in this case. Because the  {{ReplicationSourceWALReader. 
totalBufferUsed}}  is actually scoped to {{ReplicationSourceManager}}, after a 
long run, replication to all peers may hang up.  (was: When we read a new WAL 
Entry in {{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader. totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), but the {{ReplicationSourceWALReader. totalBufferUsed}} is not 
decreased and because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is 
scoped to {{ReplicationSourceManager}}, after a long run, replication to all 
peers may hang up.)

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader.totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), and 
> the {{ReplicationSourceWALReader. totalBufferUsed}} is not decreased in this 
> case. Because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is 
> actually scoped to {{ReplicationSourceManager}}, after a long run, 
> replication to all peers may hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Description: When we read a new WAL Entry in 
{{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader. totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), but the {{ReplicationSourceWALReader. totalBufferUsed}} is not 
decreased and because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is 
scoped to {{ReplicationSourceManager}}, after a long run, replication to all 
peers may hang up.  (was: When we read a new WAL Entry in 
{{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader. totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), but the {{ReplicationSourceWALReader. totalBufferUsed}} is not 
decreased and because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is 
scoped to {{ReplicationSourceManager}}, after a long run, all peers may be go 
slow and eventually block completely.)

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader. totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), but 
> the {{ReplicationSourceWALReader. totalBufferUsed}} is not decreased and 
> because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is scoped to 
> {{ReplicationSourceManager}}, after a long run, replication to all peers may 
> hang up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Description: When we read a new WAL Entry in 
{{ReplicationSourceWALReader.readWALEntries}}, we add 
{{ReplicationSourceWALReader. totalBufferUsed}} by the size of new entry in   
{{ReplicationSourceWALReader.addEntryToBatch}}, but the whole {{WALEntryBatch}} 
may not be put to the {{ReplicationSourceWALReader.entryBatchQueue}} because of 
exception(eg. exception thrown by {{WALEntryFilter.filter}} for following WAL 
Entry), but the {{ReplicationSourceWALReader. totalBufferUsed}} is not 
decreased and because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is 
scoped to {{ReplicationSourceManager}}, after a long run, all peers may be go 
slow and eventually block completely.  (was: 
{{ReplicationSourceWALReader.addEntryToBatch}})

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
> When we read a new WAL Entry in 
> {{ReplicationSourceWALReader.readWALEntries}}, we add 
> {{ReplicationSourceWALReader. totalBufferUsed}} by the size of new entry in   
> {{ReplicationSourceWALReader.addEntryToBatch}}, but the whole 
> {{WALEntryBatch}} may not be put to the 
> {{ReplicationSourceWALReader.entryBatchQueue}} because of exception(eg. 
> exception thrown by {{WALEntryFilter.filter}} for following WAL Entry), but 
> the {{ReplicationSourceWALReader. totalBufferUsed}} is not decreased and 
> because the  {{ReplicationSourceWALReader. totalBufferUsed}}  is scoped to 
> {{ReplicationSourceManager}}, after a long run, all peers may be go slow and 
> eventually block completely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27778:
-
Description: {{ReplicationSourceWALReader.addEntryToBatch}}

> Incorrect  ReplicationSourceWALReader. totalBufferUsed may cause replication 
> hang up
> 
>
> Key: HBASE-27778
> URL: https://issues.apache.org/jira/browse/HBASE-27778
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.6.0, 3.0.0-alpha-3
>Reporter: chenglei
>Priority: Major
>
> {{ReplicationSourceWALReader.addEntryToBatch}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27778) Incorrect ReplicationSourceWALReader. totalBufferUsed may cause replication hang up

2023-04-04 Thread chenglei (Jira)
chenglei created HBASE-27778:


 Summary: Incorrect  ReplicationSourceWALReader. totalBufferUsed 
may cause replication hang up
 Key: HBASE-27778
 URL: https://issues.apache.org/jira/browse/HBASE-27778
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 3.0.0-alpha-3, 2.6.0
Reporter: chenglei






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27654) IndexBlockEncoding is missing in HFileContextBuilder copy constructor

2023-02-20 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691125#comment-17691125
 ] 

chenglei commented on HBASE-27654:
--

Pushed to 2.5+, thanks [~bbeaudreault] for review.

> IndexBlockEncoding is missing in HFileContextBuilder copy constructor
> -
>
> Key: HBASE-27654
> URL: https://issues.apache.org/jira/browse/HBASE-27654
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.5.3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{IndexBlockEncoding}} is missing in {{HFileContextBuilder}} copy 
> constructor, so if we want to construct a new {{HFileContext}}  by an 
> existing {{HFileContext}}, we would missing the {{ndexBlockEncoding}} 
> setting. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27654) IndexBlockEncoding is missing in HFileContextBuilder copy constructor

2023-02-20 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27654:
-
Fix Version/s: 2.6.0
   3.0.0-alpha-4
   2.5.4
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> IndexBlockEncoding is missing in HFileContextBuilder copy constructor
> -
>
> Key: HBASE-27654
> URL: https://issues.apache.org/jira/browse/HBASE-27654
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.5.3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.4
>
>
> {{IndexBlockEncoding}} is missing in {{HFileContextBuilder}} copy 
> constructor, so if we want to construct a new {{HFileContext}}  by an 
> existing {{HFileContext}}, we would missing the {{ndexBlockEncoding}} 
> setting. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27654) IndexBlockEncoding is missing in HFileContextBuilder copy constructor

2023-02-18 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27654:
-
Affects Version/s: 2.5.3
   3.0.0-alpha-3
   2.6.0

> IndexBlockEncoding is missing in HFileContextBuilder copy constructor
> -
>
> Key: HBASE-27654
> URL: https://issues.apache.org/jira/browse/HBASE-27654
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.0.0-alpha-3, 2.5.3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{IndexBlockEncoding}} is missing in {{HFileContextBuilder}} copy 
> constructor, so if we want to construct a new {{HFileContext}}  by an 
> existing {{HFileContext}}, we would missing the {{ndexBlockEncoding}} 
> setting. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27654) IndexBlockEncoding is missing in HFileContextBuilder copy constructor

2023-02-18 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27654:
-
Description: {{IndexBlockEncoding}} is missing in {{HFileContextBuilder}} 
copy constructor, so if we want to construct a new {{HFileContext}}  by an 
existing {{HFileContext}}, we would missing the {{ndexBlockEncoding}} setting.  
 (was: IndexBlockEncoding is )

> IndexBlockEncoding is missing in HFileContextBuilder copy constructor
> -
>
> Key: HBASE-27654
> URL: https://issues.apache.org/jira/browse/HBASE-27654
> Project: HBase
>  Issue Type: Bug
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> {{IndexBlockEncoding}} is missing in {{HFileContextBuilder}} copy 
> constructor, so if we want to construct a new {{HFileContext}}  by an 
> existing {{HFileContext}}, we would missing the {{ndexBlockEncoding}} 
> setting. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27654) IndexBlockEncoding is missing in HFileContextBuilder copy constructor

2023-02-18 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27654:
-
Description: IndexBlockEncoding is 

> IndexBlockEncoding is missing in HFileContextBuilder copy constructor
> -
>
> Key: HBASE-27654
> URL: https://issues.apache.org/jira/browse/HBASE-27654
> Project: HBase
>  Issue Type: Bug
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> IndexBlockEncoding is 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27654) IndexBlockEncoding is missing in HFileContextBuilder copy constructor

2023-02-18 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27654:
-
Status: Patch Available  (was: Open)

> IndexBlockEncoding is missing in HFileContextBuilder copy constructor
> -
>
> Key: HBASE-27654
> URL: https://issues.apache.org/jira/browse/HBASE-27654
> Project: HBase
>  Issue Type: Bug
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27654) IndexBlockEncoding is missing in HFileContextBuilder copy constructor

2023-02-18 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei reassigned HBASE-27654:


Assignee: chenglei

> IndexBlockEncoding is missing in HFileContextBuilder copy constructor
> -
>
> Key: HBASE-27654
> URL: https://issues.apache.org/jira/browse/HBASE-27654
> Project: HBase
>  Issue Type: Bug
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27654) IndexBlockEncoding is missing in HFileContextBuilder copy constructor

2023-02-18 Thread chenglei (Jira)
chenglei created HBASE-27654:


 Summary: IndexBlockEncoding is missing in HFileContextBuilder copy 
constructor
 Key: HBASE-27654
 URL: https://issues.apache.org/jira/browse/HBASE-27654
 Project: HBase
  Issue Type: Bug
Reporter: chenglei






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27539) Encapsulate and centralise access to ref count through StoreFileInfo

2022-12-24 Thread chenglei (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated HBASE-27539:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Encapsulate and centralise access to ref count through StoreFileInfo
> 
>
> Key: HBASE-27539
> URL: https://issues.apache.org/jira/browse/HBASE-27539
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.3
>
>
> Both {{StoreFileReader}} and {{StoreFileInfo}} have  a {{refCount}}, and the 
> {{refCount}} is currently used in three main ways:
> * When a new {{StoreFileScanner}} is created or close, it increases  or 
> decreases the {{StoreFileReader.refCount}}.
> * When {{CompactedHFilesDischarger}}  checks a {{HStoreFile}} whether it 
> could be deleted, it check the {{StoreFileInfo.refCount}}.
> * When {{HStore.getScanners}} gets {{HStoreFile}} from {{StoreFileManager}}, 
> it  increases  or decreases the {{StoreFileInfo.refCount}}.
> The problem here is  {{StoreFileReader.refCount}}  is copied from the 
> {{StoreFileInfo.refCount}}  and the inconsistent usage of the {{refCount}} 
> making the code somewhat hard to understand and causing trace the resource 
> race problems such as HBASE-27484 and HBASE-27519 somewhat difficult. I 
> suggest we should unify these two {{refCount}} and just use 
> {{StoreFileInfo.refCount}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-27539) Encapsulate and centralise access to ref count through StoreFileInfo

2022-12-24 Thread chenglei (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651782#comment-17651782
 ] 

chenglei commented on HBASE-27539:
--

Pushed to 2.5+, thanks [~wchevreuil] for reviewing!

> Encapsulate and centralise access to ref count through StoreFileInfo
> 
>
> Key: HBASE-27539
> URL: https://issues.apache.org/jira/browse/HBASE-27539
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-3
>Reporter: chenglei
>Assignee: chenglei
>Priority: Major
>
> Both {{StoreFileReader}} and {{StoreFileInfo}} have  a {{refCount}}, and the 
> {{refCount}} is currently used in three main ways:
> * When a new {{StoreFileScanner}} is created or close, it increases  or 
> decreases the {{StoreFileReader.refCount}}.
> * When {{CompactedHFilesDischarger}}  checks a {{HStoreFile}} whether it 
> could be deleted, it check the {{StoreFileInfo.refCount}}.
> * When {{HStore.getScanners}} gets {{HStoreFile}} from {{StoreFileManager}}, 
> it  increases  or decreases the {{StoreFileInfo.refCount}}.
> The problem here is  {{StoreFileReader.refCount}}  is copied from the 
> {{StoreFileInfo.refCount}}  and the inconsistent usage of the {{refCount}} 
> making the code somewhat hard to understand and causing trace the resource 
> race problems such as HBASE-27484 and HBASE-27519 somewhat difficult. I 
> suggest we should unify these two {{refCount}} and just use 
> {{StoreFileInfo.refCount}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   >