[jira] [Commented] (HBASE-20897) Port HBASE-20866 "HBase 1.x scan performance degradation compared to 0.98 version" to branch-2 and up

2020-01-24 Thread Vikas Vishwakarma (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022924#comment-17022924
 ] 

Vikas Vishwakarma commented on HBASE-20897:
---

[~ndimiduk] I am a bit out of touch on this. Would be grateful if this can be 
re-assigned or closed. I had done the changes for 1.x. From what I remember 
those changes were not compatible with 2.x and up and might need considerable 
refactoring or may not be applicable at all.  

> Port HBASE-20866 "HBase 1.x scan performance degradation compared to 0.98 
> version" to branch-2 and up
> -
>
> Key: HBASE-20897
> URL: https://issues.apache.org/jira/browse/HBASE-20897
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Andrew Kyle Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 3.0.0, 2.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-03 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16637746#comment-16637746
 ] 

Vikas Vishwakarma commented on HBASE-21265:
---

looks good +1 

few minor checkstyle and whitespace warnings in the generated test report 
[~apurtell]

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20890) PE filterScan seems to be stuck forever

2018-08-27 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16593271#comment-16593271
 ] 

Vikas Vishwakarma commented on HBASE-20890:
---

Committed the patch to 1.3, 1.4, branch-1, branch-2, main and updated the Fix 
versions and have marked the issue as resolved [~apurtell] [~abhishek94goyal]

> PE filterScan seems to be stuck forever
> ---
>
> Key: HBASE-20890
> URL: https://issues.apache.org/jira/browse/HBASE-20890
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.7
>Reporter: Vikas Vishwakarma
>Assignee: Abhishek Goyal
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.7
>
> Attachments: HBASE-20890.001.patch
>
>
> Command Used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
> write 2>&1
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
> filterScan 2>&1
> {code}
>  
> Output
> This kept running for several hours just printing the below messages in logs
>  
> {code:java}
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
> 2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> .
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail
> 2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20890) PE filterScan seems to be stuck forever

2018-08-27 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20890:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> PE filterScan seems to be stuck forever
> ---
>
> Key: HBASE-20890
> URL: https://issues.apache.org/jira/browse/HBASE-20890
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.7
>Reporter: Vikas Vishwakarma
>Assignee: Abhishek Goyal
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.7
>
> Attachments: HBASE-20890.001.patch
>
>
> Command Used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
> write 2>&1
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
> filterScan 2>&1
> {code}
>  
> Output
> This kept running for several hours just printing the below messages in logs
>  
> {code:java}
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
> 2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> .
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail
> 2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20890) PE filterScan seems to be stuck forever

2018-08-27 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20890:
--
Affects Version/s: 1.4.7
   2.2.0
   1.5.0
   3.0.0
Fix Version/s: 1.4.7
   2.2.0
   1.3.3
   1.5.0
   3.0.0

> PE filterScan seems to be stuck forever
> ---
>
> Key: HBASE-20890
> URL: https://issues.apache.org/jira/browse/HBASE-20890
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.7
>Reporter: Vikas Vishwakarma
>Assignee: Abhishek Goyal
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 1.3.3, 2.2.0, 1.4.7
>
> Attachments: HBASE-20890.001.patch
>
>
> Command Used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
> write 2>&1
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
> filterScan 2>&1
> {code}
>  
> Output
> This kept running for several hours just printing the below messages in logs
>  
> {code:java}
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
> 2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> .
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail
> 2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20890) PE filterScan seems to be stuck forever

2018-08-23 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591153#comment-16591153
 ] 

Vikas Vishwakarma commented on HBASE-20890:
---

LGTM +1 [~abhishek94goyal] 

I can commit if no objections [~apurtell] 

> PE filterScan seems to be stuck forever
> ---
>
> Key: HBASE-20890
> URL: https://issues.apache.org/jira/browse/HBASE-20890
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Abhishek Goyal
>Priority: Minor
> Attachments: HBASE-20890.001.patch
>
>
> Command Used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
> write 2>&1
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
> filterScan 2>&1
> {code}
>  
> Output
> This kept running for several hours just printing the below messages in logs
>  
> {code:java}
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
> 2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> .
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail
> 2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-20890) PE filterScan seems to be stuck forever

2018-08-22 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma resolved HBASE-20890.
---
Resolution: Not A Problem

> PE filterScan seems to be stuck forever
> ---
>
> Key: HBASE-20890
> URL: https://issues.apache.org/jira/browse/HBASE-20890
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Abhishek Goyal
>Priority: Minor
>
> Command Used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
> write 2>&1
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
> filterScan 2>&1
> {code}
>  
> Output
> This kept running for several hours just printing the below messages in logs
>  
> {code:java}
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
> 2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> .
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail
> 2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20890) PE filterScan seems to be stuck forever

2018-08-22 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589603#comment-16589603
 ] 

Vikas Vishwakarma commented on HBASE-20890:
---

[~abhishek94goyal] ya that makes sense. By overriding rows value with scan 
command, we can limit the number of iterations for filterScan to 20 or any 
number of iterations we would like to run. I think we can close this request. 
Thanks for looking into it. 

> PE filterScan seems to be stuck forever
> ---
>
> Key: HBASE-20890
> URL: https://issues.apache.org/jira/browse/HBASE-20890
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Abhishek Goyal
>Priority: Minor
>
> Command Used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
> write 2>&1
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
> filterScan 2>&1
> {code}
>  
> Output
> This kept running for several hours just printing the below messages in logs
>  
> {code:java}
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
> 2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> .
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail
> 2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20890) PE filterScan seems to be stuck forever

2018-08-20 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586870#comment-16586870
 ] 

Vikas Vishwakarma commented on HBASE-20890:
---

[~abhishek94goyal] that would be great ! please pick it up

> PE filterScan seems to be stuck forever
> ---
>
> Key: HBASE-20890
> URL: https://issues.apache.org/jira/browse/HBASE-20890
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Priority: Minor
>
> Command Used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
> write 2>&1
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
> filterScan 2>&1
> {code}
>  
> Output
> This kept running for several hours just printing the below messages in logs
>  
> {code:java}
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
> 2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> .
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail
> 2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-13 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579186#comment-16579186
 ] 

Vikas Vishwakarma commented on HBASE-21028:
---

[~dbwong] committed it to 1.3.3. Please assign to yourself once you get the 
added to the contributor list. 

[~apurtell] [~taklwu]

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch, 
> HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-13 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma reassigned HBASE-21028:
-

Assignee: (was: Vikas Vishwakarma)

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch, 
> HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-13 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-21028:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch, 
> HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-13 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma reassigned HBASE-21028:
-

Assignee: Vikas Vishwakarma

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Assignee: Vikas Vishwakarma
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch, 
> HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21028) Backport HBASE-18633 to branch-1.3

2018-08-12 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577829#comment-16577829
 ] 

Vikas Vishwakarma commented on HBASE-21028:
---

+1 in the original request this change is already in the newer branches, so 
looks like 1.3.3 target should be ok here. Original request did not show these 
checkstyle warnings in the Hadoop QA report, I can try and fix those in other 
branches also by logging a subtask along with this commit if no objections 
[~apurtell]

 

> Backport HBASE-18633 to branch-1.3
> --
>
> Key: HBASE-21028
> URL: https://issues.apache.org/jira/browse/HBASE-21028
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.3.2
>Reporter: Daniel Wong
>Priority: Minor
> Fix For: 1.3.3
>
> Attachments: HBASE-21028-branch-1.3.patch, 
> HBASE-21028-branch-1.3.patch
>
>
> The logging improvements in HBASE-18633 would give greater visibility on 
> systems in 1.3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-08-02 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.006.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client, scan
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
>  Labels: perfomance
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch, HBASE-20896.branch-1.4.005.patch, 
> HBASE-20896.branch-1.4.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-08-02 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567790#comment-16567790
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

thanks [~reidchan] 

I have run mvn test -P runDevTests  locally on 1.4.7 to include server tests 
and it passed all tests 

I was looking at the branch-2 and main branch changes, however there are some 
more changes needed for these branches due to introduction of 
AsyncClientScanner and slight diff in the way cache is initialized in 
ClientScanner.  Will work on those to port these changes. 

I am updating the patch with 2 small changes: 
 # removed unnecessary return statement in 
CompleteScanResultCache:loadResultsToCache(..) last return statement
 # made LinkedList cache private instead of protected in 
TestBatchScanResultCache

 

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client, scan
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
>  Labels: perfomance
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch, HBASE-20896.branch-1.4.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-08-01 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564825#comment-16564825
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

Thanks [~reidchan] , fixed the indent issue and updated the patch. Will add 
patch for master and branch-2 also in HBASE-20897 and ping you again for help 
with review :) (hopefully will not require much changes now since this part of 
the code was backported from the master branch)

[~apurtell] kindly take a look at the final patch. 

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch, HBASE-20896.branch-1.4.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-08-01 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.005.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch, HBASE-20896.branch-1.4.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.004.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: (was: HBASE-20896.branch-1.4.004.patch)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564732#comment-16564732
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

[~reidchan] I have addressed all the review comments. Changed LinkedList to 
List in ScanResultCache and the inheritances

Removed the code repetition by adding a constructor and calling super(cache) in 
the inheritances, same for clear();

changed access modifiers to default for newly added variables in ScanResultCache

replaced count = 0 and resultSize = 0 with call to corresponding reset functions

Thanks for the detailed review, kindly take a look at the latest patch updated 
in the Jira and ReviewBoard

 (fixed the whitespace warnings from Hadoop QA in the latest patch)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.004.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: (was: HBASE-20896.branch-1.4.004.patch)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.004.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: (was: HBASE-20896.branch-1.4.005.patch)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.005.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: (was: HBASE-20896.branch-1.4.004.patch)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.004.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch, 
> HBASE-20896.branch-1.4.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563546#comment-16563546
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

[~reidchan] I am done with most of the changes, only needed more feedback on 
last comment

{quote}

protected List cache; // Not LinkedList, unless you have to use 
specific methods in it

{quote}

In ClientScanner cache.poll(); is being used which is implemented in LinkedList 
{code:java}
public Result next() throws IOException {
// If the scanner is closed and there's nothing left in the cache, next is a 
no-op.
if (cache.size() == 0 && this.closed) {
return null;
}
if (cache.size() == 0) {
loadCache();
}

if (cache.size() > 0) {
return cache.poll();
}

// if we exhausted this scanner before calling close, write out the scan metrics
writeScanMetrics();
return null;
}{code}
If we use List instead of LinkedList we can replace this with cache.remove(0) . 
It looks like there is only this one change, let me know your thoughts and I 
can replace it in the patch. 

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563314#comment-16563314
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

[~mdrob] the changes do not add any extra linkedlist operations. In the 
original code in ClientScanner.java 

we already have the cache LinkedList 

In ClientScanner.java  we call ScanResultCache.addAndGet(..)  which takes 
Result array values as input, does all the partial handling on it during which 
it gets converted to LinkedList or subarrays or goes through a separate 
iterations to update numRowCounts, etc depending on which handling gets invoked 
\{AllowPartialScanResultCache, BatchScanResultCache, CompleteScanResultCache} 
and returns a processed result array 

The returned result array from  ScanResultCache.addAndGet(..)  is again 
iterated over and added to the cache LinkedList back in ClientScanner
{code:java}
public abstract class ClientScanner extends AbstractClientScanner {
..
protected final LinkedList cache = new LinkedList();
..

  protected void loadCache() throws IOException {
...
Result[] resultsToAddToCache =
scanResultCache.addAndGet(values, callable.isHeartbeatMessage());
..
for (Result rs : resultsToAddToCache) {
cache.add(rs);
for (Cell cell : rs.rawCells()) {
remainingResultSize -= CellUtil.estimatedHeapSizeOf(cell);
}
countdown--;
this.lastResult = rs;
}
..
}{code}
 

In my changes I have only eliminated all the intermediate array iterations and 
conversions to list or subarray in \{AllowPartialScanResultCache, 
BatchScanResultCache, CompleteScanResultCache}. In this case Result array 
values is directly added to the cache depending on the various checks and 
conditions and the numRowCounts is updated along with it. So all the work is 
done in a single iteration over the values input passed to 
loadResultsToCache(..) . High level structure 
{code:java}
public abstract class ClientScanner extends AbstractClientScanner {
..
protected final LinkedList cache = new LinkedList();
..

  protected void loadCache() throws IOException {
..
scanResultCache.loadResultsToCache(values, callable.isHeartbeatMessage());
..
}

//add values to cache in loadResultsToCache itself in 
AllowPartialScanResultCache, BatchScanResultCache, CompleteScanResultCache 
after all the checks and partialResult handling

{code}
 

 

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-31 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563193#comment-16563193
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

[~reidchan] yes CompleteScanResultCache. I will complete the earlier review 
comments today, was out for few days. 

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558153#comment-16558153
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

thanks [~reidchan] will make the suggested changes

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558146#comment-16558146
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

Apache build Hadoop QA tests have passed and reviewboard has been updated with 
the latest patch

locally I have run mvn test -P runDevTests to include server tests which also 
passed

Perf test results are given below and show 5-10% improvement (PE result tests 
tends to vary between iterations) :

|| test || 1.4.6 (ms) || HBASE-20896 (ms) || %diff || 
| scan all rows | 27669 | 26257 | 5 | 
| scan more CQ | 81308 | 76493 | 6 | 
| PE scan | 8917 | 8488 | 5 | 
| PE scan10 | 215134 | 209203 | 3 | 
| PE scan100 | 663645 | 650348 | 2 | 
| PE scanseq | 99696 | 94888 | 5 |

So this request is now ready for review. After that I can start working on the 
main branch changes. 

[~apurtell] [~reidchan] [~md...@cloudera.com] [~yuzhih...@gmail.com] 
[~vrodionov]

 

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558097#comment-16558097
 ] 

Vikas Vishwakarma edited comment on HBASE-20896 at 7/26/18 10:28 AM:
-

Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code:java}
private Result[] prependCombined(Result[] results, int length) throws 
IOException {

 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}
In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}
In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}
At a high level the above has been replaced by the results being added to cache 
which is passed from ClientScanner#loadCache() directly 
In ConnectionUtils.java we pass the cache 
{code:java}
  public static ScanResultCache createScanResultCache(Scan scan, 
LinkedList cache) {
if (scan.getAllowPartialResults()) {
  return new AllowPartialScanResultCache(cache);
} else if (scan.getBatch() > 0) {
  return new BatchScanResultCache(cache, scan.getBatch());
} else {
  return new CompleteScanResultCache(cache);
}
  }
{code}
And for respective scanner cache we just do all the partial handling on the 
input results array that we get from the server and directly add it to the 
cache in a single iteration using addResultToCache which is same as what 
ScanResultCache was doing earlier with the result array returned from these 
classes processing the results array
{code:java}
  protected void checkUpdateNumberOfCompleteRowsAndCache(Result rs) {
numberOfCompleteRows++;
addResultToCache(rs);
  }

  protected void addResultToCache(Result rs) {
cache.add(rs);
for (Cell cell : rs.rawCells()) {
  resultSize += CellUtil.estimatedHeapSizeOf(cell);
}
count++;
lastResult = rs;
  }

{code}


was (Author: vik.karma):
Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code:java}
private Result[] prependCombined(Result[] results, int length) throws 
IOException {

 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}
In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}
In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}

At a high level the above has been replaced by the results being added to cache 
which is 

[jira] [Comment Edited] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558097#comment-16558097
 ] 

Vikas Vishwakarma edited comment on HBASE-20896 at 7/26/18 10:03 AM:
-

Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code:java}
private Result[] prependCombined(Result[] results, int length) throws 
IOException {

 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}
In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}
In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}

At a high level the above has been replaced by the results being added to cache 
which is passed from ClientScanner.java directly and updating the 
numberOfCompleteRows while adding the element to cache
ConnectionUtils.java
{code}
  public static ScanResultCache createScanResultCache(Scan scan, 
LinkedList cache) {
if (scan.getAllowPartialResults()) {
  return new AllowPartialScanResultCache(cache);
} else if (scan.getBatch() > 0) {
  return new BatchScanResultCache(cache, scan.getBatch());
} else {
  return new CompleteScanResultCache(cache);
}
  }
{code}

And for respective scanner cache we just do all the partial handling on the 
input results array itself that we get from the server and directly add it to 
the cache in a single iteration over the results using addResultToCache which 
is same as what ScanResultCache was doing earlier with the result array 
returned from these classes
{code}
  protected void checkUpdateNumberOfCompleteRowsAndCache(Result rs) {
numberOfCompleteRows++;
addResultToCache(rs);
  }

  protected void addResultToCache(Result rs) {
cache.add(rs);
for (Cell cell : rs.rawCells()) {
  resultSize += CellUtil.estimatedHeapSizeOf(cell);
}
count++;
lastResult = rs;
  }

{code}








was (Author: vik.karma):
Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code:java}
private Result[] prependCombined(Result[] results, int length) throws 
IOException {

 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}
In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}
In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}

At a high level the above has been replaced by the 

[jira] [Comment Edited] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558097#comment-16558097
 ] 

Vikas Vishwakarma edited comment on HBASE-20896 at 7/26/18 9:47 AM:


Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code:java}
private Result[] prependCombined(Result[] results, int length) throws 
IOException {

 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}
In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}
In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}

At a high level the above has been replaced by the results being added to cache 
which is passed from ClientScanner.java directly and updating the 
numberOfCompleteRows while adding the element to cache
ConnectionUtils.java
{code}
  public static ScanResultCache createScanResultCache(Scan scan, 
LinkedList cache) {
if (scan.getAllowPartialResults()) {
  return new AllowPartialScanResultCache(cache);
} else if (scan.getBatch() > 0) {
  return new BatchScanResultCache(cache, scan.getBatch());
} else {
  return new CompleteScanResultCache(cache);
}
  }
{code}

And for respective scanner cache where addResultToCache is same as what was 
happening in ScanResultCache java earlier
{code}
  protected void checkUpdateNumberOfCompleteRowsAndCache(Result rs) {
numberOfCompleteRows++;
addResultToCache(rs);
  }

  protected void addResultToCache(Result rs) {
cache.add(rs);
for (Cell cell : rs.rawCells()) {
  resultSize += CellUtil.estimatedHeapSizeOf(cell);
}
count++;
lastResult = rs;
  }

{code}








was (Author: vik.karma):
Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code}

private Result[] prependCombined(Result[] results, int length) throws 
IOException {

 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}


In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided 
{code}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}


In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>

[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558097#comment-16558097
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code}

private Result[] prependCombined(Result[] results, int length) throws 
IOException {

 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}


In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided 
{code}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}


In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.003.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: (was: HBASE-20896.branch-1.4.003.patch)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-26 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.003.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-25 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.002.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-25 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: (was: HBASE-20896.branch-1.4.002.patch)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-24 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.002.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-23 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552627#comment-16552627
 ] 

Vikas Vishwakarma commented on HBASE-20896:
---

I have added the 1.4 branch patch to get a QA build run and a review board link 
to look at the changes. I need to do some perf runs with these changes. 

I have a bit more refactoring that can be done like moving the addResultToCache 
function to common ScanResultCache

Also possibly refactor prependCombined function in CompleteScanResultCache. 
These are mostly related to code cleanup. Just wanted to get a sanity test and 
some perf benchmarks before making more changes. 

[~apurtell] [~yuzhih...@gmail.com] [~reidchan] [~md...@cloudera.com] 
[~vrodionov]

 

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-23 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Release Note:   (was: get a QA run)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-23 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Release Note: get a QA run
  Status: Patch Available  (was: Open)

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4

2018-07-23 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20896:
--
Attachment: HBASE-20896.branch-1.4.001.patch

> Port HBASE-20866 to branch-1 and branch-1.4 
> 
>
> Key: HBASE-20896
> URL: https://issues.apache.org/jira/browse/HBASE-20896
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.5.0, 1.4.7
>
> Attachments: HBASE-20896.branch-1.4.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-16 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544979#comment-16544979
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

The code in branch-1.4 onwards is similar to the master branch and will require 
considerable change for implementing the above change in these branches. But 
once done it should be easy to apply the same from branch-1.4 to master branch. 
Will work on the same and update. [~apurtell] so for now I was able to commit 
the patch only for 1.3 branch. 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } 

[jira] [Commented] (HBASE-20889) PE scan is failing with NullPointer

2018-07-15 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544798#comment-16544798
 ] 

Vikas Vishwakarma commented on HBASE-20889:
---

+1 v2 looks good to me [~yuzhih...@gmail.com] 

 

> PE scan is failing with NullPointer
> ---
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20890) PE filterScan seems to be stuck forever

2018-07-15 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544565#comment-16544565
 ] 

Vikas Vishwakarma commented on HBASE-20890:
---

oh ! thanks for debugging [~yuzhih...@gmail.com] , probably we should add 
another parameter here to limit the number of iterations for filterScan. My 
weekend test runs got completely taken over by filterScan test :)

> PE filterScan seems to be stuck forever
> ---
>
> Key: HBASE-20890
> URL: https://issues.apache.org/jira/browse/HBASE-20890
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Priority: Minor
>
> Command Used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
> write 2>&1
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
> filterScan 2>&1
> {code}
>  
> Output
> This kept running for several hours just printing the below messages in logs
>  
> {code:java}
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
> 2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> .
> -bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail
> 2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> 2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
> internal scanner to startKey at '52359'
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20889) PE scan is failing with NullPointer

2018-07-15 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544559#comment-16544559
 ] 

Vikas Vishwakarma edited comment on HBASE-20889 at 7/15/18 3:00 PM:


Thanks [~yuzhih...@gmail.com] for the quick fix. I am still wondering a bit 
though why we have different handling for RandomScanWithRangeTest and ScanTest. 
In ScanTest also shouldn't we be calling 
updateScanMetrics(scan.getScanMetrics()); after every testRow() and not just in 
testTakedown(). Probably open a new Jira for handling the ScanMetrics for 
ScanTest or we could just move the updateScanMetrics() from takedown() to 
testRow() in a final block as part of this Jira, that should also solve the 
NullPointer as well as metric update issue ?


was (Author: vik.karma):
Thanks [~yuzhih...@gmail.com] for the quick fix. I am still wondering a bit 
though why we have different handling for RandomScanWithRangeTest and ScanTest. 
In ScanTest also shouldn't we be calling 
updateScanMetrics(scan.getScanMetrics()); after every testRow() and not just in 
testTakedown(). Probably open a new Jira for handling the ScanMetrics for 
ScanTest or we could just move the updateScanMetrics() from takedown() to 
testRow() in a final block as part of this Jira ?

> PE scan is failing with NullPointer
> ---
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20889.branch-1.3.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20889) PE scan is failing with NullPointer

2018-07-15 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544559#comment-16544559
 ] 

Vikas Vishwakarma edited comment on HBASE-20889 at 7/15/18 2:59 PM:


Thanks [~yuzhih...@gmail.com] for the quick fix. I am still wondering a bit 
though why we have different handling for RandomScanWithRangeTest and ScanTest. 
In ScanTest also shouldn't we be calling 
updateScanMetrics(scan.getScanMetrics()); after every testRow() and not just in 
testTakedown(). Probably open a new Jira for handling the ScanMetrics for 
ScanTest or we could just move the updateScanMetrics() from takedown() to 
testRow() in a final block as part of this Jira ?


was (Author: vik.karma):
Thanks [~yuzhih...@gmail.com] for the quick fix. I am still wondering a bit 
though why we have different handling for RandomScanWithRangeTest and ScanTest. 
In ScanTest also shouldn't we be calling 
updateScanMetrics(scan.getScanMetrics()); after every testRow() and not just in 
testTakedown(). Probably open a new Jira for handling the ScanMetrics for 
ScanTest?

> PE scan is failing with NullPointer
> ---
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20889.branch-1.3.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20889) PE scan is failing with NullPointer

2018-07-15 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544559#comment-16544559
 ] 

Vikas Vishwakarma commented on HBASE-20889:
---

Thanks [~yuzhih...@gmail.com] for the quick fix. I am still wondering a bit 
though why we have different handling for RandomScanWithRangeTest and ScanTest. 
In ScanTest also shouldn't we be calling 
updateScanMetrics(scan.getScanMetrics()); after every testRow() and not just in 
testTakedown(). Probably open a new Jira for handling the ScanMetrics for 
ScanTest?

> PE scan is failing with NullPointer
> ---
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20889.branch-1.3.txt
>
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20891) Avoid intermediate array to arraylist inter-conversions while loading scan cache

2018-07-15 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20891:
--
Summary: Avoid intermediate array to arraylist inter-conversions while 
loading scan cache  (was: Avoid intermediate array to arraylist conversions 
while loading scan cache)

> Avoid intermediate array to arraylist inter-conversions while loading scan 
> cache
> 
>
> Key: HBASE-20891
> URL: https://issues.apache.org/jira/browse/HBASE-20891
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 3.0.0, 2.1.0
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Major
>
> As discussed in HBASE-20866, we would like to avoid array to arraylist 
> conversions while loading scan cache which is currently happening as part of 
> partial result handling. In HBASE-20866 we are handling the changes for 
> branch-1.x. In this request we will handle it for branch-2 and master branch, 
> since the code has been refactored and will require more changes compared to 
> branch-1
> Also preliminary look at the master branch shows that result handling has 
> been separated out into AllowPartialScanResultCache, BatchScanResultCache and 
> CompleteScanResultCache.
> In case of BatchScanResultCache we are actually converting Result[] to 
> List for result grooming and then  List back to toArray 
> before returning to loadCache() where it is added to cache.
> So in case of BatchScanResultCache if we are able to directly load the 
> results to cache then we would be avoiding two intermediate conversions
>  * result Array to ArrayList in BatchScanResultCache for result grooming
>  * ArrayList to array conversion while returning to loadCache()
> Which will probably give higher performance improvement compared to branch-1 
> case handled in HBASE-20866 where we avoided just one result array to 
> arraylist conversion and saw upto 10% improvement in scan performance
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20891) Avoid intermediate array to arraylist conversions while loading scan cache

2018-07-15 Thread Vikas Vishwakarma (JIRA)
Vikas Vishwakarma created HBASE-20891:
-

 Summary: Avoid intermediate array to arraylist conversions while 
loading scan cache
 Key: HBASE-20891
 URL: https://issues.apache.org/jira/browse/HBASE-20891
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 3.0.0, 2.1.0
Reporter: Vikas Vishwakarma
Assignee: Vikas Vishwakarma


As discussed in HBASE-20866, we would like to avoid array to arraylist 
conversions while loading scan cache which is currently happening as part of 
partial result handling. In HBASE-20866 we are handling the changes for 
branch-1.x. In this request we will handle it for branch-2 and master branch, 
since the code has been refactored and will require more changes compared to 
branch-1

Also preliminary look at the master branch shows that result handling has been 
separated out into AllowPartialScanResultCache, BatchScanResultCache and 
CompleteScanResultCache.

In case of BatchScanResultCache we are actually converting Result[] to 
List for result grooming and then  List back to toArray before 
returning to loadCache() where it is added to cache.

So in case of BatchScanResultCache if we are able to directly load the results 
to cache then we would be avoiding two intermediate conversions
 * result Array to ArrayList in BatchScanResultCache for result grooming
 * ArrayList to array conversion while returning to loadCache()

Which will probably give higher performance improvement compared to branch-1 
case handled in HBASE-20866 where we avoided just one result array to arraylist 
conversion and saw upto 10% improvement in scan performance

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20890) PE filterScan seems to be stuck forever

2018-07-15 Thread Vikas Vishwakarma (JIRA)
Vikas Vishwakarma created HBASE-20890:
-

 Summary: PE filterScan seems to be stuck forever
 Key: HBASE-20890
 URL: https://issues.apache.org/jira/browse/HBASE-20890
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.3
Reporter: Vikas Vishwakarma


Command Used
{code:java}

~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred randomWrite 1 > 
write 2>&1
~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred filterScan 1 > 
filterScan 2>&1
{code}
 

Output

This kept running for several hours just printing the below messages in logs

 
{code:java}

-bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | head
2018-07-13 10:44:45,188 DEBUG [TestClient-0] client.ClientScanner - Advancing 
internal scanner to startKey at '52359'
2018-07-13 10:44:45,976 DEBUG [TestClient-0] client.ClientScanner - Advancing 
internal scanner to startKey at '52359'
2018-07-13 10:44:46,695 DEBUG [TestClient-0] client.ClientScanner - Advancing 
internal scanner to startKey at '52359'
.

-bash-4.1$ grep "Advancing internal scanner to startKey" filterScan.1 | tail

2018-07-15 06:20:22,353 DEBUG [TestClient-0] client.ClientScanner - Advancing 
internal scanner to startKey at '52359'
2018-07-15 06:20:23,044 DEBUG [TestClient-0] client.ClientScanner - Advancing 
internal scanner to startKey at '52359'
2018-07-15 06:20:23,768 DEBUG [TestClient-0] client.ClientScanner - Advancing 
internal scanner to startKey at '52359'
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20889) PE scan is failing with NullPointer

2018-07-15 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20889:
--
Summary: PE scan is failing with NullPointer  (was: PE scan 1 is failing 
with NullPointer)

> PE scan is failing with NullPointer
> ---
>
> Key: HBASE-20889
> URL: https://issues.apache.org/jira/browse/HBASE-20889
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.3
>Reporter: Vikas Vishwakarma
>Priority: Minor
>
> Command used
> {code:java}
> ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > 
> scan1{code}
> PE scan 1 is failing with NullPointer
> {code:java}
> java.io.IOException: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
>     at 
> org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20889) PE scan 1 is failing with NullPointer

2018-07-15 Thread Vikas Vishwakarma (JIRA)
Vikas Vishwakarma created HBASE-20889:
-

 Summary: PE scan 1 is failing with NullPointer
 Key: HBASE-20889
 URL: https://issues.apache.org/jira/browse/HBASE-20889
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.3
Reporter: Vikas Vishwakarma


Command used
{code:java}
~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > scan1{code}
PE scan 1 is failing with NullPointer
{code:java}
java.io.IOException: java.lang.NullPointerException
    at 
org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447)
    at 
org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920)
    at 
org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at 
org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326)
Caused by: java.lang.NullPointerException
    at 
org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530)
    at 
org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165)
    at 
org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896)
    at 
org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429)
    at 
org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-15 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544414#comment-16544414
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

thanks [~apurtell] [~yuzhih...@gmail.com] , i was able to commit to 1.3 but 
looks like will need changes for other branches. Will add and commit the same. 
Separately i am seeing some issues with pe . The filterScan test never 
completes and was running for over 16 hours, also i am seeing nullpointers in 
some cases, will log separate Jira for those.

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..

[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-13 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542866#comment-16542866
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

[~yuzhih...@gmail.com] I am not seeing much difference in in RandomReadTest and 
SequentialReadTest probably because these are mostly gets. 

RandomSeekScanTest  2013537ms without patch and 1908920ms  with patch which is 
5-6 % improvement

filterScan and scanRange1 were taking a long time to complete. I will leave 
a test iteration over the weekend and report the same once completed. 

The above test failures in server module again don't look related to my change, 
probably some issue with the build. Locally mvn test -P runDevTests passed for 
me. I will leave mvn test -P runAllTests running over the weekend. 

 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws 

[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542475#comment-16542475
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

[~reidchan] [~mdrob] I was looking at the master branch. Here I can see the 
result handling has been separated out into AllowPartialScanResultCache, 
BatchScanResultCache and CompleteScanResultCache. In case of 
BatchScanResultCache we are actually converting Result[] to List for 
result grooming and then  List back to toArray before returning to 
loadCache() where it is added to cache. There seems to be good amount of 
refactoring in this area of the code between master and branch-1 as you 
mentioned. Will try and check if the changes submitted for 1.3 branch can fit 
into master branch or if it requires more refactoring. 

 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: HBASE-20866.branch-1.3.003.patch

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: (was: HBASE-20866.branch-1.3.003.patch)

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> getResultsToAddToCache(values, 

[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542421#comment-16542421
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

added a reviewboard link here [https://reviews.apache.org/r/67902/]

[~apurtell] [~reidchan] [~mdrob]

 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> 

[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-12 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541632#comment-16541632
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

these test failures and checkstyle warnings in server unit tests don't look 
related to my change. I will check, run the server tests locally and then 
resubmit the patch and reviewboard link. 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: HBASE-20866.branch-1.3.004.patch

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: (was: HBASE-20866.branch-1.3.004.patch)

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache 

[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541158#comment-16541158
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

thanks for the review and suggestions.

I have just committed a change with a trivial change in HRegion to get a full 
QA run as suggested by [~yuzhih...@gmail.com]

I will also put up the patch on RB and take a look at the master branch

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: HBASE-20866.branch-1.3.003.patch

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> 

[jira] [Updated] (HBASE-17877) Improve HBase's byte[] comparator

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-17877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-17877:
--
Attachment: HBASE-17877.branch-1.3.004.patch

> Improve HBase's byte[] comparator
> -
>
> Key: HBASE-17877
> URL: https://issues.apache.org/jira/browse/HBASE-17877
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Reporter: Lars Hofhansl
>Assignee: Vikas Vishwakarma
>Priority: Major
> Fix For: 1.4.0, 1.3.2, 2.0.0
>
> Attachments: 17877-1.2.patch, 17877-v2-1.3.patch, 17877-v3-1.3.patch, 
> 17877-v4-1.3.patch, ByteComparatorJiraHBASE-17877.pdf, 
> HBASE-17877.branch-1.3.001.patch, HBASE-17877.branch-1.3.002.patch, 
> HBASE-17877.branch-1.3.003.patch, HBASE-17877.branch-1.3.004.patch, 
> HBASE-17877.master.001.patch, HBASE-17877.master.002.patch, 
> HBASE-17877.master.003.patch
>
>
> [~vik.karma] did some extensive tests and found that Hadoop's version is 
> faster - dramatically faster in some cases.
> Patch forthcoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540027#comment-16540027
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

>> {color:#FF}test4test {color}{color:#FF}The patch doesn't appear to 
>> include any new or modified tests.{color}

Since this is a refactoring of existing code and no new functionality has been 
added, existing tests should suffice which includes scan and partial result 
handling tests. Have deployed locally with the changes and running scan 
benchmark tests.

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: HBASE-20866.branch-1.3.002.patch

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> getResultsToAddToCache(values, 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: (was: HBASE-20866.branch-1.3.002.patch)

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> getResultsToAddToCache(values, callable.isHeartbeatMessage());
>  

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: HBASE-20866.branch-1.3.002.patch

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch, 
> HBASE-20866.branch-1.3.002.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> getResultsToAddToCache(values, 

[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539743#comment-16539743
 ] 

Vikas Vishwakarma commented on HBASE-20866:
---

[~reidchan] [~yuzhih...@gmail.com] [~apurtell] [~vrodionov] that is right. 
There will still be good amount of diff between 0.98 and 1.3. For us primarily 
using HBase through phoenix and more SQL like use cases with large number of 
columns we tend to see amplified impact of such differences.  For direct HBase 
use cases (non-phoenix) and relatively smaller schema (fewer CQs) we don't see 
a major regression between 0.98 and 1.3.  In the attached patch I have 
attempted to completely get rid of the array to arraylist conversion both for 
normal as well as partial result handling to avoid having to deal with arrays 
and arraylist separately in loadcache. So I have replaced the 
getResultsToAddToCache(..) with loadResultsToCache(..) that processes the 
result array and directly adds it to the cache within the same function, 
without using/returning an intermediate arraylist both for normal results as 
well as partial results.

Please review, I will fix the checkstyle warnings and get some benchmark test 
runs with this change since this also includes the changes for partial results 
handling which was not done earlier. 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Status: Patch Available  (was: Open)

attempting a QA run 

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> getResultsToAddToCache(values, callable.isHeartbeatMessage());
> 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-11 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Attachment: HBASE-20866.branch-1.3.001.patch

> HBase 1.x scan performance degradation compared to 0.98 version
> ---
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
>Priority: Critical
> Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6
>
> Attachments: HBASE-20866.branch-1.3.001.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we 
> observed perf degradation in scan performance for phoenix queries varying 
> from few 10's to upto 200% depending on the query being executed. We tried 
> simple native HBase scan and there also we saw upto 40% degradation in 
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we 
> carried out lot of experiments with profiling and git bisect iterations, 
> however we were not able to identify any particular source of scan 
> performance degradation and it looked like this is an accumulated degradation 
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling, 
> ScannerContext with heartbeat processing, time/size limiting, RPC 
> refactoring, etc that could have contributed to small degradation in 
> performance which put together could be leading to large overall degradation.
> One of the changes is 
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which 
> implements partialResult handling. In ClientScanner.java the results received 
> from server are cached on the client side by converting the result array into 
> an ArrayList. This function gets called in a loop depending on the number of 
> rows in the scan result. Example for ten’s of millions of rows scanned, this 
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results, 
> etc). We are just taking the resultsFromServer converting it into a ArrayList 
> resultsToAddToCache in addResultsToList(..) and then iterating over the list 
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
>  List resultsToAddToCache =
>  getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
>    for (Result rs : resultsToAddToCache) {
>  rs = filterLoadedCell(rs);
>  cache.add(rs);
> ...
>    }
> }
> getResultsToAddToCache(..) {
> ..
>    final boolean isBatchSet = scan != null && scan.getBatch() > 0;
>    final boolean allowPartials = scan != null && 
> scan.getAllowPartialResults();
> ..
>    if (allowPartials || isBatchSet) {
>  addResultsToList(resultsToAddToCache, resultsFromServer, 0,
>    (null == resultsFromServer ? 0 : resultsFromServer.length));
>  return resultsToAddToCache;
>    }
> ...
> }
> private void addResultsToList(List outputList, Result[] inputArray, 
> int start, int end) {
>    if (inputArray == null || start < 0 || end > inputArray.length) return;
>    for (int i = start; i < end; i++) {
>  outputList.add(inputArray[i]);
>    }
>  }{code}
>  
> It looks like we can avoid the result array to arraylist conversion 
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also 
> the most frequent case and instead directly take the values arraay returned 
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
> and I am directly adding values to scanner cache if the above condition is 
> pass instead of coverting it into arrayList by calling 
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v Result rs = values[v];
> 
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List resultsToAddToCache =
> getResultsToAddToCache(values, callable.isHeartbeatMessage());
>  for (Result 

[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-10 Thread Vikas Vishwakarma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-20866:
--
Description: 
Internally while testing 1.3 as part of migration from 0.98 to 1.3 we observed 
perf degradation in scan performance for phoenix queries varying from few 10's 
to upto 200% depending on the query being executed. We tried simple native 
HBase scan and there also we saw upto 40% degradation in performance when the 
number of column qualifiers are high (40-50+)

To identify the root cause of performance diff between 0.98 and 1.3 we carried 
out lot of experiments with profiling and git bisect iterations, however we 
were not able to identify any particular source of scan performance degradation 
and it looked like this is an accumulated degradation of 5-10% over various 
enhancements and refactoring.

We identified few major enhancements like partialResult handling, 
ScannerContext with heartbeat processing, time/size limiting, RPC refactoring, 
etc that could have contributed to small degradation in performance which put 
together could be leading to large overall degradation.

One of the changes is 
[HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which implements 
partialResult handling. In ClientScanner.java the results received from server 
are cached on the client side by converting the result array into an ArrayList. 
This function gets called in a loop depending on the number of rows in the scan 
result. Example for ten’s of millions of rows scanned, this can be called in 
the order of millions of times.

In almost all the cases 99% of the time (except for handling partial results, 
etc). We are just taking the resultsFromServer converting it into a ArrayList 
resultsToAddToCache in addResultsToList(..) and then iterating over the list 
again and adding it to cache in loadCache(..) as given in the code path below

In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
addResultsToList(..) →
{code:java}
loadCache() {
...
 List resultsToAddToCache =
 getResultsToAddToCache(values, callable.isHeartbeatMessage());
...
…
   for (Result rs : resultsToAddToCache) {
 rs = filterLoadedCell(rs);
 cache.add(rs);
...
   }
}

getResultsToAddToCache(..) {
..
   final boolean isBatchSet = scan != null && scan.getBatch() > 0;
   final boolean allowPartials = scan != null && scan.getAllowPartialResults();
..
   if (allowPartials || isBatchSet) {
 addResultsToList(resultsToAddToCache, resultsFromServer, 0,
   (null == resultsFromServer ? 0 : resultsFromServer.length));
 return resultsToAddToCache;
   }
...
}

private void addResultsToList(List outputList, Result[] inputArray, int 
start, int end) {
   if (inputArray == null || start < 0 || end > inputArray.length) return;
   for (int i = start; i < end; i++) {
 outputList.add(inputArray[i]);
   }
 }{code}
 

It looks like we can avoid the result array to arraylist conversion 
(resultsFromServer --> resultsToAddToCache ) for the first case which is also 
the most frequent case and instead directly take the values arraay returned by 
callable and add it to the cache without converting it into ArrayList.

I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
and I am directly adding values to scanner cache if the above condition is pass 
instead of coverting it into arrayList by calling getResultsToAddToCache(). For 
example:
{code:java}
protected void loadCache() throws IOException {
Result[] values = null;
..
final boolean isBatchSet = scan != null && scan.getBatch() > 0;
final boolean allowPartials = scan != null && scan.getAllowPartialResults();
..
for (;;) {
try {
values = call(callable, caller, scannerTimeout);
..
} catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
..
}

if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
if (values != null) {
for (int v=0; v resultsToAddToCache =
getResultsToAddToCache(values, callable.isHeartbeatMessage());
 for (Result rs : resultsToAddToCache) {

cache.add(rs);
...

}
}


{code}
 

I am seeing upto 10% improvement in scan time with these changes, sample PE 
execution results given below. 
||PE (1M , 1 thread)||with addResultsToList||without 
addResultsToList||%improvement||
|ScanTest|9228|8448|9|
|RandomScanWithRange10Test|393413|378222|4|
|RandomScanWithRange100Test|1041860|980147|6|

Similarly we are observing upto 10% improvement in simple native HBase scan 
test used internally that just scans through a large region filtering all the 
rows. I still have to do the phoenix query tests with this change. Posting the 
initial observations for feedback/comments and suggestions. 

  was:
Internally while testing 1.3 as part of migration from 0.98 to 1.3 we observed 
perf degradation in scan performance for phoenix queries varying from few 10's 
to upto 200% depending on 

[jira] [Created] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version

2018-07-10 Thread Vikas Vishwakarma (JIRA)
Vikas Vishwakarma created HBASE-20866:
-

 Summary: HBase 1.x scan performance degradation compared to 0.98 
version
 Key: HBASE-20866
 URL: https://issues.apache.org/jira/browse/HBASE-20866
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.3.2
Reporter: Vikas Vishwakarma


Internally while testing 1.3 as part of migration from 0.98 to 1.3 we observed 
perf degradation in scan performance for phoenix queries varying from few 10's 
to upto 200% depending on the query being executed. We tried simple native 
HBase scan and there also we saw upto 40% degradation in performance when the 
number of column qualifiers are high (40-50+)

To identify the root cause of performance diff between 0.98 and 1.3 we carried 
out lot of experiments with profiling and git bisect iterations, however we 
were not able to identify any particular source of scan performance degradation 
and it looked like this is an accumulated degradation of 5-10% over various 
enhancements and refactoring. 

We identified few major enhancements like partialResult handling, 
ScannerContext with heartbeat processing, time/size limiting, RPC refactoring, 
etc that could have contributed to small degradation in performance which put 
together could be leading to large overall degradation.

One of the changes is 
[HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which implements 
partialResult handling. In ClientScanner.java the results received from server 
are cached on the client side by converting the result array into an ArrayList. 
This function gets called in a loop depending on the number of rows in the scan 
result. Example for ten’s of millions of rows scanned, this can be called in 
the order of millions of times. 

In almost all the cases 99% of the time (except for handling partial results, 
etc). We are just taking the resultsFromServer converting it into a ArrayList 
resultsToAddToCache in addResultsToList(..) and then iterating over the list 
again and adding it to cache in loadCache(..) as given in the code path below

In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → 
addResultsToList(..) → 
{code:java}

loadCache() {

...

 List resultsToAddToCache =

 getResultsToAddToCache(values, callable.isHeartbeatMessage());

...

…

   for (Result rs : resultsToAddToCache) {

 rs = filterLoadedCell(rs);

 cache.add(rs);

...

   }

}


getResultsToAddToCache(..) {

..

   final boolean isBatchSet = scan != null && scan.getBatch() > 0;

   final boolean allowPartials = scan != null && scan.getAllowPartialResults();

..

   if (allowPartials || isBatchSet) {

 addResultsToList(resultsToAddToCache, resultsFromServer, 0,

   (null == resultsFromServer ? 0 : resultsFromServer.length));

 return resultsToAddToCache;

   }

...



}

 private void addResultsToList(List outputList, Result[] inputArray, 
int start, int end) {

   if (inputArray == null || start < 0 || end > inputArray.length) return;


   for (int i = start; i < end; i++) {

 outputList.add(inputArray[i]);

   }

 }{code}
 

It looks like we can avoid the result array to arraylist conversion 
(resultsFromServer --> resultsToAddToCache ) for the first case which is also 
the most frequent case and instead directly take the resultsFromServer and add 
it to the cache without converting it into ArrayList.
{code:java}
if (allowPartials || isBatchSet) {
addResultsToList(resultsToAddToCache, resultsFromServer, 0,
(null == resultsFromServer ? 0 : resultsFromServer.length));
return resultsToAddToCache;
}{code}
I have taken both these flags allowPartials and isBatchSet out in loadcahe() 
and I am directly adding values to scanner cache if the above condition is pass 
instead of coverting it into arrayList by calling getResultsToAddToCache(). For 
example:
{code:java}
protected void loadCache() throws IOException {
Result[] values = null;
..
final boolean isBatchSet = scan != null && scan.getBatch() > 0;
final boolean allowPartials = scan != null && scan.getAllowPartialResults();
..
for (;;) {
try {
values = call(callable, caller, scannerTimeout);
..
} catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
..
}

if (allowPartials || isBatchSet) {  // DIRECTLY COPY values TO CACHE
if (values != null) {
for (int v=0; v resultsToAddToCache =
getResultsToAddToCache(values, callable.isHeartbeatMessage());
 for (Result rs : resultsToAddToCache) {

cache.add(rs);
...

}
}


{code}
 

I am seeing upto 10% improvement in scan time with these changes, sample PE 
execution results given below. 

|| PE (1M , 1 thread) || with addResultsToList || without addResultsToList || 
%improvement || 
| ScanTest | 9228 | 8448 | 9 | 
| RandomScanWithRange10Test | 393413 | 378222 | 4 | 
|RandomScanWithRange100Test | 1041860 | 980147 | 6 |

Similarly we are observing upto 10% improvement in simple 

[jira] [Commented] (HBASE-17958) Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP

2018-04-22 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447538#comment-16447538
 ] 

Vikas Vishwakarma commented on HBASE-17958:
---

Actually [~apurtell] [~lhofhansl] [~Apache9] I think the issue is this one 
HBASE-10993 already reported and fixed by  [~stack] in HBASE-15971. Profiling 
and comparing 0.98 vs 1.3 is also showing this as the major diff in hot 
threads. Now in the request hbase.ipc.server.callqueue.type default has been 
set to fifo so it should be fixed. I have verified we are not setting it to 
deadline in our hbase-site.xml. But still we have very similar characteristics. 
Let me look into this further. 

I will log a separate ticket to update all the profiling snapshots and result 
details, since it is not related to this request and this is turning into a 
long discussion. 

> Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP
> 
>
> Key: HBASE-17958
> URL: https://issues.apache.org/jira/browse/HBASE-17958
> Project: HBase
>  Issue Type: Bug
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 1.4.0, 2.0.0
>
> Attachments: 0001-add-one-ut-testWithColumnCountGetFilter.patch, 
> 17958-add.txt, HBASE-17958-branch-1.patch, HBASE-17958-branch-1.patch, 
> HBASE-17958-branch-1.patch, HBASE-17958-branch-1.patch, HBASE-17958-v1.patch, 
> HBASE-17958-v2.patch, HBASE-17958-v3.patch, HBASE-17958-v4.patch, 
> HBASE-17958-v5.patch, HBASE-17958-v6.patch, HBASE-17958-v7.patch, 
> HBASE-17958-v7.patch
>
>
> {code}
> ScanQueryMatcher.MatchCode qcode = matcher.match(cell);
> qcode = optimize(qcode, cell);
> {code}
> The optimize method may change the MatchCode from SEEK_NEXT_COL/SEEK_NEXT_ROW 
> to SKIP. But it still pass the next cell to ScanQueryMatcher. It will get 
> wrong result when use some filter, etc. ColumnCountGetFilter. It just count 
> the  columns's number. If pass a same column to this filter, the count result 
> will be wrong. So we should avoid passing cell to ScanQueryMatcher when 
> optimize SEEK to SKIP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17958) Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP

2018-04-22 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447456#comment-16447456
 ] 

Vikas Vishwakarma commented on HBASE-17958:
---

[~lhofhansl] [~Apache9] we are ok on this request for now. We have observed the 
issue against 1.3 build. This patch seems to be present only in 1.4 . I 
verified that this is not cherry picked in our internal 1.3 light fork also. 
Once we are done with debugging the 1.3 issue, we anyways have a plan to do a 
0.98 vs 1.3 vs 1.4 benchmark. Will get back on this request when we do the 1.4 
benchmark. [~apurtell]

> Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP
> 
>
> Key: HBASE-17958
> URL: https://issues.apache.org/jira/browse/HBASE-17958
> Project: HBase
>  Issue Type: Bug
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 1.4.0, 2.0.0
>
> Attachments: 0001-add-one-ut-testWithColumnCountGetFilter.patch, 
> 17958-add.txt, HBASE-17958-branch-1.patch, HBASE-17958-branch-1.patch, 
> HBASE-17958-branch-1.patch, HBASE-17958-branch-1.patch, HBASE-17958-v1.patch, 
> HBASE-17958-v2.patch, HBASE-17958-v3.patch, HBASE-17958-v4.patch, 
> HBASE-17958-v5.patch, HBASE-17958-v6.patch, HBASE-17958-v7.patch, 
> HBASE-17958-v7.patch
>
>
> {code}
> ScanQueryMatcher.MatchCode qcode = matcher.match(cell);
> qcode = optimize(qcode, cell);
> {code}
> The optimize method may change the MatchCode from SEEK_NEXT_COL/SEEK_NEXT_ROW 
> to SKIP. But it still pass the next cell to ScanQueryMatcher. It will get 
> wrong result when use some filter, etc. ColumnCountGetFilter. It just count 
> the  columns's number. If pass a same column to this filter, the count result 
> will be wrong. So we should avoid passing cell to ScanQueryMatcher when 
> optimize SEEK to SKIP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17958) Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP

2018-04-22 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447109#comment-16447109
 ] 

Vikas Vishwakarma commented on HBASE-17958:
---

sure [~lhofhansl] [~Apache9] I will look into it. let's see if this helps 
narrow down the issue, then we can try the suggested solutions.

> Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP
> 
>
> Key: HBASE-17958
> URL: https://issues.apache.org/jira/browse/HBASE-17958
> Project: HBase
>  Issue Type: Bug
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 1.4.0, 2.0.0
>
> Attachments: 0001-add-one-ut-testWithColumnCountGetFilter.patch, 
> 17958-add.txt, HBASE-17958-branch-1.patch, HBASE-17958-branch-1.patch, 
> HBASE-17958-branch-1.patch, HBASE-17958-branch-1.patch, HBASE-17958-v1.patch, 
> HBASE-17958-v2.patch, HBASE-17958-v3.patch, HBASE-17958-v4.patch, 
> HBASE-17958-v5.patch, HBASE-17958-v6.patch, HBASE-17958-v7.patch, 
> HBASE-17958-v7.patch
>
>
> {code}
> ScanQueryMatcher.MatchCode qcode = matcher.match(cell);
> qcode = optimize(qcode, cell);
> {code}
> The optimize method may change the MatchCode from SEEK_NEXT_COL/SEEK_NEXT_ROW 
> to SKIP. But it still pass the next cell to ScanQueryMatcher. It will get 
> wrong result when use some filter, etc. ColumnCountGetFilter. It just count 
> the  columns's number. If pass a same column to this filter, the count result 
> will be wrong. So we should avoid passing cell to ScanQueryMatcher when 
> optimize SEEK to SKIP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-16499) slow replication for small HBase clusters

2018-03-29 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419436#comment-16419436
 ] 

Vikas Vishwakarma commented on HBASE-16499:
---

Hi [~ashish singhi] I am re-assigning this to you, I had lower or no 
significant change in replication in some cases when increasing the ratio, so 
was not sure about the change at that time, but that was sometime back and I 
don't remember the exact ratios I used for testing.

> slow replication for small HBase clusters
> -
>
> Key: HBASE-16499
> URL: https://issues.apache.org/jira/browse/HBASE-16499
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Vikas Vishwakarma
>Assignee: Ashish Singhi
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.0
>
> Attachments: HBASE-16499.patch
>
>
> For small clusters 10-20 nodes we recently observed that replication is 
> progressing very slowly when we do bulk writes and there is lot of lag 
> accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed 
> that the number of threads used for shipping wal edits in parallel comes from 
> the following equation in HBaseInterClusterReplicationEndpoint
> int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
>   replicationSinkMgr.getSinks().size());
> ... 
>   for (int i=0; i entryLists.add(new ArrayList(entries.size()/n+1)); <-- 
> batch size
>   }
> ...
> for (int i=0; i  .
> // RuntimeExceptions encountered here bubble up and are handled 
> in ReplicationSource
> pool.submit(createReplicator(entryLists.get(i), i));  <-- 
> concurrency 
> futures++;
>   }
> }
> maxThreads is fixed & configurable and since we are taking min of the three 
> values n gets decided based replicationSinkMgr.getSinks().size() when we have 
> enough edits to replicate
> replicationSinkMgr.getSinks().size() is decided based on 
> int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio);
> where ratio is this.ratio = conf.getFloat("replication.source.ratio", 
> DEFAULT_REPLICATION_SOURCE_RATIO);
> Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small 
> clusters of size 10-20 RegionServers  the value we get for numSinks and hence 
> n is very small like 1 or 2. This substantially reduces the pool concurrency 
> used for shipping wal edits in parallel effectively slowing down replication 
> for small clusters and causing lot of lag accumulation in AgeOfLastShipped. 
> Sometimes it takes tens of hours to clear off the entire replication queue 
> even after the client has finished writing on the source side. 
> We are running tests by varying replication.source.ratio and have seen 
> multi-fold improvement in total replication time (will update the results 
> here). I wanted to propose here that we should increase the default value for 
> replication.source.ratio also so that we have sufficient concurrency even for 
> small clusters. We figured it out after lot of iterations and debugging so 
> probably slightly higher default will save the trouble. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-16499) slow replication for small HBase clusters

2018-03-29 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma reassigned HBASE-16499:
-

Assignee: Ashish Singhi  (was: Vikas Vishwakarma)

> slow replication for small HBase clusters
> -
>
> Key: HBASE-16499
> URL: https://issues.apache.org/jira/browse/HBASE-16499
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Vikas Vishwakarma
>Assignee: Ashish Singhi
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.0
>
> Attachments: HBASE-16499.patch
>
>
> For small clusters 10-20 nodes we recently observed that replication is 
> progressing very slowly when we do bulk writes and there is lot of lag 
> accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed 
> that the number of threads used for shipping wal edits in parallel comes from 
> the following equation in HBaseInterClusterReplicationEndpoint
> int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
>   replicationSinkMgr.getSinks().size());
> ... 
>   for (int i=0; i entryLists.add(new ArrayList(entries.size()/n+1)); <-- 
> batch size
>   }
> ...
> for (int i=0; i  .
> // RuntimeExceptions encountered here bubble up and are handled 
> in ReplicationSource
> pool.submit(createReplicator(entryLists.get(i), i));  <-- 
> concurrency 
> futures++;
>   }
> }
> maxThreads is fixed & configurable and since we are taking min of the three 
> values n gets decided based replicationSinkMgr.getSinks().size() when we have 
> enough edits to replicate
> replicationSinkMgr.getSinks().size() is decided based on 
> int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio);
> where ratio is this.ratio = conf.getFloat("replication.source.ratio", 
> DEFAULT_REPLICATION_SOURCE_RATIO);
> Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small 
> clusters of size 10-20 RegionServers  the value we get for numSinks and hence 
> n is very small like 1 or 2. This substantially reduces the pool concurrency 
> used for shipping wal edits in parallel effectively slowing down replication 
> for small clusters and causing lot of lag accumulation in AgeOfLastShipped. 
> Sometimes it takes tens of hours to clear off the entire replication queue 
> even after the client has finished writing on the source side. 
> We are running tests by varying replication.source.ratio and have seen 
> multi-fold improvement in total replication time (will update the results 
> here). I wanted to propose here that we should increase the default value for 
> replication.source.ratio also so that we have sufficient concurrency even for 
> small clusters. We figured it out after lot of iterations and debugging so 
> probably slightly higher default will save the trouble. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-19236) Tune client backoff trigger logic and backoff time in ExponentialClientBackoffPolicy

2017-11-09 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma reassigned HBASE-19236:
-

Assignee: Harshal Jain

> Tune client backoff trigger logic and backoff time in 
> ExponentialClientBackoffPolicy
> 
>
> Key: HBASE-19236
> URL: https://issues.apache.org/jira/browse/HBASE-19236
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vikas Vishwakarma
>Assignee: Harshal Jain
>
> We were evaluating the ExponentialClientBackoffPolicy (HBASE-12986) for 
> implementing basic service protection, usage quota allocation for few heavy 
> loading clients, especially M/R job based HBase clients. However it was 
> observed that ExponentialClientBackoffPolicy slows down the client 
> dramatically even when there is not much load on the HBase cluster. 
> For a simple multithreaded write throughput client without enabling 
> ExponentialClientBackoffPolicy was able to complete in less than 5 mins 
> running on a 40 node cluster (~100G data). 
> The same client took ~10 hours to complete with 
> ExponentialClientBackoffPolicy enabled with default settings 
> DEFAULT_MAX_BACKOFF of 5 mins
> Even after reducing the DEFAULT_MAX_BACKOFF of 1 min, the client took ~2 
> hours to complete
> Current ExponentialClientBackoffPolicy decides the backoff time based on 3 
> factors 
> // Factor in memstore load
> double percent = regionStats.getMemstoreLoadPercent() / 100.0;
> // Factor in heap occupancy
> float heapOccupancy = regionStats.getHeapOccupancyPercent() / 100.0f;
> // Factor in compaction pressure, 1.0 means heavy compaction pressure
> float compactionPressure = regionStats.getCompactionPressure() / 100.0f;
> However according to our test observations it looks like the client backoff 
> is getting triggered even when there is hardly any load on the cluster. We 
> need to evaluate the existing logic or probably implement a different policy 
> more customized and suitable to our needs. 
> One of the ideas is to base it directly on compactionQueueLength instead of 
> heap occupancy etc. Consider a case where there is high throughput write load 
> and the compaction is still able keep up with the rate of memstore flushes 
> and compact all the files being flushed at the same rate. In this case 
> memstore can be full and heap occupancy can be high but still it is not 
> necessary indicator that the service is falling behind on processing the 
> client load and there is a need to backoff the client as we are just 
> utilizing the full write throughput of the system which is good. However if 
> the compactionQueue starts building up and is continuously above a threshold 
> and increasing then that is a reliable indicator that the system is not able 
> to keep up with the input load and  is slowly falling behind. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19236) Tune client backoff trigger logic and backoff time in ExponentialClientBackoffPolicy

2017-11-09 Thread Vikas Vishwakarma (JIRA)
Vikas Vishwakarma created HBASE-19236:
-

 Summary: Tune client backoff trigger logic and backoff time in 
ExponentialClientBackoffPolicy
 Key: HBASE-19236
 URL: https://issues.apache.org/jira/browse/HBASE-19236
 Project: HBase
  Issue Type: Improvement
Reporter: Vikas Vishwakarma


We were evaluating the ExponentialClientBackoffPolicy (HBASE-12986) for 
implementing basic service protection, usage quota allocation for few heavy 
loading clients, especially M/R job based HBase clients. However it was 
observed that ExponentialClientBackoffPolicy slows down the client dramatically 
even when there is not much load on the HBase cluster. 
For a simple multithreaded write throughput client without enabling 
ExponentialClientBackoffPolicy was able to complete in less than 5 mins running 
on a 40 node cluster (~100G data). 
The same client took ~10 hours to complete with ExponentialClientBackoffPolicy 
enabled with default settings DEFAULT_MAX_BACKOFF of 5 mins
Even after reducing the DEFAULT_MAX_BACKOFF of 1 min, the client took ~2 hours 
to complete
Current ExponentialClientBackoffPolicy decides the backoff time based on 3 
factors 
// Factor in memstore load
double percent = regionStats.getMemstoreLoadPercent() / 100.0;

// Factor in heap occupancy
float heapOccupancy = regionStats.getHeapOccupancyPercent() / 100.0f;

// Factor in compaction pressure, 1.0 means heavy compaction pressure
float compactionPressure = regionStats.getCompactionPressure() / 100.0f;

However according to our test observations it looks like the client backoff is 
getting triggered even when there is hardly any load on the cluster. We need to 
evaluate the existing logic or probably implement a different policy more 
customized and suitable to our needs. 

One of the ideas is to base it directly on compactionQueueLength instead of 
heap occupancy etc. Consider a case where there is high throughput write load 
and the compaction is still able keep up with the rate of memstore flushes and 
compact all the files being flushed at the same rate. In this case memstore can 
be full and heap occupancy can be high but still it is not necessary indicator 
that the service is falling behind on processing the client load and there is a 
need to backoff the client as we are just utilizing the full write throughput 
of the system which is good. However if the compactionQueue starts building up 
and is continuously above a threshold and increasing then that is a reliable 
indicator that the system is not able to keep up with the input load and  is 
slowly falling behind. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19094) NPE in RSGroupStartupWorker.waitForGroupTableOnline during master startup

2017-10-27 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222012#comment-16222012
 ] 

Vikas Vishwakarma commented on HBASE-19094:
---

lgtm +1

> NPE in RSGroupStartupWorker.waitForGroupTableOnline during master startup
> -
>
> Key: HBASE-19094
> URL: https://issues.apache.org/jira/browse/HBASE-19094
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Abhishek Singh Chouhan
>Assignee: Abhishek Singh Chouhan
>Priority: Minor
> Attachments: HBASE-19094.branch-1.001.patch, 
> HBASE-19094.master.001.patch, HBASE-19094.master.001.patch
>
>
> {noformat}
> rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker - Caught exception while 
> verifying group region
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getClient(ConnectionManager.java:1638)
> at 
> org.apache.hadoop.hbase.client.ConnectionUtils$2.getClient(ConnectionUtils.java:167)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker$1.visit(RSGroupInfoManagerImpl.java:646)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:638)
> at 
> org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:159)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker.waitForGroupTableOnline(RSGroupInfoManagerImpl.java:661)
> at 
> org.apache.hadoop.hbase.rsgroup.RSGroupInfoManagerImpl$RSGroupStartupWorker.run(RSGroupInfoManagerImpl.java:582)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19024) provide a configurable option to hsync WAL edits to the disk for better durability

2017-10-18 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209147#comment-16209147
 ] 

Vikas Vishwakarma commented on HBASE-19024:
---

[~anoop.hbase] we carried few combinations of tests since this was a part of a 
larger story to enable WAL on SSD. So we looked at 
* hflush HDD vs SSD
* hsync HDD vs SSD
* HDD hflush vs SSD hsync
* HDD hflush vs hsync 

Each test was carried out for both small batches of few 100 bytes and large 
batches of 1 MB and 10 MB

We used a multithreaded native HBase write loader for the tests that does batch 
puts of 100 bytes, 1 MB, 10 MB using random data. Latency is calculated for 
each batch put as well as total time taken for the loader to complete for few 
million rows. As per our observation 
* between hflush and hsync there is 10-15% degradation for using hsync instead 
of hflush for HDD

SSD results are slightly controversial and not in-line with conventional belief 
and we had a long discussion and experimentation phase on it. It will also 
depend on the type and grade of SSD being used value or low grade SSD vs 
enterprise SSD and other factors, so I am not posting those results here as 
this jira is anyways independent of SSD :)





> provide a configurable option to hsync WAL edits to the disk for better 
> durability
> --
>
> Key: HBASE-19024
> URL: https://issues.apache.org/jira/browse/HBASE-19024
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
> Environment: 
>Reporter: Vikas Vishwakarma
>
> At present we do not have an option to hsync WAL edits to the disk for better 
> durability. In our local tests we see 10-15% latency impact of using hsync 
> instead of hflush which is not very high.  
> We should have a configurable option to hysnc WAL edits instead of just 
> sync/hflush which will call the corresponding API on the hadoop side. 
> Currently HBase handles both SYNC_WAL and FSYNC_WAL as the same calling 
> FSDataOutputStream sync/hflush on the hadoop side. This can be modified to 
> let FSYNC_WAL call hsync on the hadoop side instead of sync/hflush. We can 
> keep the default value to sync as the current behavior and hsync can be 
> enabled based on explicit configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19024) provide a configurable option to hsync WAL edits to the disk for better durability

2017-10-17 Thread Vikas Vishwakarma (JIRA)
Vikas Vishwakarma created HBASE-19024:
-

 Summary: provide a configurable option to hsync WAL edits to the 
disk for better durability
 Key: HBASE-19024
 URL: https://issues.apache.org/jira/browse/HBASE-19024
 Project: HBase
  Issue Type: Improvement
 Environment: 

Reporter: Vikas Vishwakarma


At present we do not have an option to hsync WAL edits to the disk for better 
durability. In our local tests we see 10-15% latency impact of using hsync 
instead of hflush which is not very high.  
We should have a configurable option to hysnc WAL edits instead of just 
sync/hflush which will call the corresponding API on the hadoop side. Currently 
HBase handles both SYNC_WAL and FSYNC_WAL as the same calling 
FSDataOutputStream sync/hflush on the hadoop side. This can be modified to let 
FSYNC_WAL call hsync on the hadoop side instead of sync/hflush. We can keep the 
default value to sync as the current behavior and hsync can be enabled based on 
explicit configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18786) FileNotFoundException should not be silently handled for primary region replicas

2017-09-19 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172770#comment-16172770
 ] 

Vikas Vishwakarma commented on HBASE-18786:
---

so we have removed the property hbase.hregion.unassign.for.fnfe and 
corresponding handleFileNotFound related code. looks good. +1 [~apurtell]

> FileNotFoundException should not be silently handled for primary region 
> replicas
> 
>
> Key: HBASE-18786
> URL: https://issues.apache.org/jira/browse/HBASE-18786
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Ashu Pachauri
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2, 1.5.0
>
> Attachments: HBASE-18786-branch-1.patch, HBASE-18786.patch
>
>
> This is a follow up for HBASE-18186.
> FileNotFoundException while scanning from a primary region replica can be 
> indicative of a more severe problem. Handling them silently can cause many 
> underlying issues go undetected. We should either
> 1. Hard fail the regionserver if there is a FNFE on a primary region replica, 
> OR
> 2. Report these exceptions as some region / server level metric so that these 
> can be proactively investigated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-29 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146527#comment-16146527
 ] 

Vikas Vishwakarma commented on HBASE-18633:
---

[~apurtell] pushed it just now

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.4.0, 1.3.1, 2.0.0-alpha-3
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch, HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-29 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-18633:
--
Fix Version/s: 1.4.0

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.4.0, 1.3.1, 2.0.0-alpha-3
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch, HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-29 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-18633:
--
Affects Version/s: 1.4.0

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.4.0, 1.3.1, 2.0.0-alpha-3
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch, HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-28 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-18633:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to main, branch-1, branch-2

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.3.1, 2.0.0-alpha-3
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.5.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch, HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-28 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-18633:
--
Affects Version/s: 2.0.0-alpha-3

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.3.1, 2.0.0-alpha-3
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.5.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch, HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-28 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-18633:
--
Fix Version/s: 2.0.0-alpha-3

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.3.1, 2.0.0-alpha-3
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.5.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch, HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-28 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143551#comment-16143551
 ] 

Vikas Vishwakarma commented on HBASE-18633:
---

latest QA run looks ok, will commit the patch

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.3.1
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.5.0
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch, HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-23 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-18633:
--
Attachment: HBASE-18633.master.001.patch

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.3.1
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.5.0
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch, HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-23 Thread Vikas Vishwakarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139470#comment-16139470
 ] 

Vikas Vishwakarma commented on HBASE-18633:
---

thanks for the review [~tedyu] [~abhishek.chouhan] !

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.3.1
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.5.0
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18633) Add more info to understand the source/scenario of large batch requests exceeding threshold

2017-08-23 Thread Vikas Vishwakarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Vishwakarma updated HBASE-18633:
--
Attachment: HBASE-18633.branch-1.001.patch

> Add more info to understand the source/scenario of large batch requests 
> exceeding threshold
> ---
>
> Key: HBASE-18633
> URL: https://issues.apache.org/jira/browse/HBASE-18633
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1.3.1
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 3.0.0, 1.5.0
>
> Attachments: HBASE-18633.branch-1.001.patch, 
> HBASE-18633.master.001.patch
>
>
> In our controlled test env, we are seeing frequent Large batch operation 
> detected warnings (as implemented in HBASE-18023). 
> We are not running any client with large batch sizes on this test env, but we 
> start seeing these warnings after some runtime. Maybe it is caused due to 
> some error / retry scenario. Could also be related to Phoenix index updates 
> based on surrounding activity in the logs. Need to add more info like 
> table/region name and anything else that will enable debugging the source or 
> the scenario in which these warnings occur. 
> 2017-08-12 03:40:33,919 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7108 Client: xxx
> 2017-08-12 03:40:34,476 WARN  [7,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7096 Client: xxx
> 2017-08-12 03:40:34,483 WARN  [4,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7091 Client: xxx
> 2017-08-12 03:40:35,728 WARN  [3,queue=0,port=16020] 
> regionserver.RSRpcServices - Large batch operation detected (greater than 
> 5000) (HBASE-18023). Requested Number of Rows: 7102 Client: xxx



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   >