[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-04-19 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248384#comment-15248384
 ] 

Andrew Purtell commented on HBASE-15480:


bq. (Ted) Can you post improvement in performance with the patch ?
bq. (Walter) illustration of poor performance before change

I think we need before and after measurements to assess the impact of the 
change. I found the screenshot grabbed from a profile fairly context free as 
well. If you can use a respected microbenchmark harness like JMH to get before 
and after numbers, and those numbers are favorable, I will commit this. 

> Bloom Filter check needs to be more efficient for array
> ---
>
> Key: HBASE-15480
> URL: https://issues.apache.org/jira/browse/HBASE-15480
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.0.3
>Reporter: Walter Koetke
>Assignee: Walter Koetke
> Fix For: 1.0.4
>
> Attachments: BloomFilterCheckOneByOne.tiff, 
> HBASE-15480-branch-1.0.patch, HBASE-15480.patch
>
>
> It is currently inefficient to do lots of bloom filter checks. Each check has 
> overhead like going to the block cache to retrieve the block and recording 
> metrics. It would be good to have one bloom filter check api that does a 
> bunch of checks without so much block retrieval and metrics updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-04-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227446#comment-15227446
 ] 

Hadoop QA commented on HBASE-15480:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
23s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 21s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 
41s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
54s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 5m 46s 
{color} | {color:red} hbase-server: patch generated 4 new + 2 unchanged - 0 
fixed = 6 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
39m 43s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 194m 41s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 274m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
|   | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 |
| Timed out junit tests | 
org.apache.hadoop.hbase.security.access.TestNamespaceCommands |
|   | org.apache.hadoop.hbase.snapshot.TestMobFlushSnapshotFromClient |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12797151/HBASE-15480.patch |
| JIRA Issue | HBASE-15480 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux penates.apache.org 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT 
Wed Sep 3 

[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-04-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227139#comment-15227139
 ] 

Ted Yu commented on HBASE-15480:


{code}
   * Check if the specified keys are contained in the bloom filter.
...
+  public BitSet contains(byte[][] key, int[] keyOffset, int[] keyLength, 
ByteBuff bloom) {
{code}
Since multiple keys are checked in the method, shouldn't the parameter names be 
in plural ?

BTW I don't find the caller for the new method.
Did I miss something ?

> Bloom Filter check needs to be more efficient for array
> ---
>
> Key: HBASE-15480
> URL: https://issues.apache.org/jira/browse/HBASE-15480
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.0.3
>Reporter: Walter Koetke
>Assignee: Walter Koetke
> Fix For: 1.0.4
>
> Attachments: BloomFilterCheckOneByOne.tiff, 
> HBASE-15480-branch-1.0.patch, HBASE-15480.patch
>
>
> It is currently inefficient to do lots of bloom filter checks. Each check has 
> overhead like going to the block cache to retrieve the block and recording 
> metrics. It would be good to have one bloom filter check api that does a 
> bunch of checks without so much block retrieval and metrics updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-04-01 Thread John Leach (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222168#comment-15222168
 ] 

John Leach commented on HBASE-15480:


Yes.  Two initial use cases would be the following...

Snapshot Isolation: Use the bloom filter to check a bath of keys for an 
existing record (conflict detection).

Bloom Join:  Apply the bloom filters to restrict elements from the shuffle.







> Bloom Filter check needs to be more efficient for array
> ---
>
> Key: HBASE-15480
> URL: https://issues.apache.org/jira/browse/HBASE-15480
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.0.3
>Reporter: Walter Koetke
>Assignee: Walter Koetke
> Fix For: 1.0.4
>
> Attachments: BloomFilterCheckOneByOne.tiff, 
> HBASE-15480-branch-1.0.patch
>
>
> It is currently inefficient to do lots of bloom filter checks. Each check has 
> overhead like going to the block cache to retrieve the block and recording 
> metrics. It would be good to have one bloom filter check api that does a 
> bunch of checks without so much block retrieval and metrics updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-03-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221152#comment-15221152
 ] 

stack commented on HBASE-15480:
---

Group check sounds good. The new API would be used by upper layers? It looks at 
returned bitset and skips out on those that are positive?

> Bloom Filter check needs to be more efficient for array
> ---
>
> Key: HBASE-15480
> URL: https://issues.apache.org/jira/browse/HBASE-15480
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.0.3
>Reporter: Walter Koetke
>Assignee: Walter Koetke
> Fix For: 1.0.4
>
> Attachments: BloomFilterCheckOneByOne.tiff, 
> HBASE-15480-branch-1.0.patch
>
>
> It is currently inefficient to do lots of bloom filter checks. Each check has 
> overhead like going to the block cache to retrieve the block and recording 
> metrics. It would be good to have one bloom filter check api that does a 
> bunch of checks without so much block retrieval and metrics updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-03-31 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220144#comment-15220144
 ] 

Ted Yu commented on HBASE-15480:


Can you clarify how much is spent in passesGeneralBloomFilter with the change ?

Please also upload patch for master branch.

Thanks

> Bloom Filter check needs to be more efficient for array
> ---
>
> Key: HBASE-15480
> URL: https://issues.apache.org/jira/browse/HBASE-15480
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.0.3
>Reporter: Walter Koetke
>Assignee: Walter Koetke
> Fix For: 1.0.4
>
> Attachments: BloomFilterCheckOneByOne.tiff, 
> HBASE-15480-branch-1.0.patch
>
>
> It is currently inefficient to do lots of bloom filter checks. Each check has 
> overhead like going to the block cache to retrieve the block and recording 
> metrics. It would be good to have one bloom filter check api that does a 
> bunch of checks without so much block retrieval and metrics updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-03-31 Thread Walter Koetke (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15219987#comment-15219987
 ] 

Walter Koetke commented on HBASE-15480:
---

To answer Ted's question about the performance improvement, see the attachment 
BloomFilterCheckOneByOne, screenshot about how much time was going here. All of 
this went away for us with the change in place.

> Bloom Filter check needs to be more efficient for array
> ---
>
> Key: HBASE-15480
> URL: https://issues.apache.org/jira/browse/HBASE-15480
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.0.3
>Reporter: Walter Koetke
>Assignee: Walter Koetke
> Fix For: 1.0.4
>
> Attachments: BloomFilterCheckOneByOne.tiff, 
> HBASE-15480-branch-1.0.patch
>
>
> It is currently inefficient to do lots of bloom filter checks. Each check has 
> overhead like going to the block cache to retrieve the block and recording 
> metrics. It would be good to have one bloom filter check api that does a 
> bunch of checks without so much block retrieval and metrics updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-03-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216368#comment-15216368
 ] 

Hadoop QA commented on HBASE-15480:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 
45s {color} | {color:green} branch-1.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s 
{color} | {color:green} branch-1.0 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} branch-1.0 passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} branch-1.0 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} branch-1.0 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 22s 
{color} | {color:red} hbase-server in branch-1.0 has 59 extant Findbugs 
warnings. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 4s 
{color} | {color:red} hbase-server in branch-1.0 failed with JDK v1.8.0. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s 
{color} | {color:green} branch-1.0 passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
17m 43s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
56s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 48s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} 
|
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 108m 38s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
32s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 148m 27s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.mapreduce.TestImportExport |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12795852/HBASE-15480-branch-1.0.patch
 |
| JIRA Issue | HBASE-15480 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf910.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git 

[jira] [Commented] (HBASE-15480) Bloom Filter check needs to be more efficient for array

2016-03-29 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216155#comment-15216155
 ] 

Ted Yu commented on HBASE-15480:


Can you post improvement in performance with the patch ?

> Bloom Filter check needs to be more efficient for array
> ---
>
> Key: HBASE-15480
> URL: https://issues.apache.org/jira/browse/HBASE-15480
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Affects Versions: 1.0.3
>Reporter: Walter Koetke
>Assignee: Walter Koetke
> Fix For: 1.0.4
>
> Attachments: HBASE-15480-branch-1.0.patch
>
>
> It is currently inefficient to do lots of bloom filter checks. Each check has 
> overhead like going to the block cache to retrieve the block and recording 
> metrics. It would be good to have one bloom filter check api that does a 
> bunch of checks without so much block retrieval and metrics updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)