[jira] [Assigned] (HBASE-18033) Update supplemental models for new deps in Hadoop trunk

2017-05-11 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HBASE-18033:
---

Assignee: (was: Andrew Wang)

> Update supplemental models for new deps in Hadoop trunk
> ---
>
> Key: HBASE-18033
> URL: https://issues.apache.org/jira/browse/HBASE-18033
> Project: HBase
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Andrew Wang
>
> Did a test compile of HBase against latest Hadoop trunk, there are some new 
> dependencies that need to be added to the supplemental-models.xml file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18033) Update supplemental models for new deps in Hadoop trunk

2017-05-11 Thread Andrew Wang (JIRA)
Andrew Wang created HBASE-18033:
---

 Summary: Update supplemental models for new deps in Hadoop trunk
 Key: HBASE-18033
 URL: https://issues.apache.org/jira/browse/HBASE-18033
 Project: HBase
  Issue Type: Bug
  Components: dependencies
Reporter: Andrew Wang
Assignee: Andrew Wang


Did a test compile of HBase against latest Hadoop trunk, there are some new 
dependencies that need to be added to the supplemental-models.xml file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17847) update documentation to include positions on recent Hadoop releases

2017-03-29 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947709#comment-15947709
 ] 

Andrew Wang commented on HBASE-17847:
-

Yea, looks fine to me. There are some known issues with the 3.0.0 alpha1 and 
alpha2, e.g. not working with secure or HA clusters. Feel free to upgrade to X 
if that's more appropriate. Hoping that alpha3 / beta1 are a lot closer to 
ready.

> update documentation to include positions on recent Hadoop releases
> ---
>
> Key: HBASE-17847
> URL: https://issues.apache.org/jira/browse/HBASE-17847
> Project: HBase
>  Issue Type: Task
>  Components: community, documentation
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-17847.0.patch
>
>
> [per dev@hbase discussion on how to handle the most recent round of Hadoop 
> releases|https://lists.apache.org/thread.html/bb5591e3260f4e70d40c1f9cb30aa98d018fc145ab3aef8b3f6a4f5d@%3Cdev.hbase.apache.org%3E],
>  get our docs updated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14588) Stop accessing test resources from within src folder

2015-10-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14588:

Attachment: hbase-14588.branch-1.1.001.patch

Here's my patch for branch-1.0 and branch-1.1 if that's useful. I noticed stack 
committed the original version all the way down, but branch-1.{0,1} has some 
extra stuff still floating in the src/test/data.

I tried running "mvn apache-rat:check" before and after to verify the exclude 
changes, but it had my CPU at 100% for 5 mins with no additional output. Dunno 
what's up with that.

> Stop accessing test resources from within src folder
> 
>
> Key: HBASE-14588
> URL: https://issues.apache.org/jira/browse/HBASE-14588
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: hbase-14588.001.patch, hbase-14588.001.patch, 
> hbase-14588.002.patch, hbase-14588.branch-1.1.001.patch
>
>
> A few tests in hbase-server reach into the src/test/data folder to get test 
> resources, which is naughty since tests are supposed to only operate within 
> the target/ folder. It's better to put these into src/test/resources and let 
> them be automatically copied into target/ via the resources plugin, like 
> other test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14586) Use a maven profile to run Jacoco analysis

2015-10-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14586:

Summary: Use a maven profile to run Jacoco analysis  (was: Remove 
extraneous ${argLine} references in pom)

> Use a maven profile to run Jacoco analysis
> --
>
> Key: HBASE-14586
> URL: https://issues.apache.org/jira/browse/HBASE-14586
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hbase-14586.001.patch, hbase-14586.001.patch, 
> hbase-14586.002.patch
>
>
> The pom.xml has a line like this for the Surefire argLine, which has an extra 
> ${argLine} reference. Recommend changes like this:
> {noformat}
> -${hbase-surefire.argLine} ${argLine}
> +${hbase-surefire.argLine}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14588) Stop accessing test resources from within src folder

2015-10-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14588:

Status: Patch Available  (was: Open)

> Stop accessing test resources from within src folder
> 
>
> Key: HBASE-14588
> URL: https://issues.apache.org/jira/browse/HBASE-14588
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hbase-14588.001.patch, hbase-14588.001.patch, 
> hbase-14588.002.patch
>
>
> A few tests in hbase-server reach into the src/test/data folder to get test 
> resources, which is naughty since tests are supposed to only operate within 
> the target/ folder. It's better to put these into src/test/resources and let 
> them be automatically copied into target/ via the resources plugin, like 
> other test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14586) Remove extraneous ${argLine} references in pom

2015-10-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953827#comment-14953827
 ] 

Andrew Wang commented on HBASE-14586:
-

I'll also update the JIRA summary to reflect 002 if we like the maven profile 
approach.

> Remove extraneous ${argLine} references in pom
> --
>
> Key: HBASE-14586
> URL: https://issues.apache.org/jira/browse/HBASE-14586
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hbase-14586.001.patch, hbase-14586.001.patch, 
> hbase-14586.002.patch
>
>
> The pom.xml has a line like this for the Surefire argLine, which has an extra 
> ${argLine} reference. Recommend changes like this:
> {noformat}
> -${hbase-surefire.argLine} ${argLine}
> +${hbase-surefire.argLine}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14586) Remove extraneous ${argLine} references in pom

2015-10-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14586:

Attachment: hbase-14586.002.patch

Here's a patch which moves jacoco execution under a new "jacoco" profile which 
is off by default. I also removed the hbase.skip-jacoco property since I think 
the same functionality is handled by the profile.

I tested with Andy's mvn line, I think it's working:

{noformat}
-> % mvn clean package -Dtest=TestCheckTestClasses -Pjacoco
...
[INFO] --- jacoco-maven-plugin:0.7.5.201505241946:report (report) @ 
hbase-server ---
[INFO] Analyzed bundle 'Apache HBase - Server' with 1595 classes
{noformat}

Ran the same thing with "-Pjacoco,os.windows" and saw that same output, so I 
think the cygwin argLine is also being handled correctly.

I also saw this warning with os.windows, so removed the duplicate 
preferIPv4Stack definition in os.windows (it's already in the argLine):

{noformat}
[WARNING] The system property java.net.preferIPv4Stack is configured twice! The 
property appears in  and any of , 
 or user property.
{noformat}

If we're okay with this approach, I assume an HBase committer can take of 
updating any code coverage jobs / precommit as appropriate, as well as related 
documentation. If I can get some pointers, I'm happy to provide patches to help 
speed it along.

> Remove extraneous ${argLine} references in pom
> --
>
> Key: HBASE-14586
> URL: https://issues.apache.org/jira/browse/HBASE-14586
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hbase-14586.001.patch, hbase-14586.001.patch, 
> hbase-14586.002.patch
>
>
> The pom.xml has a line like this for the Surefire argLine, which has an extra 
> ${argLine} reference. Recommend changes like this:
> {noformat}
> -${hbase-surefire.argLine} ${argLine}
> +${hbase-surefire.argLine}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14586) Remove extraneous ${argLine} references in pom

2015-10-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14586:

Status: Patch Available  (was: Open)

> Remove extraneous ${argLine} references in pom
> --
>
> Key: HBASE-14586
> URL: https://issues.apache.org/jira/browse/HBASE-14586
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hbase-14586.001.patch, hbase-14586.001.patch, 
> hbase-14586.002.patch
>
>
> The pom.xml has a line like this for the Surefire argLine, which has an extra 
> ${argLine} reference. Recommend changes like this:
> {noformat}
> -${hbase-surefire.argLine} ${argLine}
> +${hbase-surefire.argLine}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14587) Attach a test-sources.jar for hbase-server

2015-10-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953762#comment-14953762
 ] 

Andrew Wang commented on HBASE-14587:
-

Thanks stack!

> Attach a test-sources.jar for hbase-server
> --
>
> Key: HBASE-14587
> URL: https://issues.apache.org/jira/browse/HBASE-14587
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: hbase-14587.001.patch
>
>
> It'd be nice to attach a test-sources jar alongside the others as part of the 
> build, to provide test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14587) Attach a test-sources.jar for hbase-server

2015-10-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953712#comment-14953712
 ] 

Andrew Wang commented on HBASE-14587:
-

Does HBase precommit show the javac warnings? I don't know how a pom change 
could cause this. Unit test failures also seem unrelated for same reason, I ran 
TestShell successfully locally with the patch applied. Jenkins had this, which 
looks like the test was killed for some reason:

{noformat}
Running org.apache.hadoop.hbase.client.TestShell
Killed

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 1

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache HBase .. SUCCESS [3.039s]
[INFO] Apache HBase - Checkstyle . SUCCESS [0.573s]
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/test-framework/dev-support/test-patch.sh:
 line 838: 27744 Killed  $MVN clean test 
-Dsurefire.rerunFailingTestsCount=2 -P runAllTests -D${PROJECT_NAME}PatchProcess
We're ok: there is no zombie test
{noformat}

> Attach a test-sources.jar for hbase-server
> --
>
> Key: HBASE-14587
> URL: https://issues.apache.org/jira/browse/HBASE-14587
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hbase-14587.001.patch
>
>
> It'd be nice to attach a test-sources jar alongside the others as part of the 
> build, to provide test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14586) Remove extraneous ${argLine} references in pom

2015-10-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953675#comment-14953675
 ] 

Andrew Wang commented on HBASE-14586:
-

Thanks for the reviews all. To provide a little more color about the issue I'm 
trying to fix, invoking the surefire:test goal doesn't work with this error:

{noformat}
-> % mvn surefire:test -f hbase-server/pom.xml -Dtest=TestRecoveredEdits
...
---
 T E S T S
---
Error: Could not find or load main class ${argLine}

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
{noformat}

Maybe we could have a maven profile that enables jacoco? Or perhaps there is a 
better maven-y way of doing this. I'll experiment and update.

> Remove extraneous ${argLine} references in pom
> --
>
> Key: HBASE-14586
> URL: https://issues.apache.org/jira/browse/HBASE-14586
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hbase-14586.001.patch, hbase-14586.001.patch
>
>
> The pom.xml has a line like this for the Surefire argLine, which has an extra 
> ${argLine} reference. Recommend changes like this:
> {noformat}
> -${hbase-surefire.argLine} ${argLine}
> +${hbase-surefire.argLine}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14588) Stop accessing test resources from within src folder

2015-10-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14588:

Attachment: hbase-14588.002.patch

I think I generated the 001 patch incorrectly, forgot to do a binary diff. 
Here's a new patch, ran TestRecoveredEdits okay after applying it. Let's see if 
Jenkins takes it.

> Stop accessing test resources from within src folder
> 
>
> Key: HBASE-14588
> URL: https://issues.apache.org/jira/browse/HBASE-14588
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hbase-14588.001.patch, hbase-14588.001.patch, 
> hbase-14588.002.patch
>
>
> A few tests in hbase-server reach into the src/test/data folder to get test 
> resources, which is naughty since tests are supposed to only operate within 
> the target/ folder. It's better to put these into src/test/resources and let 
> them be automatically copied into target/ via the resources plugin, like 
> other test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14588) Stop accessing test resources from within src folder

2015-10-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14588:

Attachment: hbase-14588.001.patch

Patch attached. I also have one for branch-1.1 if that's interesting, there are 
some additional test resources there that are not present in master.

> Stop accessing test resources from within src folder
> 
>
> Key: HBASE-14588
> URL: https://issues.apache.org/jira/browse/HBASE-14588
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hbase-14588.001.patch
>
>
> A few tests in hbase-server reach into the src/test/data folder to get test 
> resources, which is naughty since tests are supposed to only operate within 
> the target/ folder. It's better to put these into src/test/resources and let 
> them be automatically copied into target/ via the resources plugin, like 
> other test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14588) Stop accessing test resources from within src folder

2015-10-09 Thread Andrew Wang (JIRA)
Andrew Wang created HBASE-14588:
---

 Summary: Stop accessing test resources from within src folder
 Key: HBASE-14588
 URL: https://issues.apache.org/jira/browse/HBASE-14588
 Project: HBase
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Andrew Wang
Assignee: Andrew Wang


A few tests in hbase-server reach into the src/test/data folder to get test 
resources, which is naughty since tests are supposed to only operate within the 
target/ folder. It's better to put these into src/test/resources and let them 
be automatically copied into target/ via the resources plugin, like other test 
resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14587) Attach a test-sources.jar for hbase-server

2015-10-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14587:

Status: Patch Available  (was: Open)

> Attach a test-sources.jar for hbase-server
> --
>
> Key: HBASE-14587
> URL: https://issues.apache.org/jira/browse/HBASE-14587
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hbase-14587.001.patch
>
>
> It'd be nice to attach a test-sources jar alongside the others as part of the 
> build, to provide test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14587) Attach a test-sources.jar for hbase-server

2015-10-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14587:

Attachment: hbase-14587.001.patch

pom-only change attached.

> Attach a test-sources.jar for hbase-server
> --
>
> Key: HBASE-14587
> URL: https://issues.apache.org/jira/browse/HBASE-14587
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hbase-14587.001.patch
>
>
> It'd be nice to attach a test-sources jar alongside the others as part of the 
> build, to provide test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14587) Attach a test-sources.jar for hbase-server

2015-10-09 Thread Andrew Wang (JIRA)
Andrew Wang created HBASE-14587:
---

 Summary: Attach a test-sources.jar for hbase-server
 Key: HBASE-14587
 URL: https://issues.apache.org/jira/browse/HBASE-14587
 Project: HBase
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Andrew Wang
Assignee: Andrew Wang


It'd be nice to attach a test-sources jar alongside the others as part of the 
build, to provide test resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14586) Remove extraneous ${argLine} references in pom

2015-10-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14586:

Status: Patch Available  (was: Open)

> Remove extraneous ${argLine} references in pom
> --
>
> Key: HBASE-14586
> URL: https://issues.apache.org/jira/browse/HBASE-14586
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hbase-14586.001.patch
>
>
> The pom.xml has a line like this for the Surefire argLine, which has an extra 
> ${argLine} reference. Recommend changes like this:
> {noformat}
> -${hbase-surefire.argLine} ${argLine}
> +${hbase-surefire.argLine}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14586) Remove extraneous ${argLine} references in pom

2015-10-09 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-14586:

Attachment: hbase-14586.001.patch

Patch attached, I also removed what looks like a stale comment.

> Remove extraneous ${argLine} references in pom
> --
>
> Key: HBASE-14586
> URL: https://issues.apache.org/jira/browse/HBASE-14586
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hbase-14586.001.patch
>
>
> The pom.xml has a line like this for the Surefire argLine, which has an extra 
> ${argLine} reference. Recommend changes like this:
> {noformat}
> -${hbase-surefire.argLine} ${argLine}
> +${hbase-surefire.argLine}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14586) Remove extraneous ${argLine} references in pom

2015-10-09 Thread Andrew Wang (JIRA)
Andrew Wang created HBASE-14586:
---

 Summary: Remove extraneous ${argLine} references in pom
 Key: HBASE-14586
 URL: https://issues.apache.org/jira/browse/HBASE-14586
 Project: HBase
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Minor


The pom.xml has a line like this for the Surefire argLine, which has an extra 
${argLine} reference. Recommend changes like this:

{noformat}
-${hbase-surefire.argLine} ${argLine}
+${hbase-surefire.argLine}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10123) Change default ports; move them out of linux ephemeral port range

2014-01-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-10123:


Status: Patch Available  (was: Open)

> Change default ports; move them out of linux ephemeral port range
> -
>
> Key: HBASE-10123
> URL: https://issues.apache.org/jira/browse/HBASE-10123
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.1.1
>Reporter: stack
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.98.0
>
> Attachments: hbase-10123.patch
>
>
> Our defaults clash w/ the range linux assigns itself for creating come-and-go 
> ephemeral ports; likely in our history we've clashed w/ a random, short-lived 
> process.  While easy to change the defaults, we should just ship w/ defaults 
> that make sense.  We could host ourselves up into the 7 or 8k range.
> See http://www.ncftp.com/ncftpd/doc/misc/ephemeral_ports.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HBASE-10123) Change default ports; move them out of linux ephemeral port range

2014-01-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HBASE-10123:
---

Assignee: Jonathan Hsieh

> Change default ports; move them out of linux ephemeral port range
> -
>
> Key: HBASE-10123
> URL: https://issues.apache.org/jira/browse/HBASE-10123
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.1.1
>Reporter: stack
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.98.0
>
> Attachments: hbase-10123.patch
>
>
> Our defaults clash w/ the range linux assigns itself for creating come-and-go 
> ephemeral ports; likely in our history we've clashed w/ a random, short-lived 
> process.  While easy to change the defaults, we should just ship w/ defaults 
> that make sense.  We could host ourselves up into the 7 or 8k range.
> See http://www.ncftp.com/ncftpd/doc/misc/ephemeral_ports.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7715) FSUtils#waitOnSafeMode can incorrectly loop on standby NN

2013-01-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566792#comment-13566792
 ] 

Andrew Wang commented on HBASE-7715:


Looks good to me. Thanks for the prompt patch, Ted.

> FSUtils#waitOnSafeMode can incorrectly loop on standby NN
> -
>
> Key: HBASE-7715
> URL: https://issues.apache.org/jira/browse/HBASE-7715
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.4
>Reporter: Andrew Wang
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7715-trunk-v2.txt, 7715-trunk-v3.txt
>
>
> We encountered an issue where HMaster failed to start with an active NN not 
> in safe mode and a standby NN in safemode. The relevant lines in 
> {{FSUtils.java}} show the issue:
> {noformat}
> while 
> (dfs.setSafeMode(org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET))
>  {
> {noformat}
> This call skips the normal client failover from the standby to active NN, so 
> it will loop polling the standby NN if it unfortunately talks to the standby 
> first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7715) FSUtils#waitOnSafeMode can incorrectly loop on standby NN

2013-01-29 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566167#comment-13566167
 ] 

Andrew Wang commented on HBASE-7715:


Cool patch, Ted!

There's also one more usage of setSafeMode in FSUtils#checkDfsSafeMode that 
should probably be fixed. Uma's comment about using #isInSafeMode() is also 
probably a bit cleaner.

I don't think that the StandbyException is actually propagated up this high, it 
gets caught and used in the failover logic in DFSClient. You can check me on 
that one though.

I'd also prefer to see an explicit catch of the {{NoSuchMethod}} exception (and 
a comment denoting why we're doing all this this business), rather than a 
generic {{Exception}} catch. Then you can avoid rethrowing in the catch.

> FSUtils#waitOnSafeMode can incorrectly loop on standby NN
> -
>
> Key: HBASE-7715
> URL: https://issues.apache.org/jira/browse/HBASE-7715
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.4
>Reporter: Andrew Wang
>Assignee: Ted Yu
> Fix For: 0.96.0
>
> Attachments: 7715-trunk-v2.txt
>
>
> We encountered an issue where HMaster failed to start with an active NN not 
> in safe mode and a standby NN in safemode. The relevant lines in 
> {{FSUtils.java}} show the issue:
> {noformat}
> while 
> (dfs.setSafeMode(org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET))
>  {
> {noformat}
> This call skips the normal client failover from the standby to active NN, so 
> it will loop polling the standby NN if it unfortunately talks to the standby 
> first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-7715) FSUtils#waitOnSafeMode can incorrectly loop on standby NN

2013-01-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HBASE-7715:
--

Assignee: (was: Andrew Wang)

> FSUtils#waitOnSafeMode can incorrectly loop on standby NN
> -
>
> Key: HBASE-7715
> URL: https://issues.apache.org/jira/browse/HBASE-7715
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.4
>Reporter: Andrew Wang
>
> We encountered an issue where HMaster failed to start with an active NN not 
> in safe mode and a standby NN in safemode. The relevant lines in 
> {{FSUtils.java}} show the issue:
> {noformat}
> while 
> (dfs.setSafeMode(org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET))
>  {
> {noformat}
> This call skips the normal client failover from the standby to active NN, so 
> it will loop polling the standby NN if it unfortunately talks to the standby 
> first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7715) FSUtils#waitOnSafeMode can incorrectly loop on standby NN

2013-01-29 Thread Andrew Wang (JIRA)
Andrew Wang created HBASE-7715:
--

 Summary: FSUtils#waitOnSafeMode can incorrectly loop on standby NN
 Key: HBASE-7715
 URL: https://issues.apache.org/jira/browse/HBASE-7715
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.4
Reporter: Andrew Wang
Assignee: Andrew Wang


We encountered an issue where HMaster failed to start with an active NN not in 
safe mode and a standby NN in safemode. The relevant lines in {{FSUtils.java}} 
show the issue:

{noformat}
while 
(dfs.setSafeMode(org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET))
 {
{noformat}

This call skips the normal client failover from the standby to active NN, so it 
will loop polling the standby NN if it unfortunately talks to the standby first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-12-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HBASE-6261.


Resolution: Fixed

Elliot took care of this, porting the histogram from HADOOP-8541 to HBase in 
HBASE-6409. Thanks!

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf, MetricsHistogram.data, parse.py, 
> SampleQuantiles.data
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6409) Create histogram class for metrics 2

2012-09-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455341#comment-13455341
 ] 

Andrew Wang commented on HBASE-6409:


Hi Elliot, sorry about the delay on getting back to you.

I don't think there's an issue with sharing an executor. IIRC, findbugs 
complained since it didn't correctly pick up on synchronizing on the parent 
MutableQuantiles, but it's fine.

The datanode and namenode still seem to shutdown correctly even though the 
threads aren't daemonized. Making them daemonized probably won't hurt though, 
so also LGTM.

> Create histogram class for metrics 2
> 
>
> Key: HBASE-6409
> URL: https://issues.apache.org/jira/browse/HBASE-6409
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6409-0.patch, HBASE-6409-1.patch, 
> HBASE-6409-2.patch, HBASE-6409-3.patch, HBASE-6409-4.patch
>
>
> Create the replacement for MetricsHistogram and PersistantTimeVaryingRate 
> classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6409) Create histogram class for metrics 2

2012-09-06 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450135#comment-13450135
 ] 

Andrew Wang commented on HBASE-6409:


Hey Elliot,

Peeked at your changes to MutableQuantiles. You don't actually need to have 
{{setChanged()}} in {{add()}}, since the metric only changes when it's rolled 
over by the rollover thread. Besides that, I see you swapped out the 
ScheduledExecutorService for MetricsService, which I assume does the same thing.

Is that what you wanted a check for?

> Create histogram class for metrics 2
> 
>
> Key: HBASE-6409
> URL: https://issues.apache.org/jira/browse/HBASE-6409
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6409-0.patch, HBASE-6409-1.patch, 
> HBASE-6409-2.patch, HBASE-6409-3.patch
>
>
> Create the replacement for MetricsHistogram and PersistantTimeVaryingRate 
> classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6538) Remove copy_table.rb script

2012-08-15 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435786#comment-13435786
 ] 

Andrew Wang commented on HBASE-6538:


Pretty sure findbugs and javac are unrelated, since this is a trivial patch.

> Remove copy_table.rb script
> ---
>
> Key: HBASE-6538
> URL: https://issues.apache.org/jira/browse/HBASE-6538
> Project: HBase
>  Issue Type: Task
>  Components: scripts
>Affects Versions: 0.96.0
>Reporter: David S. Wang
>Assignee: Andrew Wang
>Priority: Minor
>  Labels: noob
> Attachments: hbase-6583-1.patch
>
>
> Remove copy_table.rb script as per mailing list discussion.  It hasn't been 
> maintained in a while and does not run against any recent HBase release.  
> There is also an MR job to do the same thing that does work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6538) Remove copy_table.rb script

2012-08-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-6538:
---

Assignee: Andrew Wang  (was: David S. Wang)
  Status: Patch Available  (was: Open)

> Remove copy_table.rb script
> ---
>
> Key: HBASE-6538
> URL: https://issues.apache.org/jira/browse/HBASE-6538
> Project: HBase
>  Issue Type: Task
>  Components: scripts
>Affects Versions: 0.96.0
>Reporter: David S. Wang
>Assignee: Andrew Wang
>Priority: Minor
>  Labels: noob
> Attachments: hbase-6583-1.patch
>
>
> Remove copy_table.rb script as per mailing list discussion.  It hasn't been 
> maintained in a while and does not run against any recent HBase release.  
> There is also an MR job to do the same thing that does work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6538) Remove copy_table.rb script

2012-08-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-6538:
---

Attachment: hbase-6583-1.patch

I went and resolved all the dups of this issue, and added Dave back to the 
watchlist.

Also attached is the trivial patch, which is just a "git rm" of copy_table.rb.

> Remove copy_table.rb script
> ---
>
> Key: HBASE-6538
> URL: https://issues.apache.org/jira/browse/HBASE-6538
> Project: HBase
>  Issue Type: Task
>  Components: scripts
>Affects Versions: 0.96.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
>  Labels: noob
> Attachments: hbase-6583-1.patch
>
>
> Remove copy_table.rb script as per mailing list discussion.  It hasn't been 
> maintained in a while and does not run against any recent HBase release.  
> There is also an MR job to do the same thing that does work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6543) Remove copy_table.rb

2012-08-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HBASE-6543.


Resolution: Duplicate

> Remove copy_table.rb
> 
>
> Key: HBASE-6543
> URL: https://issues.apache.org/jira/browse/HBASE-6543
> Project: HBase
>  Issue Type: Task
>  Components: scripts
>Affects Versions: 0.96.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
>  Labels: noob
>
> Remove copy_table.rb script as per mailing list discussion.  It hasn't been 
> maintained in a while and does not run against any recent HBase release.  
> There is also an MR job to do the same thing that does work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6540) Remove copy_table.rb script

2012-08-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HBASE-6540.


Resolution: Duplicate

> Remove copy_table.rb script
> ---
>
> Key: HBASE-6540
> URL: https://issues.apache.org/jira/browse/HBASE-6540
> Project: HBase
>  Issue Type: Task
>  Components: scripts
>Affects Versions: 0.96.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
>  Labels: noob
>
> Remove copy_table.rb script as per mailing list discussion.  It hasn't been 
> maintained in a while and does not run against any recent HBase release.  
> There is also an MR job to do the same thing that does work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6541) Remove copy_table.rb

2012-08-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HBASE-6541.


Resolution: Duplicate

> Remove copy_table.rb
> 
>
> Key: HBASE-6541
> URL: https://issues.apache.org/jira/browse/HBASE-6541
> Project: HBase
>  Issue Type: Task
>  Components: scripts
>Affects Versions: 0.96.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
>  Labels: noob
>
> Remove copy_table.rb script as per mailing list discussion.  It hasn't been 
> maintained in a while and does not run against any recent HBase release.  
> There is also an MR job to do the same thing that does work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6542) Remove copy_table.rb

2012-08-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HBASE-6542.


Resolution: Duplicate

> Remove copy_table.rb
> 
>
> Key: HBASE-6542
> URL: https://issues.apache.org/jira/browse/HBASE-6542
> Project: HBase
>  Issue Type: Task
>  Components: scripts
>Affects Versions: 0.96.0
>Reporter: David S. Wang
>Assignee: David S. Wang
>Priority: Minor
>  Labels: noob
>
> Remove copy_table.rb script as per mailing list discussion.  It hasn't been 
> maintained in a while and does not run against any recent HBase release.  
> There is also an MR job to do the same thing that does work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-07-23 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421031#comment-13421031
 ] 

Andrew Wang commented on HBASE-6261:


Based on feedback from Elliot and Jon, I've done some analysis of both 
SampleQuantiles and MetricsHistogram.

For both, I tried item counts of 1k, 10k, 100k, 1M, 2.5M, 5M, and 10M. For each 
count, I randomly shuffled longs from {{[0, count)}}, pushed them through the 
estimator, and measured the runtime, # of samples, and error for various 
quantiles. This was repeated ten times, giving stddev error bars for each point.

MetricsHistogram was left using default settings (1028 item reservoir). 
SampleQuantiles was also left with default settings, tracking the same 
quantiles as MetricsHistogram, but with bounded error. I threw away the 0.90 
quantile from SampleQuantiles since MetricsHistogram didn't have a function to 
compute it (though trivial).

This was all run single-threaded on my couple-years-old T410s laptop.

You can view the imgur album of just the plots here: [http://imgur.com/a/gTDYr]

h2. Runtime

Note that the y-axis is log-scale in this plot. SampleQuantiles is roughly an 
order of magnitude slower at 10 million items (26.8s vs. 3.3s), but the scaling 
pattern overall looks good. It's comparable for low (<=10k) items.

!http://i.imgur.com/c6SIl.png!

h2. Memory usage

Note that the y-axis is again log-scale in this plot.

MetricsHistogram uses a flat 1028 items of storage, so it has constant memory 
usage. At 10 million items, SampleQuantiles uses roughly an order of magnitude 
more memory (19.4k items vs. 1k). Since SampleQuantiles samples are about 40B 
each and MetricsHistogram samples are 8B each, this is approximately 776KB vs. 
8KB.

This matters less for small numbers of items. The crossover point on the graph 
happens at between 10k and 100k items. The scaling pattern looks similar to the 
runtime, overall good.

!http://i.imgur.com/3E3RQ.png!

h2. Error bounds

Note that for this series of plots, the y-axis is linear and the x-axis is log. 
This makes the actual error values easier to interpret. Error was calculated by 
taking the difference in the actual and the estimated rank of the percentile, 
and dividing by the total count.

SampleQuantiles is by default configured to track 50th with 5% error, 75th with 
2.5%, 95th with 0.5%, and 99th with 0.1%. We see less error at higher 
percentiles, and with larger sized streams. For 95th and 99th, we reach 
essentially 0% error at around 1 million items (0.009% for 95th, 0.004% for 
99th).

MetricsHistogram doesn't really provide great error, and high percentiles seem 
to get worse as the number of items increase. There's also large standard 
deviation in error, which is unfortunate if these values are going to be used 
for thresholding. For 95th, it looks like 0.4% to 0.6% error. For 99th, we're 
looking at 0.2% to 0.3%.

An error of half a percent doesn't sound huge, but remember that this is error 
in rank, or effectively on a uniform latency distribution. To translate this, I 
fitted against the get latency distribution I got from running a mixed get/scan 
YCSB workload against CDH3u1 HBase. At the 95th percentile, an error of 0.5% 
translated to 137ms -3.4% and +4%. At the 99th, an error of 0.5% translated to 
310ms -21.7% and +43.3%. These are just indicative numbers; the important point 
is that half a percent on the tail of a Zipf distribution is pretty meaningful.

!http://i.imgur.com/m0ERq.png!
!http://i.imgur.com/qvfpR.png!
!http://i.imgur.com/k5y5o.png!
!http://i.imgur.com/uyqAK.png!

h2. Conclusion

For low-rate events (order 0.1s on up) like compactions or flushes, I think it 
can go either way. SampleQuantiles has similar CPU/memory usage up until ~10k 
items, but MetricsHistogram is perfectly accurate up until 1028 items, has 
bounded memory, and can be used to compute other statistics. The 1028 mark 
seems important here; just keep all the data for low-rate events.

For high-rate events (order ms) like RPCs, it depends if you care at all about 
accuracy. The memory/CPU overhead of SampleQuantiles is high in relative terms 
(order of magnitude), but you need to use it if you're measuring for SLAs since 
MetricsHistogram basically isn't accurate. It also seems unlikely that you'll 
have that many 1M+ item streams you want to track, and it's just a couple 
hundred KB more memory. Use MetricsHistogram if accuracy isn't important, but I 
feel like SampleQuantiles is a pretty reasonable choice.

Hopefully that was enlightening. I posted the raw data and plotting script if 
anyone else wants to play with it, and I can post the test code snippets used 
to make the data if anyone's interested in that too.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
>

[jira] [Updated] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-07-23 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-6261:
---

Attachment: SampleQuantiles.data
parse.py
MetricsHistogram.data

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf, MetricsHistogram.data, 
> SampleQuantiles.data, parse.py
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-07-18 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417302#comment-13417302
 ] 

Andrew Wang commented on HBASE-6261:


It'd be these files from hadoop-common:

* src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java
* src/main/java/org/apache/hadoop/metrics2/lib/Quantiles.java
* src/main/java/org/apache/hadoop/metrics2/util/SampleQuantiles.java

{{wc -l}} reports it's 534 lines across those three files, heavily commented of 
course. {{MutableQuantiles}} is a hadoop2 metrics2 interface for 
SampleQuantiles, and might need to be modified for use in HBase. I haven't 
looked at what Elliot's done for HBASE-4050 yet.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-07-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416680#comment-13416680
 ] 

Andrew Wang commented on HBASE-6261:


Sorry I haven't had time to push on this more. I talked with Jon Hsieh last 
week about doing a more convincing analysis of the performance of the new 
MutableQuantiles class from HADOOP-8541 vs the existing reservoir-sampling 
histogram method. I'll try to get that done within a week.

I'm also not sure about the right course of action at getting it used in HBase. 
Stack indicated way back on the mailing list that he was okay waiting for a 
hadoop-common version bump, which is kind of a long timescale. If people really 
urgently want this, we could just copy the code over and then refactor it away 
when it's released in hadoop-common.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6409) Create histogram class for metrics 2

2012-07-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416672#comment-13416672
 ] 

Andrew Wang commented on HBASE-6409:


This seems related to HBASE-6261 and HADOOP-8541, which introduced a new 
MutableQuantiles class into hadoop-common metrics2. You could consider using 
that in lieu of rolling your own histogram.

> Create histogram class for metrics 2
> 
>
> Key: HBASE-6409
> URL: https://issues.apache.org/jira/browse/HBASE-6409
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>
> Create the replacement for MetricsHistogram and PersistantTimeVaryingRate 
> classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413296#comment-13413296
 ] 

Andrew Wang commented on HBASE-6377:


Please correct my understanding on this if I'm wrong, but estimating via the 
average (total latency / num ops) is pretty undesirable for both normal 
MultiActions and the batched put MultiAction case. Latency from a slow op 
bleeds over to a fast one, which messes up per-op metrics. This would also 
affect doing per-region or per-column family metrics in the future, for the 
same reasons.

I haven't looked at the code, but is it possible to do more accurate accounting 
of the latency of each op in a MultiAction? If so, I think it'd be worthwhile.

> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.94.2
>
> Attachments: 6377-0.94.patch, 6377-trunk-simple.patch, 6377.patch
>
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-28 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HBASE-6261:
--

Assignee: Andrew Wang

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403586#comment-13403586
 ] 

Andrew Wang commented on HBASE-6261:


I think it'll be usable from common, it's going to be like the existing 
MutableCounter or MutableStat in that you instantiate it once then call 
updateMethod() a bunch. Unless HBase does it differently than the datanode, I 
don't think reflection is used on the hot path of tracking the stream of 
values, just occasionally to publish it via JMX.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403571#comment-13403571
 ] 

Andrew Wang commented on HBASE-6261:


I filed HADOOP-8541, since this is going to be landing in hadoop-common's 
metrics2. When HBASE-5040 clears, we can look into actually hooking it up in 
HBase.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403304#comment-13403304
 ] 

Andrew Wang commented on HBASE-6261:


Yea, everything ultimately goes into the {{sample}} LinkedList. The fixed size 
{{buffer}} is just used to do more efficient batch inserts into {{sample}}.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-28 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403280#comment-13403280
 ] 

Andrew Wang commented on HBASE-6261:


I don't think performance is very sensitive to the buffer size, it's just a way 
of batching inserts for efficiency. Definitely doesn't affect accuracy because 
I have it call insertBatch() on every query().

We can maintain the compress count and track the # items removed, but I don't 
know if it's really worth exposing to the user (metrics for our metrics?). I 
think it's nice for testing though, so I'll try to expose it internally.

I've never seen compress() fail to remove any items, but I guess this could 
happen with some adversarial pattern. I don't think you can do much about it 
though, since the algo needs those items to maintain the error bounds.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-27 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402733#comment-13402733
 ] 

Andrew Wang commented on HBASE-6261:


I've got my Java implementation of the non-sliding biased quantiles algorithm 
(QuantileEstimationCKMS.java) up on github:

https://github.com/umbrant/QuantileEstimation

Benchmarking on my laptop, I pushed 1 million shuffled items [0, 10**9) through 
it in 1.2 seconds while asking it to track the 50th, 90th, 95th, and 99th 
percentiles with low error. It kept ~5500 samples to do this, which at ~36B per 
sample, is about 193KiB. Empirical error was basically 0. I also ran it for 10 
million random longs, which took 19s and about 685KiB.

I think this is pretty lightweight. If this sounds reasonable, I'll start 
working on a patch.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-27 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402429#comment-13402429
 ] 

Andrew Wang commented on HBASE-6261:


@Elliot: Moving averages can be cheaply computed on the existing reservoir 
sample, this is more about percentiles. I'm not sure how OpenTSDB factors into 
this, since you'd have to feed the latency stream to OpenTSDB to figure out 
percentiles, which seems expensive. Depending on how tight your speed and 
memory constraints are, I think we could do this in HBase at acceptably minimal 
cost, or make this configurable somehow.

@Ted: The additional cost to do sliding windows is somewhat significant (I 
think 10s of MB more memory). Both the sliding and non-sliding methods allow 
for arbitrary percentiles. Anyway, I think reporting the 50th, 90th, 95th, and 
99th should satisfy anyone. Mixing and matching algorithms is possible and 
probably even advised since it's only worth doing this for high-rate streams 
where accuracy is important. Implementations of the cheaper and less accurate 
algos are already available.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-26 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-6261:
---

Attachment: Latencyestimation.pdf

I've written up a comparison of what I think are all the available options. It 
really just comes down to a couple questions:

- Do we care about bounded error?
- Do we want sliding windows (more mem), or are okay just snapshotting and 
starting anew every interval?
- Do we care about strictly bounded memory usage, or is O(few MBs) good enough?

I'm hoping that we want bounded error, are okay snapshotting, and are okay with 
O(few MBs). I've implemented the algo for this case and am testing it out to 
make sure it meets the performance requirements.

> Better approximate high-percentile percentile latency metrics
> -
>
> Key: HBASE-6261
> URL: https://issues.apache.org/jira/browse/HBASE-6261
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Wang
>  Labels: metrics
> Attachments: Latencyestimation.pdf
>
>
> The existing reservoir-sampling based latency metrics in HBase are not 
> well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
> 95th, or 99th) latency. This is a well-studied problem in the literature (see 
> [1] and [2]), the question is determining which methods best suit our needs 
> and then implementing it.
> Ideally, we should be able to estimate these high percentiles with minimal 
> memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
> on 99th). It's also desirable to provide this over different time-based 
> sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
> I'll note that this would also be useful in HDFS, or really anywhere latency 
> metrics are kept.
> [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
> [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5786) Implement histogram metrics for flush and compaction latencies and sizes.

2012-06-22 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399576#comment-13399576
 ] 

Andrew Wang commented on HBASE-5786:


Opened HBASE-6261 for high-percentile latency estimation, lets take it there.

> Implement histogram metrics for flush and compaction latencies and sizes.
> -
>
> Key: HBASE-5786
> URL: https://issues.apache.org/jira/browse/HBASE-5786
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics, regionserver
>Affects Versions: 0.92.2, 0.94.0, 0.96.0
>Reporter: Jonathan Hsieh
>
> Average time for region operations doesn't really tell a useful story when 
> that help diagnose anomalous conditions.
> It would be extremely useful to add histogramming metrics similar to 
> HBASE-5533 for region operations like flush, compaction and splitting.  The 
> probably should be forward biased at a much coarser granularity however 
> (maybe decay every day?) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-06-22 Thread Andrew Wang (JIRA)
Andrew Wang created HBASE-6261:
--

 Summary: Better approximate high-percentile percentile latency 
metrics
 Key: HBASE-6261
 URL: https://issues.apache.org/jira/browse/HBASE-6261
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Wang


The existing reservoir-sampling based latency metrics in HBase are not 
well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
95th, or 99th) latency. This is a well-studied problem in the literature (see 
[1] and [2]), the question is determining which methods best suit our needs and 
then implementing it.

Ideally, we should be able to estimate these high percentiles with minimal 
memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% on 
99th). It's also desirable to provide this over different time-based sliding 
windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.

I'll note that this would also be useful in HDFS, or really anywhere latency 
metrics are kept.

[1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
[2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5786) Implement histogram metrics for flush and compaction latencies and sizes.

2012-06-22 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399486#comment-13399486
 ] 

Andrew Wang commented on HBASE-5786:


I don't think you can assume a normal distribution for latency. I think it 
looks more Zipfian in practice, or maybe bi-modal because of cache misses. 
Also, a 5% error on a 95th percentile is kind of huge; IIUC, that means it's 
actually reporting between the 90th and 100th percentile. [1] by the same 
authors as your link discusses sampling for high-percentiles.

I found [2] which I think is well-suited for our use case, since it can do 
approximate quantiles on a sliding time window. Space and time bounds seems to 
be O(reasonable log factors). Somehow mashing up [2] to use [1] would be most 
optimal, but doing just [2] is probably okay too.

[1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
[2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

> Implement histogram metrics for flush and compaction latencies and sizes.
> -
>
> Key: HBASE-5786
> URL: https://issues.apache.org/jira/browse/HBASE-5786
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics, regionserver
>Affects Versions: 0.92.2, 0.94.0, 0.96.0
>Reporter: Jonathan Hsieh
>
> Average time for region operations doesn't really tell a useful story when 
> that help diagnose anomalous conditions.
> It would be extremely useful to add histogramming metrics similar to 
> HBASE-5533 for region operations like flush, compaction and splitting.  The 
> probably should be forward biased at a much coarser granularity however 
> (maybe decay every day?) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5786) Implement histogram metrics for flush and compaction latencies and sizes.

2012-06-21 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399053#comment-13399053
 ] 

Andrew Wang commented on HBASE-5786:


A real stats expert can weigh in, but I don't think the current sampling 
methods are well-suited for computing high-percentile latencies. Reservoir 
sampling is fine for computing gross statistics like the mean and stddev, but 
you really want to be biasing your sampling toward the top end for accurate 
95th and 99th percentile estimates.

I unfortunately don't have any solutions yet, but I'm looking into it.

> Implement histogram metrics for flush and compaction latencies and sizes.
> -
>
> Key: HBASE-5786
> URL: https://issues.apache.org/jira/browse/HBASE-5786
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics, regionserver
>Affects Versions: 0.92.2, 0.94.0, 0.96.0
>Reporter: Jonathan Hsieh
>
> Average time for region operations doesn't really tell a useful story when 
> that help diagnose anomalous conditions.
> It would be extremely useful to add histogramming metrics similar to 
> HBASE-5533 for region operations like flush, compaction and splitting.  The 
> probably should be forward biased at a much coarser granularity however 
> (maybe decay every day?) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-06-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: (was: hbase-5892-4.patch)

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Fix For: 0.90.7, 0.96.0, 0.94.1, 0.92.3
>
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892-4-0.90.patch, 
> hbase-5892-4.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-06-04 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892-4.patch

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Fix For: 0.90.7, 0.96.0, 0.94.1, 0.92.3
>
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892-4-0.90.patch, 
> hbase-5892-4.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-06-01 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287748#comment-13287748
 ] 

Andrew Wang commented on HBASE-5892:


I don't know why Findbugs is erroring. Maybe the modularization change?

{code}
[ERROR] Could not find resource 
'${parent.basedir}/dev-support/findbugs-exclude.xml'. -> [Help 1]
{code}

No tests because there's no functionality change, it's a refactor.

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892-4-0.90.patch, 
> hbase-5892-4.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285927#comment-13285927
 ] 

Andrew Wang commented on HBASE-5892:


Okay, respun both versions to fix the code style comments.

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892-4-0.90.patch, 
> hbase-5892-4.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-30 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892-4-0.90.patch

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892-4-0.90.patch, 
> hbase-5892-4.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-30 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892-4.patch

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892-4-0.90.patch, 
> hbase-5892-4.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-30 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285854#comment-13285854
 ] 

Andrew Wang commented on HBASE-5892:


Re-diffed to get the right prefix. Thanks Zhihong.

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-30 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892-3.patch

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892-3.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: (was: hbase-5892-2.patch)

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-29 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892-2.patch

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-29 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285301#comment-13285301
 ] 

Andrew Wang commented on HBASE-5892:


Patch failed to apply because Jenkins tried to apply the version I made for 
0.90 to trunk. I don't know how to kick the build bot to make it do the right 
thing for different versions...

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-25 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892-2-0.90.patch
hbase-5892-2.patch

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-25 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283759#comment-13283759
 ] 

Andrew Wang commented on HBASE-5892:


Another rev, tried to fix the one relevant FindBugs error I saw. The patch for 
trunk applied cleanly to 0.94 and 0.92; provided is the slightly modified patch 
for 0.90.

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892-2-0.90.patch, 
> hbase-5892-2.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-25 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283714#comment-13283714
 ] 

Andrew Wang commented on HBASE-5892:


Ran TestHBaseFsck, had to fix a null pointer thus new version of the patch. 
I'll port it to prior versions too.

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-25 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892-1.patch

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>Assignee: Andrew Wang
>  Labels: noob
> Attachments: hbase-5892-1.patch, hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-24 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282790#comment-13282790
 ] 

Andrew Wang commented on HBASE-5892:


I tried to do this refactor, essentially switching out Runnable for Callable 
and adding some more logging in the process. Let me know if it's not what you 
were thinking of.

I didn't do any testing beyond running hbck on my local machine, which seemed 
to work.

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>  Labels: noob
> Attachments: hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-24 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Status: Patch Available  (was: Open)

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>  Labels: noob
> Attachments: hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5892) [hbck] Refactor parallel WorkItem* to Futures.

2012-05-24 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HBASE-5892:
---

Attachment: hbase-5892.patch

> [hbck] Refactor parallel WorkItem* to Futures.
> --
>
> Key: HBASE-5892
> URL: https://issues.apache.org/jira/browse/HBASE-5892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jonathan Hsieh
>  Labels: noob
> Attachments: hbase-5892.patch
>
>
> This would convert WorkItem* logic (with low level notifies, and rough 
> exception handling)  into a more canonical Futures pattern.
> Currently there are two instances of this pattern (for loading hdfs dirs, for 
> contacting regionservers for assignments, and soon -- for loading hdfs 
> .regioninfo files).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira