[
https://issues.apache.org/jira/browse/HBASE-29365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-29365.
-------------------------------
Fix Version/s: 2.7.0
3.0.0-beta-2
2.6.3
2.5.12
Hadoop Flags: Reviewed
Resolution: Fixed
Pushed to all active branches.
Thanks [~junegunn]!
> Improve option parser of PerformanceEvaluation
> ----------------------------------------------
>
> Key: HBASE-29365
> URL: https://issues.apache.org/jira/browse/HBASE-29365
> Project: HBase
> Issue Type: Bug
> Components: PE
> Reporter: Junegunn Choi
> Assignee: Junegunn Choi
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.3, 2.5.12
>
>
> h2. Summary
> {{PerformanceEvaluation.parseOpts}} spans over 300 lines with significant
> code duplication, and has several usability issues. It needs to be refactored
> and improved.
> *
> [https://github.com/apache/hbase/blob/787f5459883211b6017cee6a9bcb109f3cb66ddb/hbase-diagnostics/src/main/java/org/apache/hadoop/hbase/PerformanceEvaluation.java#L2845-L3158]
> h2. Problems
> h3. Code duplication
> The same code pattern is repeated for each option.
> {code:java}
> final String rows = "--rows=";
> if (cmd.startsWith(rows)) {
> opts.perClientRunRows = Long.parseLong(cmd.substring(rows.length()));
> continue;
> }
> {code}
> * Define the option prefix as a constant
> * Check if the argument starts with the prefix
> * Use {{substring}} to extract the value
> * Move on to the next argument
> We can easily refactor this pattern and reduce the code duplication.
> h3. Usability issues
> Malformed arguments are not handled gracefully, often resulting in confusing
> error messages. Here are some examples.
> h4. Confusing error messages
> The {{--rows}} option expects a value, but if omitted, it prints a
> {{ClassNotFoundException}} exception, followed by an error message that
> doesn't clearly indicate the problem.
> {noformat}
> $ bin/hbase pe --rows
> INFO [main {}] hbase.PerformanceEvaluation: No class found for command:
> --rows
> java.lang.ClassNotFoundException: --rows
> at
> jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
> ~[?:?]
> at
> jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
> ~[?:?]
> at java.lang.ClassLoader.loadClass(ClassLoader.java:525) ~[?:?]
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.isCustomTestClass(PerformanceEvaluation.java:3258)
> ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.isCommandClass(PerformanceEvaluation.java:3251)
> ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:3139)
> ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3217)
> ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> ~[hadoop-common-3.4.1.jar:?]
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)
> ~[hadoop-common-3.4.1.jar:?]
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3273)
> ~[hbase-diagnostics-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
> ERROR: Unrecognized option/command: --rows
> Usage: hbase pe <OPTIONS> [-D<property=value>]* <command|class> <nclients>
> ...
> {noformat}
> The same holds true for other options that expect a value.
> —
> The error message for invalid values can also be improved.
> {noformat}
> $ bin/hbase pe --rows= --table= --compress=
> java.lang.NumberFormatException: For input string: ""
> at
> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
> at java.base/java.lang.Long.parseLong(Long.java:721)
> at java.base/java.lang.Long.parseLong(Long.java:836)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2864)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3217)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3273)
> {noformat}
> It doesn't show which option caused the error, so you would have to guess it
> or follow the stack trace to find out.
> h4. Inconsistent handling of boolean options
> PE supports many boolean options, but their handling is inconsistent. Some
> options require {{\-\-flag=true/false}} syntax, while others just accept
> {{\-\-flag}}.
> * In this case, this fails because {{autoFlush}} requires a value, while
> {{valueRandom}} doesn't.
> {code:java}
> $ bin/hbase pe --valueRandom --autoFlush sequentialWrite 1
> ...
> ERROR: Unrecognized option/command: --autoFlush
> {code}
> * And in this case, {{valueRandom}} is set to true, because {{=false}} after
> {{--valueRandom}} is simply ignored.
> {code:java}
> $ bin/hbase pe --valueRandom=false --autoFlush=true sequentialWrite 1
> SequentialWriteTest test run options={ ... ,"autoFlush":true, ...
> "valueRandom":true, ... }
> {code}
> * Another problem is that they silently ignore illegal values and just treat
> them as {{false}}. This can lead to confusion, and we should be more strict
> about it.
> {code:java}
> $ bin/hbase pe --autoFlush=yes sequentialWrite 1
> SequentialWriteTest test run options={ ..., "autoFlush":false, ... }
> {code}
> h2. Suggestion
> Refactor the code, make the behavior more consistent, and improve the error
> messages.
> h2. Result
> * Shaved off over 180 lines of code
> * Fixed hidden bugs
> * Better error messages
> h3. Examples
> {code:java}
> $ bin/hbase pe --rows
> java.lang.IllegalArgumentException: Invalid option: --rows
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2943)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3033)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3089)
> Caused by: java.lang.IllegalArgumentException: --rows requires a value
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2937)
> ... 4 more
> {code}
> {code:java}
> $ bin/hbase pe --rows= --table= --compress=
> java.lang.IllegalArgumentException: Invalid option: --rows=
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2943)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3033)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3089)
> Caused by: java.lang.NumberFormatException: For input string: ""
> at
> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
> at java.base/java.lang.Long.parseLong(Long.java:721)
> at java.base/java.lang.Long.parseLong(Long.java:836)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.lambda$parseOpts$14(PerformanceEvaluation.java:2866)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2939)
> ... 4 more
> {code}
> {code:java}
> $ bin/hbase pe sequentialWrite 1
> SequentialWriteTest test run options={ ..., "autoFlush":false, ...
> "valueRandom":false, ... }
> {code}
> {code:java}
> $ bin/hbase pe --autoFlush --valueRandom sequentialWrite 1
> SequentialWriteTest test run options={ ..., "autoFlush":true, ...
> "valueRandom":true, ...}
> ...
> {code}
> {code:java}
> $ bin/hbase pe --autoFlush=true --valueRandom=false sequentialWrite 1
> SequentialWriteTest test run options={ ..., "autoFlush":true, ...
> "valueRandom":false, ... }
> ...
> {code}
> {code:java}
> $ bin/hbase pe --autoFlush=yes sequentialWrite 1
> java.lang.IllegalArgumentException: Invalid option: --autoFlush=yes
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2943)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:3033)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:97)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:3089)
> Caused by: java.lang.IllegalArgumentException: Invalid boolean value: yes
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseBoolean(PerformanceEvaluation.java:2973)
> at
> org.apache.hadoop.hbase.PerformanceEvaluation.parseOpts(PerformanceEvaluation.java:2929)
> ... 4 more
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)