Re: RFR: 8251989: Hex formatting and parsing utility [v10]

Chris Hegarty Fri, 27 Nov 2020 02:49:49 -0800

On Wed, 25 Nov 2020 22:51:44 GMT, Roger Riggs <[email protected]> wrote:


>> java.util.HexFormat utility:
>> 
>>  - Format and parse hexadecimal strings, with parameters for delimiter, 
>> prefix, suffix and upper/lowercase
>>  - Static factories and builder methods to create HexFormat copies with 
>> modified parameters.
>>  - Consistent naming of methods for conversion of byte arrays to formatted 
>> strings and back: formatHex and parseHex
>>  - Consistent naming of methods for conversion of primitive types: 
>> toHexDigits... and fromHexDigits...
>>  - Prefix and suffixes now apply to each formatted value, not the string as 
>> a whole
>>  - Using java.util.Appendable as a target for buffered conversions so output 
>> to Writers and PrintStreams
>>    like System.out are supported in addition to StringBuilder. (IOExceptions 
>> are converted to unchecked exceptions)
>>  - Immutable and thread safe, a "value-based" class
>> 
>> See the [HexFormat 
>> javadoc](http://cr.openjdk.java.net/~rriggs/8251989-hex-formatter/java.base/java/util/HexFormat.html)
>>  for details.
>> 
>> Review comments and suggestions welcome.
>
> Roger Riggs has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains 19 additional 
> commits since the last revision:
> 
>  - Clarified that suffix() and prefix() methods do not return null, instead 
> the empty string is returned.
>  - Merge branch 'master' into 8251989-hex-formatter
>  - Merge branch 'master' into 8251989-hex-formatter
>  - Merge branch 'master' into 8251989-hex-formatter
>  - The HexFormat API indexing model for array and string ranges is changed
>    to describe the range using 'fromIndex (inclusive)' and 'toIndex 
> (exclusive)'.
>    
>    Initially, it was specified as 'index' and 'length'. However, both byte 
> arrays
>    and strings used in the HexFormat API typically use fromIndex and toIndex
>    to describe ranges.  Using the same indexing model can prevent mistakes.
>    
>    The change affects the methods and corresponding tests:
>    
>        formatHex(byte[] bytes, int fromIndex, int toIndex)
>        formatHex(A out, byte[] bytes, int fromIndex, int toIndex)
>        parseHex(char[] chars, int fromIndex, int toIndex)
>        parseHex(CharSequence string, int fromIndex, int toIndex)
>        fromHexDigits(CharSequence string, int fromIndex, int toIndex)
>        fromHexDigitsToLong(CharSequence string, int fromIndex, int toIndex)
>  - - Added @see and @link references to Integer.toHexString and 
> Long.toHexString
>    - Clarified parsing is case insensistive in various parse and fromXXX 
> methods
>    - Source level cleanup based on review comments
>    - Expanded some javadoc tag text to make it more descriptive
>    - Consistent use of 'hexadecimal' vs 'hex'
>  - Review comment updates to class javadoc
>  - Review comment updates, in the example code, and to describe the 
> characters used to convert to hexadecimal
>  - Correct length of StringBuilder in formatHex;
>    Correct bug in formatHex(char[], 2, 3) and add test for subranges of char[]
>  - Merge branch 'master' into 8251989-hex-formatter
>  - ... and 9 more: 
> https://git.openjdk.java.net/jdk/compare/55be0a99...b19d2827

Changes requested by chegar (Reviewer).

src/java.base/share/classes/java/util/HexFormat.java line 71:

> 69:  * <p>
> 70:  * For formatted hexadecimal string to byte array conversions the
> 71:  * {@code parseHex} methods include {@link #parseHex(CharSequence) 
> parseHex(string)} and

parseHex(string) -> parseHex(CharSequence)

src/java.base/share/classes/java/util/HexFormat.java line 467:

> 465: 
> 466:     /**
> 467:      * Returns a byte array containing hexadecimal values parsed from a 
> range of the string.

string -> charSequence. And flow through at param tag and param name.

src/java.base/share/classes/java/util/HexFormat.java line 448:

> 446: 
> 447:     /**
> 448:      * Returns a byte array containing hexadecimal values parsed from 
> the string.

string -> charSequence. And flow through at param tag and param name.

src/java.base/share/classes/java/util/HexFormat.java line 528:

> 526:      * a range of the character array.
> 527:      *
> 528:      * Each byte value is parsed as the prefix, two case insensitive 
> hexadecimal characters,

Each *char* value ...

src/java.base/share/classes/java/util/HexFormat.java line 538:

> 536:      * @param toIndex the final index of the range, exclusive.
> 537:      * @return a byte array with the values parsed from the character 
> array range
> 538:      * @throws IllegalArgumentException if the prefix or suffix is not 
> present for each byte value,

... for each *char* value ...  OR ... for each pair of char values?

src/java.base/share/classes/java/util/HexFormat.java line 617:

> 615:      * Returns the two hexadecimal characters for the {@code byte} value.
> 616:      * Each nibble (4 bits) from most significant to least significant 
> of the value
> 617:      * is formatted as if by {@link #toLowHexDigit(int) 
> toLowHexDigit(nibble)}.

It might be more straightforward to frame this in terms of toLowHexDigit and 
toHighHexDigit (rather than only toLowHexDigit).  `nibble`- should the param 
name for to(Low|High)HexDigit be named `nibble` ?

src/java.base/share/classes/java/util/HexFormat.java line 860:

> 858: 
> 859:     /**
> 860:      * Returns a value parsed from two hexadecimal characters in a 
> string.

sting -> charsequence   ( same comment more generally, anywhere a charsequence 
is referred to as a string )

src/java.base/share/classes/java/util/HexFormat.java line 47:

> 45:  * or choice of {@link #withUpperCase()} or {@link #withLowerCase()} 
> parameters.
> 46:  * <p>
> 47:  * For primitive to hexadecimal string conversions the {@code toHexDigits}

It took me a little while to grok the naming convention being used here. It 
might be helpful to ensure that all the terms are clearly defined and used 
consistently throughout. 

"hexadecimal string"  - consists of just 0-9, a-f, A-F  (i.e. no delimiter, 
prefix, suffix)
"formatted hexadecimal string" - consists of hexadecimal string, and prefix, 
delimiter, suffix.

The `parseXXX` and `formatXXX` are logically related - and operate on 
"formatted hexadecimal strings"

The `toXXX` and `fromXXX` are logically related - one is kind of the inverse of 
the other. And these operate (even when accepting/producing chars) on 
"hexadecimal strings" - ignoring almost all properties of the HexFormat, except 
the toXXX cares about upper/lower case. Does `fromXXX` care about upper/lower ?

-------------

PR: https://git.openjdk.java.net/jdk/pull/482

Re: RFR: 8251989: Hex formatting and parsing utility [v10]

Reply via email to