[
https://issues.apache.org/jira/browse/CODEC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089635#comment-18089635
]
Ruiqi Dong commented on CODEC-341:
----------------------------------
Base32 has a similar issue. I create a new ticket
[https://issues.apache.org/jira/browse/CODEC-342].
> Base16.Builder#setEncodeTable(...) can create an instance that cannot decode
> its own output
> -------------------------------------------------------------------------------------------
>
> Key: CODEC-341
> URL: https://issues.apache.org/jira/browse/CODEC-341
> Project: Commons Codec
> Issue Type: Bug
> Reporter: Ruiqi Dong
> Priority: Major
>
> *Summary*
> Base16.Builder exposes setEncodeTable(...), which suggests callers can
> provide a custom Base16 alphabet. Encoding does honor the custom table, but
> the builder only switches the decode table between the built-in upper-case
> and lower-case variants. As a result, a Base16 instance created with an
> arbitrary custom alphabet can emit encoded data that the same instance
> decodes incorrectly. *This issue also happens on Base32.* BTW, is that fine
> for me to report Base32 in this ticket? Or do I need to create a new ticket?
>
> *Affected code*
> File: src/main/java/org/apache/commons/codec/binary/Base16.java
> File: src/main/java/org/apache/commons/codec/binary/Base32.java
> {code:java}
> # Base16
> @Override
> public Builder setEncodeTable(final byte... encodeTable) {
> super.setDecodeTableRaw(Arrays.equals(encodeTable,
> LOWER_CASE_ENCODE_TABLE) ? LOWER_CASE_DECODE_TABLE : UPPER_CASE_DECODE_TABLE);
> return super.setEncodeTable(encodeTable);
> }{code}
> {code:java}
> # Base32
> @Override
> public Builder setEncodeTable(final byte... encodeTable) {
> super.setDecodeTableRaw(Arrays.equals(encodeTable, HEX_ENCODE_TABLE) ?
> HEX_DECODE_TABLE : DECODE_TABLE);
> return super.setEncodeTable(encodeTable);
> } {code}
>
> *Reproducer*
> Add the following test to
> src/test/java/org/apache/commons/codec/binary/Base16Test.java:
> {code:java}
> @Test
> void testBuilderCustomEncodeTableAffectsDecodeTable() {
> final byte[] encodeTable =
> "0123456789ABCDEF".getBytes(StandardCharsets.US_ASCII);
> final byte tmp = encodeTable[0];
> encodeTable[0] = encodeTable[1];
> encodeTable[1] = tmp;
> final Base16 base16 = Base16.builder().setEncodeTable(encodeTable).get();
> final byte[] encoded = base16.encode(new byte[] { 1 });
> assertEquals("10", new String(encoded, StandardCharsets.US_ASCII),
> "A custom Base16 alphabet should affect encoding");
> assertArrayEquals(new byte[] { 1 }, base16.decode(encoded),
> "A custom Base16 alphabet should decode its own encoded output");
> }{code}
> Run:
> {code:java}
> mvn -q
> -Dtest=org.apache.commons.codec.binary.Base16Test#testBuilderCustomEncodeTableAffectsDecodeTable
> test {code}
> The encoding assertion passes, showing that the custom alphabet is used. The
> encoded output is:
> {code:java}
> 10{code}
> But the decode assertion fails because 10 is interpreted with the default
> decode table:
> {code:java}
> array contents differ at index [0], expected: <1> but was: <16> {code}
> Expected behavior:
> If setEncodeTable(...) is part of the public builder API, the resulting
> Base16 instance should use a matching decode table so that it can decode its
> own output consistently. If arbitrary custom alphabets are not supported, the
> builder should reject them instead of silently pairing them with an
> incompatible decode table.
> Add the following test to
> src/test/java/org/apache/commons/codec/binary/Base32Test.java:
> {code:java}
> @Test
> void testBuilderCustomEncodeTableAffectsDecodeTable() {
> final byte[] encodeTable =
> "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567".getBytes(StandardCharsets.US_ASCII);
> final byte tmp = encodeTable[0];
> encodeTable[0] = encodeTable[1];
> encodeTable[1] = tmp;
> final Base32 base32 =
> Base32.builder().setEncodeTable(encodeTable).setLineLength(0).get();
> final byte[] encoded = base32.encode(new byte[] { 0 });
> assertEquals("BB======", new String(encoded, StandardCharsets.US_ASCII),
> "A custom Base32 alphabet should affect encoding");
> assertArrayEquals(new byte[] { 0 }, base32.decode(encoded),
> "A custom Base32 alphabet should decode its own encoded output");
> } {code}
> Run:
> {code:java}
> mvn -q
> -Dtest=org.apache.commons.codec.binary.Base32Test#testBuilderCustomEncodeTableAffectsDecodeTable
> test {code}
> Observed behavior:
> The encoding assertion passes, showing that the custom alphabet is used. The
> encoded output is:
> {code:java}
> BB====== {code}
> But the decode assertion fails because "BB======" is interpreted with the
> default decode table:
> {code:java}
> array contents differ at index [0], expected: <0> but was: <8> {code}
> Expected behavior:
> The resulting Base32 instance should use a matching decode table so that it
> can decode its own output consistently. If arbitrary custom alphabets are not
> supported, the builder should reject them instead of silently pairing them
> with an incompatible decode table.
>
>
> This is a configuration/state inconsistency in a public API. The builder
> accepts a custom alphabet and encoding follows that configuration, but
> decoding silently continues to interpret characters under a different
> alphabet. That makes the configured Base16 and Base32 instances internally
> inconsistent.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)