Ruiqi Dong created CODEC-342:
--------------------------------
Summary: Base32.Builder#setEncodeTable(...) can create an instance
that cannot decode its own output
Key: CODEC-342
URL: https://issues.apache.org/jira/browse/CODEC-342
Project: Commons Codec
Issue Type: Bug
Reporter: Ruiqi Dong
*Summary*
`Base32.Builder` exposes `setEncodeTable(...)`, which suggests callers can
provide a custom Base32 alphabet. Encoding does honor the custom table, but the
builder only switches the decode table between the built-in standard and hex
variants. As a result, a `Base32` instance created with an arbitrary custom
alphabet can emit encoded data that the same instance decodes incorrectly.
*Affected code*
File: `src/main/java/org/apache/commons/codec/binary/Base32.java`
{code:java}
@Override
public Builder setEncodeTable(final byte... encodeTable) {
super.setDecodeTableRaw(Arrays.equals(encodeTable, HEX_ENCODE_TABLE) ?
HEX_DECODE_TABLE : DECODE_TABLE);
return super.setEncodeTable(encodeTable);
} {code}
So any table other than the exact built-in hex alphabet gets paired with the
default decode table. Encoding uses the configured `encodeTable`, but decoding
uses the mismatched `decodeTable`, so encode/decode no longer agree on the
alphabet.
*Reproducer*
Add the following test to
`src/test/java/org/apache/commons/codec/binary/Base32Test.java`:
{code:java}
@Test
void testBuilderCustomEncodeTableAffectsDecodeTable() {
final byte[] encodeTable =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ234567".getBytes(StandardCharsets.US_ASCII);
final byte tmp = encodeTable[0];
encodeTable[0] = encodeTable[1];
encodeTable[1] = tmp;
final Base32 base32 =
Base32.builder().setEncodeTable(encodeTable).setLineLength(0).get();
final byte[] encoded = base32.encode(new byte[] { 0 });
assertEquals("BB======", new String(encoded, StandardCharsets.US_ASCII),
"A custom Base32 alphabet should affect encoding");
assertArrayEquals(new byte[] { 0 }, base32.decode(encoded),
"A custom Base32 alphabet should decode its own encoded output");
} {code}
Run:
{code:java}
mvn -q
-Dtest=org.apache.commons.codec.binary.Base32Test#testBuilderCustomEncodeTableAffectsDecodeTable
test {code}
Observed behavior
The encoding assertion passes, showing that the custom alphabet is used. The
encoded output is:
{code:java}
BB====== {code}
But the decode assertion fails because `"BB======"` is interpreted with the
default decode table:
{code:java}
array contents differ at index [0], expected: <0> but was: <8> {code}
*Expected behavior*
If `setEncodeTable(...)` is part of the public builder API, the resulting
`Base32` instance should use a matching decode table so that it can decode its
own output consistently. If arbitrary custom alphabets are not supported, the
builder should reject them instead of silently pairing them with an
incompatible decode table.
This is a configuration/state inconsistency in a public API. The builder
accepts a custom alphabet and encoding follows that configuration, but decoding
silently continues to interpret characters under a different alphabet. That
makes the configured `Base32` instance internally inconsistent.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)