Ruiqi Dong created CODEC-342:
--------------------------------

             Summary: Base32.Builder#setEncodeTable(...) can create an instance 
that cannot decode its own output
                 Key: CODEC-342
                 URL: https://issues.apache.org/jira/browse/CODEC-342
             Project: Commons Codec
          Issue Type: Bug
            Reporter: Ruiqi Dong


*Summary*
`Base32.Builder` exposes `setEncodeTable(...)`, which suggests callers can 
provide a custom Base32 alphabet. Encoding does honor the custom table, but the 
builder only switches the decode table between the built-in standard and hex 
variants. As a result, a `Base32` instance created with an arbitrary custom 
alphabet can emit encoded data that the same instance decodes incorrectly.
 
*Affected code*
File: `src/main/java/org/apache/commons/codec/binary/Base32.java`
{code:java}
@Override
public Builder setEncodeTable(final byte... encodeTable) {
    super.setDecodeTableRaw(Arrays.equals(encodeTable, HEX_ENCODE_TABLE) ? 
HEX_DECODE_TABLE : DECODE_TABLE);
    return super.setEncodeTable(encodeTable);
} {code}
So any table other than the exact built-in hex alphabet gets paired with the 
default decode table. Encoding uses the configured `encodeTable`, but decoding 
uses the mismatched `decodeTable`, so encode/decode no longer agree on the 
alphabet.
 
*Reproducer*
Add the following test to 
`src/test/java/org/apache/commons/codec/binary/Base32Test.java`:
{code:java}
@Test
void testBuilderCustomEncodeTableAffectsDecodeTable() {
    final byte[] encodeTable = 
"ABCDEFGHIJKLMNOPQRSTUVWXYZ234567".getBytes(StandardCharsets.US_ASCII);
    final byte tmp = encodeTable[0];
    encodeTable[0] = encodeTable[1];
    encodeTable[1] = tmp;

    final Base32 base32 = 
Base32.builder().setEncodeTable(encodeTable).setLineLength(0).get();
    final byte[] encoded = base32.encode(new byte[] { 0 });
    assertEquals("BB======", new String(encoded, StandardCharsets.US_ASCII),
            "A custom Base32 alphabet should affect encoding");
    assertArrayEquals(new byte[] { 0 }, base32.decode(encoded),
            "A custom Base32 alphabet should decode its own encoded output");
} {code}
Run:
{code:java}
mvn -q 
-Dtest=org.apache.commons.codec.binary.Base32Test#testBuilderCustomEncodeTableAffectsDecodeTable
 test {code}
Observed behavior
The encoding assertion passes, showing that the custom alphabet is used. The 
encoded output is:
{code:java}
BB====== {code}
But the decode assertion fails because `"BB======"` is interpreted with the 
default decode table:
{code:java}
array contents differ at index [0], expected: <0> but was: <8> {code}
*Expected behavior*
If `setEncodeTable(...)` is part of the public builder API, the resulting 
`Base32` instance should use a matching decode table so that it can decode its 
own output consistently. If arbitrary custom alphabets are not supported, the 
builder should reject them instead of silently pairing them with an 
incompatible decode table.
 
This is a configuration/state inconsistency in a public API. The builder 
accepts a custom alphabet and encoding follows that configuration, but decoding 
silently continues to interpret characters under a different alphabet. That 
makes the configured `Base32` instance internally inconsistent.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to