Gustavo de Morais created FLINK-39125:
-----------------------------------------

             Summary: Support injective casts from CHAR/VARCHAR to 
BINARY/VARBINARY for upsert key preservation
                 Key: FLINK-39125
                 URL: https://issues.apache.org/jira/browse/FLINK-39125
             Project: Flink
          Issue Type: Improvement
          Components: Table SQL / Planner
    Affects Versions: 2.2.0
            Reporter: Gustavo de Morais
            Assignee: Gustavo de Morais
             Fix For: 2.3.0


When users cast a VARCHAR key column to VARBINARY, the upsert key uniqueness is 
lost because the cast is not recognized as injective.UTF-8 encoding is itself 
injective - distinct strings always produce distinct byte sequences - so we can 
safely mark these casts as injective when the binary target has sufficient 
capacity. The cast is injective under the following conditions:
 * VARCHAR(MAX) → VARBINARY(MAX): both sides are unbounded

 * VARCHAR(x) → VARBINARY(y) where y >= x * 4: target can hold the worst-case 
UTF-8 encoding (4 bytes per character)

 * Bounded source to unbounded (MAX) target: always fits

This applies to all four cross-family combinations: CHAR/VARCHAR to 
BINARY/VARBINARY.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to