Liu Chong created FLINK-33197:
---------------------------------

             Summary: PyFlink support for ByteArraySchema
                 Key: FLINK-33197
                 URL: https://issues.apache.org/jira/browse/FLINK-33197
             Project: Flink
          Issue Type: New Feature
          Components: API / Python
    Affects Versions: 1.17.0
            Reporter: Liu Chong


Currently in Python Flink API, when reading messages from a Kafka source, only 
SimpleStringSchema is available.
If the data is in arbitary binary format(e.g. marshalled Protocol Buffer msg) 
it may not be decodable with the default 'utf-8' encoding. 
There's currently a workaround which is to manually set the encoding to 
'ISO-8859-1' which supports all possible byte combinations. 
However this is not an elegant solution.
We should support ByteArraySchema which outputs a raw byte array for subsequent 
unmarshalling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to