Heejong Lee created BEAM-7008:
---------------------------------

             Summary: standardize UTF-8 string coder encodings
                 Key: BEAM-7008
                 URL: https://issues.apache.org/jira/browse/BEAM-7008
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-core, sdk-py-core
            Reporter: Heejong Lee
            Assignee: Heejong Lee


It looks like UTF-8 String Coder in Java and Python SDKs uses different 
encoding schemes. StringUtf8Coder in Java SDK puts the varint length of the 
input string before actual data bytes however StrUtf8Coder in Python SDK 
directly encodes the input string to bytes value. We should unify the encoding 
schemes of UTF8 strings across the different SDKs and make it a standard coder.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to