[ 
https://issues.apache.org/jira/browse/THRIFT-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476450#comment-17476450
 ] 

Richard H. Gumpertz commented on THRIFT-5497:
---------------------------------------------

An improvement: in the XXX_types.h generated file, just after the line 
containing "namespace XXX_server {" we could add the following typedef
     typedef std::basic_string<unsigned_char> _binary;
when the "binary_as_unsigned_string" option is enabled.  When it is disabled, 
add the following typedef instead:
    typedef std::string _binary;

Then, when generating the declarations for Thrift binary fields, always use the 
type "_binary" for binary fields.

As for the overhead of multiple copying, note that once the data is put into 
the string by the reader, it need never be copied again.  Every implementation 
of strings that I have encountered only does copy on write.  The data itself is 
not copied when a string is copied.  As for the end user not having to make a 
copy, that is what basic_string<...>::data() and basic_string<...>::length() 
are for.

If I could only come up with a better name for the thrift compiler option, I 
would submit a pull request for the changes immediately.  The name 
"binary_as_unsigned_string" just seems to long.  What do people think of 
"binary_unsigned" as the option name?

> Change the C++ representation of binary be std::basic_string<unsigned_char>
> ---------------------------------------------------------------------------
>
>                 Key: THRIFT-5497
>                 URL: https://issues.apache.org/jira/browse/THRIFT-5497
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Richard H. Gumpertz
>            Priority: Minor
>
> I would like to change the C++ representation of binary from "std::string" to 
> "std::basic_string<unsigned_char>".  This would allow C++ templates to 
> distinguish the Thrift type "string" from "binary".
> If nothing else, ::apached::thrift::to_string could print binary differently 
> than string.  Currently, it can get pretty ugly when one calls printTo on a 
> struct containing a binary field that has data that is not readable!
> There is an issue as to how they should be displayed by default; using 
> \xDE\xAD\xBE\xEF with a \ before every byte seems ugly.  Alternativlely, 
> "DEADBEEF"_x because it could also be used as input if the C++ operator""_x 
> is defined by the user?  Alternatively just <binary> or <27-byte binary> 
> (mirroring <null> for unset fields), hiding the contents?  Perhaps leave 
> to_string(std::basic_string<unsigned char> &value) undefined and force the 
> user to define it if needed?  Thoughts & suggestions???
> Given that such a change could break existing code, it should probably be 
> enabled under a new thrift compiler option to --gen cpp.  I haven't come up 
> with a good name for such an option yet: --gen cpp:binary_as_unsigned_string 
> is explicit but wordy.  Any suggestions???
> I would be willing to do the work necessary to add the relevant code if there 
> are no volunteers who have a better understanding of precisely what code 
> would have to be updated (or at least help me find such code)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to