[ 
https://issues.apache.org/jira/browse/THRIFT-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17617175#comment-17617175
 ] 

Yuxuan Wang commented on THRIFT-5587:
-------------------------------------

[~jensg] in the example this is a valid UUId string (in the 8-4-4-4-12 
canonical form), but what if it is invalid?

looking at the current compiler code (via `git grep -i uuid` under 
`compiler/cpp/src`) I don't see any actual parsing of the uuid string on the 
compiler level (please let me know if I missed it), which means from the 
compiler's point of view, it just has this string literal that's supposed to be 
an uuid, and it's the language library's responsibility to convert that from 
string to uuid. there's also no way to gracefully handle errors/exceptions for 
thrift generated language code at const definitions/default value definitions, 
which means if the string cannot be converted to uuid, it must be a runtime 
exception/panic/etc.

compare these 2 examples:

{code}
const string FOO = 123
{code}

vs.

{code}
const uuid FOO = "123"
{code}

the first thrift file will cause a compiler error, while the second will cause 
a runtime error instead.

so here are two options/approaches I can think of for now:

1. the compiler should actually parse the uuid string (using boost:uuid or 
something), reject any invalid uuid literals, and feed the bytes to generated 
code (vs. the string via language libraries' parse function)
2. we accept that for invalid uuid literals we'll have runtime exceptions/panics

but even if we accept runtime exceptions, there's still an issue with "lenient 
imparity" between the language libraries. for example, all language libaries 
should support the canonical 8-4-4-4-12 form, as that's what we defined as the 
form to be used by TJSONProtocol. but some language libraries can be more 
lenient than others, e.g. some might also accept {8-4-4-4-12} form, some might 
accept urn:uuid:8-4-4-4-12 form, some might accept 32-hex form. so when someone 
put one of the non-8-4-4-4-12 form literal in thrift file, some generated 
language code will have runtime exceptions and some won't. this can lead to 
bugs (e.g. someone created a thrift file and tested it in one language and it 
works, but it breaks for another language when someone else tries to use this 
same thrift file).

what do you think?

> Introduce uuid as additional builtin type
> -----------------------------------------
>
>                 Key: THRIFT-5587
>                 URL: https://issues.apache.org/jira/browse/THRIFT-5587
>             Project: Thrift
>          Issue Type: New Feature
>          Components: Compiler (General)
>            Reporter: Jens Geyer
>            Assignee: Jens Geyer
>            Priority: Major
>             Fix For: 0.18.0
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> I hereby propose to add *{{uuid}}* as an additional built-in basic type.
> *Rationale:*
>  * Uuids (or Guids) are a state of the art and well-understood way to 
> identitify data entities.
>  * There is support in next to all languages' runtimes to generate, convert 
> and store such data.
>  * Although conversion to and from string or binary is certainly possible, it 
> is also rather cumbersome.
>  * A distinct datatype enables us to sent it in the most suitable way across 
> the wire: JSON=text, Compact=binary
> *Remarks*
>  * -I am open to discuss the term "guid" vs "uuid". Coming from NET platforms 
> I would tend to guid, but I am open to uuid (or even both) as well. That's 
> just a detail.-
>  * If no significant concern is raised, I would move on and implement C#, 
> Delphi and Haxe bindings (as sub-tasks) myself and leave the rest to the 
> community.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to