[ 
https://issues.apache.org/jira/browse/THRIFT-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17617549#comment-17617549
 ] 

Jens Geyer commented on THRIFT-5652:
------------------------------------

I'm trying to do it via IDL parser first. If that does not work out, we might 
add some code to do it. A regex just for this seems overkill to me (personal 
opinion).

I understand that C++ 11 supports regex OOTB but my guess would be that there 
might be some platforms that rely on older C++, no? 


> IDL uuid literals can be improved 
> ----------------------------------
>
>                 Key: THRIFT-5652
>                 URL: https://issues.apache.org/jira/browse/THRIFT-5652
>             Project: Thrift
>          Issue Type: Sub-task
>          Components: Compiler (General)
>            Reporter: Jens Geyer
>            Assignee: Jens Geyer
>            Priority: Major
>
> [~fishywang] wrote:
> in the example this is a valid UUId string (in the 8-4-4-4-12 canonical 
> form), but what if it is invalid?
> looking at the current compiler code (via `git grep -i uuid` under 
> `compiler/cpp/src`) I don't see any actual parsing of the uuid string on the 
> compiler level (please let me know if I missed it), which means from the 
> compiler's point of view, it just has this string literal that's supposed to 
> be an uuid, and it's the language library's responsibility to convert that 
> from string to uuid. there's also no way to gracefully handle 
> errors/exceptions for thrift generated language code at const 
> definitions/default value definitions, which means if the string cannot be 
> converted to uuid, it must be a runtime exception/panic/etc.
> compare these 2 examples:
> const string FOO = 123
> vs.
> const uuid FOO = "123"
> the first thrift file will cause a compiler error, while the second will 
> cause a runtime error instead.
> so here are two options/approaches I can think of for now:
> 1. the compiler should actually parse the uuid string (using boost:uuid or 
> something), reject any invalid uuid literals, and feed the bytes to generated 
> code (vs. the string via language libraries' parse function)
> 2. we accept that for invalid uuid literals we'll have runtime 
> exceptions/panics
> but even if we accept runtime exceptions, there's still an issue with 
> "lenient imparity" between the language libraries. for example, all language 
> libaries should support the canonical 8-4-4-4-12 form, as that's what we 
> defined as the form to be used by TJSONProtocol. but some language libraries 
> can be more lenient than others, e.g. some might also accept
> {8-4-4-4-12}
> form, some might accept urn:uuid:8-4-4-4-12 form, some might accept 32-hex 
> form. so when someone put one of the non-8-4-4-4-12 form literal in thrift 
> file, some generated language code will have runtime exceptions and some 
> won't. this can lead to bugs (e.g. someone created a thrift file and tested 
> it in one language and it works, but it breaks for another language when 
> someone else tries to use this same thrift file).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to