We are looking to implement upper() / lower() for non-ASCII characters. The 
current Gandiva implementation handles upper() / lower() only for standard 
ASCII characters.

For the implementation in Gandiva, I went through a few articles and answers on 
StackOverflow and the top answer to this question 
<https://stackoverflow.com/questions/36897781/how-to-uppercase-lowercase-utf-8-characters-in-c>
 suggests that there is no standard way to do Unicode case conversion in C/C++ 
and that an external library like ICU 
<https://unicode-org.github.io/icu-docs/#/icu4c/> is necessary to ensure 
guaranteed Unicode case conversion.

So, I just wanted to know that while adding any external library in Gandiva, 
what are the issues that we need to take care of in order to ensure that we do 
not break existing code and not sacrifice on performance as well? Is there any 
existing library that we can make use of to go about solving this problem? Any 
suggestions would be welcome.

Regards,
Sagnik

Reply via email to