edponce commented on pull request #10869:
URL: https://github.com/apache/arrow/pull/10869#issuecomment-907122705


   The capitalize and title kernels are the first vector string kernels that 
perform code point transforms. The code point transforms (case changes) can 
grow in bytes and thus required the use of `MaxCodeUnits` with the 3/2 growth 
factor. Also, these kernels depend on `EnsureLookupTablesFilled` to be called 
in `PreExec` method. These two requirements existed in different classes 
(`CaseMappingTransform` and `StringTransformCodepoint`) and their use applied 
`TransformCodepoint` (scalar method) instead of `Transform` (custom). For these 
reasons, I created `StringTransformCodepointBase` class with the growth factor 
and lookup tables methods. Now the available classes can be chosen depending on 
kind of kernel and for those that apply codepoint transforms.
   
   cc @pitrou


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to