Github user sachouche commented on the issue:
https://github.com/apache/drill/pull/1001
Paul, again thanks for the detailed review:
- I was able to address most of the feedback except for one
- I agree that expressions that can operate directly on the encoded UTF-8
string should ideally perform checks on bytes and not characters
- Having said that, such a change is more involved and should be done
properly
o The SqlPatternContainsMatcher currently gets a CharSequence as input
o We should enhance the expression framework so that matchers can a)
express their capabilities and b) receive the expected data type (Character or
Byte sequences)
o Note also there is an impact on the test-suite since StringBuffer are
being used to directly test the matcher functionality
---