Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8900 )
Change subject: IMPALA-3282: Adds regexp_escape built-in function ...................................................................... Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/8900/4/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8900/4/be/src/exprs/string-functions-ir.cc@624 PS4, Line 624: ostringstream ss; We should just preallocate a StringVal of 2 * str.len for the output, instead of constructing a std::string and copying it. E.g. see how StringFunctions::Repeat() initialises 'result'. http://gerrit.cloudera.org:8080/#/c/8900/4/be/src/exprs/string-functions-ir.cc@625 PS4, Line 625: for (char const *c = start_ptr; c < end_ptr; c++) { > Thanks for the point. Let me check performance and the feasibility. Calling non-IR functions from other modules from IR generally works - when we compile the codegen module, LLVM falls back to looking up dynamic symbols in the current process if there's no IR function. One downside of BackslashEscape is that it requires allocating a temporary std::string, whereas our output is a StringVal. We could just use strings::CharSet in this function instead. -- To view, visit http://gerrit.cloudera.org:8080/8900 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I84c3e0ded26f6eb20794c38b75be9b25cd111e4b Gerrit-Change-Number: 8900 Gerrit-PatchSet: 4 Gerrit-Owner: Kim Jin Chul <jinc...@gmail.com> Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org> Gerrit-Reviewer: Kim Jin Chul <jinc...@gmail.com> Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Wed, 10 Jan 2018 16:32:54 +0000 Gerrit-HasComments: Yes