Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12797 )
Change subject: IMPALA-4356,IMPALA-7331: codegen all ScalarExprs ...................................................................... Patch Set 7: Code-Review+1 (3 comments) I am ok with the patch in general, but I do not have enough codegen experience to give +2. If no other people wants to jump on it, I can dig deeper but it may take some time. http://gerrit.cloudera.org:8080/#/c/12797/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12797/6//COMMIT_MSG@41 PS6, Line 41: IMPALA-7331 (CHAR codegen support functions) is also fixed becaus > I added a toy manual benchmark below that shows a significant improvement. I am not sure - I do not know how many people use CHAR(N). My intuition says that CHAR(N) makes sense for columns with short strings but large NDV, but if its usage is discouraged in Impala, then I guess that we shouldn't put effort in benchmarking. I wouldn't add it to targeted-perf in this patch, but creating a Jira for it never hurts. :) http://gerrit.cloudera.org:8080/#/c/12797/6/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: http://gerrit.cloudera.org:8080/#/c/12797/6/be/src/exprs/scalar-fn-call.cc@483 PS6, Line 483: #pragma push_macro("GET > Done I would prefer to use a macro in the header too - in such a way that gets the list of "*Val"s from a macro, to avoid the possibility of forgetting to override it for a new type. I am ok with not doing it in this change. The simplest way would be probably to use a macro like GENERATE_GET_VAL_INTERPRETED_OVERRIDES_FOR_ALL_SCALAR_TYPES, and use it in the few classes that implement it for all scalar types. A few classes override it for CollectionVal too, so it could be added separately, or a macro could be created that includes CollectionVal too. http://gerrit.cloudera.org:8080/#/c/12797/7/be/src/udf/udf-internal.h File be/src/udf/udf-internal.h: http://gerrit.cloudera.org:8080/#/c/12797/7/be/src/udf/udf-internal.h@299 PS7, Line 299: // Matches memory layout of StringVal to allow sharing of support in CodegenAnyval. : // Also is denser this can fit into 16 bytes along with the initial is_null member. This could be enforced with static_assert()s: static_assert(sizeof(CollectionVal) == sizeof(StringVal), "Wrong size."); static_assert(offsetof(CollectionVal, num_tuples) == offsetof(StringVal, len), "Wrong offset."); static_assert(offsetof(CollectionVal, ptr) == offsetof(StringVal, ptr), "Wrong offset."); offsetof works in GCC, I didn't try it in Clang. -- To view, visit http://gerrit.cloudera.org:8080/12797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I839d7a3a2f5e1309c33a1f66013ef11628c5dc11 Gerrit-Change-Number: 12797 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Mon, 08 Apr 2019 15:47:04 +0000 Gerrit-HasComments: Yes