Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11160 )
Change subject: IMPALA-7412: width_bucket() function overflows too easily ...................................................................... Patch Set 2: (2 comments) I did not check the python part. http://gerrit.cloudera.org:8080/#/c/11160/2/be/src/exprs/math-functions-ir.cc File be/src/exprs/math-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/11160/2/be/src/exprs/math-functions-ir.cc@516 PS2, Line 516: if (expr_val > max_range_val) return (int64_t)num_buckets.val + 1; This was >= in the previous version - is this intentional? http://gerrit.cloudera.org:8080/#/c/11160/2/be/src/exprs/math-functions-ir.cc@518 PS2, Line 518: using BiggerType = typename Bigger<decltype(expr_val)>::type; I assume that in most use cases min_range, max_range and num_buckets will be constants and they will be const eliminated by llvm. If this is true, then checking that max_range_val-min_range_val does not overflow is done during code generation. Whether dist_from_min*num_buckets overflows can be also checked using only constants by calculating min_range_val*num_buckets instead, because dist_from_min <= range_size. If non of these overflow, then the whole calculation could be done using decltype(expr_val) instead of BiggerType. If BiggerType is <=int64, then this is probably not useful, but division of integers above the machine's supported type should be quite slow. -- To view, visit http://gerrit.cloudera.org:8080/11160 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I728cc05d9aef8d081e6f2da66146f6d7b75dbb57 Gerrit-Change-Number: 11160 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Thu, 09 Aug 2018 12:26:17 +0000 Gerrit-HasComments: Yes
