anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )
Change subject: IMPALA-4848: Add WIDTH_BUCKET() function ...................................................................... Patch Set 7: (4 comments) Yes, I have been discussing these approaches with Taras and Alex. I have benchmarked these approaches - Created a large table with1073741824 rows . The patch with DoubleVal outperforms (patch set 3) Using just int 256 444.58s 453.40s Using int128 159.28s 155.25s Binary search approach // This was done with float array and need to change this to using decimalVal 109.21s 109.20s DoubleVal (patch set 3) 104.20s 104.20s Current status - Will send out a patch which uses int128_t (int256_t in case of overflows) for storing the intermediate results very soon. Will continue working on exploring the binary search approach later and will send out a follow up patch if we see performance improvements. http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc File be/src/exprs/math-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@429 PS7, Line 429: bucket_width > This should be called bucket_number to make it more clear Done http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@431 PS7, Line 431: width_size > width_size is a confusing name. This should be called something like "dista Done http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@479 PS7, Line 479: result.val = num_buckets.val; > I think it's clearer and simpler to write: Done http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516 PS7, Line 516: int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value()); > This idea may give a nice performance boost (if it works) because all the h This patch stores intermediate results in int256_t only when needed. Uses int128_t otherwise. Will do some more benchmarking and tests for the binary search approach and will post a follow up patch. -- To view, visit http://gerrit.cloudera.org:8080/6023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f Gerrit-Change-Number: 6023 Gerrit-PatchSet: 7 Gerrit-Owner: anujphadke <apha...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Michael Brown <mi...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: anujphadke <apha...@cloudera.com> Gerrit-Comment-Date: Thu, 16 Nov 2017 06:12:55 +0000 Gerrit-HasComments: Yes