anujphadke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/6023 )

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 7:

(4 comments)

Yes, I have been discussing these approaches with Taras and Alex.
I have benchmarked these  approaches -

Created a large table with1073741824 rows . The patch with
DoubleVal outperforms (patch set 3)

Using just int 256
444.58s
453.40s

Using int128
159.28s
155.25s

Binary search approach // This was done with float array and need to change 
this to using decimalVal
109.21s
109.20s

DoubleVal (patch set 3)
104.20s
104.20s

Current status -
Will send out a patch which uses  int128_t (int256_t in case of overflows) for 
storing the intermediate results very soon. Will continue working on exploring 
the binary search approach later and will send out a follow up patch if we see 
performance improvements.

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@429
PS7, Line 429: bucket_width
> This should be called bucket_number to make it more clear
Done


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@431
PS7, Line 431: width_size
> width_size is a confusing name. This should be called something like "dista
Done


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@479
PS7, Line 479:     result.val = num_buckets.val;
> I think it's clearer and simpler to write:
Done


http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/math-functions-ir.cc@516
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * 
ConvertToInt256(width_size.value());
> This idea may give a nice performance boost (if it works) because all the h
This patch stores intermediate results in int256_t only when needed. Uses 
int128_t otherwise.
Will do some more benchmarking and tests for the binary search approach and 
will post a follow up patch.



--
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <apha...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: anujphadke <apha...@cloudera.com>
Gerrit-Comment-Date: Thu, 16 Nov 2017 06:12:55 +0000
Gerrit-HasComments: Yes

Reply via email to