Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13807 )

Change subject: IMPALA-8741: Speed up bit unpacking by vectorisation
......................................................................


Patch Set 4:

(8 comments)

I started reviewing this morning but ran out of time to look today. I got 
through the C++ code but haven't reviewed the guts of the code generator. I 
have pretty good confidence it's correct though, the unit tests should provide 
good coverage.

http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/benchmarks/bit-packing-benchmark.cc
File be/src/benchmarks/bit-packing-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/benchmarks/bit-packing-benchmark.cc@29
PS4, Line 29: // The second one compares the original scalar implementation 
with the vectorised one
Include results for this one as a comment?


http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/bit-packing-vectorized.h
File be/src/util/bit-packing-vectorized.h:

PS4:
We generally don't check in generated files. There are arguments for and 
against doing so, but generally that's the direction we've gone. The most 
compelling reason for me is that re-generating the code as part of the build 
means that  vectorised_bit_unpacking_generator.py is tested. Otherwise it could 
easily bit rot.

I think this is useful for the purposes of review, but I'd be inclined to 
remove it before merging and rely on generating via a CMake rule. We can 
discuss the pros and cons though; maybe there are some considerations I'm 
issing.


http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/bit-packing.h
File be/src/util/bit-packing.h:

http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/bit-packing.h@64
PS4, Line 64: simultaniously
simultaneously


http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/bit-packing.h@67
PS4, Line 67:   template <typename OutType, bool VECTORIZE = true>
Is there a significant performance benefit to making VECTORIZE a compile-time 
constant - we already have to do a runtime check for the instruction anyway, so 
it can't result in more specialisation.


http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/bit-packing.inline.h
File be/src/util/bit-packing.inline.h:

http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/bit-packing.inline.h@84
PS4, Line 84:   if (LIKELY((std::is_same<OutType, uint8_t>::value
Does it even make sense to unpack values into a different type outside of these 
4? Could we make this a static_assert instead?

That would avoid someone accidentally instantiating a non-vectorized version.


http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/bit-packing.inline.h@262
PS4, Line 262:     return in;
indentation


http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/vectorised_bit_unpacking_generator.py
File be/src/util/vectorised_bit_unpacking_generator.py:

http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/vectorised_bit_unpacking_generator.py@142
PS4, Line 142: sinlge
single


http://gerrit.cloudera.org:8080/#/c/13807/4/be/src/util/vectorised_bit_unpacking_generator.py@1080
PS4, Line 1080: metod
method



--
To view, visit http://gerrit.cloudera.org:8080/13807
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9e452a547973778bbd8d768c608e1a32e948f947
Gerrit-Change-Number: 13807
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Jul 2019 23:02:01 +0000
Gerrit-HasComments: Yes

Reply via email to