Dan Hecht has posted comments on this change.

Change subject: Add a build flag for the undefined behavior sanitizer, aka 
"ubsan".
......................................................................


Patch Set 4:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/exec/hdfs-avro-scanner-ir.cc
File be/src/exec/hdfs-avro-scanner-ir.cc:

Line 275:         if (bytes_to_fill < sizeof(*decimal)) {
this doesn't seem necessary: we know that:
slot_byte_size == 4
sizeof(*decimal) == 4
0 <= len.val <= slot_byte_size
0 <= bytes_to_fill <= slot_byte_size
i.e. 0 <= bytes_to_fill <= 4

Is the undefined case when bytes_to_fill == 4, i.e. right shift by 32?  is that 
actually undefined to shift int32 by 32? If so, we'd need the check for the 
other two cases below as well, but can just do a common

if (len.val == 0) 

check outside the switch statement.


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/exec/zigzag-test.cc
File be/src/exec/zigzag-test.cc:

PS4, Line 101: 
what's undefined about that?


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

PS4, Line 161: dst->ptr != nullptr
dst->ptr equals copy, which we've already checked against NULL at line 154, so 
this check isn't necessary. And this code is somewhat perf sensitive so we 
shouldn't do it.


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/exprs/expr-value.h
File be/src/exprs/expr-value.h:

PS4, Line 72: size()
size() != 0
or use empty()


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

Line 379:       place = Overflow::CheckedProduct(place, 
static_cast<uint64_t>(src_base), &overflow);
why don't we need to check overflow here? and then why do we need to check it 
at line 367 instead?


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

PS4, Line 819: 
why remove that? do we sometimes pass null?


Line 629:   if (strs[0].len) memcpy(ptr, strs[0].ptr, strs[0].len);
!= 0.
but why is that necessary. C standard about string.h says:
Where an argument declared as size_t n specifies the length of the array for a 
function, n can have the value zero on a call to that function.


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 388: (
these are extraneous


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

Line 143:       // case but what makes the normal path faster?
this comment seemed to imply that this change will make this code slower. is 
that the case?


PS4, Line 149: x + y
why do we need UnsignedSum() on line 144 case but not here?


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/runtime/raw-value.cc
File be/src/runtime/raw-value.cc:

PS4, Line 161: dest->len > 0
why is this len > 0 rather than dest->ptr != nullptr? is memcpy with 0 length 
undefined? as mentioned earlier, C standard seems to claim otherwise.


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

PS4, Line 975: string_val->ptr != nullptr
is this actually possible? it's not checked upstream?


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/runtime/string-value.inline.h
File be/src/runtime/string-value.inline.h:

PS4, Line 60: (len > 0)
is it really undefined to call strncmp() with len==0?


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/runtime/tuple.h
File be/src/runtime/tuple.h:

PS4, Line 75: tuple_desc.num_null_bytes() > 0
same question here and elsewhere about 0 count.


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/udf/udf.cc
File be/src/udf/udf.cc:

PS4, Line 493: len > 0
len will almost always be > 0 so this condition should go second (if we need it 
at all).


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/util/decompress-test.cc
File be/src/util/decompress-test.cc:

PS4, Line 118: input_len
!= 0 here and elsewhere. but why needed?


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/util/overflow.h
File be/src/util/overflow.h:

Line 45: 
this could use a quick comment


PS4, Line 68: [[gnu::always_inline]]
ALWAYS_INLINE ?


Line 134:     constexpr Signedness SIGN = (std::is_signed<T>::value || 
std::is_signed<U>::value) ?
do the tests cover these combinations?


http://gerrit.cloudera.org:8080/#/c/5082/4/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

PS4, Line 999: num > 0
same


-- 
To view, visit http://gerrit.cloudera.org:8080/5082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88c7234bd7c5eb7404490a0913d90470c10835e7
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to