Daniel Becker has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20187 )

Change subject: IMPALA-12239: BitWidthZeroRepeated seems to be flaky
......................................................................

IMPALA-12239: BitWidthZeroRepeated seems to be flaky

RleTest.BitWidthZeroRepeated seems to be flaky in release builds. A
possible error message is this:

  Value of: 0 Expected: val Which is: '\x9F' (159)
  Stacktrace

  .../Impala/be/src/util/rle-test.cc:410
  Value of: 0
  Expected: val
  Which is: '\x9F' (159)

The problem seems to be around this 'memcpy()' call:
https://github.com/apache/impala/blob/3e9408480c5285ca925576b7486b35593407a32a/be/src/util/bit-stream-utils.inline.h#L237.

We're almost certainly running into undefined behaviour because
 - the error only occurs in release mode, not in debug mode (not even in
   ASAN or UBSAN mode)
 - if we add something around the 'memcpy()', for example printing
   'buffer_pos_' or 'v', the error doesn't occur
 - the value with which the test fails, i.e. the value it reads instead
   of the expected 0 is non-deterministic.

The failure occurs since https://gerrit.cloudera.org/#/c/20073/
(IMPALA-11961/IMPALA-12207: Add Redhat 9 / Ubuntu 22 support), which
upgraded gperftools.

We couldn't find the root cause of the bug but the most probable
possibilities are the following:
 - there's a bug in Impala that leads to undefined behaviour and that
   manifests in the test failure with the new gperftools version but not
   with the old one
 - there's a bug in the new gperftools version (2.10).

Downgrading gperftools is not an option because it is necessary for
Redhat 9 and Ubuntu 22. There is a workaround that causes the test to
pass - putting the 'memcpy()' in a conditional block:

  if (LIKELY(buffer_pos_ != buffer_end_)) {
    memcpy(v, buffer_pos_, num_bytes);
    buffer_pos_ += num_bytes;
  } else {
    ...
  }

This most probably does not address the root cause, however; for more
see the Jira ticket.

Testing:
  - manually verified that the test RleTest.BitWidthZeroRepeated passes
    in a release build

Change-Id: I84645271101b4bd594dd25929924a3549c223244
Reviewed-on: http://gerrit.cloudera.org:8080/20187
Reviewed-by: Joe McDonnell <joemcdonn...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
---
M be/src/util/bit-stream-utils.inline.h
1 file changed, 19 insertions(+), 2 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/20187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I84645271101b4bd594dd25929924a3549c223244
Gerrit-Change-Number: 20187
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>

Reply via email to