[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. Patch Set 1: > (5 comments) > > >When I run Impala-TSAN locally, it uses the TSAN code under: > '/mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc'. > I would guess it should be using the LLVM from the toolchain? > Although that class doesn't seem to exist anywhere in the > toolchain/ dir. > > I think that might be where the source was mounted during the > toolchain build? The same thing happens with ASAN (/mnt/source/llvm/llvm-5.0.1.src-p1/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:67). So I guess its okay. > > >I'm a bit confused about running TSAN builds with/without the > -build_shared_libs option and how TSAN interacts with static vs > dynamic linking (trying to mimic Kudu's TSAN usage > https://github.com/apache/kudu/blob/master/CMakeLists.txt#L385) > I don't think I know any more > > >According to the TSAN docs, everything (including third party > libs) should be build with -fsanitize=thread > (https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual#non-instrumented-code). > Not sure if we are doing that in Impala? > We're not, the toolchain is just a regular release build. Might consider doing this in a follow up JIRA, seems like a considerable amount of effort. It seems we can surface a lot of race conditions using TSAN without this. -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 30 Jan 2020 01:54:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. Patch Set 1: > > Does ignore_noninstrumented_modules work on Linux? Looking at the > > LLVM source, https://reviews.llvm.org/D61708 doesn't look like > it's > > been merged. > > > > FWIW, it didn't exist (and certainly didn't work on Linux) when > > Kudu's TSAN support was added, which is why we went the other > > direction and recompile all of our dependencies with > > -fsanitize=thread if doing a TSAN build. > > Maybe https://reviews.llvm.org/D28263 fixed it? > > It definitely does something on my dev machine, which is running > Ubuntu 16.04. mm ignore that review link. I'll try to see if I can find the patch that fixed it, but confirmed it works on my dev machine. -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 28 Jan 2020 22:32:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. Patch Set 1: > Does ignore_noninstrumented_modules work on Linux? Looking at the > LLVM source, https://reviews.llvm.org/D61708 doesn't look like it's > been merged. > > FWIW, it didn't exist (and certainly didn't work on Linux) when > Kudu's TSAN support was added, which is why we went the other > direction and recompile all of our dependencies with > -fsanitize=thread if doing a TSAN build. Maybe https://reviews.llvm.org/D28263 fixed it? It definitely does something on my dev machine, which is running Ubuntu 16.04. -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 28 Jan 2020 22:30:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. Patch Set 1: Does ignore_noninstrumented_modules work on Linux? Looking at the LLVM source, https://reviews.llvm.org/D61708 doesn't look like it's been merged. FWIW, it didn't exist (and certainly didn't work on Linux) when Kudu's TSAN support was added, which is why we went the other direction and recompile all of our dependencies with -fsanitize=thread if doing a TSAN build. -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 28 Jan 2020 22:01:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. Patch Set 1: (5 comments) >When I run Impala-TSAN locally, it uses the TSAN code under: > '/mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc'. > I would guess it should be using the LLVM from the toolchain? Although that > class doesn't seem to exist anywhere in the toolchain/ dir. I think that might be where the source was mounted during the toolchain build? >I'm a bit confused about running TSAN builds with/without the > -build_shared_libs option and how TSAN interacts with static vs dynamic > linking (trying to mimic Kudu's TSAN usage > https://github.com/apache/kudu/blob/master/CMakeLists.txt#L385) I don't think I know any more >According to the TSAN docs, everything (including third party libs) should > be build with -fsanitize=thread > (https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual#non-instrumented-code). > Not sure if we are doing that in Impala? We're not, the toolchain is just a regular release build. http://gerrit.cloudera.org:8080/#/c/15116/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15116/1//COMMIT_MSG@10 PS1, Line 10: existing -tsan build flag. -tsan_full is equivalent to the current -tsan Is -tsan_full useful? Wonder if we should just get rid of it. http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/reservation-tracker.h File be/src/runtime/bufferpool/reservation-tracker.h: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/reservation-tracker.h@316 PS1, Line 316: AtomicInt64 reservation_; I think these members need some explanation that they can be read without holding the lock (otherwise it's confusing whether code should be holding the lock or not). http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/system-allocator.cc File be/src/runtime/bufferpool/system-allocator.cc: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/bufferpool/system-allocator.cc@124 PS1, Line 124: #if defined(ADDRESS_SANITIZER) || defined(THREAD_SANITIZER) Did you run into this bug? I think it's meant to be fixed: https://bugs.llvm.org/show_bug.cgi?id=32968 http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/coordinator-backend-state.cc@179 PS1, Line 179: unique_lock lock(lock_); Could this cause a deadlock if we exit from one of the points below where 'lock_' is held?. I think maybe not since the destructor for lock_guard below runs first. But is there a way we can make the correctness more obvious? http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/runtime/coordinator.h@292 PS1, Line 292: AtomicBool has_called_wait_; // if true, Wait() was called I think the writer should still hold wait_lock_? -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 28 Jan 2020 21:37:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5536/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 28 Jan 2020 19:07:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. Patch Set 1: A few things I could use input on: * When I run Impala-TSAN locally, it uses the TSAN code under: '/mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc'. I would guess it should be using the LLVM from the toolchain? Although that class doesn't seem to exist anywhere in the toolchain/ dir. * I'm a bit confused about running TSAN builds with/without the -build_shared_libs option and how TSAN interacts with static vs dynamic linking (trying to mimic Kudu's TSAN usage https://github.com/apache/kudu/blob/master/CMakeLists.txt#L385) * According to the TSAN docs, everything (including third party libs) should be build with -fsanitize=thread (https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual#non-instrumented-code). Not sure if we are doing that in Impala? -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 28 Jan 2020 18:37:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15116 ) Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/common/init.cc File be/src/common/init.cc: http://gerrit.cloudera.org:8080/#/c/15116/1/be/src/common/init.cc@423 PS1, Line 423: // https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual#non-instrumented-code), line too long (95 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 28 Jan 2020 18:23:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5904: Add tsan full option and fix several TSAN bugs
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15116 Change subject: IMPALA-5904: Add tsan_full option and fix several TSAN bugs .. IMPALA-5904: Add tsan_full option and fix several TSAN bugs This patch adds an additional build flag -tsan_full in addition to the existing -tsan build flag. -tsan_full is equivalent to the current -tsan behavior, and -tsan is changed to set the ignore_noninstrumented_modules flag to true. ignore_noninstrumented_modules causes TSAN to ignore any modules that are not TSAN-instrumented. This is necessary to get TSAN to play nicely with Java, since Java is not TSAN-instrumented (see https://wiki.openjdk.java.net/display/tsan/Main and JDK-8208520). While this might decrease the number of issues surfaced by TSAN, it drastically decreases the amount of noise produced by TSAN because the JVM is not running TSAN-instrumented code. Without this flag set to true, almost every single backend test fails with the error: WARNING: ThreadSanitizer: data race (pid=12939) Write of size 1 at 0x7fcbe379c4c6 by thread T31: #0 strncpy /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:650 (unifiedbetests+0x1b2a4ad) #1 (libjvm.so+0x90e706) This patch fixes various TSAN bugs (e.g. data races) reported while running backend tests and E2E against a TSAN build (it does not make Impala completely TSAN-clean). This patch makes the following changes: * Fixes several bugs involving issues with updating shared variables between threads * Fixes a few race conditions in test classes * Where possible, existing locks are used to fix any data races; in cases where the locking logic is non-trivial, atomics are used * There are a few places where variables are marked as 'volatile' presumably for synchronization purposes; TSAN flags these 'volatile' variables as unsafe, and according to https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rconc-volatile using 'volatile' for synchronization is dangerous; in these cases, the 'volatile' variables are changed to 'atomic' variables * This patch adds a suppression file (bin/tsan-suppresions.txt) similar to the UBSAN suppresion file (bin/ubsan-suppresions.txt) Testing: * Ran exhaustive tests * Manually re-ran backend tests against a TSAN build and made sure the reported errors are gone Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 --- M CMakeLists.txt M be/CMakeLists.txt M be/src/common/init.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/rpc/thrift-thread.cc M be/src/rpc/thrift-thread.h M be/src/runtime/bufferpool/reservation-tracker.cc M be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/bufferpool/system-allocator.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-test.cc M be/src/runtime/io/data-cache.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-stress.h M be/src/runtime/io/scan-range.cc M be/src/service/session-expiry-test.cc M be/src/statestore/statestore-test.cc M be/src/statestore/statestore.cc M be/src/util/runtime-profile-test.cc M be/src/util/stopwatch.h M bin/run-backend-tests.sh A bin/tsan-suppressions.txt M buildall.sh M tests/common/environ.py M tests/webserver/test_web_pages.py 30 files changed, 199 insertions(+), 94 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/15116/1 -- To view, visit http://gerrit.cloudera.org:8080/15116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3d7ef5c228afd5882e145e6f53885b355d6c25a0 Gerrit-Change-Number: 15116 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar