[
https://issues.apache.org/jira/browse/KUDU-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin updated KUDU-3680:
--------------------------------
Component/s: log
> LogRollingITest.TestLogCleanupOnStartup: TSAN reports race due to concurrency
> issue in glog GetLoggingDirectories()
> -------------------------------------------------------------------------------------------------------------------
>
> Key: KUDU-3680
> URL: https://issues.apache.org/jira/browse/KUDU-3680
> Project: Kudu
> Issue Type: Bug
> Components: log, test
> Reporter: Alexey Serbin
> Priority: Major
> Attachments: log-rolling-itest.txt.xz
>
>
> Apparently, the following code in glog's src/logging.cc isn't thread-safe and
> that becomes apparent in TSAN builds of the
> LogRollingITest.TestLogCleanupOnStartup test scenario when scheduler
> anomalies are present:
> {code}
> static vector<string>* logging_directories_list;
>
>
>
> const vector<string>& GetLoggingDirectories() {
>
> // Not strictly thread-safe but we're called early in InitGoogle().
>
> if (logging_directories_list == NULL) {
>
> logging_directories_list = new vector<string>;
>
>
>
> if ( !FLAGS_log_dir.empty() ) {
>
> // A dir was specified, we should use it
>
> logging_directories_list->push_back(FLAGS_log_dir);
>
> } else {
>
> GetTempDirectories(logging_directories_list);
>
> #ifdef GLOG_OS_WINDOWS
>
> char tmp[MAX_PATH];
>
> if (GetWindowsDirectoryA(tmp, MAX_PATH))
>
> logging_directories_list->push_back(tmp);
>
> logging_directories_list->push_back(".\\");
>
> #else
>
> logging_directories_list->push_back("./");
>
> #endif
>
> }
>
> }
>
> return *logging_directories_list;
>
> }
> {code}
> The actual race happens in the code above, and it affects two concurrently
> running threads iterating over the std::vector<string> instance which is
> accessed via const reference in the LogFileObject::Write() method for two
> different LogFileObject objects:
> {code}
> const vector<string> & log_dirs = GetLoggingDirectories();
>
>
>
> // Go through the list of dirs, and try to create the log file in each
>
> // until we succeed or run out of options
>
> bool success = false;
>
> for (vector<string>::const_iterator dir = log_dirs.begin();
>
> dir != log_dirs.end();
>
> ++dir) {
>
> ...
> }
> {code}
> The proper solution for GetLoggingDirectories() would be using atomic
> semantics, where the static pointer is swapped atomically using
> std::atomic::compare_exchange_strong() or similar construct only when the
> underlying std::vector<string> object has been populated with all the
> necessary entries.
> The full log of the LogRollingITest.TestLogCleanupOnStartup scenario with the
> race detected by TSAN is attached.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)