[
https://issues.apache.org/jira/browse/KUDU-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin updated KUDU-3680:
--------------------------------
Description:
Apparently, the following code in glog's src/logging.cc isn't thread-safe and
that becomes apparent in TSAN builds of the
LogRollingITest.TestLogCleanupOnStartup test scenario when scheduler anomalies
are present:
{code}
static vector<string>* logging_directories_list;
const vector<string>& GetLoggingDirectories() {
// Not strictly thread-safe but we're called early in InitGoogle().
if (logging_directories_list == NULL) {
logging_directories_list = new vector<string>;
if ( !FLAGS_log_dir.empty() ) {
// A dir was specified, we should use it
logging_directories_list->push_back(FLAGS_log_dir);
} else {
GetTempDirectories(logging_directories_list);
#ifdef GLOG_OS_WINDOWS
char tmp[MAX_PATH];
if (GetWindowsDirectoryA(tmp, MAX_PATH))
logging_directories_list->push_back(tmp);
logging_directories_list->push_back(".\\");
#else
logging_directories_list->push_back("./");
#endif
}
}
return *logging_directories_list;
}
{code}
The actual race happens in the code above, and it affects two concurrently
running threads iterating over the std::vector<string> instance which is
accessed via const reference in the LogFileObject::Write() method for two
different LogFileObject objects:
{code}
const vector<string> & log_dirs = GetLoggingDirectories();
// Go through the list of dirs, and try to create the log file in each
// until we succeed or run out of options
bool success = false;
for (vector<string>::const_iterator dir = log_dirs.begin();
dir != log_dirs.end();
++dir) {
...
}
{code}
The proper solution for GetLoggingDirectories() would be using atomic
semantics, where the static pointer is swapped atomically using
std::atomic::compare_exchange_strong() or similar construct only when the
underlying std::vector<string> object has been populated with all the necessary
entries.
The full log with the race detected by TSAN is attached.
was:
Apparently, the following code in glog's src/logging.cc isn't thread-safe and
that becomes apparent in TSAN builds of the
LogRollingITest.TestLogCleanupOnStartup test scenario when scheduler anomalies
are present:
{code}
static vector<string>* logging_directories_list;
const vector<string>& GetLoggingDirectories() {
// Not strictly thread-safe but we're called early in InitGoogle().
if (logging_directories_list == NULL) {
logging_directories_list = new vector<string>;
if ( !FLAGS_log_dir.empty() ) {
// A dir was specified, we should use it
logging_directories_list->push_back(FLAGS_log_dir);
} else {
GetTempDirectories(logging_directories_list);
#ifdef GLOG_OS_WINDOWS
char tmp[MAX_PATH];
if (GetWindowsDirectoryA(tmp, MAX_PATH))
logging_directories_list->push_back(tmp);
logging_directories_list->push_back(".\\");
#else
logging_directories_list->push_back("./");
#endif
}
}
return *logging_directories_list;
}
{code}
The actual race happens in the code above, and it affects two threads iterating
over the std::vector<string> instance which is accessed via const reference and
being iterated over in the LogFileObject::Write() method for two corresponding
LogFileObject objects:
{code}
const vector<string> & log_dirs = GetLoggingDirectories();
// Go through the list of dirs, and try to create the log file in each
// until we succeed or run out of options
bool success = false;
for (vector<string>::const_iterator dir = log_dirs.begin();
dir != log_dirs.end();
++dir) {
...
}
{code}
The proper solution for GetLoggingDirectories() would be using atomic
semantics, where the static pointer is swapped atomically using
std::atomic::compare_exchange_strong() or similar construct only when the
underlying std::vector<string> object has been populated with all the necessary
entries.
The full log with the race detected by TSAN is attached.
> LogRollingITest.TestLogCleanupOnStartup: TSAN reports race due to concurrency
> issue in glog GetLoggingDirectories()
> -------------------------------------------------------------------------------------------------------------------
>
> Key: KUDU-3680
> URL: https://issues.apache.org/jira/browse/KUDU-3680
> Project: Kudu
> Issue Type: Bug
> Components: test
> Reporter: Alexey Serbin
> Priority: Major
> Attachments: log-rolling-itest.txt.xz
>
>
> Apparently, the following code in glog's src/logging.cc isn't thread-safe and
> that becomes apparent in TSAN builds of the
> LogRollingITest.TestLogCleanupOnStartup test scenario when scheduler
> anomalies are present:
> {code}
> static vector<string>* logging_directories_list;
>
>
>
> const vector<string>& GetLoggingDirectories() {
>
> // Not strictly thread-safe but we're called early in InitGoogle().
>
> if (logging_directories_list == NULL) {
>
> logging_directories_list = new vector<string>;
>
>
>
> if ( !FLAGS_log_dir.empty() ) {
>
> // A dir was specified, we should use it
>
> logging_directories_list->push_back(FLAGS_log_dir);
>
> } else {
>
> GetTempDirectories(logging_directories_list);
>
> #ifdef GLOG_OS_WINDOWS
>
> char tmp[MAX_PATH];
>
> if (GetWindowsDirectoryA(tmp, MAX_PATH))
>
> logging_directories_list->push_back(tmp);
>
> logging_directories_list->push_back(".\\");
>
> #else
>
> logging_directories_list->push_back("./");
>
> #endif
>
> }
>
> }
>
> return *logging_directories_list;
>
> }
> {code}
> The actual race happens in the code above, and it affects two concurrently
> running threads iterating over the std::vector<string> instance which is
> accessed via const reference in the LogFileObject::Write() method for two
> different LogFileObject objects:
> {code}
> const vector<string> & log_dirs = GetLoggingDirectories();
>
>
>
> // Go through the list of dirs, and try to create the log file in each
>
> // until we succeed or run out of options
>
> bool success = false;
>
> for (vector<string>::const_iterator dir = log_dirs.begin();
>
> dir != log_dirs.end();
>
> ++dir) {
>
> ...
> }
> {code}
> The proper solution for GetLoggingDirectories() would be using atomic
> semantics, where the static pointer is swapped atomically using
> std::atomic::compare_exchange_strong() or similar construct only when the
> underlying std::vector<string> object has been populated with all the
> necessary entries.
> The full log with the race detected by TSAN is attached.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)