Pavel Sanda wrote:

> Georg Baum wrote:
>> 1) I could not find any write access to string instances created as copy
>> of Language::babel_. They are all const, so my first example is actually
>> not used here (but might be used elsewhere, one area to test would be our
>> VCS and graphics converter code, both use multiple threads as well).
> 
> IIRC think VCS does not use threading.

All VCS operations that call external tools use it indirectly: Via 
SystemCall.

>> I am now 100% sure and if needed I can try to give better explanations.
> 
> Amazing that we don't crash all the time if the analysis in the previous
> mails is correct.

I'd see it the other way round: It is amazing that it crashes reliably on my 
machine, since the time window for it to happen is very, very small. 
However, it is not the first multi-threading bug I saw that heavily depends 
on the "right conditions" to happen.

Anyway, I id some experiments with different solutions:

forcecopy.diff shows how a solution with a small helper function looks like. 
This is quite simply, but it requires to manually use it in all places where 
strings could be copied in multiple threads. Not nice, but I think it is 
doable.

stringclass.diff shows an incomplete solution with an own string class. I 
used docstring (this would be needed for std::string as well), since a 
replacement of std::string would produce a huge diff. The advantage is that 
you do not need to think about using a deep copy or not, but the 
disadvantage is that you need to duplicate a lot of base class 
functionality, and it produces warnings about ambigous operator+ (but 
without these ambiguities, other parts of the code don't compile).

I did not try Jean-Marcs suggestion about de-parallelizing the export and 
only calling the external tools in parallel. I am pretty sure that it works, 
however if we go this way then we basically say that we don't want parallel 
code in LyX itself. Despite I like the idea very much I am not so sure 
whether we should go this way, since the processor trend goes towards more 
cores and more energy efficency (which often means lower clock frequency), 
so we might need to use parallelism even more in the future.

Currently I prefer the deep_copy() method: The own string class which I 
initially liked more has too many side effects, then we can as well use a 
small helper which is better to understand. What do others think? How should 
we proceed?



Georg
diff --git a/config/lyxinclude.m4 b/config/lyxinclude.m4
index 95cb1f0..01d57bc 100644
--- a/config/lyxinclude.m4
+++ b/config/lyxinclude.m4
@@ -297,6 +297,7 @@ if test x$GXX = xyes; then
 	      ;;
       esac
   fi
+  AC_DEFINE(STD_STRING_USES_COW, 1, [std::string uses copy-on-write])
 fi
 test "$lyx_pch_comp" = yes && lyx_flags="$lyx_flags pch"
 AM_CONDITIONAL(LYX_BUILD_PCH, test "$lyx_pch_comp" = yes)
diff --git a/src/output_latex.cpp b/src/output_latex.cpp
index 3595a08..b8a8aef 100644
--- a/src/output_latex.cpp
+++ b/src/output_latex.cpp
@@ -656,13 +656,13 @@ void TeXOnePar(Buffer const & buf,
 
 	bool const use_polyglossia = runparams.use_polyglossia;
 	string const par_lang = use_polyglossia ?
-		getPolyglossiaEnvName(par_language): par_language->babel();
+		getPolyglossiaEnvName(par_language): deep_copy(par_language->babel());
 	string const prev_lang = use_polyglossia ?
-		getPolyglossiaEnvName(prev_language) : prev_language->babel();
+		getPolyglossiaEnvName(prev_language) : deep_copy(prev_language->babel());
 	string const doc_lang = use_polyglossia ?
-		getPolyglossiaEnvName(doc_language) : doc_language->babel();
+		getPolyglossiaEnvName(doc_language) : deep_copy(doc_language->babel());
 	string const outer_lang = use_polyglossia ?
-		getPolyglossiaEnvName(outer_language) : outer_language->babel();
+		getPolyglossiaEnvName(outer_language) : deep_copy(outer_language->babel());
 	string lang_begin_command = use_polyglossia ?
 		"\\begin{$$lang}" : lyxrc.language_command_begin;
 	string lang_end_command = use_polyglossia ?
@@ -953,7 +953,7 @@ void TeXOnePar(Buffer const & buf,
 						: outer_language;
 				string const current_lang = use_polyglossia
 					? getPolyglossiaEnvName(current_language)
-					: current_language->babel();
+					: deep_copy(current_language->babel());
 				if (!current_lang.empty()) {
 					os << from_ascii(subst(
 						lang_begin_command,
@@ -1131,7 +1131,7 @@ void latexParagraphs(Buffer const & buf,
 	// language on at start
 	string const mainlang = runparams.use_polyglossia
 		? getPolyglossiaEnvName(bparams.language)
-		: bparams.language->babel();
+		: deep_copy(bparams.language->babel());
 	string const lang_begin_command = runparams.use_polyglossia ?
 		"\\begin{$$lang}" : lyxrc.language_command_begin;
 
diff --git a/src/support/lstrings.h b/src/support/lstrings.h
index 0d21e95..7639095 100644
--- a/src/support/lstrings.h
+++ b/src/support/lstrings.h
@@ -24,6 +24,23 @@
 namespace lyx {
 namespace support {
 
+/// Helper to enforce creating a deep copy of lyx::docstring or std::string
+/// even if std::basic_string uses copy-on-write as in GNU libstdc++.
+/// \sa https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334
+template<typename Char, typename Traits, typename Alloc>
+typename std::basic_string<Char, Traits, Alloc>
+deep_copy(typename std::basic_string<Char, Traits, Alloc> const & src)
+{
+#ifdef STD_STRING_USES_COW
+	typedef typename std::basic_string<Char, Traits, Alloc> String;
+	// Use constructor with two arguments to support strings with embedded
+	// \0 characters
+	return String(src.c_str(), src.length());
+#else
+	return src;
+#endif
+}
+
 /// Compare \p s and \p s2, ignoring the case.
 /// Does not depend on the locale.
 int compare_no_case(docstring const & s, docstring const & s2);

diff --git a/config/lyxinclude.m4 b/config/lyxinclude.m4
index 95cb1f0..01d57bc 100644
--- a/config/lyxinclude.m4
+++ b/config/lyxinclude.m4
@@ -297,6 +297,7 @@ if test x$GXX = xyes; then
 	      ;;
       esac
   fi
+  AC_DEFINE(STD_STRING_USES_COW, 1, [std::string uses copy-on-write])
 fi
 test "$lyx_pch_comp" = yes && lyx_flags="$lyx_flags pch"
 AM_CONDITIONAL(LYX_BUILD_PCH, test "$lyx_pch_comp" = yes)
diff --git a/src/support/TempFile.cpp b/src/support/TempFile.cpp
index 21559e1..15d2527 100644
--- a/src/support/TempFile.cpp
+++ b/src/support/TempFile.cpp
@@ -13,6 +13,7 @@
 #include "support/TempFile.h"
 
 #include "support/debug.h"
+#include "support/docstring.h"
 #include "support/FileName.h"
 #include "support/Package.h"
 #include "support/qstring_helpers.h"
diff --git a/src/support/docstream.cpp b/src/support/docstream.cpp
index 38cabc4..5318548 100644
--- a/src/support/docstream.cpp
+++ b/src/support/docstream.cpp
@@ -506,6 +506,15 @@ otexstream & operator<<(otexstream & ots, docstring const & s)
 }
 
 
+#ifdef STD_STRING_USES_COW
+otexstream & operator<<(otexstream & ots, std::basic_string<char_type> const & s)
+{
+	docstring const ds(s);
+	return ots << ds;
+}
+#endif
+
+
 otexstream & operator<<(otexstream & ots, string const & s)
 {
 	ots << from_utf8(s);
diff --git a/src/support/docstream.h b/src/support/docstream.h
index e2f56f9..4417d32 100644
--- a/src/support/docstream.h
+++ b/src/support/docstream.h
@@ -160,6 +160,9 @@ otexstream & operator<<(otexstream &, SafeBreakLine);
 otexstream & operator<<(otexstream &, odocstream_manip);
 ///
 otexstream & operator<<(otexstream &, docstring const &);
+#ifdef STD_STRING_USES_COW
+otexstream & operator<<(otexstream &, std::basic_string<char_type> const &);
+#endif
 ///
 otexstream & operator<<(otexstream &, std::string const &);
 ///
diff --git a/src/support/docstring.cpp b/src/support/docstring.cpp
index f44259d..4ed9d23 100644
--- a/src/support/docstring.cpp
+++ b/src/support/docstring.cpp
@@ -32,6 +32,40 @@ using lyx::support::isHexChar;
 
 namespace lyx {
 
+#ifdef STD_STRING_USES_COW
+docstring operator+(docstring const & lhs, docstring const & rhs)
+{
+	docstring s(lhs);
+	s.append(rhs);
+	return s;
+}
+docstring operator+(docstring const & lhs, char_type const * rhs)
+{
+	docstring s(lhs);
+	s.append(rhs);
+	return s;
+}
+docstring operator+(char_type const * lhs, docstring const & rhs)
+{
+	docstring s(lhs);
+	s.append(rhs);
+	return s;
+}
+docstring operator+(docstring const & lhs, char_type rhs)
+{
+	docstring s(lhs);
+	s.append(1, rhs);
+	return s;
+}
+docstring operator+(char_type lhs, docstring const & rhs)
+{
+	docstring s(1, lhs);
+	s.append(rhs);
+	return s;
+}
+#endif
+
+
 docstring const from_ascii(char const * ascii)
 {
 	docstring s;
diff --git a/src/support/docstring.h b/src/support/docstring.h
index 5eea011..9aaa747 100644
--- a/src/support/docstring.h
+++ b/src/support/docstring.h
@@ -24,7 +24,41 @@ namespace lyx {
  * Use std::string only in cases 7-bit ASCII is to be manipulated
  * within the variable.
  */
+#ifdef STD_STRING_USES_COW
+class docstring : public std::basic_string<char_type>
+{
+typedef std::basic_string<char_type> base;
+public:
+	docstring() {}
+	docstring(size_type n, char_type c) : base(n, c) {}
+	docstring(char_type const * s) : base(s) {}
+	docstring(char_type const * s, size_type n) : base(s, n) {}
+	docstring(std::basic_string<char_type> const & s, size_type pos, size_type len = npos) : base(s, pos, len) {}
+	template <class InputIterator> docstring(InputIterator first, InputIterator last) : base(first, last) {}
+	docstring(std::basic_string<char_type> const & that) : base(that.c_str(), that.length()) {}
+	docstring & operator=(std::basic_string<char_type> const & that)
+	{
+		if (&that == this)
+			return *this;
+		base::operator=(that.c_str());
+		return *this;
+	}
+	docstring & operator=(char_type that)
+	{
+		base::operator=(that);
+		return *this;
+	}
+	docstring& operator+=(docstring const & s) { append(s); return *this; }
+	docstring& operator+=(char_type const * s) { append(s); return *this; }
+	docstring& operator+=(char_type c) { append(1, c); return *this; }
+	docstring substr(size_type pos = 0, size_type n = npos) const
+	{
+		return docstring(*this, pos, n);
+	}
+};
+#else
 typedef std::basic_string<char_type> docstring;
+#endif
 
 /// Creates a docstring from a C string of ASCII characters
 docstring const from_ascii(char const *);
@@ -98,6 +132,14 @@ docstring & operator+=(docstring &, char const *);
 /// Append a single ASCII character to a docstring
 docstring & operator+=(docstring & l, char r);
 
+#ifdef STD_STRING_USES_COW
+docstring operator+(docstring const & lhs, docstring const & rhs);
+docstring operator+(docstring const & lhs, char_type const * rhs);
+docstring operator+(char_type const * lhs, docstring const & rhs);
+docstring operator+(docstring const & lhs, char_type rhs);
+docstring operator+(char_type lhs, docstring const & rhs);
+#endif
+
 } // namespace lyx
 
 
diff --git a/src/support/strfwd.h b/src/support/strfwd.h
index e40ec1c..afdabb1 100644
--- a/src/support/strfwd.h
+++ b/src/support/strfwd.h
@@ -69,8 +69,12 @@ typedef basic_ostringstream<char, char_traits<char>, allocator<char> > ostringst
 namespace lyx {
 
 /// String type for storing the main text in UCS4 encoding
+#ifdef STD_STRING_USES_COW
+class docstring;
+#else
 typedef std::basic_string<char_type, std::char_traits<char_type>,
 	std::allocator<char_type> > docstring;
+#endif
 
 /// Base class for UCS4 input streams
 typedef std::basic_istream<char_type, std::char_traits<char_type> > idocstream;

Reply via email to