https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67903

--- Comment #5 from Yucheng Low <ylow at graphlab dot com> ---
After some deep investigation on a related issue, I think might finally have a
root cause.

Introduction
------------
 - We compile as a shared library to be imported into Python as part of a
python 
module. 
 - We want to use C++11 features yet we want to be able to run on 
relatively old Linux distributions, hence we must package our own libstdc++.
 - However, there are other Python modules which are source distributions and
 compile on the system itself hence depends on the system's existing libtsdc++.
 - Since the system will only load 1 instance of libstdc++ at a time, we can't
 merely distribute libstdc++.so. Doing so will mean whether we load correctly
 or not will depend on import ordering.
 - Therefore we must static link libstdc++ into our shared library.
 - As a result there are actually 2 instances of libstdc++ symbols loaded into 
 Python (This ought to work since Python uses RTLD_LOCAL)

Issue
-----
When any other libstdc++ is loaded, our iostreams have issues. Specifically,
the former will fail with std::bad_Cast, and the latter will give nonsense.


```
boost::lexical_cast<double>("0.01")
```

```
double d;
std::stringstream strm("0.01");
strm >> d;
```



Investigation
-------------
std::locale is associated with a collection of facets. 
(like std::codecvt, std::ctype, std::num_get std::num_put, etc).

Every facet has a static member, internally holds an index value initially
assigned to 0.

[facet]::id._M_id() when first called, assigns the internal index
based on an atomic incrementing integer. On subsequent calls will return the
internal index. This index represents a position into a facet array.

Key Issue
---------
Now, the key issue is that all [facet]::id are a UNIQUE symbol which crosses 
RTLD_LOCAL boundaries, EXCEPT for ctype<char>::id, codecvt<char>::id, 
ctype<wchar>::id, and codecvt<wchar>::id which are WEAK.

This means that all IDs are shared across mulitple libstdc++ libraries (which
is not a problem if the number of facets do not change). 

But the 2nd loaded libstdc++ will reassign the indexes 0-3 to the 4 id's above
(since it gets its own copy of those 4 symbols). which causes the
codecvt<wchar> to be assigned the same index as std::num_get<char>, overwriting
its position in the locale, and causing the above to code to fail the dynamic
cast from std::locale::facet to std::num_get.

Solution
--------
It might be easy to solve this problem for libstdc++'s going forward. Make all
the [facet]::id variables WEAK. 

However, that is not sufficient to solve compatibility with older versions of
libstdc++. The better solution will be to change all the symbol names
[facet]::id. However, [facet]::id is part of the C++ spec which makes
renaming it somewhat problematic.  (might be possible to use a version script
to add a version to the symbol?)

(Internal Hack)
---------------
To solve this in our project we change all usage of facet::id to facet::id_2.
So when installing new facets, we will look for the existance of facet::id_2
first.  This requires introducing id_2 to all the facets, and modifying
_M_install_facet and _M_replace_facet to use SFINAE to preferably select
facet::id_2 if it exists, then use facet::id otherwise.

This is a rather unhappy hack and I would like to avoid this if possible.
If anyone has a better suggestion I am all ears.

Reply via email to