kosak opened a new pull request, #13775:
URL: https://github.com/apache/arrow/pull/13775

   When a user's C++ program links to both Arrow and an installation of the 
Flatbuffers library, the program can crash
   or send corrupt Arrow messages.
   
   The reason for this is version incompatibility between the vendored (and 
trimmed-down) version of Flatbuffers that lives inside Arrow, and whatever 
version the user is using.
   
   The community seems to be aware of this issue, at least as it impacts Java: 
ARROW-5579 
   
   In C++, the problem is especially pernicious because it is not even 
diagnosed at build time (e.g. by duplicate linker symbols). The methods being 
used are templates and so their definitions are emitted as weak symbols by the 
compiler. As we all know, when a weak symbol is defined in two different 
compilation units, the linker assumes their definitions are identical and it 
will just pick one. Here, the result is that either Arrow or the user program 
gets different Flatbuffers code than what it expected, and the program crashes.
   
   Arrow doesn't even advertise the version of Flatbuffers that it vendored so 
it's impossible for the user to even ameliorate this problem. In any case, it 
would be a little unfriendly to force the user to use that exact version of 
Flatbuffers even if it could be identified.
   
   The good news is that there is an easy workaround. Arrow C++ doesn't export 
Flatbuffers as part of its public interface. Instead, it just uses it 
internally, as an implementation detail. Therefore it is easy to just move the 
vendored Flatbuffers from the namespace "flatbuffers" to some other private 
namespace. In my PR, I change the namespace to `arrow_thirdparty_flatbuffer`. 
Then I create a namespace alias which makes `flatbuffers` an alias for 
`arrow_thirdparty_flatbuffers`. The net result is that (thanks to the new 
namespace) the symbols exported by the linker are in the "private" namespace 
`arrow_thirdparty_flatbuffers`, and therefore don't conflict with any other 
flatbuffers, but (thanks to the alias) the calling code in the rest of the 
Arrow library doesn't have to change at all.
   
   You might prefer a nested namespace instead, such as 
`arrow::thirdparty::flatbuffers`, or some other choice.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to