Thanks everyone. I think there are two issues being discussed here and I'd
like to keep them separate:

1. the ABI compatibility of Arrow's pip binary release.
It's true that there is no ABI standard and the topic is messy, but
as Antoine pointed out:

> If you'd like to benefit from the PyArrow binary
> packages, including the C++ API, then you need to use the same toolchain
> (or an ABI-compatible toolchain, but I'm afraid there's no clear
> specification of ABI compatibility in g++ / libstdc++ land).

we should be safe. And I think manylinux (which says, everyone should use
GCC/libstdc++, and should not use a GNU ABI version newer than X) and GNU
ABI Policy and Guidelines [1] (which says, Binaries with equivalent
DT_SONAMEs are forward-compatibile, and IIUC the SONAME has been
libstdc++.so.6 for quite a while, since GCC 3.4).

2. the ODR (one definition rule) violation caused by template classes,
specifically STL classes.

Strictly speaking, this is not about ABI compatibility, and sticking to
manylinux does not prevent this problem. The problem is essentially because
the STL headers shipped with GCC change over versions, and there's no
guarantee that those STL classes will have the same layout forever, and the
layout did change without notice (see the example in my original post).

Again, note that manylinux does not specify which toolchain everyone should
use. It merely specifies the maximum version of those fundamental
libraries. And with manylinux2010, people might have more choices in
compiler versions. For example, devtoolset-6 and devtoolset-7 both qualify.

I guess I was asking for a policy or guideline regarding to how to
correctly build things depending on Arrow's pip release. Even if the
guideline says "you need to build your library in this docker image", it's
still an improvement from the current situation. It might greatly limit the
developer's choices, if they also want to depend on some other library, or
they want to use a newer / older GCC verison.

Or maybe we could disallow STL classes in arrow's public headers. This
might not be feasible, because std::shared_ptr and std::vector are used
everywhere.

Or maybe we only allow some "safe" STL classes in the public headers. But
there is no guarantee for them to be safe. It's purely empirical.

On Thu, Jun 20, 2019 at 3:47 PM Zhuo Peng <bril...@gmail.com> wrote:

> Dear Arrow maintainers,
>
> I work on several TFX (TensorFlow eXtended) [1] projects (e.g. TensorFlow
> Data Validation [2]) and am trying to use Arrow in them. These projects are
> mostly written in Python but has C++ code as Python extension modules,
> therefore we use both Arrow’s C++ and Python APIs. Our projects are
> distributed through PyPI as binary packages.
>
> The python extension modules are compiled with the headers shipped within
> pyarrow PyPI binary package and are linked with libarrow.so and
> libarrow_python.so in the same package. So far we’ve seen two major
> problems:
>
> * There are STL container definitions in public headers.
>
> It causes problems because the binary for template classes is generated at
> compilation time. And the definition of those template classes might differ
> from compiler to compiler. This might happen even if we use a different GCC
>  version than the one that compiled pyarrow (for example, the layout of
> std::unordered_map<> has changed in GCC 5.2 [3], and arrow::Schema used to
> contain an std::unordered_map<> member [4].)
>
> One might argue that everyone releasing manylinux1 packages should use
> exactly the same compiler, as provided by the pypa docker image, however
> the standard only specifies the maximum versions of corresponding
> fundamental libraries [5]. Newer GCC versions could be backported to work
> with older libraries [6].
>
> A recent change in Arrow [7] has removed most (but not all [8]) of the STL
> members in publicly accessible class declarations and will resolve our
> immediate problem, but I wonder if there is, or there should be an explicit
> policy on the ABI compatibility, especially regarding the usage of template
> functions / classes in public interfaces?
>
> * Our wheel cannot pass “auditwheel repair”
>
> I don’t think it’s correct to pull libarrow.so and libarrow_python.so into
> our wheel and have user’s Python load both our libarrow.so and pyarrow’s,
> but that’s what “auditwheel repair” attempts to do. But if we don’t allow
> auditwheel to do so, it refuses to stamp on our wheel because it has
> “external” dependencies.
>
> This seems not an Arrow problem, but I wonder if others in the community
> have had to deal with similar issues and what the resolution is. Our
> current workaround is to manually stamp the wheel.
>
>
> Thanks,
> Zhuo
>
>
> References:
>
> [1] https://github.com/tensorflow/tfx
> [2] https://github.com/tensorflow/data-validation
> [3]
> https://github.com/gcc-mirror/gcc/commit/54b755d349d17bb197511529746cd7cf8ea761c1#diff-f82d3b9fa19961eed132b10c9a73903e
> [4]
> https://github.com/apache/arrow/blob/b22848952f09d6f9487feaff80ee358ca41b1562/cpp/src/arrow/type.h#L532
> [5] https://www.python.org/dev/peps/pep-0513/#id40
> [6] https://github.com/pypa/auditwheel/issues/125#issuecomment-438513357
> [7]
> https://github.com/apache/arrow/commit/7a5562174cffb21b16f990f64d114c1a94a30556
> [8]
> https://github.com/apache/arrow/blob/a0e1fbb9ef51d05a3f28e221cf8c5d4031a50c93/cpp/src/arrow/ipc/dictionary.h#L91
>

Reply via email to