Le 07/12/2021 à 23:46, Niranda Perera a écrit :
On Tue, Dec 7, 2021 at 5:43 PM Antoine Pitrou <anto...@python.org> wrote:
Le 07/12/2021 à 23:35, Niranda Perera a écrit :
Hi all,
I'd like to discuss a packaging change for arrow.
AFAIU, there are two broad categories of frameworks that use Arrow.
1. Projects that only use Arrow core (ex: cudf, ray) - where they follow
the Arrow format, but internally they are using their own Arrow impl. So,
they mostly need to read/ write from the Arrow core public APIs to
convert
to/from their internal impl.
2. Projects that use arrow intimately (ex: cylon) - where they use Arrow
sub-components intimately (ex: compute, flight, etc). These may also
depend/ support Type1 projects as well (ex: GCylon for with cudf)
Now, as a member of the latter category, a major challenge we face is
managing dependencies. We currently depend on Arrow v5 and cudf 21.10 but
can not upgrade to v6 because Cudf is yet to upgrade the Arrow
dependencies. But when we look at the version upgrade PR [1], there's
hardly any API changes.
Why don't cudf simply relax the version requirements if they know their
code runs with both Arrow 5.0 and 6.0?
It's discussed here.
https://github.com/rapidsai/cudf/pull/9686#issuecomment-969079069
I see. I wonder if that can be solved by choosing different linker
options, but a linker expert would have to answer.
Here is another suggestion: since it seems cudf uses only a basic set of
Arrow C++ APIs, should it simply link against a private static build of
Arrow C++? I'm afraid I'm not sure how the Arrow-cudf conversion APIs
are supposed to be exposed.
Regards
Antoine.