Would anyone have some bandwidth in the next couple of months to help with this?
On Thu, Apr 30, 2020 at 9:10 AM Wes McKinney <[email protected]> wrote: > > The proposal is for any BUNDLED dependency to be merged into > libarrow.a (or another one of the static libraries if the dependency > is only used in e.g. one subcomponent), so this applies to the AWS SDK > also > > On Thu, Apr 30, 2020 at 3:02 AM Rémi Dettai <[email protected]> wrote: > > > > Hi! > > > > Does your point 1 also apply to the AWS SDK dependency ? Currently it seems > > that it cannot be built in BUNDLED mode. As stated in > > https://issues.apache.org/jira/browse/ARROW-8565 I struggled a lot to make > > a static build with the S3 dependency activated ! I would really like to > > help on this because it is very important for my usecase that we can > > assemble compact builds of Arrow, but I'm still very uncomfortable with > > CMake :-( > > > > Thanks for your amazing work ! > > > > Remi > > > > Le mar. 28 avr. 2020 à 16:22, Wes McKinney <[email protected]> a écrit : > > > > > hi folks, > > > > > > I would like to highlight some outstanding problems with our packages > > > > > > 1. Our Arrow C++ static libraries are generally unusable. > > > > > > Whenever -DARROW_JEMALLOC=ON or any dependency is built in BUNDLED > > > mode, libarrow.a (or other static libraries) cannot be used for > > > linking. That's because the static library has a dependency on the > > > bundled static wheels which are _not_ packaged with the Arrow static > > > libraries. > > > > > > The preferred solution seems to be ARROW-7605. I demonstrated how this > > > works in > > > > > > https://github.com/apache/arrow/pull/6220 > > > > > > but I need someone to help with the PR to deal with other BUNDLED > > > dependencies. I likely won't be able to complete the PR myself in time > > > for the next release. > > > > > > 2. Our Python packages are unacceptably large > > > > > > On Linux, wheels are now 64MB and after installation take up 218MB. > > > There is an immediate serious problem that has gone unresolved that is > > > easier to fix and a separate structural problem that is more difficult > > > to fix. See the directory listing > > > > > > https://gist.github.com/wesm/57bd99798a2fa23ef3cb5e4b18b5a248 > > > > > > We're duplicating all of the shared libraries inside the wheel and on > > > disk. It's unfortunate that we've allowed this problem for a whole > > > year or more > > > > > > https://issues.apache.org/jira/browse/ARROW-5082 > > > > > > I also recently opened > > > > > > https://issues.apache.org/jira/browse/ARROW-8518 > > > > > > which describes a proposal to create some tools to assist with > > > building "parent" and "child" Python packages. This would enable us to > > > ship components like Flight and Gandiva as separate wheels. This is a > > > large project but one that will ultimately be necessary for the > > > long-term scalability and sustainability of the project. > > > > > > I am not able to personally work on either of these projects in the > > > current release cycle, but I hope that some progress can be made on > > > these since they have lingered on for a long time, and it would be > > > good for us to "put our best foot forward" with the 1.0.0 release. > > > > > > Thanks, > > > Wes > > >
