Hi Ralf,

Thank you very much for the response. I apologize for my delay. I misread the 
subscription defaults for this list and was waiting for something in my inbox.

You mentioned measuring platform usage. How does the project do that? Analysis 
of PyPI download stats?

I feel like rasterio and fiona are in a weird position where making a host of C 
libraries installable via pip is one of their top features (which is why 
projects like rio-xarray and geopandas use them), so that packaging is equally 
large of a concern/cost as maintenance of its code. I'm ever getting requests 
for new platform support from people who don't otherwise engage with the 
projects. They clearly want wheels to use with their specialized CI systems at 
work and that's all. You've given me a lot to think about, and I appreciate it. 

Good luck with the 2.0.0 release! I'm working to be sure that my projects will 
be compatible.

Yours,
Sean

Ralf Gommers wrote:
> On Thu, Feb 22, 2024 at 7:03 PM  wrote:
> > Hi folks,
> > My name is Sean and I'm the author of several GIS packages using Numpy:
> > Fiona, Rasterio, and Shapely.
> > Hi Sean, thanks for this very good question, and for all your work on GIS
> packages.
> > I've followed Numpy's trail when it comes to wheel building for many years
> > and now I'm seeking advice on how to prioritize platforms to support and
> > how to pay for the labor and computing that it takes to build wheels and
> > maintain the infrastructure over time.
> > I'm probably best placed to answer your questions, because I've both been
> involved in NumPy build & packaging for a long time and am responsible for
> overseeing a significant fraction of the funded work on NumPy as well as
> coordinating unrestricted funding coming in (mostly via Tidelift, as you
> can see at https://opencollective.com/numpy). I'll do my best to accurately
> represent the situation for NumPy. Your questions are challenging though,
> so if you want a higher-bandwidth conversation I'd be happy to chat. Or we
> can use part of a community meeting for this, since I imagine other folks
> may be interested in this topic as well.
> > Fiona and Rasterio have an order of magnitude more C library dependencies
> > than Numpy, via GDAL (https://gdal.org/), which is almost more of an OS
> > than a library.
> > Dealing with NumPy's BLAS dependency is already a large amount of work, so
> I don't envy your task. PyPI really isn't well-suited to that many C
> libraries (as I'm sure you know); for a long time the geospatial stack was
> only usable from conda-forge, where packaging is a much easier task. I'm
> not sure that was a terrible situation - there are a couple of domains like
> that where things just get too challenging. So if you want to do something
> much more restricted than NumPy for platforms to support with wheels, that
> seems perfectly okay.
> > I found a thread in the archive about adding musllinux wheels, but it
> > wasn't clear to me how the work gets done, who does, and how it gets paid
> > for.
> > Of all work on NumPy, the funded part has increased steadily. Until ~2016
> that fraction was zero, and now a lot of the heavy lifting is funded work -
> 9 out of 10 of the top 10 committers over the past 1.5 years get paid for
> at least a part of their time spent on NumPy. This is supported in several
> ways (partially documented at https://numpy.org/about, but that's a bit out
> of date):
> 
> 1. a number of grants received over the years, from: Moore and Sloan
> Foundations (>$1M), the Chan Zuckerberg Institute (>$1M), and NASA (~$400k)
> 2. maintainers employed by companies who allow those maintainers to spend
> part of their day job time on NumPy:
>     - Quansight (Matti, Nathan, Rohit, Mateusz, Melissa, me)
>     - NVIDIA (Sebastian - long-time maintainer, now ~2 years at NVIDIA)
>     - Intel (Raghuveer, contributor for several years, just gained commit
> rights)
>     - Arm (Chris, contributor for ~1 year, just gained commit rights)
>     - I'm not sure if I should list Berkeley here too; folks at Berkeley
> contributed a lot in the past, not sure if that was all grant-funded or if
> there was unrestricted BIDS money to support NumPy.
> 3. unrestricted project funds, obtained from individual and corporate
> (Tidelift (>$100k), Bloomberg ($10k)) donations, which support Sayed's
> Developer in Residence position:
> https://blog.scientific-python.org/numpy/fellowship-program/.
> 4. contracts for work on NumPy from clients of Quansight (and maybe other
> companies, that is hard to know) that aligned with the NumPy project
> roadmap. Noteworthy mentions here for the Sovereign Tech Fund, which
> supported packaging-related work (
> https://www.sovereigntechfund.de/tech/openblas), and the D. E. Shaw group,
> which supported recent work on string ufuncs.
> 
> That said, *funding for packaging work is still quite challenging*. While
> the above is an impressive list of funding, the vast majority of funders do
> care about what they fund, and "keep the package installable" or "do
> general maintenance work" typically doesn't do well in grant applications.
> Funders have improved in this regard, and the ones mentioned in (1) above
> do allow a general maintenance bucket which is some percentage of an
> overall grant.
> 
> The people who are doing most of the work on packaging and wheel build CI
> jobs for NumPy are Matti Picus, Andrew Nelson and myself. Andrew's time is
> unfunded, for Matti and me I'd say a significant fraction of the time
> working on this topic is also unfunded.
> 
> Does NumFOCUS support pay a maintainer to do it?
> 
> 
> No, NumFOCUS does the admin for our project funds, but doesn't supply
> funding to NumPy. It also doesn't structurally support any other open
> source projects with direct funding, with the exception of its Small
> Development Grant program - which is meant for smaller one-off projects
> (amounts in the $2k - $10k range) rather than part-time or full-time
> employment.
> > Are Numpy maintainers adding new platform builds as part of their day
> > jobs?
> > In general, no. This has always been volunteer work. There is only one
> exception I can think of: the recent CZI grant for Scientific Python (
> https://blog.scientific-python.org/scientific-python/2022-czi-grant/) has
> as a goal to make html docs of projects interactive. To achieve that, we
> need improved support for Pyodide - and that's going to include wheels that
> can be built and deployed as part of doc builds in CI.
> > Are they donating their own time to the effort?
> > Mostly yes, as described above.
> > Does the Numpy project aspire to provide wheels for all of the top N
> > platforms? Is it more than an aspiration?
> > Not quite. We have to judge carefully, because adding platforms is costly
> in terms of maintainer time. Adding CI jobs is easier, because there is no
> long-term commitment. Once we decide to upload wheels for a platform to
> PyPI and people start to rely on them, we can not remove them anymore
> without breaking the vast majority of users on those platform, since they
> won't be set up to build from source (the `pip` defaults are pretty harmful
> here unfortunately).
> 
> We've always had issues with deciding on niche platforms. We settled on a
> rough rule of thumb once that if usage fell below 0.5%, we could remove it
> (IIRC this was for making SSE2 the baseline on Windows, and hence dropping
> support for ancient CPUs).
> 
> You mentioned musllinux - that's an example where the number of users is
> fairly low, but also the cost of doing that work is fairly low: hardware
> and CI is easily available, builds are fast, and bugs need fixing anyway.
> Also, Musl is interesting to support - from a "technical progress"
> perspective it's nicer to work on that than on a legacy platform.
> 
> Another example of a toss-up platform to support is 32-bit Windows. That's
> a pain and not interesting to work on, the only reason we restored it is
> because when we ship it without OpenBLAS it's not too difficult. Those
> builds then have poor performance of course (np.dot can be 100x slower),
> but the only genuine use case is folks who interact with other apps like in
> Excel - not where one cares about performance.
> 
> Yet another toss-up that we do support is PyPy. We've had wheels for the
> most current stable PyPy version for a long time. The main reason is that
> Matti (who is also a PyPy maintainer) has been willing to do the work to
> support these builds.
> 
> Platforms that we don't plan to support include:
> - 32-bit x86 Linux: we dropped those wheels, because demand is quite low
> and on that platform users should be able to build from source
> - ppc64le: no freely available CI system (TravisCI did for a while, but it
> was very unreliable)
> - s390x: same as ppc64le
> - ppc64: never considered this one, seems much more niche than ppc64le
> - macOS universal2: this is pointless, we have thin wheels for arm64 now,
> and universal2 was simply a bad idea, it should fade away
> 
> The last valid one (I think, see PEP 600) is `armv7l`. We have never
> seriously considered it, but in case cross-compiling becomes more smooth I
> can see us adding support at some point in the future.
> 
> I was planning to write a NEP covering platform support and how we decide
> on this. Still haven't gotten around to that. This email seems like a
> reasonable start for that:)
> 
> Is support for one or more platforms part of any sponsorship agreement?
> 
> 
> No. We have no commitments for any sponsorships, nor do we have any fixed
> commitments as a project for anything.
> > Maybe users and sponsorship cover the cost of building wheels and
> > maintenance entirely for Numpy, but it's not so in my case.
> > I've got so many questions in this vein, and I'm grateful for any answers
> > or insights or more discussion. Thanks!
> > I've also written a bit about this in the pypackaging-native docs. These
> pages may interest you:
> https://pypackaging-native.github.io/meta-topics/user_expectations_wheels/
> https://pypackaging-native.github.io/meta-topics/pypi_social_model/
> https://pypackaging-native.github.io/key-issues/native-dependencies/geospati...
> 
> Cheers,
> Ralf
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to