Previous posts in the series: 
https://groups.google.com/g/sage-devel/c/OeN8o14s6Jc/m/ChnpijP3AgAJ, 
https://groups.google.com/g/sage-devel/c/xBzaINHWwUQ/m/Tq17YRqOAAAJ

As we all know, SageMath makes use of hundreds of *"upstream" projects: 
third-party, separately maintained packages* written either in 
Python/Cython or in other languages (C, C++, Common Lisp, Fortran, and the 
domain-specific languages of systems such as GAP, Singular, Maxima, ...).

The role of SageMath, although it does have a role as a software 
distribution, is in a clear contrast to that of general software 
distributions such as Ubuntu or conda-forge: It's probably rare for users 
to say "*I computed this Gröbner basis using Ubuntu Linux*" or "a strong 
generating set for this matrix group was computed using conda-forge". But 
many users say such things all the time about using SageMath.

Of course this is because of the added value of SageMath over the 
collection of its dependencies:
1. Abstraction and unification of the interface to multiple upstream 
dependencies, and integration.
2. The algorithms, structures, and applications implemented in the Sage 
library itself.

I posit that there is an *intrinsic conflict between 
abstraction/unification/integration and attribution for the upstream 
projects: *Regardless of intent and purpose, a real side effect 
of abstraction/unification/integration is that the use of the upstream 
project is obscured to some degree.

It is natural if individuals who contribute to the upstream projects (or 
have contributed to them in the past) are concerned or unhappy about such 
effects. And it is understandable if they perceive that the SageMath 
project is using their work, consuming attention/visibility/attribution, 
but not "giving back" sufficiently. Contributors are entitled to taking 
pride in their workpersonship, in the success of the project that they have 
been contributing to, the brand that they have created, etc. Even if some 
of us may be wary of possible toxic gradations such as tribalism, it is 
clear that *attention and attribution are important, positive, and 
legitimate motivating factors for open source contributors* in general, and 
moreover attribution via academic citations may indirectly translate into 
individuals' careers and success in obtaining funding. 

The 2018 sage-devel thread "Suggestion for the SageMath website" 
(https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/VRIRzj1sBAAJ) 
focused on *getting upstream projects credited on our website. *Although 
the suggestions there, to randomly rotate the names of external 
dependencies that are listed on the main page so they all get equal 
exposure, or scrolling lists, were not implemented, we have come a long way 
since then regarding better attribution for upstream projects.
William's message in that thread, 
https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/4ke5ekyVAgAJ, 
suggested that "*linking to dependencies should be done much more, but in a 
way that provides clear value to users*:
- being able to better know what is in Sage,
- being able to read the original upstreams docs and source code more 
easily,
- knowing which upstreams devs to contact for *support*, to ask 
for features, to contribute work, and to thank,
- being able to properly acknowledge what they are using."

Regarding William's first point, *being able to better know what is in 
Sage:* The main page of http://sagemath.org now links to our reference 
manual with a list of dependencies that is always up to date because it is 
automatically generated from the source code. And in the most current 
version, this long list is broken into categories such as "Mathematics" for 
better navigability: 
https://deploy-livedoc--sagemath.netlify.app/html/en/reference/spkg/
For each dependency, we have a page with various information, including a 
short description, installation instructions and a link to the upstream 
project. 

Regarding the second point, Simon King's suggestion in the thread, to 
"[...] on that list have a link to the doc for each package that provides 
docs" 
(https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/YEp6aO8cAQAJ), has 
not been implemented and would still be a valuable improvement. But we have 
just made progress in a similar way by *facilitating Sphinx hyperlinks to 
specific pages of package documentation*, see 
https://github.com/sagemath/sage/wiki/Sage-10.4-Release-Tour#linking-to-external-package-documentation
Also TB's suggestion to use "the Sphinx extensions viewcode and linkcode 
[...] add links next to functions and classes with the corresponding source 
code" (https://groups.google.com/g/sage-devel/c/H8FcZD90O0Y/m/RisE0AajAgAJ) 
was implemented in this development cycle, 
see 
https://github.com/sagemath/sage/wiki/Sage-10.4-Release-Tour#links-to-source-code

*Giving attribution to the projects that supported a particular computation 
is a hard problem that cannot be fully automated.* I don't know how widely 
SageMath's profiling-based citation system 
(https://doc.sagemath.org/html/en/reference/misc/sage/misc/citation.html) 
is used by the community; but in any case, it's still a long way from the 
terse output of this citation system to actual citations that people can 
use, and it may be valuable to provide some convenient shortcuts.


*Next I'll note that the modularization project provides an opportunity to 
refresh our relations to the upstream projects in very significant ways.*


The new pip-installable packages from the modularization project will be a 
new way for our project to give back something of value to the upstream 
projects that Sage depends on; and thus are a possible *new expression of 
interest to collaborate with upstream projects:* In particular those 
projects that do not maintain Python interfaces themselves or those that 
might be interested in higher-level interfaces than what they provide. 
Examples of packages corresponding to actively maintained upstream projects:
- *sagemath-gap* 
https://github.com/mkoeppe/sage/tree/t/32432/modularization_of_sagelib__break_out_a_separate_package_sagemath_polyhedra/pkgs/sagemath-gap
- *sagemath-giac *
https://github.com/mkoeppe/sage/tree/t/32432/modularization_of_sagelib__break_out_a_separate_package_sagemath_polyhedra/pkgs/sagemath-giac
- *sagemath-linbox*
 
https://github.com/mkoeppe/sage/tree/t/32432/modularization_of_sagelib__break_out_a_separate_package_sagemath_polyhedra/pkgs/sagemath-linbox
- *sagemath-pari *
https://github.com/mkoeppe/sage/tree/t/32432/modularization_of_sagelib__break_out_a_separate_package_sagemath_polyhedra/pkgs/sagemath-pari
*- sagemath-singular*
 
https://github.com/mkoeppe/sage/tree/t/32432/modularization_of_sagelib__break_out_a_separate_package_sagemath_polyhedra/pkgs/sagemath-singular

Viewing one of these packages as the Python interface to the upstream 
library may be much more plausible than considering the monolithic SageMath 
system as the Python interface to the library. This may facilitate a shared 
investment in its development and may also avoid duplicate developments. 
(Disclosure: I have not contacted any upstream projects about this yet 
because there's little that I can offer before the work that makes the 
modularized distributions available is merged in to Sage.)

I'll note that these new distributions differ very significantly from the 
products of earlier, "bottom-up" modularization efforts of the Sage 
library: packages such as *cypari2* (just discussed in the concurrent 
thread on modularization, 
https://groups.google.com/g/sage-devel/c/mqgtkLr2gXY/m/kSiZktwpAAAJ) and *pplpy 
(*mentioned in the same thread 
in https://groups.google.com/g/sage-devel/c/mqgtkLr2gXY/m/65UjwaMaBQAJ, 
along with some other packages). These packages, designed to be reusable in 
the Python ecosystem without dependencies on anything in Sage, are not 
exposed directly to Sage users; they are merely glue between a C/C++ 
library and the higher-level Sage code that uses it. As such, *these 
packages do not provide users with a slice of the part of Sage where most 
of the effort and polish in Sage development is spent*, namely the 
high-level public interface of Sage. (I cannot say whether or to what 
degree it is related to this observation, but I unfortunately have to say 
that these modularization efforts have not been a clear success: With the 
exception of *cypari* (which Marc and Nathan created specifically for 
SnapPy) and the package *cysignals*, there is little evidence that such 
packages have attracted a community of users other than indirectly as 
dependencies of the Sage library; and certainly no viable community of 
*developers* has formed for these packages; just a few weeks ago I took 
over as the de-facto maintainer of *cypari2* and *pplpy*, you may have seen 
the announcements.)


Finally, recall that to make the modularized distributions 
testable separately, we have annotated doctests in the source code with 
tags like "*# needs sage.libs.flint*" (at the file, block, or doctest 
level), supported by a convenient maintenance tool, the powered-up "*sage 
--fixdoctests*" command 
(see 
https://github.com/sagemath/sage/wiki/Sage-10.1-Release-Tour#new-developer-tools-modularization-deprecations;
 
this was an outcome of the June 2023 sage-devel discussion "Modularized 
doctests" 
https://groups.google.com/g/sage-devel/c/utA0N1it0Eo/m/ep_G5dFOAAAJ and the 
June/July 2023 vote 
https://groups.google.com/g/sage-devel/c/MtS2u3VbJEo/m/wBhhdN3aAAAJ).

Such annotations -- even if as Sage developers we may find them annoying -- 
give an important secondary benefit, namely *specific attribution for the 
libraries that Sage uses for particular types of computations*. 
This alleviates the conflict between abstraction and attribution that I 
mentioned above.

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sage-devel/a5f83569-bb45-4276-81dc-aeafef6f22f5n%40googlegroups.com.

Reply via email to