Wes asked me to bring this discussion here. I'm a developer of PETSc and, with Arrow is getting into the sparse representation space, would like for it to interoperate as well as possible.
1. Please guarantee support for 64-bit offsets and indices. The current spec uses "long", which is 32-bit on some common platforms (notably Windows). Specifying int64_t would bring that size support to LLP64 architectures. Meanwhile, there is a significant performance benefit to using 32-bit indices when sizes allow it. If using 4-byte floats, a CSR format costs 12 bytes per nonzero with int64, versus 8 bytes per nonzero with int32. Sparse matrix operations are typically dominated by bandwidth, so this is a substantial performance impact. 2. This relates to sorting of indices for CSR. Many algorithms need to operate on upper vs lower-triangular parts of matrices, which is much easier if the CSR column indices are sorted within rows. Typically, one finds the diagonal (storing its location in each row, if it exists). Given that the current spec says COO entries are sorted, it would be simple to specify this also for CSR. https://github.com/apache/arrow/blob/master/format/SparseTensor.fbs 3. This is more nebulous and relates to partitioned representations of matrices, such as arise when using distributed memory or when optimizing for locality when using threads. Some packages store "global" indices in the CSR representation (in which case you can have column indices that are larger than the specified sizes) -- I discourage this except as intermediate representations in matrix-matrix computations. Others (including PETSc) store local matrices plus a "local-to-global" translation of local indices to the distributed setting. This may be a no-op for Arrow (no support, but don't prevent), but I'm curious whether Arrow has a philosophy regarding collective use of distributed memory (e.g., via MPI). Finally, there are many other matrix formats to at least be aware of. These include blocked CSR (fewer indices to store), sliced Ellpack-style formats (better vectorization for some sparse matrix operations), and compressed row offsets for CSR (good if many rows are empty). Conventions regarding diagonals and how to express parallel matrices vary by package, but often requires some conversion (can sometimes be done in-place) to use different packages. PETSc has interfaces to many such packages, so we have many examples of such conversions. Thanks for reading. I'm happy to discuss further.