Wes asked me to bring this discussion here.  I'm a developer of PETSc
and, with Arrow is getting into the sparse representation space, would
like for it to interoperate as well as possible.

1. Please guarantee support for 64-bit offsets and indices.  The current
spec uses "long", which is 32-bit on some common platforms (notably
Windows).  Specifying int64_t would bring that size support to LLP64
architectures.

Meanwhile, there is a significant performance benefit to using 32-bit
indices when sizes allow it.  If using 4-byte floats, a CSR format costs
12 bytes per nonzero with int64, versus 8 bytes per nonzero with int32.
Sparse matrix operations are typically dominated by bandwidth, so this
is a substantial performance impact.

2. This relates to sorting of indices for CSR.  Many algorithms need to
operate on upper vs lower-triangular parts of matrices, which is much
easier if the CSR column indices are sorted within rows.  Typically, one
finds the diagonal (storing its location in each row, if it exists).
Given that the current spec says COO entries are sorted, it would be
simple to specify this also for CSR.

  https://github.com/apache/arrow/blob/master/format/SparseTensor.fbs


3. This is more nebulous and relates to partitioned representations of
matrices, such as arise when using distributed memory or when optimizing
for locality when using threads.  Some packages store "global" indices
in the CSR representation (in which case you can have column indices
that are larger than the specified sizes) -- I discourage this except as
intermediate representations in matrix-matrix computations.  Others
(including PETSc) store local matrices plus a "local-to-global"
translation of local indices to the distributed setting.  This may be a
no-op for Arrow (no support, but don't prevent), but I'm curious whether
Arrow has a philosophy regarding collective use of distributed memory
(e.g., via MPI).


Finally, there are many other matrix formats to at least be aware of.
These include blocked CSR (fewer indices to store), sliced Ellpack-style
formats (better vectorization for some sparse matrix operations), and
compressed row offsets for CSR (good if many rows are empty).
Conventions regarding diagonals and how to express parallel matrices
vary by package, but often requires some conversion (can sometimes be
done in-place) to use different packages.  PETSc has interfaces to many
such packages, so we have many examples of such conversions.

Thanks for reading.  I'm happy to discuss further.

Reply via email to