Hi all, I've been working quite a lot with sparse vectors and sparse matrices (basically as feature vectors in the context of machine learning), and have noticed that they do crop up in a lot of places (e.g. the CVXOPT library, in scikits, ...) and that people tend to either reinvent the wheel (i.e. implement a complete sparse matrix library) or pretend that no separate data structure is needed (i.e. always passing along pairs of coordinate and data arrays). I do think there would be some benefit to having sparse vectors/matrices or tensors (in parallel to numpy's arrays, which can be vectors or arbitrary-order tensors) with a standardized interface so that different packages (e.g. eigenvalue/SVD computation, least-squares and other QPs, but possibly also things like numpy.bincount as well as computations that are more domain-specific) can be more interoperable than they are now.
One problem that I see is that people doing PDE solving usually want banded matrices, whereas other people (including me) do most of their work with coordinate-list or CSR matrices, which normally means some variation in the actual implementations for different domains, and it's also possible that the most convenient interface for a sparse matrix is not the most convenient one for a dense matrix (and vice-versa), but I think it would be nice if there were some kind of standardized data structure or maybe just a standardized vtable-based interface (similar to Python's buffer interface) that would allow all sparse matrix packages to interoperate with each other in some meaningful (even if not most-efficient) way. I'd be willing to adapt the code that I have (C++- and Cython-based) to this kind of interface to provide some kind of 'reference' implementation, but before inventing the N+1th solution to the problem, I'd be curious what other people's opinions are. Yannick
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion