Github user fommil commented on the pull request:
https://github.com/apache/incubator-spark/pull/575#issuecomment-35218098
@martinjaggi I'm happy to advise on what the best sparse format would be
for any particular problem that you're wanting to solve in spark. just let me
know the matrix operations that you're performing (noting the sorts of
structures you expect for each symbol) and at what points the formats have to
be sent over the wire.
I wouldn't get too caught up on sparse benchmarks. All they will show is
which storage format works well for that problem. I could give you some
incredibly efficient sparse formats that will epically fail that test, because
they are designed for another problem. Column vs Row compression is a classic
example: column compressed are great for multiplication from the right (or
transpose mult) whereas row compression are great for multiplication from the
left... but even that depends on the format of the matrix or vector on the
right. And this might not be the most efficient format from a memory PoV...
what if the matrices have a low band size?
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
[email protected] or file a JIRA ticket with INFRA.