I haven't examined the out-of-core scenarios at all, but in-memory, it is possible to have labels with no performance cost if you assume add the constraint that labeled matrices are only conformable if they share the identical label dictionary. That implies that you can use the internal row and column indexes for all internal operations. This is pretty easy to enforce if the persistent form of the matrix has only labels and not indexes since you can simple augment a shared dictionary as you read or generate the matrix. For distributed operations, I am considerably more dubious of this approach.
On Thu, Mar 4, 2010 at 8:37 AM, Jake Mannix (JIRA) <j...@apache.org> wrote: > Having keys for row be objects is one thing, but doing this all the time > for the keys for the Vector indexes will seriously slow down inner loops, > due to the translation time between object to int (via a multitude of > hashCode() calls), and we treating the rows and columns on equal footing is > pretty required. > -- Ted Dunning, CTO DeepDyve