[ 
https://issues.apache.org/jira/browse/MAHOUT-531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925152#action_12925152
 ] 

Alexander Hans commented on MAHOUT-531:
---------------------------------------

I hadn't realized that there are both, an iterator() and a iterateNonZeros(), 
for vectors. I just checked, their behavior is indeed identical for dense as 
well as sparse vectors. So I think the best solution would mean to

- implement iterator() for matrices like the one for vectors: iterate over 
everything, even for values that might not be stored in memory (values = 0 for 
sparse representations)
- implement iterateNonZero() for matrices like the one for vectors: do the same 
as iterator(), but skip values = 0; this would speed up the iteration for 
sparse representations and change almost nothing for dense ones

Now it remains open what to do with getNumNondefaultElements(). For sparse 
vectors, it returns the number of non-zero elements, i.e., it gives the number 
of elements that iterateNonZero() will iterate over. For dense vectors, it just 
returns the size of the vector, no matter what the actual values are. 
Matrix.getNumNondefaultElements() currently returns an int[2], where the first 
value is the number of rows containing non-zero elements, the second value is 
the number of columns with non-zero elements. I don't see any way of deriving 
the actual number of non-zero elements from that, just an upper bound is 
possible is int[0] * int[1]. However, to write matrices similarly to how 
vectors are written, that number is needed. I see two options:

- 1. Change Matrix.getNumNondefaultElements() to behave like 
Vector.getNumNondefaultElements(), i.e., return just an int. This would break 
break backward compatibility and we'd lose the size() analogy.
- 2. Introduce Matrix.getNumNonZeroElements(). For consistency, we might also 
want a Vector.getNumNonZeroElements(). For sparse representations those would 
contain code that is identical (vector) or similar (matrix) to 
getNumNondefaultElements(), for the dense versions there wouldn't be a way 
around (costly) iterating the whole thing. That could be noted in the JavaDoc, 
though. I wouldn't need those for reading/writing anyway.

So, what do you think?


> MatrixWritable doesn't actually write/read anything
> ---------------------------------------------------
>
>                 Key: MAHOUT-531
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-531
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>            Reporter: Alexander Hans
>         Attachments: MAHOUT-531.patch, MAHOUT-531.patch
>
>
> The write() and readFields() methods of MatrixWritable write/read only the 
> classname, they don't write/read actual data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to