ReneEnjilian commented on code in PR #2049:
URL: https://github.com/apache/systemds/pull/2049#discussion_r1676982739
##########
src/main/java/org/apache/sysds/runtime/data/SparseBlockMCSR.java:
##########
@@ -312,8 +312,8 @@ public long size(int rl, int ru, int cl, int cu) {
for(int i=rl; i<ru; i++)
if( !isEmpty(i) ) {
int start = posFIndexGTE(i, cl);
- int end = posFIndexGTE(i, cu);
- nnz += (start!=-1) ? (end-start) : 0;
+ int end = posFIndexLTE(i, cu-1);
+ nnz += (start!=-1) ? (end-start+1) : 0;
Review Comment:
Assuming the previous example (matrix) from above and boundaries defined by
`size(rl=0, ru=4, cl=0, cu=5)` (i.e., the entire matrix except the last
column). Further, we assume that the additional check `end!=-1` is removed for
both MCSR and CSR.
In the case of MCSR, for the last row we would have `start=0`, and
`end=-1`. This is an example of a scenario where `start` is indeed not equal to
-1 but `end` is (not covered anymore by the checks). The `start` variable finds
the value 80 but `end` does not find any value. But still, we see that `(end -
start +1) = (-1 - 0 + 1) = 0`. Now, we could also assume the case where the
first column is left out and the value of the last row is in the that column
instead of the last (i.e., `[80, 0, 0, 0, 0, 0]`) with `size(rl=0,
ru=4, cl=1, cu=6)`. Then, `start=-1` and `end=0`. Since `start=-1`, `nnz` is
incremented by 0. For the case where `[80, 0, 0, 0, 0, 81]` and `size(rl=0,
ru=4, cl=1, cu=5)`, we would have `start=1` and `end=0`. This gives us (0 - 1 +
1) = 0. For MCSR, we can look at these different cases easily because the
column indexes are not stored all together in a single data structure like CSR.
For CSR, let us assume again the matrix from the original example (above)
and `size(rl=0, ru=4, cl=0, cu=5)` (i.e., the entire matrix except the last
column). Here, we have `_indexes=[0, 1, 1, 3, 2, 3, 4, 5]`. For the first 3
rows we would calculate the correct number of non-zeros. The problem occurs in
the last row. There, `start=7` and `end=-1`. Hence, `(-1 -7 +1) = -7`. As a
result, `nnz` would be 0 instead of 8. For these types of cases, we need the
additional condition `end!=-1`.
In MCSR, when `end=-1`, `start` has to be 0 (as far as I understand). Thus,
the additional check is not needed and is already covered by `(end - start +
1)`. I hope this explains my reasoning. In any case, adding `end!=-1` to
MCSR/MCSC does not hurt and may cover for a case I oversaw.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]