janniklinde commented on PR #2376: URL: https://github.com/apache/systemds/pull/2376#issuecomment-3641594972
Thank you for the very good PR @Biranavan-Parameswaran and thanks for addressing the normalization issue for negative shifts in the roll operation. I particularly like that you added new correctness tests to increase test coverage and that you provide benchmarks to verify speedups for larger / dense matrices. I suggest that `RollOperationThreadSafetyTest` is moved to `test.java.org.apache.sysds.performance.matrix` (and renamed to `MatrixRollPerf`). As you correctly mentioned, warmups can distort experimental results. Similarly, the overhead of `JUnit` is also not ideal for performance tests. I would consider doing a performance test in the style of `MatrixAggregate` to get more accurate results. It would be interesting to see the performance of larger dense matrices with few numbers of columns (including column vectors). My guess is that performance drops in that scenario because you then do many array copies on small ranges. If the underlying row-major `DenseBlock` is `DenseBlockFP64`, I think we could benefit from replacing the row-by-row copy with one (or two in the overflow case) large arraycopy calls (similar to the `clen == 1` case in `copyDenseMtx(...)`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
