[ https://issues.apache.org/jira/browse/SYSTEMML-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Glenn Weidner updated SYSTEMML-1548: ------------------------------------ Fix Version/s: (was: SystemML 1.0) SystemML 0.15 > Performance ultra-sparse matrix read > ------------------------------------ > > Key: SYSTEMML-1548 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1548 > Project: SystemML > Issue Type: Task > Reporter: Matthias Boehm > Assignee: Matthias Boehm > Fix For: SystemML 0.15 > > > Reading ultra-sparse matrices shows for certain data sizes and memory > configurations poor performance due to garbage collection overheads. > In detail, this task covers two scenarios that will be addressed > independently: > 1) Large heap: In case of large heaps, the problem are temporarily > deserialized sparse blocks which are not reused due to inefficient reset, > leading to lots of garbage and hence high cost for full garbage collection. > This will be addressed by using our CSR sparse blocks for ultra-sparse blocks > because CSR has a smaller memory footprint and allows for efficient reset. > 2) Small heap: In case of a small heap not the temporary blocks but the > memory overhead of the target sparse matrix becomes the bottleneck. This is > due to a relatively large memory overhead per sparse row which is not > amortized if a row has just one or very few non-zeros. This will be addressed > via a modification of the MCSR representation for ultra-sparse matrices. Note > that we cannot use CSR or COO here because we want to support efficient > multi-threaded incremental construction and subsequent operations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)