[ 
https://issues.apache.org/jira/browse/SYSTEMML-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1623.
--------------------------------------
       Resolution: Fixed
         Assignee: Matthias Boehm
    Fix Version/s: SystemML 1.0

> Memory efficiency JMLC matrix and frame conversions
> ---------------------------------------------------
>
>                 Key: SYSTEMML-1623
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1623
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Matthias Boehm
>            Assignee: Matthias Boehm
>             Fix For: SystemML 1.0
>
>
> The current JMLC conversion functions cause a very inefficient and memory 
> intensive code path with leads to unnecessary OOMs that can be easily 
> avoided. This task aims to add and improve these primitives to allow 
> convenient data conversions with much better memory efficiency. 
> For example consider a scenario of a 500k x 90 input model available as csv 
> file in the classpath, which string representation requires 1GB. The typical 
> codepath currently use looks as follows:
> {code}
> ResourceStream(model_file)
> -> prep
> ---> StringBuilder -> String [3GB tmp, 1GB]
> -> convertToDoubleMatrix
> ---> byte[] -> ByteInputStream [2GB]
> ---> MatrixBlock [360MB]
> ---> double[][] [400MB]
> -> setMatrix
> ---> MatrixBlock [360MB]
> {code} 
> which requires at least 4GB of memory due to strong references to all 
> intermediates. The goal of this task is to reduce this to the following, 
> which only requires 360MB of memory:
> {code}
> ResourceStream(model_file)
> -> convertToMatrix
> ---> MatrixBlock [360MB]
> -> setMatrix
> ---> by references
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to