[ 
https://issues.apache.org/jira/browse/SINGA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangwei resolved SINGA-47.
--------------------------
    Resolution: Fixed
      Assignee: wangwei

> Fix a bug in data layers that leads to out-of-memory when group size is too 
> large 
> ----------------------------------------------------------------------------------
>
>                 Key: SINGA-47
>                 URL: https://issues.apache.org/jira/browse/SINGA-47
>             Project: Singa
>          Issue Type: Bug
>            Reporter: wangwei
>            Assignee: wangwei
>
> The Setup function of a data layer opens the database (e.g., DataShard or 
> LMDB) and reads a sample record. The sample record is necessary for setting 
> upper layers' data shape. Every data layer's Setup function is called when 
> SINGA creates the NeuralNet object. If there the group size is 128 and 
> partitioning is on dimension 0, then 128 data layers will be created. The 
> memory would be used up if the database object has large cache (prefetch) 
> size.
> Although every process has the full NeuralNet object, i.e., all layers. Each 
> process has a subset of workers which run over a subset of (data) layers. 
> Consequently, in one process, only a small number of data layers will call 
> ComputeFeature to read data records.
> To fix the bug, we just close the database after reading one sample record in 
> Setup function, and re-open it in ComputeFeature function. In this way, only 
> a smaller number of database instances are open in each process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to