[
https://issues.apache.org/jira/browse/SINGA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
wangwei resolved SINGA-47.
--------------------------
Resolution: Fixed
Assignee: wangwei
> Fix a bug in data layers that leads to out-of-memory when group size is too
> large
> ----------------------------------------------------------------------------------
>
> Key: SINGA-47
> URL: https://issues.apache.org/jira/browse/SINGA-47
> Project: Singa
> Issue Type: Bug
> Reporter: wangwei
> Assignee: wangwei
>
> The Setup function of a data layer opens the database (e.g., DataShard or
> LMDB) and reads a sample record. The sample record is necessary for setting
> upper layers' data shape. Every data layer's Setup function is called when
> SINGA creates the NeuralNet object. If there the group size is 128 and
> partitioning is on dimension 0, then 128 data layers will be created. The
> memory would be used up if the database object has large cache (prefetch)
> size.
> Although every process has the full NeuralNet object, i.e., all layers. Each
> process has a subset of workers which run over a subset of (data) layers.
> Consequently, in one process, only a small number of data layers will call
> ComputeFeature to read data records.
> To fix the bug, we just close the database after reading one sample record in
> Setup function, and re-open it in ComputeFeature function. In this way, only
> a smaller number of database instances are open in each process.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)