Nandish Jayaram created MADLIB-1259:
---------------------------------------
Summary: PostgreSQL out of memory issue with Neural networks
training
Key: MADLIB-1259
URL: https://issues.apache.org/jira/browse/MADLIB-1259
Project: Apache MADlib
Issue Type: Bug
Components: Module: Neural Networks
Reporter: Nandish Jayaram
Fix For: v2.0
Neural network training results in an out of memory exception in the following
scenario:
* 16 GB RAM
* Dataset: Same as the one used in
https://issues.apache.org/jira/browse/MADLIB-1257.
23K instances / 300 features in 263 groups
* PostgreSQL memory setup.
checkpoint_completion_target = '0.9';
default_statistics_target = '500';
effective_cache_size = '12GB';
effective_io_concurrency = '200';
maintenance_work_mem = '2GB';
max_connections = '20';
max_parallel_workers = '4';
max_parallel_workers_per_gather = '2';
max_wal_size = '8GB';
max_worker_processes = '4';
min_wal_size = '4GB';
random_page_cost = '1.1';
shared_buffers = '4GB';
wal_buffers = '16MB';
work_mem = '52428kB';
sysctl -w vm.overcommit_memory=2 to avoid the crash of postmaster
With the above database settings and dataset size, the following query resulted
in an error:
{code:java}
SELECT madlib.mlp_classification(
'train_data_sub', -- Source table
'mlp_model', -- Destination table
'features', -- Input features
'positive', -- Label
ARRAY[5], -- Number of units per layer
'learning_rate_init=0.003,
n_iterations=500,
tolerance=0', -- Optimizer params
'tanh', -- Activation function
NULL, -- Default weight (1)
FALSE, -- No warm start
true, -- verbose
'case_icd' -- Grouping
);
ERROR: spiexceptions.OutOfMemory: out of memory
DETAIL: Failed on request of size 32800.
CONTEXT: Traceback (most recent call last):
PL/Python function "mlp_classification", line 36, in <module>
grouping_col
PL/Python function "mlp_classification", line 45, in wrapper
PL/Python function "mlp_classification", line 325, in mlp
PL/Python function "mlp_classification", line 580, in update
PL/Python function "mlp_classification"
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)