Nandish Jayaram created MADLIB-1259:
---------------------------------------

             Summary: PostgreSQL out of memory issue with Neural networks 
training
                 Key: MADLIB-1259
                 URL: https://issues.apache.org/jira/browse/MADLIB-1259
             Project: Apache MADlib
          Issue Type: Bug
          Components: Module: Neural Networks
            Reporter: Nandish Jayaram
             Fix For: v2.0


Neural network training results in an out of memory exception in the following 
scenario:
 * 16 GB RAM
 * Dataset: Same as the one used in 
https://issues.apache.org/jira/browse/MADLIB-1257.
 23K instances / 300 features in 263 groups
 * PostgreSQL memory setup.
 checkpoint_completion_target = '0.9';
 default_statistics_target = '500';
 effective_cache_size = '12GB';
 effective_io_concurrency = '200';
 maintenance_work_mem = '2GB';
 max_connections = '20';
 max_parallel_workers = '4';
 max_parallel_workers_per_gather = '2';
 max_wal_size = '8GB';
 max_worker_processes = '4';
 min_wal_size = '4GB';
 random_page_cost = '1.1';
 shared_buffers = '4GB';
 wal_buffers = '16MB';
 work_mem = '52428kB';
 sysctl -w vm.overcommit_memory=2 to avoid the crash of postmaster

With the above database settings and dataset size, the following query resulted 
in an error:
{code:java}
SELECT madlib.mlp_classification(
    'train_data_sub',      -- Source table 
    'mlp_model',      -- Destination table
    'features',     -- Input features
    'positive',     -- Label
    ARRAY[5],         -- Number of units per layer
    'learning_rate_init=0.003,
    n_iterations=500,
    tolerance=0',     -- Optimizer params
    'tanh',           -- Activation function
    NULL,             -- Default weight (1)
    FALSE,            -- No warm start
    true,             -- verbose
    'case_icd'         -- Grouping
);

ERROR:  spiexceptions.OutOfMemory: out of memory
DETAIL:  Failed on request of size 32800.
CONTEXT:  Traceback (most recent call last):
  PL/Python function "mlp_classification", line 36, in <module>
    grouping_col
  PL/Python function "mlp_classification", line 45, in wrapper
  PL/Python function "mlp_classification", line 325, in mlp
  PL/Python function "mlp_classification", line 580, in update
PL/Python function "mlp_classification"
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to