Xinyi Zhang created MADLIB-1501:
-----------------------------------

             Summary: Can not train model larger than 1GB.
                 Key: MADLIB-1501
                 URL: https://issues.apache.org/jira/browse/MADLIB-1501
             Project: Apache MADlib
          Issue Type: Bug
          Components: Deep Learning
            Reporter: Xinyi Zhang
             Fix For: v1.19.0


When I want to train a model whose size is large than 1GB on Greenplum, I get 
the error below:
CONTEXT: PL/Python function "madlib_keras_fit"
ERROR: spiexceptions.InternalError: invalid memory alloc request size 
1100478264 (plpy_elog.c:121) .
 
But If I use a smaller model, it can run successfully.
It seems that "SELECT \{schema_madlib}.fit_step()" can not execute when the 
model is larger than 1GB.
I set my shared_buffers to 32GB, and the instance has 290G memory available. 
So, something wrong might happen to the memory allocation in Madlib.
I did not find any parameters to solve the problem. But since the large model 
is quite common, I think there should be a solution for training models larger 
than 1GB in Madlib.
 
 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to