Xinyi Zhang created MADLIB-1501: ----------------------------------- Summary: Can not train model larger than 1GB. Key: MADLIB-1501 URL: https://issues.apache.org/jira/browse/MADLIB-1501 Project: Apache MADlib Issue Type: Bug Components: Deep Learning Reporter: Xinyi Zhang Fix For: v1.19.0
When I want to train a model whose size is large than 1GB on Greenplum, I get the error below: CONTEXT: PL/Python function "madlib_keras_fit" ERROR: spiexceptions.InternalError: invalid memory alloc request size 1100478264 (plpy_elog.c:121) . But If I use a smaller model, it can run successfully. It seems that "SELECT \{schema_madlib}.fit_step()" can not execute when the model is larger than 1GB. I set my shared_buffers to 32GB, and the instance has 290G memory available. So, something wrong might happen to the memory allocation in Madlib. I did not find any parameters to solve the problem. But since the large model is quite common, I think there should be a solution for training models larger than 1GB in Madlib. -- This message was sent by Atlassian Jira (v8.20.7#820007)