reductionista edited a comment on issue #355: Keras fit interface
URL: https://github.com/apache/madlib/pull/355#issuecomment-473492589
 
 
   When I run dev-check, I get this error:
   
   ```
   psql:/tmp/madlib.xNO5vR/convex/madlib_keras.sql_in.tmp:60: ERROR:  
plpy.Error: A plpy error occurred in the step function: ImportError: No module 
named keras (plpython.c:5038)  (seg0 slice1 127.0.0.1:25432 pid=98371) 
(plpython.c:5038)
   ```
   I've investigated this, and I have a theory about what causes this.
   
   My keras is installed under `/usr/local/lib/python2.7/site-packages`, and 
that directory is included in my `PYTHONPATH`.  If I run madlib functions that 
only include one level of calls to `plpython`, then everything is fine.  And 
even in the `madlib_keras_fit()` function, the `import keras` statement at the 
beginning of `malib_keras.py_in` works the first time it is loaded.  But after 
it successfully imports keras, it executes a SQL command with `plpy.execute` to 
run fit_step().  It's somewhere around this point that fit_step() modifies the 
environment to remove `/usr/local/lib/python2.7/site-packages` from the 
PYTHONPATH.  And while `madlib_keras.py` is being loaded the second time to 
call fit_step(), it fails.
   
   Note:  the directory containing keras is always in my `PYTHONPATH` if I 
`ssh` to `localhost`, or if I run `bash` as a subcommand.  And it always 
imports fine in either case.
   
   This could be due to a general issue with Greenplum, namely that 
`greeplum_path.sh` always removes everything from the `PYTHONPATH` except for 
specific python library directories under `$GPHOME`.  This isn't usually a 
problem, since you can call `greenplum_path.sh` first in `.bashrc` and then add 
any other directories afterwards.  (What I do on my system.)  But possibly, 
this is what's causing things to fail when there is a nested call to a 
`plpython` function inside a sql function inside a `plpython` function inside a 
`sql` function.  Not sure if there are other examples of this in our codebase?
   
   It seems like we may or may not care about fixing this, but we should at 
least add a requirement somewhere that keras has to be installed in a system 
directory that python knows about by default, it can't be in a custom directory 
somewhere even if you set your `PYTHONPATH` to point to it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to