Edward Espino
Table of Contents _________________ 1. Apache MADlib Version v1.21.0 RC2 2. Observations .. 1. Reviewing Jira status - QUESTION .. 2. 1.21.0 docs Review - PASSED .. 3. MADlib 1.21.0 wiki - old release reference .. 4. PGP signature verification - PASSED .. 5. SHA512 checksum verification - PASSED .. 6. RELEASE_NOTES reviewed - QUESTION .. 7. Copyright review in NOTICE file - PASSED .. 8. Apache RAT (mvn apache-rat:check) - PASSED .. 9. Python dependencies - QUESTION/OBSERVATION .. 10. Operating System: Rocky Linux 8.7 (Green Obsidian) x86_64 - PASSED .. 11. Operating System: Ubuntu 22.04.2 LTS - PASSED .. 12. Test cannot find "python" - FAILURE 1 Apache MADlib Version v1.21.0 RC2 =================================== As I am not a member of PMC or a committer. Here are my observations of the release. 2 Observations ============== 2.1 Reviewing Jira status - QUESTION ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Only a single Jira is fixed in this release. Is this expected? <https://issues.apache.org/jira/browse/MADLIB-1506> v1.21.0 Release Jira info <https://issues.apache.org/jira/projects/MADLIB/versions/12352104> 1 Issues in version 1 Issues done 0 Issues in progress 0 Issues to do I am curious why the v1.20.0 Jira release list is not in a finalized state. <https://issues.apache.org/jira/projects/MADLIB/versions/12352103> Is it possible some of these issues are actually fixed in the v1.21.0 release? 16 Issues in version 2 Issues done 0 Issues in progress 14 Issues to do 2.2 1.21.0 docs Review - PASSED ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <http://madlib.apache.org/docs/rc/index.html> 2.3 MADlib 1.21.0 wiki - old release reference ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.21.0> This page reference v1.20.0 release: Release Notes MADlib v1.20.0: 2.4 PGP signature verification - PASSED ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --> Good signature from "Venkatesh Raghavan (G!) <raghava...@vmware.com>" [unknown] 2.5 SHA512 checksum verification - PASSED ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2.6 RELEASE_NOTES reviewed - QUESTION ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The v1.21.0 release notes indicate three New Features, two Improvements and three Bug Fixes are included. However, a single Jira (MADLIB-1506) is listed for the release. 2.7 Copyright review in NOTICE file - PASSED ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Apache MADlib Copyright 2016-2023 The Apache Software Foundation. ... 2.8 Apache RAT (mvn apache-rat:check) - PASSED ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [INFO] Building madlib 1.21.0 ... (elided verbose output) [INFO] 289 resources included (use -debug for more details) [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 approved: 280 licence. 2.9 Python dependencies - QUESTION/OBSERVATION ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I was able to complete install-check and dev-check activities by installing the following Python modules. What exactly are the required components for testing and runtime use? Minimally, these should be referenced in the release notes. keras tensorflow==1.14 dill pandas hyperopt==0.2.5 xgboost rtree scikit-learn When building against the Greenplum 6.23.1 release I had to set the following to find the appropriate Python 2.7 shared libraries embedded within the release installation directory. export LIBRARY_PATH=/usr/local/greenplum-db-6.23.1/lib:/usr/local/greenplum-db-6.23.1/ext/python/lib 2.10 Operating System: Rocky Linux 8.7 (Green Obsidian) x86_64 - PASSED ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - VMware Greenplum 6.23.1 (GA) - PostgreSQL 12.14 (built from source) - PostgreSQL 11.19 (built from source) - Apache MADlib apache-madlib-1.21.0-src.tar.gz Source tarball - install-check passed - dev-check passed 2.11 Operating System: Ubuntu 22.04.2 LTS - PASSED ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - PostgreSQL 12.14 (built from source) - PostgreSQL 11.19 (built from source) - Apache MADlib apache-madlib-1.21.0-src.tar.gz Source tarball - install-check passed - dev-check passed 2.12 Test cannot find "python" - FAILURE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Python 2 on the system revealed a hardcoded legacy issue. On both Rocky 8 and Ubuntu 22 the Python 2 executable is not installed as /usr/bin/python. Thus the deep_learning madlib_keras_gpu_info test fails. The function get_gpu_info_from_tensorflow expects to find it via the executable name "python" which is not satisfied and the test fails. Here is the reference; < https://github.com/apache/madlib/blob/master/src/ports/postgres/modules/deep_learning/madlib_keras_gpu_info.py_in#L67 > I was able to workaround this by creating the following symbolic link. This is possible because I did not install Python 3 which might have relied upon /usr/bin/python. This needs some investigation and a minimal reference in the release notes. ln -s /usr/bin/python2 /usr/bin/python