+1 approve ....
For reference, here are my notes for the Apache MADlib v1.14-rc1 review. There are two minor observations (** PLEASE REVIEW **) below. Please let me know if additional information would help. Regards, -=e -------------------------------------------------------------------------------- __ __ _ ____ _ _ _ | \/ | / \ | _ \| (_) |__ | |\/| | / _ \ | | | | | | '_ \ | | | |/ ___ \| |_| | | | |_) | |_| |_/_/ \_\____/|_|_|_.__/ -------------------------------------------------------------------------------- Observations: o Reviewing Jira status: The outstanding jira list can be seen with the following: https://issues.apache.org/jira/projects/MADLIB/versions/12342305 0 Warnings 34 Issues in version 33 Issues done 0 Issues in progress 1 Issues to do ** PLEASE REVIEW ** There is an open jira remaining with a v1.14 fix version https://issues.apache.org/jira/browse/MADLIB-1048' o PGP signature verified o SHA512 checksum verified o Release notes reviewed and look good o Copyright is good in NOTICE file o Apache RAT passed (mvn apache-rat:check) v1.14 - [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 approved: 118 licence. o Docs review (generation only, no content review): - Generated design.pdf - Generated html user docs o Source validation - Mac OS X ProductName: Mac OS X ProductVersion: 10.13.4 BuildVersion: 17E202 PostgreSQL 10.3 & 9.6.8 (built from source) MADlib v1.14-rc1 (built from source) ** PLEASE REVIEW ** For the Mac dev environment, I used latest Homebrew to setup environment. I originally used the Homebrew version of Boost (1.67.0) and encountered compilation issues. [ 86%] Building CXX object src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/elastic_net/elastic_net_binomial_fista.cpp.o In file included from /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/modules/crf/viterbi.cpp:12: /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/ports/postgres/dbconnector/dbconnector.hpp:88:10: fatal error: 'boost/tr1/array.hpp' fileIn file included from not/Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/modules/elastic_net/elastic_net_binomial_fista.cpp :found2 : /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/ports/postgres/dbconnector/dbconnector.hpp:88:10: fatal error: 'boost/tr1/array.hpp' file not found#include <boost/tr1/array.hpp> ^~~~~~~~~~~~~~~~~~~~~ #include <boost/tr1/array.hpp> ^~~~~~~~~~~~~~~~~~~~~ In file included from /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/modules/convex/lmf_igd.cpp:9: /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/ports/postgres/dbconnector/dbconnector.hpp:88:10: fatal error: 'boost/tr1/array.hpp' file not found #include <boost/tr1/array.hpp> ^~~~~~~~~~~~~~~~~~~~~ In file included from /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/modules/crf/linear_crf.cpp:11: /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/ports/postgres/dbconnector/dbconnector.hppIn file included from :/Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/modules/convex/linear_svm_igd.cpp88::910: : fatal error/Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/ports/postgres/dbconnector/dbconnector.hpp: :88:10 : fatal error'boost/tr1/array.hpp': file not found 'boost/tr1/array.hpp' file not found #include <boost/tr1/array.hpp> ^~~~~~~~~~~~~~~~~~~~~ #include <boost/tr1/array.hpp> ^~~~~~~~~~~~~~~~~~~~~ In file included from /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/modules/assoc_rules/assoc_rules.cpp:11: /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/ports/postgres/dbconnector/dbconnector.hpp:88:10: fatal error: 'boost/tr1/array.hpp' file not found #include <boost/tr1/array.hpp> ^~~~~~~~~~~~~~~~~~~~~ In file included from /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/modules/convex/utils_regularization.cpp:6: /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/ports/postgres/dbconnector/dbconnector.hpp:88:10: fatal error: 'boost/tr1/array.hpp' file not found #include <boost/tr1/array.hpp> ^~~~~~~~~~~~~~~~~~~~~ In file included from /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/modules/convex/mlp_igd.cpp:26: /Users/espino/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/src/ports/postgres/dbconnector/dbconnector.hpp:88:10: fatal error: 'boost/tr1/array.hpp' file not found #include <boost/tr1/array.hpp> ^~~~~~~~~~~~~~~~~~~~~ 1 error generated. 1 error generated. 1 error generated. make[2]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/crf/viterbi.cpp.o] Error 1 make[2]: *** Waiting for unfinished jobs.... make[2]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/elastic_net/elastic_net_binomial_fista.cpp.o] Error 1 make[2]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/convex/lmf_igd.cpp.o] Error 1 1 error generated. make[2]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/assoc_rules/assoc_rules.cpp.o] Error 1 1 error generated. 1 error generated. make[2]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/convex/linear_svm_igd.cpp.o] Error 1 make[2]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/crf/linear_crf.cpp.o] Error 1 1 error generated. make[2]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/convex/utils_regularization.cpp.o] Error 1 1 error generated. make[2]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/__/__/__/modules/convex/mlp_igd.cpp.o] Error 1 make[1]: *** [src/ports/postgres/10/CMakeFiles/madlib_postgresql_10.dir/all] Error 2 make: *** [all] Error 2 ✘-2 ~/workspace/madlib/madlib-v1.14-rc1/apache-madlib-1.14-src/build 12:16 $ I un-installed the Homebrew version of boost and allowed the build process to download a compatible Boost. install-check passed for the two DB platforms. - with Boost downloaded through build process - CentOS release 6.9 (Final) Run in Google Cloud Platform (GCP) image: centos-6-v20180401 Dev packages installed to build PostgreSQL and MADlib from source: bison cmake flex gcc gcc-c++ patch python-devel readline-devel zlib-devel PostgreSQL 10.3 (built from source) Greenplum Database: greenplum-db-4.3.24.0-rhel5-x86_64.zip (pre-built downloaded from PivNet) greenplum-db-5.7.0-rhel6-x86_64.zip (pre-built downloaded from PivNet) MADlib v1.14-rc1 (built from source) install-check passed for the three DB platforms. On Thu, Apr 26, 2018 at 2:57 PM, Jingyi Mei <j...@pivotal.io> wrote: > Hello Apache MADlib dev community, > > This is the vote for Apache MADlib 1.14 Release (RC1). It provides the > source release tarball and convenience binaries. This is the third > Apache MADlib release as an Apache Top Level Project (TLP). > > The vote will run for at least 72 working hours and will close on > Tuesday, May 1st, 2018 @ 6pm PDT. A minimum of 3 binding +1 votes and > more binding +1 than binding -1 are required to pass. > > The main goals of this release are: > > New features: > > - New module - Balanced datasets: A sampling module to balance > classification > datasets by resampling using various techniques including undersampling, > oversampling, uniform sampling or user-defined proportion sampling > (MADLIB-1168) > - Mini-batch: Added a mini-batch optimizer for MLP and a preprocessor > function > necessary to create batches from the data (MADLIB-1200, MADLIB-1206, > MADLIB-1220, MADLIB-1224, MADLIB-1226, MADLIB-1227) > - k-NN: Added weighted averaging/voting by distance (MADLIB-1181) > - Summary: Added additional stats: number of positive, negative, zero > values and > 95% confidence intervals for the mean (MADLIB-1167) > - Encode categorical: Updated to produce lower-case column names when > possible > (MADLIB-1202) > - MLP: Added support for already one-hot encoded categorical dependent > variable > in a classification task (MADLIB-1222) > - Pagerank: Added option for personalized vertices that allows higher > weightage > for a subset of vertices which will have a higher jump probability as > compared to other vertices and a random surfer is more likely to > jump to these personalization vertices (MADLIB-1084) > > Bug fixes: > > - Fixed issue with invalid calls of construct_array that led to problems > in Postgresql 10 (MADLIB-1185) > - Added newline between file concatenation during PGXN install > (MADLIB-1194) > - Fixed upgrade issues in knn (MADLIB-1197) > - Added fix to ensure RF variable importance are always non-negative > - Fixed inconsistency in LDA output and improved usability (MADLIB-1160, > MADLIB-1201) > - Fixed MLP and RF predict for models trained in earlier versions to > ensure missing optional parameters are given appropriate default values > (MADLIB-1207) > - Fixed a scenario in DT where no features exist due categorical columns > with single level being dropped led to the database crashing > - Fixed step size initialization in MLP based on learning rate policy > (MADLIB-1212) > - Fixed PCA issue that leads to failure when grouping column is a TEXT > type (MADLIB-1215) > - Fixed cat levels output in DT when grouping is enabled (MADLIB-1218) > - Fixed and simplified initialization of model coefficients in MLP > - Removed source table dependency for predicting regression models in > MLP (MADLIB-1223) > - Print loss of first iteration in MLP (MADLIB-1228) > - Fixed MLP failure on GPDB 4.3 when verbose=True (MADLIB-1209) > - Fixed RF issue that showed up when var_importance=True with no > continuous features (MADLIB-1219) > - Fixed DT/RF issue for null_as_category=True and grouping enabled > (MADLIB-1217) > > Other: > > - Reduced install-check runtime for PCA, DT, RF, elastic net > (MADLIB-1216) > - Added CentOS 7 PostgreSQL 9.6/10 docker files > > For additional information, please see: > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14 > > Here are the release artifact details: > > Source release tag to be voted on: rc/1.14-rc1, located here: > https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag; > h=refs/tags/rc/1.14-rc1 > > Source release tarball can be retrieved from the following locations: > > Package: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-src.tar.gz > PGP Signature: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-src.tar.gz.asc > SHA512 Hash: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-src.tar.gz.sha512 > > Convenience binary packages can be retrieved from the following > locations: > > macOS: 10.* PostgreSQL 9.6 & 10.2 > > Package: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Darwin.dmg > PGP Signature: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Darwin.dmg.asc > SHA512 Hash: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Darwin.dmg.sha512 > > CentOS* GPDB 4.3.5+ > > Package: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux-GPDB43.rpm > PGP Signature: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux-GPDB43.rpm.asc > SHA512 Hash: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux-GPDB43.rpm.sha512 > > CentOS 6 &* GPDB 5.3.0, PostgreSQL 9.6 & 10.2 > > Package: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux.rpm > PGP Signature: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux.rpm.asc > SHA512 Hash: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux.rpm.sha512 > > The PGP KEYS file used to validate the signature of the release artifacts > is available here: > https://dist.apache.org/repos/dist/dev/madlib/KEYS > > To help in tallying the vote, PMC members please be sure to indicate > “(binding)” with the vote. > > [ ] +1 approve > [ ] +0 no opinion > [ ] -1 disapprove (and reason why) > > Regards, > Jingyi Mei > > Pivotal R&D Advanced Analytics > >