Hi Jingyi, Thanks for posting the artifacts and sending out the vote.
My findings: Installation and IC passed on postgres 9.6.7 Also I tested a cpl of the new features (personalized page rank and mini-batch preprocessor) and they worked OK for me with a small sample data set. +1 (binding) On Thu, Apr 26, 2018 at 2:57 PM, Jingyi Mei <[email protected]> wrote: > Hello Apache MADlib dev community, > > This is the vote for Apache MADlib 1.14 Release (RC1). It provides the > source release tarball and convenience binaries. This is the third > Apache MADlib release as an Apache Top Level Project (TLP). > > The vote will run for at least 72 working hours and will close on > Tuesday, May 1st, 2018 @ 6pm PDT. A minimum of 3 binding +1 votes and > more binding +1 than binding -1 are required to pass. > > The main goals of this release are: > > New features: > > - New module - Balanced datasets: A sampling module to balance > classification > datasets by resampling using various techniques including > undersampling, > oversampling, uniform sampling or user-defined proportion sampling > (MADLIB-1168) > - Mini-batch: Added a mini-batch optimizer for MLP and a preprocessor > function > necessary to create batches from the data (MADLIB-1200, MADLIB-1206, > MADLIB-1220, MADLIB-1224, MADLIB-1226, MADLIB-1227) > - k-NN: Added weighted averaging/voting by distance (MADLIB-1181) > - Summary: Added additional stats: number of positive, negative, zero > values and > 95% confidence intervals for the mean (MADLIB-1167) > - Encode categorical: Updated to produce lower-case column names when > possible > (MADLIB-1202) > - MLP: Added support for already one-hot encoded categorical dependent > variable > in a classification task (MADLIB-1222) > - Pagerank: Added option for personalized vertices that allows higher > weightage > for a subset of vertices which will have a higher jump probability as > compared to other vertices and a random surfer is more likely to > jump to these personalization vertices (MADLIB-1084) > > Bug fixes: > > - Fixed issue with invalid calls of construct_array that led to > problems > in Postgresql 10 (MADLIB-1185) > - Added newline between file concatenation during PGXN install > (MADLIB-1194) > - Fixed upgrade issues in knn (MADLIB-1197) > - Added fix to ensure RF variable importance are always non-negative > - Fixed inconsistency in LDA output and improved usability > (MADLIB-1160, MADLIB-1201) > - Fixed MLP and RF predict for models trained in earlier versions to > ensure missing optional parameters are given appropriate default values > (MADLIB-1207) > - Fixed a scenario in DT where no features exist due categorical > columns with single level being dropped led to the database crashing > - Fixed step size initialization in MLP based on learning rate policy > (MADLIB-1212) > - Fixed PCA issue that leads to failure when grouping column is a TEXT > type (MADLIB-1215) > - Fixed cat levels output in DT when grouping is enabled (MADLIB-1218) > - Fixed and simplified initialization of model coefficients in MLP > - Removed source table dependency for predicting regression models in > MLP (MADLIB-1223) > - Print loss of first iteration in MLP (MADLIB-1228) > - Fixed MLP failure on GPDB 4.3 when verbose=True (MADLIB-1209) > - Fixed RF issue that showed up when var_importance=True with no > continuous features (MADLIB-1219) > - Fixed DT/RF issue for null_as_category=True and grouping enabled > (MADLIB-1217) > > Other: > > - Reduced install-check runtime for PCA, DT, RF, elastic net > (MADLIB-1216) > - Added CentOS 7 PostgreSQL 9.6/10 docker files > > For additional information, please see: > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.14 > > Here are the release artifact details: > > Source release tag to be voted on: rc/1.14-rc1, located here: > https://git-wip-us.apache.org/repos/asf?p=madlib.git;a=tag; > h=refs/tags/rc/1.14-rc1 > > Source release tarball can be retrieved from the following locations: > > Package: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-src.tar.gz > PGP Signature: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-src.tar.gz.asc > SHA512 Hash: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-src.tar.gz.sha512 > > Convenience binary packages can be retrieved from the following > locations: > > macOS: 10.* PostgreSQL 9.6 & 10.2 > > Package: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Darwin.dmg > PGP Signature: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Darwin.dmg.asc > SHA512 Hash: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Darwin.dmg.sha512 > > CentOS* GPDB 4.3.5+ > > Package: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux-GPDB43.rpm > PGP Signature: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux-GPDB43.rpm.asc > SHA512 Hash: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux-GPDB43.rpm.sha512 > > CentOS 6 &* GPDB 5.3.0, PostgreSQL 9.6 & 10.2 > > Package: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux.rpm > PGP Signature: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux.rpm.asc > SHA512 Hash: > https://dist.apache.org/repos/dist/dev/madlib/1.14-RC1/ > apache-madlib-1.14-bin-Linux.rpm.sha512 > > The PGP KEYS file used to validate the signature of the release artifacts > is available here: > https://dist.apache.org/repos/dist/dev/madlib/KEYS > > To help in tallying the vote, PMC members please be sure to indicate > “(binding)” with the vote. > > [ ] +1 approve > [ ] +0 no opinion > [ ] -1 disapprove (and reason why) > > Regards, > Jingyi Mei > > Pivotal R&D Advanced Analytics > >
