[jira] [Assigned] (MADLIB-1005) Cannot compile for greenplum (arch linux) - AggCheckCallContext issue
[ https://issues.apache.org/jira/browse/MADLIB-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Espino reassigned MADLIB-1005: - Assignee: Ed Espino > Cannot compile for greenplum (arch linux) - AggCheckCallContext issue > - > > Key: MADLIB-1005 > URL: https://issues.apache.org/jira/browse/MADLIB-1005 > Project: Apache MADlib > Issue Type: Bug > Components: Build System >Reporter: Aleksandr Melnyk >Assignee: Ed Espino >Priority: Minor > Fix For: v1.12 > > > In file included from > /home/gpadmin/incubator-madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/dbconnector.hpp:272:0, > from > /home/gpadmin/incubator-madlib/src/ports/greenplum/dbconnector/dbconnector.hpp:35, > from > /home/gpadmin/incubator-madlib/src/modules/sample/weighted_sample.cpp:9: > /home/gpadmin/incubator-madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/AnyType_impl.hpp: > In member function 'madlib::dbconnector::postgres::AnyType > madlib::dbconnector::postgres::AnyType::operator[](uint16_t) const': > /home/gpadmin/incubator-madlib/src/ports/greenplum/dbconnector/../../postgres/dbconnector/AnyType_impl.hpp:325:57: > error: call of overloaded 'AggCheckCallContext(FunctionCallInfoData* const&, > NULL)' is ambiguous > isMutable = AggCheckCallContext(fcinfo, NULL); > ^ > In file included from > /home/gpadmin/incubator-madlib/src/ports/greenplum/dbconnector/dbconnector.hpp:33:0, > from > /home/gpadmin/incubator-madlib/src/modules/sample/weighted_sample.cpp:9: > /home/gpadmin/incubator-madlib/src/ports/greenplum/dbconnector/Compatibility.hpp:61:1: > note: candidate: int > madlib::dbconnector::postgres::{anonymous}::AggCheckCallContext(FunctionCallInfo, > MemoryContextData**) > AggCheckCallContext(FunctionCallInfo fcinfo, MemoryContext *aggcontext) { > ^~~ > In file included from > /usr/local/projects/custom_builds/gpdb.master/include/postgresql/server/funcapi.h:19:0, > from > /home/gpadmin/incubator-madlib/src/ports/greenplum/dbconnector/dbconnector.hpp:17, > from > /home/gpadmin/incubator-madlib/src/modules/sample/weighted_sample.cpp:9: > /usr/local/projects/custom_builds/gpdb.master/include/postgresql/server/fmgr.h:584:12: > note: candidate: int AggCheckCallContext(FunctionCallInfo, > MemoryContextData**) > extern int AggCheckCallContext(FunctionCallInfo fcinfo, > ^~~ > src/ports/greenplum/4.3ORCA/CMakeFiles/madlib_greenplum_4_3ORCA.dir/build.make:62: > recipe for target > 'src/ports/greenplum/4.3ORCA/CMakeFiles/madlib_greenplum_4_3ORCA.dir/__/__/__/modules/sample/weighted_sample.cpp.o' > failed > make[2]: *** > [src/ports/greenplum/4.3ORCA/CMakeFiles/madlib_greenplum_4_3ORCA.dir/__/__/__/modules/sample/weighted_sample.cpp.o] > Error 1 > CMakeFiles/Makefile2:728: recipe for target > 'src/ports/greenplum/4.3ORCA/CMakeFiles/madlib_greenplum_4_3ORCA.dir/all' > failed > make[1]: *** > [src/ports/greenplum/4.3ORCA/CMakeFiles/madlib_greenplum_4_3ORCA.dir/all] > Error 2 > Makefile:149: recipe for target 'all' failed > make: *** [all] Error 2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (MADLIB-1005) Cannot compile for greenplum (arch linux) - AggCheckCallContext issue
[ https://issues.apache.org/jira/browse/MADLIB-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Espino resolved MADLIB-1005. --- Resolution: Fixed This was fixed back in January 2017 with the following commit: {code} git show -s 3cf3f6771ab51dd26605ce4d70cd70aee5d896dd commit 3cf3f6771ab51dd26605ce4d70cd70aee5d896dd Author: Dave Cramer Date: Wed Jan 11 15:17:01 2017 -0800 Build: Exclude AggCheckCallContext for GPDB5 - Adds build files to compile MADlib with GPDB5 - GPDB5 cherrypicked AggCheckCallContext, we have to exclude it for GPDB5 builds Closes #83 {code} I have verified this with GPDB 5 (7789b1a5fd18338b454396d5281a6127c9a9ee8a - {{configure --disable-orca --with-python}}) and MADlib (4e8616b7a9c0a21326b906ff534d341fab8a5fa4) on CentOS Linux release 7.3.1611 (Core). {code} $ /usr/local/madlib/bin/madpack -s madlib -p greenplum install-check madpack.py : INFO : Detected Greenplum DB version 5.0.0. TEST CASE RESULT|Module: array_ops|array_ops.sql_in|PASS|Time: 582 milliseconds TEST CASE RESULT|Module: bayes|gaussian_naive_bayes.sql_in|PASS|Time: 806 milliseconds TEST CASE RESULT|Module: bayes|bayes.sql_in|PASS|Time: 2032 milliseconds TEST CASE RESULT|Module: crf|crf_train_small.sql_in|PASS|Time: 980 milliseconds TEST CASE RESULT|Module: crf|crf_train_large.sql_in|PASS|Time: 1389 milliseconds TEST CASE RESULT|Module: crf|crf_test_small.sql_in|PASS|Time: 852 milliseconds TEST CASE RESULT|Module: crf|crf_test_large.sql_in|PASS|Time: 1019 milliseconds TEST CASE RESULT|Module: elastic_net|elastic_net_install_check.sql_in|PASS|Time: 79703 milliseconds TEST CASE RESULT|Module: linalg|svd.sql_in|PASS|Time: 7446 milliseconds TEST CASE RESULT|Module: linalg|matrix_ops.sql_in|PASS|Time: 6264 milliseconds TEST CASE RESULT|Module: linalg|linalg.sql_in|PASS|Time: 341 milliseconds TEST CASE RESULT|Module: prob|prob.sql_in|PASS|Time: 1213 milliseconds TEST CASE RESULT|Module: sketch|support.sql_in|PASS|Time: 49 milliseconds TEST CASE RESULT|Module: sketch|mfv.sql_in|PASS|Time: 263 milliseconds TEST CASE RESULT|Module: sketch|fm.sql_in|PASS|Time: 1782 milliseconds TEST CASE RESULT|Module: sketch|cm.sql_in|PASS|Time: 6164 milliseconds TEST CASE RESULT|Module: svm|svm.sql_in|PASS|Time: 13794 milliseconds TEST CASE RESULT|Module: tsa|arima_train.sql_in|PASS|Time: 3856 milliseconds TEST CASE RESULT|Module: tsa|arima.sql_in|PASS|Time: 3622 milliseconds TEST CASE RESULT|Module: conjugate_gradient|conj_grad.sql_in|PASS|Time: 347 milliseconds TEST CASE RESULT|Module: knn|knn.sql_in|PASS|Time: 483 milliseconds TEST CASE RESULT|Module: lda|lda.sql_in|PASS|Time: 3117 milliseconds TEST CASE RESULT|Module: stats|wsr_test.sql_in|PASS|Time: 171 milliseconds TEST CASE RESULT|Module: stats|t_test.sql_in|PASS|Time: 259 milliseconds TEST CASE RESULT|Module: stats|robust_and_clustered_variance_coxph.sql_in|PASS|Time: 1125 milliseconds TEST CASE RESULT|Module: stats|pred_metrics.sql_in|PASS|Time: 1015 milliseconds TEST CASE RESULT|Module: stats|mw_test.sql_in|PASS|Time: 126 milliseconds TEST CASE RESULT|Module: stats|ks_test.sql_in|PASS|Time: 336 milliseconds TEST CASE RESULT|Module: stats|f_test.sql_in|PASS|Time: 127 milliseconds TEST CASE RESULT|Module: stats|cox_prop_hazards.sql_in|PASS|Time: 2430 milliseconds TEST CASE RESULT|Module: stats|correlation.sql_in|PASS|Time: 1107 milliseconds TEST CASE RESULT|Module: stats|chi2_test.sql_in|PASS|Time: 378 milliseconds TEST CASE RESULT|Module: stats|anova_test.sql_in|PASS|Time: 267 milliseconds TEST CASE RESULT|Module: svec_util|svec_test.sql_in|PASS|Time: 1567 milliseconds TEST CASE RESULT|Module: svec_util|gp_sfv_sort_order.sql_in|PASS|Time: 126 milliseconds TEST CASE RESULT|Module: utilities|text_utilities.sql_in|PASS|Time: 288 milliseconds TEST CASE RESULT|Module: utilities|sessionize.sql_in|PASS|Time: 421 milliseconds TEST CASE RESULT|Module: utilities|pivot.sql_in|PASS|Time: 1398 milliseconds TEST CASE RESULT|Module: utilities|path.sql_in|PASS|Time: 439 milliseconds TEST CASE RESULT|Module: utilities|encode_categorical.sql_in|PASS|Time: 735 milliseconds TEST CASE RESULT|Module: utilities|drop_madlib_temp.sql_in|PASS|Time: 165 milliseconds TEST CASE RESULT|Module: assoc_rules|assoc_rules.sql_in|PASS|Time: 1833 milliseconds TEST CASE RESULT|Module: convex|mlp.sql_in|PASS|Time: 14029 milliseconds TEST CASE RESULT|Module: convex|lmf.sql_in|PASS|Time: 3226 milliseconds TEST CASE RESULT|Module: glm|poisson.sql_in|PASS|Time: 1309 milliseconds TEST CASE RESULT|Module: glm|ordinal.sql_in|PASS|Time: 1002 milliseconds TEST CASE RESULT|Module: glm|multinom.sql_in|PASS|Time: 1184 milliseconds TEST CASE RESULT|Module: glm|inverse_gaussian.sql_in|PASS|Time: 1604 milliseconds TEST CASE RESULT|Module: glm|gaussian.sql_in|PASS|Time: 1349 milliseconds TEST CASE RESULT|Module: glm|gamma.sql_in|PASS|Time: 6276 milliseconds TEST CASE RESULT|Module: glm|binomial.sql_in|PASS|Time: 4382 milliseconds TEST
[jira] [Commented] (MADLIB-1118) Reduce size of elastic net install check table
[ https://issues.apache.org/jira/browse/MADLIB-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120865#comment-16120865 ] ASF GitHub Bot commented on MADLIB-1118: Github user njayaram2 commented on the issue: https://github.com/apache/incubator-madlib/pull/163 LGTM, since we anyway don't assert on `relative_error` on `log_likelihood` in elastic_net. > Reduce size of elastic net install check table > -- > > Key: MADLIB-1118 > URL: https://issues.apache.org/jira/browse/MADLIB-1118 > Project: Apache MADlib > Issue Type: Task > Components: Module: Regularized Regression >Reporter: Frank McQuillan >Assignee: Ed Espino >Priority: Minor > Fix For: v1.12 > > > IC is taking too long for elastic net. I would suggest we reduce the size of > the input data table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MADLIB-1134) Neural Networks - MLP - Phase 2
[ https://issues.apache.org/jira/browse/MADLIB-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120655#comment-16120655 ] Cooper Sloan edited comment on MADLIB-1134 at 8/9/17 9:12 PM: -- Very good article on regularization for NN: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=8&cad=rja&uact=8&ved=0ahUKEwiaj573h8vVAhVos1QKHQubAa0QFgheMAc&url=https%3A%2F%2Fpiazza.com%2Fclass_profile%2Fget_resource%2Fieytirtomz425i%2Fifyrgs0anxs3d5&usg=AFQjCNHYQNY3YuO4TX3UAzeplaXNgoANOQ was (Author: coopersloan): Very good article on regularization for NN: https://piazza-resources.s3.amazonaws.com/ieytirtomz425i/ifyrgs0anxs3d5/Oct19Lecture.pdf?AWSAccessKeyId=AKIAIEDNRLJ4AZKBW6HA&Expires=1502245062&Signature=6FGb4J8zhez0uQWsu4xtefaKlFU%3D > Neural Networks - MLP - Phase 2 > --- > > Key: MADLIB-1134 > URL: https://issues.apache.org/jira/browse/MADLIB-1134 > Project: Apache MADlib > Issue Type: Improvement > Components: Module: Neural Networks >Reporter: Frank McQuillan >Assignee: Cooper Sloan > Fix For: v1.12 > > > Follow on from https://issues.apache.org/jira/browse/MADLIB-413 > Story > As a MADlib developer, I want to get 2nd phase implementation of NN going > with training and prediction functions, so that I can use this to build to an > MVP version for GA. > Features to add: > * weights for inputs > * logic for n_tries > * normalize inputs > * L2 regularization > * learning rate policy > * warm start -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MADLIB-1134) Neural Networks - MLP - Phase 2
[ https://issues.apache.org/jira/browse/MADLIB-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120655#comment-16120655 ] Cooper Sloan commented on MADLIB-1134: -- Very good article on regularization for NN: https://piazza-resources.s3.amazonaws.com/ieytirtomz425i/ifyrgs0anxs3d5/Oct19Lecture.pdf?AWSAccessKeyId=AKIAIEDNRLJ4AZKBW6HA&Expires=1502245062&Signature=6FGb4J8zhez0uQWsu4xtefaKlFU%3D > Neural Networks - MLP - Phase 2 > --- > > Key: MADLIB-1134 > URL: https://issues.apache.org/jira/browse/MADLIB-1134 > Project: Apache MADlib > Issue Type: Improvement > Components: Module: Neural Networks >Reporter: Frank McQuillan >Assignee: Cooper Sloan > Fix For: v1.12 > > > Follow on from https://issues.apache.org/jira/browse/MADLIB-413 > Story > As a MADlib developer, I want to get 2nd phase implementation of NN going > with training and prediction functions, so that I can use this to build to an > MVP version for GA. > Features to add: > * weights for inputs > * logic for n_tries > * normalize inputs > * L2 regularization > * learning rate policy > * warm start -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MADLIB-1118) Reduce size of elastic net install check table
[ https://issues.apache.org/jira/browse/MADLIB-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120552#comment-16120552 ] ASF GitHub Bot commented on MADLIB-1118: Github user edespino commented on the issue: https://github.com/apache/incubator-madlib/pull/163 For future reference, this is how I reviewed the elastic_net install-check execution: * Update the following file:`/src/ports/postgres/modules/elastic_net/test/elastic_net_install_check.sql_in` * Added `\timing` to top of the file. * Added `SELECT ASSERT (FALSE, 'Deliberately forced failure');` to the bottom of the file to force a failure condition. This will allowed me to review the timing information in the log files from the test execution. * From build directory run `make install` to push updated install-check file to installation directory * Run only the elastic_net test suite (using Postgres): `/usr/local/madlib/bin/madpack -s madlib -p postgres install-check -t elastic_net I updated the elastic_net_train tolerance values with varying values and reran the repeated the scenario reviewing the recorded `Time:` values. > Reduce size of elastic net install check table > -- > > Key: MADLIB-1118 > URL: https://issues.apache.org/jira/browse/MADLIB-1118 > Project: Apache MADlib > Issue Type: Task > Components: Module: Regularized Regression >Reporter: Frank McQuillan >Assignee: Ed Espino >Priority: Minor > Fix For: v1.12 > > > IC is taking too long for elastic net. I would suggest we reduce the size of > the input data table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (MADLIB-1089) Install check errors on HAWQ 2.2 when install MADlib on non-default schema
[ https://issues.apache.org/jira/browse/MADLIB-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank McQuillan resolved MADLIB-1089. - Resolution: Fixed > Install check errors on HAWQ 2.2 when install MADlib on non-default schema > -- > > Key: MADLIB-1089 > URL: https://issues.apache.org/jira/browse/MADLIB-1089 > Project: Apache MADlib > Issue Type: Bug > Components: All Modules >Reporter: Frank McQuillan >Priority: Minor > Fix For: v1.12 > > Attachments: k-means-IC-fail-on-hawq-2dot2, > linalg-IC-fail-on-hawq-2dot2 > > > Running install-check on a non-default schema in HAWQ 2.2 results in errors > for lining and means. > {code} > MADlib version: 1.10.0, git revision: rel/v1.9.1-58-ga3863b6, cmake > configuration time: Wed Mar 8 19:49:45 UTC 2017, build type: Release, bui > ld system: Linux-2.6.18-238.27.1.el5.hotfix.bz516490, C compiler: gcc 4.4.0, > C++ compiler: g++ 4.4.0 > PostgreSQL 8.2.15 (Greenplum Database 4.2.0 build 1) (HAWQ 2.2.0.0 build > 4141) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.8.5 20 > 150623 (Red Hat 4.8.5-11) compiled on Mar 30 2017 21:45:26 > {code} > See attached log files and summaries below: > linalg.sql_in > {code} > psql:/tmp/madlib.sGu72l/linalg/test/linalg.sql_in.tmp:165: ERROR: Function > "closest_column(double precision[],double precision[],text)": Inval > id distance metric provided: madlib1.squared_dist_norm2. Currently only > madlib provided distance functions are supported. > {code} > kmeans.sql_in > {code} > psql:/tmp/madlib.sGu72l/kmeans/test/kmeans.sql_in.tmp:117: ERROR: > plpy.SPIError: Function "closest_column(double precision[],double precision[ > ],text)": Invalid distance metric provided: madlib1.squared_dist_norm2. > Currently only madlib provided distance functions are supported. (seg1 > ip-10-32-127-188.ore6.vpc.pivotal.io:4 pid=483012) (plpython.c:4663) > CONTEXT: Traceback (most recent call last): > PL/Python function "internal_compute_kmeanspp_seeding", line 22, in > return kmeans.compute_kmeanspp_seeding(**globals()) > PL/Python function "internal_compute_kmeanspp_seeding", line 154, in > compute_kmeanspp_seeding > PL/Python function "internal_compute_kmeanspp_seeding", line 415, in update > PL/Python function "internal_compute_kmeanspp_seeding" > SQL statement "SELECT ( SELECT madlib1.internal_compute_kmeanspp_seeding( > '_madlib_kmeanspp_args', '_madlib_kmeanspp_state', textin(regclassou > t( $1 )), $2 ) )" > PL/pgSQL function "kmeanspp_seeding" line 83 at assignment > SQL statement "SELECT madlib1.kmeans( $1 , $2 , madlib1.kmeanspp_seeding( > $1 , $2 , $3 , $4 , NULL, $5 ), $4 , $6 , $7 , $8 )" > PL/pgSQL function "kmeanspp" line 4 at assignment > SQL statement "SELECT madlib1.kmeanspp( $1 , $2 , $3 , > 'madlib1.squared_dist_norm2'::VARCHAR, 'madlib1.avg'::VARCHAR, 20::INTEGER, > 0.001::DO > UBLE PRECISION, 1.0::DOUBLE PRECISION)" > PL/pgSQL function "kmeanspp" line 4 at assignment > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MADLIB-1103) Remove pyxb GPL workaround
[ https://issues.apache.org/jira/browse/MADLIB-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119507#comment-16119507 ] Ed Espino commented on MADLIB-1103: --- The fix will be made available in the PyXB 1.2.6 release. It is not clear when it will be made available. I suggest we push this to the next release. > Remove pyxb GPL workaround > -- > > Key: MADLIB-1103 > URL: https://issues.apache.org/jira/browse/MADLIB-1103 > Project: Apache MADlib > Issue Type: Improvement > Components: Build System >Reporter: Roman Shaposhnik >Priority: Minor > Fix For: v1.12 > > > Upstream pyxb has done the right thing and got rid of GPL code: > https://github.com/pabigot/pyxb/issues/77 > It would be great to remove workaround from MADlib -- This message was sent by Atlassian JIRA (v6.4.14#64029)