[jira] [Commented] (MADLIB-1086) Unnest 2-D array by one level (i.e. into rows of 1-D arrays)
[ https://issues.apache.org/jira/browse/MADLIB-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971914#comment-15971914 ] ASF GitHub Bot commented on MADLIB-1086: GitHub user rashmi815 opened a pull request: https://github.com/apache/incubator-madlib/pull/116 Unnest 2d array Array Operations: Add function to unnest 2-D arrays into rows of 1-D arrays JIRA: MADLIB-1086 Function to unnest 2-D array by one level (i.e. into rows of 1-D arrays). This is needed, for instance, in K-means, so that we can get one centroid per row for follow on operations. - Added function to array operations - Added an example in k-means to demonstrate usage You can merge this pull request into a Git repository by running: $ git pull https://github.com/rashmi815/incubator-madlib unnest_2d_array Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-madlib/pull/116.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #116 commit 18e562813702d12d620594598f471161a990fbbd Author: Rashmi Raghu Date: 2017-04-15T00:08:17Z Unnest function, install-check tests completed. Initial docs included commit 2a4baffa29c8f976d3260931c1790cfc125e91f4 Author: Rashmi Raghu Date: 2017-04-15T06:20:01Z Refactored names of function output columns commit a3eae964adc84382fa674e4d95c486f472b14099 Author: Rashmi Raghu Date: 2017-04-17T23:45:32Z Updated docs (array_ops and k-means) and minor update to install-check tests > Unnest 2-D array by one level (i.e. into rows of 1-D arrays) > > > Key: MADLIB-1086 > URL: https://issues.apache.org/jira/browse/MADLIB-1086 > Project: Apache MADlib > Issue Type: New Feature > Components: Module: Utilities >Reporter: Frank McQuillan >Assignee: Rashmi Raghu >Priority: Minor > Fix For: v1.11 > > > Context > Currently k-means returns the following > {code} > centroids| > {{13.75333,1.905,2.425,16.06667,90.3,2.805,2.98,0.29,2.005,5.406633,1.041667, > 3.318333,1020.833}, > > {14.255,1.9325,2.5025,16.05,110.5,3.055,2.9775,0.2975,1.845,6.2125,0.9975,3.365,1378.75}} > cluster_variance | {122999.110416013,30561.74805} > objective_fn | 153560.858466013 > frac_reassigned | 0 > num_iterations | 3 > {code} > Story > As a data scientist, I want to unnest 2-D array by one level (i.e. into rows > of 1-D arrays) in K-means, so that I can get one centroid per row for follow > on operations. > Acceptance > 1) Add function to array operations > http://madlib.incubator.apache.org/docs/latest/group__grp__array.html > 2) Add an example in k-means > http://madlib.incubator.apache.org/docs/latest/group__grp__kmeans.html > to demonstrate usage -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (MADLIB-1089) Install check errors on HAWQ 2.2 when install MADlib on non-default schema
[ https://issues.apache.org/jira/browse/MADLIB-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank McQuillan updated MADLIB-1089: Fix Version/s: (was: v1.11) v1.12 > Install check errors on HAWQ 2.2 when install MADlib on non-default schema > -- > > Key: MADLIB-1089 > URL: https://issues.apache.org/jira/browse/MADLIB-1089 > Project: Apache MADlib > Issue Type: Bug > Components: All Modules >Reporter: Frank McQuillan >Priority: Minor > Fix For: v1.12 > > Attachments: k-means-IC-fail-on-hawq-2dot2, > linalg-IC-fail-on-hawq-2dot2 > > > Running install-check on a non-default schema in HAWQ 2.2 results in errors > for lining and means. > {code} > MADlib version: 1.10.0, git revision: rel/v1.9.1-58-ga3863b6, cmake > configuration time: Wed Mar 8 19:49:45 UTC 2017, build type: Release, bui > ld system: Linux-2.6.18-238.27.1.el5.hotfix.bz516490, C compiler: gcc 4.4.0, > C++ compiler: g++ 4.4.0 > PostgreSQL 8.2.15 (Greenplum Database 4.2.0 build 1) (HAWQ 2.2.0.0 build > 4141) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.8.5 20 > 150623 (Red Hat 4.8.5-11) compiled on Mar 30 2017 21:45:26 > {code} > See attached log files and summaries below: > linalg.sql_in > {code} > psql:/tmp/madlib.sGu72l/linalg/test/linalg.sql_in.tmp:165: ERROR: Function > "closest_column(double precision[],double precision[],text)": Inval > id distance metric provided: madlib1.squared_dist_norm2. Currently only > madlib provided distance functions are supported. > {code} > kmeans.sql_in > {code} > psql:/tmp/madlib.sGu72l/kmeans/test/kmeans.sql_in.tmp:117: ERROR: > plpy.SPIError: Function "closest_column(double precision[],double precision[ > ],text)": Invalid distance metric provided: madlib1.squared_dist_norm2. > Currently only madlib provided distance functions are supported. (seg1 > ip-10-32-127-188.ore6.vpc.pivotal.io:4 pid=483012) (plpython.c:4663) > CONTEXT: Traceback (most recent call last): > PL/Python function "internal_compute_kmeanspp_seeding", line 22, in > return kmeans.compute_kmeanspp_seeding(**globals()) > PL/Python function "internal_compute_kmeanspp_seeding", line 154, in > compute_kmeanspp_seeding > PL/Python function "internal_compute_kmeanspp_seeding", line 415, in update > PL/Python function "internal_compute_kmeanspp_seeding" > SQL statement "SELECT ( SELECT madlib1.internal_compute_kmeanspp_seeding( > '_madlib_kmeanspp_args', '_madlib_kmeanspp_state', textin(regclassou > t( $1 )), $2 ) )" > PL/pgSQL function "kmeanspp_seeding" line 83 at assignment > SQL statement "SELECT madlib1.kmeans( $1 , $2 , madlib1.kmeanspp_seeding( > $1 , $2 , $3 , $4 , NULL, $5 ), $4 , $6 , $7 , $8 )" > PL/pgSQL function "kmeanspp" line 4 at assignment > SQL statement "SELECT madlib1.kmeanspp( $1 , $2 , $3 , > 'madlib1.squared_dist_norm2'::VARCHAR, 'madlib1.avg'::VARCHAR, 20::INTEGER, > 0.001::DO > UBLE PRECISION, 1.0::DOUBLE PRECISION)" > PL/pgSQL function "kmeanspp" line 4 at assignment > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (MADLIB-1077) Double check binary distribution
[ https://issues.apache.org/jira/browse/MADLIB-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank McQuillan reassigned MADLIB-1077: --- Assignee: Roman Shaposhnik (was: Frank McQuillan) > Double check binary distribution > > > Key: MADLIB-1077 > URL: https://issues.apache.org/jira/browse/MADLIB-1077 > Project: Apache MADlib > Issue Type: Task > Components: All Modules >Reporter: Frank McQuillan >Assignee: Roman Shaposhnik >Priority: Minor > Fix For: v1.11 > > > Double check that binary distribution licensing issues are all OK. > For example, see comments from Ed Espino on 1.10 RC-2 review on thread > https://mail-archives.apache.org/mod_mbox/incubator-madlib-user/201703.mbox/%3CCAHAuQDzarS7K4u-rOsLLhbwSHCyFn5cKSyjLinE%2BZ%3DjSpU59qw%40mail.gmail.com%3E > {code} > I was performing the build from a simple perspective. Download > source, configure, make and glance at docs (in this order). > As we have dealt with auto-downloaded files in the HAWQ project, I > was a surprised that the following packages were automatically > downloaded for me. On the HAWQ project we were instructed to require > these as pre-requisites and or make them optional included via > command line options (configure). I'm guessing other packages would > have been automatically downloaded if they were not found on system > (eg: boost). > Automatically downloaded packages: > https://github.com/madlib/eigen/archive/branches/3.2.tar.gz > http://sourceforge.net/projects/pyxb/files/PyXB-1.2.4.tar.gz > > Issue: As "make" was running, the following message was a bit alarming: >PyXB: Removing GPL component from code base > > This comes from the script src/patch/PyXB.sh run after PyXB source > is downloaded. > > ... > echo "PyXB: Removing GPL component from code base" > rm -f doc/extapi.py > rm -f doc/extapi.pyc > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MADLIB-1081) Graph - add grouping to shortest path
[ https://issues.apache.org/jira/browse/MADLIB-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971444#comment-15971444 ] ASF GitHub Bot commented on MADLIB-1081: Github user asfgit closed the pull request at: https://github.com/apache/incubator-madlib/pull/113 > Graph - add grouping to shortest path > - > > Key: MADLIB-1081 > URL: https://issues.apache.org/jira/browse/MADLIB-1081 > Project: Apache MADlib > Issue Type: Improvement > Components: Module: Graph >Reporter: Frank McQuillan >Assignee: Orhan Kislal >Priority: Minor > Fix For: v1.11 > > > * Add a GROUP BY column to the edge table > * Because wants to run SSSP on the different server graphs defined for users, > i.e., group by userID -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (MADLIB-1081) Graph - add grouping to shortest path
[ https://issues.apache.org/jira/browse/MADLIB-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank McQuillan closed MADLIB-1081. --- > Graph - add grouping to shortest path > - > > Key: MADLIB-1081 > URL: https://issues.apache.org/jira/browse/MADLIB-1081 > Project: Apache MADlib > Issue Type: Improvement > Components: Module: Graph >Reporter: Frank McQuillan >Assignee: Orhan Kislal >Priority: Minor > Fix For: v1.11 > > > * Add a GROUP BY column to the edge table > * Because wants to run SSSP on the different server graphs defined for users, > i.e., group by userID -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (MADLIB-1081) Graph - add grouping to shortest path
[ https://issues.apache.org/jira/browse/MADLIB-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank McQuillan resolved MADLIB-1081. - Resolution: Fixed > Graph - add grouping to shortest path > - > > Key: MADLIB-1081 > URL: https://issues.apache.org/jira/browse/MADLIB-1081 > Project: Apache MADlib > Issue Type: Improvement > Components: Module: Graph >Reporter: Frank McQuillan >Assignee: Orhan Kislal >Priority: Minor > Fix For: v1.11 > > > * Add a GROUP BY column to the edge table > * Because wants to run SSSP on the different server graphs defined for users, > i.e., group by userID -- This message was sent by Atlassian JIRA (v6.3.15#6346)