[GitHub] madlib issue #296: DT/RF: Ensure cat features are recorded per group

2018-07-19 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/296 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/577/ ---

[GitHub] madlib pull request #295: Recursive Partitioning: Add function to report imp...

2018-07-19 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/295#discussion_r203811750 --- Diff: src/ports/postgres/modules/recursive_partitioning/decision_tree.py_in --- @@ -2327,6 +2328,110 @@ def _tree_error(schema_madlib, source_table, depe

[GitHub] madlib pull request #295: Recursive Partitioning: Add function to report imp...

2018-07-19 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/295#discussion_r203816091 --- Diff: src/ports/postgres/modules/recursive_partitioning/decision_tree.py_in --- @@ -2327,6 +2328,110 @@ def _tree_error(schema_madlib, source_table, depe

[GitHub] madlib pull request #295: Recursive Partitioning: Add function to report imp...

2018-07-19 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/295#discussion_r203812338 --- Diff: src/ports/postgres/modules/recursive_partitioning/decision_tree.py_in --- @@ -2327,6 +2328,110 @@ def _tree_error(schema_madlib, source_table, depe

[GitHub] madlib pull request #295: Recursive Partitioning: Add function to report imp...

2018-07-19 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/295#discussion_r203811384 --- Diff: src/ports/postgres/modules/recursive_partitioning/decision_tree.py_in --- @@ -2327,6 +2328,110 @@ def _tree_error(schema_madlib, source_table, depe

[GitHub] madlib issue #296: DT/RF: Ensure cat features are recorded per group

2018-07-19 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/296 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/578/ ---

[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-19 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/291 In cols2vec, For this table: ``` CREATE TABLE golf ( id integer NOT NULL, "OUTLOOK" text, temperature double precision, humidity double precision,

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/295 Thank you for the comments @iyerr3 , will make necessary changes. ---

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/295 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/579/ ---

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/295 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/580/ ---

[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-19 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/291 In vec2cols, ``` SELECT madlib.vec2cols( 'golf', -- source table 'vec2cols_result',-- output table 'clouds_airquality

[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-19 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/291 After the above 2 issues I mentioned are fixed, I will have 1 more commit on user docs to this PR ---

[GitHub] madlib issue #296: DT/RF: Ensure cat features are recorded per group

2018-07-19 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/296 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/581/ ---

[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-19 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/291 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/582/ ---

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/295 Should impurity_var_importance always add up to 100? From the regression example in the user docs: ``` DROP TABLE IF EXISTS mt_imp_output; SELECT madlib.get_var_importance('mt

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread fmcquillan99
Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/295 Another run I got ``` grp 0 grp1 31.01364943 31.6576 22.85881741

[GitHub] madlib pull request #291: Feature: Vector to Columns

2018-07-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/291#discussion_r203890181 --- Diff: src/ports/postgres/modules/utilities/transform_vec_cols.py_in --- @@ -0,0 +1,492 @@ +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread iyerr3
Github user iyerr3 commented on the issue: https://github.com/apache/madlib/pull/295 Considering the above situation, I suggest the variable importance values not be scaled to sum to 100. We can make the normalization within `get_var_importance` just for the reporting (which is the be

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread fmcquillan
Github user fmcquillan commented on the issue: https://github.com/apache/madlib/pull/295 Would this apply to oob too? Or just impurity? ---

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/295 @fmcquillan only impurity, I don't think we scale oob to 100. ---

[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-19 Thread asfgit
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/291 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/583/ ---