[GitHub] madlib pull request #276: Feature/dev check

2018-06-13 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request:

https://github.com/apache/madlib/pull/276#discussion_r195278229
  
--- Diff: src/madpack/madpack.py ---
@@ -898,13 +900,19 @@ def run_install_check(args, testcase):
   % (test_user, test_schema, schema)
 
 # Loop through all test SQL files for this module
-sql_files = maddir_mod_sql + '/' + module + '/test/*.sql_in'
+if is_install_check:
+sql_files = maddir_mod_sql + '/' + module + '/test/*.ic.sql_in'
+else:
+sql_files = maddir_mod_sql + '/' + module + '/test/*.sql_in'
 for sqlfile in sorted(glob.glob(sql_files), reverse=True):
 algoname = os.path.basename(sqlfile).split('.')[0]
 # run only algo specified
 if (module in modset and modset[module] and
 algoname not in modset[module]):
 continue
+# Do not run test/*.ic.sql_in files for dev-check.
--- End diff --

I believe if you use `*[!ic].sql_in` in the dev-check glob expression, it 
will exclude the ic files. You won't need these lines if that works. 


---


[GitHub] madlib issue #278: Upgrade: add changelist file and bug fixes and changelist...

2018-06-13 Thread kaknikhil
Github user kaknikhil commented on the issue:

https://github.com/apache/madlib/pull/278
  
@iyerr3 yes we will have to rename it. 


---


[GitHub] madlib pull request #273: Minibatch Preprocessing: fix dependent var with sp...

2018-06-13 Thread jingyimei
Github user jingyimei closed the pull request at:

https://github.com/apache/madlib/pull/273


---


[GitHub] madlib pull request #277: DT: Add impurity importance metric

2018-06-13 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request:

https://github.com/apache/madlib/pull/277#discussion_r195277278
  
--- Diff: 
src/ports/postgres/modules/recursive_partitioning/decision_tree.py_in ---
@@ -1097,28 +1121,21 @@ def _one_step(schema_madlib, training_table_name, 
cat_features,
  "$3", "$2",
  null_proxy)
 
-# The arguments of the aggregate (in the same order):
-# 1. current tree state, madlib.bytea8
-# 2. categorical features (integer format) in a single array
-# 3. continuous features in a single array
-# 4. weight value
-# 5. categorical sorted levels (integer format) in a combined array
-# 6. continuous splits
-# 7. number of dependent levels
 train_sql = """
 SELECT (result).* from (
 SELECT
-{schema_madlib}._dt_apply($1,
+{schema_madlib}._dt_apply(
+$1,
 {schema_madlib}._compute_leaf_stats(
-$1,
-{cat_features_str},
-{con_features_str},
+$1,  -- current tree state, 
madlib.bytea8
+{cat_features_str},  -- categorical features in an 
array
+{con_features_str},  -- continuous features in an 
array
 {dep_var},
-{weights},
-$2,
-$4,
-{dep_n_levels}::smallint,
-{subsample}::boolean
+{weights},   -- weight value
+$2,  -- categorical sorted levels 
in a combined array
+$4,  -- continuous splits
+{dep_n_levels}::smallint, -- number of dependent 
levels
+{subsample}::boolean  -- should we use a subsample 
of data
--- End diff --

The `$3` is part of the `cat_features_str`. I can put in a comment to that 
effect over here. 


---


[GitHub] madlib issue #273: Minibatch Preprocessing: fix dependent var with special c...

2018-06-13 Thread jingyimei
Github user jingyimei commented on the issue:

https://github.com/apache/madlib/pull/273
  
Will handle those in another PR, closing this one.


---


[GitHub] madlib issue #274: Handling special characters in MLP and Encode Categorical...

2018-06-13 Thread jingyimei
Github user jingyimei commented on the issue:

https://github.com/apache/madlib/pull/274
  
Will handle those in another PR, closing this one.


---


[GitHub] madlib pull request #274: Handling special characters in MLP and Encode Cate...

2018-06-13 Thread jingyimei
Github user jingyimei closed the pull request at:

https://github.com/apache/madlib/pull/274


---


[GitHub] madlib issue #273: Minibatch Preprocessing: fix dependent var with special c...

2018-06-13 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/273
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/514/



---


[GitHub] madlib issue #273: Minibatch Preprocessing: fix dependent var with special c...

2018-06-13 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/273
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/513/



---


[GitHub] madlib issue #277: DT: Add impurity importance metric

2018-06-13 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/277
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/512/



---


[GitHub] madlib issue #277: DT: Add impurity importance metric

2018-06-13 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/277
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/511/



---


[GitHub] madlib issue #272: MLP: Add momentum and nesterov to gradient updates.

2018-06-13 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/272
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/510/



---