[ 
https://issues.apache.org/jira/browse/MADLIB-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429156#comment-16429156
 ] 

Nandish Jayaram commented on MADLIB-1225:
-----------------------------------------

The variable importance computation involves randomization inherently. So it is 
hard to reproduce this error consistently. But, I did try to run the offending 
query multiple times, and it looks like with the latest code, the failure 
happens around 4.3% of the time (26 failures in 600 runs). I tried to see if 
this is a result of some changes that have gone into RF/DT post 1.13 by running 
the same experiment with 1.13 code base. The failure happens there too, at 
around 4.8% (29 failures in 600 runs).

The failures happen even when we tune various hyper params. For instance, I 
tried increasing the value of the following hyper params in the offending 
query: num_permutations and num_trees. I also tried to increase the number of 
rows in the training data. But, failure happens sporadically in all cases.

> Sporadic install check failures in random forest
> ------------------------------------------------
>
>                 Key: MADLIB-1225
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1225
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Module: Random Forest
>            Reporter: Nandish Jayaram
>            Priority: Major
>             Fix For: v1.14
>
>
> Install check seems to fail for random forest sporadically. The failure 
> happens for the test which deals with variable importance in the install 
> check.
> The error in the log when a failure happens is:
> {code}
> SELECT
>  assert(cat_var_importance[1] > con_var_importance[1], 'class should be 
> important!'),
>  assert(cat_var_importance[1] > cat_var_importance[2], 'class should be 
> important!')
> FROM train_output_group;
> psql:/tmp/madlib.WW_EyD/recursive_partitioning/test/random_forest.sql_in.tmp:158:
>  ERROR: Failed assertion: class should be important! (seg0 slice1 
> 93e250c8-8924-4a80-5c68-1464f40b0395:25432 pid=91044)
> {code}
> The last RF install-check query that was run before the error was:
> {code}
> SELECT forest_train(
>  'dt_golf', -- source table
>  'train_output', -- output model table
>  'id', -- id column
>  'class::TEXT', -- response
>  'class, windy, temperature', -- features
>  NULL, -- exclude columns
>  NULL, -- no grouping
>  10, -- num of trees
>  1, -- num of random features
>  TRUE, -- importance
>  3, -- num_permutations
>  10, -- max depth
>  1, -- min split
>  1, -- min bucket
>  8, -- number of bins per continuous variable
>  'max_surrogates=0',
>  FALSE
>  );
> SELECT * from train_output_summary;
> -[ RECORD 1 ]---------+--------------------------------
> method | forest_train
> is_classification | t
> source_table | dt_golf
> model_table | train_output
> id_col_name | id
> dependent_varname | class::TEXT
> independent_varnames | class,windy,temperature
> cat_features | class,windy
> con_features | temperature
> grouping_cols |
> num_trees | 10
> num_random_features | 1
> max_tree_depth | 10
> min_split | 1
> min_bucket | 1
> num_splits | 8
> verbose | f
> importance | t
> num_permutations | 3
> num_all_groups | 1
> num_failed_groups | 0
> total_rows_processed | 16
> total_rows_skipped | 0
> dependent_var_levels | "Don't Play","Play"
> dependent_var_type | text
> independent_var_types | text, boolean, double precision
> null_proxy | None
> SELECT * from train_output_group;
> -[ RECORD 1 ]------+---------------------------------------
> gid | 1
> success | t
> cat_n_levels | \{2,2}
> cat_levels_in_text | \{"Don't Play",Play,False,True}
> oob_error | 0.20000000000000000000
> cat_var_importance | \{0.0244444444444445,0.025487012987013}
> con_var_importance | \{0}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to