Frank McQuillan created MADLIB-1446:
---------------------------------------
Summary: DL: Hyperband phase 2 - generate MST table
Key: MADLIB-1446
URL: https://issues.apache.org/jira/browse/MADLIB-1446
Project: Apache MADlib
Issue Type: New Feature
Components: Deep Learning
Reporter: Frank McQuillan
Fix For: v1.18.0
Python code to do some version of this is in
https://github.com/apache/madlib-site/blob/asf-site/community-artifacts/Deep-learning/automl/hyperband-diag-cifar10-v1.ipynb
in methods called `setup_full_schedule()` and `create_mst_superset()` +
combine with the random search function from
https://www.pivotaltracker.com/story/show/173692930
**Story***
Generate the MST table and do input validation on input params (to the extent
possible without implementing the whole method). It does not do the whole
hyperband method. The proposed interface:
{code}
madlib_keras_automl(
source_table, -- input
model_output_table, -- output
model_selection_table, -- output
model_arch_table, -- input
model_id_list,
compile_params_grid,
fit_params_grid,
automl_method, -- new params vvv
automl_params
random_state, -- optional -- from
generate model configs vvv
object_table -- optional
use_gpus, -- optional -- from fit
multiple vvv
validation_table, -- optional
metrics_compute_frequency, -- optional
name, -- optional
description -- optional
)
{code}
Here are the output tables:
(1)
<model_output_table>
Same as model output table in
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows
(2)
<model_output_table>_summary
Same as model output table summary in
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
will have 1 row + add the following columns at the bottom, i.e., right side of
the table:
{code}
use_gpus BOOLEAN
e.g., TRUE -- this is missing from summary table from
before
automl_method TEXT e.g.,
'hyperband'
automl_params_names TEXT[] e.g.,
{'R', 'eta', 'skip_last' }
automl_params_vals TEXT[] e.g.,
{'81', '3', 'TRUE'} -- note this needs to be text array since mixed types
of autoML params
{code}
(3)
<model_output_table>_info
Same as model output table info in
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows + add the following columns
at the bottom, i.e., right side of the table:
{code}
s INTEGER
"Bracket number" e.g., 4
i INTEGER
"Depth in bracket model trained to" e.g., 3
{code}
(4)
<model_selection_table>
Same as model selection table in
https://madlib.apache.org/docs/latest/group__grp__keras__setup__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows
(5)
<model_selection_table>_summary
Same as model selection table in
https://madlib.apache.org/docs/latest/group__grp__keras__setup__model__selection.html
**Acceptance**
1) For `R=81, eta=3` check that it creates the correct MST tables
<model_selection_table> and <model_selection_table>_summary
2) Set `skip_last =1` and check that it creates the correct MST tables
3) Try multiple other values to see if produces the correct schedule
--
This message was sent by Atlassian Jira
(v8.3.4#803005)