[
https://issues.apache.org/jira/browse/MADLIB-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Orhan Kislal updated MADLIB-1255:
---------------------------------
Description:
w/ [~njayaram] For the boston dataset (duplicated for multiple groups testing)
the following query produces NaN for the loss. If NaN is an acceptable result,
we should have a more user-friendly way of erroring out during prediction.
{code:java}
SELECT setseed(0);
DROP TABLE IF EXISTS temp3;
DROP TABLE IF EXISTS temp3_summary;
DROP TABLE IF EXISTS temp3_standardization;
SELECT madlib.mlp_regression(
'madlibtestdata.boston_grouping'::varchar,
'temp3'::varchar,
'ARRAY[crim, zn, indus, chas, nox, rm, age, dis, rad, tax, ptratio,
b, lstat]'::varchar,
'medv'::varchar,
ARRAY[100]::integer[],
'learning_rate_init=0.0025, lambda=0.00001,
learning_rate_policy=step, gamma=0.8, iterations_per_step=250,
n_iterations=1500, tolerance=0, momentum=0'::varchar,
'tanh'::varchar,
NULL,
False,
False,
'grp_col'
);
SELECT loss FROM temp3 WHERE grp_col=2;
{code}
Dataset: [https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html]
was:
w/ [~njayaram] For the boston dataset (duplicated for multiple groups testing)
the following query produces NaN for the loss.
{code:java}
SELECT setseed(0);
DROP TABLE IF EXISTS temp3;
DROP TABLE IF EXISTS temp3_summary;
DROP TABLE IF EXISTS temp3_standardization;
SELECT madlib.mlp_regression(
'madlibtestdata.boston_grouping'::varchar,
'temp3'::varchar,
'ARRAY[crim, zn, indus, chas, nox, rm, age, dis, rad, tax, ptratio,
b, lstat]'::varchar,
'medv'::varchar,
ARRAY[100]::integer[],
'learning_rate_init=0.0025, lambda=0.00001,
learning_rate_policy=step, gamma=0.8, iterations_per_step=250,
n_iterations=1500, tolerance=0, momentum=0'::varchar,
'tanh'::varchar,
NULL,
False,
False,
'grp_col'
);
SELECT loss FROM temp3 WHERE grp_col=2;
{code}
Dataset: [https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html]
> MLP: NaN loss for some hyperparam settings
> ------------------------------------------
>
> Key: MADLIB-1255
> URL: https://issues.apache.org/jira/browse/MADLIB-1255
> Project: Apache MADlib
> Issue Type: Bug
> Components: Module: Neural Networks
> Reporter: Orhan Kislal
> Priority: Major
>
> w/ [~njayaram] For the boston dataset (duplicated for multiple groups
> testing) the following query produces NaN for the loss. If NaN is an
> acceptable result, we should have a more user-friendly way of erroring out
> during prediction.
> {code:java}
> SELECT setseed(0);
> DROP TABLE IF EXISTS temp3;
> DROP TABLE IF EXISTS temp3_summary;
> DROP TABLE IF EXISTS temp3_standardization;
> SELECT madlib.mlp_regression(
> 'madlibtestdata.boston_grouping'::varchar,
> 'temp3'::varchar,
> 'ARRAY[crim, zn, indus, chas, nox, rm, age, dis, rad, tax, ptratio,
> b, lstat]'::varchar,
> 'medv'::varchar,
> ARRAY[100]::integer[],
> 'learning_rate_init=0.0025, lambda=0.00001,
> learning_rate_policy=step, gamma=0.8, iterations_per_step=250,
> n_iterations=1500, tolerance=0, momentum=0'::varchar,
> 'tanh'::varchar,
> NULL,
> False,
> False,
> 'grp_col'
> );
> SELECT loss FROM temp3 WHERE grp_col=2;
> {code}
> Dataset: [https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)