[
https://issues.apache.org/jira/browse/MADLIB-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Frank McQuillan updated MADLIB-1322:
------------------------------------
Priority: Minor (was: Major)
> MLP with minibatch fails for integer dependent variable
> -------------------------------------------------------
>
> Key: MADLIB-1322
> URL: https://issues.apache.org/jira/browse/MADLIB-1322
> Project: Apache MADlib
> Issue Type: Bug
> Components: Module: Neural Networks
> Reporter: Frank McQuillan
> Priority: Minor
> Fix For: v1.16
>
>
> (1)
> If I have an integer dependent variable and I mini-batch:
> {code}
> select madlib.minibatch_preprocessor(
> 'classification_train', -- input table
> 'mini_batch_packed_train', -- output table
> 'response', -- response INTEGER
> 'feature_vector', -- indep vars
> NULL, -- grouping
> NULL, -- buffer size (or size of the mini-batch)
> TRUE -- Encode scalar int dependent variable (if response is integer instead
> of boolean or char)
> );
> {code}
> Then the table looks like:
> {code}
> madlib=# \d+ batch_packed_train_summary
> Table "public.mini_batch_packed_train_summary"
> Column | Type | Modifiers | Storage | Stats target |
> Description
> --------------------------+-----------+-----------+----------+--------------+-------------
> source_table | text | | extended | |
> output_table | text | | extended | |
> dependent_varname | text | | extended | |
> independent_varname | text | | extended | |
> dependent_vartype | text | | extended | |
> buffer_size | integer | | plain | |
> class_values | integer[] | | extended | |
> num_rows_processed | integer | | plain | |
> num_missing_rows_skipped | integer | | plain | |
> grouping_cols | text | | extended | |
> {code}
> Then MLP classification fails with:
> {code}
> InternalError: (psycopg2.InternalError) TypeError: must be string, not int
> CONTEXT: Traceback (most recent call last):
> PL/Python function "mlp_classification", line 33, in <module>
> grouping_col)
> PL/Python function "mlp_classification", line 42, in wrapper
> PL/Python function "mlp_classification", line 147, in mlp
> PL/Python function "mlp_classification", line 74, in quote_literal
> {code}
> (2)
> If I cast to text explicitly:
> {code}
> select madlib.minibatch_preprocessor(
> 'classification_train', -- input table
> 'mini_batch_packed_train', -- output table
> 'response::TEXT', -- response
> 'feature_vector', -- indep vars
> NULL, -- grouping
> NULL, -- buffer size (or size of the mini-batch)
> TRUE -- Encode scalar int dependent variable (if response is integer instead
> of boolean or char)
> );
> {code}
> The tables looks like:
> {code}
> madlib=# \d+ mini_batch_packed_train_summary
> Table "public.mini_batch_packed_train_summary"
> Column | Type | Modifiers | Storage | Stats target |
> Description
> --------------------------+---------+-----------+----------+--------------+-------------
> source_table | text | | extended | |
> output_table | text | | extended | |
> dependent_varname | text | | extended | |
> independent_varname | text | | extended | |
> dependent_vartype | text | | extended | |
> buffer_size | integer | | plain | |
> class_values | text[] | | extended | |
> num_rows_processed | integer | | plain | |
> num_missing_rows_skipped | integer | | plain | |
> grouping_cols | text | | extended | |
> {code}
> And MLP training works OK.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)