[ 
https://issues.apache.org/jira/browse/MADLIB-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank McQuillan updated MADLIB-1364:
------------------------------------
    Description: 
(1)
confusing error message if forgot to preprocess source table

{code}
SELECT madlib.madlib_keras_fit('train_lt5',           -- source table (NOT 
PREPROCESSED)
                               'mnist_model',         -- model output table
                               'model_arch_library',  -- model arch table
                                1,                    -- model arch id
                                $$ loss='categorical_crossentropy', 
optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params
                                $$ batch_size=batch_size, epochs=1 $$,  -- 
fit_params
                                5,                    -- num_iterations
                                0,                    -- gpus_per_host
                                'test_lt5_packed',           -- validation table
                                1                     -- 
metrics_compute_frequency
                              );

InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit error: 
Input table 'train_lt5_summary' does not exist (plpython.c:5038)
{code}

A better message would be:
{code}
InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit error: 
Input table 'train_lt5_summary' does not exist.  Please ensure that the source 
table you specify has been preprocessed by the image preprocessor. 
(plpython.c:5038)
{code}


(2)
confusing error message if forgot to preprocess validation table

{code}
SELECT madlib.madlib_keras_fit('train_lt5_packed',           -- source table 
(YES PREPROCESSED)
                               'mnist_model',         -- model output table
                               'model_arch_library',  -- model arch table
                                1,                    -- model arch id
                                $$ loss='categorical_crossentropy', 
optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params
                                $$ batch_size=batch_size, epochs=1 $$,  -- 
fit_params
                                5,                    -- num_iterations
                                0,                    -- gpus_per_host
                                'test_lt5',           -- validation table  (NOT 
PREPROCESSED)
                                1                     -- 
metrics_compute_frequency
                              );

InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit: invalid 
independent_varname ('independent_var') for table (test_lt5). (plpython.c:5038)
CONTEXT:  Traceback (most recent call last):
  PL/Python function "madlib_keras_fit", line 21, in <module>
    madlib_keras.fit(**globals())
  PL/Python function "madlib_keras_fit", line 42, in wrapper
  PL/Python function "madlib_keras_fit", line 71, in fit
  PL/Python function "madlib_keras_fit", line 233, in __init__
  PL/Python function "madlib_keras_fit", line 274, in _validate_input_args
  PL/Python function "madlib_keras_fit", line 288, in _validate_validation_table
  PL/Python function "madlib_keras_fit", line 242, in _validate_input_table
  PL/Python function "madlib_keras_fit", line 96, in _assert
PL/Python function "madlib_keras_fit"
 [SQL: "SELECT madlib.madlib_keras_fit('train_lt5_packed',           -- source 
table\n                               'mnist_model',         -- model output 
table\n                               'model_arch_library',  -- model arch 
table\n                                1,                    -- model arch id\n 
                               $$ loss='categorical_crossentropy', 
optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params\n              
                  $$ batch_size=batch_size, epochs=1 $$,  -- fit_params\n       
                         5,                    -- num_iterations\n              
                  0,                    -- gpus_per_host\n                      
          'test_lt5',           -- validation table\n                           
     1                     -- metrics_compute_frequency\n                       
       );"]
{code}

A better message would be:
{code}
InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit: invalid 
independent_varname ('independent_var') for table (test_lt5). Please ensure 
that this table has been preprocessed by the image preprocessor.  
(plpython.c:5038)
{code}



  was:
(1)
input shape checking

We added input shape checking which is a good idea in principle, but it seems 
to be too restrictive. e.g., for the mnist data set, Keras input shape is:
{code}
x_train_lt5.shape
(30596, 28, 28)
{code}

In Madlib before preprocessing we get:
{code}
id | 2238
x  | 
{{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,196,195,12,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,79,159,44,0,0,0,0,39,253,218,10,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,221,253,179,0,0,0,0,149,253,169,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,221,253,53,0,0,0,12,222,253,123,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,8,226,253,16,0,0,0,25,253,253,56,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,50,253,253,16,0,0,0,41,253,218,7,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,139,253,217,8,0,0,0,126,253,193,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,213,253,114,0,0,0,10,226,253,130,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,39,250,253,223,10,0,0,17,253,253,54,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,173,253,253,253,169,137,83,120,253,221,2,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,52,238,254,254,254,254,254,255,254,254,192,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,115,253,228,84,73,97,154,238,253,253,138,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,40,146,45,0,0,0,0,9,253,250,73,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,253,228,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,75,253,228,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,132,253,186,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,243,253,102,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,196,254,238,7,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,245,254,186,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,166,251,79,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}}
y  | 4
{code}

A validation error gets thrown when we run fit():

{code}
InternalError: (psycopg2.InternalError) plpy.Error: model_keras error: Input 
shape [28, 28, 1] in the model architecture does not match the input shape [28, 
28, None] of column independent_var in table train_lt5_packed. (plpython.c:5038)
CONTEXT:  Traceback (most recent call last):
  PL/Python function "madlib_keras_fit", line 21, in <module>
    madlib_keras.fit(**globals())
  PL/Python function "madlib_keras_fit", line 42, in wrapper
  PL/Python function "madlib_keras_fit", line 102, in fit
  PL/Python function "madlib_keras_fit", line 300, in validate_input_shapes
  PL/Python function "madlib_keras_fit", line 86, in _validate_input_shapes
PL/Python function "madlib_keras_fit"
 [SQL: "SELECT madlib.madlib_keras_fit('train_lt5_packed',           -- source 
table\n                               'mnist_model',         -- model output 
table\n                               'model_arch_library',  -- model arch 
table\n                                1,                    -- model arch id\n 
                               $$ loss='categorical_crossentropy', 
optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params\n              
                  $$ batch_size=batch_size, epochs=1 $$,  -- fit_params\n       
                         5,                    -- num_iterations\n              
                  0,                    -- gpus_per_host\n                      
          'test_lt5_packed',           -- validation table\n                    
            1                     -- metrics_compute_frequency\n                
              );"]
{code}

which is too restrictive.  I suggest we turn madlib input shape validation off 
for the time being and let the back end fail or not according to its rules.  
This applies to fit, evaluate and predict.


(2)
confusing error message if forgot to preprocess source table

{code}
SELECT madlib.madlib_keras_fit('train_lt5',           -- source table (NOT 
PREPROCESSED)
                               'mnist_model',         -- model output table
                               'model_arch_library',  -- model arch table
                                1,                    -- model arch id
                                $$ loss='categorical_crossentropy', 
optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params
                                $$ batch_size=batch_size, epochs=1 $$,  -- 
fit_params
                                5,                    -- num_iterations
                                0,                    -- gpus_per_host
                                'test_lt5_packed',           -- validation table
                                1                     -- 
metrics_compute_frequency
                              );

InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit error: 
Input table 'train_lt5_summary' does not exist (plpython.c:5038)
{code}

A better message would be:
{code}
InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit error: 
Input table 'train_lt5_summary' does not exist.  Please ensure that the source 
table you specify has been preprocessed by the image preprocessor. 
(plpython.c:5038)
{code}


(3)
confusing error message if forgot to preprocess validation table

{code}
SELECT madlib.madlib_keras_fit('train_lt5_packed',           -- source table 
(YES PREPROCESSED)
                               'mnist_model',         -- model output table
                               'model_arch_library',  -- model arch table
                                1,                    -- model arch id
                                $$ loss='categorical_crossentropy', 
optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params
                                $$ batch_size=batch_size, epochs=1 $$,  -- 
fit_params
                                5,                    -- num_iterations
                                0,                    -- gpus_per_host
                                'test_lt5',           -- validation table  (NOT 
PREPROCESSED)
                                1                     -- 
metrics_compute_frequency
                              );

InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit: invalid 
independent_varname ('independent_var') for table (test_lt5). (plpython.c:5038)
CONTEXT:  Traceback (most recent call last):
  PL/Python function "madlib_keras_fit", line 21, in <module>
    madlib_keras.fit(**globals())
  PL/Python function "madlib_keras_fit", line 42, in wrapper
  PL/Python function "madlib_keras_fit", line 71, in fit
  PL/Python function "madlib_keras_fit", line 233, in __init__
  PL/Python function "madlib_keras_fit", line 274, in _validate_input_args
  PL/Python function "madlib_keras_fit", line 288, in _validate_validation_table
  PL/Python function "madlib_keras_fit", line 242, in _validate_input_table
  PL/Python function "madlib_keras_fit", line 96, in _assert
PL/Python function "madlib_keras_fit"
 [SQL: "SELECT madlib.madlib_keras_fit('train_lt5_packed',           -- source 
table\n                               'mnist_model',         -- model output 
table\n                               'model_arch_library',  -- model arch 
table\n                                1,                    -- model arch id\n 
                               $$ loss='categorical_crossentropy', 
optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params\n              
                  $$ batch_size=batch_size, epochs=1 $$,  -- fit_params\n       
                         5,                    -- num_iterations\n              
                  0,                    -- gpus_per_host\n                      
          'test_lt5',           -- validation table\n                           
     1                     -- metrics_compute_frequency\n                       
       );"]
{code}

A better message would be:
{code}
InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit: invalid 
independent_varname ('independent_var') for table (test_lt5). Please ensure 
that this table has been preprocessed by the image preprocessor.  
(plpython.c:5038)
{code}




> Misc message and other items for 1.16 release
> ---------------------------------------------
>
>                 Key: MADLIB-1364
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1364
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: Deep Learning
>            Reporter: Frank McQuillan
>            Assignee: Nikhil
>            Priority: Minor
>             Fix For: v1.16
>
>
> (1)
> confusing error message if forgot to preprocess source table
> {code}
> SELECT madlib.madlib_keras_fit('train_lt5',           -- source table (NOT 
> PREPROCESSED)
>                                'mnist_model',         -- model output table
>                                'model_arch_library',  -- model arch table
>                                 1,                    -- model arch id
>                                 $$ loss='categorical_crossentropy', 
> optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params
>                                 $$ batch_size=batch_size, epochs=1 $$,  -- 
> fit_params
>                                 5,                    -- num_iterations
>                                 0,                    -- gpus_per_host
>                                 'test_lt5_packed',           -- validation 
> table
>                                 1                     -- 
> metrics_compute_frequency
>                               );
> InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit error: 
> Input table 'train_lt5_summary' does not exist (plpython.c:5038)
> {code}
> A better message would be:
> {code}
> InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit error: 
> Input table 'train_lt5_summary' does not exist.  Please ensure that the 
> source table you specify has been preprocessed by the image preprocessor. 
> (plpython.c:5038)
> {code}
> (2)
> confusing error message if forgot to preprocess validation table
> {code}
> SELECT madlib.madlib_keras_fit('train_lt5_packed',           -- source table 
> (YES PREPROCESSED)
>                                'mnist_model',         -- model output table
>                                'model_arch_library',  -- model arch table
>                                 1,                    -- model arch id
>                                 $$ loss='categorical_crossentropy', 
> optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params
>                                 $$ batch_size=batch_size, epochs=1 $$,  -- 
> fit_params
>                                 5,                    -- num_iterations
>                                 0,                    -- gpus_per_host
>                                 'test_lt5',           -- validation table  
> (NOT PREPROCESSED)
>                                 1                     -- 
> metrics_compute_frequency
>                               );
> InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit: invalid 
> independent_varname ('independent_var') for table (test_lt5). 
> (plpython.c:5038)
> CONTEXT:  Traceback (most recent call last):
>   PL/Python function "madlib_keras_fit", line 21, in <module>
>     madlib_keras.fit(**globals())
>   PL/Python function "madlib_keras_fit", line 42, in wrapper
>   PL/Python function "madlib_keras_fit", line 71, in fit
>   PL/Python function "madlib_keras_fit", line 233, in __init__
>   PL/Python function "madlib_keras_fit", line 274, in _validate_input_args
>   PL/Python function "madlib_keras_fit", line 288, in 
> _validate_validation_table
>   PL/Python function "madlib_keras_fit", line 242, in _validate_input_table
>   PL/Python function "madlib_keras_fit", line 96, in _assert
> PL/Python function "madlib_keras_fit"
>  [SQL: "SELECT madlib.madlib_keras_fit('train_lt5_packed',           -- 
> source table\n                               'mnist_model',         -- model 
> output table\n                               'model_arch_library',  -- model 
> arch table\n                                1,                    -- model 
> arch id\n                                $$ loss='categorical_crossentropy', 
> optimizer='adadelta', metrics=['accuracy']$$,  -- compile_params\n            
>                     $$ batch_size=batch_size, epochs=1 $$,  -- fit_params\n   
>                              5,                    -- num_iterations\n        
>                         0,                    -- gpus_per_host\n              
>                   'test_lt5',           -- validation table\n                 
>                1                     -- metrics_compute_frequency\n           
>                    );"]
> {code}
> A better message would be:
> {code}
> InternalError: (psycopg2.InternalError) plpy.Error: madlib_keras_fit: invalid 
> independent_varname ('independent_var') for table (test_lt5). Please ensure 
> that this table has been preprocessed by the image preprocessor.  
> (plpython.c:5038)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to