Frank McQuillan created MADLIB-1237:
---------------------------------------

             Summary: Mini-batch preprocessor fails for dt_golf dataset 
                 Key: MADLIB-1237
                 URL: https://issues.apache.org/jira/browse/MADLIB-1237
             Project: Apache MADlib
          Issue Type: Bug
          Components: Module: Utilities
            Reporter: Frank McQuillan
             Fix For: v1.15


For the dt_golf data set from 
http://madlib.apache.org/docs/latest/group__grp__decision__tree.html#examples
minibatch pre-processor fails

{code}
madlib=# SELECT madlib.minibatch_preprocessor('dt_golf',
'dt_golf_packed_2', 
'class', 
'"Temp_Humidity"', NULL ,1, True);

ERROR: spiexceptions.SyntaxError: syntax error at or near "t"
LINE 8: ...T madlib.array_contains_null(ARRAY[(class) = 'Don't Play', (...
 ^
QUERY:
 SELECT SUM(source_table_row_count_by_group) AS source_table_row_count,
 SUM(num_rows_processed_by_group) AS total_num_rows_processed,
 AVG(num_rows_processed_by_group) AS avg_num_rows_processed
 FROM (
 SELECT COUNT(*) AS source_table_row_count_by_group,
 SUM(CASE
 WHEN NOT madlib.array_contains_null(ARRAY[(class) = 'Don't Play', (class) = 
'Play']::INTEGER[]) AND
 NOT madlib.array_contains_null(("Temp_Humidity")::DOUBLE PRECISION[])
 THEN 1
 ELSE 0
 END) AS num_rows_processed_by_group
 FROM dt_golf

) AS s

CONTEXT: Traceback (most recent call last):
 PL/Python function "minibatch_preprocessor", line 24, in <module>
 minibatch_preprocessor_obj.minibatch_preprocessor()
 PL/Python function "minibatch_preprocessor", line 45, in wrapper
 PL/Python function "minibatch_preprocessor", line 104, in 
minibatch_preprocessor
 PL/Python function "minibatch_preprocessor", line 236, in 
_get_skipped_rows_processed_count
PL/Python function "minibatch_preprocessor"
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to