[ 
https://issues.apache.org/jira/browse/MADLIB-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461501#comment-16461501
 ] 

Himanshu Pandey commented on MADLIB-1172:
-----------------------------------------

Hi [~fmcquillan] ,

I have tested this in 4.3.25 and below are the results. 

Both Singular data and Separated datasets work fine and return an output but 
the regular data-set load-data.sql 

is returning an empty model table which is different from the initial issue. 

 
{code:java}
[gpadmin@gpdb ~]$ psql -f load-data.sql 
psql:load-data.sql:1: NOTICE: table "dummy_data" does not exist, skipping
DROP TABLE
psql:load-data.sql:2: NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- 
Using column named 'id' as the Greenplum Database data distribution key for 
this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make 
sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
[gpadmin@gpdb ~]$ psql
psql (8.2.15)
Type "help" for help.

gpadmin=# \dt
List of relations
Schema | Name | Type | Owner | Storage 
--------+------------+-------+---------+---------
public | dummy_data | table | gpadmin | heap
(1 row)

gpadmin=# select 
madlib.logregr_train('dummy_data','dummy_logit_gp','y','ARRAY[1,x1,x2,x3,x4,x5]',NULL,20,'irls');
logregr_train 
---------------
 
(1 row)

gpadmin=# select * from dummy_logit_gp;
coef | log_likelihood | std_err | z_stats | p_values | odds_ratios | 
condition_no | num_rows_processed | num_missing_rows_skipped | num_iterations | 
variance_covariance 
------+----------------+---------+---------+----------+-------------+--------------+--------------------+--------------------------+----------------+---------------------
| | | | | | | | | 4 | 
(1 row)

gpadmin=#

{code}

> Logistic regression produces empty output table but no error message on 
> Greenplum
> ---------------------------------------------------------------------------------
>
>                 Key: MADLIB-1172
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1172
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Module: Logistic Regression
>            Reporter: Frank McQuillan
>            Assignee: Himanshu Pandey
>            Priority: Minor
>             Fix For: v1.15
>
>         Attachments: Logistic-regression-empty-output.ipynb, 
> load-data-sep.sql, load-data-singular.sql, load-data.sql
>
>
> Separated and singular data sets may produce and empty model table on 
> Greenplum 4.3.x.  On Postgres 9.6 the same example works OK. 
> See the attache jupyter notebook and data sets for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to