[ 
https://issues.apache.org/jira/browse/MADLIB-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471207#comment-16471207
 ] 

Himanshu Pandey commented on MADLIB-1172:
-----------------------------------------

Hi [~fmcquillan],

Here is what I have discovered so far. The data for the last record, id = 300 
is causing this issue. More specifically, value for column x5. 

When I changed the value to 107 or 110 it worked. The value in data-set is 108. 

 

For eg: 

{code}

gpadmin=# update dummy_data set x5 = 107 where id = 300;

UPDATE 1

gpadmin=# select 
madlib.logregr_train('dummy_data','dummy_logit_gp','y','ARRAY[1,x1,x2,x3,x4,x5]',NULL,20,'irls');

 logregr_train 

---------------

 

(1 row)

 

gpadmin=# select * from dummy_logit_gp;

 
\{-60.6963399562406,83.2481369307379,-41.5757740167708,41.6723539072261,-220.233572947378,59.1513318881784}
 | -0.000188250058233221 | 
\{6379.02591630875,33122.6359459149,16561.3303753767,16561.3157515072,1895.85601499328,25482.6866481577}
 | {-0.00951498563457205,0.00251333067412484,-

0.00251041269477877,0.00251624656715053,-0.116165769555109,0.00232123608883504} 
| 
\{0.99240825441914,0.997994654370162,0.997996982573486,0.997992327831493,0.907521164983078,0.998147923225947}
 | 
{4.36429888587852e-27,1.4262856007256e+36,8.78760978274153e-19,1.25335284111666e+18,2.2582

631046692e-96,4.88761553216779e+25} |     Infinity |                300 |       
                 0 |             20 | 
\{{40691971.6409387,-211081401.140304,105540486.26844,-105540866.77075,-7621488.81098509,162096329.331243},{-211081401.140304,1097109012.00561,-548554623.142637,54855

4138.7612,41701724.8827007,-843656994.586568},\{105540486.26844,-548554623.142637,274277663.802374,-274276834.289312,-20852051.0896696,421829046.690131},\{-105540866.77075,548554138.7612,-274276834.289312,274277179.421121,20849664.1968923,-421827755.523287},{-7621488.81098509,41701724

.8827007,-20852051.0896696,20849664.1968923,3594270.02958621,-33175131.4666084},\{162096329.331243,-843656994.586568,421829046.690132,-421827755.523287,-33175131.4666084,649367318.808194}}

(1 row)

 

gpadmin=# update dummy_data set x5 = 110 where id = 300;

UPDATE 1

 

gpadmin=# select 
madlib.logregr_train('dummy_data','dummy_logit_gp','y','ARRAY[1,x1,x2,x3,x4,x5]',NULL,20,'irls');

 logregr_train 

---------------

 

(1 row)

 

gpadmin=# select * from dummy_logit_gp;

 

 
\{-82.8680682455691,198.593133300487,-99.2473096206599,99.3458092682156,-215.747249093631,-29.6039160744594}
 | -0.000188751550486611 | 
\{3095.02103385595,15902.436542112,7951.16609447363,7951.30599917857,1493.94792706878,12191.3941605613}
 | {-0.026774638149172,0.0124882204544305,-0.0

124821074596392,0.0124942756923804,-0.144414169452974,-0.00242826338682633} | 
\{0.978639481789339,0.990036100695913,0.990040977779944,0.990031269691949,0.885173428489227,0.998062528038127}
 | 
{1.02531009876533e-36,1.76970931272333e+86,7.89661724612934e-44,1.39745156827503e+43,2.005211

7132129e-94,1.39053718214918e-13} |     Infinity |                300 |         
               0 |             20 | 
\{{9579155.20001074,-49013448.3206615,24506250.6929641,-24507193.0554383,-1117659.32579782,37279704.1887547},{-49013448.3206615,252887487.9759,-126442619.757955,1264448

44.588794,7820179.2181533,-193478250.287115},\{24506250.6929641,-126442619.757955,63221042.2619071,-63221565.6815607,-3911332.71616586,96738775.4366319},\{-24507193.0554383,126444844.588794,-63221565.6815607,63223267.0925732,3908845.73413929,-96739456.7519012},{-1117659.32579782,78201

79.2181533,-3911332.71616586,3908845.73413929,2231880.40879312,-7079677.90789804},\{37279704.1887547,-193478250.287115,96738775.4366319,-96739456.7519012,-7079677.90789804,148630091.578168}}

(1 row)

{code}

 I will continue investigating and will update with my findings. 

 

> Logistic regression produces empty output table but no error message on 
> Greenplum
> ---------------------------------------------------------------------------------
>
>                 Key: MADLIB-1172
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1172
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Module: Logistic Regression
>            Reporter: Frank McQuillan
>            Assignee: Himanshu Pandey
>            Priority: Minor
>             Fix For: v1.15
>
>         Attachments: Logistic-regression-empty-output.ipynb, 
> load-data-sep.sql, load-data-singular.sql, load-data.sql
>
>
> Separated and singular data sets may produce and empty model table on 
> Greenplum 4.3.x.  On Postgres 9.6 the same example works OK. 
> See the attache jupyter notebook and data sets for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to