[
https://issues.apache.org/jira/browse/MADLIB-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471207#comment-16471207
]
Himanshu Pandey commented on MADLIB-1172:
-----------------------------------------
Hi [~fmcquillan],
Here is what I have discovered so far. The data for the last record, id = 300
is causing this issue. More specifically, value for column x5.
When I changed the value to 107 or 110 it worked. The value in data-set is 108.
For eg:
{code}
gpadmin=# update dummy_data set x5 = 107 where id = 300;
UPDATE 1
gpadmin=# select
madlib.logregr_train('dummy_data','dummy_logit_gp','y','ARRAY[1,x1,x2,x3,x4,x5]',NULL,20,'irls');
logregr_train
---------------
(1 row)
gpadmin=# select * from dummy_logit_gp;
\{-60.6963399562406,83.2481369307379,-41.5757740167708,41.6723539072261,-220.233572947378,59.1513318881784}
| -0.000188250058233221 |
\{6379.02591630875,33122.6359459149,16561.3303753767,16561.3157515072,1895.85601499328,25482.6866481577}
| {-0.00951498563457205,0.00251333067412484,-
0.00251041269477877,0.00251624656715053,-0.116165769555109,0.00232123608883504}
|
\{0.99240825441914,0.997994654370162,0.997996982573486,0.997992327831493,0.907521164983078,0.998147923225947}
|
{4.36429888587852e-27,1.4262856007256e+36,8.78760978274153e-19,1.25335284111666e+18,2.2582
631046692e-96,4.88761553216779e+25} | Infinity | 300 |
0 | 20 |
\{{40691971.6409387,-211081401.140304,105540486.26844,-105540866.77075,-7621488.81098509,162096329.331243},{-211081401.140304,1097109012.00561,-548554623.142637,54855
4138.7612,41701724.8827007,-843656994.586568},\{105540486.26844,-548554623.142637,274277663.802374,-274276834.289312,-20852051.0896696,421829046.690131},\{-105540866.77075,548554138.7612,-274276834.289312,274277179.421121,20849664.1968923,-421827755.523287},{-7621488.81098509,41701724
.8827007,-20852051.0896696,20849664.1968923,3594270.02958621,-33175131.4666084},\{162096329.331243,-843656994.586568,421829046.690132,-421827755.523287,-33175131.4666084,649367318.808194}}
(1 row)
gpadmin=# update dummy_data set x5 = 110 where id = 300;
UPDATE 1
gpadmin=# select
madlib.logregr_train('dummy_data','dummy_logit_gp','y','ARRAY[1,x1,x2,x3,x4,x5]',NULL,20,'irls');
logregr_train
---------------
(1 row)
gpadmin=# select * from dummy_logit_gp;
\{-82.8680682455691,198.593133300487,-99.2473096206599,99.3458092682156,-215.747249093631,-29.6039160744594}
| -0.000188751550486611 |
\{3095.02103385595,15902.436542112,7951.16609447363,7951.30599917857,1493.94792706878,12191.3941605613}
| {-0.026774638149172,0.0124882204544305,-0.0
124821074596392,0.0124942756923804,-0.144414169452974,-0.00242826338682633} |
\{0.978639481789339,0.990036100695913,0.990040977779944,0.990031269691949,0.885173428489227,0.998062528038127}
|
{1.02531009876533e-36,1.76970931272333e+86,7.89661724612934e-44,1.39745156827503e+43,2.005211
7132129e-94,1.39053718214918e-13} | Infinity | 300 |
0 | 20 |
\{{9579155.20001074,-49013448.3206615,24506250.6929641,-24507193.0554383,-1117659.32579782,37279704.1887547},{-49013448.3206615,252887487.9759,-126442619.757955,1264448
44.588794,7820179.2181533,-193478250.287115},\{24506250.6929641,-126442619.757955,63221042.2619071,-63221565.6815607,-3911332.71616586,96738775.4366319},\{-24507193.0554383,126444844.588794,-63221565.6815607,63223267.0925732,3908845.73413929,-96739456.7519012},{-1117659.32579782,78201
79.2181533,-3911332.71616586,3908845.73413929,2231880.40879312,-7079677.90789804},\{37279704.1887547,-193478250.287115,96738775.4366319,-96739456.7519012,-7079677.90789804,148630091.578168}}
(1 row)
{code}
I will continue investigating and will update with my findings.
> Logistic regression produces empty output table but no error message on
> Greenplum
> ---------------------------------------------------------------------------------
>
> Key: MADLIB-1172
> URL: https://issues.apache.org/jira/browse/MADLIB-1172
> Project: Apache MADlib
> Issue Type: Bug
> Components: Module: Logistic Regression
> Reporter: Frank McQuillan
> Assignee: Himanshu Pandey
> Priority: Minor
> Fix For: v1.15
>
> Attachments: Logistic-regression-empty-output.ipynb,
> load-data-sep.sql, load-data-singular.sql, load-data.sql
>
>
> Separated and singular data sets may produce and empty model table on
> Greenplum 4.3.x. On Postgres 9.6 the same example works OK.
> See the attache jupyter notebook and data sets for details.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)