[ https://issues.apache.org/jira/browse/MADLIB-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pratik updated MADLIB-1317: --------------------------- Description: Hi team, I have using madlib multinomial method on my dataset with categorical independent variable (hot encoded) as below. {code:java} SELECT CASE WHEN multinom IS NOT NULL THEN TRUE ELSE FALSE END FROM madlib.multinom( 'TEMP_TEST_1', 'TEMP_TEST_1_OP', 'dep_var_col', 'ARRAY[ 1,hot_encoded_GENDER_col_val1, hot_encoded_GENDER_col_val2]', '1',--REF CATEGORY 'logit', NULL, 'max_iter=100,optimizer=irls,tolerance=0.0001', TRUE );{code} Gender being a categorical column I am hot encoding it in 2 columns 0|1. When comparing results with R's method coefficients match but the StdErr and pValue are way off in comparison. R method - {code:java} nnet::multinom {code} Is there anything I need to do specially for multinom or is it a bug? Or is there perticular way I need to use R to compare results with multinom? *UPDATE:* Is it mandatory to have ref_category like column for categorical independent variable?? hot encoded GENDER_col_val1 from list of independent variable and results are matching with Rs output. Is there any documentation or reference to confirm this? was: Hi team, I have using madlib multinomial method on my dataset with categorical independent variable (hot encoded) as below. {code:java} SELECT CASE WHEN multinom IS NOT NULL THEN TRUE ELSE FALSE END FROM madlib.multinom( 'TEMP_TEST_1', 'TEMP_TEST_1_OP', 'dep_var_col', 'ARRAY[ 1,hot_encoded_GENDER_col_val1, hot_encoded_GENDER_col_val2]', '1',--REF CATEGORY 'logit', NULL, 'max_iter=100,optimizer=irls,tolerance=0.0001', TRUE );{code} Gender being a categorical column I am hot encoding it in 2 columns 0|1. When comparing results with R's method coefficients match but the StdErr and pValue are way off in comparison. R method - {code:java} nnet::multinom {code} Is there anything I need to do specially for multinom or is it a bug? Or is there perticular way I need to use R to compare results with multinom? > Multinomial results not matching with R method > ---------------------------------------------- > > Key: MADLIB-1317 > URL: https://issues.apache.org/jira/browse/MADLIB-1317 > Project: Apache MADlib > Issue Type: Bug > Components: Module: Multinomial Logistic Regression > Reporter: Pratik > Priority: Major > > Hi team, > I have using madlib multinomial method on my dataset with categorical > independent variable (hot encoded) as below. > > {code:java} > SELECT > CASE WHEN multinom IS NOT NULL THEN TRUE ELSE FALSE END > FROM > madlib.multinom( > 'TEMP_TEST_1', > 'TEMP_TEST_1_OP', > 'dep_var_col', > 'ARRAY[ 1,hot_encoded_GENDER_col_val1, hot_encoded_GENDER_col_val2]', > '1',--REF CATEGORY > 'logit', > NULL, > 'max_iter=100,optimizer=irls,tolerance=0.0001', > TRUE > );{code} > Gender being a categorical column I am hot encoding it in 2 columns 0|1. > When comparing results with R's method coefficients match but the StdErr and > pValue are way off in comparison. > R method - > {code:java} > nnet::multinom > {code} > > Is there anything I need to do specially for multinom or is it a bug? > Or is there perticular way I need to use R to compare results with multinom? > *UPDATE:* > Is it mandatory to have ref_category like column for categorical independent > variable?? > hot encoded GENDER_col_val1 from list of independent variable and results are > matching with Rs output. > > Is there any documentation or reference to confirm this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)