[ https://issues.apache.org/jira/browse/MADLIB-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818226#comment-16818226 ]
Orhan Kislal commented on MADLIB-1317: -------------------------------------- w/ [~khannaekta] Hi Pratik, One hot encoding is the preferred way to handle categorical independent variables for multinomial regression. Closing this JIRA per your confirmation. Let us know if you have any other questions. > Multinomial results not matching with R method > ---------------------------------------------- > > Key: MADLIB-1317 > URL: https://issues.apache.org/jira/browse/MADLIB-1317 > Project: Apache MADlib > Issue Type: Bug > Components: Module: Multinomial Logistic Regression > Reporter: Pratik > Priority: Major > > Hi team, > I have using madlib multinomial method on my dataset with categorical > independent variable (hot encoded) as below. > > {code:java} > SELECT > CASE WHEN multinom IS NOT NULL THEN TRUE ELSE FALSE END > FROM > madlib.multinom( > 'TEMP_TEST_1', > 'TEMP_TEST_1_OP', > 'dep_var_col', > 'ARRAY[ 1,hot_encoded_GENDER_col_val1, hot_encoded_GENDER_col_val2]', > '1',--REF CATEGORY > 'logit', > NULL, > 'max_iter=100,optimizer=irls,tolerance=0.0001', > TRUE > );{code} > Gender being a categorical column I am hot encoding it in 2 columns 0|1. > When comparing results with R's method coefficients match but the StdErr and > pValue are way off in comparison. > R method - > {code:java} > nnet::multinom > {code} > > Is there anything I need to do specially for multinom or is it a bug? > Or is there perticular way I need to use R to compare results with multinom? > *UPDATE:* > Is it mandatory to have ref_category like column for categorical independent > variable?? > hot encoded GENDER_col_val1 from list of independent variable and results are > matching with Rs output. > > Is there any documentation or reference to confirm this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)