Github user actuaryzhang commented on the issue:

    https://github.com/apache/spark/pull/18140
  
    Simple example to illustrate:
    ```
    > df <- createDataFrame(as.data.frame(Titanic, stringsAsFactors = FALSE))
    > rModel <- stats::glm(Freq ~ Sex + Age, family = "gaussian", data = 
as.data.frame(df))
    > summary(rModel)$coefficients
                  Estimate Std. Error   t value    Pr(>|t|)
    (Intercept)   91.34375   35.99417  2.537737 0.016790098
    SexMale       78.81250   41.56249  1.896241 0.067931094
    AgeChild    -123.93750   41.56249 -2.981956 0.005752153
     
    > model <- spark.glm(df, Freq ~ Sex + Age, family = "gaussian")
    > summary(model)$coefficients
                 Estimate Std. Error    t value    Pr(>|t|)
    (Intercept) -32.59375   35.99417 -0.9055286 0.372647658
    Sex_Male     78.81250   41.56249  1.8962412 0.067931094
    Age_Adult   123.93750   41.56249  2.9819558 0.005752153
    
    > model2 <- spark.glm(df, Freq ~ Sex + Age, family = "gaussian",
    +                     stringIndexerOrderType = "alphabetDesc")
    > summary(model2)$coefficients
                  Estimate Std. Error   t value    Pr(>|t|)
    (Intercept)   91.34375   35.99417  2.537737 0.016790098
    Sex_Male      78.81250   41.56249  1.896241 0.067931094
    Age_Child   -123.93750   41.56249 -2.981956 0.005752153
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to