Github user rnowling commented on the pull request: https://github.com/apache/spark/pull/4087#issuecomment-70446766 [~leahmcguire], Thanks for the patch! A few comments: 1. PySpark calls the Scala API for MLlib, so for API compatibility, we can't use enumerations on the public APIs. I suggest using a string for the train() functions but keeping the enumeration for the internal API. 2. Can you create a new JIRA for updating the PySpark MLlib NB API? I can post details on what needs to change there -- if you don't want to do the PR for that, I can. 3. The populateMatrix function is verbose. Breeze seems to support element-wise operations (https://github.com/scalanlp/breeze/wiki/Linear-Algebra-Cheat-Sheet) which might be negate the need for the populateMatrix function. 4. Can you update the MLlib docs in docs/mllib-naive-bayes.md ? Thanks!
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org