Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/1269#issuecomment-66529072 @akopich Thanks for the updates. It looks like rebasing did not work correctly (looking at the 10K+ lines in this PR!). It should be possible to fix with rebase + conflict resolution. But since most updates in this PR are in separate files, it may be easiest to get a clean copy of master, copy the files & changes in, and then force push to update this PR's branch (or make a new PR). About Enumerator / TokenIndexer: Renaming sounds good. You're right that both APIs (setters vs. constructor arguments) are used in spark.mllib, but in the new spark.ml API, only setters will be used. (And hopefully spark.ml will become the primary ML package soon.) That's my main reason for recommending setters. About Dirichlet: Since there isn't a precedent to follow, I'd recommend putting it in a new stat/impl/ folder and leaving it private. A later PR could add public APIs for some common distributions.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org