Repository: spark Updated Branches: refs/heads/master 943a684b9 -> d20a976e8
[SPARK-20192][SPARKR][DOC] SparkR migration guide to 2.2.0 ## What changes were proposed in this pull request? Updating R Programming Guide ## How was this patch tested? manually Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17816 from felixcheung/r22relnote. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d20a976e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d20a976e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d20a976e Branch: refs/heads/master Commit: d20a976e8918ca8d607af452301e8014fe14e64a Parents: 943a684 Author: Felix Cheung <felixcheun...@hotmail.com> Authored: Mon May 1 21:03:48 2017 -0700 Committer: Felix Cheung <felixche...@apache.org> Committed: Mon May 1 21:03:48 2017 -0700 ---------------------------------------------------------------------- docs/sparkr.md | 8 ++++++++ 1 file changed, 8 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/d20a976e/docs/sparkr.md ---------------------------------------------------------------------- diff --git a/docs/sparkr.md b/docs/sparkr.md index 16b1ef6..6dbd02a 100644 --- a/docs/sparkr.md +++ b/docs/sparkr.md @@ -644,3 +644,11 @@ You can inspect the search path in R with [`search()`](https://stat.ethz.ch/R-ma ## Upgrading to SparkR 2.1.0 - `join` no longer performs Cartesian Product by default, use `crossJoin` instead. + +## Upgrading to SparkR 2.2.0 + + - A `numPartitions` parameter has been added to `createDataFrame` and `as.DataFrame`. When splitting the data, the partition position calculation has been made to match the one in Scala. + - The method `createExternalTable` has been deprecated to be replaced by `createTable`. Either methods can be called to create external or managed table. Additional catalog methods have also been added. + - By default, derby.log is now saved to `tempdir()`. This will be created when instantiating the SparkSession with `enableHiveSupport` set to `TRUE`. + - `spark.lda` was not setting the optimizer correctly. It has been corrected. + - Several model summary outputs are updated to have `coefficients` as `matrix`. This includes `spark.logit`, `spark.kmeans`, `spark.glm`. Model summary outputs for `spark.gaussianMixture` have added log-likelihood as `loglik`. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org