Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22079#discussion_r209731698 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala --- @@ -144,7 +144,7 @@ object ChiSqSelectorModel extends Loader[ChiSqSelectorModel] { val dataArray = Array.tabulate(model.selectedFeatures.length) { i => Data(model.selectedFeatures(i)) } - spark.createDataFrame(dataArray).repartition(1).write.parquet(Loader.dataPath(path)) + spark.createDataFrame(sc.makeRDD(dataArray, 1)).write.parquet(Loader.dataPath(path)) --- End diff -- @jiangxb1987 and @bersprockets . SPARK-22905 consists of two commits. - ChiSqSelector (https://github.com/apache/spark/pull/20088) - GaussianMixtureModel (https://github.com/apache/spark/pull/20113) If we want to include SPARK-22905 here, it had better be explicit and complete by putting `[SPARK-22905]` into the PR title and includes both patches.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org