[ https://issues.apache.org/jira/browse/SPARK-22905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16306020#comment-16306020 ]
zhengruifeng commented on SPARK-22905: -------------------------------------- [~WeichenXu123] I made a check and found that same issue exists in {{GaussianMixtureModel}}, otherwise looks fine. > Fix ChiSqSelectorModel save implementation > ------------------------------------------ > > Key: SPARK-22905 > URL: https://issues.apache.org/jira/browse/SPARK-22905 > Project: Spark > Issue Type: Bug > Components: MLlib > Affects Versions: 2.2.1 > Reporter: Weichen Xu > Assignee: Weichen Xu > Fix For: 2.3.0 > > Original Estimate: 24h > Remaining Estimate: 24h > > Currently, in `ChiSqSelectorModel`, save: > {code} > spark.createDataFrame(dataArray).repartition(1).write... > {code} > The default partition number used by createDataFrame is "defaultParallelism", > Current RoundRobinPartitioning won't guarantee the "repartition" generating > the same order result with local array. We need fix it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org