Liang-Chi Hsieh created SPARK-28722: ---------------------------------------
Summary: Change sequential label sorting in StringIndexer fit to parallel Key: SPARK-28722 URL: https://issues.apache.org/jira/browse/SPARK-28722 Project: Spark Issue Type: Improvement Components: ML Affects Versions: 3.0.0 Reporter: Liang-Chi Hsieh The fit method in StringIndexer sorts given labels in a sequential approach, if there are multiple input columns. When the number of input column increases, the time of label sorting dramatically increases too so it is hard to use in practice if dealing with hundreds of input columns. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org