Liang-Chi Hsieh created SPARK-28722:
---------------------------------------

             Summary: Change sequential label sorting in StringIndexer fit to 
parallel
                 Key: SPARK-28722
                 URL: https://issues.apache.org/jira/browse/SPARK-28722
             Project: Spark
          Issue Type: Improvement
          Components: ML
    Affects Versions: 3.0.0
            Reporter: Liang-Chi Hsieh


The fit method in StringIndexer sorts given labels in a sequential approach, if 
there are multiple input columns. When the number of input column increases, 
the time of label sorting dramatically increases too so it is hard to use in 
practice if dealing with hundreds of input columns.





--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to