Github user facaiy commented on the issue: https://github.com/apache/spark/pull/17383 Hi, since the work has been done for a long time, I take a review by myself. After careful review, as SparseVector is compressed sparse row format, so the only benefit of the PR would be for data storage but in the cost of performance. But for tree-method, it is uncommon to handle a super large dimension features. Hence, it cannot satisfy me. I prefer to [SPARK-3717: DecisionTree, RandomForest: Partition by feature](https://issues.apache.org/jira/browse/SPARK-3717) as an alternative, which will be benefits in both performance and storage if I understand correctly. So the PR is closed. Thank everyone for review / comment.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org