Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17383
Hi, since the work has been done for a long time, I take a review by
myself.
After careful review, as SparseVector is compressed sparse row format, so
the only benefit of the PR would be fo
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17383
Sure, @WeichenXu123 , perhaps one or two weeks later, is it OK?
By the way, I think using sparse representation can only reduce memory
usage, and it is in the cost of compute performance. Hen
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17383
@facaiy So can you do benchmark first (by generating random testing data) ?
So we can see how much this can speed up.
---
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17383
Thank you for comment.
Very good question, at least for me, the answer to both questions is no. In
most case, we feed dense raw data into tree model. However, if large dimensions
required,