Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/17556
  
    I don't mind the weighted midpoints. However, if for a continuous feature 
we find that many points have the exact same value, we are assuming we may find 
data points in the test set that are close to but not these same values. But 
since our train data was clustered at these particular values, perhaps it's not 
a good assumption. I could live with either method, but maybe a slight 
preference to match the other libraries.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to