GitHub user BryanCutler opened a pull request:

    https://github.com/apache/spark/pull/11832

    [SPARK-13963][ML] Adding binary toggle param to HashingTF

    ## What changes were proposed in this pull request?
    Adding binary toggle parameter to ml.feature.HashingTF, as well as 
mllib.feature.HashingTF since the former wraps this functionality.  This 
parameter, if true, will set non-zero valued term counts to 1 to transform term 
count features to binary values that are well suited for discrete probability 
models.
    
    ## How was this patch tested?
    Added unit tests for ML and MLlib
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/BryanCutler/spark 
binary-param-HashingTF-SPARK-13963

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11832.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11832
    
----
commit a5ff3309c0d07e57177374133130803eb98ebffb
Author: Bryan Cutler <cutl...@gmail.com>
Date:   2016-03-18T21:19:19Z

    [SPARK-13963] Adding binary toggle to HashingTF in ml/mllib

commit 31097231769860b86d1d3234ebf7d4e95f96e5cb
Author: Bryan Cutler <cutl...@gmail.com>
Date:   2016-03-18T21:19:48Z

    Added unit test for HashingTF binary toggle

commit ca1436166a1292f92d72408c10cf606623b31bbd
Author: Bryan Cutler <cutl...@gmail.com>
Date:   2016-03-18T21:26:34Z

    fixed param description text

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to