Github user yongtang commented on a diff in the pull request: https://github.com/apache/spark/pull/12079#discussion_r58083774 --- Diff: python/pyspark/ml/feature.py --- @@ -512,6 +512,16 @@ class HashingTF(JavaTransformer, HasInputCol, HasOutputCol, HasNumFeatures, Java .. versionadded:: 1.3.0 """ + """ + Binary toggle to control term frequency counts. + If true, all non-zero counts are set to 1. This is useful for discrete probabilistic + models that model binary events rather than integer counts. + (default = False) + """ + binary = Param(Params._dummy(), "binary", + "Binary toggle to control term frequency counts", + typeConverter=TypeConverters.toBoolean) + @keyword_only def __init__(self, numFeatures=1 << 18, inputCol=None, outputCol=None): --- End diff -- Thanks @yanboliang just updated the pull request with issues addressed. Let me know if there are any other issues.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org