[ 
https://issues.apache.org/jira/browse/SPARK-26899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-26899.
-------------------------------
    Resolution: Not A Problem

This isn't "Major", and I don't think it's a doc problem in a comment, and I 
don't think it's wrong: it's just stating what count min sketch is for in 
general.

> CountMinSketchAgg ExpressionDescription is not so correct
> ---------------------------------------------------------
>
>                 Key: SPARK-26899
>                 URL: https://issues.apache.org/jira/browse/SPARK-26899
>             Project: Spark
>          Issue Type: Documentation
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: tomzhu
>            Priority: Major
>
> Hi, all, there are some not-so-correct comment in CountMinSketchAgg.scala, 
> the ExpressionDescription says:
> {code:java}
> @ExpressionDescription(
>       usage = """
>       _FUNC_(col, eps, confidence, seed) - Returns a count-min sketch of a 
> column with the given esp,
>       confidence and seed. The result is an array of bytes, which can be 
> deserialized to a
>       `CountMinSketch` before usage. Count-min sketch is a probabilistic data 
> structure used for
>       cardinality estimation using sub-linear space.
>       """,
>       since = "2.2.0")
> {code}
> , *the Count-min sketch is a probabilistic data structure used for 
> cardinality estimation*, ** actually, Count-min sketch is mainly used for 
> point query, self_join size query, 
> how can it support cardinality estimation? a fix might be better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to