AlexanderSaydakov commented on code in PR #35:
URL: 
https://github.com/apache/datasketches-bigquery/pull/35#discussion_r1757639709


##########
kll/sqlx/kll_sketch_float_get_cdf.sqlx:
##########
@@ -24,20 +24,36 @@ RETURNS ARRAY<FLOAT64>
 LANGUAGE js
 OPTIONS (
   library=["gs://$GCS_BUCKET/kll_sketch.js"],
-  description = '''Returns an approximation to the Cumulative Distribution 
Function (CDF), which is the cumulative analog of the PMF,
-of the input stream given a set of split points.
-Param sketch: the given sketch in serialized form.
+  description = '''Returns an approximation to the Cumulative Distribution 
Function (CDF) 
+of the input stream as an array of cumulative probabilities defined by the 
given split_points.
+
+Param sketch: the given sketch as BYTES.
+
 Param split_points: an array of M unique, monotonically increasing values
-that divide the input domain into M+1 consecutive disjoint intervals.
-Param inclusive: if true the rank of a value includes its own weight,
-and therefore if the sketch contains values equal to a slit point,
-then in CDF such values are included into the interval to the left of split 
point.
-Otherwise they are included into the interval to the right of split point.
-Returns: an array of M+1 values which are a consecutive approximation to the 
CDF
-of the input stream given the split_points. The value at array position j of 
the returned
-CDF array is the sum of the returned values in positions 0 through j of the 
PMF array.
-This can be viewed as array of ranks of the given split points plus one more 
value that is always 1.
-For more details: https://datasketches.apache.org/docs/KLL/KLLSketch.html'''
+  (of the same type as the input values to the sketch)
+  that divide the input value domain into M+1 overlapping intervals.
+  
+  The start of each interval is below the lowest input value retained by the 
sketch
+  (corresponding to a zero rank or zero probability).
+  
+  The end of each interval is the associated split-point except for the top 
interval
+  where the end is the maximum input value of the stream.
+
+Param inclusive: if true and the upper boundary of an interval equals a value 
retained by the sketch, the interval will include that value. 
+  If the lower boundary of an interval equals a value retained by the sketch, 
the interval will exclude that value.
+
+  If false and the upper boundary of an interval equals a value retained by 
the sketch, the interval will exclude that value. 
+  If the lower boundary of an interval equals a value retained by the sketch, 
the interval will include that value.
+
+Returns: the CDF as a monotonically increasing FLOAT64 array of M+1 cumulative 
probablities on the interval (0.0, 1.0].

Review Comment:
   zero is included, isn't it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to