Peng-Lei commented on pull request #33132:
URL: https://github.com/apache/spark/pull/33132#issuecomment-874496061


   > > Why are the changes needed?
   > > 35926
   > 
   > @Peng-Lei I would like to see why the changes are really needed. Could you 
explain that, please.
   > 
   > Just in case, I would like to propose to focus on feature parity with 
`CalendarIntervalType` in the release 3.2. So, users could smoothly replace it 
by new ANSI types. Could you re-check Spark's code base, and point out the 
places where we miss handling of the ANSI interval types.
   
   @MaxGekk Firstly, the `WIDTH_BUCKET` function assigns values to buckets 
(individual segments) in an equiwidth histogram. The ANSI SQL Standard Syntax 
is like follow: `WIDTH_BUCKET( expression, min, max, buckets)`. 
[Reference](https://www.oreilly.com/library/view/sql-in-a/9780596155322/re91.html).
 Secondly, `WIDTH_BUCKET` just support `Double` at now, Of course, we can cast 
`Int` to `Double` to use it. But we cloud not cast `YearMonthIntervayType` to 
`Double`. Finally, I think it has a use scenario. eg:  Histogram of employee 
years of service, the `years of service` is a column of `YearMonthIntervalType` 
dataType.
   Maybe I'm wrong. When I studied the `WIDTH_BUCKET` code, I wondered if I 
could support `YearMonthIntervalType`. Thank you for reminding me. I will check 
whether there are miss handling of the ANSI interval types and test ANSI 
interval types features added.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to