I don't think it'd be a release blocker .. I think we can implement them across multiple releases.
On Fri, May 26, 2023 at 1:01 AM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > Thank you for the proposal. > > I'm wondering if we are going to consider them as release blockers or not. > > In general, I don't think those SQL functions should be available in all > languages as release blockers. > (Especially in R or new Spark Connect languages like Go and Rust). > > If they are not release blockers, we may allow some existing or future > community PRs only before feature freeze (= branch cut). > > Thanks, > Dongjoon. > > > On Wed, May 24, 2023 at 7:09 PM Jia Fan <fan...@apache.org> wrote: > >> +1 >> It is important that different APIs can be used to call the same function >> >> Ryan Berti <rbe...@netflix.com.invalid> 于2023年5月25日周四 01:48写道: >> >>> During my recent experience developing functions, I found that >>> identifying locations (sql + connect functions.scala + functions.py, >>> FunctionRegistry, + whatever is required for R) and standards for adding >>> function signatures was not straight forward (should you use optional args >>> or overload functions? which col/lit helpers should be used when?). Are >>> there docs describing all of the locations + standards for defining a >>> function? If not, that'd be great to have too. >>> >>> Ryan Berti >>> >>> Senior Data Engineer | Ads DE >>> >>> M 7023217573 >>> >>> 5808 W Sunset Blvd | Los Angeles, CA 90028 >>> <https://www.google.com/maps/search/5808+W+Sunset+Blvd%C2%A0+%7C%C2%A0+Los+Angeles,+CA+90028?entry=gmail&source=g> >>> >>> >>> >>> On Wed, May 24, 2023 at 12:44 AM Enrico Minack <i...@enrico.minack.dev> >>> wrote: >>> >>>> +1 >>>> >>>> Functions available in SQL (more general in one API) should be >>>> available in all APIs. I am very much in favor of this. >>>> >>>> Enrico >>>> >>>> >>>> Am 24.05.23 um 09:41 schrieb Hyukjin Kwon: >>>> >>>> Hi all, >>>> >>>> I would like to discuss adding all SQL functions into Scala, Python and >>>> R API. >>>> We have SQL functions that do not exist in Scala, Python and R around >>>> 175. >>>> For example, we don’t have pyspark.sql.functions.percentile but you >>>> can invoke >>>> it as a SQL function, e.g., SELECT percentile(...). >>>> >>>> The reason why we do not have all functions in the first place is that >>>> we want to >>>> only add commonly used functions, see also >>>> https://github.com/apache/spark/pull/21318 (which I agreed at that >>>> time) >>>> >>>> However, this has been raised multiple times over years, from the OSS >>>> community, dev mailing list, JIRAs, stackoverflow, etc. >>>> Seems it’s confusing about which function is available or not. >>>> >>>> Yes, we have a workaround. We can call all expressions by expr("...") >>>> or call_udf("...", Columns ...) >>>> But still it seems that it’s not very user-friendly because they expect >>>> them available under the functions namespace. >>>> >>>> Therefore, I would like to propose adding all expressions into all >>>> languages so that Spark is simpler and less confusing, e.g., which API is >>>> in functions or not. >>>> >>>> Any thoughts? >>>> >>>> >>>>