I don't think it'd be a release blocker .. I think we can implement them
across multiple releases.

On Fri, May 26, 2023 at 1:01 AM Dongjoon Hyun <dongjoon.h...@gmail.com>
wrote:

> Thank you for the proposal.
>
> I'm wondering if we are going to consider them as release blockers or not.
>
> In general, I don't think those SQL functions should be available in all
> languages as release blockers.
> (Especially in R or new Spark Connect languages like Go and Rust).
>
> If they are not release blockers, we may allow some existing or future
> community PRs only before feature freeze (= branch cut).
>
> Thanks,
> Dongjoon.
>
>
> On Wed, May 24, 2023 at 7:09 PM Jia Fan <fan...@apache.org> wrote:
>
>> +1
>> It is important that different APIs can be used to call the same function
>>
>> Ryan Berti <rbe...@netflix.com.invalid> 于2023年5月25日周四 01:48写道:
>>
>>> During my recent experience developing functions, I found that
>>> identifying locations (sql + connect functions.scala + functions.py,
>>> FunctionRegistry, + whatever is required for R) and standards for adding
>>> function signatures was not straight forward (should you use optional args
>>> or overload functions? which col/lit helpers should be used when?). Are
>>> there docs describing all of the locations + standards for defining a
>>> function? If not, that'd be great to have too.
>>>
>>> Ryan Berti
>>>
>>> Senior Data Engineer  |  Ads DE
>>>
>>> M 7023217573
>>>
>>> 5808 W Sunset Blvd  |  Los Angeles, CA 90028
>>> <https://www.google.com/maps/search/5808+W+Sunset+Blvd%C2%A0+%7C%C2%A0+Los+Angeles,+CA+90028?entry=gmail&source=g>
>>>
>>>
>>>
>>> On Wed, May 24, 2023 at 12:44 AM Enrico Minack <i...@enrico.minack.dev>
>>> wrote:
>>>
>>>> +1
>>>>
>>>> Functions available in SQL (more general in one API) should be
>>>> available in all APIs. I am very much in favor of this.
>>>>
>>>> Enrico
>>>>
>>>>
>>>> Am 24.05.23 um 09:41 schrieb Hyukjin Kwon:
>>>>
>>>> Hi all,
>>>>
>>>> I would like to discuss adding all SQL functions into Scala, Python and
>>>> R API.
>>>> We have SQL functions that do not exist in Scala, Python and R around
>>>> 175.
>>>> For example, we don’t have pyspark.sql.functions.percentile but you
>>>> can invoke
>>>> it as a SQL function, e.g., SELECT percentile(...).
>>>>
>>>> The reason why we do not have all functions in the first place is that
>>>> we want to
>>>> only add commonly used functions, see also
>>>> https://github.com/apache/spark/pull/21318 (which I agreed at that
>>>> time)
>>>>
>>>> However, this has been raised multiple times over years, from the OSS
>>>> community, dev mailing list, JIRAs, stackoverflow, etc.
>>>> Seems it’s confusing about which function is available or not.
>>>>
>>>> Yes, we have a workaround. We can call all expressions by expr("...")
>>>>  or call_udf("...", Columns ...)
>>>> But still it seems that it’s not very user-friendly because they expect
>>>> them available under the functions namespace.
>>>>
>>>> Therefore, I would like to propose adding all expressions into all
>>>> languages so that Spark is simpler and less confusing, e.g., which API is
>>>> in functions or not.
>>>>
>>>> Any thoughts?
>>>>
>>>>
>>>>

Reply via email to