Hi Kant,

The udfDeterministic would be set to false if the results from your UDF are
non-deterministic, such as produced by random numbers, so the catalyst
optimizer will not cache and reuse results.

On Mon, Apr 2, 2018 at 12:11 PM, kant kodali <kanth...@gmail.com> wrote:

> Looks like there is spark.udf().registerPython() like below.
>
> public void registerPython(java.lang.String name, org.apache.spark.sql.
> execution.python.UserDefinedPythonFunction udf)
>
>
> can anyone describe what *udfDeterministic *parameter does in the method
> signature below?
>
> public UserDefinedPythonFunction(java.lang.String name, 
> org.apache.spark.api.python.PythonFunction func, 
> org.apache.spark.sql.types.DataType dataType, int pythonEvalType, boolean 
> udfDeterministic) { /* compiled code */ }
>
>
> On Sun, Apr 1, 2018 at 3:46 PM, kant kodali <kanth...@gmail.com> wrote:
>
>> Hi All,
>>
>> All of our spark code is in Java wondering if there a way to register
>> python UDF's using java API such that the registered UDF's can be used
>> using raw spark SQL.
>> If there is any other way to achieve this goal please suggest!
>>
>> Thanks
>>
>>
>

Reply via email to