is there a way of register python UDF using java API?

2018-04-01 Thread kant kodali
Hi All,

All of our spark code is in Java wondering if there a way to register
python UDF's using java API such that the registered UDF's can be used
using raw spark SQL.
If there is any other way to achieve this goal please suggest!

Thanks


Re: is there a way of register python UDF using java API?

2018-04-02 Thread kant kodali
Looks like there is spark.udf().registerPython() like below.

public void registerPython(java.lang.String name,
org.apache.spark.sql.execution.python.UserDefinedPythonFunction udf)


can anyone describe what *udfDeterministic *parameter does in the method
signature below?

public UserDefinedPythonFunction(java.lang.String name,
org.apache.spark.api.python.PythonFunction func,
org.apache.spark.sql.types.DataType dataType, int pythonEvalType,
boolean udfDeterministic) { /* compiled code */ }


On Sun, Apr 1, 2018 at 3:46 PM, kant kodali  wrote:

> Hi All,
>
> All of our spark code is in Java wondering if there a way to register
> python UDF's using java API such that the registered UDF's can be used
> using raw spark SQL.
> If there is any other way to achieve this goal please suggest!
>
> Thanks
>
>


Re: is there a way of register python UDF using java API?

2018-04-02 Thread Bryan Cutler
Hi Kant,

The udfDeterministic would be set to false if the results from your UDF are
non-deterministic, such as produced by random numbers, so the catalyst
optimizer will not cache and reuse results.

On Mon, Apr 2, 2018 at 12:11 PM, kant kodali  wrote:

> Looks like there is spark.udf().registerPython() like below.
>
> public void registerPython(java.lang.String name, org.apache.spark.sql.
> execution.python.UserDefinedPythonFunction udf)
>
>
> can anyone describe what *udfDeterministic *parameter does in the method
> signature below?
>
> public UserDefinedPythonFunction(java.lang.String name, 
> org.apache.spark.api.python.PythonFunction func, 
> org.apache.spark.sql.types.DataType dataType, int pythonEvalType, boolean 
> udfDeterministic) { /* compiled code */ }
>
>
> On Sun, Apr 1, 2018 at 3:46 PM, kant kodali  wrote:
>
>> Hi All,
>>
>> All of our spark code is in Java wondering if there a way to register
>> python UDF's using java API such that the registered UDF's can be used
>> using raw spark SQL.
>> If there is any other way to achieve this goal please suggest!
>>
>> Thanks
>>
>>
>