Re: use WithColumn with external function in a java jar

2017-08-29 Thread purna pradeep
Thanks, I'll check it out.

On Mon, Aug 28, 2017 at 10:22 PM Praneeth Gayam 
wrote:

> You can create a UDF which will invoke your java lib
>
> def calculateExpense: UserDefinedFunction = udf((pexpense: String, cexpense: 
> String) => new MyJava().calculateExpense(pexpense.toDouble, 
> cexpense.toDouble))
>
>
>
>
>
> On Tue, Aug 29, 2017 at 6:53 AM, purna pradeep 
> wrote:
>
>> I have data in a DataFrame with below columns
>>
>> 1)Fileformat is csv
>> 2)All below column datatypes are String
>>
>> employeeid,pexpense,cexpense
>>
>> Now I need to create a new DataFrame which has new column called
>> `expense`, which is calculated based on columns `pexpense`, `cexpense`.
>>
>> The tricky part is the calculation algorithm is not an **UDF** function
>> which I created, but it's an external function that needs to be imported
>> from a Java library which takes primitive types as arguments - in this case
>> `pexpense`, `cexpense` - to calculate the value required for new column.
>>
>> The external function signature
>>
>> public class MyJava
>>
>> {
>>
>> public Double calculateExpense(Double pexpense, Double cexpense) {
>>// calculation
>> }
>>
>> }
>>
>> So how can I invoke that external function to create a new calculated
>> column. Can I register that external function as UDF in my Spark
>> application?
>>
>> Stackoverflow reference
>>
>>
>> https://stackoverflow.com/questions/45928007/use-withcolumn-with-external-function
>>
>>
>>
>>
>>
>>
>


Re: use WithColumn with external function in a java jar

2017-08-28 Thread Praneeth Gayam
You can create a UDF which will invoke your java lib

def calculateExpense: UserDefinedFunction = udf((pexpense: String,
cexpense: String) => new MyJava().calculateExpense(pexpense.toDouble,
cexpense.toDouble))





On Tue, Aug 29, 2017 at 6:53 AM, purna pradeep 
wrote:

> I have data in a DataFrame with below columns
>
> 1)Fileformat is csv
> 2)All below column datatypes are String
>
> employeeid,pexpense,cexpense
>
> Now I need to create a new DataFrame which has new column called
> `expense`, which is calculated based on columns `pexpense`, `cexpense`.
>
> The tricky part is the calculation algorithm is not an **UDF** function
> which I created, but it's an external function that needs to be imported
> from a Java library which takes primitive types as arguments - in this case
> `pexpense`, `cexpense` - to calculate the value required for new column.
>
> The external function signature
>
> public class MyJava
>
> {
>
> public Double calculateExpense(Double pexpense, Double cexpense) {
>// calculation
> }
>
> }
>
> So how can I invoke that external function to create a new calculated
> column. Can I register that external function as UDF in my Spark
> application?
>
> Stackoverflow reference
>
> https://stackoverflow.com/questions/45928007/use-withcolumn-with-external-
> function
>
>
>
>
>
>


use WithColumn with external function in a java jar

2017-08-28 Thread purna pradeep
I have data in a DataFrame with below columns

1)Fileformat is csv
2)All below column datatypes are String

employeeid,pexpense,cexpense

Now I need to create a new DataFrame which has new column called `expense`,
which is calculated based on columns `pexpense`, `cexpense`.

The tricky part is the calculation algorithm is not an **UDF** function
which I created, but it's an external function that needs to be imported
from a Java library which takes primitive types as arguments - in this case
`pexpense`, `cexpense` - to calculate the value required for new column.

The external function signature

public class MyJava

{

public Double calculateExpense(Double pexpense, Double cexpense) {
   // calculation
}

}

So how can I invoke that external function to create a new calculated
column. Can I register that external function as UDF in my Spark
application?

Stackoverflow reference

https://stackoverflow.com/questions/45928007/use-withcolumn-with-external-function