I have data in a DataFrame with below columns
1)Fileformat is csv
2)All below column datatypes are String
employeeid,pexpense,cexpense
Now I need to create a new DataFrame which has new column called `expense`,
which is calculated based on columns `pexpense`, `cexpense`.
The tricky part is the calculation algorithm is not an **UDF** function
which I created, but it's an external function that needs to be imported
from a Java library which takes primitive types as arguments - in this case
`pexpense`, `cexpense` - to calculate the value required for new column.
The external function signature
public class MyJava
{
public Double calculateExpense(Double pexpense, Double cexpense) {
// calculation
}
}
So how can I invoke that external function to create a new calculated
column. Can I register that external function as UDF in my Spark
application?
Stackoverflow reference
https://stackoverflow.com/questions/45928007/use-withcolumn-with-external-function