You can create a UDF which will invoke your java lib def calculateExpense: UserDefinedFunction = udf((pexpense: String, cexpense: String) => new MyJava().calculateExpense(pexpense.toDouble, cexpense.toDouble))
On Tue, Aug 29, 2017 at 6:53 AM, purna pradeep <purna2prad...@gmail.com> wrote: > I have data in a DataFrame with below columns > > 1)Fileformat is csv > 2)All below column datatypes are String > > employeeid,pexpense,cexpense > > Now I need to create a new DataFrame which has new column called > `expense`, which is calculated based on columns `pexpense`, `cexpense`. > > The tricky part is the calculation algorithm is not an **UDF** function > which I created, but it's an external function that needs to be imported > from a Java library which takes primitive types as arguments - in this case > `pexpense`, `cexpense` - to calculate the value required for new column. > > The external function signature > > public class MyJava > > { > > public Double calculateExpense(Double pexpense, Double cexpense) { > // calculation > } > > } > > So how can I invoke that external function to create a new calculated > column. Can I register that external function as UDF in my Spark > application? > > Stackoverflow reference > > https://stackoverflow.com/questions/45928007/use-withcolumn-with-external- > function > > > > > >