Thanks, I'll check it out. On Mon, Aug 28, 2017 at 10:22 PM Praneeth Gayam <praneeth.ga...@gmail.com> wrote:
> You can create a UDF which will invoke your java lib > > def calculateExpense: UserDefinedFunction = udf((pexpense: String, cexpense: > String) => new MyJava().calculateExpense(pexpense.toDouble, > cexpense.toDouble)) > > > > > > On Tue, Aug 29, 2017 at 6:53 AM, purna pradeep <purna2prad...@gmail.com> > wrote: > >> I have data in a DataFrame with below columns >> >> 1)Fileformat is csv >> 2)All below column datatypes are String >> >> employeeid,pexpense,cexpense >> >> Now I need to create a new DataFrame which has new column called >> `expense`, which is calculated based on columns `pexpense`, `cexpense`. >> >> The tricky part is the calculation algorithm is not an **UDF** function >> which I created, but it's an external function that needs to be imported >> from a Java library which takes primitive types as arguments - in this case >> `pexpense`, `cexpense` - to calculate the value required for new column. >> >> The external function signature >> >> public class MyJava >> >> { >> >> public Double calculateExpense(Double pexpense, Double cexpense) { >> // calculation >> } >> >> } >> >> So how can I invoke that external function to create a new calculated >> column. Can I register that external function as UDF in my Spark >> application? >> >> Stackoverflow reference >> >> >> https://stackoverflow.com/questions/45928007/use-withcolumn-with-external-function >> >> >> >> >> >> >