One thing I noticed inside the UDF is that original column names from the
data frame have disappeared and the columns are called col1, col2 etc.

Regards
Meeraj

On Sat, Nov 26, 2016 at 7:31 PM, Meeraj Kunnumpurath <
mee...@servicesymphony.com> wrote:

> Hello,
>
> I have a dataset of features on which I want to compute the likelihood
> value for implementing gradient ascent for estimating coefficients. I have
> written a UDF that compute the probability function on each feature as
> shown below.
>
> def getLikelihood(cfs : List[(String, Double)], df: DataFrame) = {
>   val pr = udf((r: Row) => {
>     cfs.foldLeft(0.0)((x, y) => x * 1 / Math.pow(Math.E, 
> r.getAs[Double](y._1) * y._2))
>   })
>   df.withColumn("probabibility", pr(struct(df.columns.map(df(_)) : 
> _*))).agg(sum('probabibility)).first.get(0)
> }
>
> When I run it I get a long exception trace listing some generated code, as
> shown below.
>
> org.codehaus.commons.compiler.CompileException: File 'generated.java',
> Line 2445, Column 34: Expression "scan_isNull1" is not an rvalue
> at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:10174)
> at org.codehaus.janino.UnitCompiler.toRvalueOrCompileException(
> UnitCompiler.java:6036)
> at org.codehaus.janino.UnitCompiler.getConstantValue2(
> UnitCompiler.java:4440)
> at org.codehaus.janino.UnitCompiler.access$9900(UnitCompiler.java:185)
> at org.codehaus.janino.UnitCompiler$11.visitAmbiguousName(
> UnitCompiler.java:4417)
>
> This is line 2445 in the generated code,
>
> /* 2445 */     Object project_arg = scan_isNull1 ? null :
> project_converter2.apply(scan_value1);
>
> Many thanks
>
>
>
> --
> *Meeraj Kunnumpurath*
>
>
> *Director and Executive PrincipalService Symphony Ltd00 44 7702 693597*
>
> *00 971 50 409 0169mee...@servicesymphony.com <mee...@servicesymphony.com>*
>



-- 
*Meeraj Kunnumpurath*


*Director and Executive PrincipalService Symphony Ltd00 44 7702 693597*

*00 971 50 409 0169mee...@servicesymphony.com <mee...@servicesymphony.com>*

Reply via email to