Haha ok, its one of those days, Array isn't valid. RTFM and it says
Catalyst array maps to a Scala Seq, that makes sense.
So it works! Two follow up questions;
1 - Is this the best approach?
2 - what if I want my expression to return multiple rows? - my binary
classification model gives me a
Not sure how that would work. Really I want to tack on an extra column onto
the DF with a UDF that can take a Row object.
On Tue, Sep 8, 2015 at 1:54 AM, Jörn Franke wrote:
> Can you use a map or list with different properties as one parameter?
> Alternatively a string
Sorry for the spam - I had some success;
case class ScoringDF(function: Row => Double) extends Expression {
val dataType = DataTypes.DoubleType
override type EvaluatedType = Double
override def eval(input: Row): EvaluatedType = {
function(input)
}
override def nullable: Boolean =
So basically I need something like
df.withColumn("score", new Column(new Expression {
...
def eval(input: Row = null): EvaluatedType = myModel.score(input)
...
}))
But I can't do this, so how can I make a UDF or something like it, that can
take in a Row and pass back a double value or some
Is it possible to have a UDF which takes a variable number of arguments?
e.g. df.select(myUdf($"*")) fails with
org.apache.spark.sql.AnalysisException: unresolved operator 'Project
[scalaUDF(*) AS scalaUDF(*)#26];
What I would like to do is pass in a generic data frame which can be then
passed
Can you use a map or list with different properties as one parameter?
Alternatively a string where parameters are Comma-separated...
Le lun. 7 sept. 2015 à 8:35, Night Wolf a écrit :
> Is it possible to have a UDF which takes a variable number of arguments?
>
> e.g.