[ 
https://issues.apache.org/jira/browse/SPARK-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianshi Huang updated SPARK-6382:
---------------------------------
    Description: 
Currently the scope of UDF registration is global. It's unsuitable for 
libraries that are built on top of DataFrame, as many operations has to be done 
by registering a UDF first.

Please provide a way for binding temporary UDFs.

e.g.

{code}
withUDF(("merge_map", (m1: Map[String, Double], m2: Map[String, Double]) => m2 
++ m2),
    ...) {
  sql("select merge_map(d1.map, d2.map) from d1, d2 where d1.id = d2.id")
}
{code}

Also UDF registry is a mutable Hashmap, refactoring it to a immutable one makes 
more sense.

Jianshi


  was:
Currently the scope of UDF registration is global. It's unsuitable for 
libraries that's built on top of DataFrame, as many operations has to be done 
by registering a UDF first.

Please provide a way for binding temporary UDFs.

e.g.

{code}
withUDF(("merge_map", (m1: Map[String, Double], m2: Map[String, Double]) => m2 
++ m2),
    ...) {
  sql("select merge_map(d1.map, d2.map) from d1, d2 where d1.id = d2.id")
}
{code}

Also UDF registry is a mutable Hashmap, refactoring it to a immutable one makes 
more sense.

Jianshi



> withUDF(...) {...} for supporting temporary UDF definitions in the scope
> ------------------------------------------------------------------------
>
>                 Key: SPARK-6382
>                 URL: https://issues.apache.org/jira/browse/SPARK-6382
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.3.0, 1.3.1
>            Reporter: Jianshi Huang
>
> Currently the scope of UDF registration is global. It's unsuitable for 
> libraries that are built on top of DataFrame, as many operations has to be 
> done by registering a UDF first.
> Please provide a way for binding temporary UDFs.
> e.g.
> {code}
> withUDF(("merge_map", (m1: Map[String, Double], m2: Map[String, Double]) => 
> m2 ++ m2),
>     ...) {
>   sql("select merge_map(d1.map, d2.map) from d1, d2 where d1.id = d2.id")
> }
> {code}
> Also UDF registry is a mutable Hashmap, refactoring it to a immutable one 
> makes more sense.
> Jianshi



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to