Re: registering udf to use in spark.sql('select...

2016-08-04 Thread Mich Talebzadeh
Yes pretty straight forward define, register and use def cleanupCurrency (word : String) : Double = { word.toString.substring(1).replace(",", "").toDouble } sqlContext.udf.register("cleanupCurrency", cleanupCurrency(_:String)) val a = df.filter(col("Total") > "").map(p => Invoices(p(0).

Re: registering udf to use in spark.sql('select...

2016-08-04 Thread Nicholas Chammas
No, SQLContext is not disappearing. The top-level class is replaced by SparkSession, but you can always get the underlying context from the session. You can also use SparkSession.udf.register() , which is

Re: registering udf to use in spark.sql('select...

2016-08-04 Thread Ben Teeuwen
Yes, but I don’t want to use it in a select() call. Either selectExpr() or spark.sql(), with the udf being called inside a string. Now I got it to work using "sqlContext.registerFunction('encodeOneHot_udf',encodeOneHot, VectorUDT())” But this sqlContext approach will disappear, right? So I’m cur

Re: registering udf to use in spark.sql('select...

2016-08-04 Thread Nicholas Chammas
Have you looked at pyspark.sql.functions.udf and the associated examples? 2016년 8월 4일 (목) 오전 9:10, Ben Teeuwen 님이 작성: > Hi, > > I’d like to use a UDF in pyspark 2.0. As in .. > > > def squareIt(x): > return x * x > > # register the function and define return type > …. > > spark.sql(“”"s

registering udf to use in spark.sql('select...

2016-08-04 Thread Ben Teeuwen
Hi, I’d like to use a UDF in pyspark 2.0. As in .. def squareIt(x): return x * x # register the function and define return type …. spark.sql(“”"select myUdf(adgroupid, 'extra_string_parameter') as function_result from df’) _ How can I register the function? I only see reg