Hello, Riccardo I was able to make it run, the problem is that HiveContext doesn't exists any more in Spark 2.0.2, as far I can see. But exists the method enableHiveSupport to add the hive functionality to SparkSession. To enable this the spark-hive_2.11 dependency is needed.
In the Spark API Docs this is not well explained, only says that SqlContext and HiveContext are now part of SparkSession "SparkSession is now the new entry point of Spark that replaces the old SQLContext and HiveContext. Note that the old SQLContext and HiveContext are kept for backward compatibility. A new catalog interface is accessible from SparkSession - existing API on databases and tables access such as listTables, createExternalTable, dropTempView, cacheTable are moved here." I think would be a good idea document enableHiveSupport also. Thanks, On Wed, Jun 14, 2017 at 5:13 AM, Takeshi Yamamuro <linguin....@gmail.com> wrote: > You can use the function w/o hive and you can try: > > scala> Seq(1.0, 8.0).toDF("a").selectExpr("percentile_approx(a, > 0.5)").show > > +------------------------------------------------+ > > |percentile_approx(a, CAST(0.5 AS DOUBLE), 10000)| > > +------------------------------------------------+ > > | 8.0| > > +------------------------------------------------+ > > > // maropu > > > > On Wed, Jun 14, 2017 at 5:04 PM, Riccardo Ferrari <ferra...@gmail.com> > wrote: > >> Hi Andres, >> >> I can't find the refrence, last time I searched for that I found that >> 'percentile_approx' is only available via hive context. You should register >> a temp table and use it from there. >> >> Best, >> >> On Tue, Jun 13, 2017 at 8:52 PM, Andrés Ivaldi <iaiva...@gmail.com> >> wrote: >> >>> Hello, I`m trying to user percentile_approx on my SQL query, but It's >>> like spark context can´t find the function >>> >>> I'm using it like this >>> import org.apache.spark.sql.functions._ >>> import org.apache.spark.sql.DataFrameStatFunctions >>> >>> val e = expr("percentile_approx(Cantidadcon0234514)") >>> df.agg(e).show() >>> >>> and exception is >>> >>> org.apache.spark.sql.AnalysisException: Undefined function: >>> 'percentile_approx'. This function is neither a registered temporary >>> function nor a permanent function registered >>> >>> I've also tryid with callUDF >>> >>> Regards. >>> >>> -- >>> Ing. Ivaldi Andres >>> >> >> > > > -- > --- > Takeshi Yamamuro > -- Ing. Ivaldi Andres