To confirm: Does the error happen during view creation, or when we read the view later?
On Mon, Nov 1, 2021 at 11:28 PM Adam Binford <adam...@gmail.com> wrote: > I don't have a minimal reproduction right now but here's more relevant > code snippets. > > Stacktrace: > org.apache.spark.sql.AnalysisException: Undefined function: > 'ST_PolygonFromEnvelope'. This function is neither a registered temporary > function nor a permanent function registered in the database 'default'.; > line 2 pos 50 > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.failFunctionLookup(SessionCatalog.scala:1562) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1660) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1677) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$27$$anonfun$applyOrElse$114.$anonfun$applyOrElse$116(Analyzer.scala:2150) > at > org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:60) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$27$$anonfun$applyOrElse$114.applyOrElse(Analyzer.scala:2150) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$27$$anonfun$applyOrElse$114.applyOrElse(Analyzer.scala:2137) > at > org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) > > Expression definition: > case class ST_PolygonFromEnvelope(inputExpressions: Seq[Expression]) > extends Expression with CodegenFallback with UserDataGeneratator { > override def nullable: Boolean = false > > override def eval(input: InternalRow): Any = { > val minX = inputExpressions(0).eval(input) match { > case a: Double => a > case b: Decimal => b.toDouble > } > > val minY = inputExpressions(1).eval(input) match { > case a: Double => a > case b: Decimal => b.toDouble > } > > val maxX = inputExpressions(2).eval(input) match { > case a: Double => a > case b: Decimal => b.toDouble > } > > val maxY = inputExpressions(3).eval(input) match { > case a: Double => a > case b: Decimal => b.toDouble > } > > var coordinates = new Array[Coordinate](5) > coordinates(0) = new Coordinate(minX, minY) > coordinates(1) = new Coordinate(minX, maxY) > coordinates(2) = new Coordinate(maxX, maxY) > coordinates(3) = new Coordinate(maxX, minY) > coordinates(4) = coordinates(0) > val geometryFactory = new GeometryFactory() > val polygon = geometryFactory.createPolygon(coordinates) > new GenericArrayData(GeometrySerializer.serialize(polygon)) > } > > override def dataType: DataType = GeometryUDT > > override def children: Seq[Expression] = inputExpressions > } > > Function registration: > Catalog.expressions.foreach(f => { > val functionIdentifier = > FunctionIdentifier(f.getClass.getSimpleName.dropRight(1)) > val expressionInfo = new ExpressionInfo( > f.getClass.getCanonicalName, > functionIdentifier.database.orNull, > functionIdentifier.funcName) > sparkSession.sessionState.functionRegistry.registerFunction( > functionIdentifier, > expressionInfo, > f > ) > }) > > On Mon, Nov 1, 2021 at 10:43 AM Wenchen Fan <cloud0...@gmail.com> wrote: > >> Hi Adam, >> >> Thanks for reporting this issue! Do you have the full stacktrace or a >> code snippet to reproduce the issue at Spark side? It looks like a bug, but >> it's not obvious to me how this bug can happen. >> >> Thanks, >> Wenchen >> >> On Sat, Oct 30, 2021 at 1:08 AM Adam Binford <adam...@gmail.com> wrote: >> >>> Hi devs, >>> >>> I'm working on getting Apache Sedona upgraded to work with Spark 3.2, >>> and ran into a weird issue I wanted to get some feedback on. The PR and >>> current discussion can be found here: >>> https://github.com/apache/incubator-sedona/pull/557 >>> >>> To try to sum up in a quick way, this library defines custom expressions >>> and registers the expressions using >>> sparkSession.sessionState.functionRegistry.registerFunction. One of the >>> unit tests is now failing because the function can't be found when a >>> temporary view using that function is created in pure SQL. >>> >>> Examples: >>> This fails with Undefined function: 'ST_PolygonFromEnvelope'. This >>> function is neither a registered temporary function nor a permanent >>> function registered in the database 'default'.: >>> >>> spark.sql( >>> """ >>> |CREATE OR REPLACE TEMP VIEW pixels AS >>> |SELECT pixel, shape FROM pointtable >>> |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, >>> ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel >>> """.stripMargin) >>> >>> // Test visualization partitioner >>> val zoomLevel = 2 >>> val newDf = VizPartitioner(spark.table("pixels"), zoomLevel, "pixel", >>> new Envelope(0, 1000, 0, 1000)) >>> >>> >>> But both of these work fine: >>> >>> val table = spark.sql( >>> """ >>> |SELECT pixel, shape FROM pointtable >>> |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, >>> ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel >>> """.stripMargin) >>> >>> // Test visualization partitioner >>> val zoomLevel = 2 >>> val newDf = VizPartitioner(table, zoomLevel, "pixel", new Envelope(0, >>> 1000, 0, 1000)) >>> >>> val table = spark.sql( >>> """ >>> |SELECT pixel, shape FROM pointtable >>> |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, >>> ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel >>> """.stripMargin) >>> table.createOrReplaceTempView("pixels") >>> >>> // Test visualization partitioner >>> val zoomLevel = 2 >>> val newDf = VizPartitioner(spark.table("pixels"), zoomLevel, "pixel", >>> new Envelope(0, 1000, 0, 1000)) >>> >>> >>> So the main question is, is this a feature or a bug? >>> >>> -- >>> Adam Binford >>> >> > > -- > Adam Binford >