Re: Issue Upgrading to 3.2

Wenchen Fan Mon, 01 Nov 2021 09:36:20 -0700

To confirm: Does the error happen during view creation, or when we read the
view later?


On Mon, Nov 1, 2021 at 11:28 PM Adam Binford <adam...@gmail.com> wrote:

> I don't have a minimal reproduction right now but here's more relevant
> code snippets.
>
> Stacktrace:
>  org.apache.spark.sql.AnalysisException: Undefined function:
> 'ST_PolygonFromEnvelope'. This function is neither a registered temporary
> function nor a permanent function registered in the database 'default'.;
> line 2 pos 50
>   at
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.failFunctionLookup(SessionCatalog.scala:1562)
>   at
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1660)
>   at
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1677)
>   at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$27$$anonfun$applyOrElse$114.$anonfun$applyOrElse$116(Analyzer.scala:2150)
>   at
> org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:60)
>   at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$27$$anonfun$applyOrElse$114.applyOrElse(Analyzer.scala:2150)
>   at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$27$$anonfun$applyOrElse$114.applyOrElse(Analyzer.scala:2137)
>   at
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481)
>   at
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>   at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481)
>
> Expression definition:
> case class ST_PolygonFromEnvelope(inputExpressions: Seq[Expression])
> extends Expression with CodegenFallback with UserDataGeneratator {
>   override def nullable: Boolean = false
>
>   override def eval(input: InternalRow): Any = {
>     val minX = inputExpressions(0).eval(input) match {
>       case a: Double => a
>       case b: Decimal => b.toDouble
>     }
>
>     val minY = inputExpressions(1).eval(input) match {
>       case a: Double => a
>       case b: Decimal => b.toDouble
>     }
>
>     val maxX = inputExpressions(2).eval(input) match {
>       case a: Double => a
>       case b: Decimal => b.toDouble
>     }
>
>     val maxY = inputExpressions(3).eval(input) match {
>       case a: Double => a
>       case b: Decimal => b.toDouble
>     }
>
>     var coordinates = new Array[Coordinate](5)
>     coordinates(0) = new Coordinate(minX, minY)
>     coordinates(1) = new Coordinate(minX, maxY)
>     coordinates(2) = new Coordinate(maxX, maxY)
>     coordinates(3) = new Coordinate(maxX, minY)
>     coordinates(4) = coordinates(0)
>     val geometryFactory = new GeometryFactory()
>     val polygon = geometryFactory.createPolygon(coordinates)
>     new GenericArrayData(GeometrySerializer.serialize(polygon))
>   }
>
>   override def dataType: DataType = GeometryUDT
>
>   override def children: Seq[Expression] = inputExpressions
> }
>
> Function registration:
> Catalog.expressions.foreach(f => {
>       val functionIdentifier =
> FunctionIdentifier(f.getClass.getSimpleName.dropRight(1))
>       val expressionInfo = new ExpressionInfo(
>         f.getClass.getCanonicalName,
>         functionIdentifier.database.orNull,
>         functionIdentifier.funcName)
>       sparkSession.sessionState.functionRegistry.registerFunction(
>         functionIdentifier,
>         expressionInfo,
>         f
>       )
>     })
>
> On Mon, Nov 1, 2021 at 10:43 AM Wenchen Fan <cloud0...@gmail.com> wrote:
>
>> Hi Adam,
>>
>> Thanks for reporting this issue! Do you have the full stacktrace or a
>> code snippet to reproduce the issue at Spark side? It looks like a bug, but
>> it's not obvious to me how this bug can happen.
>>
>> Thanks,
>> Wenchen
>>
>> On Sat, Oct 30, 2021 at 1:08 AM Adam Binford <adam...@gmail.com> wrote:
>>
>>> Hi devs,
>>>
>>> I'm working on getting Apache Sedona upgraded to work with Spark 3.2,
>>> and ran into a weird issue I wanted to get some feedback on. The PR and
>>> current discussion can be found here:
>>> https://github.com/apache/incubator-sedona/pull/557
>>>
>>> To try to sum up in a quick way, this library defines custom expressions
>>> and registers the expressions using
>>> sparkSession.sessionState.functionRegistry.registerFunction. One of the
>>> unit tests is now failing because the function can't be found when a
>>> temporary view using that function is created in pure SQL.
>>>
>>> Examples:
>>> This fails with Undefined function: 'ST_PolygonFromEnvelope'. This
>>> function is neither a registered temporary function nor a permanent
>>> function registered in the database 'default'.:
>>>
>>>  spark.sql(
>>>         """
>>>           |CREATE OR REPLACE TEMP VIEW pixels AS
>>>           |SELECT pixel, shape FROM pointtable
>>>           |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, 
>>> ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel
>>>         """.stripMargin)
>>>
>>>       // Test visualization partitioner
>>>       val zoomLevel = 2
>>>       val newDf = VizPartitioner(spark.table("pixels"), zoomLevel, "pixel", 
>>> new Envelope(0, 1000, 0, 1000))
>>>
>>>
>>> But both of these work fine:
>>>
>>>  val table = spark.sql(
>>>        """
>>>          |SELECT pixel, shape FROM pointtable
>>>          |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, 
>>> ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel
>>>         """.stripMargin)
>>>
>>>       // Test visualization partitioner
>>>       val zoomLevel = 2
>>>       val newDf = VizPartitioner(table, zoomLevel, "pixel", new Envelope(0, 
>>> 1000, 0, 1000))
>>>
>>>     val table = spark.sql(
>>>        """
>>>          |SELECT pixel, shape FROM pointtable
>>>          |LATERAL VIEW EXPLODE(ST_Pixelize(shape, 1000, 1000, 
>>> ST_PolygonFromEnvelope(-126.790180,24.863836,-64.630926,50.000))) AS pixel
>>>         """.stripMargin)
>>>       table.createOrReplaceTempView("pixels")
>>>
>>>       // Test visualization partitioner
>>>       val zoomLevel = 2
>>>       val newDf = VizPartitioner(spark.table("pixels"), zoomLevel, "pixel", 
>>> new Envelope(0, 1000, 0, 1000))
>>>
>>>
>>> So the main question is, is this a feature or a bug?
>>>
>>> --
>>> Adam Binford
>>>
>>
>
> --
> Adam Binford
>

Re: Issue Upgrading to 3.2

Reply via email to