Martin Andersson created SEDONA-221:
---------------------------------------

             Summary: Outer join throws NPE for null geometries
                 Key: SEDONA-221
                 URL: https://issues.apache.org/jira/browse/SEDONA-221
             Project: Apache Sedona
          Issue Type: Bug
            Reporter: Martin Andersson


The following query throws a NullPointerException.
{code}
select /*+ BROADCAST(t2) */ * from t1 left join t2 on st_intersects(t1.geom, 
t2.geom)
{code}
{code}
java.lang.NullPointerException
        at org.locationtech.jts.io.WKBReader.read(WKBReader.java:159)
        at 
org.apache.sedona.sql.utils.GeometrySerializer$.deserialize(GeometrySerializer.scala:50)
        at 
org.apache.spark.sql.sedona_sql.strategy.join.TraitJoinQueryBase.$anonfun$toSpatialRDD$1(TraitJoinQueryBase.scala:45)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
        at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
{code}

The failure happens when the streaming side is mapped to a SpatialRDD. The NPE 
doesn't happen for inner join with null geometries. I suspect Spark is pushing 
a not null predicate since rows with null geometries would be excluded in an 
inner join anyway.

Looking at the code I suspect there are more errors in the new broadcast join 
types. InternalRow is encoded in the user data field in the geometry. That 
doesn't work if the geometry is null. For a left join the InternalRow on the 
left side has to be emitted even if the geometry is null. Instead of using a 
SpatialRDD it might be better to map the RDD[InternalRow] to a 
RDD[Pair[Geometry, InternalRow]] where Geometry might be null.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to