Kristin Cowalcijk created SEDONA-560: ----------------------------------------
Summary: Spatial join involving dataframe containing 0 partition throws exception Key: SEDONA-560 URL: https://issues.apache.org/jira/browse/SEDONA-560 Project: Apache Sedona Issue Type: Bug Affects Versions: 1.6.0 Reporter: Kristin Cowalcijk Sedona cannot handle dataframes containing 0 partitions properly when performing spatial join. For example: {code:python} schema = StructType([ StructField("id", IntegerType(), True) ]) # Create empty RDD empty_rdd = spark.sparkContext.emptyRDD() # Create empty DataFrame empty_df = spark.createDataFrame(empty_rdd, schema) df_point = spark.range(0, 10).toDF("id").withColumn('geom', expr("ST_Point(id, id)")) df_poly = empty_df.withColumn("poly", expr("ST_Buffer(ST_Point(id, id), 2)")).drop("geom") spark.conf.set("sedona.join.autoBroadcastJoinThreshold", "-1") df_point.join(broadcast(df_poly), expr("ST_Intersects(poly, geom)")).count() {code} failed with the following error message: {code:java} Py4JJavaError: An error occurred while calling o107.showString. : java.util.NoSuchElementException: next on empty iterator at scala.collection.Iterator$$anon$2.next(Iterator.scala:41) at scala.collection.Iterator$$anon$2.next(Iterator.scala:39) at scala.collection.IterableLike.head(IterableLike.scala:109) at scala.collection.IterableLike.head$(IterableLike.scala:108) at scala.collection.AbstractIterable.head(Iterable.scala:56) at org.apache.spark.sql.sedona_sql.strategy.join.SpatialIndexExec.doExecuteBroadcast(SpatialIndexExec.scala:63) {code} This does not only happen to broadcast join, range join also has problems: {code:python} df_point.join(df_poly, expr("ST_Intersects(poly, geom)")).count() 24/05/27 10:50:13.989 WARN RangeJoinExec: [SedonaSQL] Join dominant side partition number 8 is larger than 1/2 of the dominant side count 10 24/05/27 10:50:13.989 WARN RangeJoinExec: [SedonaSQL] Try to use follower side partition number 0 Number of partitions must be >= 0 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)