Hi Jia, Thanks for your email and the hint. I checked and my use of RddA and RddB is correct (as I modeled after your helpful example: https://github.com/apache/sedona/blob/master/binder/ApacheSedonaSQL_SpatialJoin_AirportsPerCountry.ipynb ).
I now got the join of lines within polygons to work. The issue was that the creation of a Sedona spatial rdd from a polygon geojson requires 'st_buffer(0)' to fix data issues but this should not be used for point or linestring geojsons (see my functions below in case you are interested). None urgent: I did revisit the Sedona community server but I can not write a message. I have a none-urgent question pls: When I have one spatially extensive (e.g. worldwide extent) and one spatially small rdd (e.g. extent of US Arizona), which one should I use to determine the partitioning? The part of the spatially extensive rdd that has any overlap with the small rdd would be very small. I could subset the spatially extensive rdd but this would also involve some effort. Perhaps this is what I should do. Thanks a lot, Mark def get_rdd_point_or_line(spark: SparkSession, df: DataFrame, geojson_temp_name: str, geojson: str='geojson'): df.createOrReplaceTempView("temptable") df = spark.sql("select ST_GeomFromGeoJSON(temptable." + geojson + ") as geometry, " + geojson + " as " + geojson_temp_name + " from temptable") rdd = Adapter.toSpatialRdd(df, "geometry") rdd.analyze() return rdd def get_rdd_poly(spark: SparkSession, df: DataFrame, geojson_temp_name: str, geojson: str='geojson'): df.createOrReplaceTempView("temptable") df = spark.sql("select st_buffer(ST_GeomFromGeoJSON(temptable." + geojson + "),0) as geometry, " + geojson + " as " + geojson_temp_name + " from temptable") rdd = Adapter.toSpatialRdd(df, "geometry") rdd.analyze() return rdd On Fri, Jan 6, 2023 at 5:21 PM Jia Yu <ji...@apache.org> wrote: > And please try to switch the left and right side of the join and see if the > result changes. > > JoinQuery.SpatialJoingQueryFlat(RddA, RddB, considerBoundaryIntersection) > means that check if each one in Rdd A is CONTAINED BY each one in Rdd B, > considering the situation of boundary intersecting (not fully contained). > > In your case, a line cannot contain a polygon, but a polygon can contain a > line. Make sure you get the order correct. > > Thanks, > Jia > > On Fri, Jan 6, 2023 at 3:16 PM Mark Broich <mark.bro...@mapbox.com > .invalid> > wrote: > > > Thank you Jia for confirming that Sedona supports polygon-linestring > joins. > > I did set > > 'ConsiderBoundaryIntersection' to true. Am still looking for the mistake > in > > my code... > > Regards, Mark > > > > > > On Thu, Jan 5, 2023 at 6:44 PM Jia Yu <ji...@apache.org> wrote: > > > > > Hi Mark, > > > > > > Sedona supports polygon-linestring joins. Did you set > > > 'ConsiderBoundaryIntersection' to true? See: > > > > > > > > > https://sedona.apache.org/1.3.1-incubating/tutorial/core-python/#write-a-spatial-join-query > > > > > > This is the last parameter in Sedona Python > > > JoinQueryRaw.SpatialJoinQueryFlat(). > > > > > > Thanks, > > > Jia > > > > > > ---------- Forwarded message --------- > > > From: Mark Broich <mark.bro...@mapbox.com> > > > Date: Thu, Jan 5, 2023 at 4:27 PM > > > Subject: JoinQueryRaw.SpatialJoinQueryFlat for polygon - linestring > join? > > > To: <dev-i...@sedona.apache.org> > > > Cc: <dev-ow...@sedona.apache.org> > > > > > > > > > Hi all, > > > > > > I am trying to use JoinQueryRaw.SpatialJoinQueryFlat() to join polygons > > and > > > linestrings but the result is empty despite overlap in the polygons and > > > linestrings. > > > > > > I am wondering if JoinQueryRaw.SpatialJoinQueryFlat can do the join I > am > > > after or if I need to do a RangeQuery.SpatialRangeQuery(). > > > > > > Also, how do I get posting rights on the Apache Sedona community server > > > <https://discord.gg/9A3k5dEBsY>? > > > > > > Tnx for any pointers. Regards, Mark > > > > > >