Hi, I’m a graduated in Computer Engineering and I am writing in connection with the possibility to contribute to the Apache Sedona project. During my work I bumped into a problem regarding the incapability to perform the KNNQuery operation with a dataset rather than a single point. Hence, the contribution will enhance the library with a new signature of the SpatialKNNQuery:
public static <U extends Geometry, T extends Geometry> List<T> SpatialKnnQuery( SpatialRDD<T> spatialRDD, SpatialRDD<U> datasetPoint, Integer k, boolean useIndex ) The solution I’ve tried is similar to the one exploited for the Join-Query. In a few words, I’ll subdivide both dataset geographically, zip the partitions together and finally iterate on each partition computing the nearest neighbour query. I’d like to know if it could be a good proposal for a contribution and ask you some questions about the idea: 1. Can the contribution be limited to RDD API or should it cover the SQL API too? 2. Can the contribution be limited to enhance the Scala/Java API or should it cover the Python API too? 3. Need the tests to be runned in local or should I deploy something like a cluster? It would be my first contribution in a open-source project so I’m not very experienced in these kind of procedures. I want to be sure that I can develop and submit my solution in a correct environment: where could I find a guide or doc with all the steps to do this after a possible approval? Waiting for a response, Best regards, Alessandro.