Apache Sedona contribution

Alessandro Calvio Fri, 26 Mar 2021 03:56:51 -0700

Hi,
I’m a graduated in Computer Engineering and I am writing in connection with the 
possibility to contribute to the Apache Sedona project.
During my work I bumped into a problem regarding the incapability to perform 
the KNNQuery operation with a dataset rather than a single point.
Hence, the contribution will enhance the library with a new signature of the 
SpatialKNNQuery:


public static <U extends Geometry, T extends Geometry> List<T> SpatialKnnQuery(
SpatialRDD<T> spatialRDD, SpatialRDD<U> datasetPoint, Integer k, boolean 
useIndex
)

The solution I’ve tried is similar to the one exploited for the Join-Query. In 
a few words, I’ll subdivide both dataset geographically, zip the partitions 
together and finally iterate on each partition computing the nearest neighbour 
query.
I’d like to know if it could be a good proposal for a contribution and ask you 
some questions about the idea:

  1.  Can the contribution be limited to RDD API or should it cover the SQL API 
too?
  2.  Can the contribution be limited to enhance the Scala/Java API or should 
it cover the Python API too?
  3.  Need the tests to be runned in local or should I deploy something like a 
cluster?

It would be my first contribution in a open-source project so I’m not very 
experienced in these kind of procedures. I want to be sure that I can develop 
and submit my solution in a correct environment: where could I find a guide or 
doc with all the steps to do this after a possible approval?

Waiting for a response,
Best regards,
Alessandro.

Apache Sedona contribution

Reply via email to