Thanks Jia for forwarding this to the appropriate place! As some background, I've used arrow-testing [1] when implementing integration tests for Arrow IPC data and parquet-testing [2] when implementing geometry/geography in Parquet C++ and have come to appreciate the value of cultivated test data in promoting a library/format. I've also been an observer in the slow move of the arrow monorepo [3] to split into multiple smaller repos to better engage subcommunities (e.g., arrow-js, arrow-julia). While the spatial subcommunity passionate about low-level testing is admittedly small, the apache/sedona repo has some great testing infrastructure that might find a wider audience in a dedicated repository.
In parallel, I've spent some time on the geoarrow-data [4] repository as a basis for testing/writing about GeoArrow implementations and collecting a few openly licensed non-trival datasets in a variety of formats. I'm happy to contribute anything interesting to a future sedona-testing (or not if it's out of scope!). Probably the most relevant is just the raw list of WKT covering the full matrix of geometry types and dimensions, as Jia noted. Cheers, -dewey [1] https://github.com/apache/arrow-testing [2] https://github.com/apache/parquet-testing [3] https://github.com/apache/arrow [4] https://github.com/geoarrow/geoarrow-data On Fri, May 23, 2025 at 5:18 PM Matthew Powers <matthewkevinpow...@gmail.com> wrote: > +1 (non-binding) > > On Fri, May 23, 2025 at 5:57 PM Jia Yu <ji...@apache.org> wrote: > > > Hi all, > > > > I’d like to help move forward with a discussion on our GitHub Discussion > > [1]: creating a dedicated sedona-testing repository, similar to what > Apache > > Parquet and Arrow projects have done (e.g., apache/parquet-testing, > > apache/arrow-testing). > > > > As Dewey mentioned, test cases, test data, and examples could cover > > hundreds of geospatial functions. Extracting this into a dedicated repo > > would make these resources easier to discover, reuse, and even integrate > > with external tools or projects. > > > > I’m strongly in favor of this idea and think it can benefit both Sedona > > developers and the broader ecosystem. Dewey also offered to contribute > data > > from geoarrow/geoarrow-data, including the WKT/WKB test suite, which > would > > be a great addition. > > > > Would love to hear what others think. If there’s general support, I'd > like > > to kick start the voting process. > > > > Best, > > Jia > > > > [1] https://github.com/apache/sedona/discussions/1950 > > >