Mosaic by Databricks Labs <https://github.com/databrickslabs/mosaic>
tir. 17. jan. 2023 kl. 15:53 skrev Grigory Pomadchin <dau...@gmail.com>: > Hey Mo, > > That is awesome, great to hear! > > Best, > > Grigory > > On Tue, Jan 17, 2023 at 9:03 AM Mo Sarwat <mosar...@apache.org> wrote: > >> Grigory, >> >> Thanks a lot for chiming - I really like the PostGIS to PostgreSQL >> analogy. That is exactly what Sedona (an Apache project) is to Spark. Spark >> core should remain light / generic enough (similar to PostgreSQL) and all >> spatial functionalities should be pluggable extensions (Sedona). Otherwise, >> the core will be unnecessarily heavy to maintain, release, and integrate. >> >> Sedona already supports geo-hashing among many other geospatial standard >> functionality, which work seamlessly with Spark without any issues to the >> end user. If there is something missing, I would highly recommend that we >> bring it to the Sedona community, and that will directly feed into the >> benefit of Spark uses who are doing geo. >> >> Implementing geospatial functionality in the core Spark will be a >> replication of work done already. Databricks for instance already uses >> Sedona internally with their geospatial capabilities. >> >> Finally, I would like to mention that I am totally willing to be >> corrected on that. Especially, if you tried Sedona with Spark and figured >> that it does not serve the purpose at all. But, please try it first and >> let's come up with a few capabilities it cannot provide unless it is >> implemented in Spark core. And, then we can suggest those capabilities to >> the Spark community. >> >> Thanks, >> -Mo >> >> >> On 2023/01/17 03:09:06 Grigory Pomadchin wrote: >> > Hey folks, >> > >> > Traditionally GIS functionality is distributed a bit separately - i.e. >> > PostGIS is a great example; and indeed for GIS needs Sedona / GeoMesa / >> > GeoWave may work out; I think GeoMesa implements GeoHash (see >> > >> https://www.geomesa.org/documentation/stable/user/spark/sparksql_functions.html >> > - >> > could be used as an inspiration at least); >> > >> > I'm pretty sure DataBricks provides some GIS functions (H3) at this >> point. >> > Could be an argument for having smth in the core / officially supported >> by >> > Spark community? >> > >> > I'd really love to see some relatively lightweight (JTS + Proj4j / SIS) >> > library with basic expressions and optimization rules in the wild that >> is >> > usable in the Spark native interfaces primarily; so there is no need to >> > figure out the API / way to set it up and / or resolve peculiar >> > dependencies. Could be a step towards Spark GIS types standardization. >> > >> > Best, >> > >> > Grigory >> > >> > On Mon, Jan 16, 2023 at 6:21 PM Mo Sarwat <mosar...@apache.org> wrote: >> > >> > > Martin, thanks for chiming in and mentioning Apache SIS. However, >> Mark was >> > > asking about Geo in Spark, which Sedona already supports. >> > > >> > > Yet, I like the idea of making all dependencies within the Apache >> family. >> > > I believe a good solution would be for you (or the SIS community at >> large) >> > > to include Apache SIS in Sedona to replace libs like GeoTools. The >> Sedona >> > > community would definitely welcome your contribution :) >> > > >> > > Regards, >> > > -Mo >> > > >> > > On 2023/01/16 22:24:14 Martin Desruisseaux wrote: >> > > > Hello Mark >> > > > >> > > > Indeed Sedona is surely a serious candidate. Maybe one aspect to >> take in >> > > consideration, depending how "core" the geospatial services would be, >> is >> > > that Sedona depends on a LGPL library (GeoTools, bundled separately) >> for >> > > map projections, Shapefile and GeoTIFF support. So those features >> could not >> > > be in core since category X dependencies shall be optional. >> > > > >> > > > Regarding referencing by coordinates (including map projections), >> I'm >> > > aware of 3 libraries having a license compatible with Apache: >> > > > >> > > > * Apache SIS (Apache License) >> > > > * PROJ4J (Apache license) >> > > > * PROJ-JNI (MIT license) >> > > > >> > > > PROJ-JNI is a binding to PROJ native library using Java Native >> Interface >> > > (JNI). PROJ is the most well known map projection library, but it is >> > > difficult to bundle native code in a Java application. >> > > > >> > > > I'm not in a neutral position to said that, but I believe that >> Apache >> > > SIS is the most powerful open source pure-Java referencing library. >> But it >> > > is relatively big, about 4 Mb for the referencing module with its >> > > dependencies, not counting the optional EPSG geodetic dataset >> (because not >> > > compatible with Apache license). Apache SIS is not the library with >> the >> > > largest amount of map projections (PROJ4J has more), but it handles >> some >> > > difficult problems and scale well with three- or four-dimensional >> data (or >> > > more). >> > > > >> > > > PROJ4J is a lightweight library which may be sufficient if data are >> > > mostly two-dimensional (limited 3D support seems also possible) and if >> > > uncertainty of a few metres in coordinate transformations (depending >> how >> > > datum shifts are specified) is acceptable. >> > > > >> > > > It is possible to write some code in an implementation-independent >> way >> > > using GeoAPI interfaces, which aim to do what JDBC interfaces do for >> > > databases. Apache SIS and PROJ-JNI are implementations of GeoAPI >> > > interfaces, so by using those interfaces you can let users choose >> among >> > > those two implementations. I think that GeoAPI wrappers could easily >> be >> > > contributed to PROJ4J as well if there is a desire for that. >> > > > >> > > > Regarding Geohash, if we are talking about the algorithm described >> at >> > > https://en.wikipedia.org/wiki/Geohash, then SIS already supports it. >> SIS >> > > supports also the Military Grid Reference System (MGRS), which can be >> seen >> > > as another kind of geohash with better characteristics. >> > > > >> > > > Regards, >> > > > >> > > > Martin >> > > > >> > > > >> --------------------------------------------------------------------- >> > > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> > > > >> > > > >> > > >> > > --------------------------------------------------------------------- >> > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> > > >> > > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> > -- Bjørn Jørgensen Vestre Aspehaug 4, 6010 Ålesund Norge +47 480 94 297