Hey Mo,

That is awesome, great to hear!

Best,

Grigory

On Tue, Jan 17, 2023 at 9:03 AM Mo Sarwat <mosar...@apache.org> wrote:

> Grigory,
>
> Thanks a lot for chiming - I really like the PostGIS to PostgreSQL
> analogy. That is exactly what Sedona (an Apache project) is to Spark. Spark
> core should remain light / generic enough (similar to PostgreSQL) and all
> spatial functionalities should be pluggable extensions (Sedona). Otherwise,
> the core will be unnecessarily heavy to maintain, release, and integrate.
>
> Sedona already supports geo-hashing among many other geospatial standard
> functionality, which work seamlessly with Spark without any issues to the
> end user. If there is something missing, I would highly recommend that we
> bring it to the Sedona community, and that will directly feed into the
> benefit of Spark uses who are doing geo.
>
> Implementing geospatial functionality in the core Spark will be a
> replication of work done already. Databricks for instance already uses
> Sedona internally with their geospatial capabilities.
>
> Finally, I would like to mention that I am totally willing to be corrected
> on that. Especially, if you tried Sedona with Spark and figured that it
> does not serve the purpose at all. But, please try it first and let's come
> up with a few capabilities it cannot provide unless it is implemented in
> Spark core. And, then we can suggest those capabilities to the Spark
> community.
>
> Thanks,
> -Mo
>
>
> On 2023/01/17 03:09:06 Grigory Pomadchin wrote:
> > Hey folks,
> >
> > Traditionally GIS functionality is distributed a bit separately - i.e.
> > PostGIS is a great example; and indeed for GIS needs Sedona / GeoMesa /
> > GeoWave may work out; I think GeoMesa implements GeoHash (see
> >
> https://www.geomesa.org/documentation/stable/user/spark/sparksql_functions.html
> > -
> > could be used as an inspiration at least);
> >
> > I'm pretty sure DataBricks provides some GIS functions (H3) at this
> point.
> > Could be an argument for having smth in the core / officially supported
> by
> > Spark community?
> >
> > I'd really love to see some relatively lightweight (JTS + Proj4j / SIS)
> > library with basic expressions and optimization rules in the wild that is
> > usable in the Spark native interfaces primarily; so there is no need to
> > figure out the API / way to set it up and / or resolve peculiar
> > dependencies. Could be a step towards Spark GIS types standardization.
> >
> > Best,
> >
> > Grigory
> >
> > On Mon, Jan 16, 2023 at 6:21 PM Mo Sarwat <mosar...@apache.org> wrote:
> >
> > > Martin, thanks for chiming in and mentioning Apache SIS. However, Mark
> was
> > > asking about Geo in Spark, which Sedona already supports.
> > >
> > > Yet, I like the idea of making all dependencies within the Apache
> family.
> > > I believe a good solution would be for you (or the SIS community at
> large)
> > > to include Apache SIS in Sedona to replace libs like GeoTools. The
> Sedona
> > > community would definitely welcome your contribution :)
> > >
> > > Regards,
> > > -Mo
> > >
> > > On 2023/01/16 22:24:14 Martin Desruisseaux wrote:
> > > > Hello Mark
> > > >
> > > > Indeed Sedona is surely a serious candidate. Maybe one aspect to
> take in
> > > consideration, depending how "core" the geospatial services would be,
> is
> > > that Sedona depends on a LGPL library (GeoTools, bundled separately)
> for
> > > map projections, Shapefile and GeoTIFF support. So those features
> could not
> > > be in core since category X dependencies shall be optional.
> > > >
> > > > Regarding referencing by coordinates (including map projections), I'm
> > > aware of 3 libraries having a license compatible with Apache:
> > > >
> > > > * Apache SIS (Apache License)
> > > > * PROJ4J (Apache license)
> > > > * PROJ-JNI (MIT license)
> > > >
> > > > PROJ-JNI is a binding to PROJ native library using Java Native
> Interface
> > > (JNI). PROJ is the most well known map projection library, but it is
> > > difficult to bundle native code in a Java application.
> > > >
> > > > I'm not in a neutral position to said that, but I believe that Apache
> > > SIS is the most powerful open source pure-Java referencing library.
> But it
> > > is relatively big, about 4 Mb for the referencing module with its
> > > dependencies, not counting the optional EPSG geodetic dataset (because
> not
> > > compatible with Apache license). Apache SIS is not the library with the
> > > largest amount of map projections (PROJ4J has more), but it handles
> some
> > > difficult problems and scale well with three- or four-dimensional data
> (or
> > > more).
> > > >
> > > > PROJ4J is a lightweight library which may be sufficient if data are
> > > mostly two-dimensional (limited 3D support seems also possible) and if
> > > uncertainty of a few metres in coordinate transformations (depending
> how
> > > datum shifts are specified) is acceptable.
> > > >
> > > > It is possible to write some code in an implementation-independent
> way
> > > using GeoAPI interfaces, which aim to do what JDBC interfaces do for
> > > databases. Apache SIS and PROJ-JNI are implementations of GeoAPI
> > > interfaces, so by using those interfaces you can let users choose among
> > > those two implementations. I think that GeoAPI wrappers could easily be
> > > contributed to PROJ4J as well if there is a desire for that.
> > > >
> > > > Regarding Geohash, if we are talking about the algorithm described at
> > > https://en.wikipedia.org/wiki/Geohash, then SIS already supports it.
> SIS
> > > supports also the Military Grid Reference System (MGRS), which can be
> seen
> > > as another kind of geohash with better characteristics.
> > > >
> > > > Regards,
> > > >
> > > >     Martin
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to