Mosaic by Databricks Labs <https://github.com/databrickslabs/mosaic>



tir. 17. jan. 2023 kl. 15:53 skrev Grigory Pomadchin <dau...@gmail.com>:

> Hey Mo,
>
> That is awesome, great to hear!
>
> Best,
>
> Grigory
>
> On Tue, Jan 17, 2023 at 9:03 AM Mo Sarwat <mosar...@apache.org> wrote:
>
>> Grigory,
>>
>> Thanks a lot for chiming - I really like the PostGIS to PostgreSQL
>> analogy. That is exactly what Sedona (an Apache project) is to Spark. Spark
>> core should remain light / generic enough (similar to PostgreSQL) and all
>> spatial functionalities should be pluggable extensions (Sedona). Otherwise,
>> the core will be unnecessarily heavy to maintain, release, and integrate.
>>
>> Sedona already supports geo-hashing among many other geospatial standard
>> functionality, which work seamlessly with Spark without any issues to the
>> end user. If there is something missing, I would highly recommend that we
>> bring it to the Sedona community, and that will directly feed into the
>> benefit of Spark uses who are doing geo.
>>
>> Implementing geospatial functionality in the core Spark will be a
>> replication of work done already. Databricks for instance already uses
>> Sedona internally with their geospatial capabilities.
>>
>> Finally, I would like to mention that I am totally willing to be
>> corrected on that. Especially, if you tried Sedona with Spark and figured
>> that it does not serve the purpose at all. But, please try it first and
>> let's come up with a few capabilities it cannot provide unless it is
>> implemented in Spark core. And, then we can suggest those capabilities to
>> the Spark community.
>>
>> Thanks,
>> -Mo
>>
>>
>> On 2023/01/17 03:09:06 Grigory Pomadchin wrote:
>> > Hey folks,
>> >
>> > Traditionally GIS functionality is distributed a bit separately - i.e.
>> > PostGIS is a great example; and indeed for GIS needs Sedona / GeoMesa /
>> > GeoWave may work out; I think GeoMesa implements GeoHash (see
>> >
>> https://www.geomesa.org/documentation/stable/user/spark/sparksql_functions.html
>> > -
>> > could be used as an inspiration at least);
>> >
>> > I'm pretty sure DataBricks provides some GIS functions (H3) at this
>> point.
>> > Could be an argument for having smth in the core / officially supported
>> by
>> > Spark community?
>> >
>> > I'd really love to see some relatively lightweight (JTS + Proj4j / SIS)
>> > library with basic expressions and optimization rules in the wild that
>> is
>> > usable in the Spark native interfaces primarily; so there is no need to
>> > figure out the API / way to set it up and / or resolve peculiar
>> > dependencies. Could be a step towards Spark GIS types standardization.
>> >
>> > Best,
>> >
>> > Grigory
>> >
>> > On Mon, Jan 16, 2023 at 6:21 PM Mo Sarwat <mosar...@apache.org> wrote:
>> >
>> > > Martin, thanks for chiming in and mentioning Apache SIS. However,
>> Mark was
>> > > asking about Geo in Spark, which Sedona already supports.
>> > >
>> > > Yet, I like the idea of making all dependencies within the Apache
>> family.
>> > > I believe a good solution would be for you (or the SIS community at
>> large)
>> > > to include Apache SIS in Sedona to replace libs like GeoTools. The
>> Sedona
>> > > community would definitely welcome your contribution :)
>> > >
>> > > Regards,
>> > > -Mo
>> > >
>> > > On 2023/01/16 22:24:14 Martin Desruisseaux wrote:
>> > > > Hello Mark
>> > > >
>> > > > Indeed Sedona is surely a serious candidate. Maybe one aspect to
>> take in
>> > > consideration, depending how "core" the geospatial services would be,
>> is
>> > > that Sedona depends on a LGPL library (GeoTools, bundled separately)
>> for
>> > > map projections, Shapefile and GeoTIFF support. So those features
>> could not
>> > > be in core since category X dependencies shall be optional.
>> > > >
>> > > > Regarding referencing by coordinates (including map projections),
>> I'm
>> > > aware of 3 libraries having a license compatible with Apache:
>> > > >
>> > > > * Apache SIS (Apache License)
>> > > > * PROJ4J (Apache license)
>> > > > * PROJ-JNI (MIT license)
>> > > >
>> > > > PROJ-JNI is a binding to PROJ native library using Java Native
>> Interface
>> > > (JNI). PROJ is the most well known map projection library, but it is
>> > > difficult to bundle native code in a Java application.
>> > > >
>> > > > I'm not in a neutral position to said that, but I believe that
>> Apache
>> > > SIS is the most powerful open source pure-Java referencing library.
>> But it
>> > > is relatively big, about 4 Mb for the referencing module with its
>> > > dependencies, not counting the optional EPSG geodetic dataset
>> (because not
>> > > compatible with Apache license). Apache SIS is not the library with
>> the
>> > > largest amount of map projections (PROJ4J has more), but it handles
>> some
>> > > difficult problems and scale well with three- or four-dimensional
>> data (or
>> > > more).
>> > > >
>> > > > PROJ4J is a lightweight library which may be sufficient if data are
>> > > mostly two-dimensional (limited 3D support seems also possible) and if
>> > > uncertainty of a few metres in coordinate transformations (depending
>> how
>> > > datum shifts are specified) is acceptable.
>> > > >
>> > > > It is possible to write some code in an implementation-independent
>> way
>> > > using GeoAPI interfaces, which aim to do what JDBC interfaces do for
>> > > databases. Apache SIS and PROJ-JNI are implementations of GeoAPI
>> > > interfaces, so by using those interfaces you can let users choose
>> among
>> > > those two implementations. I think that GeoAPI wrappers could easily
>> be
>> > > contributed to PROJ4J as well if there is a desire for that.
>> > > >
>> > > > Regarding Geohash, if we are talking about the algorithm described
>> at
>> > > https://en.wikipedia.org/wiki/Geohash, then SIS already supports it.
>> SIS
>> > > supports also the Military Grid Reference System (MGRS), which can be
>> seen
>> > > as another kind of geohash with better characteristics.
>> > > >
>> > > > Regards,
>> > > >
>> > > >     Martin
>> > > >
>> > > >
>> ---------------------------------------------------------------------
>> > > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> > > >
>> > > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> > >
>> > >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>

-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Reply via email to