Re: SparkR build with AppVeyor, broken by external reason

2023-01-16 Thread Dongjoon Hyun
Thank you for checking and sharing, Hyukjin. :)

Dongjoon.

On Mon, Jan 16, 2023 at 4:37 PM Hyukjin Kwon  wrote:

> Hi all,
>
> AppVeyor is currently broken assuming the flaky Github authorization issue
> (
> https://help.appveyor.com/discussions/problems/11287-the-build-phase-is-set-to-msbuild-mode-default-but-no-visual-studio-project-or-solution-files-were-found
> ).
>
> AppVeyor build is specific to SparkR (on WIndows) so could be ignored in
> most cases for now.
>


Re: [Suggest] Add geo function to core

2023-01-16 Thread Grigory Pomadchin
Hey folks,

Traditionally GIS functionality is distributed a bit separately - i.e.
PostGIS is a great example; and indeed for GIS needs Sedona / GeoMesa /
GeoWave may work out; I think GeoMesa implements GeoHash (see
https://www.geomesa.org/documentation/stable/user/spark/sparksql_functions.html
-
could be used as an inspiration at least);

I'm pretty sure DataBricks provides some GIS functions (H3) at this point.
Could be an argument for having smth in the core / officially supported by
Spark community?

I'd really love to see some relatively lightweight (JTS + Proj4j / SIS)
library with basic expressions and optimization rules in the wild that is
usable in the Spark native interfaces primarily; so there is no need to
figure out the API / way to set it up and / or resolve peculiar
dependencies. Could be a step towards Spark GIS types standardization.

Best,

Grigory

On Mon, Jan 16, 2023 at 6:21 PM Mo Sarwat  wrote:

> Martin, thanks for chiming in and mentioning Apache SIS. However, Mark was
> asking about Geo in Spark, which Sedona already supports.
>
> Yet, I like the idea of making all dependencies within the Apache family.
> I believe a good solution would be for you (or the SIS community at large)
> to include Apache SIS in Sedona to replace libs like GeoTools. The Sedona
> community would definitely welcome your contribution :)
>
> Regards,
> -Mo
>
> On 2023/01/16 22:24:14 Martin Desruisseaux wrote:
> > Hello Mark
> >
> > Indeed Sedona is surely a serious candidate. Maybe one aspect to take in
> consideration, depending how "core" the geospatial services would be, is
> that Sedona depends on a LGPL library (GeoTools, bundled separately) for
> map projections, Shapefile and GeoTIFF support. So those features could not
> be in core since category X dependencies shall be optional.
> >
> > Regarding referencing by coordinates (including map projections), I'm
> aware of 3 libraries having a license compatible with Apache:
> >
> > * Apache SIS (Apache License)
> > * PROJ4J (Apache license)
> > * PROJ-JNI (MIT license)
> >
> > PROJ-JNI is a binding to PROJ native library using Java Native Interface
> (JNI). PROJ is the most well known map projection library, but it is
> difficult to bundle native code in a Java application.
> >
> > I'm not in a neutral position to said that, but I believe that Apache
> SIS is the most powerful open source pure-Java referencing library. But it
> is relatively big, about 4 Mb for the referencing module with its
> dependencies, not counting the optional EPSG geodetic dataset (because not
> compatible with Apache license). Apache SIS is not the library with the
> largest amount of map projections (PROJ4J has more), but it handles some
> difficult problems and scale well with three- or four-dimensional data (or
> more).
> >
> > PROJ4J is a lightweight library which may be sufficient if data are
> mostly two-dimensional (limited 3D support seems also possible) and if
> uncertainty of a few metres in coordinate transformations (depending how
> datum shifts are specified) is acceptable.
> >
> > It is possible to write some code in an implementation-independent way
> using GeoAPI interfaces, which aim to do what JDBC interfaces do for
> databases. Apache SIS and PROJ-JNI are implementations of GeoAPI
> interfaces, so by using those interfaces you can let users choose among
> those two implementations. I think that GeoAPI wrappers could easily be
> contributed to PROJ4J as well if there is a desire for that.
> >
> > Regarding Geohash, if we are talking about the algorithm described at
> https://en.wikipedia.org/wiki/Geohash, then SIS already supports it. SIS
> supports also the Military Grid Reference System (MGRS), which can be seen
> as another kind of geohash with better characteristics.
> >
> > Regards,
> >
> > Martin
> >
> > -
> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >
> >
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


SparkR build with AppVeyor, broken by external reason

2023-01-16 Thread Hyukjin Kwon
Hi all,

AppVeyor is currently broken assuming the flaky Github authorization issue (
https://help.appveyor.com/discussions/problems/11287-the-build-phase-is-set-to-msbuild-mode-default-but-no-visual-studio-project-or-solution-files-were-found
).

AppVeyor build is specific to SparkR (on WIndows) so could be ignored in
most cases for now.


Re: [Suggest] Add geo function to core

2023-01-16 Thread Mo Sarwat
Martin, thanks for chiming in and mentioning Apache SIS. However, Mark was 
asking about Geo in Spark, which Sedona already supports. 

Yet, I like the idea of making all dependencies within the Apache family. I 
believe a good solution would be for you (or the SIS community at large) to 
include Apache SIS in Sedona to replace libs like GeoTools. The Sedona 
community would definitely welcome your contribution :)

Regards,
-Mo

On 2023/01/16 22:24:14 Martin Desruisseaux wrote:
> Hello Mark
> 
> Indeed Sedona is surely a serious candidate. Maybe one aspect to take in 
> consideration, depending how "core" the geospatial services would be, is that 
> Sedona depends on a LGPL library (GeoTools, bundled separately) for map 
> projections, Shapefile and GeoTIFF support. So those features could not be in 
> core since category X dependencies shall be optional.
> 
> Regarding referencing by coordinates (including map projections), I'm aware 
> of 3 libraries having a license compatible with Apache:
> 
> * Apache SIS (Apache License)
> * PROJ4J (Apache license)
> * PROJ-JNI (MIT license)
> 
> PROJ-JNI is a binding to PROJ native library using Java Native Interface 
> (JNI). PROJ is the most well known map projection library, but it is 
> difficult to bundle native code in a Java application.
> 
> I'm not in a neutral position to said that, but I believe that Apache SIS is 
> the most powerful open source pure-Java referencing library. But it is 
> relatively big, about 4 Mb for the referencing module with its dependencies, 
> not counting the optional EPSG geodetic dataset (because not compatible with 
> Apache license). Apache SIS is not the library with the largest amount of map 
> projections (PROJ4J has more), but it handles some difficult problems and 
> scale well with three- or four-dimensional data (or more).
> 
> PROJ4J is a lightweight library which may be sufficient if data are mostly 
> two-dimensional (limited 3D support seems also possible) and if uncertainty 
> of a few metres in coordinate transformations (depending how datum shifts are 
> specified) is acceptable.
> 
> It is possible to write some code in an implementation-independent way using 
> GeoAPI interfaces, which aim to do what JDBC interfaces do for databases. 
> Apache SIS and PROJ-JNI are implementations of GeoAPI interfaces, so by using 
> those interfaces you can let users choose among those two implementations. I 
> think that GeoAPI wrappers could easily be contributed to PROJ4J as well if 
> there is a desire for that.
> 
> Regarding Geohash, if we are talking about the algorithm described at 
> https://en.wikipedia.org/wiki/Geohash, then SIS already supports it. SIS 
> supports also the Military Grid Reference System (MGRS), which can be seen as 
> another kind of geohash with better characteristics.
> 
> Regards,
> 
> Martin
> 
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> 
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [Suggest] Add geo function to core

2023-01-16 Thread Martin Desruisseaux
Hello Mark

Indeed Sedona is surely a serious candidate. Maybe one aspect to take in 
consideration, depending how "core" the geospatial services would be, is that 
Sedona depends on a LGPL library (GeoTools, bundled separately) for map 
projections, Shapefile and GeoTIFF support. So those features could not be in 
core since category X dependencies shall be optional.

Regarding referencing by coordinates (including map projections), I'm aware of 
3 libraries having a license compatible with Apache:

* Apache SIS (Apache License)
* PROJ4J (Apache license)
* PROJ-JNI (MIT license)

PROJ-JNI is a binding to PROJ native library using Java Native Interface (JNI). 
PROJ is the most well known map projection library, but it is difficult to 
bundle native code in a Java application.

I'm not in a neutral position to said that, but I believe that Apache SIS is 
the most powerful open source pure-Java referencing library. But it is 
relatively big, about 4 Mb for the referencing module with its dependencies, 
not counting the optional EPSG geodetic dataset (because not compatible with 
Apache license). Apache SIS is not the library with the largest amount of map 
projections (PROJ4J has more), but it handles some difficult problems and scale 
well with three- or four-dimensional data (or more).

PROJ4J is a lightweight library which may be sufficient if data are mostly 
two-dimensional (limited 3D support seems also possible) and if uncertainty of 
a few metres in coordinate transformations (depending how datum shifts are 
specified) is acceptable.

It is possible to write some code in an implementation-independent way using 
GeoAPI interfaces, which aim to do what JDBC interfaces do for databases. 
Apache SIS and PROJ-JNI are implementations of GeoAPI interfaces, so by using 
those interfaces you can let users choose among those two implementations. I 
think that GeoAPI wrappers could easily be contributed to PROJ4J as well if 
there is a desire for that.

Regarding Geohash, if we are talking about the algorithm described at 
https://en.wikipedia.org/wiki/Geohash, then SIS already supports it. SIS 
supports also the Military Grid Reference System (MGRS), which can be seen as 
another kind of geohash with better characteristics.

Regards,

Martin

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [Suggest] Add geo function to core

2023-01-16 Thread Mo Sarwat
Mark,

There is already another Apache project (namely Apache Sedona) that provides 
comprehensive support of geospatial operations in Spark. Please check it out:

Github: https://github.com/apache/sedona
Website: https://sedona.apache.org

Please feel free to contribute more geospatial functions to Sedona too!

Regards,
-Mo

On 2023/01/06 18:03:38 Mark Andreev wrote:
> Hi,
> 
> I suggest adding geographical functions to Apache Core like Clickhouse (
> https://clickhouse.com/docs/en/sql-reference/functions/geo/).
> 
> - Geographical Coordinates Functions
> - Geohash Functions
> - H3 Indexes
> - S2 Indexes
> 
> What do you think? What is current policy about core evolution? Should we
> create a separate module (standalone repository out of apache) and after
> success merge into the main branch?
> 
> --
> Best regards,
> Mark Andreev
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org