Replace LGPL dependencies by Apache ones?

2022-12-09 Thread Martin Desruisseaux

Hello all

The Sedona root pom.xml file declares some GeoTools modules as 
dependencies that user must add themselves [1]. But I do not know if 
those dependencies are optional. Can Sedona works without them, as 
requested by [2]? Alternatively, would it be conceivable to replace them 
by Apache SIS [3]? The GeoTools modules that are used seems to be for 
referencing services and GeoTIFF support, both of them being services 
offered by Apache SIS as well. The API for referencing services should 
be similar, making transition easier.


Is there any interest for such migration? If yes, I would be glad to help.

    Martin

[1]https://github.com/apache/incubator-sedona/blob/5c0f92701b45d49dbab78a5366776e30b0fef834/pom.xml#L135
[2]https://www.apache.org/legal/resolved.html#optional
[3]https://sis.apache.org/


Re: Replace LGPL dependencies by Apache ones?

2022-12-09 Thread Mo Sarwat
Martin,

That sounds like a good idea - thanks for the proposal. But, we first need
to consider the following:

1. Make sure that will not break other components in Sedona
2. The speed of using SIS is at least  the same as GeoTools.  Better if
faster for sure

If those 2 points are covered, I believe we can go ahead with migration to
SIS. We will definitely love to have you as a Sedona contributor 😊

Thanks,
Mo

On Fri, Dec 9, 2022 at 6:01 AM Martin Desruisseaux <
martin.desruisse...@geomatys.com> wrote:

> Hello all
>
> The Sedona root pom.xml file declares some GeoTools modules as
> dependencies that user must add themselves [1]. But I do not know if
> those dependencies are optional. Can Sedona works without them, as
> requested by [2]? Alternatively, would it be conceivable to replace them
> by Apache SIS [3]? The GeoTools modules that are used seems to be for
> referencing services and GeoTIFF support, both of them being services
> offered by Apache SIS as well. The API for referencing services should
> be similar, making transition easier.
>
> Is there any interest for such migration? If yes, I would be glad to help.
>
>  Martin
>
> [1]
> https://github.com/apache/incubator-sedona/blob/5c0f92701b45d49dbab78a5366776e30b0fef834/pom.xml#L135
> [2]https://www.apache.org/legal/resolved.html#optional
> [3]https://sis.apache.org/
>


Fwd: Replace LGPL dependencies by Apache ones?

2022-12-09 Thread Mo Sarwat
Martin,

That sounds like a good idea - thanks for the proposal. But, we first need
to consider the following:

1. Make sure that will not break other components in Sedona
2. The speed of using SIS is at least  the same as GeoTools.  Better if
faster for sure

If those 2 points are covered, I believe we can go ahead with migration to
SIS. We will definitely love to have you as a Sedona contributor 😊

Thanks,
Mo

On Fri, Dec 9, 2022 at 6:01 AM Martin Desruisseaux <
martin.desruisse...@geomatys.com> wrote:

> Hello all
>
> The Sedona root pom.xml file declares some GeoTools modules as
> dependencies that user must add themselves [1]. But I do not know if
> those dependencies are optional. Can Sedona works without them, as
> requested by [2]? Alternatively, would it be conceivable to replace them
> by Apache SIS [3]? The GeoTools modules that are used seems to be for
> referencing services and GeoTIFF support, both of them being services
> offered by Apache SIS as well. The API for referencing services should
> be similar, making transition easier.
>
> Is there any interest for such migration? If yes, I would be glad to help.
>
>  Martin
>
> [1]
> https://github.com/apache/incubator-sedona/blob/5c0f92701b45d49dbab78a5366776e30b0fef834/pom.xml#L135
> [2]https://www.apache.org/legal/resolved.html#optional
> [3]https://sis.apache.org/
>


Re: Replace LGPL dependencies by Apache ones?

2022-12-09 Thread Martin Desruisseaux

Hello Mo

Thanks for the reply.

Le 09/12/2022 à 14:13, Mo Sarwat a écrit :


That sounds like a good idea - thanks for the proposal. But, we first need to 
consider the following:

1. Make sure that will not break other components in Sedona
2. The speed of using SIS is at least  the same as GeoTools.  Better if faster 
for sure


It is difficult to know without trying; I have no benchmark with 
GeoTools. A benchmark with PROJ (the C library) was done 5 years ago 
[1], but is outdated because Java 9 vastly increased the speed of 
trigonometric functions, and PROJ 6 changed its architecture making it 
closer to Apache SIS (search "SIS" on [2]). I think that the only way to 
answer those questions would be to try migration in a branch.


    Martin

[1]https://www.geomatys.com/2017/08/28/proj-4-versus-apache-sis-a-performance-comparison/
[2]https://gdalbarn.com/


Re: Replace LGPL dependencies by Apache ones?

2022-12-14 Thread Martin Desruisseaux

Hello all

To follow-up on this question, is there an interest for experimenting a 
replacement of GeoTools dependency (LGPL license) by Apache SIS? (not 
knowing in advance if the experiment would be conclusive or not). If 
yes, where should be the branch?


Regarding dependencies, by looking at the root pom.xml file [1] I'm a 
little bit confused. All non-test dependencies are declared with 
${dependency.scope} where the ${dependency.scope} 
property is set to "provided" be default. This pom.xml file seems to 
implement an "all or nothing" behavior regarding dependency scope 
management. I would have expected the default scope ("compile") for a 
set of dependencies that are essential to Sedona working, and the 
"provided" scope for optional dependencies. Are all the dependencies 
really optional?


I think that the NOTICE file needs to list the licenses of dependencies. 
Maybe [3] can be used as a source of inspiration? Note that gt-epsg is 
under EPSG Terms of use, not only LGPL. A similar issue raised for 
Apache Calcite goes deeper in this topic and how it can be resolved [4].


I mention those licensing topics because I guess they are the reason for 
the way dependencies are declared in the root pom.xml, and if I create a 
branch for experimenting some dependencies replacement, understanding 
their purposes would be better.


    Thanks,

        Martin

[1]https://github.com/apache/incubator-sedona/blob/sedona-1.3.1-incubating-rc1/pom.xml
[2]https://github.com/apache/incubator-sedona/blob/sedona-1.3.1-incubating-rc1/NOTICE
[3]https://github.com/apache/sis/blob/master/NOTICE
[4]https://github.com/locationtech/proj4j/issues/90


Re: Replace LGPL dependencies by Apache ones?

2022-12-14 Thread Jia Yu
Dear Martin,

Thanks for your suggestion!

1. I am interested in exploring replacing geotools with Proj4J or Apache
SIS if this replacement can serve this purpose: this library can be shipped
together in Sedona binary (Maven 'compile' scope) and users no longer need
to manually add another dependency like the geotools-wrapper [1].

Given this purpose, I can see there are a few features that Apache SIS
lacks. Please correct me if I am wrong since you are an expert of Apache
SIS.

(1) Compatibility with both Java 8 and Java 11. SIS only works for Java 11
however many Sedona users are still on Java 8. We have no plan to get rid
of Java 8 support any time soon.
(2) Support of ESRI shapefile read.
(3) Support of GeoTiiff read and write. SIS supports GeoTiiff read but
Sedona also writes result to GeoTiiff.
(4) Apache SIS also does not ship EPSG dataset in its binary meaning that
the users still have to manually add another dependency.

2. Sedona pom files are kind of very complicated since we have modules for
both Spark and Flink. The core logic is:

(1) The root pom does not have any compile-scope dependencies. All these
dependencies are just for development purposes. We never package Spark and
Flink dependency because expect users to add Spark, Flink dependencies on
their own.
(2) If it is really necessary, dependencies that need to be shipped in
Sedona binary are put in the corresponding child module pom under 'compile'
scope.
(3) There are a few exceptions: JTS, jts2geojson, GeoTools [2]. We don't
ship them in binary because these libraries are very commonly used in GIS
applications. Packaging them to Sedona might end up polluting the user's
environment.
(4) Sedona has a special module called sedona-python-adapter which is
designated for non-JVM world users (Python and R). In this module, we
package all dependencies in compile scope since these users usually don't
know much about Maven packaging. However, geotools is excluded due to the
license issue.

3. If you think using SIS can serve the purpose mentioned above, you can
create a fork of Sedona and work on it. Then you can create a PR on Sedona
and the CI will automatically test the code.

Thanks,
Jia

[1] https://sedona.apache.org/setup/maven-coordinates/#use-sedona-fat-jars
[2]
https://sedona.apache.org/setup/maven-coordinates/#use-sedona-and-third-party-jars-separately

On Wed, Dec 14, 2022 at 2:26 PM Martin Desruisseaux <
martin.desruisse...@geomatys.com> wrote:

> Hello all
>
> To follow-up on this question, is there an interest for experimenting a
> replacement of GeoTools dependency (LGPL license) by Apache SIS? (not
> knowing in advance if the experiment would be conclusive or not). If
> yes, where should be the branch?
>
> Regarding dependencies, by looking at the root pom.xml file [1] I'm a
> little bit confused. All non-test dependencies are declared with
> ${dependency.scope} where the ${dependency.scope}
> property is set to "provided" be default. This pom.xml file seems to
> implement an "all or nothing" behavior regarding dependency scope
> management. I would have expected the default scope ("compile") for a
> set of dependencies that are essential to Sedona working, and the
> "provided" scope for optional dependencies. Are all the dependencies
> really optional?
>
> I think that the NOTICE file needs to list the licenses of dependencies.
> Maybe [3] can be used as a source of inspiration? Note that gt-epsg is
> under EPSG Terms of use, not only LGPL. A similar issue raised for
> Apache Calcite goes deeper in this topic and how it can be resolved [4].
>
> I mention those licensing topics because I guess they are the reason for
> the way dependencies are declared in the root pom.xml, and if I create a
> branch for experimenting some dependencies replacement, understanding
> their purposes would be better.
>
>  Thanks,
>
>  Martin
>
> [1]
> https://github.com/apache/incubator-sedona/blob/sedona-1.3.1-incubating-rc1/pom.xml
> [2]
> https://github.com/apache/incubator-sedona/blob/sedona-1.3.1-incubating-rc1/NOTICE
> [3]https://github.com/apache/sis/blob/master/NOTICE
> [4]https://github.com/locationtech/proj4j/issues/90
>


Re: Replace LGPL dependencies by Apache ones?

2022-12-15 Thread Martin Desruisseaux

Hello Jia

Le 15/12/2022 à 02:26, Jia Yu a écrit :

1. I am interested in exploring replacing geotools with Proj4J or 
Apache SIS if this replacement can serve this purpose: this library 
can be shipped together in Sedona binary (Maven 'compile' scope) and 
users no longer need to manually add another dependency like the 
geotools-wrapper.


Yes, Apache SIS is of course under Apache license except the EPSG 
dataset which needs to be added explicitly by the user. More on it in 
item (4).



(1) Compatibility with both Java 8 and Java 11. SIS only works for 
Java 11 however many Sedona users are still on Java 8. We have no plan 
to get rid of Java 8 support any time soon.


All SIS versions up to SIS 1.3 (to be released soon) are compatible with 
Java 8. The next version (SIS 1.4) will indeed require Java 11, but 
maybe SIS 1.3 would be sufficient until Sedona can upgrade? If needed, 
we can do some patch releases of SIS 1.3 for bug fixes.




(2) Support of ESRI shapefile read.

True, this part is missing. This work started years ago by a contributor 
but is not finished.



(3) Support of GeoTiiff read and write. SIS supports GeoTiiff read but 
Sedona also writes result to GeoTiiff.


True, GeoTIFF write capability is missing (planed but not done yet). 
TIFF World File could be used to mitigate this lack, but I admit it is 
not as nice.



(4) Apache SIS also does not ship EPSG dataset in its binary meaning 
that the users still have to manually add another dependency.


Yes, but EPSG Terms of Use have been classified as a Category X license 
by Apache [1] and no library can ship those data without EPSG Terms of 
Use [2] — doing otherwise is called "relicensing", and no-one can do 
that except the copyright owner. The libraries that bundle EPSG data 
without EPSG terms of use are not legally correct. Proj4J just fixed 
this problem recently with a separated JAR, similar to what SIS does 
[3]. SIS provides different ways that the user can choose for adding 
EPSG data [4].



2. Sedona pom files are kind of very complicated since we have modules 
for both Spark and Flink. The core logic is: (…snip…)



Thanks for the explanation. Indeed, I understand a little bit more now.


3. If you think using SIS can serve the purpose mentioned above, you 
can create a fork of Sedona and work on it. Then you can create a PR 
on Sedona and the CI will automatically test the code.


It is hard to know how well it serves the purpose without trying. With 
SIS 1.3 there would be lost of functionalities on one hand (GeoTIFF 
writer and Shapefile) but some new functionalities on the other hand. 
The coordinate transformation services provided by SIS are more 
advanced, and I would also be curious to know how the GeoTIFF reader 
compares. But I'm not sure that a CI automatic tests would be successful 
without some PMC decision about altered functionalities. What I do not 
know is how critical they would be for Sedona.


    Martin

[1]https://issues.apache.org/jira/browse/LEGAL-183
[2]https://epsg.org/terms-of-use.html
[3]https://github.com/locationtech/proj4j/issues/90
[4]https://sis.apache.org/epsg.html


Re: Replace LGPL dependencies by Apache ones?

2022-12-16 Thread Martin Desruisseaux

On a minor note, in the following links:


[1] https://sedona.apache.org/setup/maven-coordinates/#use-sedona-fat-jars
[2] 
https://sedona.apache.org/setup/maven-coordinates/#use-sedona-and-third-party-jars-separately


The sentence "This libary is under GNU Lesser General Public License 
(LGPL) license" should be completed with "and under EPSG terms of use" 
if the JAR contains gt-epsg.jar.


    Martin




Re: Replace LGPL dependencies by Apache ones?

2022-12-17 Thread Jia Yu
Hi Martin,

Thanks for your email.

1. I will update the Sedona website to better explain the EPSG terms of use.
2. GeoTiff read/write are integral to Sedona Raster funcs. We don't want to
lose it. Shapefile read are also important. So it looks like we cannot
avoid geotools anyway. Given this, I am kind of reluctant to add Apache SIS
as an additional dependency unless the performance improvement on
ST_Transform is significant.

Other PMC members, please feel free to chime in.

Thanks,
Jia

On Fri, Dec 16, 2022 at 3:26 AM Martin Desruisseaux <
martin.desruisse...@geomatys.com> wrote:

> On a minor note, in the following links:
>
> > [1]
> https://sedona.apache.org/setup/maven-coordinates/#use-sedona-fat-jars
> > [2]
> https://sedona.apache.org/setup/maven-coordinates/#use-sedona-and-third-party-jars-separately
>
> The sentence "This libary is under GNU Lesser General Public License
> (LGPL) license" should be completed with "and under EPSG terms of use"
> if the JAR contains gt-epsg.jar.
>
>  Martin
>
>
>


Re: Replace LGPL dependencies by Apache ones?

2022-12-19 Thread Martin Andersson
I'm not a PMC member. I'm just a happy Sedona user and occasional
contributor.

I think that the minimum java version is the key issue.

Redhat will maintain openjdk8 until 2026. Hadoop and Spark still support
java 8. My guess is that they will do so for many years.

Geotools is moving to java 11. They might event move to java 17 soon. But
we can stick with an older version since it very mature and has all the
features we need.
https://osgeo-org.atlassian.net/browse/GEOT-7254

This could be an opportunity for Apache SIS to get an edge on Geotools, if
you revert the decision to drop java 8 and instead decided to adopt a big
data friendly policy. That would position Apache SIS as the spatial library
of choice in the data space
https://projects.apache.org/projects.html?category#big-data

The missing features are less concerning to me. The Sedona community could
work with the SIS community to implement them. It would be worth it in the
long run if we have a stable gis library that follows hadoops java version
policy.

I’m struggling to find any upside to dropping java 8. There are no major
language features in java 11. New apis in the sdk can be used with multi
version jars.

Br,
Martin

Den sön 18 dec. 2022 kl 07:32 skrev Jia Yu :

> Hi Martin,
>
> Thanks for your email.
>
> 1. I will update the Sedona website to better explain the EPSG terms of
> use.
> 2. GeoTiff read/write are integral to Sedona Raster funcs. We don't want to
> lose it. Shapefile read are also important. So it looks like we cannot
> avoid geotools anyway. Given this, I am kind of reluctant to add Apache SIS
> as an additional dependency unless the performance improvement on
> ST_Transform is significant.
>
> Other PMC members, please feel free to chime in.
>
> Thanks,
> Jia
>
> On Fri, Dec 16, 2022 at 3:26 AM Martin Desruisseaux <
> martin.desruisse...@geomatys.com> wrote:
>
> > On a minor note, in the following links:
> >
> > > [1]
> > https://sedona.apache.org/setup/maven-coordinates/#use-sedona-fat-jars
> > > [2]
> >
> https://sedona.apache.org/setup/maven-coordinates/#use-sedona-and-third-party-jars-separately
> >
> > The sentence "This libary is under GNU Lesser General Public License
> > (LGPL) license" should be completed with "and under EPSG terms of use"
> > if the JAR contains gt-epsg.jar.
> >
> >  Martin
> >
> >
> >
>


-- 
Hälsningar,
Martin


Re: Replace LGPL dependencies by Apache ones?

2022-12-21 Thread Jia Yu
Folks, sorry for the late reply. I was travelling for the past few days.

Martin Andersson 's suggestion makes sense to me. If Apache SIS decides to
provide Java 8 support in the long run, we can try to replace GeoTools CRS
transformation with Apache SIS.

Thanks,
Jia

On Mon, Dec 19, 2022 at 4:40 AM Martin Andersson <
u.martin.anders...@gmail.com> wrote:

> I'm not a PMC member. I'm just a happy Sedona user and occasional
> contributor.
>
> I think that the minimum java version is the key issue.
>
> Redhat will maintain openjdk8 until 2026. Hadoop and Spark still support
> java 8. My guess is that they will do so for many years.
>
> Geotools is moving to java 11. They might event move to java 17 soon. But
> we can stick with an older version since it very mature and has all the
> features we need.
> https://osgeo-org.atlassian.net/browse/GEOT-7254
>
> This could be an opportunity for Apache SIS to get an edge on Geotools, if
> you revert the decision to drop java 8 and instead decided to adopt a big
> data friendly policy. That would position Apache SIS as the spatial library
> of choice in the data space
> https://projects.apache.org/projects.html?category#big-data
>
> The missing features are less concerning to me. The Sedona community could
> work with the SIS community to implement them. It would be worth it in the
> long run if we have a stable gis library that follows hadoops java version
> policy.
>
> I’m struggling to find any upside to dropping java 8. There are no major
> language features in java 11. New apis in the sdk can be used with multi
> version jars.
>
> Br,
> Martin
>
> Den sön 18 dec. 2022 kl 07:32 skrev Jia Yu :
>
> > Hi Martin,
> >
> > Thanks for your email.
> >
> > 1. I will update the Sedona website to better explain the EPSG terms of
> > use.
> > 2. GeoTiff read/write are integral to Sedona Raster funcs. We don't want
> to
> > lose it. Shapefile read are also important. So it looks like we cannot
> > avoid geotools anyway. Given this, I am kind of reluctant to add Apache
> SIS
> > as an additional dependency unless the performance improvement on
> > ST_Transform is significant.
> >
> > Other PMC members, please feel free to chime in.
> >
> > Thanks,
> > Jia
> >
> > On Fri, Dec 16, 2022 at 3:26 AM Martin Desruisseaux <
> > martin.desruisse...@geomatys.com> wrote:
> >
> > > On a minor note, in the following links:
> > >
> > > > [1]
> > > https://sedona.apache.org/setup/maven-coordinates/#use-sedona-fat-jars
> > > > [2]
> > >
> >
> https://sedona.apache.org/setup/maven-coordinates/#use-sedona-and-third-party-jars-separately
> > >
> > > The sentence "This libary is under GNU Lesser General Public License
> > > (LGPL) license" should be completed with "and under EPSG terms of use"
> > > if the JAR contains gt-epsg.jar.
> > >
> > >  Martin
> > >
> > >
> > >
> >
>
>
> --
> Hälsningar,
> Martin
>


Re: Replace LGPL dependencies by Apache ones?

2022-12-22 Thread Martin Desruisseaux

Hello Martin and Jia

Le 22/12/2022 à 08:25, Jia Yu a écrit :

Martin Andersson 's suggestion makes sense to me. If Apache SIS 
decides to provide Java 8 support in the long run, we can try to 
replace GeoTools CRS transformation with Apache SIS.


It would be possible to backport SIS 1.4+ in a Java 8 branch. We did 
something similar for Java 7 some years ago, so we have experience in 
this process. But it is a significant effort (a few thousands lines of 
code have changed after SIS 1.3 release candidate for taking advantage 
of Java 11 methods in Math, NIO, Arrays, collections, TIFF tags, etc). 
In order to evaluate if a Java 8 branch is worth, it may be desirable to 
first test "Sedona on SIS" in a branch somewhere and see if it is 
considered satisfying.


A little bit of context regarding GeoTools versus Apache SIS: from 2002 
to 2009 I was the main author of GeoTools metadata, referencing and base 
grid coverage classes excluding I/O (so GeoTIFF is an independent work). 
Apache SIS metadata, referencing and grid coverage modules are 
continuation of the work I started in GeoTools with bug fixes, 
improvements and upgrades to newer ISO/OGC standards (e.g. GML and ISO 
19162, a.k.a. "WKT 2"). Apache SIS has been used as a source of 
inspiration for a major upgrade of the PROJ library [1].


Performance is one aspect, but accuracy is another important aspect. A 
video from IOGP (creator of EPSG dataset) explains how a few tens of 
centimeters error has cost them millions of dollars [2]. Not every 
projects need high confidence, but some groups such as OSDU (Open 
Subsurface Data Universe) require software to pass the GIGS tests 
(Geospatial Integrity of Geoscience Software) [3] before to care about 
performance. Apache SIS does not yet pass the full GIGS test suite or 
implement all features shown in IOGP video, but we are working on that 
in collaboration with peoples from GIGS and OGC working group.


Nevertheless Apache SIS and GeoTools performance of referencing services 
should be similar since their code have the same roots. SIS got 
improvements, but I did not measured their performance impact. But 
anyway, performance depends a lot about how a library is used. The 
accuracy mentioned in previous paragraph is not always needed, so we 
should allow users to control the performance/accuracy tradeoff (still a 
work in progress in SIS). Furthermore, transforming points is easy but 
transforming geometries or rasters involve additional complexities. SIS 
has some features such as Jacobian matrix which, when put in the hands 
of a mathematically skilled developer, allows nice optimizations. This 
is used by SIS for envelope and raster transformations.


In summary, if there is an interest, creating a "Sedona on SIS" branch 
may take some time. The result would need to be evaluated before to 
decide in there is a desire to put more effort, e.g. in a SIS Java 8 branch.


    Regards,

        Martin

[1]https://gdalbarn.com/  — search "SIS" on that page.
[2]https://www.youtube.com/watch?v=IKM-bR6SwVs
[3]https://gigs.iogp.org/


Re: Replace LGPL dependencies by Apache ones?

2023-01-18 Thread Martin Andersson
Sorry for the late reply. The holidays and an endless stream of colds kept
me busy.

I think that Sedona and SIS joining forces would be great. But maybe the
timing isn't right.

I don't think a maintenance branch is going to work. Once the big data
community drops java 8, in a few years, and Sedona is ready to upgrade SIS
you will already be several java releases ahead and we'll be stuck on a new
maintenance branch.

If you are interested in joining the big data community I would advice you
to look into multi release jars and be very conservative when it comes to
bumping minimum java version.

Multi release jars would not only help you with backwards compatibility. It
would also help you take advantage of new java features, for those running
on newer runtimes, before you bump the minimum java version.

For instance jackson runs on java 8 but supports records on new java
runtimes. All from the same build and jar file.

Br,
Martin Andersson


Den tors 22 dec. 2022 kl 12:21 skrev Martin Desruisseaux <
martin.desruisse...@geomatys.com>:

> Hello Martin and Jia
>
> Le 22/12/2022 à 08:25, Jia Yu a écrit :
>
> > Martin Andersson 's suggestion makes sense to me. If Apache SIS
> > decides to provide Java 8 support in the long run, we can try to
> > replace GeoTools CRS transformation with Apache SIS.
> >
> It would be possible to backport SIS 1.4+ in a Java 8 branch. We did
> something similar for Java 7 some years ago, so we have experience in
> this process. But it is a significant effort (a few thousands lines of
> code have changed after SIS 1.3 release candidate for taking advantage
> of Java 11 methods in Math, NIO, Arrays, collections, TIFF tags, etc).
> In order to evaluate if a Java 8 branch is worth, it may be desirable to
> first test "Sedona on SIS" in a branch somewhere and see if it is
> considered satisfying.
>
> A little bit of context regarding GeoTools versus Apache SIS: from 2002
> to 2009 I was the main author of GeoTools metadata, referencing and base
> grid coverage classes excluding I/O (so GeoTIFF is an independent work).
> Apache SIS metadata, referencing and grid coverage modules are
> continuation of the work I started in GeoTools with bug fixes,
> improvements and upgrades to newer ISO/OGC standards (e.g. GML and ISO
> 19162, a.k.a. "WKT 2"). Apache SIS has been used as a source of
> inspiration for a major upgrade of the PROJ library [1].
>
> Performance is one aspect, but accuracy is another important aspect. A
> video from IOGP (creator of EPSG dataset) explains how a few tens of
> centimeters error has cost them millions of dollars [2]. Not every
> projects need high confidence, but some groups such as OSDU (Open
> Subsurface Data Universe) require software to pass the GIGS tests
> (Geospatial Integrity of Geoscience Software) [3] before to care about
> performance. Apache SIS does not yet pass the full GIGS test suite or
> implement all features shown in IOGP video, but we are working on that
> in collaboration with peoples from GIGS and OGC working group.
>
> Nevertheless Apache SIS and GeoTools performance of referencing services
> should be similar since their code have the same roots. SIS got
> improvements, but I did not measured their performance impact. But
> anyway, performance depends a lot about how a library is used. The
> accuracy mentioned in previous paragraph is not always needed, so we
> should allow users to control the performance/accuracy tradeoff (still a
> work in progress in SIS). Furthermore, transforming points is easy but
> transforming geometries or rasters involve additional complexities. SIS
> has some features such as Jacobian matrix which, when put in the hands
> of a mathematically skilled developer, allows nice optimizations. This
> is used by SIS for envelope and raster transformations.
>
> In summary, if there is an interest, creating a "Sedona on SIS" branch
> may take some time. The result would need to be evaluated before to
> decide in there is a desire to put more effort, e.g. in a SIS Java 8
> branch.
>
>  Regards,
>
>  Martin
>
> [1]https://gdalbarn.com/  — search "SIS" on that page.
> [2]https://www.youtube.com/watch?v=IKM-bR6SwVs
> [3]https://gigs.iogp.org/
>


Re: Replace LGPL dependencies by Apache ones?

2023-01-19 Thread Martin Desruisseaux

Hello Martin

Indeed, we use multi-release JAR files for other projects. But in Apache 
SIS case, upgrade from Java 8 to 11 implies changes pervasive enough to 
make uncertain that multi-release JAR would be easier than a branch. But 
anyway, the decision to bump the minimum Java version is up to the 
community, which includes the users. If a user (could be Sedona) objects 
against a future bump, this is taken in account.


One thing about the timing is that a migration (if desired) may require 
more work as time passes. For example a recent request on this mailing 
list was about fetching values in a GeoTIFF file. Apache SIS already 
supports that on GeoTIFF and netCDF among others, handling the CRS 
issues, in N-dimensions when the format supports it, and efficiently on 
large files when the format supports tilings (geospatial softwares do 
some big data on their own since they work with space agencies). Work 
developed on top of a different library would probably be done 
differently, making an eventual migration more unlikely.


Whether a migration would be desirable or not depends on the technical 
merits of libraries, but also licensing is maybe not a negligible 
aspect. Given that Sedona is all about bringing geospatial 
functionalities to Spark, it seems a little bit surprising that 
"referencing by coordinates" services and "grid coverages" (a.k.a. 
rasters) are not core to Sedona?


But anyway, I understand that for making a migration proposal receivable 
for consideration, we need a prototype that can be evaluated for 
functionalities and performances. I will not have the time to work on a 
Sedona branch for the next few months however (especially if acceptance 
is uncertain), but maybe in spring some experiment would be possible.


    Martin


Le 18/01/2023 à 15:37, Martin Andersson a écrit :


I think that Sedona and SIS joining forces would be great. But maybe the
timing isn't right.

I don't think a maintenance branch is going to work. Once the big data
community drops java 8, in a few years, and Sedona is ready to upgrade SIS
you will already be several java releases ahead and we'll be stuck on a new
maintenance branch.

If you are interested in joining the big data community I would advice you
to look into multi release jars and be very conservative when it comes to
bumping minimum java version.

Multi release jars would not only help you with backwards compatibility. It
would also help you take advantage of new java features, for those running
on newer runtimes, before you bump the minimum java version.

For instance jackson runs on java 8 but supports records on new java
runtimes. All from the same build and jar file.