Re: JTSplus fork status

2020-10-15 Thread Jim Hughes

Hi all,

I'd suggest that we look for alternatives around the changes to 
userData.  When I've talked to Martin Davis (lead JTS developer), he's 
been a little unsure and cautious around how userData is used.


As one suggestion, is it possible to separate out comparison and 
serialization around userData?


Cheers,

Jim

On 10/15/2020 1:32 AM, Felix Cheung wrote:

They are not too bad - IMO it will be good to pursue getting these changes
accepted in jts


On Wed, Oct 14, 2020 at 10:01 PM Jia Yu  wrote:


Not yet. I am pretty sure that they will not accept all changes if you look
at there:

https://github.com/locationtech/jts/compare/master...jiayuasu:master?expand=1
Especially, the "userdata" part and "findLeafBounds" function.

I am gonna create a PR to JTS, and only keep acceptable changes. This way,
we can at least merge some functions to JTS. But I believe we have to use
this fork in the future.

On Wed, Oct 14, 2020 at 7:59 AM Netanel Malka  wrote:


Hi Jia,
I saw that you merge the PR about make Sedona depend on JTS.
Did you also try to open a PR in the JTS project?

I wondered if they agreed to accept your changes.


BR,
Netanel Malka.



Re: First Sedona release

2020-11-12 Thread Jim Hughes

Hi all,

As a JTS committer, I have tried to request that the Sedona project 
discuss the desired changes to JTS previously.  I'd still encourage that.


JTS is an active project and I feel that maintaining a fork of JTS is 
unnecessary and inappropriate.


Cheers,

Jim

On 11/11/20 9:04 PM, Felix Cheung wrote:

Ah. You will need to publish it in order for the dependency chain to work
on Maven Central

However, since you are not the project owner there you might need to
publish that under a different artifact id.

In general, it would be best to avoid hard forking another project like
this.


On Wed, Nov 11, 2020 at 1:05 PM Jia Yu  wrote:


Hi Netanel,

That links to this git submodule:
https://github.com/jiayuasu/jts/blob/1.16.x/modules/core/pom.xml#L6

I can easily fix this by changing the version number here to 1.16.2
excluding "SNAPSHOT":
https://github.com/jiayuasu/jts/blob/1.16.x/modules/core/pom.xml#L6

Will this solve the problem?

On Wed, Nov 11, 2020 at 7:40 AM Netanel Malka 
wrote:


Hi Folks,

I tried to make a release (dry-run) following by
publishing-maven-artifacts
, and I
encountered an issue.

On sedona-core, we have jts-core as a dependency with the SNAPSHOT
version.
(link
<
https://github.com/apache/incubator-sedona/blob/2e60fc07b0eae78ccae3876d970e677fc9319c40/core/pom.xml#L37



)

As a prerequisite to the release process, we cannot have dependencies in a
SNAPSHOT version.


Do you have any clue about how to solve this?


On Mon, 9 Nov 2020 at 21:22, Netanel Malka  wrote:


OK. Thanks Felix.


Updates:

   *
   *   Opened a ticket for INFRA to Enable Nexus Access For Sedona<
https://issues.apache.org/jira/browse/INFRA-21085>
   *   Followed this<
https://infra.apache.org/publishing-maven-artifacts.html> guide to test
the maven release process
   *   I hope to create a PR soon for adjusting the build to deploy to

the

ASF Nexus repository
   *   The key that signs the artifacts were created and tested.

Do we want to create a candidate release for the current master branch?

Netanel Malka,
Big Data Consultant
[Description: Description: Description: Description:
cid:image001.jpg@01C85203.36A2AF30]

From: Felix Cheung 
Sent: Wednesday, November 4, 2020 19:57
To: dev@sedona.apache.org
Cc: Jinxuan Wu; Mohamed Sarwat; Netanel Malka; Paweł Kociński; Zongsi

Zhang

Subject: Re: First Sedona release

1) No you don’t need KEYS file in github only on the release share
https://dist.apache.org/repos/dist/dev/incubator/

2) as podling you add to
https://dist.apache.org/repos/dist/dev/incubator/
When you commit via svn you will be able to add a “directory” for Sedona

2a) for release, you basically do a svn rename to move from dev to

release

“path”

3) if you have java based artifacts, yes. You will publish to Nexus,
staging first and when release is signed off, you can click on the
interface to make it official, which then automatically sync to Maven
central.

Here is a script for example that does release signing and publication

to

Nexus (and staging before release)



https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh


On Wed, Nov 4, 2020 at 2:50 AM Netanel Malka 

netanel...@gmail.com>> wrote:
Hi,

I followed the release-signing
 doc and created a key

for

signing and hashing.

I have a few questions:

1. Should the KEYS file also be added to the project root directory

on

Github? ( I saw it in Apache Ant)
2. I saw in release-policy_upload-ci
 that we
need
to add a release candidate to

https://dist.apache.org/repos/dist/*dev*/

/. However, there does not seem to be a directory with Sedona as
the
TLP name. How may we be able to get a directory with that name? (Also
for
the *release*)
3. Do we need to push the artifacts also to ASF Nexus Repository

(beside

Maven Central)?


Thanks.

On Mon, 2 Nov 2020 at 19:21, Netanel Malka 

netanel...@gmail.com>> wrote:


Thanks Felix.

I would be delighted to help.
I can start with the GPG.
  Can I test it on a some artifact, or I need to wait for the first

release?


On Mon, 2 Nov 2020 at 03:17, Felix Cheung 
> wrote:

Great progress!

To add,
A) I’d strongly recommend the WIP disclaimer - it would be much

easier

to

pass with in the first release
https://incubator.apache.org/policy/incubation.html#disclaimers

B) more info in signing, checksum
https://infra.apache.org/release-signing.html

C) signing key should be individual’s and (public key ) published and

also

listed in KEYS file - KEYS file  should be located next to the

staging

(and
later release) location, see above

D) “correct place” - this is in reference to ASF officIal staging

server

http://www.apache.org/legal/release-policy.html#stage
And can be “uploaded” by committing to svn
http://www.a

Re: First Sedona release

2020-11-12 Thread Jim Hughes

Hi Mo,

I can definitely help.  The first step will be for Jia to push a PR for 
the JTS changes.  (Since they are his changes, I cannot do this on his 
behalf.)


From talking to the lead JTS developer, he wanted to see the previous 
PR (from months/a year+ ago) split up.  I think the initial PR should be 
used to discuss what changes are sensible for JTS and where we'll need 
to push some of the changes to Sedona.


Concretely, I noticed that the Sedona JTS fork changes the toString on 
Geometry to include printing out the userData.  I imagine that may cause 
trouble for downstream JTS users, so it'd be good to find an 
alternative.  One suggestion would to be add a static method in Sedona 
for printing a Geometry with its userData object.


Cheers,

Jim

On 11/12/20 12:32 PM, Mohamed Sarwat wrote:

Folks,

I totally agree with Jim on that. Jim, would you like to take the lead on that 
- I trust that you can bring this task to completion. Jia, would you please let 
us know how we can incorporate the changes into the JTS master branch?

Thanks,


On Nov 12, 2020, at 10:10 AM, Jim Hughes  wrote:

Hi all,

As a JTS committer, I have tried to request that the Sedona project discuss the 
desired changes to JTS previously.  I'd still encourage that.

JTS is an active project and I feel that maintaining a fork of JTS is 
unnecessary and inappropriate.

Cheers,

Jim


On 11/11/20 9:04 PM, Felix Cheung wrote:
Ah. You will need to publish it in order for the dependency chain to work
on Maven Central

However, since you are not the project owner there you might need to
publish that under a different artifact id.

In general, it would be best to avoid hard forking another project like
this.



On Wed, Nov 11, 2020 at 1:05 PM Jia Yu  wrote:

Hi Netanel,

That links to this git submodule:
https://github.com/jiayuasu/jts/blob/1.16.x/modules/core/pom.xml#L6

I can easily fix this by changing the version number here to 1.16.2
excluding "SNAPSHOT":
https://github.com/jiayuasu/jts/blob/1.16.x/modules/core/pom.xml#L6

Will this solve the problem?

On Wed, Nov 11, 2020 at 7:40 AM Netanel Malka 
wrote:


Hi Folks,

I tried to make a release (dry-run) following by
publishing-maven-artifacts
<https://infra.apache.org/publishing-maven-artifacts.html>, and I
encountered an issue.

On sedona-core, we have jts-core as a dependency with the SNAPSHOT
version.
(link
<
https://github.com/apache/incubator-sedona/blob/2e60fc07b0eae78ccae3876d970e677fc9319c40/core/pom.xml#L37
)

As a prerequisite to the release process, we cannot have dependencies in a
SNAPSHOT version.


Do you have any clue about how to solve this?


On Mon, 9 Nov 2020 at 21:22, Netanel Malka  wrote:


OK. Thanks Felix.


Updates:

   *
   *   Opened a ticket for INFRA to Enable Nexus Access For Sedona<
https://issues.apache.org/jira/browse/INFRA-21085>
   *   Followed this<
https://infra.apache.org/publishing-maven-artifacts.html> guide to test
the maven release process
   *   I hope to create a PR soon for adjusting the build to deploy to

the

ASF Nexus repository
   *   The key that signs the artifacts were created and tested.

Do we want to create a candidate release for the current master branch?

Netanel Malka,
Big Data Consultant
[Description: Description: Description: Description:
cid:image001.jpg@01C85203.36A2AF30]

From: Felix Cheung 
Sent: Wednesday, November 4, 2020 19:57
To: dev@sedona.apache.org
Cc: Jinxuan Wu; Mohamed Sarwat; Netanel Malka; Paweł Kociński; Zongsi

Zhang

Subject: Re: First Sedona release

1) No you don’t need KEYS file in github only on the release share
https://dist.apache.org/repos/dist/dev/incubator/

2) as podling you add to
https://dist.apache.org/repos/dist/dev/incubator/
When you commit via svn you will be able to add a “directory” for Sedona

2a) for release, you basically do a svn rename to move from dev to

release

“path”

3) if you have java based artifacts, yes. You will publish to Nexus,
staging first and when release is signed off, you can click on the
interface to make it official, which then automatically sync to Maven
central.

Here is a script for example that does release signing and publication

to

Nexus (and staging before release)



https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh

On Wed, Nov 4, 2020 at 2:50 AM Netanel Malka 

netanel...@gmail.com>> wrote:
Hi,

I followed the release-signing
<https://infra.apache.org/release-signing.html> doc and created a key

for

signing and hashing.

I have a few questions:

1. Should the KEYS file also be added to the project root directory

on

Github? ( I saw it in Apache Ant)
2. I saw in release-policy_upload-ci
<http://www.apache.org/legal/release-policy.html#upload-ci> that we
need
to add a release candidate to

https://dist.apache.org/repos/dist/*dev*/

/. However, there does not seem to be a directory with Sedona as
t

Re: First Sedona release

2020-11-16 Thread Jim Hughes

Hi Jia,

Thanks for putting up the PRs.  Martin and I have commented on them.  If 
you are interested in a more real-time discussion than the PRs, Martin 
and I are both in the JTS Gitter (https://gitter.im/locationtech/jts).


To ask directly, please do not fork JTS.  You will be unable to publish 
1.16.2 artifacts on Maven central.  Finding another way to do this will 
cause confusion.


Cheers,

Jim

On 11/16/20 2:28 AM, Jia Yu wrote:

Dear all,

Thanks for all your suggestions.

1. To completely solve the long-overdue JTS issue, I made a Sedona PR and
two JTS PRs. @Jim Hughes  , @Paweł Kociński
 , I, and probably Martin from JTS will take
care of these PRs in the coming days.
(1) Sedona PR: https://github.com/apache/incubator-sedona/pull/488
(2) JTS PR: https://github.com/locationtech/jts/pull/633
https://github.com/locationtech/jts/pull/634

2. To move forward with the first release, I have deleted the "SNAPSHOT" in
my JTS 1.16 fork.
Most likely, we have to move forward with my JTS 1.16 fork in the first
Sedona release because of the conflict among JTStoGeoJSON, GeoTools, and
JTS 1.17.
So @Netanel Malka   could you please do another
dry-run on the Sedona first release on this Sedona branch: sedona-1.0-doc:
https://github.com/apache/incubator-sedona/tree/sedona-1.0-doc

Thanks,
Jia

On Thu, Nov 12, 2020 at 11:36 AM Jim Hughes  wrote:


Hi Mo,

I can definitely help.  The first step will be for Jia to push a PR for
the JTS changes.  (Since they are his changes, I cannot do this on his
behalf.)

  From talking to the lead JTS developer, he wanted to see the previous
PR (from months/a year+ ago) split up.  I think the initial PR should be
used to discuss what changes are sensible for JTS and where we'll need
to push some of the changes to Sedona.

Concretely, I noticed that the Sedona JTS fork changes the toString on
Geometry to include printing out the userData.  I imagine that may cause
trouble for downstream JTS users, so it'd be good to find an
alternative.  One suggestion would to be add a static method in Sedona
for printing a Geometry with its userData object.

Cheers,

Jim

On 11/12/20 12:32 PM, Mohamed Sarwat wrote:

Folks,

I totally agree with Jim on that. Jim, would you like to take the lead

on that - I trust that you can bring this task to completion. Jia, would
you please let us know how we can incorporate the changes into the JTS
master branch?

Thanks,


On Nov 12, 2020, at 10:10 AM, Jim Hughes  wrote:

Hi all,

As a JTS committer, I have tried to request that the Sedona project

discuss the desired changes to JTS previously.  I'd still encourage that.

JTS is an active project and I feel that maintaining a fork of JTS is

unnecessary and inappropriate.

Cheers,

Jim


On 11/11/20 9:04 PM, Felix Cheung wrote:
Ah. You will need to publish it in order for the dependency chain to

work

on Maven Central

However, since you are not the project owner there you might need to
publish that under a different artifact id.

In general, it would be best to avoid hard forking another project like
this.



On Wed, Nov 11, 2020 at 1:05 PM Jia Yu  wrote:

Hi Netanel,

That links to this git submodule:
https://github.com/jiayuasu/jts/blob/1.16.x/modules/core/pom.xml#L6

I can easily fix this by changing the version number here to 1.16.2
excluding "SNAPSHOT":
https://github.com/jiayuasu/jts/blob/1.16.x/modules/core/pom.xml#L6

Will this solve the problem?

On Wed, Nov 11, 2020 at 7:40 AM Netanel Malka 
wrote:


Hi Folks,

I tried to make a release (dry-run) following by
publishing-maven-artifacts
<https://infra.apache.org/publishing-maven-artifacts.html>, and I
encountered an issue.

On sedona-core, we have jts-core as a dependency with the SNAPSHOT
version.
(link
<


https://github.com/apache/incubator-sedona/blob/2e60fc07b0eae78ccae3876d970e677fc9319c40/core/pom.xml#L37

)

As a prerequisite to the release process, we cannot have

dependencies in a

SNAPSHOT version.


Do you have any clue about how to solve this?


On Mon, 9 Nov 2020 at 21:22, Netanel Malka 

wrote:

OK. Thanks Felix.


Updates:

*
*   Opened a ticket for INFRA to Enable Nexus Access For Sedona<
https://issues.apache.org/jira/browse/INFRA-21085>
*   Followed this<
https://infra.apache.org/publishing-maven-artifacts.html> guide to

test

the maven release process
*   I hope to create a PR soon for adjusting the build to deploy

to

the

ASF Nexus repository
*   The key that signs the artifacts were created and tested.

Do we want to create a candidate release for the current master

branch?

Netanel Malka,
Big Data Consultant
[Description: Description: Description: Description:
cid:image001.jpg@01C85203.36A2AF30]

From: Felix Cheung 
Sent: Wednesday, November 4, 2020 19:57
To: dev@sedona.apache.org
Cc: Jinxuan Wu; Mohamed Sarwat; Netanel Malka; Paweł Kociński;

Zongsi

Zhang

Subject: Re: First Sedona release

1) No you don’t 

Re: First Sedona release

2020-11-23 Thread Jim Hughes

Hi all,

Has the fact that one of the dependencies is LGPL (GeoTools) been 
discussed / addressed?  (See 
https://www.apache.org/legal/resolved.html#category-x)


I'm asking since I don't know if the ASF has any recommended work 
arounds for shipping code with licenses that it does not approve of.


Cheers,

Jim

On 11/23/20 1:41 PM, Felix Cheung wrote:

I can help review around Dev 13 to give a first pass. It should give you an
easier path to IPMC vote.


On Sun, Nov 22, 2020 at 10:50 PM Jia Yu  wrote:


Hi Pawel and everyone,

Let's do this in the first Sedona release. But can you please first fix the
Python API for our Move-to-JTS PR, and then work on this one? If this
Python RDD-DF Adapter PR might slow down our progress of releasing Sedona
before Christmas, we can postpone it to Sedona 1.0.1 or 1.1.0.

@everyone
Our top priority is to draw the first Sedona release ASAP. Users have been
waiting for almost six months. Let's push hard to publish the first Sedona
release to Maven Central and PyPI before Christmas. In order to make it
happen,

Finalize coding and documentation before Dec 6:
1. I believe the Move-to-JTS PR will be done in around one week.
2. Then we can accept Pawel' Python RDD-DF Adapter PR, if necessary
3. I will work on Sedona documentation.
4. @Netanel will work on Sedona support of Spark 2.4 and Scala 2.11. I will
first create a branch for it to illustrate some necessary changes in Sedona
SQL for Spark 2.4.

Final walk-through before Dec 13
1. Netanel can test the release management for Sedona.
2. Other committers can go through the docs, release notes

Community voting before Dec 20
1. Sedona community voting: before Dec 16
2. Apache Incubator voting: before Dec 20

Push to Maven Central and PyPi before Dec 24

Please feel free to comment if you have any suggestions!

Jia

On Sun, Nov 22, 2020 at 9:51 AM Paweł Kociński 
wrote:


Hi,
I saw some users reported need to improve Python RDD API in two

scenarios:

- converting spatial flat join result to df
- saving spatial flat join result directly to external storage

Currently SerDe between jvm and Python causes additional time needed to
compute the result. I have a local branch with tests where this
functionality is available (need 3-4 days to make it 100% ready), in two
above scenarios there will be almost no difference between Python and

Scala

or Java API. Should I create PR to include this feature within the first
Sedona release ?
Regards,
Paweł

pon., 16 lis 2020 o 08:29 Jia Yu  napisał(a):


Dear all,

Thanks for all your suggestions.

1. To completely solve the long-overdue JTS issue, I made a Sedona PR

and

two JTS PRs. @Jim Hughes  , @Paweł Kociński
 , I, and probably Martin from JTS will take
care of these PRs in the coming days.
(1) Sedona PR: https://github.com/apache/incubator-sedona/pull/488
(2) JTS PR: https://github.com/locationtech/jts/pull/633
https://github.com/locationtech/jts/pull/634

2. To move forward with the first release, I have deleted the "SNAPSHOT"
in my JTS 1.16 fork.
Most likely, we have to move forward with my JTS 1.16 fork in the first
Sedona release because of the conflict among JTStoGeoJSON, GeoTools, and
JTS 1.17.
So @Netanel Malka   could you please do another
dry-run on the Sedona first release on this Sedona branch:

sedona-1.0-doc:

https://github.com/apache/incubator-sedona/tree/sedona-1.0-doc

Thanks,
Jia

On Thu, Nov 12, 2020 at 11:36 AM Jim Hughes  wrote:


Hi Mo,

I can definitely help.  The first step will be for Jia to push a PR for
the JTS changes.  (Since they are his changes, I cannot do this on his
behalf.)

  From talking to the lead JTS developer, he wanted to see the previous
PR (from months/a year+ ago) split up.  I think the initial PR should

be

used to discuss what changes are sensible for JTS and where we'll need
to push some of the changes to Sedona.

Concretely, I noticed that the Sedona JTS fork changes the toString on
Geometry to include printing out the userData.  I imagine that may

cause

trouble for downstream JTS users, so it'd be good to find an
alternative.  One suggestion would to be add a static method in Sedona
for printing a Geometry with its userData object.

Cheers,

Jim

On 11/12/20 12:32 PM, Mohamed Sarwat wrote:

Folks,

I totally agree with Jim on that. Jim, would you like to take the

lead

on that - I trust that you can bring this task to completion. Jia,

would

you please let us know how we can incorporate the changes into the JTS
master branch?

Thanks,


On Nov 12, 2020, at 10:10 AM, Jim Hughes  wrote:

Hi all,

As a JTS committer, I have tried to request that the Sedona project

discuss the desired changes to JTS previously.  I'd still encourage

that.

JTS is an active project and I feel that maintaining a fork of JTS

is

unnecessary and inappropriate.

Cheers,

Jim


On 11/11/20 9:04 PM, Felix Cheung wrote:
Ah. You will need to publish it in order for the dependency chain

to

work

on Mav

Re: First Sedona release

2020-11-24 Thread Jim Hughes

Hi all,

Felix, good to know that a WIP disclaimer is standard practice and will 
let things move forward!


Jia, I believe that page is explaining that a portion of the code in 
various GeoTools modules has other licenses on it.  As such, gt-main is 
mostly LGPL with some BSD code as well.


Cheers,

Jim

On 11/23/2020 9:50 PM, Jia Yu wrote:

Thank you, Felix. I will use the WIP disclaimer.

To answer Jim's question, GeoTools components use different licenses:
https://docs.geotools.org/latest/userguide/welcome/license.html

GT-main uses BSD, so its binary can be included in Sedona's release.
Other components in GeoTools use LGPL, but Sedona only uses them for CRS
transformation. I already set the dependency scope to "provided" in
Sedona's POM.xml. If a user wants to use CRS transformation in Sedona, they
will have to add some GeoTools library by themselves.


On Mon, Nov 23, 2020 at 6:24 PM Felix Cheung  wrote:


On Mon, Nov 23, 2020 at 6:03 PM Felix Cheung 
wrote:


I’d strongly recommend the community to move towards the first release
with the WIP disclaimer



https://incubator.apache.org/policy/incubation.html#work_in_progress_disclaimer

https://incubator.apache.org/policy/incubation.html#releases


As for the LGPL dependency specifically, a replacement will be needed?



To clarify, ok to note in the WIP disclaimer- so it can be released with
this.




On Mon, Nov 23, 2020 at 11:15 AM Jim Hughes  wrote:


Hi all,

Has the fact that one of the dependencies is LGPL (GeoTools) been
discussed / addressed?  (See
https://www.apache.org/legal/resolved.html#category-x)

I'm asking since I don't know if the ASF has any recommended work
arounds for shipping code with licenses that it does not approve of.

Cheers,

Jim

On 11/23/20 1:41 PM, Felix Cheung wrote:

I can help review around Dev 13 to give a first pass. It should give

you an

easier path to IPMC vote.


On Sun, Nov 22, 2020 at 10:50 PM Jia Yu 

wrote:

Hi Pawel and everyone,

Let's do this in the first Sedona release. But can you please first

fix the

Python API for our Move-to-JTS PR, and then work on this one? If this
Python RDD-DF Adapter PR might slow down our progress of releasing

Sedona

before Christmas, we can postpone it to Sedona 1.0.1 or 1.1.0.

@everyone
Our top priority is to draw the first Sedona release ASAP. Users have

been

waiting for almost six months. Let's push hard to publish the first

Sedona

release to Maven Central and PyPI before Christmas. In order to make

it

happen,

Finalize coding and documentation before Dec 6:
1. I believe the Move-to-JTS PR will be done in around one week.
2. Then we can accept Pawel' Python RDD-DF Adapter PR, if necessary
3. I will work on Sedona documentation.
4. @Netanel will work on Sedona support of Spark 2.4 and Scala 2.11.

I

will

first create a branch for it to illustrate some necessary changes in

Sedona

SQL for Spark 2.4.

Final walk-through before Dec 13
1. Netanel can test the release management for Sedona.
2. Other committers can go through the docs, release notes

Community voting before Dec 20
1. Sedona community voting: before Dec 16
2. Apache Incubator voting: before Dec 20

Push to Maven Central and PyPi before Dec 24

Please feel free to comment if you have any suggestions!

Jia

On Sun, Nov 22, 2020 at 9:51 AM Paweł Kociński <

pawel93kocin...@gmail.com>

wrote:


Hi,
I saw some users reported need to improve Python RDD API in two

scenarios:

- converting spatial flat join result to df
- saving spatial flat join result directly to external storage

Currently SerDe between jvm and Python causes additional time needed

to

compute the result. I have a local branch with tests where this
functionality is available (need 3-4 days to make it 100% ready), in

two

above scenarios there will be almost no difference between Python

and

Scala

or Java API. Should I create PR to include this feature within the

first

Sedona release ?
Regards,
Paweł

pon., 16 lis 2020 o 08:29 Jia Yu 

napisał(a):

Dear all,

Thanks for all your suggestions.

1. To completely solve the long-overdue JTS issue, I made a Sedona

PR

and

two JTS PRs. @Jim Hughes  , @Paweł Kociński
 , I, and probably Martin from JTS will

take

care of these PRs in the coming days.
(1) Sedona PR: https://github.com/apache/incubator-sedona/pull/488
(2) JTS PR: https://github.com/locationtech/jts/pull/633
https://github.com/locationtech/jts/pull/634

2. To move forward with the first release, I have deleted the

"SNAPSHOT"

in my JTS 1.16 fork.
Most likely, we have to move forward with my JTS 1.16 fork in the

first

Sedona release because of the conflict among JTStoGeoJSON,

GeoTools,

and

JTS 1.17.
So @Netanel Malka   could you please do

another

dry-run on the Sedona first release on this Sedona branch:

sedona-1.0-doc:

https://github.com/apache/incubator-sedona/tree/sedona-1.0-doc

Thanks,
Jia

On Thu, Nov 12, 2020 at 11:36 AM Jim Hughes 

wrot

Re: First Sedona release

2020-12-10 Thread Jim Hughes

Hi all,

It may be worth discussing with the JTS directly what their schedule is 
rather than guessing at it.


I am for finding a way for Sedona to work with JTS with the least 
friction for the Sedona development team and the Sedona users.  I feel 
that copying or forking complex codebases will likely lead to bigger 
issues downstream.


Also, is the only hang-up around the serialization of R-Trees? If so, 
could you use reflection with JTS 1.17.0?  That change may be pretty 
quick to make...


Cheers,

Jim

On 12/9/20 10:35 PM, Jia Yu wrote:

Hi Felix, Jim and Netanel and other Sedona committers,

As you know, my JTS PR has been accepted to JTS 1.18-SNAPSHOT and we are
waiting for the official release of JTS 1.18 on Maven. However, I didn't
see a clear date when JTS 1.18 will be published. I guess this will take
one or two months to happen.

Currently, Sedona 1.0.0 release is blocked by this issue (Maven Central
does not allow SNAPSHOTS to be dependencies). Since we are so desperate to
publish Sedona 1.0.0 as soon as possible, I proposed to copy the latest JTS
source code into Sedona-core in our 1.0.0 release. In the future release
(say Sedona 1.0.1), we can drop JTS source code and use their Maven
release. JTS source code is dual-licensed under Eclipse Public License 2.0
and Eclipse Distribution License 1.0 (a BSD Style License). So it is safe
to keep it in Sedona.

What do you think? @Jim Hughes   Is this a good idea?

Thanks,
Jia

On Fri, Dec 4, 2020 at 10:43 PM Jia Yu  wrote:


Hi Netanel,

So for Sedona SQL 1.0.0 on Spark 2.4, we can do
"sedona-sql_2.11-2.4-1.0.0-incubator" , right?

Sedona 1.0 on Spark 2.4 and 3.0 will be compiled against Scala 2.11 and
2.12. I believe this can be done via different compilation target in Maven.

I am currently looking at whether I can do conditional compilation using
Maven (similar to C++ #ifdef) because there is a change in Aggregator in
Spark 3.0. Otherwise I always need to maintain a separate branch for Sedona
on Spark 2.4

It looks OK to me.

On Fri, Dec 4, 2020 at 1:12 AM Netanel Malka  wrote:


Hi,
I think that we can follow the Apache Spark convention as you can see
here
<https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.12/3.0.1/>.
For example:
sedona-sql_2.11-2.4, where 2.11 -> scala version and 2.4 -> spark version

  What do you think?


On Fri, 4 Dec 2020 at 10:34, Jia Yu  wrote:


Dear all,

The current status:
1. Move to JTS PR has been merged to the master branch. If JTS 1.18 gets
published in a few weeks, we will use the latest JTS. Otherwise, we still
need to use my fork for this release. But Sedona API is now finalized. From
the user perspective, use my fork or JTS official release should not make
any difference.
2. Sedona doc update is in progress. I am half way there. You can track
the progress here: https://github.com/apache/incubator-sedona/pull/493
3. I will create a separate branch to test Spark 2.4 over this weekend.
4. Pawel is working on his improvement on RDD-SQL Python adapter.

Question:

What is the most appropriate maven artifact name for Sedona on Spark
2.4? I used to put "sedona-sql_2.4". But it looks like "_2.4" is usually
reserved for specifying the Scala version. How about "sedona-sql-spark2"?
Should we also use "sedona-sql-spark3" for Spark 3.0?

Thanks,
Jia

On Tue, Nov 24, 2020 at 8:16 AM Jim Hughes  wrote:


Hi all,

Felix, good to know that a WIP disclaimer is standard practice and will
let things move forward!

Jia, I believe that page is explaining that a portion of the code in
various GeoTools modules has other licenses on it.  As such, gt-main is
mostly LGPL with some BSD code as well.

Cheers,

Jim

On 11/23/2020 9:50 PM, Jia Yu wrote:

Thank you, Felix. I will use the WIP disclaimer.

To answer Jim's question, GeoTools components use different licenses:
https://docs.geotools.org/latest/userguide/welcome/license.html

GT-main uses BSD, so its binary can be included in Sedona's release.
Other components in GeoTools use LGPL, but Sedona only uses them for

CRS

transformation. I already set the dependency scope to "provided" in
Sedona's POM.xml. If a user wants to use CRS transformation in

Sedona, they

will have to add some GeoTools library by themselves.


On Mon, Nov 23, 2020 at 6:24 PM Felix Cheung 

wrote:

On Mon, Nov 23, 2020 at 6:03 PM Felix Cheung 
I’d strongly recommend the community to move towards the first

release

with the WIP disclaimer



https://incubator.apache.org/policy/incubation.html#work_in_progress_disclaimer

https://incubator.apache.org/policy/incubation.html#releases


As for the LGPL dependency specifically, a replacement will be

needed?

To clarify, ok to note in the WIP disclaimer- so it can be released

with

this.




On Mon, Nov 23, 2020 at 11:15 AM Jim Hughes 

wrote:

Hi all,

Has the fact that one of the dependencies is LGPL (GeoTools) been
discussed / addressed?  (See
https://

Re: First Sedona release

2020-12-10 Thread Jim Hughes

Hi Jia,

A JTS 1.18.0 release would not be just for Apache Sedona.;) Getting it 
out sooner would let others projects adopt it sooner (I'm thinking of 
GeoTools and GeoServer).  I'm excited to see the improvements to the 
overlay operations...


I've traded some emails and chats with Martin.  It sounds like he is ok 
with cutting JTS 1.18.0 in the next week; I'll be working with him and 
Jody to do our best to make that happen.


Anyhow, in terms of shading, there are few things I'd suggest. First, 
I'd suggest that libraries which can function as libraries have a 
version of the jar which does not include any dependencies.  If you go 
along with that, sedona-core should produce a jar on its own and another 
module could build a "batteries included" jar for users to drop into Spark.


Separate from that, I'd recommend that when you copy entire files into a 
project that you change the package for those classes. Concretely, you 
could just prepend org.apache.sedona to the package names for those 5 
classes.  (This assumes that it is possible.  Sometimes there may be 
issues around package protected access, etc.)


As it stands right now, if a user tries to use Sedona with any other 
library that pulls in JTS, then they will be at the mercy of the class 
loading order.  If the JTS jar comes in elsewhere, your versions of the 
RTree may not be loaded!  The exception would look like a JTS issue and 
it be fairly confusing for most people to debug.


With those issues taken together, a user could load up a sedona-core jar 
(which wouldn't have JTS or org.wololo.geojson) with a different version 
of JTS potentially provided by another project and be able to use Sedona 
and other projects together.


Thanks for working through the issues to be able to use a release of 
JTS.  Hopefully we can knock this out over the next week, and if not, 
you do have an approach which would let you release Sedona.


Cheers,

Jim

On 12/10/2020 2:33 PM, Jia Yu wrote:

Hi Jim,

Thanks for your feedback.

1. I indeed asked Martin, Jody, and you in the JTS Gitter chat. It looks
like Martin still needs some time to fix some functions. In fact, I feel it
is inappropriate to push Martin, an OSS contributor, to draw a release just
for us :)
2. I also saw your comment on the GitHub PR. My current solution in that PR
is that use JTS 1.17.1 official release + 5 copied JTS index classes. I
also use the maven shade plugin to filter out the 5 corresponding classes
in JTS 1.17.1 jar (
https://github.com/apache/incubator-sedona/pull/495/files#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8R278)
to avoid duplicates . Do you think I should even use the shade plugin to
relocate these classes to a different path?

Thanks,
Jia

On Thu, Dec 10, 2020 at 6:25 AM Jim Hughes  wrote:


Hi all,

It may be worth discussing with the JTS directly what their schedule is
rather than guessing at it.

I am for finding a way for Sedona to work with JTS with the least
friction for the Sedona development team and the Sedona users.  I feel
that copying or forking complex codebases will likely lead to bigger
issues downstream.

Also, is the only hang-up around the serialization of R-Trees? If so,
could you use reflection with JTS 1.17.0?  That change may be pretty
quick to make...

Cheers,

Jim

On 12/9/20 10:35 PM, Jia Yu wrote:

Hi Felix, Jim and Netanel and other Sedona committers,

As you know, my JTS PR has been accepted to JTS 1.18-SNAPSHOT and we are
waiting for the official release of JTS 1.18 on Maven. However, I didn't
see a clear date when JTS 1.18 will be published. I guess this will take
one or two months to happen.

Currently, Sedona 1.0.0 release is blocked by this issue (Maven Central
does not allow SNAPSHOTS to be dependencies). Since we are so desperate

to

publish Sedona 1.0.0 as soon as possible, I proposed to copy the latest

JTS

source code into Sedona-core in our 1.0.0 release. In the future release
(say Sedona 1.0.1), we can drop JTS source code and use their Maven
release. JTS source code is dual-licensed under Eclipse Public License

2.0

and Eclipse Distribution License 1.0 (a BSD Style License). So it is safe
to keep it in Sedona.

What do you think? @Jim Hughes   Is this a good idea?

Thanks,
Jia

On Fri, Dec 4, 2020 at 10:43 PM Jia Yu  wrote:


Hi Netanel,

So for Sedona SQL 1.0.0 on Spark 2.4, we can do
"sedona-sql_2.11-2.4-1.0.0-incubator" , right?

Sedona 1.0 on Spark 2.4 and 3.0 will be compiled against Scala 2.11 and
2.12. I believe this can be done via different compilation target in

Maven.

I am currently looking at whether I can do conditional compilation using
Maven (similar to C++ #ifdef) because there is a change in Aggregator in
Spark 3.0. Otherwise I always need to maintain a separate branch for

Sedona

on Spark 2.4

It looks OK to me.

On Fri, Dec 4, 2020 at 1:12 AM Netanel Malka 

wrote:

Hi,
I think that we can follow the Apach

Re: First Sedona release

2020-12-23 Thread Jim Hughes

Hi all,

Good news; I've just released JTS 1.18.0!

Thanks again to Jia for contributing the changes necessary to get JTS to 
work with Sedona.


Cheers,

Jim

On 12/21/20 7:25 AM, Netanel Malka wrote:

Succeeded to push the snapshots.

On Mon, 21 Dec 2020 at 12:04, Netanel Malka  wrote:


Thanks. but unfortunately, it's not working.
I got the prompt for the PGP passphrase at the release:prepare phase.

It looks like I don't have permission to push to the Sedona
nexus artifactory.

I will try to fix that later.

On Mon, 21 Dec 2020 at 11:55, Jia Yu  wrote:


@Netanel Malka 

Sometimes, if you are using Mac, you need to enter the following in your
terminal before using GPG key to sign an artifact:
https://gist.github.com/jiayuasu/8bab8ecb0234dfc280264fb587fd8b01

GPG_TTY=$(tty)
export GPG_TTY



On Mon, Dec 21, 2020 at 1:52 AM Netanel Malka 
wrote:


Hi Jia,
I tried to deploy but I got a 401 Unauthorized error, full error:
https://gist.github.com/netanel246/04c5be423d242a3bb9ef9a300c8817c8

I created a settings.xml file with my apache user and an encrypted
password. I also have a GPG key.
Did you encounter this problem?


Thanks,
Netanel Malka.


On Sun, 20 Dec 2020 at 20:12, Netanel Malka 
wrote:


That's great!!
Hope to try it today.


On Fri, 18 Dec 2020 at 10:36, Jia Yu  wrote:


Hi Netanel and Paweł,

The JTS issue has resolved. I am now waiting for JTS 1.18 release but

we

are currently using 1.17.1 + copied files. So we are good anyway.

So the next step will be documentation and stage the first release.
Although I really want to resolve the ST_Transform lock contention

issue,

it requires a new ST_FlipCoordinate which may take a few days. I will

see

whether I can finish this by Christmas but not sure.

@Netanel Malka  Could you please compile the

master

branch and try to deploy a SNAPSHOT release on your own? I have

pushed a

few snapshots but I would like to see whether you can do it too.

Please

follow the steps here:
https://gist.github.com/jiayuasu/849e1f3bf7a2dd11593ca27c14e9e92d

@Paweł Kociński  Step 1. Could you please
update
the new Python Adaptor documentation? Step 2. Could you please try to
deploy a SNAPSHOT release to PyPI? You can find some help here:
https://incubator.apache.org/guides/distribution.html

Thank you very much!
Jia


On Thu, Dec 10, 2020 at 3:26 PM Jim Hughes  wrote:


Hi Jia,

A JTS 1.18.0 release would not be just for Apache Sedona.;) Getting

it

out sooner would let others projects adopt it sooner (I'm thinking

of

GeoTools and GeoServer).  I'm excited to see the improvements to the
overlay operations...

I've traded some emails and chats with Martin.  It sounds like he

is ok

with cutting JTS 1.18.0 in the next week; I'll be working with him

and

Jody to do our best to make that happen.

Anyhow, in terms of shading, there are few things I'd suggest.

First,

I'd suggest that libraries which can function as libraries have a
version of the jar which does not include any dependencies.  If you

go

along with that, sedona-core should produce a jar on its own and

another

module could build a "batteries included" jar for users to drop into

Spark.

Separate from that, I'd recommend that when you copy entire files

into a

project that you change the package for those classes. Concretely,

you

could just prepend org.apache.sedona to the package names for those

5

classes.  (This assumes that it is possible.  Sometimes there may be
issues around package protected access, etc.)

As it stands right now, if a user tries to use Sedona with any other
library that pulls in JTS, then they will be at the mercy of the

class

loading order.  If the JTS jar comes in elsewhere, your versions of

the

RTree may not be loaded!  The exception would look like a JTS issue

and

it be fairly confusing for most people to debug.

With those issues taken together, a user could load up a

sedona-core jar

(which wouldn't have JTS or org.wololo.geojson) with a different

version

of JTS potentially provided by another project and be able to use

Sedona

and other projects together.

Thanks for working through the issues to be able to use a release of
JTS.  Hopefully we can knock this out over the next week, and if

not,

you do have an approach which would let you release Sedona.

Cheers,

Jim

On 12/10/2020 2:33 PM, Jia Yu wrote:

Hi Jim,

Thanks for your feedback.

1. I indeed asked Martin, Jody, and you in the JTS Gitter chat. It

looks

like Martin still needs some time to fix some functions. In fact,

I

feel

it

is inappropriate to push Martin, an OSS contributor, to draw a

release

just

for us :)
2. I also saw your comment on the GitHub PR. My current solution

in

that

PR

is that use JTS 1.17.1 official release + 5 copied JTS index

classes.

I

also use the maven shade plugin to filter out the 5 corresponding

classes

in JTS 1.17.1 

Re: Contributing Apache Sedona(Raster Dataframes)

2021-04-19 Thread Jim Hughes

Hi Shantanu,

I'd be interested to know how your work would compare to existing 
projects which provide raster support in Spark.  LocationTech GeoTrellis 
has existed for several years and provides that support already.  Also, 
LocationTech RasterFrames builds on top of GeoTrellis to provide PySpark 
and Spark SQL support for data science with respect to raster-based 
dataframes.


Cheers,

Jim

On 4/18/21 1:25 PM, Shantanu Aggarwal wrote:

Hello All,

I am a current graduate student at Arizona State University and wanted to
propose raster data frames written in Pyspark that can be incorporated in
Apache Sedona to load satellite images and be able to perform various map
algebra operations on it.

How can I add my constructors as a part of the Python folder? Is there a
separate guide on how to contribute?

Hope to hear from you soon!


Very Respectfully
Shantanu Aggarwal
Masters In Science
Arizona State University