Hi, I saw some users reported need to improve Python RDD API in two scenarios: - converting spatial flat join result to df - saving spatial flat join result directly to external storage
Currently SerDe between jvm and Python causes additional time needed to compute the result. I have a local branch with tests where this functionality is available (need 3-4 days to make it 100% ready), in two above scenarios there will be almost no difference between Python and Scala or Java API. Should I create PR to include this feature within the first Sedona release ? Regards, Paweł pon., 16 lis 2020 o 08:29 Jia Yu <jiayu198...@gmail.com> napisał(a): > Dear all, > > Thanks for all your suggestions. > > 1. To completely solve the long-overdue JTS issue, I made a Sedona PR and > two JTS PRs. @Jim Hughes <jhug...@ccri.com> , @Paweł Kociński > <pawel93kocin...@gmail.com> , I, and probably Martin from JTS will take > care of these PRs in the coming days. > (1) Sedona PR: https://github.com/apache/incubator-sedona/pull/488 > (2) JTS PR: https://github.com/locationtech/jts/pull/633 > https://github.com/locationtech/jts/pull/634 > > 2. To move forward with the first release, I have deleted the "SNAPSHOT" > in my JTS 1.16 fork. > Most likely, we have to move forward with my JTS 1.16 fork in the first > Sedona release because of the conflict among JTStoGeoJSON, GeoTools, and > JTS 1.17. > So @Netanel Malka <netanel...@gmail.com> could you please do another > dry-run on the Sedona first release on this Sedona branch: sedona-1.0-doc: > https://github.com/apache/incubator-sedona/tree/sedona-1.0-doc > > Thanks, > Jia > > On Thu, Nov 12, 2020 at 11:36 AM Jim Hughes <jhug...@ccri.com> wrote: > >> Hi Mo, >> >> I can definitely help. The first step will be for Jia to push a PR for >> the JTS changes. (Since they are his changes, I cannot do this on his >> behalf.) >> >> From talking to the lead JTS developer, he wanted to see the previous >> PR (from months/a year+ ago) split up. I think the initial PR should be >> used to discuss what changes are sensible for JTS and where we'll need >> to push some of the changes to Sedona. >> >> Concretely, I noticed that the Sedona JTS fork changes the toString on >> Geometry to include printing out the userData. I imagine that may cause >> trouble for downstream JTS users, so it'd be good to find an >> alternative. One suggestion would to be add a static method in Sedona >> for printing a Geometry with its userData object. >> >> Cheers, >> >> Jim >> >> On 11/12/20 12:32 PM, Mohamed Sarwat wrote: >> > Folks, >> > >> > I totally agree with Jim on that. Jim, would you like to take the lead >> on that - I trust that you can bring this task to completion. Jia, would >> you please let us know how we can incorporate the changes into the JTS >> master branch? >> > >> > Thanks, >> > >> >> On Nov 12, 2020, at 10:10 AM, Jim Hughes <jhug...@ccri.com> wrote: >> >> >> >> Hi all, >> >> >> >> As a JTS committer, I have tried to request that the Sedona project >> discuss the desired changes to JTS previously. I'd still encourage that. >> >> >> >> JTS is an active project and I feel that maintaining a fork of JTS is >> unnecessary and inappropriate. >> >> >> >> Cheers, >> >> >> >> Jim >> >> >> >>> On 11/11/20 9:04 PM, Felix Cheung wrote: >> >>> Ah. You will need to publish it in order for the dependency chain to >> work >> >>> on Maven Central >> >>> >> >>> However, since you are not the project owner there you might need to >> >>> publish that under a different artifact id. >> >>> >> >>> In general, it would be best to avoid hard forking another project >> like >> >>> this. >> >>> >> >>> >> >>>> On Wed, Nov 11, 2020 at 1:05 PM Jia Yu <jiayu198...@gmail.com> >> wrote: >> >>>> >> >>>> Hi Netanel, >> >>>> >> >>>> That links to this git submodule: >> >>>> https://github.com/jiayuasu/jts/blob/1.16.x/modules/core/pom.xml#L6 >> >>>> >> >>>> I can easily fix this by changing the version number here to 1.16.2 >> >>>> excluding "SNAPSHOT": >> >>>> https://github.com/jiayuasu/jts/blob/1.16.x/modules/core/pom.xml#L6 >> >>>> >> >>>> Will this solve the problem? >> >>>> >> >>>> On Wed, Nov 11, 2020 at 7:40 AM Netanel Malka <netanel...@gmail.com> >> >>>> wrote: >> >>>> >> >>>>> Hi Folks, >> >>>>> >> >>>>> I tried to make a release (dry-run) following by >> >>>>> publishing-maven-artifacts >> >>>>> <https://infra.apache.org/publishing-maven-artifacts.html>, and I >> >>>>> encountered an issue. >> >>>>> >> >>>>> On sedona-core, we have jts-core as a dependency with the SNAPSHOT >> >>>>> version. >> >>>>> (link >> >>>>> < >> >>>>> >> https://github.com/apache/incubator-sedona/blob/2e60fc07b0eae78ccae3876d970e677fc9319c40/core/pom.xml#L37 >> >>>>> ) >> >>>>> >> >>>>> As a prerequisite to the release process, we cannot have >> dependencies in a >> >>>>> SNAPSHOT version. >> >>>>> >> >>>>> >> >>>>> Do you have any clue about how to solve this? >> >>>>> >> >>>>> >> >>>>> On Mon, 9 Nov 2020 at 21:22, Netanel Malka <netan...@sela.co.il> >> wrote: >> >>>>> >> >>>>>> OK. Thanks Felix. >> >>>>>> >> >>>>>> >> >>>>>> Updates: >> >>>>>> >> >>>>>> * >> >>>>>> * Opened a ticket for INFRA to Enable Nexus Access For Sedona< >> >>>>>> https://issues.apache.org/jira/browse/INFRA-21085> >> >>>>>> * Followed this< >> >>>>>> https://infra.apache.org/publishing-maven-artifacts.html> guide >> to test >> >>>>>> the maven release process >> >>>>>> * I hope to create a PR soon for adjusting the build to >> deploy to >> >>>>> the >> >>>>>> ASF Nexus repository >> >>>>>> * The key that signs the artifacts were created and tested. >> >>>>>> >> >>>>>> Do we want to create a candidate release for the current master >> branch? >> >>>>>> >> >>>>>> Netanel Malka, >> >>>>>> Big Data Consultant >> >>>>>> [Description: Description: Description: Description: >> >>>>>> cid:image001.jpg@01C85203.36A2AF30] >> >>>>>> ________________________________ >> >>>>>> From: Felix Cheung <felixche...@apache.org> >> >>>>>> Sent: Wednesday, November 4, 2020 19:57 >> >>>>>> To: dev@sedona.apache.org >> >>>>>> Cc: Jinxuan Wu; Mohamed Sarwat; Netanel Malka; Paweł Kociński; >> Zongsi >> >>>>> Zhang >> >>>>>> Subject: Re: First Sedona release >> >>>>>> >> >>>>>> 1) No you don’t need KEYS file in github only on the release share >> >>>>>> https://dist.apache.org/repos/dist/dev/incubator/ >> >>>>>> >> >>>>>> 2) as podling you add to >> >>>>>> https://dist.apache.org/repos/dist/dev/incubator/ >> >>>>>> When you commit via svn you will be able to add a “directory” for >> Sedona >> >>>>>> >> >>>>>> 2a) for release, you basically do a svn rename to move from dev to >> >>>>> release >> >>>>>> “path” >> >>>>>> >> >>>>>> 3) if you have java based artifacts, yes. You will publish to >> Nexus, >> >>>>>> staging first and when release is signed off, you can click on the >> >>>>>> interface to make it official, which then automatically sync to >> Maven >> >>>>>> central. >> >>>>>> >> >>>>>> Here is a script for example that does release signing and >> publication >> >>>>> to >> >>>>>> Nexus (and staging before release) >> >>>>>> >> >>>>>> >> >>>>> >> https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh >> >>>>>> On Wed, Nov 4, 2020 at 2:50 AM Netanel Malka <netanel...@gmail.com >> >>>>> <mailto: >> >>>>>> netanel...@gmail.com>> wrote: >> >>>>>> Hi, >> >>>>>> >> >>>>>> I followed the release-signing >> >>>>>> <https://infra.apache.org/release-signing.html> doc and created a >> key >> >>>>> for >> >>>>>> signing and hashing. >> >>>>>> >> >>>>>> I have a few questions: >> >>>>>> >> >>>>>> 1. Should the KEYS file also be added to the project root >> directory >> >>>>> on >> >>>>>> Github? ( I saw it in Apache Ant) >> >>>>>> 2. I saw in release-policy_upload-ci >> >>>>>> <http://www.apache.org/legal/release-policy.html#upload-ci> >> that we >> >>>>>> need >> >>>>>> to add a release candidate to >> >>>>> https://dist.apache.org/repos/dist/*dev*/ >> >>>>>> <TLP >> >>>>>> name>/. However, there does not seem to be a directory with >> Sedona as >> >>>>>> the >> >>>>>> TLP name. How may we be able to get a directory with that >> name? (Also >> >>>>>> for >> >>>>>> the *release*) >> >>>>>> 3. Do we need to push the artifacts also to ASF Nexus >> Repository >> >>>>> (beside >> >>>>>> Maven Central)? >> >>>>>> >> >>>>>> >> >>>>>> Thanks. >> >>>>>> >> >>>>>> On Mon, 2 Nov 2020 at 19:21, Netanel Malka <netanel...@gmail.com >> >>>>> <mailto: >> >>>>>> netanel...@gmail.com>> wrote: >> >>>>>> >> >>>>>>> Thanks Felix. >> >>>>>>> >> >>>>>>> I would be delighted to help. >> >>>>>>> I can start with the GPG. >> >>>>>>> Can I test it on a some artifact, or I need to wait for the >> first >> >>>>>> release? >> >>>>>>> On Mon, 2 Nov 2020 at 03:17, Felix Cheung <felixche...@apache.org >> >>>>>> <mailto:felixche...@apache.org>> wrote: >> >>>>>>>> Great progress! >> >>>>>>>> >> >>>>>>>> To add, >> >>>>>>>> A) I’d strongly recommend the WIP disclaimer - it would be much >> >>>>> easier >> >>>>>> to >> >>>>>>>> pass with in the first release >> >>>>>>>> https://incubator.apache.org/policy/incubation.html#disclaimers >> >>>>>>>> >> >>>>>>>> B) more info in signing, checksum >> >>>>>>>> https://infra.apache.org/release-signing.html >> >>>>>>>> >> >>>>>>>> C) signing key should be individual’s and (public key ) >> published and >> >>>>>> also >> >>>>>>>> listed in KEYS file - KEYS file should be located next to the >> >>>>> staging >> >>>>>>>> (and >> >>>>>>>> later release) location, see above >> >>>>>>>> >> >>>>>>>> D) “correct place” - this is in reference to ASF officIal staging >> >>>>> server >> >>>>>>>> http://www.apache.org/legal/release-policy.html#stage >> >>>>>>>> And can be “uploaded” by committing to svn >> >>>>>>>> http://www.apache.org/legal/release-policy.html#upload-ci >> >>>>>>>> >> >>>>>>>> E) python / PyPI - >> >>>>>>>> https://incubator.apache.org/guides/distribution.html#pypi >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Sun, Nov 1, 2020 at 2:17 PM Jia Yu <ji...@apache.org<mailto: >> >>>>>> ji...@apache.org>> wrote: >> >>>>>>>>> Hi Netanel, Pawel and other committers, >> >>>>>>>>> >> >>>>>>>>> While Pawel is working on Python code of Sedona 1.0, let's >> focus on >> >>>>>>>> other >> >>>>>>>>> parts required by the release. Netanel, can you help me with all >> >>>>> the >> >>>>>> ASF >> >>>>>>>>> incubator requirement items that are not DONE? >> >>>>>>>>> >> >>>>>>>>> *Here is a checklist for our first Sedona release* >> >>>>>>>>> >> >>>>>>>>> *ASF incubator requirement >> >>>>>>>>> (https://incubator.apache.org/guides/releasemanagement.html >> >>>>>>>>> <https://incubator.apache.org/guides/releasemanagement.html>, >> we >> >>>>>>>> probably >> >>>>>>>>> should read ASF release requirement as well):* >> >>>>>>>>> >> >>>>>>>>> 1 .Include the word incubating in the release file name: DONE. >> >>>>> Please >> >>>>>>>> see >> >>>>>>>>> the POM.xml in all directories. >> >>>>>>>>> >> >>>>>>>>> 2. Include an ASF LICENSE and NOTICE file: DONE. Please see the >> >>>>> GitHub >> >>>>>>>>> repo. >> >>>>>>>>> >> >>>>>>>>> 3. Have valid checksums or signatures: I believe signature >> should >> >>>>> be >> >>>>>>>> done >> >>>>>>>>> by the GPG key. Not sure about the checksum. I am also not sure >> >>>>> about >> >>>>>>>> the >> >>>>>>>>> GPG key requirement of ASF. I use GPG key to sign releases of >> >>>>> GeoSpark >> >>>>>>>> in >> >>>>>>>>> the past. >> >>>>>>>>> >> >>>>>>>>> 4. Be placed in the correct place on the ASF’s infrastructure: >> we >> >>>>>> should >> >>>>>>>>> place our releases in two places: Maven, and PyPi. Not sure how >> to >> >>>>>>>> relate >> >>>>>>>>> them to ASF. >> >>>>>>>>> >> >>>>>>>>> 5. Have a KEYS file to validate the release: this should be the >> >>>>> public >> >>>>>>>> key >> >>>>>>>>> of our GPG key? >> >>>>>>>>> >> >>>>>>>>> *Sedona requirement* >> >>>>>>>>> >> >>>>>>>>> 1. Python path name, file headers, and jars >> >>>>>>>>> 2. Project website docs: documentation should use the name, >> >>>>> Sedona, in >> >>>>>>>> all >> >>>>>>>>> tutorials. We should also include the situation of GeoTools >> >>>>>>>> dependencies. >> >>>>>>>>> Thanks, >> >>>>>>>>> Jia >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> On Wed, Oct 14, 2020 at 10:08 PM Jia Yu <ji...@apache.org >> <mailto: >> >>>>>> ji...@apache.org>> wrote: >> >>>>>>>>>> Hi folks, >> >>>>>>>>>> >> >>>>>>>>>> We will be working on the first Sedona. Please see the JIRA >> >>>>> ticket >> >>>>>>>> here: >> >>>>> >> https://issues.apache.org/jira/projects/SEDONA/issues/SEDONA-3?filter=allopenissues >> >>>>>>>>>> Do you think there are any outstanding issues to be fixed as >> >>>>> well? >> >>>>>>>>>> Thanks, >> >>>>>>>>>> Jia >> >>>>>>>>>> >> >>>>>>> -- >> >>>>>>> Best regards, >> >>>>>>> Netanel Malka. >> >>>>>>> >> >>>>>> -- >> >>>>>> Best regards, >> >>>>>> Netanel Malka. >> >>>>>> >> >>>>> -- >> >>>>> Best regards, >> >>>>> Netanel Malka. >> >>>>> >> >>