Dear Sir or Madam, we are currently in the process of determining if the Apache-Sedona environment will be suitable for our project within the Software Engineering of Raytheon Anschuetz GmbH Germany.
Specifically, we are testing the Apache-Sedona environment in terms of GeoJSON files in OGC standard. Some problems during the testing made me write this little "help" email: We want to work with the Python Spark solution and a typical converted GeoJSON file. For this first approach we try * to read from GeoJSON and save to RDD, * make a simple query, * use the native RTREE index and save to a permanent indexed storage. * Afterward we want to make a range query with a query window with the use of the implemented RTREE _________________________________________ Maybe you have a short and good tutorial for us? The Jupyter notebook examples are not very precise. _________________________________________ Already the first step of READING a standard geojson does not work like assumed: We used native GeoJsonReader.readToGeometryRDD from the Sedona.core.formatMapper The "test.geojson" looks like: (source: https://geojson.org/ ) { "type": "FeatureCollection", "features": [ { "type": "Feature", "geometry": {"type": "Point", "coordinates": [102.0, 0.5]}, "properties": {"prop0": "value0"} }, { "type": "Feature", "geometry": { "type": "LineString", "coordinates": [ [102.0, 0.0], [103.0, 1.0], [104.0, 0.0], [105.0, 1.0] ] }, "properties": { "prop0": "value0", "prop1": 0.0 } }, { "type": "Feature", "geometry": { "type": "Polygon", "coordinates": [ [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ] ] }, "properties": { "prop0": "value0", "prop1": {"this": "that"} } } ] } Only with trial and error we found out that the following not native "GeoJSON" format works: "test_updated.geojson" looks like: { "type": "Feature", "geometry": {"type": "Point", "coordinates": [102.0, 0.5]}, "properties": {"prop0": "value0"} } { "type": "Feature", "geometry": {"type": "LineString", "coordinates": [[102.0, 0.0], [103.0, 1.0], [104.0, 0.0], [105.0, 1.0]]}, "properties": {"prop0": "value0","prop1": 0.0}} { "type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0],[100.0, 1.0], [100.0, 0.0] ]]}, "properties": {"prop0": "value0", "prop1": {"this": "that"} }} To query the file data we used the sedona.utils.adapter Adapter.toDf. Only with this not common GeoJSON format the first step of reading was possible. _________________________________________ Why the Apache Sedona environment do not support native GeoJSON format? _________________________________________ Our environment: Ubuntu 22.04 LTS openjdk version "1.8.0_312" spark-3.0.3 apache-sedona[spark] 1.2.0) attrs 21.4.0 shapely 1.8.2 pyspark 3.3.0 py4j 0.10.9.5 Also wondering why the documentation for the python solution is not completed. Python doc - Apache Sedona(tm) (incubating)<https://sedona.apache.org/api/python-api/> Have a nice day and best wishes from Kiel/Germany. Jan Rittenbach Werkstudent Software Engineering (ESW) Raytheon Anschütz [email protected]<mailto:[email protected]> raytheon-anschuetz.com<https://www.raytheon-anschuetz.com/> | LinkedIn<https://www.linkedin.com/company/raytheon-anschuetz> | Xing<https://www.xing.com/company/raytheonanschuetz> Raytheon Anschütz GmbH, Zeyestr. 16-24, 24106 Kiel, Deutschland Sitz der Gesellschaft: Kiel, Registergericht: Amtsgericht Kiel HRB 4086 Geschäftsführer: Michael Schulz, Vorsitzende des Aufsichtsrats: Kimberly Nicole Ernzen Unsere aktuelle Datenschutzerklärung finden Sie unter / Our most current Privacy Policy can be found under https://www.raytheon-anschuetz.com/fileadmin/content/Downloads_Documents/Privacy_Policy.pdf
