Hello,

I've been trying to run Sedona for Python on Databricks for 2 days and I
think I've stumbled upon a bug.

*Configuration*:

   - Spark 3.0.1
   - Scala 2.12
   - Python 3.7

*Librairies*:

   - apache-sedona (from PyPi)
   - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
   (from Maven)

*What I'm trying to do:*

I'm trying to load a series of Shapefiles files into a dataframe for
geospatial analysis. See code snippet below, based of your example notebook
<https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb>


> from sedona.core.formatMapper.shapefileParser import ShapefileReader
> from sedona.register import SedonaRegistrator
> from sedona.utils.adapter import Adapter
>
> SedonaRegistrator.registerAll(spark)
> shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
> file_name)
> df = Adapter.toDf(shape_rdd, spark)
>

*Bug*:

The ShapefileReader.readToGeometryRDD() currently throws the following
error:

> Py4JJavaError: An error occurred while calling
> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
> : java.lang.NoClassDefFoundError: org/opengis/referencing/FactoryException
> at
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> at
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498) at
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
> py4j.Gateway.invoke(Gateway.java:295) at
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
> py4j.commands.CallCommand.execute(CallCommand.java:79) at
> py4j.GatewayConnection.run(GatewayConnection.java:251) at
> java.lang.Thread.run(Thread.java:748) Caused by:
> java.lang.ClassNotFoundException: org.opengis.referencing.FactoryException
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> : java.lang.NoClassDefFoundError: org/opengis/referencing/FactoryException
> at
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> at
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
> at py4j.Gateway.invoke(Gateway.java:295)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:251)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException:
> org.opengis.referencing.FactoryException
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> at
> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:352)


Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating library
from Maven doesn't solve the error. Adding the org.datasyslab:geospark:1.3.1
library from Maven solves the error, but it creates conflicts with the
underlying org.locationtech.jts dependencies. This makes me think there is
a missing OpenGIS dependency in the sedona-python-adapter.

Regards,
G. Dugernier

-- 



Grégory Dugernier
Software Engineer

g...@aloalto.com <f...@aloalto.com>
+32 (0)484 11 26 09

www.aloalto.com
+32 (0)2 736 10 17

-- 




DISCLAIMER : The content of this e-mail
message does not constitute a 
commitment of S.A. ALOALTO N.V. or its
subsidiaries/affiliates. This e-mail 
and any attachments thereto may contain
information which is confidential 
and/or protected by intellectual property
rights and are intended for the 
intended recipient only. Any use of the
information contained herein 
(including, but not limited to, total or partial
reproduction, 
communication or distribution in any form) by persons other than
the 
designated recipient(s) is prohibited. If an addressing or transmission
error has misdirected this e-mail, please notify the author, either by
telephone or by e-mail and delete the material from any computer. 

Reply via email to