Hi Gregory,

Thanks for letting us know. This is not a bug. We cannot include GeoTools
jars due to license issues. But indeed we forgot to update the docs and
jupyter notebook examples. I just updated them. Please read them here:

https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb

(Make sure you disable the browser cache or open it in an incognito
window)  http://sedona.apache.org/download/overview/#install-sedona-python

In short, you need to add the following coordinates in the notebook:

spark = SparkSession. \ builder. \ appName('appName'). \ config(
"spark.serializer", KryoSerializer.getName). \ config(
"spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
"spark.jars.repositories", 'https://repo.osgeo.org/repository/release,' '
https://download.java.net/maven/2'). \ config('spark.jars.packages',
'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()

On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <[email protected]> wrote:

> Hello,
>
> I've been trying to run Sedona for Python on Databricks for 2 days and I
> think I've stumbled upon a bug.
>
> *Configuration*:
>
>    - Spark 3.0.1
>    - Scala 2.12
>    - Python 3.7
>
> *Librairies*:
>
>    - apache-sedona (from PyPi)
>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>    (from Maven)
>
> *What I'm trying to do:*
>
> I'm trying to load a series of Shapefiles files into a dataframe for
> geospatial analysis. See code snippet below, based of your example notebook
> <
> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
> >
>
>
> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
> > from sedona.register import SedonaRegistrator
> > from sedona.utils.adapter import Adapter
> >
> > SedonaRegistrator.registerAll(spark)
> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
> > file_name)
> > df = Adapter.toDf(shape_rdd, spark)
> >
>
> *Bug*:
>
> The ShapefileReader.readToGeometryRDD() currently throws the following
> error:
>
> > Py4JJavaError: An error occurred while calling
> >
> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
> > : java.lang.NoClassDefFoundError:
> org/opengis/referencing/FactoryException
> > at
> >
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> > at
> >
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498) at
> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
> > py4j.Gateway.invoke(Gateway.java:295) at
> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
> > java.lang.Thread.run(Thread.java:748) Caused by:
> > java.lang.ClassNotFoundException:
> org.opengis.referencing.FactoryException
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
> >
> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> > : java.lang.NoClassDefFoundError:
> org/opengis/referencing/FactoryException
> > at
> >
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
> > at
> >
> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
> > at py4j.Gateway.invoke(Gateway.java:295)
> > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
> > at java.lang.Thread.run(Thread.java:748)
> > Caused by: java.lang.ClassNotFoundException:
> > org.opengis.referencing.FactoryException
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> > at
> >
> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>
>
> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating library
> from Maven doesn't solve the error. Adding the
> org.datasyslab:geospark:1.3.1
> library from Maven solves the error, but it creates conflicts with the
> underlying org.locationtech.jts dependencies. This makes me think there is
> a missing OpenGIS dependency in the sedona-python-adapter.
>
> Regards,
> G. Dugernier
>
> --
>
>
>
> Grégory Dugernier
> Software Engineer
>
> [email protected] <[email protected]>
> +32 (0)484 11 26 09
>
> www.aloalto.com
> +32 (0)2 736 10 17
>
> --
>
>
>
>
> DISCLAIMER : The content of this e-mail
> message does not constitute a
> commitment of S.A. ALOALTO N.V. or its
> subsidiaries/affiliates. This e-mail
> and any attachments thereto may contain
> information which is confidential
> and/or protected by intellectual property
> rights and are intended for the
> intended recipient only. Any use of the
> information contained herein
> (including, but not limited to, total or partial
> reproduction,
> communication or distribution in any form) by persons other than
> the
> designated recipient(s) is prohibited. If an addressing or transmission
> error has misdirected this e-mail, please notify the author, either by
> telephone or by e-mail and delete the material from any computer.
>
>

Reply via email to