Hi Gregory, Can you please try to install the jars on the Databricks Cluster?
For example: On clusters -> choose your cluster -> libraries -> install new: 1.Coordinates: org.geotools:gt-main:24.0 2.repo: https://repo.osgeo.org/repository/release/ I successfully did it. Please let me know if it solves your problem On 2021/02/10 13:16:50, Grégory Dugernier <g...@aloalto.com> wrote: > Thank you for the quick reply! > > It seems my particular situation is a bit more complex than that, since I'm > running the notebook on a Databricks cluster, and the default spark config > doesn't seem to allow for more jar repositories (GeoTools isn't on Maven > Central), nor does creating a new SparkSession appears to work. I've tried > to download the jars and add them manually to the cluster but it doesn't > seem to work either. But at least I know where the issue's at! > > Thanks again for your help, > Regards > > On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote: > > > Hi Gregory, > > > > Thanks for letting us know. This is not a bug. We cannot include GeoTools > > jars due to license issues. But indeed we forgot to update the docs and > > jupyter notebook examples. I just updated them. Please read them here: > > > > > > https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb > > > > (Make sure you disable the browser cache or open it in an incognito > > window) http://sedona.apache.org/download/overview/#install-sedona-python > > > > In short, you need to add the following coordinates in the notebook: > > > > spark = SparkSession. \ builder. \ appName('appName'). \ config( > > "spark.serializer", KryoSerializer.getName). \ config( > > "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config( > > "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,' ' > > https://download.java.net/maven/2'). \ config('spark.jars.packages', > > 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,' > > 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,' > > 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate() > > > > On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <g...@aloalto.com> wrote: > > > >> Hello, > >> > >> I've been trying to run Sedona for Python on Databricks for 2 days and I > >> think I've stumbled upon a bug. > >> > >> *Configuration*: > >> > >> - Spark 3.0.1 > >> - Scala 2.12 > >> - Python 3.7 > >> > >> *Librairies*: > >> > >> - apache-sedona (from PyPi) > >> - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating > >> (from Maven) > >> > >> *What I'm trying to do:* > >> > >> I'm trying to load a series of Shapefiles files into a dataframe for > >> geospatial analysis. See code snippet below, based of your example > >> notebook > >> < > >> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb > >> > > >> > >> > >> > from sedona.core.formatMapper.shapefileParser import ShapefileReader > >> > from sedona.register import SedonaRegistrator > >> > from sedona.utils.adapter import Adapter > >> > > >> > SedonaRegistrator.registerAll(spark) > >> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext, > >> > file_name) > >> > df = Adapter.toDf(shape_rdd, spark) > >> > > >> > >> *Bug*: > >> > >> The ShapefileReader.readToGeometryRDD() currently throws the following > >> error: > >> > >> > Py4JJavaError: An error occurred while calling > >> > > >> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD. > >> > : java.lang.NoClassDefFoundError: > >> org/opengis/referencing/FactoryException > >> > at > >> > > >> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79) > >> > at > >> > > >> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66) > >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > >> > > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > >> > at > >> > > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > >> > at java.lang.reflect.Method.invoke(Method.java:498) at > >> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at > >> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at > >> > py4j.Gateway.invoke(Gateway.java:295) at > >> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at > >> > py4j.commands.CallCommand.execute(CallCommand.java:79) at > >> > py4j.GatewayConnection.run(GatewayConnection.java:251) at > >> > java.lang.Thread.run(Thread.java:748) Caused by: > >> > java.lang.ClassNotFoundException: > >> org.opengis.referencing.FactoryException > >> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at > >> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at > >> > > >> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151) > >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352) > >> > : java.lang.NoClassDefFoundError: > >> org/opengis/referencing/FactoryException > >> > at > >> > > >> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79) > >> > at > >> > > >> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66) > >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >> > at > >> > > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > >> > at > >> > > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > >> > at java.lang.reflect.Method.invoke(Method.java:498) > >> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) > >> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) > >> > at py4j.Gateway.invoke(Gateway.java:295) > >> > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) > >> > at py4j.commands.CallCommand.execute(CallCommand.java:79) > >> > at py4j.GatewayConnection.run(GatewayConnection.java:251) > >> > at java.lang.Thread.run(Thread.java:748) > >> > Caused by: java.lang.ClassNotFoundException: > >> > org.opengis.referencing.FactoryException > >> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419) > >> > at > >> > > >> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151) > >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352) > >> > >> > >> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating library > >> from Maven doesn't solve the error. Adding the > >> org.datasyslab:geospark:1.3.1 > >> library from Maven solves the error, but it creates conflicts with the > >> underlying org.locationtech.jts dependencies. This makes me think there is > >> a missing OpenGIS dependency in the sedona-python-adapter. > >> > >> Regards, > >> G. Dugernier > >> > >> -- > >> > >> > >> > >> Grégory Dugernier > >> Software Engineer > >> > >> g...@aloalto.com <f...@aloalto.com> > >> +32 (0)484 11 26 09 > >> > >> www.aloalto.com > >> +32 (0)2 736 10 17 > >> > >> -- > >> > >> > >> > >> > >> DISCLAIMER : The content of this e-mail > >> message does not constitute a > >> commitment of S.A. ALOALTO N.V. or its > >> subsidiaries/affiliates. This e-mail > >> and any attachments thereto may contain > >> information which is confidential > >> and/or protected by intellectual property > >> rights and are intended for the > >> intended recipient only. Any use of the > >> information contained herein > >> (including, but not limited to, total or partial > >> reproduction, > >> communication or distribution in any form) by persons other than > >> the > >> designated recipient(s) is prohibited. If an addressing or transmission > >> error has misdirected this e-mail, please notify the author, either by > >> telephone or by e-mail and delete the material from any computer. > >> > >> > > -- > > > > Grégory Dugernier > Software Engineer > > g...@aloalto.com <f...@aloalto.com> > +32 (0)484 11 26 09 > > www.aloalto.com > +32 (0)2 736 10 17 > > -- > > > > > DISCLAIMER : The content of this e-mail > message does not constitute a > commitment of S.A. ALOALTO N.V. or its > subsidiaries/affiliates. This e-mail > and any attachments thereto may contain > information which is confidential > and/or protected by intellectual property > rights and are intended for the > intended recipient only. Any use of the > information contained herein > (including, but not limited to, total or partial > reproduction, > communication or distribution in any form) by persons other than > the > designated recipient(s) is prohibited. If an addressing or transmission > error has misdirected this e-mail, please notify the author, either by > telephone or by e-mail and delete the material from any computer. > >