Hi Gregory,

Please let us know if you get your issue fixed. I know many of our users
are also using Databricks cluster. We are also interested in the solution.

Thanks,
Jia

On Wed, Feb 10, 2021 at 5:17 AM Grégory Dugernier <g...@aloalto.com> wrote:

> Thank you for the quick reply!
>
> It seems my particular situation is a bit more complex than that, since
> I'm running the notebook on a Databricks cluster, and the default spark
> config doesn't seem to allow for more jar repositories (GeoTools isn't on
> Maven Central), nor does creating a new SparkSession appears to work. I've
> tried to download the jars and add them manually to the cluster but it
> doesn't seem to work either. But at least I know where the issue's at!
>
> Thanks again for your help,
> Regards
>
> On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:
>
>> Hi Gregory,
>>
>> Thanks for letting us know. This is not a bug. We cannot include GeoTools
>> jars due to license issues. But indeed we forgot to update the docs and
>> jupyter notebook examples. I just updated them. Please read them here:
>>
>>
>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>>
>> (Make sure you disable the browser cache or open it in an incognito
>> window)
>> http://sedona.apache.org/download/overview/#install-sedona-python
>>
>> In short, you need to add the following coordinates in the notebook:
>>
>> spark = SparkSession. \ builder. \ appName('appName'). \ config(
>> "spark.serializer", KryoSerializer.getName). \ config(
>> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
>> "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,' '
>> https://download.java.net/maven/2'). \ config('spark.jars.packages',
>> 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
>> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
>> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>>
>> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <g...@aloalto.com> wrote:
>>
>>> Hello,
>>>
>>> I've been trying to run Sedona for Python on Databricks for 2 days and I
>>> think I've stumbled upon a bug.
>>>
>>> *Configuration*:
>>>
>>>    - Spark 3.0.1
>>>    - Scala 2.12
>>>    - Python 3.7
>>>
>>> *Librairies*:
>>>
>>>    - apache-sedona (from PyPi)
>>>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>>    (from Maven)
>>>
>>> *What I'm trying to do:*
>>>
>>> I'm trying to load a series of Shapefiles files into a dataframe for
>>> geospatial analysis. See code snippet below, based of your example
>>> notebook
>>> <
>>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>>> >
>>>
>>>
>>> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
>>> > from sedona.register import SedonaRegistrator
>>> > from sedona.utils.adapter import Adapter
>>> >
>>> > SedonaRegistrator.registerAll(spark)
>>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>>> > file_name)
>>> > df = Adapter.toDf(shape_rdd, spark)
>>> >
>>>
>>> *Bug*:
>>>
>>> The ShapefileReader.readToGeometryRDD() currently throws the following
>>> error:
>>>
>>> > Py4JJavaError: An error occurred while calling
>>> >
>>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>>> > : java.lang.NoClassDefFoundError:
>>> org/opengis/referencing/FactoryException
>>> > at
>>> >
>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>> > at
>>> >
>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>>> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
>>> > py4j.Gateway.invoke(Gateway.java:295) at
>>> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
>>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>>> > java.lang.Thread.run(Thread.java:748) Caused by:
>>> > java.lang.ClassNotFoundException:
>>> org.opengis.referencing.FactoryException
>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>>> >
>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>> > : java.lang.NoClassDefFoundError:
>>> org/opengis/referencing/FactoryException
>>> > at
>>> >
>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>>> > at
>>> >
>>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> > at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> > at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> > at java.lang.reflect.Method.invoke(Method.java:498)
>>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>>> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>>> > at py4j.Gateway.invoke(Gateway.java:295)
>>> > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>>> > at java.lang.Thread.run(Thread.java:748)
>>> > Caused by: java.lang.ClassNotFoundException:
>>> > org.opengis.referencing.FactoryException
>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>>> > at
>>> >
>>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>>
>>>
>>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating
>>> library
>>> from Maven doesn't solve the error. Adding the
>>> org.datasyslab:geospark:1.3.1
>>> library from Maven solves the error, but it creates conflicts with the
>>> underlying org.locationtech.jts dependencies. This makes me think there
>>> is
>>> a missing OpenGIS dependency in the sedona-python-adapter.
>>>
>>> Regards,
>>> G. Dugernier
>>>
>>> --
>>>
>>>
>>>
>>> Grégory Dugernier
>>> Software Engineer
>>>
>>> g...@aloalto.com <f...@aloalto.com>
>>> +32 (0)484 11 26 09
>>>
>>> www.aloalto.com
>>> +32 (0)2 736 10 17
>>>
>>> --
>>>
>>>
>>>
>>>
>>> DISCLAIMER : The content of this e-mail
>>> message does not constitute a
>>> commitment of S.A. ALOALTO N.V. or its
>>> subsidiaries/affiliates. This e-mail
>>> and any attachments thereto may contain
>>> information which is confidential
>>> and/or protected by intellectual property
>>> rights and are intended for the
>>> intended recipient only. Any use of the
>>> information contained herein
>>> (including, but not limited to, total or partial
>>> reproduction,
>>> communication or distribution in any form) by persons other than
>>> the
>>> designated recipient(s) is prohibited. If an addressing or transmission
>>> error has misdirected this e-mail, please notify the author, either by
>>> telephone or by e-mail and delete the material from any computer.
>>>
>>>
>
> --
>
>
>
> Grégory Dugernier
> Software Engineer
>
> g...@aloalto.com <f...@aloalto.com>
> +32 (0)484 11 26 09
>
> www.aloalto.com
> +32 (0)2 736 10 17
>
> DISCLAIMER : The content of this e-mail message does not constitute a
> commitment of S.A. ALOALTO N.V. or its subsidiaries/affiliates. This e-mail
> and any attachments thereto may contain information which is confidential
> and/or protected by intellectual property rights and are intended for the
> intended recipient only. Any use of the information contained herein
> (including, but not limited to, total or partial reproduction,
> communication or distribution in any form) by persons other than the
> designated recipient(s) is prohibited. If an addressing or transmission
> error has misdirected this e-mail, please notify the author, either by
> telephone or by e-mail and delete the material from any computer.
>

Reply via email to