Hi,
I was trying to install Sedona on databricks and ran into some problems with 
the process in the following url:

https://sedona.apache.org/1.5.0/setup/databricks/

I installed the via curl the jar files to the shared work space and generated 
the init script in the cluster settings with the init script initially stored 
in the shared workspace. I had done that on an older cluster that I had been 
using up until a few weeks ago.

In case you didn't know, Databricks moved init scripts from DBFS. In Azure, 
they are requiring them to be installed in ADLS or in Volumes. I had to move 
not just the init script to a volume but the jars as well. I tried to just move 
the init scripts and leave the jars in the Shared work space but it did not 
work. Either way, dbfs Is not an option.

Here is some chages that I made to your setup process that worked. Send it to 
you incase you all find it informative.

%sh
# Create JAR directory for Sedona
mkdir -p /<Volume Name>/<schema name>/<init path>/sedona/jars/

# Download the dependencies from Maven into DBFS
curl -o /<Volume Name>/<schema name>/<init 
path>/sedona/jars/geotools-wrapper-1.5.1-28.2.jar 
https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/1.5.1-28.2/geotools-wrapper-1.5.1-28.2.jar

curl -o /<Volume Name>/<schema name>/<init 
path>/sedona/jars/sedona-spark-shaded-3.4_2.12-1.5.1.jar 
https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.4_2.12/1.5.1/sedona-spark-shaded-3.4_2.12-1.5.1.jar


INIT Script




#!/bin/bash
#
# File: sedona-init.sh
#
# On cluster startup, this script will copy the Sedona jars to the cluster's 
default jar directory.
# In order to activate Sedona functions, remember to add to your spark 
configuration the Sedona extensions: "spark.sql.extensions 
org.apache.sedona.viz.sql.SedonaVizExtensions,org.apache.sedona.sql.SedonaSqlExtensions"

#cp -p /Workspace/Shared/sedona/1.5.1/*.jar /databricks/jars/
cp /<Volume Name>/<schema name>/<init path>/sedona/jars/*.jar /databricks/jars



I stored the init script in
/<Volume Name>/<schema name>/<init path>/sedona_init.sh

In the cluster details - Init Scripts I added a Volume Type with the file path 
above

Hope this helps
William Komp

Reply via email to