golfalot opened a new issue, #1477:
URL: https://github.com/apache/sedona/issues/1477
## Expected behavior
return result rows/table
## Actual behavior
crash with stack trace
java.lang.NoSuchMethodError: 'void
org.geotools.coverage.grid.GridGeometry2D.(org.opengis.coverage.grid.GridEnvelope,
org.opengis.referencing.datum.PixelInCell,
org.opengis.referencing.operation.MathTransform,
org.opengis.referencing.crs.CoordinateReferenceSystem,
org.geotools.util.factory.Hints)
## Steps to reproduce the problem
```python
from sedona.spark import SedonaContext
config = SedonaContext.builder() .\
config('spark.jars.packages',
'org.apache.sedona:sedona-spark-shaded-3.4_2.12-1.6.0,'
'org.datasyslab:geotools-wrapper:1.6.0-28.2'). \
getOrCreate()
```
```python
from pyspark.sql import functions as f
df =
sedona.read.format("binaryFile").load("/raw/GIS_Raster_Data/samples/test.nc")
df2 = df.withColumn("raster", f.expr("RS_FromNetCDF(content, 'O3')"))
df2.createOrReplaceTempView("raster_table")
# this command throws the error
sedona.sql("SELECT RS_Value(raster, 3, 4, 1) FROM raster_table").show()
```
Raster sources from:
https://github.com/apache/sedona/blob/master/spark/common/src/test/resources/raster/netcdf/test.nc
sedona = SedonaContext.create(config)
## Settings
Sedona version = 1.6.0
Apache Spark version = 3.4
API type =Python
Scala version = 2.12.17
Java version = 11
Python version = 3.10
Environment = Azure Synapse Spark Pool
# Additional background
We're using Azure Synapse with DEP (data exfiltration protection enabled)
which means no outbound internet access, so all packages must be obtained
manually before being uploaded as "Workspace packages" which can then enabled
on the spark pools.
## A configuration that works (no error)
### Spark pool
- Apache Spark version = 3.4
- Scala version = 2.12.17
- Java version = 11
- Python version = 3.10
Java
- sedona-spark-shaded-3.4_2.12-1.5.3.jar
- geotools-wrapper-1.5.3-28.2.jar
Python
-
apache_sedona-1.5.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- shapely-2.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
## A configuration that causes the error
### Spark pool (identical to above)
- Apache Spark version = 3.4
- Scala version = 2.12.17
- Java version = 11
- Python version = 3.10
### Packages
Java
- sedona-spark-shaded-3.4_2.12-1.6.0.jar
- geotools-wrapper-1.6.0-28.2.jar
Python
- click_plugins-1.1.1-py2.py3-none-any.whl
- affine-2.4.0-py3-none-any.whl
-
apache_sedona-1.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- cligj-0.7.2-py3-none-any.whl
- rasterio-1.3.10-cp310-cp310-manylinux2014_x86_64.whl
- shapely-2.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- snuggs-1.4.7-py3-none-any.whl
**stating the obvious:** There are many packages listed in the failing
scenario. See below the convaluted steps need to establish what packages are
required for a baseline Synapse Spark pool.
# How to establish Python package dependencies for Synsapse Spark pool
## Identify Operating System
https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-34-runtime
=> Mariner 2.0
## Create a VM and apply baseline configuration
https://github.com/microsoft/azurelinux/blob/2.0/toolkit/docs/quick_start/quickstart.md
### Get conda
```bash
wget
https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
sudo bash Miniforge3-Linux-x86_64.sh -b -p /usr/lib/miniforge3
export PATH="/usr/lib/miniforge3/bin:$PATH"
```
### Apply baseline Synapse configuration
```bash
sudo tdnf -y install gcc g++
wget
https://raw.githubusercontent.com/Azure-Samples/Synapse/main/Spark/Python/Synapse-Python310-CPU.yml
conda env create -n synapse-env -f Synapse-Python310-CPU.yml
source activate synapse-env
```
### Install pip packages and determine which packages are Downloaded above
and beyond the baseline packages
requirements.txt
```bash
# echo "apache-sedona==1.5.3" > input-user-req.txt
echo "apache-sedona==1.6.0" > input-user-req.txt
```
install apache-sedona and dependencies
```bash
pip install -r input-user-req.txt > pip_output.txt
```
install apache-sedona and dependencies
```bash
cat pip_output.txt | grep Downloading
```
Use the above output to identify the `.whl` files to download add to Synapse.
# Full stack trace of error
```python
---
Py4JJavaError