fmendezlopez opened a new issue, #1776:
URL: https://github.com/apache/sedona/issues/1776
Hello,
I am having an error when reading a GeoTiff file and invoking
"RS_FromGeoTiff" function.
The code:
` val sedona = SedonaContext.create(datioSparkSession.getSparkSession)
SedonaVizRegistrator.registerAll(sedona)
val filePath =
DatioFileSystem.get().qualify("/in/staging/kris/custom/Aqueduct_FL100_2030_RCP45.tif").string()
sedona.read
.format("binaryFile")
.load(filePath)
.selectExpr("RS_FromGeoTiff(content) as raster", "path")
.selectExpr("raster", "RS_Metadata(raster) as metadata")
.show(false)`
The error thrown:
`2025-01-29T09:44:40,061 [task-result-getter-1/134] [WARN]
org.apache.spark.scheduler.TaskSetManager - Lost task 0.1 in stage 0.0 (TID 1)
(ip-10-60-253-200.eu-south-2.compute.internal executor 13):
org.apache.spark.sql.sedona_sql.expressions.InferredExpressionException:
Exception occurred while evaluating expression RS_FromGeoTiff - inputs:
[[B@44d7c680], cause: null
at
org.apache.spark.sql.sedona_sql.expressions.InferredExpression$.throwExpressionInferenceException(InferredExpression.scala:149)
at
org.apache.spark.sql.sedona_sql.expressions.InferredExpression.eval(InferredExpression.scala:113)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
at
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:408)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
at
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1541)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.IllegalArgumentException
at sun.misc.Unsafe.copyMemory(Native Method)
at
com.esotericsoftware.kryo.io.UnsafeOutput.writeBytes(UnsafeOutput.java:378)
at
com.esotericsoftware.kryo.io.UnsafeOutput.writeFloats(UnsafeOutput.java:348)
at
org.apache.sedona.common.raster.serde.KryoUtil.writeFloatArrays(KryoUtil.java:234)
at
org.apache.sedona.common.raster.serde.DataBufferSerializer.write(DataBufferSerializer.java:58)
at
org.apache.sedona.common.raster.serde.AWTRasterSerializer.write(AWTRasterSerializer.java:48)
at
org.apache.sedona.common.raster.DeepCopiedRenderedImage.write(DeepCopiedRenderedImage.java:453)
at
org.apache.sedona.common.raster.serde.Serde$SerializableState.write(Serde.java:125)
at org.apache.sedona.common.raster.serde.Serde.serialize(Serde.java:173)
at
org.apache.spark.sql.sedona_sql.expressions.raster.implicits$RasterEnhancer.serialize(implicits.scala:46)
at
org.apache.spark.sql.sedona_sql.expressions.InferrableRasterTypes$.rasterSerializer(InferrableRasterTypes.scala:47)
at
org.apache.spark.sql.sedona_sql.expressions.InferredRasterExpression$.$anonfun$rasterSerializer$1(InferredRasterExpression.scala:54)
at
org.apache.spark.sql.sedona_sql.expressions.InferredExpression.eval(InferredExpression.scala:107)
... 19 more`
I have tried the following:
- Same code with other file --> no error thrown
- Opening the file with QGIS --> loads the layer successfully
- Executing in a cluster environment, with more memory -> same error
- Same code in Python --> another error thrown:
`2025-01-29T11:28:27,041 [Thread-42/107] [DEBUG]
com.amazonaws.emr.recordserver.connector.spark.sql.SparkPlanValidator - plan is
Project [metadata#31, raster#27, point#32,
**org.apache.spark.sql.sedona_sql.expressions.raster.RS_Contains** AS
rs_contains(raster, point)#36]+- Project [raster#27, rs_metadata(raster#27) AS
metadata#31, **org.apache.spark.sql.sedona_sql.expressions.ST_Point** AS
point#32] +- Project [
**org.apache.spark.sql.sedona_sql.expressions.raster.RS_FromGeoTiff** AS
raster#27, path#19] +- Relation
[path#19,modificationTime#20,length#21L,content#22] binaryFile
--
2025-01-29T11:28:27,051 [Thread-11/37] [ERROR] dataproc.Main - Exception:
[NOT_INT] Argument `n` should be an int, got bool.
`
Please, could you help me addressing this issue?
Thank you in advance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]