[
https://issues.apache.org/jira/browse/SEDONA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732374#comment-17732374
]
Jia Yu commented on SEDONA-289:
-------------------------------
Thanks for finding this. The `geotiff` data source internally does ST_Transform
on behalf of the user. So it might lead to inconsistent result. This data
source is not well designed and might lead to some issues in more cases. With
SEDONA-251 and SEDONA-292 in place, we are ready to deprecate this data source
and will remove it after a few releases.
Therefore, we will not fix the bug. In Sedona 1.4.1, we will provide a full
end-to-end raster ETL pipeline.
> On the correctness of accessing geometry attribute value of raster data
> -----------------------------------------------------------------------
>
> Key: SEDONA-289
> URL: https://issues.apache.org/jira/browse/SEDONA-289
> Project: Apache Sedona
> Issue Type: Bug
> Affects Versions: 1.4.0
> Reporter: Liam An
> Priority: Major
>
> When reading geotiff raster data, the coordinate values of the read geometry
> properties do not agree with the coordinate range values of the actual raster
> by geotools or arcgis;
> For example:
>
> {code:java}
> // read raster Envelope by geotools
> File file = new File("D:\\test.tif");
> AbstractGridFormat format = GridFormatFinder.findFormat(file);
> GeoTiffReader reader = new GeoTiffReader(file, new
> Hints(Hints.FORCE_LONGITUDE_FIRST_AXIS_ORDER, Boolean.TRUE));
> GridCoverage2D coverage = reader.read(null);
> Envelope env = coverage.getEnvelope();
> System.out.println(env.toString());
> //Output content : GeneralEnvelope[(120.1190185546875, 30.13824462890625),
> (120.23162841796875, 30.2398681640625)]{code}
> {code:java}
> //read raster geometry by sedona
> val spark = SparkSession.builder().
> //config("spark.serializer", classOf[KryoSerializer].getName).
> //config("spark.kryo.registrator", classOf[SedonaKryoRegistrator].getName).
> master("local[*]").appName("sedonasqlScalaTest")
> .getOrCreate()
> SedonaSQLRegistrator.registerAll(spark)
> var geotiffDF = spark.read.format("geotiff")
> .option("dropInvalid", true)
> .option("readToCRS", "EPSG:4326") //3857
> .load("D:\\test.tif")
> geotiffDF.printSchema()
> geotiffDF = geotiffDF.selectExpr("image.origin as origin","image.geometry as
> geometry", "image.height as height", "image.width as width", "image.nBands as
> nBands", "image.data as data")
> val fr = geotiffDF.first()
> val geom:String = fr.getString(1)
> println("geom value is : "+geom)
> //Output content : POLYGON ((120.11910247802734 30.239782333374023,
> 120.11910247802734 30.138330459594727, 120.2315444946289 30.138330459594727,
> 120.2315444946289 30.239782333374023, 120.11910247802734 30.239782333374023))
> {code}
> The coordinates of the above two areas are inconsistent.
> Please check the "decode" method in the "GeotiffSchema.scala" file.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)