jiayuasu opened a new issue, #2651:
URL: https://github.com/apache/sedona/issues/2651

   ## Summary
   
   The GeoPackage V2 DataSource reader (`GeoPackageTable`) does not implement 
`SupportsMetadataColumns`, so queries like `SELECT _metadata.file_name FROM 
geopackage...` fail.
   
   This is the same issue as SEDONA-729 (which tracks the shapefile reader), 
but for the GeoPackage reader.
   
   ## Expected behavior
   
   ```scala
   val df = spark.read.format("geopackage").load("/path/to/data.gpkg")
   df.select("_metadata.file_path", "_metadata.file_name", 
"_metadata.file_size").show()
   ```
   
   The above query should return file-level metadata for each row.
   
   ## Current behavior
   
   The `_metadata` column is not available because `GeoPackageTable` extends 
`FileTable` but does not implement `SupportsMetadataColumns`.
   
   ## Fix
   
   `GeoPackageTable` should implement `SupportsMetadataColumns` and expose a 
`_metadata` column with the standard struct fields (`file_path`, `file_name`, 
`file_size`, `file_block_start`, `file_block_length`, 
`file_modification_time`), similar to the fix being applied for shapefiles in 
SEDONA-729.
   
   The corresponding scan builder, scan, and partition reader factory will also 
need to be updated to propagate and populate the metadata schema.
   
   ## Notes
   
   - The OSM PBF reader is **not affected** because it uses the V1 DataSource 
API (`FileFormat`), which gets `_metadata` support automatically from Spark.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to