vsachinrao opened a new issue, #759:
URL: https://github.com/apache/incubator-xtable/issues/759

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/incubator-xtable/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Please describe the bug 🐞
   
   Hi, I am trying to convert an Iceberg table created using Athena Engine in 
AWS Glue catalog to Delta format. This is to enable catalog federation in to 
Unity catalog without copying the actual data files.
   The error seems to suggest that the GLUE catalog as source for Iceberg still 
uses Hadoop approach of identifying the latest version pointer. IT seems to 
look for version-hint, but in in this case the version pointer is managed by 
glue catalog and metadata file naming convention is 00000-uuid-metadata.json 
and not v0.metadata.json.
   The error, YAML file and python code to run the Xtable command is provided 
below.
   
   **It failed with error**
   
   python coverter_athena.py
   Running: /usr/lib/jvm/java-11-openjdk-amd64/bin/java -cp 
/xtable/incubator-xtable/xtable-utilities/target/xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:/xtable/hudi-hive-sync-0.15.0.jar:/xtable/hudi-common-0.15.0.jar:/xtable/hudi-hadoop-mr-bundle-0.15.0.jar:/xtable/hudi-sync-common-0.15.0.jar
 org.apache.xtable.utilities.RunCatalogSync --catalogSyncConfig athena_sync.yaml
   WARNING: Runtime environment or build system does not support multi-release 
JARs. This will impact location-based features.
   2025-11-10 15:37:35 WARN  org.apache.hadoop.util.NativeCodeLoader:60 - 
Unable to load native-hadoop library for your platform... using builtin-java 
classes where applicable
   WARNING: An illegal reflective access operation has occurred
   WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform 
(file:/xtable/incubator-xtable/xtable-utilities/target/xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar)
 to constructor java.nio.DirectByteBuffer(long,int)
   WARNING: Please consider reporting this to the maintainers of 
org.apache.spark.unsafe.Platform
   WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
   WARNING: All illegal access operations will be denied in a future release
   2025-11-10 15:37:43 WARN  org.apache.hadoop.metrics2.impl.MetricsConfig:138 
- Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
   2025-11-10 15:37:44 INFO  
org.apache.spark.sql.delta.storage.DelegatingLogStore:60 - LogStore 
`LogStoreAdapter(io.delta.storage.S3SingleDriverLogStore)` is used for scheme 
`s3`
   2025-11-10 15:37:46 INFO  org.apache.spark.sql.delta.DeltaLog:60 - Creating 
initial snapshot without metadata, because the directory is empty
   2025-11-10 15:37:50 INFO  org.apache.spark.sql.delta.InitialSnapshot:60 - 
[tableId=cbab2b53-407e-46f9-83ce-5bfc964ba4ca] Created snapshot 
InitialSnapshot(path=s3://xxxx/xtable-sync-test/table_sync_with_unity/_delta_log,
 version=-1, 
metadata=Metadata(708d9d00-94da-49bc-906c-90b8dde18e17,null,null,Format(parquet,Map()),null,List(),Map(),Some(1762785470724)),
 
logSegment=LogSegment(s3://xxxx/xtable-sync-test/table_sync_with_unity/_delta_log,-1,List(),None,-1),
 checksumOpt=None)
   2025-11-10 15:37:51 INFO  
org.apache.xtable.conversion.ConversionController:338 - No previous 
InternalTable sync for target. Falling back to snapshot sync.
   2025-11-10 15:37:51 WARN  
org.apache.iceberg.hadoop.HadoopTableOperations:325 - Error reading version 
hint file 
s3://xxxx/xtable-sync-test/table_sync_with_unity/metadata/version-hint.text
   java.io.FileNotFoundException: No such file or directory: 
s3://xxxx/xtable-sync-test/table_sync_with_unity/metadata/version-hint.text
           at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4156) 
~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:4007)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.hadoop.fs.s3a.S3AFileSystem.extractOrFetchSimpleFileStatus(S3AFileSystem.java:5649)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$executeOpen$4(S3AFileSystem.java:1861)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.hadoop.fs.s3a.S3AFileSystem.executeOpen(S3AFileSystem.java:1859) 
~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:1834) 
~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:997) 
~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.iceberg.hadoop.HadoopTableOperations.findVersion(HadoopTableOperations.java:318)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.iceberg.hadoop.HadoopTableOperations.refresh(HadoopTableOperations.java:104)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.iceberg.hadoop.HadoopTableOperations.current(HadoopTableOperations.java:84)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at org.apache.iceberg.hadoop.HadoopTables.load(HadoopTables.java:94) 
~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.iceberg.IcebergTableManager.lambda$getTable$1(IcebergTableManager.java:60)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at java.util.Optional.orElseGet(Optional.java:369) [?:?]
           at 
org.apache.xtable.iceberg.IcebergTableManager.getTable(IcebergTableManager.java:60)
 [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.iceberg.IcebergConversionSource.initSourceTable(IcebergConversionSource.java:98)
 [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.iceberg.IcebergConversionSource.getSourceTable(IcebergConversionSource.java:77)
 [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.iceberg.IcebergConversionSource.getCurrentSnapshot(IcebergConversionSource.java:147)
 [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.spi.extractor.ExtractFromSource.extractSnapshot(ExtractFromSource.java:40)
 [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.conversion.ConversionController.syncSnapshot(ConversionController.java:281)
 [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.conversion.ConversionController.syncTableFormats(ConversionController.java:203)
 [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.conversion.ConversionController.syncTableAcrossCatalogs(ConversionController.java:136)
 [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.utilities.RunCatalogSync.main(RunCatalogSync.java:193) 
[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
   2025-11-10 15:37:51 ERROR org.apache.xtable.utilities.RunCatalogSync:197 - 
Error running sync for s3://xxxx/xtable-sync-test/table_sync_with_unity
   org.apache.iceberg.exceptions.NoSuchTableException: Table does not exist at 
location: s3://xxxx/xtable-sync-test/table_sync_with_unity
           at org.apache.iceberg.hadoop.HadoopTables.load(HadoopTables.java:97) 
~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.iceberg.IcebergTableManager.lambda$getTable$1(IcebergTableManager.java:60)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at java.util.Optional.orElseGet(Optional.java:369) ~[?:?]
           at 
org.apache.xtable.iceberg.IcebergTableManager.getTable(IcebergTableManager.java:60)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.iceberg.IcebergConversionSource.initSourceTable(IcebergConversionSource.java:98)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.iceberg.IcebergConversionSource.getSourceTable(IcebergConversionSource.java:77)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.iceberg.IcebergConversionSource.getCurrentSnapshot(IcebergConversionSource.java:147)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.spi.extractor.ExtractFromSource.extractSnapshot(ExtractFromSource.java:40)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.conversion.ConversionController.syncSnapshot(ConversionController.java:281)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.conversion.ConversionController.syncTableFormats(ConversionController.java:203)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.conversion.ConversionController.syncTableAcrossCatalogs(ConversionController.java:136)
 ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
           at 
org.apache.xtable.utilities.RunCatalogSync.main(RunCatalogSync.java:193) 
[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT]
   XTable sync completed successfully
   
   **My YAML file is:**
   sourceCatalog:
     catalogId: "source-catalog-id"
     catalogType: "GLUE"
     catalogProperties:
       externalCatalog.glue.region: "eu-central-1"
   
   targetCatalogs:
      - catalogId: "target-catalog-id-glue"
        catalogSyncClientImpl: "org.apache.xtable.glue.GlueCatalogSyncClient"
        catalogProperties:
           externalCatalog.glue.region: "eu-central-1"
   datasets:
      - sourceCatalogTableIdentifier:
           tableIdentifier:
              hierarchicalId: "xtable_sync_db.table_sync_with_unity"
        targetCatalogTableIdentifiers:
           - catalogId: "target-catalog-id-glue"
             tableFormat: "DELTA"
             tableIdentifier:
                hierarchicalId: "xtable_sync_db.delta_table_sync_with_unity"
   
   **My python code to invoke the command is**
   
   import subprocess
   import os
   
   
   java11_bin = "/usr/lib/jvm/java-11-openjdk-amd64/bin/java"
   
   
   xtable_jar = 
"/xtable/incubator-xtable/xtable-utilities/target/xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar"
   hudi_hive_sync_jar = "/xtable/hudi-hive-sync-0.15.0.jar"   # <-- Added as I 
got missing MultiPartKeysValueExtractor class error
   hudi_common_jar = 
"/home/sachin/Learn/new_begining/xtable/hudi-common-0.15.0.jar"         # <-- 
Added as I got missing MultiPartKeysValueExtractor class error
   hudi_hadoop_mr_jar = 
"/home/sachin/Learn/new_begining/xtable/hudi-hadoop-mr-bundle-0.15.0.jar"   # 
<-- Added as I got missing MultiPartKeysValueExtractor class error
   hudi_sync_common_jar = 
"/home/sachin/Learn/new_begining/xtable/hudi-sync-common-0.15.0.jar" # <-- 
Added as I got missing MultiPartKeysValueExtractor class error
   
   
   classpath = ":".join([
       xtable_jar,
       hudi_hive_sync_jar,
       hudi_common_jar,
       hudi_hadoop_mr_jar,
       hudi_sync_common_jar
   ])
   
   
   catalog_sync_config = "athena_sync.yaml"
   
   def run_xtable_sync():
       command = [
           java11_bin,
           "-cp",
           classpath,
           "org.apache.xtable.utilities.RunCatalogSync",
           "--catalogSyncConfig",
           catalog_sync_config
       ]
   
       try:
           print(f"Running: {' '.join(command)}")
           subprocess.run(command, check=True)
           print("XTable sync completed successfully")
       except subprocess.CalledProcessError as e:
           print(f"Error during XTable sync: {e}")
       except Exception as e:
           print(f"Unexpected error: {e}")
   
   if __name__ == "__main__":
       run_xtable_sync()
   
   Thanks for your help!
   
   Regards,
   
   ### Are you willing to submit PR?
   
   - [ ] I am willing to submit a PR!
   - [ ] I am willing to submit a PR but need help getting started!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to