vsachinrao opened a new issue, #759: URL: https://github.com/apache/incubator-xtable/issues/759
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/incubator-xtable/issues?q=is%3Aissue) and found no similar issues. ### Please describe the bug 🐞 Hi, I am trying to convert an Iceberg table created using Athena Engine in AWS Glue catalog to Delta format. This is to enable catalog federation in to Unity catalog without copying the actual data files. The error seems to suggest that the GLUE catalog as source for Iceberg still uses Hadoop approach of identifying the latest version pointer. IT seems to look for version-hint, but in in this case the version pointer is managed by glue catalog and metadata file naming convention is 00000-uuid-metadata.json and not v0.metadata.json. The error, YAML file and python code to run the Xtable command is provided below. **It failed with error** python coverter_athena.py Running: /usr/lib/jvm/java-11-openjdk-amd64/bin/java -cp /xtable/incubator-xtable/xtable-utilities/target/xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:/xtable/hudi-hive-sync-0.15.0.jar:/xtable/hudi-common-0.15.0.jar:/xtable/hudi-hadoop-mr-bundle-0.15.0.jar:/xtable/hudi-sync-common-0.15.0.jar org.apache.xtable.utilities.RunCatalogSync --catalogSyncConfig athena_sync.yaml WARNING: Runtime environment or build system does not support multi-release JARs. This will impact location-based features. 2025-11-10 15:37:35 WARN org.apache.hadoop.util.NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/xtable/incubator-xtable/xtable-utilities/target/xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar) to constructor java.nio.DirectByteBuffer(long,int) WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release 2025-11-10 15:37:43 WARN org.apache.hadoop.metrics2.impl.MetricsConfig:138 - Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties 2025-11-10 15:37:44 INFO org.apache.spark.sql.delta.storage.DelegatingLogStore:60 - LogStore `LogStoreAdapter(io.delta.storage.S3SingleDriverLogStore)` is used for scheme `s3` 2025-11-10 15:37:46 INFO org.apache.spark.sql.delta.DeltaLog:60 - Creating initial snapshot without metadata, because the directory is empty 2025-11-10 15:37:50 INFO org.apache.spark.sql.delta.InitialSnapshot:60 - [tableId=cbab2b53-407e-46f9-83ce-5bfc964ba4ca] Created snapshot InitialSnapshot(path=s3://xxxx/xtable-sync-test/table_sync_with_unity/_delta_log, version=-1, metadata=Metadata(708d9d00-94da-49bc-906c-90b8dde18e17,null,null,Format(parquet,Map()),null,List(),Map(),Some(1762785470724)), logSegment=LogSegment(s3://xxxx/xtable-sync-test/table_sync_with_unity/_delta_log,-1,List(),None,-1), checksumOpt=None) 2025-11-10 15:37:51 INFO org.apache.xtable.conversion.ConversionController:338 - No previous InternalTable sync for target. Falling back to snapshot sync. 2025-11-10 15:37:51 WARN org.apache.iceberg.hadoop.HadoopTableOperations:325 - Error reading version hint file s3://xxxx/xtable-sync-test/table_sync_with_unity/metadata/version-hint.text java.io.FileNotFoundException: No such file or directory: s3://xxxx/xtable-sync-test/table_sync_with_unity/metadata/version-hint.text at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4156) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:4007) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.s3a.S3AFileSystem.extractOrFetchSimpleFileStatus(S3AFileSystem.java:5649) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$executeOpen$4(S3AFileSystem.java:1861) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.s3a.S3AFileSystem.executeOpen(S3AFileSystem.java:1859) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:1834) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:997) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.iceberg.hadoop.HadoopTableOperations.findVersion(HadoopTableOperations.java:318) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.iceberg.hadoop.HadoopTableOperations.refresh(HadoopTableOperations.java:104) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.iceberg.hadoop.HadoopTableOperations.current(HadoopTableOperations.java:84) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.iceberg.hadoop.HadoopTables.load(HadoopTables.java:94) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.iceberg.IcebergTableManager.lambda$getTable$1(IcebergTableManager.java:60) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at java.util.Optional.orElseGet(Optional.java:369) [?:?] at org.apache.xtable.iceberg.IcebergTableManager.getTable(IcebergTableManager.java:60) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.iceberg.IcebergConversionSource.initSourceTable(IcebergConversionSource.java:98) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.iceberg.IcebergConversionSource.getSourceTable(IcebergConversionSource.java:77) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.iceberg.IcebergConversionSource.getCurrentSnapshot(IcebergConversionSource.java:147) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.spi.extractor.ExtractFromSource.extractSnapshot(ExtractFromSource.java:40) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.conversion.ConversionController.syncSnapshot(ConversionController.java:281) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.conversion.ConversionController.syncTableFormats(ConversionController.java:203) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.conversion.ConversionController.syncTableAcrossCatalogs(ConversionController.java:136) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.utilities.RunCatalogSync.main(RunCatalogSync.java:193) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] 2025-11-10 15:37:51 ERROR org.apache.xtable.utilities.RunCatalogSync:197 - Error running sync for s3://xxxx/xtable-sync-test/table_sync_with_unity org.apache.iceberg.exceptions.NoSuchTableException: Table does not exist at location: s3://xxxx/xtable-sync-test/table_sync_with_unity at org.apache.iceberg.hadoop.HadoopTables.load(HadoopTables.java:97) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.iceberg.IcebergTableManager.lambda$getTable$1(IcebergTableManager.java:60) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at java.util.Optional.orElseGet(Optional.java:369) ~[?:?] at org.apache.xtable.iceberg.IcebergTableManager.getTable(IcebergTableManager.java:60) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.iceberg.IcebergConversionSource.initSourceTable(IcebergConversionSource.java:98) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.iceberg.IcebergConversionSource.getSourceTable(IcebergConversionSource.java:77) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.iceberg.IcebergConversionSource.getCurrentSnapshot(IcebergConversionSource.java:147) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.spi.extractor.ExtractFromSource.extractSnapshot(ExtractFromSource.java:40) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.conversion.ConversionController.syncSnapshot(ConversionController.java:281) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.conversion.ConversionController.syncTableFormats(ConversionController.java:203) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.conversion.ConversionController.syncTableAcrossCatalogs(ConversionController.java:136) ~[xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] at org.apache.xtable.utilities.RunCatalogSync.main(RunCatalogSync.java:193) [xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar:0.2.0-SNAPSHOT] XTable sync completed successfully **My YAML file is:** sourceCatalog: catalogId: "source-catalog-id" catalogType: "GLUE" catalogProperties: externalCatalog.glue.region: "eu-central-1" targetCatalogs: - catalogId: "target-catalog-id-glue" catalogSyncClientImpl: "org.apache.xtable.glue.GlueCatalogSyncClient" catalogProperties: externalCatalog.glue.region: "eu-central-1" datasets: - sourceCatalogTableIdentifier: tableIdentifier: hierarchicalId: "xtable_sync_db.table_sync_with_unity" targetCatalogTableIdentifiers: - catalogId: "target-catalog-id-glue" tableFormat: "DELTA" tableIdentifier: hierarchicalId: "xtable_sync_db.delta_table_sync_with_unity" **My python code to invoke the command is** import subprocess import os java11_bin = "/usr/lib/jvm/java-11-openjdk-amd64/bin/java" xtable_jar = "/xtable/incubator-xtable/xtable-utilities/target/xtable-utilities_2.12-0.2.0-SNAPSHOT-bundled.jar" hudi_hive_sync_jar = "/xtable/hudi-hive-sync-0.15.0.jar" # <-- Added as I got missing MultiPartKeysValueExtractor class error hudi_common_jar = "/home/sachin/Learn/new_begining/xtable/hudi-common-0.15.0.jar" # <-- Added as I got missing MultiPartKeysValueExtractor class error hudi_hadoop_mr_jar = "/home/sachin/Learn/new_begining/xtable/hudi-hadoop-mr-bundle-0.15.0.jar" # <-- Added as I got missing MultiPartKeysValueExtractor class error hudi_sync_common_jar = "/home/sachin/Learn/new_begining/xtable/hudi-sync-common-0.15.0.jar" # <-- Added as I got missing MultiPartKeysValueExtractor class error classpath = ":".join([ xtable_jar, hudi_hive_sync_jar, hudi_common_jar, hudi_hadoop_mr_jar, hudi_sync_common_jar ]) catalog_sync_config = "athena_sync.yaml" def run_xtable_sync(): command = [ java11_bin, "-cp", classpath, "org.apache.xtable.utilities.RunCatalogSync", "--catalogSyncConfig", catalog_sync_config ] try: print(f"Running: {' '.join(command)}") subprocess.run(command, check=True) print("XTable sync completed successfully") except subprocess.CalledProcessError as e: print(f"Error during XTable sync: {e}") except Exception as e: print(f"Unexpected error: {e}") if __name__ == "__main__": run_xtable_sync() Thanks for your help! Regards, ### Are you willing to submit PR? - [ ] I am willing to submit a PR! - [ ] I am willing to submit a PR but need help getting started! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
