Re: [I] unable to create target format Delta with source format as Iceberg when the source table is on S3 [incubator-xtable]

via GitHub Tue, 14 May 2024 08:58:05 -0700


dipankarmazumdar commented on issue #431:
URL: 
https://github.com/apache/incubator-xtable/issues/431#issuecomment-2110594035


   @rajender07 - I am not really sure about this particular error. However, I 
tried reproducing this on my end and I was able to translate from ICEBERG to 
DELTA using the setup I suggested.
   
   ## ICEBERG TABLE CONFIG & CREATION:
   ```
   import pyspark
   from pyspark.sql import SparkSession
   import os
   conf = (
       pyspark.SparkConf()
           .setAppName('app_name')
           .set('spark.jars.packages', 
'org.apache.hadoop:hadoop-aws:3.3.4,org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.4.3,software.amazon.awssdk:bundle:2.17.178,software.amazon.awssdk:url-connection-client:2.17.178')
           .set('spark.sql.extensions', 
'org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions')
           .set('spark.sql.catalog.hdfs_catalog', 
'org.apache.iceberg.spark.SparkCatalog')
           .set('spark.sql.catalog.hdfs_catalog.type', 'hadoop')
           .set('spark.sql.catalog.hdfs_catalog.warehouse', 
's3a://my-bucket/new_iceberg/')
           .set('spark.sql.catalog.hdfs_catalog.io-impl', 
'org.apache.iceberg.aws.s3.S3FileIO')
   )
   spark = SparkSession.builder.config(conf=conf).getOrCreate()
   print("Spark Running")
   spark.sql("CREATE TABLE hdfs_catalog.table1 (name string) USING iceberg")
   spark.sql("INSERT INTO hdfs_catalog.table1 VALUES ('Alex'), ('Dipankar'), 
('Mary')")
   ```
   
   ## my_config.yaml
   ```
   sourceFormat: ICEBERG
   targetFormats:
     - DELTA
   datasets:
     -
       tableBasePath: s3://my-bucket/new_iceberg/table1/
       tableDataPath: s3://my-bucket/new_iceberg/table1/data
       tableName: table1
   ```
   
   ## Run Sync
   `java -jar utilities/target/utilities-0.1.0-SNAPSHOT-bundled.jar 
--datasetConfig my_config.yaml`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] unable to create target format Delta with source format as Iceberg when the source table is on S3 [incubator-xtable]

Reply via email to