Re: [I] AWS Glue Hudi to ICEBERG Tables Fails [incubator-xtable]

via GitHub Mon, 04 Mar 2024 05:37:32 -0800


soumilshah1995 commented on issue #362:
URL: 
https://github.com/apache/incubator-xtable/issues/362#issuecomment-1976599431


   ello,
   
   I hope this message finds you well. I wanted to share with you the progress 
I've made in writing data into Hudi using PySpark. You can find the code 
implementation in this [GitHub 
repository](https://github.com/soumilshah1995/aws-hudi-delta-iceberg-interoperability/blob/main/glue_jupyter_workspace/Untitled2.ipynb).
   
   Here's a snippet of the sample data I've been working with:
   
   ```
   
+--------------------+--------------+--------+----------+------------------+--------------------+--------------------+
   |         customer_id|          name|   state|      city|             email| 
         created_at|             address|
   
+--------------------+--------------+--------+----------+------------------+--------------------+--------------------+
   |7dd63c8b-d588-4f3...|Shannon Fields|New 
York|Millerport|[email protected]|2024-03-02T12:45:...|344 Bates Flats S...|
   
+--------------------+--------------+--------+----------+------------------+--------------------+--------------------+
   
   ```
   
   Regarding the improvements needed in the notebook (Untitled2.ipynb), I've 
noted that there might be nullable fields in the schema, which could pose an 
issue. To address this in PySpark, we need to ensure that all nullable fields 
are handled appropriately.
   
   Here's a suggestion on how to handle nullable fields in PySpark code:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] AWS Glue Hudi to ICEBERG Tables Fails [incubator-xtable]

Reply via email to