Avro Tables [incubator-xtable]

via GitHub Tue, 30 Apr 2024 22:35:14 -0700


the-other-tim-brown commented on issue #166:
URL: 
https://github.com/apache/incubator-xtable/issues/166#issuecomment-2088005911


   > @the-other-tim-brown I'm trying to find a good first issue to ramp up on 
XTable. Can I take a look at this one? Perhaps we can split it into different 
issues. One initial task could be to add support for the Parquet input data 
format, for example? I'm not sure what the code looks like, but ultimately, we 
can create something modular enough to extend to AVRO or other formats later, 
if it hasn't been done already. I would be interested to discuss of the 
possible approaches to fill up the partitioning and statistics info...
   > 
   I think it makes sense to start with just one of the file formats like 
Parquet. We can discuss how to get the info you would need.
   
   > However, just to check that I understand the scenario correctly: if today 
I wanted to bootstrap 2 different systems, Hudi and Iceberg, with existing 
Parquet files, couldn't I use the native capabilities of either system for a 
1st initial import, and then use the current XTable to generate the metafiles 
for the remaining system?
   
   Yes you could do that as well.
   
   There is another issue I had my eye on that I could guide you through as 
well if you are interested: 
https://github.com/apache/incubator-xtable/issues/411
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Support source/sink for plain Parquet/ORC/Avro Tables [incubator-xtable]

Reply via email to