kaxil opened a new issue, #62736:
URL: https://github.com/apache/airflow/issues/62736

   
   ## Add Apache Iceberg table provider support for DataFusion AnalyticsOperator
   
   ### Description
   
   The DataFusion AnalyticsOperator currently supports querying data from 
registered object stores, but it does not yet support **Apache Iceberg** as a 
table provider.
   
   DataFusion has built-in support for Iceberg table providers, allowing users 
to query Iceberg tables directly. See the [DataFusion Data Sources 
documentation](https://datafusion.apache.org/python/user-guide/data-sources.html#apache-iceberg)
 for details.
   
   ### Motivation
   
   Apache Iceberg is a widely adopted open table format for large-scale 
analytic datasets. Adding Iceberg table provider support to the 
AnalyticsOperator would enable users to:
   
   - Query Iceberg tables directly within Airflow DAGs using DataFusion
   - Build analytics pipelines on top of Iceberg-managed data lakes without 
leaving the Airflow ecosystem
   
   ### Requested Changes
   
   - Integrate the `datafusion` Iceberg table provider into the 
AnalyticsOperator
   - Allow users to register and query Iceberg tables (catalogs, namespaces, 
and tables)
   
   ### Reference
   
   - DataFusion Iceberg Data Source Guide: 
https://datafusion.apache.org/python/user-guide/data-sources.html#apache-iceberg
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to